Pages: 1 2 [3] 4 5   Go Down

Author Topic: Camera calibration using a neural network: questions  (Read 11161 times)

32BT

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3095
    • Pictures
Re: Camera calibration using a neural network: questions
« Reply #40 on: June 04, 2019, 02:26:06 am »

Could it be an indexing problem? Some array index off by 1?

Would make sense: it is training with incorrect reference in the GS, but with correct samples in column 16. There is very likely an indexing problem in the GS patches, somewhere during training.
Logged
Regards,
~ O ~
If you can stomach it: pictures

32BT

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3095
    • Pictures
Re: Camera calibration using a neural network: questions
« Reply #41 on: June 04, 2019, 02:38:47 am »


Regarding the Delta E calculation, I saw the dE2000 metric and got lost in the formulation. It would be great if it could be used as a loss function for the NN training, but just for testing purposes I'll stick on the primitive dE76.

Regards!

You can safely stick to dE76. It's fast and fine for the purposes here. The later dE variations are more interesting for other purposes: being able to describe the perceptual differences between our grayscale perception and colorperception for example. It does not apply to anything here.

Logged
Regards,
~ O ~
If you can stomach it: pictures

32BT

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3095
    • Pictures
Re: Camera calibration using a neural network: questions
« Reply #42 on: June 04, 2019, 02:47:10 am »

Note that once you get the bug resolved, I do expect a renewed comparison between (3, 3), (4, 4), etc...
No need to compare linear, we already know that won't work for perceptual output.
If you would need to speed things up: you could also check whether you really need 20000 epochs each time if your optimisation function shows that 5000 epochs does the trick.

In order to check the code you could perhaps separate the output logs from the code in your github. Currently there is a couple of lines of code and then thousands of lines of output, which kind of makes it unreadable.
Logged
Regards,
~ O ~
If you can stomach it: pictures

Jack Hogan

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 798
    • Hikes -more than strolls- with my dog
Re: Camera calibration using a neural network: questions
« Reply #43 on: June 04, 2019, 02:59:36 am »

I think in the former wrong prediction I applied the NN over and already converted to ProPhoto RGB version (DCRAW output). Anyway, clearly neutral patches are the weak point of the prediction, I need to understand this, specially why only gray patches seem to have large L errors. I could understand all patches would (L has a different scale as a/b after all, and I didn't normalise the Lab data to train the NN), but only the neutral ones?

That looks much better.  I think remaining differences could be due to the different processing in the two cases.

Just thinking aloud: I believe DCRAW applies a standard tone curve.  Do you?  In my example I did not and processed the two renderings exactly the same other than for color: the left portion of each color square is white balanced on GS11 and has just the dE2k P5000 matrix adjustment, while the right portion uses GS11 and Adobe's interpolated forward matrix followed by HSV corrections with no 'look' or 'tone' applied (that goes top and bottom in the neutral patches).  DCRAW uses in-camera multipliers by default (corresponding to about 5100K as I remember).  Etc.

Jack

PS For the general public: many colors are outside of sRGB so differences may be more difficult to spot on non wide-gamut monitors.
Logged

Jack Hogan

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 798
    • Hikes -more than strolls- with my dog
Re: Camera calibration using a neural network: questions
« Reply #44 on: June 04, 2019, 03:12:40 am »

I need to better understand the implications of white balance and capture lighting in the whole process.

One last thing on the lighting: Flare really messes up this type of calibration by making tones look lighter than the reference, therefore affecting how lightness is modeled.  I don't know this target so I don't know whether it exhibits flare, though it looks like it might in some areas.

Jack
Logged

32BT

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3095
    • Pictures
Re: Camera calibration using a neural network: questions
« Reply #45 on: June 04, 2019, 03:23:17 am »

PS For the general public: many colors are outside of sRGB so differences may be more difficult to spot on non wide-gamut monitors.

Just a thought: In the past I have used a diagonal split for comparison (and ended up using an S-shaped diagonal for most noticeable difference). Made sense at the time considering that the vast amount of horizontal and vertical patterns may obscure the differences of rectangular comparison, especially if they happen to coincide with the L steps.
Logged
Regards,
~ O ~
If you can stomach it: pictures

Guillermo Luijk

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 2005
    • http://www.guillermoluijk.com
Re: Camera calibration using a neural network: questions
« Reply #46 on: June 04, 2019, 08:06:16 am »

Thanks for your feedback guys. I'm pretty sure I didn't offset the patches when training but will check that. I firmly believe the problem is in the chart itself as you suggest Jack, or we'd rather say in the discrepancy between measured lightness values (spectrophotometer) and camera (sensor + optics) for column 16 vs gray row. The lightness crossover between group 16 and gray patches is real. I think it can even be seen in the L real vs prediction plot, where alternatively errors were below and above the expected value:



It this is true, it confirms how difficult is to do a proper capture of these glossy charts. Years ago I had to give up because I couldn't eliminate undesired brights. My bet is that as long as column 16 is dropped from the training set, the result for the gray patches below will be good.

Regarding DCRAW Jack, I used it in such a way that it does not apply any tone curve (option -4, linear) nor any colour conversion (-o 0). This means it only performs black point substraction, linear scaling to 16-bit range, and per-channel white balance scaling (in the chosen WB it pushes R by 2.299 and B by 1.805), giving 100% linear scaled (WB) RAW data. This is the DCRAW command used:

dcraw -v -r 2.299 1 1.805 1 -t 0 -o 0 -4 -T IT8.NEF

The GitHub repository is now clean of training data so it's easier to follow.

Regards
« Last Edit: June 04, 2019, 08:10:59 am by Guillermo Luijk »
Logged

32BT

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3095
    • Pictures
Re: Camera calibration using a neural network: questions
« Reply #47 on: June 04, 2019, 09:04:21 am »

My bet is that as long as column 16 is dropped from the training set, the result for the gray patches below will be good.

Nope.

I'm willing to take on your bet.

The more I think about it, the more I'm convinced it is an indexing problem. First of all: you can't have both lighter and darker results in gray from any conversion. There is a minute difference in source colors there, but it is so insignificant that an otherwise reasonable conversion can not in any way result in both lighter and darker patches that still look neutral. Second: the reflections mentioned do not seem to be anywhere near significant, proven by Jack's own conversion.

Find the indexing problem, it is there 99,999999% sure.
Logged
Regards,
~ O ~
If you can stomach it: pictures

32BT

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3095
    • Pictures
Re: Camera calibration using a neural network: questions
« Reply #48 on: June 04, 2019, 09:51:36 am »

See, more proof there is an indexing problem. This is NOT some fluke coincidence...
Logged
Regards,
~ O ~
If you can stomach it: pictures

Jack Hogan

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 798
    • Hikes -more than strolls- with my dog
Re: Camera calibration using a neural network: questions
« Reply #49 on: June 04, 2019, 11:51:34 am »

The more I think about it, the more I'm convinced it is an indexing problem. First of all: you can't have both lighter and darker results in gray from any conversion. There is a minute difference in source colors there, but it is so insignificant that an otherwise reasonable conversion can not in any way result in both lighter and darker patches that still look neutral. Second: the reflections mentioned do not seem to be anywhere near significant, proven by Jack's own conversion.

Find the indexing problem, it is there 99,999999% sure.

Can you elaborate on what you mean by 'indexing' problem 32BT?
Logged

32BT

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3095
    • Pictures
Re: Camera calibration using a neural network: questions
« Reply #50 on: June 04, 2019, 12:12:05 pm »

Can you elaborate on what you mean by 'indexing' problem 32BT?

An array index problem.

If we look closely at the GS result patches then the GS1 result matches the GS0 source, the GS2 result matches the GS1 source, etc...

Since this doesn't seem a drawing problem, it's likely that the NN is being trained to match GS1 to GS0, and so on. This would explain the original curved deviation in L and the slightly dark column16. Whether this is a souce index problem or a reference index problem, I don't know.

Logged
Regards,
~ O ~
If you can stomach it: pictures

Guillermo Luijk

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 2005
    • http://www.guillermoluijk.com
Re: Camera calibration using a neural network: questions
« Reply #51 on: June 04, 2019, 06:52:05 pm »

Nope.

I'm willing to take on your bet.

The more I think about it, the more I'm convinced it is an indexing problem. First of all: you can't have both lighter and darker results in gray from any conversion. There is a minute difference in source colors there, but it is so insignificant that an otherwise reasonable conversion can not in any way result in both lighter and darker patches that still look neutral. Second: the reflections mentioned do not seem to be anywhere near significant, proven by Jack's own conversion.

Find the indexing problem, it is there 99,999999% sure.

Not me, not you ;) it was the measurement that caused the lightness anomalous crossover. The author provided me with another measurement of the same card taken on a previous day and this one doesn't show this anomaly. However I am not convinced at all with this measurement because gray patches are measured as strongly bluish.

However regarding the NN, once trained it works really fine. Mean delta E increases a bit, but max delta gets reduced and L on the gray patches seems much more adequate (except for the two darkest patches L16 and GS23; I bet patch 22A is responsible for this because on it prediction remains darker than target, again a measurement anomaly).
The NN makes the gray patches remain neutral (white balance surely has an influence on this), so in some way it fixes the colour tint measured in the gray patches:

MLP_XYZ_()_relu_identity : ΔE_max = 31.2143 , ΔE_mean = 3.5602 , ΔE_median = 2.3695
MLP_Lab_()_relu_identity : ΔE_max = 80.7121 , ΔE_mean = 28.4108 , ΔE_median = 21.4503
MLP_XYZ_()_logistic_identity : ΔE_max = 31.2143 , ΔE_mean = 3.5602 , ΔE_median = 2.3695
MLP_Lab_()_logistic_identity : ΔE_max = 80.7667 , ΔE_mean = 28.4247 , ΔE_median = 21.4171
MLP_XYZ_(3, 3)_relu_identity : ΔE_max = 108.2972 , ΔE_mean = 41.9445 , ΔE_median = 37.2037
MLP_Lab_(3, 3)_relu_identity : ΔE_max = 112.5166 , ΔE_mean = 42.6690 , ΔE_median = 39.2641
MLP_XYZ_(3, 3)_logistic_identity : ΔE_max = 18.9269 , ΔE_mean = 3.8357 , ΔE_median = 2.7222
MLP_Lab_(3, 3)_logistic_identity : ΔE_max = 76.0081 , ΔE_mean = 25.5393 , ΔE_median = 20.7859
MLP_XYZ_(4, 4)_relu_identity : ΔE_max = 89.5776 , ΔE_mean = 22.5197 , ΔE_median = 17.3923
MLP_Lab_(4, 4)_relu_identity : ΔE_max = 83.8646 , ΔE_mean = 32.4117 , ΔE_median = 31.2353
MLP_XYZ_(4, 4)_logistic_identity : ΔE_max = 14.0973 , ΔE_mean = 2.6132 , ΔE_median = 1.9514
MLP_Lab_(4, 4)_logistic_identity : ΔE_max = 67.2113 , ΔE_mean = 17.6860 , ΔE_median = 12.3790
MLP_XYZ_(16, 16)_relu_identity : ΔE_max = 13.3266 , ΔE_mean = 2.4412 , ΔE_median = 1.6316
MLP_Lab_(16, 16)_relu_identity : ΔE_max = 18.4030 , ΔE_mean = 5.1635 , ΔE_median = 4.3559
MLP_XYZ_(16, 16)_logistic_identity : ΔE_max = 13.5694 , ΔE_mean = 2.2652 , ΔE_median = 1.5868
MLP_Lab_(16, 16)_logistic_identity : ΔE_max = 7.6640 , ΔE_mean = 1.6545 , ΔE_median = 1.3866
MLP_XYZ_(50, 50)_relu_identity : ΔE_max = 11.1737 , ΔE_mean = 2.1659 , ΔE_median = 1.5782
MLP_Lab_(50, 50)_relu_identity : ΔE_max = 10.4095 , ΔE_mean = 2.8398 , ΔE_median = 2.3855
MLP_XYZ_(50, 50)_logistic_identity : ΔE_max = 21.1758 , ΔE_mean = 3.6650 , ΔE_median = 2.5787
MLP_Lab_(50, 50)_logistic_identity : ΔE_max = 4.9034 , ΔE_mean = 1.1084 , ΔE_median = 0.8211
MLP_XYZ_(100, 100)_relu_identity : ΔE_max = 14.9724 , ΔE_mean = 1.8887 , ΔE_median = 1.5196
MLP_Lab_(100, 100)_relu_identity : ΔE_max = 6.9355 , ΔE_mean = 1.7798 , ΔE_median = 1.5225
MLP_XYZ_(100, 100)_logistic_identity : ΔE_max = 31.8153 , ΔE_mean = 5.3645 , ΔE_median = 3.2542
MLP_Lab_(100, 100)_logistic_identity : ΔE_max = 4.6104 , ΔE_mean = 1.0028 , ΔE_median = 0.6827
MLP_XYZ_(200, 200)_relu_identity : ΔE_max = 4.1306 , ΔE_mean = 0.8433 , ΔE_median = 0.5115
MLP_Lab_(200, 200)_relu_identity : ΔE_max = 3.9394 , ΔE_mean = 1.1816 , ΔE_median = 0.9588
MLP_XYZ_(200, 200)_logistic_identity : ΔE_max = 24.1265 , ΔE_mean = 3.3331 , ΔE_median = 2.4439
MLP_Lab_(200, 200)_logistic_identity : ΔE_max = 5.0772 , ΔE_mean = 0.8826 , ΔE_median = 0.5278

Simple NN's perform really bad in predicting Lab values. From 16 neurons/layer, Lab models start to perform better than XYZ models, although I'd rather call it Lab/XYZ convergence.



MLP_Lab_(50, 50)_logistic_identity : ΔE_max = 4.9034 , ΔE_mean = 1.1084 , ΔE_median = 0.8211



Look at the L correlation, the alternative ups & downs don't exist anymore:






Previous vs current measurements on gray patches:


Regards
« Last Edit: June 05, 2019, 02:58:21 am by Guillermo Luijk »
Logged

32BT

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3095
    • Pictures
Re: Camera calibration using a neural network: questions
« Reply #52 on: June 04, 2019, 07:30:31 pm »

Do you have the original it8.txt and the new it8.txt available?
Logged
Regards,
~ O ~
If you can stomach it: pictures

Jack Hogan

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 798
    • Hikes -more than strolls- with my dog
Re: Camera calibration using a neural network: questions
« Reply #53 on: June 05, 2019, 02:33:50 am »

MLP_XYZ_(3, 3)_logistic_identity : ΔE_max = 18.9269 , ΔE_mean = 3.8357 , ΔE_median = 2.7222
MLP_Lab_(3, 3)_logistic_identity : ΔE_max = 76.0081 , ΔE_mean = 25.5393 , ΔE_median = 20.7859

Good show Guillermo!

If I understand correctly, 3x3 represents a 5 layer network: input, output and 3 hidden layers with 3 activation units each. The output layer explodes back out to the same size as the input, it is then compared to the given reference data, feeding the result into the minimization algorithm.  Correct?

The logistics activation function seems to produce similar performance as classical linear methods with an XYZ reference.  I am curious as to why it does not work nearly as well with a Lab reference. I suspect the linear output layer. What do you guys think?

Jack
« Last Edit: June 05, 2019, 02:55:49 am by Jack Hogan »
Logged

Guillermo Luijk

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 2005
    • http://www.guillermoluijk.com
Re: Camera calibration using a neural network: questions
« Reply #54 on: June 05, 2019, 03:14:23 am »

The logistics activation function seems to produce similar performance as classical linear methods with an XYZ reference.  I am curious as to why it does not work nearly as well with a Lab reference. I suspect the linear output layer. What do you guys think?

Yes, you already gave the key to understand this: there is a nearly linear relation between RGB_WB and XYZ, but Lab is totally non-linear vs RGB_WB or XYZ. This means a (3,3) NN (2 hidden layers with 3 neurons each) is not capable to deal with such a level of non-linearity. As long as we introuduce 16 neurons/layer we have a higher degree of freedom and the non-linear abilities of the NN start to shine, and since the loss function is computed in the Lab euclidean space Lab models performs better than XYZ (surely a XYZ NN with a custom defined loss function, would be the best model because the XYZ to Lab transformation is deterministic so RGB_WB -> XYZ would be the logical way for the mapping, but I cannot define de loss function):

MLP_XYZ_(3, 3)_logistic_identity_CORR:


MLP_Lab_(3, 3)_logistic_identity_CORR:


Regards
« Last Edit: June 05, 2019, 03:18:47 am by Guillermo Luijk »
Logged

32BT

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3095
    • Pictures
Re: Camera calibration using a neural network: questions
« Reply #55 on: June 05, 2019, 03:34:37 am »

Yes, you already gave the key to understand this: there is a nearly linear relation between RGB_WB and XYZ, but Lab is totally non-linear vs RGB_WB or XYZ. This means a (3,3) NN (2 hidden layers with 3 neurons each) is not capable to deal with such a level of non-linearity.


But the problem might be caused by the early assessment selecting relu and logistic as the best performers. Now that you know that the entire experiment works properly, you might retry tanh for the Lab case. tanh allows the NN to produce smooth transitions between negative and positive values within one node. Otherwise it needs increased complexity to achieve the same.
Logged
Regards,
~ O ~
If you can stomach it: pictures

32BT

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3095
    • Pictures
Re: Camera calibration using a neural network: questions
« Reply #56 on: June 05, 2019, 04:12:49 am »

Maybe we can introduce a new activation function! We might finally have found a potentially useful application for this formula: https://forum.luminous-landscape.com/index.php?topic=58257.0

:-)
Logged
Regards,
~ O ~
If you can stomach it: pictures

Guillermo Luijk

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 2005
    • http://www.guillermoluijk.com
Re: Camera calibration using a neural network: questions
« Reply #57 on: June 05, 2019, 05:29:47 pm »

But the problem might be caused by the early assessment selecting relu and logistic as the best performers. Now that you know that the entire experiment works properly, you might retry tanh for the Lab case. tanh allows the NN to produce smooth transitions between negative and positive values within one node. Otherwise it needs increased complexity to achieve the same.

First of all I have to say you were right, gray patches were offset by one position because the author made a mistake in measuring GS0 twice and not recording the measure in patch GS23.
He has measured again the GS0-GS23 patches and made the union of measurements. I don't pretty much like the idea of mixing measurements taken in different times and possibly conditions, but OK.

I dropped the relu activation function and introduced tanh, thanks for the suggestion. It converges faster than sigmoid and performs better, so my best trade-off candidate this time is MLP_Lab_(50, 50)_tanh_identity:

MLP_XYZ_()_tanh_identity : ΔE_max = 33.6562 , ΔE_mean = 3.2097 , ΔE_median = 1.8020
MLP_Lab_()_tanh_identity : ΔE_max = 82.1350 , ΔE_mean = 28.3336 , ΔE_median = 21.5002
MLP_XYZ_()_logistic_identity : ΔE_max = 33.6562 , ΔE_mean = 3.2097 , ΔE_median = 1.8020
MLP_Lab_()_logistic_identity : ΔE_max = 82.1906 , ΔE_mean = 28.3471 , ΔE_median = 21.4991
MLP_XYZ_(3, 3)_tanh_identity : ΔE_max = 15.3840 , ΔE_mean = 2.4645 , ΔE_median = 1.8418
MLP_Lab_(3, 3)_tanh_identity : ΔE_max = 43.0895 , ΔE_mean = 11.6182 , ΔE_median = 9.2091
MLP_XYZ_(3, 3)_logistic_identity : ΔE_max = 18.1560 , ΔE_mean = 3.8059 , ΔE_median = 2.2673
MLP_Lab_(3, 3)_logistic_identity : ΔE_max = 77.2547 , ΔE_mean = 25.7759 , ΔE_median = 20.7278
MLP_XYZ_(4, 4)_tanh_identity : ΔE_max = 11.2728 , ΔE_mean = 2.0082 , ΔE_median = 1.5386
MLP_Lab_(4, 4)_tanh_identity : ΔE_max = 37.5632 , ΔE_mean = 8.2402 , ΔE_median = 6.7438
MLP_XYZ_(4, 4)_logistic_identity : ΔE_max = 16.7437 , ΔE_mean = 2.3609 , ΔE_median = 1.6683
MLP_Lab_(4, 4)_logistic_identity : ΔE_max = 68.0552 , ΔE_mean = 17.6988 , ΔE_median = 12.4104
MLP_XYZ_(16, 16)_tanh_identity : ΔE_max = 13.7044 , ΔE_mean = 2.0632 , ΔE_median = 1.2234
MLP_Lab_(16, 16)_tanh_identity : ΔE_max = 3.9206 , ΔE_mean = 1.0084 , ΔE_median = 0.8693
MLP_XYZ_(16, 16)_logistic_identity : ΔE_max = 13.1919 , ΔE_mean = 1.9344 , ΔE_median = 1.2701
MLP_Lab_(16, 16)_logistic_identity : ΔE_max = 6.1042 , ΔE_mean = 1.3190 , ΔE_median = 1.0214
MLP_XYZ_(50, 50)_tanh_identity : ΔE_max = 14.7397 , ΔE_mean = 2.4049 , ΔE_median = 1.6943
MLP_Lab_(50, 50)_tanh_identity : ΔE_max = 3.9451 , ΔE_mean = 0.6966 , ΔE_median = 0.5313
MLP_XYZ_(50, 50)_logistic_identity : ΔE_max = 26.3273 , ΔE_mean = 3.5872 , ΔE_median = 2.1889
MLP_Lab_(50, 50)_logistic_identity : ΔE_max = 4.0815 , ΔE_mean = 0.7272 , ΔE_median = 0.5018
MLP_XYZ_(100, 100)_tanh_identity : ΔE_max = 9.9610 , ΔE_mean = 1.5248 , ΔE_median = 1.0915
MLP_Lab_(100, 100)_tanh_identity : ΔE_max = 3.3160 , ΔE_mean = 0.5003 , ΔE_median = 0.3757
MLP_XYZ_(100, 100)_logistic_identity : ΔE_max = 29.8789 , ΔE_mean = 3.7231 , ΔE_median = 2.1818
MLP_Lab_(100, 100)_logistic_identity : ΔE_max = 3.7744 , ΔE_mean = 0.6649 , ΔE_median = 0.4671
MLP_XYZ_(200, 200)_tanh_identity : ΔE_max = 8.4625 , ΔE_mean = 1.4752 , ΔE_median = 0.9700
MLP_Lab_(200, 200)_tanh_identity : ΔE_max = 3.3737 , ΔE_mean = 0.4059 , ΔE_median = 0.2591
MLP_XYZ_(200, 200)_logistic_identity : ΔE_max = 16.8308 , ΔE_mean = 1.8797 , ΔE_median = 1.1590
MLP_Lab_(200, 200)_logistic_identity : ΔE_max = 4.2315 , ΔE_mean = 0.6553 , ΔE_median = 0.4166

Mean delta E is lower than 1, with max delta E below 4!

Quick and soft convergence:


Nice correlation (again lightness in dark and very bright patches gets the least accurate fit):


Good Delta E distribution gathered below 1:


Left half=prediction vs Right half=exact value (target):


Pretty impressive, right? the NN complexity is 2800 coeffs + 103 bias = 2903 numbers (if we consider 32-bit floating point numbers, that means 11,3KB are needed to store the profile's definition)

I also predicted one of the simplest models: MLP_XYZ_(4, 4)_tanh_identity : ΔE_max = 11.2728 , ΔE_mean = 2.0082 , ΔE_median = 1.5386



Not bad for a NN defined by 51 numbers:




Now I need to check how the NN behave for unseen colours, i.e., interpolating colours that are not in the chart. I expect the interpolations to be soft; any ringing behaviour would be bad news. I can also validate it on real images. I think tanh could help in providing soft transitions?

Regards



« Last Edit: June 05, 2019, 08:20:25 pm by Guillermo Luijk »
Logged

Jack Hogan

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 798
    • Hikes -more than strolls- with my dog
Re: Camera calibration using a neural network: questions
« Reply #58 on: June 06, 2019, 03:20:40 am »

Excellent, even the 16x16 Lab network looks good.  Based on Torger's comments I have a feeling that, as long as results are 'good enough', the smaller the network the better in terms of avoiding overfitting problems.

I am curious as to how such a network would perform with a non-linear output activation function, say tanh since it seems to work well.  I am asking because the neutrals are still not quite right - perhaps because L is not linear (identity)?

Also, in order not to have to do backward somersaults with the testing setup, I think for proper validation of the performance of a chosen network one would ideally grab Spectral Sensitivity Functions for the sample camera (from here for instance), choose an illuminant SPD and generate properly spaced training and cross validation sets.  Then change the illuminant and determine how far one can go before it falls apart.  Next add some subset of the illuminant SPD as an input, all the way down to just the wb multipliers.

What is the name of the correct XYZ reference file now Guillermo?  I'll see if it makes much of a difference for the linear fits.

Jack
Logged

32BT

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3095
    • Pictures
Re: Camera calibration using a neural network: questions
« Reply #59 on: June 06, 2019, 03:49:22 am »

Very interesting results indeed, Guillermo.

I don't know about anyone else, but this turns out to be a very insightful experiment.

Some thoughts and suggestions: the (50, 50) result looks remarkably like the best results you'd expect from normal profiling where the card is either slightly unevenly lit or slightly bend. Under normal matrix conversion this has no effect on actual profile performance, because it simply is reproducing the camera capture correctly.

However:
In this case I think something else is happening and this is very important to understand. It gets to the core of NN design.

What we might be seeing is a combination of overfitting and the inability of the NN to properly represent the Lab gamma curve.

1. Overfitting
If you look at the attached annotation on your L graph, you can see we have outliers (the arrows) but not what seems as random deviation.

2. Gamma curve
In the same attachment in the circle you can see something that looks like ringing. I suspect that this is a result of an inability to properly represent Lab gamma. The tanh activation curve looks somewhat like the gamma curve, but isn't. (Nor is it a linear transition in case of XYZ).

Now, in my never humble opinion I would assess the results as follows:
(50, 50) allows too much variation in curves and fitting. There are several reasons you should NOT want to make the layers that large. One vitally important reason is that NN is supposed to encode patterns compactly that are either too large for us to comprehend or too hard for us to understand, or both. By applying large NN layers for what is essentially a really simple linear matrix conversion, we are not making the solution elegantly small and succinct.

So, in this case I would ask myself what would be necessary for the NN to better match the gamma curve (or the linear curve) which I suspect will better match the overall model without overfitting? Keeping it elegantly small?

My answer would be: add another hidden layer. The NN probably just needs another step for better matching the gamma curves. And, to keep it as small as possible, I would first try (4, 4, 4) and then if it confirms the suspicion, reduce to (4, 4, 3), (3, 4, 3), and maybe (3, 3, 3).

Another important and interesting approach would be culling: remove the connections that fall below a certain threshold, rinse, and repeat. Until you reach the optimal compactness in your NN design.


Please note, this is in no way criticism. I think you did a brilliant job implementing this and sharing the results. It is absolutely insightful.


Logged
Regards,
~ O ~
If you can stomach it: pictures
Pages: 1 2 [3] 4 5   Go Up