Luminous Landscape Forum
Raw & Post Processing, Printing => Colour Management => Topic started by: Guillermo Luijk on April 12, 2020, 01:13:22 pm
-
On the topic Camera calibration using a neural network (https://forum.luminous-landscape.com/index.php?topic=130281.0), I had a very productive time learning things about neural networks (NN) and colour science, thanks specially to some forum members who got interested in the exercise. It consisted on training a NN to translate camera RAW values into RGB values with the main goal of colour accuracy (minimum deltaE), by using an IT8 card as the training set.
(http://guillermoluijk.com/misc/nnprofiling.png)
However I always had the feeling the exercise was not complete since I was not choosing the NN complexity in a rigurous way to prevent overfitting (https://en.wikipedia.org/wiki/Overfitting) because of not having a proper Test set.
In machine learning a Test set is a group of input-output pairs which are kept away from the NN during its training (the NN never 'sees' the examples in the Test set).
Reducing error on the Training set is as simple as making the NN more complex because NN do great at replicating input-output correspondences, but if the NN is too complex it will 'learn' the noise and errors present in the Training set (=overfitting), becoming a bad generic RAW to RGB converter. Instead, by looking at the error over the Test set, we can find an optimum NN which will do best on any arbitrary image:
(https://miro.medium.com/max/948/1*wZg_RQHPRtn62dDp2Ez86A.jpeg)
I'll try to mimic (and plot) this theoretical pair of curves training NN with different number of layers/nodes. For sure real world curves will be much uglier than these beautiful soft plots, but my hope is to obtain a clear indication of the optimal NN complexity for the 1000 patches of the Superchroma.
Thanks to Hugo RodrĂguez creating his new HR-1 SuperChroma (https://luminous-landscape.com/hr-1-superchroma/) card, I will be able to do a complete deep learning exercise in a more orthodox way:- The 1000 patches on the SuperChroma will be our Training set
- The 286 patches on the IT8 will be our Test set
(http://guillermoluijk.com/misc/traintest.png)
In addition to all that, I wonder how profile creation software works regarding overfitting and underfitting. Can ICC profiles generally be accepted as optimum conversions or they tend to underfit/overfit? Generally I see how colour accuracy is validated with deltaE measurements over the same colour patches that were used to create the profile. In a machine learning context this would be equivalent to just looking at the Training set error, which can easily be made as tiny as desired just by adding complexity to the transformation method.
I will post here the advances in the exercise taking advantage of this coronavirus confinement extra time. Any feedback (positive or negative) is highly appreciated.
Regards
-
Excellent Guillermo, I agree with pretty well all of your considerations, especially that we are all mostly doing the training and validation off the same few patches. Overfitting can definitely be a problem, that's why tools like DcamProf/Lumariver provide options on the smoothness of the fit. Also, it's useful to think of a well developed CFA as part of what should ideally be a fully linear system (Luther-Ives condition), so it is worth applying regularization to ensure that the results do not stray too far from that.
The two targets look good. One issue to contend with is the fact that they are probably made with only a few pigments (how many?) and therefore the patches could be non-orthogonal and correlated, diminishing their effective number. Think for instance of feeding the training set with captures of the targets at different exposures. Does that help much in the end? A question I have often thought about but never investigated. Off the top of my head I guess it would overweight the L channel, and that may/not be good depending on the application.
Have you seen Jim Kasson's recent series (https://blog.kasson.com/nikon-z6-7/camera-differences-in-color-profile-making/) using a CC SG for training and CC 24 for testing?
I look forward to seeing the results when you are ready.
Jack
-
Also, it's useful to think of a well developed CFA as part of what should ideally be a fully linear system (Luther-Ives condition), so it is worth applying regularization to ensure that the results do not stray too far from that.
(...)
Think for instance of feeding the training set with captures of the targets at different exposures. Does that help much in the end? A question I have often thought about but never investigated.
Jack I'm glad you bring up the subject about sensor linearity and how this could/should have an influence in the design of colour charts for camera profiling. The exercise I'm going to do with the SuperChroma is perfectly fine for me and interesting as what it is: a RGB to XYZ coordinates mapping exercise, and I feel happy considering it a complete regression ML exercise to practice on.
But if digital sensors have such a highly linear behaviour, I wonder if designing charts with so many patches, specially those with similar hue but just different reflectance, makes sense. To make it simple just take the group of gray neutral patches: do we really need so many grayscale patches? wouldn't just a high reflectance neutral patch (white) suffice to model the desired behaviour of the sensor in the entire grayscale range, by extrapolating linear behaviour? in fact we wouldn't even need a deep black patch because according to sensor linearity this would correspond to a RGB={0,0,0} -> XYZ={0,0,0} / L=0 correspondence, which could be hard coded into the model.
I talked to Hugo about my concerns on this. In the previous exercise I plotted for the 24 gray patches of the IT8 the RAW_WB G input numbers vs the L values: both predicted (black) and measured with the specrophotometer (blue):
(http://guillermoluijk.com/misc/validation_gvsl.png)
Ignoring the differences between prediction and measured, my concern is there is no confluence of G=0 -> L=0. Instead, if we extrapolate L to the point of expected measured L=0, in the camera shot we still have some residual G>0 reflectance. So in the end we could be modelling some parasitic reflectance into the camera profile because of the glossy paper used. This would lead to some undesired black clipping in the very deep shadows when applying the profile.
Do we really need very dark patches in the colour charts? is it even a good idea to have them? do we need any intermedium gray patches at all in the colour charts or we would be better served with a white patch and assuming linear behaviour (G=0 -> L=0)? In other words, what are we really modelling with complex colour charts if in the end sensors are highly linear devices? are we modelling their slight non-linearities? and in such a case, can we assume these non-linearities will remain unaltered over time and different exposure levels or shooting conditions?
Will take a read on Jim's article.
Regards
-
Hello Guillermo,
Lots of questions to which I only have vague comments to, I am learning as you are. I'll give you my matrix perspective so you can give us the NN's.
With current hardware, I think we can assume linearity of the photodiodes/pixels/Raw data to, say, 1 stop below clipping. Assuming perfect Spectral Sensitivity Functions of the image acquisition hardware (i.e. one exact matrix multiplication away from Cone Fundamentals and therefore CIE Color Matching Functions), the system should then be perfectly linear, and in theory all we would need are three patches to fully determine it. 3 patches = 9 readings, 9 equations with 9 unknowns (the coefficients in the matrix), et voilĂ , the one and only precise solution to our problem for the given observer and illuminant.
However, in reality SSFs are not a single matrix multiplication away from Cone Fundamentals, and this introduces non linearities in the system. All we can do with a matrix is then to find a best approximate compromise, and that's why the matrix in this context is properly referred to Compromise Color Matrix. We use more patches in order to build an overdetermined system and 'regress to the mean'. Of course in such situations we have to be mindful of bias and overfitting, as you pointed out. Since the result is just a best compromise, non linear corrections (often LUTs) may become necessary to push down some of the outliers, always keeping bias and overfitting in mind. Your NN performs both functions, linear regression and non-linear corrections, at the same time.
So the result from our overdetermined system is not perfect and may not go through every patch/value: zero and one are just two values like all others. Using 3x3 matrices implies that all three channels will have zero output with zero input. If one were to relax that assumption we could search for 12 coefficients (a 3x4 matrix that allows for an offset) and obtain the best fit that way. I sometimes do it, but typically I want all outputs to go to zero when the input is zero, so I use 3x3 matrices as a matter of course, as I think do most raw converters (and the DNG spec).
In theory the same applies at the clipping end of the color cube. As Oscar will tell you, we could reduce the number of unknowns to 6 by assuming that the 1 end of the cube (white balanced saturation/clipping) occurs at the coordinates of the illuminant (say the XYZ white point or L = 100, a = b = 0), but I generally don't do that because I use it as a rough check on the linearity of the fit (CCT of matrix with [1 1 1] input should be close to that of the actual illuminant) - besides, most landscapes do not need to have more 'perfect' whites at the expense of other tones. In fact, following this line of thinking perhaps I should be using 3x4 matrices...
It would be interesting to see that graph on properly chosen log scales to see deviations from a straight line.
Jack