Topic: Training a neural network to mimic image processing (Read 1615 times)

Guillermo Luijk · « **on:** April 03, 2019, 08:01:22 am »

I'm learning about neural networks so I tried this: the goal is to train a neural network to perform the needed regressions to mimic an arbitrary image processing consisting of:

Non linear RGB curves
Colour desaturation
Hue rotation

I did this kind of exercise in the past with curves, but they only perform well in processings that can be done using curves (this excludes arbitrary desaturations and hue cycling).

Our processing takes a picture like this...

...and turns it into this (the ugly appearance is irrelevant, the point was to make things difficult for the NN):

The process consists of teaching the neural network the transformation function from the input {R, G, B} to the every corresponding output {R', G', B'}. This is done via a synthetic 8-bit image containing all possible combinations: 256x256x256 = 17 million colours (borrowed from Bruce Lindbloom). This is called the training set:

And I have applied in Photoshop the processing we want to model:

The NN doesn't really know it's dealing with image information. For the NN this is just a R3 -> R3 function regression problem. We train the NN with some adequate parameters (only one hidden layer was used, with 32 nodes):

# NN training hyperparameters
regr = MLPRegressor(solver='adam', # solver 'sgd', lbfgs'
alpha=0, # no L2 (ridge regression) regularization
hidden_layer_sizes=32, # nodes
activation='logistic', # hidden layer activation function (default 'relu')
# 'logistic' (sigmoid) seems more adequate to model continuous functions
max_iter=30, # max epochs
tol=0.00001, # tolerance for early stopping
n_iter_no_change=10, # number of epochs to check tol
verbose=True) # tell me a story
regr.out_activation_ = 'relu' # output layer activation function (default 'identity')
# 'relu' seems a good idea since RGB values can only be positive

This is what a 6 nodes NN looks like:

Once trained some ~200 coefficients are calculated. The NN definition through just 200 figures models up to 17 million possible colour transformations. While NN seem magic, in prediction they just apply this basic sum + activation function on each node (the weights wi are the NN definition commented):

The last step is to run the net through images that are unknown to it (the NN never "saw" them during training); this is the test set. After that we compare the prediction with the exact processing applied in Photoshop. Results seem promising, I have to improve some things though (deep shadows and contrast take the worst part, but colour is very good). Just 2 examples:

Original image:

Exact processing:

NN prediction:

Original image:

Exact processing:

NN prediction:

Possible applications:

Copy someone's processing when he doesn't want to share
Camera JPEG replication from RAW (any JPEG style on any brand)
Sensor calibration
Reverse engineer cinema filters
Reverse engineer Apps filters (Instagram, NIK,...)
Mimic old film (Kodachrome, Velvia,...)
Mimic chemical cross processings

Since NN are a form of supervised learning, they need the before and after image to be trained. This can be tricky to obtain in some cases (e.g. film, we'd need a chemically developed image and be able to repeat it again with a digital camera).

Regards

Steve Gordon · « **Reply #1 on:** April 03, 2019, 10:18:47 am »

Fascinating, bewildering, and a little terrifying.

Thanks for posting.

This stuff is inevitably our future but we must tread ethically carefully...

nirpat89 · « **Reply #2 on:** April 03, 2019, 10:29:23 am »

Quote from: Steve Gordon on April 03, 2019, 10:18:47 am

Fascinating, bewildering, and a little terrifying.

Thanks for posting.

This stuff is inevitably our future but we must tread ethically carefully...

+1. Indeed!

Application #1 would be worrisome.

rdonson · « **Reply #3 on:** April 03, 2019, 11:39:16 am »

So much for developing your own style???

32BT · « **Reply #4 on:** April 03, 2019, 12:13:04 pm »

This doesn't make much sense, since you're designing the NN according to the solution and then using a technically horrible method to initialise that solution. Kind of using Monte Carlo on 3 curves with 32 steps to find the non-linear conversion.

I understand the need for a simplified learning example, but this isn't it. It does not provide you with insights into the power of patterns within AI. Think of AI as a non-linear matrix conversion. You should technically be able to do very complex destination spaces with a limited number of hidden nodes. So, for example, 2 hidden layers of 4 nodes may be all that is necessary for conversion to a highly contorted 3d destination space. The right design could work really well to encode printer colormanagement for example.

Another design may be using 2 layers where for example 1 layer represents 5 points on a curve, and a second layer represents interpolation of those points to obtain full bit depth output. Then you first train that second layer for correct interpolation, and then train the 2 layers for your image manipulation.

I'll see if I can come up with a better example to illustrate my point.

Guillermo Luijk · « **Reply #5 on:** April 03, 2019, 12:33:33 pm »

Quote from: 32BT on April 03, 2019, 12:13:04 pm

This doesn't make much sense, since you're designing the NN according to the solution and then using a technically horrible method to initialise that solution. Kind of using Monte Carlo on 3 curves with 32 steps to find the non-linear conversion.
(...)
I'll see if I can come up with a better example to illustrate my point.

Yes, that is what supervised learning is about. I want to try different strategies this weekend (not only adding layers but trying other activation functions). Just wanted to start with a single hidden layer and see how much the number of nodes had an impact. I ended using 32 nodes because it was still fast (less than 5min to have the net trained), but about 10 nodes already started to produce a quite good result.

If you mean RGB curves applied to a single channel each, I cannot see they are able to perform Hue rotations. More degrees of freedom are needed.

Regards

32BT · « **Reply #6 on:** April 03, 2019, 01:10:52 pm »

What would be interesting is to attempt to design a NN that converts RGB to HSB.

There are some interesting simple computations involved in the mathematical conversion, and it would be useful to find out what is necessary to get the NN to come to the same result. It might show very clearly the difference between encoding patterns vs encoding functions.

Guillermo Luijk · « **Reply #7 on:** April 03, 2019, 01:54:53 pm »

Quote from: 32BT on April 03, 2019, 01:10:52 pm

What would be interesting is to attempt to design a NN that converts RGB to HSB.

There are some interesting simple computations involved in the mathematical conversion, and it would be useful to find out what is necessary to get the NN to come to the same result. It might show very clearly the difference between encoding patterns vs encoding functions.

Good point. Indeed I wondered yesterday if a previous conversion to HSL would improve the result, since HSL is much closer to a perceptual model. In fact I wonder if the hidden layer could devote differentiated nodes to latent features related to Hue, Saturation and Luminance. I also wonder how the NN would deal with the cyclic property of Hue; I experienced issues with Hue while doing a K-Means exercise, and it forced me to increase the number of clusters because H=359º is the farest Hue to H=0º in Euclidean distance, being both practically the same colour.

Regards

Guillermo Luijk · « **Reply #8 on:** April 04, 2019, 07:13:52 pm »

I improved the NN using 2 hidden layers of 64 nodes each. After training for over 20min the result is nearly undistinguishable from the genuine processing. There are still very minor differences in the deep shadows, but I'm really happy with the result bearing in mind that the NN has been trained with a generical training set. This means it can process any image with the same expected accuracy:

If someone is interested in the code, it can be found in GitHub.

Regards

Jack Hogan · « **Reply #9 on:** April 08, 2019, 10:58:18 am »

Cool Guillermo!

Guillermo Luijk · « **Reply #10 on:** April 16, 2019, 07:00:20 pm »

Thanks Jack!

I'm planning to take the experiment a step beyond, and try to train a NN for sensor calibration. But I need some assistance from you, colour experts to make sure all this makes sense.

The plan is as follows:
1. Shot an IT8 card and perform a linear RAW development on it (DCRAW). In principle even without WB (i.e. all multipliers are set to 1), and even no output colour space conversion. These two steps could be later included in the basic processing if we find out we are demanding too much from the NN.
2. Train the NN to mimic the needed functions to convert the original 200+ patches of the IT8 card into the exact RGB values of a perfect calibration.
3. After that the NN would be applied to test images to check for colour rendition.

The exercise is now much more challenging because we'll only count with 200+ input/output pairs (while we had a complete mapping of nearly 17 million pairs in the original exercise), so the NN will have to interpolate a lot and care should be taken to prevent overfitting.

If the results are nice, we'lll have modelled in a simple NN (roughly 1000 numerical values) the following complex transformations:
- White balance on RAW data
- Colour space conversion
- Sensor calibration (accurate expected colours and contrast).

But I need:

- Make sure I'm not talking bullshit.
- The exacted expected RGB values of the IT8 in some colour space (e.g sRGB), something like the image depicted below. Is this information available?
- Someone who can help to check the accuracy in delta E of both the IT8 card mapping done by the NN, and the same for the test images.

Regards

Author Topic: Training a neural network to mimic image processing (Read 1615 times)

Guillermo Luijk

Training a neural network to mimic image processing

Steve Gordon

Re: Training a neural network to mimic image processing

nirpat89

Re: Training a neural network to mimic image processing

rdonson

Re: Training a neural network to mimic image processing

32BT

Re: Training a neural network to mimic image processing

Guillermo Luijk

Re: Training a neural network to mimic image processing

32BT

Re: Training a neural network to mimic image processing

Guillermo Luijk

Re: Training a neural network to mimic image processing

Guillermo Luijk

MLP (64,64)

Jack Hogan

Re: Training a neural network to mimic image processing

Guillermo Luijk

NN for colour calibration