In fact a pure binning strategy by averaging 2x2 pixels into 1 final pixel, which according to Emil's statistics doubles the SNR, ...
BR
Emil Martinec has good information, however, unfortunately like many others he also does not consider that there is a difference between noise reduction when a number of images are averaged together and the noise reduction resulting when image is resized. I could have read his pages wrongly. However, reading his pages I could sense it as an underlying assumption as quoted from his website below:
"
Bottom line: At the cost of having half the linear resolution, the superpixel made by binning together a 2x2 block of pixels has twice the signal-to-noise ratio,"
"
If one downsamples an image properly, one decreases the resolution, and noise decreases in proportion to the linear change in image size.";
and, additionally, Emil does not present any analysis regarding the differential rate of the signal change for SNR formulation, etc. For e.g., best SNR in simple average-based downsampling is related to second derivative of signal intensity.
The problem is that the assumption is that the mean square error used in formulation of SNR is only dependent upon only on noise reduction, where as, it has an additional dependency on the signal degradation due to smoothing, which is frequently ignored (the bias term). In the worst case just consider, that at each pixel position you average all pixels in an image and you end up with a flat, constant image, where each pixel has the same value. That is pathetic SNR.
Perhaps the easiest to way to see that is the mean square error in such estimation is given as:
error = (bias).^2 + (variance of noise)
When you average images of the same scene to get a cleaner image, then though image intensity is varying at each pixel, however, the average is in the time domain and each pixel can be considered to have a true value being estimated, though it is different for each pixel, of course. In this case, since average is an unbiased estimator, the bias is zero, and the variance of the noise decreases with an increase in the number of images being averaged, therefore SNR increases as a measure of the square root of the number of images.
However, when image is resized by averaging neighboring pixels, the average is in the spatial domain. Even in the absence of noise, the signal is not constant in neighboring pixels (unfortunately Emil chose a degenerate example of adding noise to constant image on his website), therefore, the bias term is not zero. Though noise variance goes down by averaging a larger number of pixels, on the other hand the bias increases as a square.
There is an optimal number of pixels to be averaged to get best SNR, however,
that varies at each pixel position, implying at each pixel position a different number of pixels must be averaged together to get the same max SNR at each location!. The reason that happens is the differential rate of change of signal and noise determine the optimal number of pixels to be averaged together and these parameters change at each location.
It is easy to figure out when SNR would be less in resizing based upon averaging. Suppose you take an image and average each 2x2 pixels into 1 pixel, then you get an image 1/4 in size of the original image. In this case, spatial pixel averaging results in good noise reduction and many people erroneously conclude that the process could be extended to averaging even larger number of pixels, i.e., 4x4->1, 8x8->1, 16x16->1, etc. But there is a problem, though noise is being reduced, the image is also becoming more and more blurry. And, if you keep on doing that, then you are going to end up with a constant flat image, and terrible SNR.
On the other hand, it does not matter how many images you average in time to get a cleaner image, if it is considered that all those images are of the same static shot where each pixel had a fixed true value, but was corrupted by noise (hopefully zero mean). Averaging a constant number is the same as the constant number, so the original signal is *not* blurred (bias is zero), however, the noise is reduced significantly, and the SNR improves.