There are a lot of threads here now in the LL forum about aliasing, anti-alias filters, foveon vs. bayer array, upsizing, downsizing, etc.
What does aliasing look like?
Here is a synthetic image of a circular sine wave that has been sampled close to the Nyquist rate.
(http://sites.google.com/site/cliffpicsmisc/home/sine204_zoom.png)
The original on the left is shown zoomed progressively larger to reveal the details in the pixel structure.
What do you see here? Is this what you would call "aliasing?"
Regards,
Cliff
What do you see here? Is this what you would call "aliasing?"
The point is that the failure to reconstruct, or reconstruction error, can be mistaken for aliasing, and can even badly affect images that have no aliasing at all.
What is a reconstruction filter?
Ok, I appologize, it was a trick question.
There is no aliasing in the posted image.
It is a sine of 2.04 samples per cycle, which is about 98% of the Nyquist rate, and so definitely below Nyquist.
Here is a center crop of what you get if you apply a high-quality reconstruction filter to the original image:
(http://sites.google.com/site/cliffpicsmisc/home/sine204_i4yaro.png)
I assume you downsampled the original image, because this image is about 8 pixels per cycle (over-sampled 4x). If so, then how does it look at 2.04 samples per cycle with the high-quality reconstruction filter?
The problem is that reconstruction results in a more finely-sampled (bigger) image. The image marked 100% shows the original samples, is has not been downsampled. The reconstruction example is a crop of a 4x upres. If you use nearest neighbor to reduce the size of the reconstruction by 4, you get back the original (possibly with a small phase shift due to how the nearest neighbor is done).
A 4x upres is more than needed. Here is a 2x upres/reconstruction (Yaroslavsky method):
Would you mind posting the Matlab code you used on that image? I'd like to try using it for some demosaic experiments.
The problem is that reconstruction results in a more finely-sampled (bigger) image.
The image marked 100% shows the original samples, is has not been downsampled.
Just to show it's not aliased, the FFT magnitude of the 100% original:
(http://sites.google.com/site/cliffpicsmisc/home/sine204_fftmag.png)
So the moral of the story is, it's bad to pixel-peep at 100%. 200% is better!
On real images, it can tend to create ripples (due to it being done in the Discrete Cosine Transform domain). There are ways around that - Yaroslavsky has papers on his site that address that problem, but I didn't find any matlab code for it.
Maybe it's my infamiliarity with the jargon, but isn't that called interpolation instead of reconstruction?
I downloaded some of the PDFs, and found them very interesting reading, especially the one about fast discrete sinc interpolation.
Maybe it's my infamiliarity with the jargon, but isn't that called interpolation instead of reconstruction?
Again, excuse me, but I find that hard to believe, unless you mean that it attempts to show the original samples. If those are the original pixels, then it is impossible to arrive at your interpolated/reconstructed version without knowledge of how the pattern was supposed to look.
But that is the FFT magnitude of the mathematical model, not the aliased image shown at the top of the thread as 100%.
I downloaded some of the PDFs, and found them very interesting reading, especially the one about fast discrete sinc interpolation. I thought the experiments he did rotating the images in multiple small increments using various resampling techniques were very interesting.
The thought I had was to use some variant of his fast discrete sinc reconstruction filter to do Bayer demosaic interpolation, then apply deconvolution-based lens blur corrections using PSFs derived from images of a special test target shot with the same camera/lens combination and demosaiced with the same algorithm. The main idea behind this is that the PSFs would not only reflect the blur characteristics of the lens, but of the camera's AA filter and the demosaic algorithm as well.
I used interp2d.m, found in the following set from Yaroslavsky's web site: http://www.eng.tau.ac.il/~yaro/adiplab/m-files.zip (http://www.eng.tau.ac.il/%7Eyaro/adiplab/m-files.zip).
On real images, it can tend to create ripples (due to it being done in the Discrete Cosine Transform domain). There are ways around that - Yaroslavsky has papers on his site that address that problem, but I didn't find any matlab code for it.
Rgds,
Cliff
Those are the actual samples. I provided a link to the interpolating code. If you don't have access to Matlab, maybe Jonathan can confirm for you that those are the actual results.
Which paper were you referring to (about suppressing ringing artifacts)? I'm curious how that is done.
At the risk of belaboring the point, here's a real-world example of aliasing that's not really aliasing.
I'm sure everyone has looked at the examples at maxmax.com, comparing the stock 5D with the maxmax HotRod modification (AA filter removed).
Crops of the famous air conditioner (only green channels shown) -
Stock:
http://sites.google.com/site/cliffpicsmisc...tock_maxmax.png (http://sites.google.com/site/cliffpicsmisc/home/5dstock_maxmax.png)
HotRod as posted by MaxMax:
http://sites.google.com/site/cliffpicsmisc...crop_maxmax.png (http://sites.google.com/site/cliffpicsmisc/home/5dhrcrop_maxmax.png)
HotRod with better raw conversion combined with sinc upres by 2:
http://sites.google.com/site/cliffpicsmisc.../5dhrcropi2.png (http://sites.google.com/site/cliffpicsmisc/home/5dhrcropi2.png)
Not quite so bad now is it? Still a little moire in the grill (possibly a demosaick fault), but the jaggies are gone. Jaggies does not mean aliasing.
Rgds,
Cliff
If you don't have access to Matlab, ...
All your examples involve upsampling to smooth the image. Presumably the image at its native resolution will always exhibit the artifacts?
Also, where does one get the RAW file for this image?
The upsampling is not to smooth the image - none of the information below Nyquist if affected (or hardly). It is smoothed only to eliminate the spectral replicas that are a side-effect of sampling, to restore the original smooth image. These artifacts at native resolution are not part of the image - upsampling/interpolation/reconstruction is a way to separate the underlying image from the spectral artifacts. This is possible because the reconstruction artifacts aren't the result of higher frequencies that have been folded down and mixed with lower frequencies, as happens with "real" aliasing.
What is the minimum reconstruction filter expansion ratio necessary to remove reconstruction artifacts? You've shown that a ratio of 2:1 works well; would a ratio of 3:2 work equally well? My intent is to incorporate this into my image processing program, and the smaller the expansion ratio, the less RAM and CPU time needed to process the image. What are your thoughts regarding the smallest effective reconstruction expansion ratio?
I got Octave downloaded, installed, and running. Where do I copy the matlab .m files so I can access them from octave's command prompt?
The files can be downloaded about 1/3 of the way down the page: MaxMax (http://maxmax.com/hot_rod_visible.htm)
When looking at the image in its "native resolution", you are looking at the samples of the image, not the smooth original analog input to the sampling system. The upsampling is not to smooth the image - none of the information below Nyquist if affected (or hardly). It is smoothed only to eliminate the spectral replicas that are a side-effect of sampling, to restore the original smooth image. These artifacts at native resolution are not part of the image - upsampling/interpolation/reconstruction is a way to separate the underlying image from the spectral artifacts. This is possible because the reconstruction artifacts aren't the result of higher frequencies that have been folded down and mixed with lower frequencies, as happens with "real" aliasing.
The image at its native resolution will not always exhibit the artifacts, it depends on the image content. The artifacts are mostly associated with frequencies in the upper half of the spatial frequency range.
By the way, for the air conditioner example I used the latest AMML demosaicking version of dcraw found at ojodigital.com. Nice work!
Rgds,
Cliff
Cliff,
Thanks, this is the first post that really sheds a clear light on a problem that has been discussed ad nauseum, without agreement.
It would be useful if you or someone else could discuss how raw images are really processed. When I first thought about this stuff, I naively thought that something approximating sinc reconstruction was done separately on the rgb channels, resulting in a red, blur and green image all resampled to the same grid. Of course each channel will exhibit different resolution, consistent with its sampling. I was surprised to learn (somewhere) that more ad hoc techniques were employed that supposedly yielded higher resolution, while accepting a bit of aliasing. Why isn't the basic sampling theory used in practice?
Things start to smooth out pretty well at about 1.4 with the sine 2.04 image:
Should probably check with some other images, too.
There are a lot of threads here now in the LL forum about aliasing, anti-alias filters, foveon vs. bayer array, upsizing, downsizing, etc.[a href=\'index.php?act=findpost&pid=324841\']I think this is a good example[/a]
What does aliasing look like?
Regards,
Cliff
[a href=\'index.php?act=findpost&pid=324841\']I think this is a good example[/a]
That's why I mentioned the Sqrt(2) factor, which for a "sine 2.0" image results in a diagonal sampling of 2 pixels/cycle and more oversampling at other angles.
On that note, could you give an estimate about the processing speed of this re-sampling filter on an RGB image? In the code sample I saw that Fourier transforms are involved, so processing time will increase significantly with size. I understand that an optimized code in e.g. C++ would also give different results than in MatLab, but even the MatLab version would give an indication (e.g. single channel pixels/second) of what to expect for an RGB image such as the MaxMax example.
Your example shows what Emil was just talking about.It would be nice if I could upload the raw file to the forum... I will e-mail it to anyone who sends me their e-mail address. The raw file is .3FR, Hasselblad Phocus format.
There is definitely some color aliasing going on there. It's possible that the luminance is unaliased and could be used to restore the color without aliasing. Is the raw file available?.
Cliff
Actually the diagonals are oversampled in comparison to horizontal and vertical. If you look at the FFT, you will see that there is more room in the corners for higher frequencies.
The largest image I tried was your rings1 image at 1000x1000 pixels, which took about 6.5 seconds to do a 2x interpolation (on my not-state-of-the-art pc). A limitation of the interp2d code is that it only accepts integer ratios. So to do the above series at 0.1 intervals, I had to over-interpolate by a factor of 10 (for example, interpolated by 14x for the 1.4 example), and then at the end decimate by nearest neighbor by a factor of 10. This complicates things and makes the memory requirements impractical for this kind of processing, unless you break it down and process it in blocks.
As I mentioned once before, for practical purposes a good bicubic interpolater might be good enough. In the maxmax example, PS bicubic does nearly as well as the sinc filter. But then again, it depends on what kind of other processing you will be doing later in the work flow. Reconstruction errors look really bad when you sharpen them.
Here is a "classic" paper by Mitchell on reconstruction: Rconstruction Filters in Computer Graphics (http://www.cs.utexas.edu/%7Efussell/courses/cs384g/lectures/mitchell/Mitchell.pdf)
Here is a "classic" paper by Mitchell on reconstruction: Rconstruction Filters in Computer Graphics (http://www.cs.utexas.edu/%7Efussell/courses/cs384g/lectures/mitchell/Mitchell.pdf)
In Figure 10, a typical problem is seen where portions of the image near an edge have become negative and have been clamped to zero. This results in pronounced black spots (e.g., at the top of the statue's head). Similar clamping occurs to white, but is less noticeable because of the eye's nonlinear response to contrast. Schreiber and Troxel have suggested that subjectively even sharpening can only be produced by introducing ringing transients in a suitably nonlinear fashion [SCH85]. These conspicuous clamping effects could also be eliminated by reducing the dynamic range of the image or raising the DC level of the image.
This is exactly why interpolation of a linear image (at least with spline-based algorithms) is a bad idea. Interpolating after a a perceptually even gamma has been applied (approximately 2) gives much better results.
If one used a simple linear filter to do the interpolation of color channels, resolution would be limited by the sampling frequency, and so would be particularly poor for R and B channels on a Bayer RGGB array. However, image data is correlated between the color channels, and if those correlations are used one can achieve much higher resolution with much less artifacting. No good demosaic just does a linear interpolation; and the better ones make some use of the correlations between R,G, and B data to achieve resolution near Nyquist for the full array rather than the individual color subsampled arrays.Emil,
Cliff, thanks for the vote of confidence.
I get your point, sometimes working in the "perceptual space" is better. My concern is that in a gamma space, 2+2 does not equal 4, and depending on the operation that could cause visible distortion (like a harmonic generator).
Some time ago I was inspired by Bart's down-sampling page (http://www.xs4all.nl/%7Ebvdwolf/main/foto/down_sample/down_sample.htm) to try down-sampling in linear, and I recall seeing better results. Do you have some examples?
The ringing effects in the dark area are much more noticeable in the linear image, and most unforgivably, the rendering of the white dot on black background is not the perceptual inverse of the black dot on white. The gamma 2 interpolation is better on both counts.
I would think that linear would better emulate the additive mixing of light in the eye.
Cliff
Try looking at them in Photoshop side by side with a calibrated/profiled monitor. Comparisons on an uncalibrated monitor aren't going to be very useful.
That's what I did. And I confirmed that L* goes from 88 in the green, dips to 49 in the seam, then goes up to 60 in the magenta.
Why not a perceptually uniform space such as Lab? I thought that would be the space where one would want to interpolate, since color differences of adjacent pixels in the source image are, well, perceptually uniform.
OK, I think I know what's happening. My guess is that you are viewing your monitor in a dark room. I've been looking at my monitor in a more "average surround" environment, with a CRT with a low black point. That's enough to explain the differences in what we are seeing, since contrast perception is affected by the surround.
I don't dispute the existence of the dark band between the green and magenta. Try counting how many pixels it takes to get to L*50 from the edge the clipped part of the black square vs the white square in the linear image vs the gamma-2 image. IMO accurately rendering the perceptual transition between white and black is more important than perfectly handling oddball hard-edged color transitions like the green/magenta example that only rarely occur in real-world images.
Converting to LAB before resampling would solve both issues simultaneously, though.
OK, I think I know what's happening. My guess is that you are viewing your monitor in a dark room. I've been looking at my monitor in a more "average surround" environment, with a CRT with a low black point. That's enough to explain the differences in what we are seeing, since contrast perception is affected by the surround.
In the sample I provided earlier (post 5), there is no room in the FT beyond 64 x Sqrt(2) distance from DC. Here are the image, its top right corner zoomed in, and the FFT (the yellow rectangle in the next image is 64x64 pixels):
Of course there is more room in the FT, but we also know what happens with the image when we push detail to the extreme (the image turns gray at given frequencies).
The ringing artifacts in the black area of the linear image (between the white dot and magenta square) have peak RGB values between 25 and 35. The same area in the gamma-2 image peaks between 10 and 14. That's a pretty significant difference, IMO, not just monitor viewing conditions.
It would be nice if I could upload the raw file to the forum... I will e-mail it to anyone who sends me their e-mail address. The raw file is .3FR, Hasselblad Phocus format.
I suppose I could up load it somewhere and post a link to it.
Dick,Hi...
I sent you my email address in a message through the LL message system. If you want, I can put the raw file where others will be able to download it.
Thanks,
Cliff
Hi...
I have had four requests for the raw file...
The file is 80Mpx, and my e-mail service says that the limit it 25Mpx.
When I crop the file in Phocus, it does not reduce the file size, or hide the identity of the young lady.
Reducing the size by cropping would be the best way if it was possible, as it would save me having to get permission from the subject, and compression might reduce the moire effect
Emil,
Thanks for your thoughtful response about the spatial correlations between channels. While I do not dispute anything you wrote, it does raise more questions in my mind. For example, if the red and blue channels were over sampled, relying on the partial correlations would add distortion.
As the sampling of the red and blue channel improves without limit, the need for relying on the correlation diminishes. Certainly using the correlation was much more important when we had 6 mPix cameras than now at 24 mPix. The degree of correlation must be dependent on the scene. How is that taken into account? What is the criteria for including the correlation?
If you look closely, it's all checkerboard with just the 3 white lines.
So you can reconstruct up to sqrt(2) higher resolution, super-fine detail along the diagonals. Wasn't this what Fuji exploited in their Super CCD sensor?
Yes, and not very useful at final output resolution to display even a suggestion of a sine wave pattern.
Cliff, with all due respect (and I do mean that), there is a difference between theoretical reconstruction capability, and useful output resolution. What you have shown is that one can interpolate and thus reconstruct a signal when a good filter is used, but you cannot make the signal/image look good (like the real structure, e.g. a sine wave) at its native size. One needs to enlarge/interpolate to avoid visual artifacts, but at the same time one reduces the apparent - angular, from a fixed viewing distance - resolution.
IOW, we are faced with a trade-off, especially for on screen display, not just a single choice. That's why I raised the subject of down sampling on my web page about that subject.
Wish someone would write a similar FFT PS plugin for Mac. grrr.Hi Emil,
Hi Emil,
You could use ImageJ (http://rsb.info.nih.gov/ij/) which is all JAVA, and operates in 32bit FP accuracy, not just 8bit.
Cheers,
Bart
I do use ImageJ quite a bit for analysis, along with IRIS and Mathematica (the latter has been very handy for algorithm development).
But I would love to have something that is easily integrated into an image processing workflow in Photoshop.
Here is the the proceedure:
1. Do the forward FFT.
2. Increase the canvas size on all sides. The ratio of the new size/original size is your interpolation factor.
3. Do the inverse FFT.
Zero-padded DFT-based techniques have been used successfully for higher frequency resolution for a long time, for e.g., for the separation of sinusoids that are close in frequency. In this case you are applying them in the reverse direction. Zero-padding in frequency to get higher resolution in the spatial domain. However, insertion of zeros in one domain only lets you have more resolution in the other domain by interpolation with existing samples in that domain. No new detail is created, and the signal in the other domain only gets "stretched", and in-between points are filled by information from the neighboring samples.
Do you have any suggestions for eliminating the ripples?
Yes, none of this is new. Do you have any suggestions for eliminating the ripples?
The solution I devised for suppressing ringing with cubic splines is simple, but seems to be very effective. I established a limit for the spline's z coefficients (as defined here (http://en.wikipedia.org/wiki/Spline_interpolation#Cubic_spline_interpolation)). By limiting z to ± (maximum - minimum) / N, where N is between 8 and 32, ringing and ripples are greatly reduced without affecting spline values in conditions where ringing or ripples are not an issue. When its z coefficients are clamped to zero, the values returned by a cubic spline are identical to linear interpolation, which of course has no issues with ringing. By intelligently limiting z coefficient values, you can alter the behavior of the spline so that it interpolates quasi-linearly in conditions where ringing is problematic (high-contrast edges), without affecting the spline's behavior in conditions where ringing is not an issue.
How well does your algorithm do at avoiding reconstruction error?
Discrete sinc functions sincd and sincd defined by (8.10) and (8.24) are discrete point spread functions of the ideal digital lowpass filter, whose discrete frequency response is a rectangular function. Depending on whether the number of signal samples N is odd or even number, they are periodic or antiperiodic with period N, as it is illustrated in Figures 8.2( a ) and 8.2( b ).
If one were to do the sinc-interpolation twice, once with an odd number of samples and once with an even number of samples (perhaps by deliberately padding the ends of the data with a different number of samples), wouldn't it be possible to use this periodic/antiperiodic property to combine the two interpolations and cancel out the ripples? In Yaroslavsky's diagrams, it looks like the ripple patterns surrounding the signal impulses are approximately mirror images of each other. If so, wouldn't this be a more mathematically elegant approach than Yaroslavsky's adaptive switch-to-nearest-neighbor-in-trouble-spots approach?
So the rings image will work well because its power spectrum matches the one assumed by the sinc filter; but natural images have a quite different power spectrum and so it's not clear that a sinc filter based reconstruction is going to be optimal. It certainly won't be in images with step edges and similar structures.
I tried out the Yaroslavsky/Happonen sinc interpolation routines on a few images. It turns out that his sliding window method performs as advertised and is very effective at avoiding ripple artifacts. Unfortunately there is a trade-off in that the sliding window method rolls off the higher frequencies enough that, in the end, it was no better in my tests than convention methods like bicubic. (The sliding window has other attractive features such as optional noise reduction, but I did not test that.)
Thanks for your experiments. They are really instructive. Do you know if the sliding window method as implemented applies a windows (such as Kaiser, etc.) to the blocks/segments of data? It is not surprising that windowing would give a better visual output. Some of the drawbacks of using full image data for interpolation, etc., are well-known. It is one of the reasons that one has interpolation schemes limited to say cubic (x^3), etc., but you won't normally see (x^100, etc.), especially if noise is a concern. Polynomial interpolation on short segments is more useful than on whole data.He shows a windowed sinc but I haven't found any reference to the type of window. See the figure on page 48 here. (http://www.eng.tau.ac.il/%7Eyaro/RecentPublications/ps&pdf/FastSincInterpolation_ASTbook.pdf) The answer might be in Happonen's code.
As far as ringing is concerned, it is a product of both the reconstruction filter kernel and also the method used to measure it. I mentioned before that l_1 space is better for ringing suppression, but it appears people are more interested in l_2.I don't follow what you mean here, please explain.
I don't follow what you mean here, please explain.
Perhaps we will eventually accept that we want soft images in 100% because it gives us more latitude and we can apply the same sampling theory we use so successfully everywhere else in engineering. While a CD (sampled at 44100 Hz) can theoretically represent signals up to 22050Hz that is not what we do. We use a nice cut-off filter so the upper frequencies are effectively zero. That is, we actually do not want steep sample-to-sample variations.People don't want soft images at 100%, nor do they want images with jagged edges at 100%. How about sharp, jaggie-free images at 50%?
However, for now what I see around me is that people compare 100% views and want to see really crisp images. A lit window on a distant office building at dusk should show up as a nicely rendered bright pixel. If this is what we want we have to work around the limitations of the sinc() interpolation.
Perhaps we simply need to figure out what we really want. Given that everyone and their dog have their own private ways to resize images (in particular uprezzing!) I'm not convinced that we really know what goals we are trying to attain.True enough.
Forgive me while I continue to beat this horse. Here is something practical to try.
By interpolating ("reconstructing") before sharpening, we can minimize the sharpening of spectral replicas that cause jaggies. The real image data gets sharpened, but not the jaggies. Potentially a higher amount of sharpening can be achieved before the image starts to fall apart.
I'm confused about your use of the term "spectral replicas". The upsampled image is generated (I would think) with the property that the spectral content is band-limited, with no information beyond Nyquist of the original image. So what is one gaining by upsampling? And furthermore, to the extent that upsampling is done by a linear filter, USM is a linear filter, downsampling is a linear filter, then why isn't the sharpening equivalent to a linear filter with slightly different kernel than USM?The following illustration is from Pratt's "Digital Image Processing" (http://www3.interscience.wiley.com/cgi-bin/bookhome/112654057) shows the 1-dimensional case:
Indeed, a good point to stress, and it is another practical reason why one should interpolate to the native resolution of a printer with a good algorithm, instead of an unknown (most likely sub-optimal bilinear) filter in firmware, and then sharpen. BTW it's one of the reasons why people like the Qimage output better than the same print directly from Photoshop.Thank you for your comments, Bart.
I have made very sharp (virtually artifact free) enlarged output by deconvolution sharpening of interpolated image data. The results are much better than sharpening at the native capture size and then letting the printer do the interpolation, as is advocated by some 'authorities'.
Imagine you have a row of 100 point light sources, a lens, and a sensor with 100 photosites in a row, arranged so that if the lens had no aberrations, the light from each point source would be focused exactly on one and only one photosite on the sensor. In such an ideal arrangement, altering the intensity of any given point source would affect the output of one and only one photosite, completely independent of all the others. With a real-world lens and diffraction, this ideal is not achieved; altering the intensity of one point source will affect the outputs of several photosites.Jonathan,
The issue I see with sinc, bicubic, and other interpolation algorithms is this: increasing a single value in a data series can cause neighboring interpolated values to decrease. But this is contrary to what happens in real life; increasing the intensity of a single point light source will increase the outputs of its associated photosite and some of its neighbors, but can never cause any photosite output value to decrease. Conversely, decreasing the intensity of a single point source will cause the outputs of its associated photosite and some of its neighbors to decrease, but can never cause any photosite output value to increase. Therefore, sinc and other common interpolation algorithms work in a way that at times is opposite of the behavior of the real world (because increasing one output value can cause some interpolated values to decrease, and vice versa), and alternative methods should be investigated.
Jonathan,
I don't follow the part about how increasing a photosite value causes a decrease in neighboring photosites. If the input to your hypothetical system is band-limited (as required for sampling), interpolation should work as expected. A point-source won't light up a single pixel in band-limited system. After passing through an optical system, a focused point source will be an Airy disk (jinc^2), spread around among local pixels.
Increasing a value doesn't decrease neighboring sampled values, but depending on the interpolation algorithm, it can cause interpolated values to decrease. The general point I'm making is that increasing the intensity of one of the point light sources cannot result in the decrease of any sampled value, for obvious reasons. It shouldn't be allowed to decrease any interpolated values between samples, either.Ok, I think what's happening is that changing the middle value while keeping the other samples constant does not model an optical system without aliasing. If you were to convolve your samples with a realistic psf/otf before interpolating, it should work in a more realistic way. The blur of the optical system will cause a point source to affect more than a single pixel, so if the intensity of a point source changes by factor x, all the pixels within the influence of the psf should change by factor x.
If you have a natural cubic spline with a row of data points of 0.2, with one data point in the middle of 0.5, the interpolated spline value will drop to about 0.16 ±1.5 samples from the 0.5 data point. If you increase the peak data value to 0.9, the interpolated spline values will drop to about 0.11 ±1.5 samples from the 0.9 data point. If the interpolation algorithm accurately reflected the realities of optics, the interpolated values ±1.5 samples from the peak value should be ≥0.2 in both cases, with the interpolated value near the 0.9 data point being greater than the interpolated value near the 0.5 data point.
If you were to convolve your samples with a realistic psf/otf before interpolating, it should work in a more realistic way.
That's where I'm going with this line of thought, though I don't have a clear idea of exactly how to do this. I'd like to combine the deconvolution-with-PSF and upsampling if possible, with the goals of preserving the greatest possible amount of true image detail, avoiding jaggies and reconstruction artifacts, and minimizing interpolation errors and distortions like clipping and ringing. Has there been any research published along those lines, or is this a new idea?I'd be surprised if it's a new idea. There is a vast amount of scientific literature on image reconstruction. For example a quick Google found this. (http://www.informaworld.com/smpp/jump%7Ejumptype=banner%7Efrompagename=content%7Efrommainurifile=content%7Efromdb=all%7Efromtitle=%7Efromvnxs=%7Econs=?dropin=dxdoiorg_101080_01431169308904292&to_url=http:/%2fdx%2edoi%2eorg%2f10%2e1080%2f01431169308904292) I hope you have a university library nearby so you don't have to pay to download a lot of papers! Or if you're lucky you belong to an alumni association that provides free online access to journals.
That's where I'm going with this line of thought, though I don't have a clear idea of exactly how to do this. I'd like to combine the deconvolution-with-PSF and upsampling if possible, with the goals of preserving the greatest possible amount of true image detail, avoiding jaggies and reconstruction artifacts, and minimizing interpolation errors and distortions like clipping and ringing. Has there been any research published along those lines, or is this a new idea?
The alternative is a discontinuous interpolation, where nearest neighbor is the simplest example. I think this approach has a lot going for it but it is hard - try to Google for "weak membrane". The idea is to have a piecewise smooth interpolation that can "break" at discontinuous edges.
Indeed, a good point to stress, and it is another practical reason why one should interpolate to the native resolution of a printer with a good algorithm, instead of an unknown (most likely sub-optimal bilinear) filter in firmware, and then sharpen. BTW it's one of the reasons why people like the Qimage output better than the same print directly from Photoshop.
I have made very sharp (virtually artifact free) enlarged output by deconvolution sharpening of interpolated image data. The results are much better than sharpening at the native capture size and then letting the printer do the interpolation, as is advocated by some 'authorities'.
Bart,
Your posts are always informative and thought provoking, and your statement about sending the file to the printer at the native resolution of the device makes sense, notwithstanding the advice of certain "authorities". The native resolution of the printer can be difficult to determine in the case of an inkjet using diffusion dither. The "native resolution" of Epson printers is often stated to be 360 dpi or multiples thereof; however, Rags Gardner (http://www.rags-int-inc.com/PhotoTechStuff/Epson2200/) has done some experiments using interference patterns with the Epson 2200 and found that the "native resolution" was 288 dpi. He also reported that the hardware resolution may differ from that used in the software for resampling. Have you seen this?
Sometimes, a little blurring of the image is desirable. Some time ago I had some images printed on the Fuji Pictrography device, which was a high resolution contone device. I sized the images to the native resolution of the printer and every little defect in my image stood out in great detail.
Also, what deconvolution algorithm do you use for image restoration and how do you determine the PSP? Jonathan Wienke, who is also contributing to this thread, has used FocusMagic with good results for capture sharpening. I have read that deconvolution can correct to some extent the softening due to the blur filter.
I'm already doing something along those lines with my modified 2-D spline interpolation code. I use the standard method to calculate the z coefficients for each spline knot, where z is the parameter used in this equation:
...
Over/undershoot near high-contrast edges is effectively limited without altering the behavior of the spline away from such high-contrast edges. And the effect of clipping this parameter is paradoxically gradual; if the clipped value is only slightly different from the unclipped value, the effect on the spline is slight.
Here's a 1-D comparison between a standard natural cublic spline (blue) and my modified cubic spline (red):
[attachment=19872:SplineComparison.gif]
The data points in this curve are random values, highlighted in white. Although this example shows one instance where clipping is not prevented, you can see several areas where the modified spline avoids ringing/overshoot much better than the unmodified spline.
Here's a 1-D comparison between a standard natural cublic spline (blue) and my modified cubic spline (red):
[attachment=19872:SplineComparison.gif]
The data points in this curve are random values, highlighted in white. Although this example shows one instance where clipping is not prevented, you can see several areas where the modified spline avoids ringing/overshoot much better than the unmodified spline.
The issue with splines of course remains to find a balance between unjustified overshoot, and justified interpolation (inventing probably useful data).
Technically, the overshoot is not "unjustified"
Technically, it is, for the reasons I described earlier. When working with images, increasing one sampled value in a series should never cause a nearby interpolated value decrease, nor should decreasing one sampled value in a series cause any nearby interpolated value to increase.
Interestingly enough, even linear interpolation will cause ringing in this case for the reasons I mentioned in my earlier message.
Exactly how can linear interpolation ever cause ringing? You simply draw a straight line from one sample to the next. Each line between samples is completely independent; you can do whatever you link to a value in the series, and the only lines affected are those between the value you just changed and its immediate neighbors. No other lines are affected in any way, shape, or form. Linear interpolation may not always be very accurate, but it's never going to cause overshoot, ringing, or clipping of interpolated values.
I mentioned in my previous messages when you throw in linear interpolation with L2 norm then the samples to be interpolated are the not the actual samples but a certain derived set of samples from the original continuous function.
No s*&t, Sherlock. That still doesn't explain how you can get ringing with linear interpolation, regardless of whether you're doing it with the original gamma 1.0 samples or after you've converted them to gamma 2.0. .... The linear interpolated value will change depending on what gamma you convert the samples to before interpolating, but claiming that is ringing is comparing apples to aliens.
Jonathan, I don't think you are reading my posts carefully. BTW, how did the gamma thingy creep in? And when did I "claim" that gamma causes ringing as you say above? I never alluded anything regarding gamma. Additionally, in your example where did you use the L2 norm I have repeatedly stressed which causes ringing. Lets take the linear case. I mentioned before that for best linear least squares approximation you won't use the samples derived by passing a L2 function (possibly non-bandlimited) through a sinc filter; rather a filter of the following form:
Click here for linear sampling filter. (http://www.djjoofa.com/data/images/linear_correction.jpg)
And, I convolved this filter with a step edge and here is what I get:
Click here for linear ringing. (http://www.djjoofa.com/data/images/ringing_linear_interp.jpg)
Do you see the ringing?
Yes, but you're NOT using "linear interpolation" in that example, you're using a multi-sample linear approximation of a sinc function to interpolate, which is an apples-to-aliens comparison. As I said before, "linear interpolation" as defined everywhere else in the universe, is a simple a(1-d) + bd calculation, where a and b are the two nearest sample values, and d is the proportion of the distance between a and b for which you're calculating the interpolated value. Like nearest-neighbor interpolation, it can never overshoot, it can never ring, and it will never cause clipping. The interpolation function you showed can do all of those things, but it is NOT "linear interpolation".
Once you convolve your analog function with the sampling kernel I provided in my earlier message, and sample, then, you WILL reconstruct using, borrowing your words, "a(1-d) + bd calculation".
Hey - stop fighting.
@Jonathan: actually, "everywhere else in the literature" is a pretty large place and it is in fact true that Joofa's view is common in e.g. machine learning.
@Joofa: actually, Jonathan is right that connecting the samples with straight lines is how the term is used for in image processing.
E.g.:- Lets pick linear interpolation, and one would find that even linear interpolation would exhibit ringing. Normally, we don't see that in linear interpolation, because we use the actual samples. However, if we use the actual samples, then linear interpolation is not the doing the best it can do. We need to derive an alternate set of samples and then interpolate between them for min. error.
Unfortunately, we are typically given samples and can't change the way we acquired them. However, even in such cases optimizations exist that shall try to mitigate the effect of information loss by considering the way samples were acquired and the reconstruction kernel jointly.
If you want better then we have to change the way we acquired the samples. I had all of this in my first message on this topic and I repeat it here with appropriate highlighting:
However, there is an important practical issue here also. We are normally just given the samples and have no control over how they were acquired. I even covered that in my same message and I am just repeating it here again:
... possibly workable in theory if you have access to the signal prior to sampling, but of no relevance whatsoever to the real-world scenario of having only the sampled values from the sensor to work with. Any algorithm that requires one to "change the way we acquired the samples" is useless in the context of processing image data that has already been sampled.
Unfortunately, we are typically given samples and can't change the way we acquired them. However, even in such cases optimizations exist that shall try to mitigate the effect of information loss by considering the way samples were acquired and the reconstruction kernel jointly.
EsbenHR, both Jonathan and I are in agreement that linear interpolation is connecting the samples with straight lines.Well, you do not appear to agree that "linear interpolation" means, by definition, connecting the original samples with straight lines.
This sounds just like your claims that you can remove aliasing without distorting signal; possibly workable in theory if you have access to the signal prior to sampling, but of no relevance whatsoever to the real-world scenario of having only the sampled values from the sensor to work with. Any algorithm that requires one to "change the way we acquired the samples" is useless in the context of processing image data that has already been sampled.I think it is a useful to know that if we did know the original function, then the best we could do (using an L2-norm, RMS, energy, least-squares or whatever you want to call it) would be ringing under quite general conditions.
Well, you do not appear to agree that "linear interpolation" means, by definition, connecting the original samples with straight lines.
I would say that the most common use of the term "interpolating" is that the interpolating function, f(x,y), satisfies f(x_i,y_i) = z_i for the sample set (i.e. the original samples).
Ignoring the terminology sideshow, what you want to do is quite different: you want to approximate the originally function (from which the samples are obtained) by approximating it with a function expressed as a sum of basis functions.
Here is the intent of the interpolation in signal processing philosophy: If we are to reconstruct a continuous signal from its samples then how can the samples be obtained that the reconstructed signal using these samples is the "best" representation of that original possibly non-bandlimited function. The "best" part will be measured by error between original, possibly unknown, function and the reconstructed function.
When you say "linear interpolation", this means that the basis functions are triangular.
Could we call this "reconstruction" to avoid the word war? Or is that too overloaded too in this context?
Firstly, there is the correct interpretation of theory. Secondly, please don't put words in my mouth that I have not said or implied, just like you said above that I claimed something about gamma.
Given that, the only way to "change the way we acquired the samples" in my world is to use a different ADC chip, or alter the mechanism feeding the analog signal to the ADC chip. This is obviously not practical in the context of processing images captured with whatever camera one has; it is unrealistic to expect someone to buy a new camera just so some flavor of "linear interpolation" will work better.
But the point is that I did not change any ADCs.
Nor did you acquire any "new" samples, you simply processed the same set of samples (the original Lena image) in different ways using different algorithms. You didn't change the way you acquired the samples, you changed the way you processed them. There's a big difference, kind of like the difference between lightning and a lightning bug.
... you risk looking like an idiot .... You seem to be a bit like Humpty Dumpty, taking the position that words mean whatever you say they mean, ...
I just worked an example for you where I took the samples as acquired by whatever "ADC" and still be able to accomplish the linear approximation in L2 sense. Please note that I have worked out this example very quickly, it is not the best, and there are certain hacks in there, and it can be made better, but it is just to illustrate that you don't need to change "ADC's".Joofa,
Click here to see an actual image of Lena. (http://www.djjoofa.com/data/images/lena.jpg)
Click here to see image doubled in size with "straight" linear interpolation. (http://www.djjoofa.com/data/images/lena_lin.jpg)
Click here to see image doubled in size with linear approximation in L2 sense. (http://www.djjoofa.com/data/images/lena_lin_ls.jpg)
The L2 approximated one looks sharper than straight linear interpolation with a lot of ringing artifacts. But as I said it was just done in a "shortcut" way and there is hope to make it better. But the point is that I did not change any ADCs.
The extra sharpness of the L2 version is interesting (although it looks like some jpeg artifacts got in the way).
Do you have a reference that describes that method?
I'm also a bit puzzled by the 'actual' Lena image [...] since it's smaller than any version I've seen
Joofa,
The extra sharpness of the L2 version is interesting (although it looks like some jpeg artifacts got in the way).
Do you have a reference that describes that method?
Obviously he changed the way the samples were acquired
I also noticed the jpeg compression artifacts. I did not mess with quality parameters etc., as I was not aiming for anything fancy.It seems that your Lena had the jpeg artifacts before going through your routine. I wonder how it would look on (a reduced version of) Bart's "clean" Lena?
I just worked an example for you where I took the samples as acquired by whatever "ADC" and still be able to accomplish the linear approximation in L2 sense. Please note that I have worked out this example very quickly, it is not the best, and there are certain hacks in there, and it can be made better, but it is just to illustrate that you don't need to change "ADC's".
The L2 approximated one looks sharper than straight linear interpolation with a lot of ringing artifacts. But as I said it was just done in a "shortcut" way and there is hope to make it better. But the point is that I did not change any ADCs.
It seems that your Lena had the jpeg artifacts before going through your routine. I wonder how it would look on (a reduced version of) Bart's "clean" Lena?
Are you nitpicking or really not following all of this stuff?
OK, I'll bite!If I'm reading your code correctly, you're filtering with the inverse of a triangular psf ([1 6 1]/8), then interpolating?
So, let's forget the L2-norm and say, instead, that we assume the (continuous) image on the sensor (yeah, Lenna was scanned, work with me here) is piecewise linear and a remarkable piece of luck would have it that the pieces are all between the nearest pixel centers :-)
So, what would that look like, if we assume that the acquired pixels represents the average of the function? That is we assume pixels are noise-free, the fill-factor is 100% etc.?
As attached I would think. I hope you can see which is which ;-)
If I'm reading your code correctly, you're filtering with the inverse of a triangular psf ([1 6 1]/8), then interpolating?Yes, you are correct that is exactly what it does.
It might be better to interpolate first, then inverse filter, to reduce the jaggies on the diagonal hat brim (as in my sharpening demo, above). Otherwise, the inverse filtering seems to help a lot.
It would seem that the image formation model one wants is that in a region of steep gradient, that the second derivative of the image does not change sign for a given distance from the steep gradient. This would prevent or at least dampen ringing (which I for one find more objectionable than a little softness).
... your usage of the terms "linear interpolation" and "sample" are very different than their most common definitions, and as a result, it is sometimes difficult to to tell whether what you are saying is insightful or gibberish.
The point was to show that if we have an image model (in this case a very simple-minded model: "the image is piecewise linear") and a sensor model (here: each pixel measures the light exposed on its area) then we can construct a plausible image. If we accept those terms, then we get ringing.