While I appreciate this initial work done by Jim, for me, it's difficult, if not impossible, to draw any practically significant conclusions from some of the data for the following reasons:
Well, this is indeed a refreshing change of pace. My work is usually criticized for being overly quantitative, and relying on numbers as graphs when a simple picture would make the point.
1) The data set is limited; one cannot draw statistically valid inferences from an N=1 for each item under test.
That is of course a valid point. Except for Roger Cicala’s excellent work, I don’t know where you’re going to go for larger sample sets. Right here on LuLa, the sample size is usually (always?) one; are you chastising the people who charge you to read their tests for that? I do look for unreasonable results, and sometimes obtain another sample if I get them. I also check lenses for decentering and focus plane tilt, two indicators of improper assembly.
2) What is the null hypothesis (Ho) being tested? It is not stated, nor is the alternate hypothesis (Ha).
My blog posts are not intended for peer-reviewed scientific publications. I don’t have the time of inclination to test to those standards, nor would my readers have the patience to deal with writings that met the standards of scientific publications. All I am doing is applying what I call “kitchen optics” – tests that almost any reader could perform for herself, given the time and a modicum of equipment – to cameras and lenses, hoping to get insights that go beyond the usual “here are the pictures I took with the NiCanOrama QRZ – 1066, and here’s what I think of them” that most everybody else is doing.
In the case of the graphs that Erik posted, the equipment required is a razor blade, a light source, a focusing rail, MTF Mapper and/or Imatest, and Excel. As in all my reports, I explain exactly how a reader who wishes to reproduce my results can go about it, either in the post itself, or by reference to an earlier post.
2) The measurement terms are not defined. What is cycles/picture height defined and how is it measured? What does measuring this functional response mean in practical terms?
Measuring MTF50 in cycles/picture height has a long history in digital photography. Try the Imatest site for some background. If you want the paper that introduced most of us to slanted edge MTF testing, it’s here:
http://imagescienceassociates.com/mm5/pubs/26pics2000burns.pdfIf you want the Matlab demonstration code, it’s here:
http://losburns.com/imaging/software/SFRedge/index.htmMTF50 is a well-known sharpness metric. For a discussion of it and why it’s appropriate, look at Jack Hogan’s explanation:
http://www.strollswithmydog.com/mtf50-perceived-sharpness/3) Some of the axes for some of the graphs are not labeled, nor are the units of measure for these axes.
Erik pulled the graphs from some of my blog posts. If you read the posts, the axes are explained. In the MTF50 vs subject distance tests, the units are cm, with 0 arbitrary.
4) The histogram data has no error bars. Any measure of means or median has to also provide a measure of variance e.g. SD, Variance or CV. Also, the confidence levels are not stated. There is no way to conclude that the difference in values observed in the histogram plots are statistically significant or not without providing an SD and confidence interval. The best way to do this would be a one-way ANOVA, and a Tukey HSM pair-wise comparison, reporting p-values for the differences.
I don’t see a histogram in anything that Erik posted. Thank you for the statistics lesson, though.
5) No description of a statistically valid Measurement Systems Analysis was performed, so there is no way to know that the measurement system is fit for purpose for what was being measured, and therefore, that the observed differences are real and not simply the result of sampling error or % contribution by the measurement system. There is no variance components analysis to characterize the part-to-part variance, operator-to-part interaction, operator-to-gage interaction, and the overall gage % contribution to the overall study variance (sums of squares of the overall variance). In essence, no data is provided to demonstrate the measurement system as a whole has sufficient measurement system precision to know any observed differences are real, not due to noise, part to part variability or intrinsic measurement system noise (*all* data sets contain noise, *some* contain signals).
Again, my blog posts are not scientific papers. The results are not statistically significant, to be sure. However, I think they form a useful addendum to the pretty pictures that are the alternative. To my knowledge, no one, not even Roger, is testing cameras and lenses and reporting results to the general public in the way you want them tested and reported.
You have proven yourself skillful in poking holes in the work of others. That’s not difficult. Please consider doing your own testing and reporting the results to all of us. That’s harder, but of much more benefit.
Jim