Sorry for dropping off the face of the Earth for a few days - needed to do work that I was actually being paid for. There have been a number of questions about our results as well as many useful and thoughtful discussions in this thread. I'll do my best to address them. The questions are in three general categories: (1) Testing methodology, (2) Sensors (particularly the ColorMunki), and (3) everything else.
1: Testing Methodology (aka "do you guys know what the hell you are doing?!?")
Q: What software is used to drive the instruments and how does that factor into the results?A: When possible, we used ArgyllCMS routines. These have the advantage of being completely customizable with the appropriate source code changes. Our intent was to determine the best possible results from each instrument. For spectrophotometers, this meant using very long integration times and heavy averaging when measuring darker values. The results we report, therefore, should be viewed as best-case for each sensor.
Q: How confident are you of the accuracy values reported in shadow levels?A: That's an easy one. The spectroradiometer we used, a PR-730, has a luminance sensitivity of 0.0003 cd/m2. The minimum luminance we measured on a monitor was ~0.1cd/m2, or over 300x the resolution of the instrument. The PR-730 is accurate to +/-2% in luminance sensitivity at 0.009cd/m2, or to put it into other terms, we might be seeing 0.0998 or 0.1002cd/m2 rather than 0.1. Color accuracy is a similarly ridiculous +/-0.0015% at a luminance 10x lower than any monitor can reach. Our PR-730 was factory calibrated to NIST traceable standards within a few weeks of our evaluations.
2: Sensor questionsQ: What version of the ColorMunki did you test?A: The Photo/Design version -- the spectrophotometer capable of emissive measurements for monitors and reflective measurements for prints. The ColorMunki Create is a repackaged Eye-One Display.
Q: What about XRGA? Are some sensors (e.g. ColorMunki) using this and would it make a difference?A: Not to the best of our knowledge when controlled with Argyll's code. Based on X-Rite's XRGA whitepaper, measured color differences will be minimal for the Munki with or without XRGA.
Q: Are all ColorMunkis inaccurate or is it just the one you guys measured?A: Only having characterized a single unit, we simply don't know. Until we can measure more samples, I will neither condemn nor exonerate the ColorMunki. Our results were disturbing, showing gross errors in black point measurements, but we might have tested a lemon unit. With the help of a third-party software supplier, we hope to get several more Munkis to evaluate. After verifying our first results showing high error levels in our ColorMunki sample, we emailed X-Rite to ask if they could send a few demo units our way. No response.
Q: For the Eye-One Pro, does using the raw 3nm measurement data help vs. using the default 10nm intervals reported by X-Rite's software?[Explanation: The i1 Pro samples light at 3nm intervals. The data are noisy, and the default values reported are pre-smoothed to 10nm intervals. The output soectra of either a CCFL or LED backlight is spiky, with significant spikes being narrower than 10nm. The question comes down to whether the default smoothing obliterates useful data]
A: Again, Argyll comes to the rescue. It supports measuring at full resolution. The noise levels are indeed high, and feeding the raw values can create some pretty strange results. I geeked away on my latest plane trip, running some i1 readings through the FFT filtering routines we use in our profiling code. After suitable tweaks, I could get spectral curves closely approximating the 1nm sampling curves from our PR-730. I do not know if i1Profiler uses a similar technique, but I would not be surprised if it does. The dE measurements we reported used the default 10nm, smoothed data. Given that both the absolute magnitude of error on white with an i1Pro and the intra-unit variations were low, the 3nm sampling strategy is a refinement on an already good product. The problem area is in the shadows, where the measurements are noise-limited rather than influenced by spectral spikes.
Q: Any thoughts on the BasICColor DISCUS?A: Aside from the snazzy specs? With thanks to the good folks at CHROMiX, we hope to have one in-house for testing within a couple of weeks. Mark Paulson was kind enough to volunteer his DISCUS as well. Mark: If the offer still stands, I may take you up on it after we get a chance to run the fisrt sample through its paces.
Everything elseQ: Which is the more important metric, the absolute sensor accuracy or how much sensor-to-sensor variation is seen? From Andrew:
"Now instrument variations per model is a big deal! So I’m not letting manufacturers off the hook for vastly different results from the same target requests in the same software. That’s not acceptable for those working in collaboration."A: Of the two, I would focus on the inter-unit variability. Different manufacturers calibrate their sensors to slightly different references. Seeing a few dE difference in average readings between sensor types can be attributed to calibration settings. The large unit-to-unit differences we saw in, for example, the Eye-One Display point to a sensor that cannot be relied on for accurate readings. The largest deviation we saw on the i1D2 was 14 dE-2000. To visualize this is Photoshop, fill a square with a middle grey [160, 160, 160] (sRGB or Adobe RGB - doesn't matter). Fill an adjoining square with [213, 213, 213]. That is 14 Delta E-2000, and that is not subtle. The graphic below illustrates this.
Scott and Terry pose the question of which combination of software and hardware gives the best results. Determining this is the end goal of our exercise. As a first pass, we aimed to determine what the best-case capability for each instrument. We are then cherry-picking the best sensors to use in software comparisons; e.g trying to eliminate as many variables as possible,
Ethan, just to follow up. I'm continuing this testing and am finding that on some displays (like a Samsung 245T and Dell 24") I'm seeing dramatically improved results (visually and statistically) using an EyeOnePro device instead of a DTP94, even when the same software is used.
The Samsung 245T is a strange beast. It is a PVA panel with moderately wide gamut. The expanded gamut is a function of the backlight on the 245T, so I am not surprised that the i1Pro spectrophotometer gives better results than the DTP-94. This correlates with our measurements. The DTP-94 contains a filter set that is farther from the standard observer than those found in the Spyder 3. Hence, uncorrected readings get progressively less accurate as the backlight spectrum difference between a particular panel and a standard CCFL increases. In our tests, the DTP-94 consistently turned in the least accurate white level readings on all wide gamut displays.