The softproofed images match good enough so the people usually don't complain. The problem is monitor's white, where the difference is obvious, and where you can see the difference between displays of different backlight spectra calibrated to the same wtpt x,y coordinates. From my personal experience it's not about observer variability - I calibrated monitors with dozens of clients and they saw same issues like me, we usually agreed when it was satisfactorily neutral in most cases, we agreed it was too greenish or pinkish.
Maybe it becomes an issue in case of superwide gamut laser displays, but IMO it also sounds like it might be the problem with adaptation.
From Oicherman: "The existence of adaptation differences between broadband and narrowband stimuli
leading to additivity failures is also long acknowledged (Trezona 1953; Trezona 1954; Stiles
1963; Crawford 1965; Lozano and Palmer 1967; Zaidi 1986). In this study we have established
a relationship between the two, and have shown evidence indicating that both effects are caused
by the same mechanism of postreceptoral adaptation. To our knowledge, this report is the first
to show the consequences of additivity failure in conditions relevant to practical colorimetry.
Since the establishment of colorimetry, the additivity laws were somewhat of a sacred issue.
The general understanding seemed to be that failure of additivity essentially leads to breakdown
of CIE colorimetry and need of redesigning it. Our main conclusion concerning the failure of
additivity is that this is not so: the additivity failure can be predicted, modelled and compensated
for."
"The present research has begun as the research in basic colorimetry. We saw our target in the
development of the new standard deviate observer. The way to approach the goal seemed to be
in collection of large as possible amount of colour matching data. By the end of the first
experiment we knew that we do not need any more data: the set from S&B (Stiles and Burch
1959) study from 50 years ago provides all the information we need. The second experiment
taught us that the new SDO in not needed altogether – at least for the cross-media colour
matching: the observer metamerism does not contribute much to variations in colour matches of
spatially separated stimuli. Moreover: in these conditions, the colour matching itself does not
seem to operate according to classical cone-quantum metamerism model, as the observer’s
adaptation state changes instantaneously when the gaze is moved from one media to another. As
the result, we had to resort to advanced colour difference formulae and chromatic adaptation
transform."
I've noticed the same thing and had chalked it up to just personal variation in CMF responses. I've seen a paper where this was investigated and pretty large variations existed amongst so called "normal color perceivers" which group somewhat differently for men and women. Alternately, I assumed it could be poor spike resolution using the I12. My approach has been the same as yours: to tweak the xy target cords so the white points match between my CFL v LED Eizos.
This xy tweak approach would be effective for both additivity failure and individual variation in CMF. That you have seen similar perceivable differences with a large number of different people strongly points to additivity failure. I had only myself as a data point.
This suggests it might be useful to compare a set of graduated printed neutral patches against the monitor showing presumably the same, in standard colorimetric terms, neutral colors. This is a little tricky because one has to illuminate the paper strip on the monitor screen at 45 degrees (to match standard colorimetric measurements) then compensate the monitor patches to add in the reflected color. For white point checking this effect is negligibly small but for darker patches it would have to be accounted for. I'm curious how close (or not) they would be.