Let me try to answer a bit the technical question, but I must first object to jargon like "real" and "imaginary" pixels for distinguishing between the values at photo-sites and the interpolated values at the "gaps" between them. ALL normal digital camera pixels are produced with some interpolation (since each records only one color), and with small enough photo-site size, they mostly give highly accurate information. Since even uninterpolated photo-site data are already at best only accurate, not exact, the "perfect/approximate" or "real/imaginary" dichotomy is an oversimplification. So I would describe pixels just as being of higher and lower quality according to the extent and quality of interpolation that has been applied.
My educated guess is that each of the three color channel values for each of the Fuji's 12 million output pixels is computed directly by interpolation from the raw values at several nearby photo-sites which records that color, rather than interpolating twice (first to fill in missing colors at each photo-site, then to fill in pixels in the gaps between the photo-sites.)
Given that, it is not clear whether the interpolated values at photosites are better or worse than the ones for the gaps. Probably, the best values are the ones at a photo-site for the color that it actually records needs no interpolation, then the values at gaps are next best, and the values at photo-sites for the two colors not directly recorded there are actually the worst, since they have to be interpolated from a bit further away than in the case of interpolating into the gaps. So who can even say which pixels are best, let alone which are real and which are imaginary?
If your worry is that 12 million pixels are reduced to 6 million new, larger ones by just discarding half the pixels, relax, it doesn't work that way. Instead, the value in each color channel for each of the new, larger pixels is computed from a weighted average of values at all the nearby old, smaller pixels: weight greatest for the closest ones of course. So each of the final 6MP wil be computed from a mixture of values associated with photo-sites and values associated with the gaps.