I do not see the biggest problem with bayer as aliasing. I see it as sampling what is really transmitted by the lens at every point, which shows up most in color accuracy.
Here is an interesting article comparing a few different programs. The most important are on the right. The RT Amaze method vs the Lightroom method.
http://renderingpipeline.com/2013/04/a-look-at-the-bayer-pattern/The first thing that stands out is the color differences from the same raw file. The second thing that stands out is the choice of a smoothed soft look by lightroom vs the sharp blocky look of RT. What is the outcome of that choice? My guess is the Lightroom version is more accurate in color by doing some blend of the color sampling from each pixel. RT definitely has the advantage on image detail at the expense (my guess) of having to adjust colors for accuracy. I could be wrong, the fake flower could be properly captured by RT.
Here is another article showing that all debayer methods have problems with what you are interested in, moire.
http://www.libraw.org/articles/bayer-moire.htmlSmaller pixels just have the same issue on a smaller scale until you hit diffraction issues. I agree with you that all else being equal, smaller pixels will give a better image. At the point of diffraction all else is no longer equal. The sizes of the red, green and blue diffraction spots are very different. So the perfect camera would be 3 sensors behind a dichroic prism with 6.5 microns for red, 5.5 for green, 4.5 for blue. The software engineering problem would be making the best image of those 3 layers. The camera companies could legitimately charge $4-5K for that.
As Bart mentioned a couple weeks ago, with the current design you can get away with pixels 1/2 the diffraction spot size due to the gap in between red to red, blue to blue, etc. As you go smaller you start throwing away light relying on just chopping the top off the center diffraction peak, then boosting contrast in software. By the time the first ring hits the adjacent same color you will have color shifts. Tiny P&S pixels should have poor color accuracy.
I would rather actually sample each color at every point with the small extra cost of the sensor motors and the 3 subshot data buffers. I expect there would be a gain in overall color accuracy first, then elimination of false color artifacts second. There should also be an improvement in noise.