Tests of the human eye's ability to distinct between shades show, that one can distinguish between two shades if their lightness differes at least 1% (I guess this is a rounded value.)
Consequently, we can distinguish between at least 70 levels within one stop dynamic range (2^70 = 2.00...). In order to eliminate visible transitions, the number of levels needs to be higher. I don't know how high and I am cautios to accept anything, which is not based on tests of a wide range; I remember to have been told, that 24 frames per second are all needed, for we don't "see" anything above it.
Anyway, let's accept this 70 as an initial value. Select the one stop wide range of the lowest lightness, which needs "full service", i.e. all levels. From here above every further stop requires twice as many levels because of the linearity of the sensor.
Accordingly, 10 stop "clean" dynamic range requires 70*2^9 = 35840 levels. Add to this the number of levels, which are required for the "dirty" stops (there is not much point to distinguish between 70 levels of a noisy stop).
This range requires over 15 bits, and the dynamic range is still not very large.
However, what does one do with such an image file?
The consideration, how many bits Photoshop can handle it totally irrelevant. Not only, because Photoshop is not the center of the universe, but because the software will support much greater bit depth as soon as it becomes actual.
But how does one present such image? I don't know what is on the horizon of printer development, but the monitors are not far away from reproducing not 12 but 16 stops. Some consumer grade monitors offer already contrast ratio 3000:1 (ok, that's a claim,. but I guess that is at least 2000:1), and new development promises tens of thousands of contrast ratio. These monitors (even those with 3000:1) are not for computer work but video (the resolution is not high enough for example for image processing), but that too is only a question of demand - and anyway, why could not images be presented on TV monitors?
The new JPEG standard will handle such images, so we only need cameras with higher dynamic range.
As the dynamic range is not limited by the noise in shadow but by the well capacity (the former is the consequence of the latter), we only need less pixels on the same sensor with the newest technology. Unfortunately, misguided consumer considerations force manufacturers to cram more and more pixels on the sensor.
Any time I hear people talking about digital cameras, the first question is always "how many megapixels". We should start an action to disspell the myth, that the measure of camera's quality is the number of pixels.