But my reading of the original camera was that it isn't measuring the distance to each pixel: it's just talking 16 simultaneous shots, each at a different fixed focus, and then letting you mix and match pixels from each of those 16 planes after the fact - hence the difference between the 'megaray' count and actual resolution.
It is taking simultaneous shots, each from a slightly different viewpoint (not a different fixed focus).
One way to do this is to arrange 16 individual cameras in a 4x4 grid, and fire their shutters simultaneously. All 16 cameras are focused at the same distance. Because you know the camera positions and orientations, you can compute a 4D lightfield from the 16 images. Once you have the 4D lightfield, you can do the refocusing, and various other "tricks". If you search on the keywords 4D lightfield, plenoptic camera, lumigraph, or computational photography, you will find a wealth of information about how the computations are done. Other interesting computation photography topics are coded apertures, coded shutters, computational sensors, etc.
The Lytro achieves the effect by having a 4x4 grid of lenses
behind the primary lens of the camera. The result is similar, but the distance between viewpoints is limited by the diameter of the primary lens, ie. each grid lens is "looking" through a small portion of the front element of the primary lens.