I re-found an article I wrote on another forum to explain the differences between a MF and a 24x36 camera. I might as well use it here. This is the part about what happens with lenses and how it relates to "bokeh" and sharp-unsharp transitions.
To study the case of digital sensors, I suggest to compare two cameras with similar resolution but different sensor sizes, the Nikon D800 and the Leica S. As an exercise, I suggest to study what happens if we would make the D800 bigger to match the sensor size.
The two cameras have similar resolution (D800: 36.3 mpix, S: 37.5 mpix), but one sensor is 36x24mm and the other 45x30mm. The linear dimensions are 1.25 bigger (25%) for the S. I will imagine that we have an expanding machine that can blow the D800 25% bigger in every dimension, creating a D800+. What happens?
The D800 dimensions are 146x123x81.5mm, the D800+ is 182.5x153.5x101.5mm.
We still have 36.3 million pixels, they just get bigger (we could have chosen to get more pixels of the same size as would be the case between the Nikon D600 and Hasselblad H5D-50). Dynamic range for highlights being roughly proportional to the surface of pixels (for a given technology), we would increase dynamic range. This is irrelevant for photographic practice, since the D800 dynamic range is already sufficient for photographic subjects (we shall come to that later).
The weight is multiplied by the cube of 1.25. This is often overlooked when scaling objects, the weight is proportional to volume, i.e. dimensions to the cube. Our 1000g D800 becomes a 1953g D800+.
Since we are interested to a complete system, we will fit that D800 with a standard lens, a 50mm f/1.8G, 7 lenses in 6 groups, and see what happens. The lens becomes a 62.5mm. The dimensions increase from 72x52.5mm to 90x65mm. Weight increases (cube again) from 185g to 361g. Aperture stays identical at f/1.8, because it is a dimensionless number (the ratio of 2 dimensions).
Let us compare the D800+ with its 62.5mm lens to the Leica S.
The D800+ is bigger than the Leica S (182.5x153.5x101.5mm versus 160x120x80mm) and heavier (1953g versus 1410g). We should expect that, since Leica does not use a blow up machine, but uses standard components for the electronics (processor, memory, etc...). The only things which need to be bigger in the S are the sensor and the mechanics (shutter, mirror box, mount).
The real difference are the lenses. Leica standard lens focal is a bit longer at 70mm versus 62.5mm. But Leica standard lens is also:
-slower: f/2.5 versus f/1.8
-uses one element more: 8 versus 7
-is longer (93mm versus 65mm)
-has the same diameter, even if it is slower (90mm for both)
-is much heavier at 740g versus 361g.
Leica lenses for the S series are particularly complex and heavy, but we would find a similar situation for Hasselblad or Phase one: medium format lenses are slower and more complex than their small format counterparts, even taking account of linear scaling. This is even more pronounced with lenses away from the "standard" focal length: on 24x36 one is used to a 35mm f/2.0 being tiny and using a few elements. The equivalent in medium format is slower and uses double the number of elements.
(...)
Earlier in this thread I compared a Leica S2 to an hypothetical Nikon D800+, which is a D800 blown up so that its sensor size matches the one of the S2. That was including the lens, so the D800+ had a 62mm f/1.8 lens (a blown up 50mm f/1.8 ). The important part is that the f number did not scale at all, because f numbers are dimensionless. And this is very important, because many things are dependent on the f number.
For small sensors, the important part is diffraction. What is important from us is that the size of the sensels dictates the minimum aperture of a lens. For example, for a sensor with 6µm sensels, diffraction first effects will be barely noticeable when stopped down beyond about f/11-f/16. This is not a practical limitation, unless you are interested in macrophotography. For smaller sensors, however, the limitation is more serious. Typically, for the tiny sensors used in P&S or cellphones with pixels under 2µm, f/2.8 may be the slowest aperture that does not degrade the picture. Typically, these cameras do not have a diaphragm at all, but use gray attenuation filters (as is also customary practice for video cameras). Typically as well, they use zoom lenses with sliding apertures and the long end can be as slow as f/5.6 or f/8. Since the lens barely resolve the sensels at f/2.8, you will have divided your linear resolution by 2 and your pixel count by 4 at the long end. And you have no depth of field control, since you don't have a real diaphragm.
For medium and larger sensors, the main difference is in the bokeh. Older photographers may remember the saying that large and medium format cameras allowed better depth of field control. But this is not quite true: due to the availability of very fast lenses (f/1.4 or faster), 24x36 cameras are actually the cameras which produce the thinner depth of field. So where did this belief come from?
The belief first comes from the fact that medium and large format cameras were used to produce larger prints. The formulas for calculating depth of field are dependent on the apparent size of the prints and large prints seen close have been particularly attractive to the average viewer since the time of classical paintings.
But even if we do not want to produce larger prints, depth of field is, in practice, dependent of the sensor size: smaller sensors need a faster aperture to produce the same apparent depth of field all other things being equal. But aperture does not scale and a faster aperture, with any sensor size, comes with more optical aberrations. Spherical aberration, chromatic aberrations, coma, etc… are all dependent on aperture and increase considerably faster than the scaling power. Moreover, these aberrations also tend to be more difficult to control with smaller focal lengths, so smaller sensors are at a further disadvantage.
What does this mean in practice for different formats?
For tiny electronic sensors with tiny pixels, we would need apertures must faster than f/1.0 if we wanted small depth of field. The optical engineer can't do these at the standard focal length of these sensors and, in practice, the best they can do is f/1.8 (and much less for zooms at the long end). The f/1.8 lens is complex, need aspherical surfaces and special glass, mechanical tolerances are a nightmare since everything is so small (especially at the price the user is ready to pay) and the lens is plagued by aberrations, most noticeably chromatic aberration. Software corrections are often the only solution, but can only do so much.
For 24x36 cameras, fast lenses are doable around 50mm, produce a very thin depth of field, but are also difficult to correct. When the photographer wants a depth of field small enough to emphasize the subject with a lens around the standard focal length, apertures around f/2.0-f/2.8 are chosen and we are in a zone where the aberrations are still responsible for bad bokeh: donut shape of out of focus highlights / split highlights (spherical aberration) or colored out of focus highlights (longitudinal chromatic aberration). Sweet spot of the lenses is around f/5.6-f/8, but depth of field is fairly large at these apertures.
For medium format digital cameras, very fast lenses are usually not available. The reason is that these cameras use a central shutter and that limits the practical maximum aperture of the lenses. Still, when one wants depth of field control, apertures around f/5.6-f/8 are used and we are in the sweet spot: the lens is almost perfect and bokeh is neutral. Moreover, MF lenses are optimized for a different set of constraints since they do not need to be designed for large apertures but still use optical formulas more complex than their 24x36 equivalents.
For much larger sensors: large format cameras, we have so much resolution on the sensor than we can afford to waste some and close down beyond the limits of diffraction. f/64 is a value for aperture rendered famous by large format photographers. Even when the photographer wants small depth of field, f/11-f/16 or slower is common (*). Not only aberrations are negligible, but the out of focus highlights take a shape produced by diffraction. This shape, approximately a bell curve, is just what we need for very pleasing bokeh.
(* optimal depth of field is very much an acquired taste, but correspond in practice to fast lenses on 24x36 cameras because this is what we are used to. Very fast lenses on large format have been emulated, check the Brenizer method in google, and the results are strange. The viewer interprets the results as if the subject were a miniature.)