The issue with splines of course remains to find a balance between unjustified overshoot, and justified interpolation (inventing probably useful data).
Technically, the overshoot is not "unjustified", and in fact arises because how we measure the error. Here is the intent of the interpolation in signal processing philosophy: If we are to reconstruct a continuous signal from its samples then how can the samples be obtained that the reconstructed signal using these samples is the "best" representation of that original possibly non-bandlimited function. The "best" part will be measured by error between original, possibly unknown, function and the reconstructed function. For ideal sinc reconstruction it turns out that the best way to obtain the samples is to use the actual samples of the original signal itself. However, if we are going to reconstruct using other methods, is that still the best way to obtain samples? E.g.:- Lets pick linear interpolation, and one would find that even linear interpolation would exhibit ringing. Normally, we don't see that in linear interpolation, because we use the actual samples. However, if we use the actual samples, then linear interpolation is not the doing the best it can do. We need to derive an alternate set of samples and then interpolate between them for min. error.
Unfortunately, we are typically given samples and can't change the way we acquired them. However, even in such cases optimizations exist that shall try to mitigate the effect of information loss by considering the way samples were acquired and the reconstruction kernel jointly.
Now as suggested by EsbenHR, one can use a Hermite polynomial instead of a cubic spline for getting rid of "ringing-type" stuff. However, that should be done with this understanding that the goal is to generate a visually pleasing image and not necessarily the technically best reconstruction from the original image.
On the other hand, if we keep the reconstruction kernel fixed, then different ways of measuring error will produce different level of suppression on the ringing phenomenon. As I mentioned in an earlier post that the L1 norm is less sensitive to the presence of outliers (a sharp step causing ringing in our case here).
For example: consider the numbers {1, 2, 5, 9, 112=outlier}. If we want to approximate them by a constant then using:
L1 norm: gives the estimator as 5,
L2 norm (least squares): gives the estimator as 25.8.
L_infinity (max) norm: gives the estimator as 56.5.
As seen L2 and L_infinity are way off, while L1 is justified, and it also gave another satisfying result that the estimator, i.e., 5, was actually present in the original set of numbers.
However, optimization in spaces other than L2 is challenging and not easy and is something to be wary of.