Highlight clipping means the exposure delivers more photons to the sensor than the pixels can hold. In normal digital sensors, all pixels across the sensor have the same photon capacity, and this capacity cannot be changed by software. So your software luminosity mask idea will not work.
A mask which blocks some of the excess photons from reaching the sensor would solve the problem. A graduated neutral density filter is one common hardware solution which does this. But it is not very programmable.
Another idea is to reduce the exposure to the point where highlight clipping does not occur, then do something else to recover the shadows which are now too dark. Multiple exposures and high dynamic range (HDR) post-processing is a common solution along these lines.
In keeping with your desired in-camera software solution, you want a sensor with programmable sensitivity (ISO setting) at the pixel level. Then you could keep the exposure below the clipping level for normal pixels, and raise the electronic gain of pixels in the dark areas. A pre-exposure could determine which pixels need low gain and which pixels need high gain. No digitial sensor has this capability, and it would greatly complicate the electronics to create such a sensor.
A few cameras do contain a mix of pixels with different photon capacity, but the pattern is not programmable. The idea is to expose so that the low capacity pixels capture the shadows, then reconstruct the clipped highlights from the high capacity pixels.