And if the scene DR is high compared to the camera DR, you have no spare latitude for ETTR. Which makes ETTR possible and relevant only in the cases where you have a low-latitude scene and high-latitude sensor, but with noisy shadows (which, really, makes its usable latitude far less).
I tend to see "ETTR" as a concept that loosely describes the process of being well-informed about what highlights will be accurately recorded, and what will not, while setting exposure.
Any process that makes it possible to raise exposure by e.g. 1 stop while still being reasonably confident that critical highlights are not clipped, will raise signal levels across the histogram. Parts of the image that have marginal SNR for whatever your editing/presentation intent will be improved accordingly.
My posts about ETTR should not be seen as a claim that everyone should ETTR all of the time. Rather, I am argueing that ETTR seems to be based on sensible theory and proven practice in that it (subject to some constraints) maximize the recorded (scene) information. Being a technical concept, it can actually be discussed and analyzed in a civil manner. The relevance for a particular user/application is left to the reader.
For many of my uses, I have limited freedom in choosing exposure time and aperture due to movement and DOF. Thus, bumping exposure time in order to capture a "hotter" signal may be an option that I want to rule out. In other cases, I can choose exposure time and/or aperture with great freedom, and would like to do so in an "optimal" manner. I have some experience with images that seem "excellent" at the time, that show warts a few years later when I am able to print larger, have (perhaps) higher standards, and want to do more processing.
-h