You can do without a tripod - in theory. It just requires a lot more storage space and a lot more computing power at the other end, as well as a camera with more pixels than you actually need and a subject far enough away that minor movements while hand-holding won't result in any meaningful parallax error.
A single shot taken handheld at 4s, ISO 100 is going to be blurry. But a single shot taken at 1/30s ISO 12800 could well be sharp, depending on focal length and camera- or lens-based IS. Take 128 such frames in quick succession and each individual frame is likely to be sharp, although framed slightly differently. Stack 128 1/30s, ISO 12800 frames together, align them and average them, and you get the equivalent of a 4s, ISO 100 exposure. You'll have to crop slightly, since the framing will be slightly different between each frame and you'll almost certainly have to straighten the final output, but, if your camera already has more than enough megapixels, you may be able to afford to.
You could do this using individual frames shot at high speed, but a much better bet, if this method of tripod replacement were to become practical in the future, would be to use an ultra-high-resolution (higher than 8k) video format.
Of course, this is very taxing on your storage space and computing resources. A 4s exposure requires 128 frames if you can keep each individual frame sharp at 1/30. But, if you need 1/500s for a sharp image, you'll need 2048 frames at ISO 204800 for a 4s exposure. And, of course, if you want a 30s, or even multi-minute exposure handheld, you're looking at tens or hundreds of thousands of frames. Not impossible - computing power and storage capacity increase every year - but certainly slow and inconvenient.
This is just an extension of what I already do when wind or focal length mean that even a tripod can't make a shot sharp (I've even had that happen at 1/400s, when shooting telephotos on tripods in windy conditions), or when doing long exposures of scenes with some moving elements I'd like to keep still (e.g. keeping the leaves and branches still while allowing motion blur in the waterfall), and what astrophotographers do to account for multi-hour exposures. Of course, all these things are already done on tripods, so you're usually talking 10-20 (for landscapes) to several hundred (astrophotography) frames. To use the same technique to replace a tripod would likely require another ten years of growth in computing power and storage space, assuming no growth in final image resolution - for now, carrying a tripod is just going to be more convenient.