These days 40-60MP sensors are readily available and it is widely rumoured that Fuji will issue a X-T5 with 40MP to go alongside the recently announced X-H2. . . . My query is whether there is a consensus that using primes with 40MP and then cropping to get the framing right is better than staying with 26MP and using zooms . . .
Depends on the particular lens, the particular sensor, the particular subject, the particular vantage point, and the particular photographer's approach to visualizing the image in the viewfinder.
I also shoot Fuji and I plan to pop for either an X-H2 or a X-T5—just to have the additional flexibility of the higher resolution. (That's the reason I still am hanging on to my Nikon D800E.) Although we don't know all the details of the X-T5 yet, I suspect its technical specs will closely track those of the X-H2. I'm very familiar with the X-T* series, but I want to try out the ergonomics of the X-H2 with a rental before I commit to one of the 40 Mpx alternatives.
But to get back to your main point, I don't think it's possible to generalize between primes versus zooms and cropping versus in-camera framing: there are too many variables. If you compose well, get your exposure right, and nail the focus, you will probably get an image you are satisfied with regardless of the camera and lens configuration.