Ok, good point about energy levels: so you say one photon = one electron? Does it make a difference as to what the energy of the photon in question is (i.e. 380nm, vs 760nm light) as far as the number of electrons generated?
The 'Unity Gain' concept is quite artificial and therefore, with due respect, I think that the introduction of noise as a consideration is both confusing and unnecessary.
I am all for keeping things simple, however noise is a pretty key element of this discussion as I hope to show below. I am thinking about the noise that is always inherently present in light, sometimes referred to as shot noise, because its distribution is similar to the arrival statistics of shot from a shot gun, whch was characterized by a gentlemen by the name of Poisson.
So now we can address the integer nature of electrons. Let's assume that each photosite is a perfect electron counter with Unity Gain. 10 electrons generated by the photosite, the ADC stores a count of 10 in the Raw data with no errors. Example 1: the sensor is exposed at a known luminous exposure and the output of the photosite in question is found to result in a Raw value of 2. What is the signal?
We cannot tell by looking at just one photosite. The signal could easily be 1,2,3,4,5, 6... For instance if it were 4, shot noise would be 2, and a value of 2 is only a standard deviation away. To know what the signal is, we need to take a uniform sample of neighbouring photosites, say a 4x4 matrix*. We gather statistics from the Raw values from each photosite and compute a mean (the signal) and a standard deviation (the noise). In this example it turns out that the signal was 1 electron with standard deviation/noise of 1. Interestingly, the human visual system works more or less the same way.
Example 2: a new exposure resulting in a signal of 7 electrons for each photosite in the 4x4 matrix on our sensor. Of course each does not get exactly 7 electrons because photons arrive randomly, and in fact we know thanks to M. Poisson that the mean of the values in our 4x4 matrix should indeed be 7 but with a standard deviation of 2.646 - so some photosites will generate a value of 7 but many will also generate ...2,3,4,5,6,8,9,10,11,12.... The signal is the mean of these values.
Example 3: Different exposure. Say we look at our 4x4 matrix of Raw values and end up with a mean of 12.30 and a standard deviation of 3.50. Using the Effective Absolute QE for the D800e above (15.5)% and ignoring for the sake of simplicity Joofa's most excellent point above, could we say that this outcome resulted from exposing each photosite to a mean of +12.3/0.155= 79.35 photons? After all, this number of photons is a mean itself.
What does this mean for Unity Gain?
*The area within the circle of confusion on an 8x12" image watched by a normal person with 20/20 vision at arm's length corresponds to the area of about 16 sensels on a typical modern FF DSLR.