Have you by any chance looked at deltaE2000 distribution of the points centered around the surrounding cubes against the average of the 8 surrounds? Does that make a good way to reduce patches that are providing little benefit?
Sorry, it's a long time ago that I played with BCC sampling, and I moved on to other sampling approaches. Certainly I've used similar ideas to estimate curvature of the device response, and the default targen "Optimised Farthest Point Sampling" uses the difference between linear device value and measured value interpolation of each Voronoi region as a curvature measure.
A lot of the original inspiration came from 3 papers written by Don Bone, and in particular the paper "Adaptive color-printer modeling using regularized linear splines" which was a CSIRO technical report TR-HJ-92-19 and also
published by the SPIE summarizes a lot of it, but also includes some interesting work on adaptive sampling. At the time (1993) strip and table reading instruments were rare, and Don was using a point by point measurement, so he came up with an interactive adaptive sampling approach. After measuring a smaller uniform grid, and used a couple of techniques to decide on which points to print and measure next. (Since he was working with a copier, it was relatively easy to print each sample as needed.) He was effectively used a curvature criteria, in that areas which were poorly predicted by linear interpolation were sampled in more detail.
One of the reasons I moved away from regular device grids is that they tended to interact with spline like models badly, since they "resonated" easily. Some sort of damping or regularization is always needed with spline or polynomial models, but it's less critical if the sampling distribution is more stochastic than regular.