[3dem] [ccpem] on FSC curve (A can of worms...)

Sun Aug 30 07:05:31 PDT 2015

Hi Alexis,

My final pennies:

(1) We FULLY agree that the B&R74 formula (=F&A75 formula) is wrong when 
used in connection with a single experiment!

(2) The field view of the image does not change upon binning, hence the 
SNR obtained from the same field of view increases upon binning. There 
is no paradox: the SNR just cannot be interpreted as an information 
metric on the object you are observing, that is all. We choose to FULLY 
disagree here.

(3) "N is not so large" in ALL cases of interest. The issue was settled 
more than a decade ago (but largely ignored since). Defining an agreed 
standard resolution indicator is not a "marginal" issue, it is like 
defining the "meter" as the unit of length. Of course, even a well 
defined resolution value can only be a single-valued indicator of what 
we will be able to interpret in terms of the 3D map details. But still, 
we need no further "elastic-band" criteria in cryo-EM. For example, when 
calculating a local resolution map and continuously reducing the 
comparison area, with the same FSC fixed-threshold value, will 
necessarily lead to absurd results. It leaves too much room for 
deliberately "polishing" the quality metrics.

(4) Again, this is a one-off experiment that does not tolerate negative 
SNR values. It is WRONG to interpret this in terms of an expectation 
value because that does not reflect the experimental reality.

(5) "Of course, signal and noise are not orthogonal, and the cross-terms 
are not zero." We FULLY agree that this is what really matters!
"It's just that the expectation value of these cross terms is zero."  
Well yes, that is true but - again - this expectation value is 
irrelevant in a one-off experiment! You simply cannot apply that to a 
single experiment. For example, in VH&S(2005) hundreds of such 
individual FSC plots are plotted on top of each other to illustrate 
their expected "one-off" behaviour. One would get close to the 
expectation value of zero if we would sum hundreds or thousands such FSC 
plots. But that would not be relevant to any individual FSC or SSNR 
experiment contained in the set.

(6) Alexis, note that in (B&R74) even their formula 15 estimator is only 
valid "for large N".

(7) What is relevant is that the cross term between signal and noise 
cannot be neglected. At the level when we state we have collected enough 
information to achieve some predetermined resolution threshold, the 
variance of the signal is typically of the same order as that of the 
noise... Thus the cross terms between the signal and the noise will be 
of the same order of magnitude as the also "uncorrelated" noise-to-noise 
cross terms which "everybody" does include in their calculations. For 
consistency "everybody" should also use the zero expectation value for 
the noise-to-noise cross terms. In that case, both sides of the formula 
go to infinity. Ha Ha!

Let us look at all the extremes of the B&R formula (as any physicist is 
taught to do at university; I have already hammered on the SNR 
positivity issue):

A) Signal --> 0 (Noise =/ 0) thus: SNR --> 0 and CCC --> Noise-to-Noise 
cross term; ergo B&R gives WRONG results in any single experiment with a 
limited number of sampling points N.
It can only be made correct if the number of sampling points goes to 
infinity (which is not our case) or if the experiment is repeated an 
infinite number of times (which is not our case). In both cases the B&R 
formula would become correct (but pointless).

B) Noise --> 0 (Signal =/ 0) thus: SNR --> Infinity  and CCC --> 1  thus 
CCC/(1-CCC) --> Infinity (here one could define B&R to be correct; but 
irrelevant)

C) In all other cases the Signal-to-Noise cross terms may not be ignored 
and the B&R74 formula - which excludes these cross-terms - is wrong.

To summarise: The B&R74 formula (= the F&A75 formula) is wrong or not 
relevant. In deriving the formula the all-important cross terms between 
Signal and Noise have been ignored which are crucial for our one-off FSC 
& SSNR experiments. A fixed-valued FSC threshold criterion is WRONG and 
leaves space for deliberate manipulation of the resulting metrics.

May the pennies drop!

Marin

Alexis, thank you for insisting on the nitty-gritty! You forced me to be 
explicit.
Maybe that was just what was needed to clean up this mess in cryo-EM.
The bottom line is "Marin van Heel" is right and "everybody else" is 
wrong! Or did I miss something ...?

https://en.wikipedia.org/wiki/Fourier_shell_correlation

=============================================================

On 22/08/2015 10:51, Alexis Rohou wrote:
> Hi Marin,
>
> My two cents:
>
> (1) SNR = CCC/(1-CCC).
> It is also my understanding that equating actual (as opposed to 
> estimated) SNR to CCC or FSC from a single experiment is incorrect. 
> Strictly speaking, it should be made clear that the left-hand-side is 
> an estimate only, ideally with some mention of what the error on that 
> estimate might be.
>
> (2) Information content from SNR
> I believe the correct interpretation of your thought experiment 
> requires the Shannon-Hartley theorem 
> (https://en.wikipedia.org/wiki/Shannon%E2%80%93Hartley_theorem), which 
> relates SNR to information rate (bits per seconds or, in our case, 
> bits per pixel). My understanding is that upon binning, the 
> information content per pixel goes up. Of course, we also lose a lot 
> of pixels and so the total information content in the image goes down. 
> Thus your paradox is gone.
>
> (3) N
> Yes, in some cases, N is not so large. This can occur with high 
> symmetry, small objects in large reconstruction volumes etc. That 
> should mean that, for example, icosahedral and C1 cases should be 
> treated differently if we want a resolution criterion that is robust.
> The reason this isn't done today is that in many cases the effect due 
> to N is believed to be marginal. Arguably this is not a good reason to 
> not do the right thing - just because something is only marginally 
> wrong doesn't mean we shouldn't fix it.
>
> (4) As SNR approaches 0.0, CCC oscillates around 0.0
> Yes, in that case, the B&R estimator will give negative SNR estimates 
> ~50% of the time. I'm OK with that, you're not.
>
> (5-6) Signal-noise cross terms in Bershad & Rockmore
> Of course, signal and noise are not orthogonal, and the cross-terms 
> are not zero. It's just that the expectation value of these cross 
> terms is zero.
> I think B&R did the right thing. In Eqn (2) they are describing the 
> expectation value of the multiplication of the two noisy data streams. 
> Therefore, when they derive Eqn (6) from it, they are justified in not 
> having cross terms. Then Eqn (7), their estimator, doesn't have the 
> cross-terms in it, but that's because it only need be correct in the 
> ensemble (unbiased), not for a single experiment. B&R's Eqns (14) and 
> (15), which give the error of their estimator, must have the 
> cross-terms - this is reflected in the prominent role of N in those 
> expressions.
>
> To summarise: B&R are correct; a fixed criterion is not always a good 
> tool for trustworthy estimates of resolution; in many cases this isn't 
> a significant problem, but we might as well improve on what we have.
>
> All the best,
> Alexis
>
>

-