[3dem] [ccpem] on FSC curve (A can of worms...)

Penczek, Pawel A Pawel.A.Penczek at uth.tmc.edu
Sun Aug 30 09:51:14 PDT 2015


I specifically wrote about the Wikipedia page.

Marin, this exchange is largely pointless.  I am not sure what your agenda is.  You seem to be keen on creating confusion.  I do not know what motivates you, but you are not helping anybody.

I am signing off from this thread. 

Regards,
Pawel

> On Aug 30, 2015, at 11:44 AM, Marin van Heel <marin.vanheel at googlemail.com> wrote:
> 
> 
> Hi Pavel,
> 
> You have lost me completely now.
> 
> This thread was very much devoted to Bershad & Rockmore 1974... I don't know what you are talking about.  What do you mean with the "true source of the FSC"? Have you read the whole thread from the beginning? Are you confusing the FSC formula with the Frank & Al-Ali 1975 formula?
> 
> Sorry, you really lost me here...
> 
> Cheers
> 
> Marin
> 
>> On 30/08/2015 13:14, Penczek, Pawel A wrote:
>> Hi,
>> 
>> these are indeed interesting days to see scientific argument based on wikipedia note!  I wonder who could have written it,
>> as it conveniently omits the true source of FSC and the one that contains the correct derivation:
>> 
>> Bershad, N. J., and Rockmore, A. J. (1974). On estimating signal-to-noise ratio using the
>> sample correlation coefficient. IEEE Trans. Inf. Theory IT20, 112¨C113.
>> 
>> The note about N nicely illustrates my point about misleading arguments.  In real world statistical significance of correlation
>> coefficient depends on both number of samples and its value, so the reasoning is plain wrong.
>> 
>> Regards,
>> -
>> Pawel Penczek
>> pawel.a.penczek at uth.tmc.edu
>> 
>> 
>> 
>>> On Aug 30, 2015, at 10:58 AM, Marin van Heel <marin.vanheel at googlemail.com> wrote:
>>> 
>>> 
>>> Hi Pawel,
>>> 
>>> I was waiting for your reaction since you have mentioned more than once over a beer that we were wrong with our 2005 paper... I am glad my shaky understanding of statistics was at least good enough to come up with the FRC/FSC concept in the first place (https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_Fourier-5Fshell-5Fcorrelation&d=BQIFaQ&c=6vgNTiRn9_pqCD9hKx9JgXN1VapJQ8JVoF8oWH1AgfQ&r=vDDf9rsFxPMXm8JgJa6hc4B9V4qKr7wftnDkLIRdshI&m=Kx6H_RV-9c4hSj_x2jvjgZmI2RxnKTf1NruwDG0IwD8&s=q8w2igERmkfzfW3s8yyu91e7G9kdfjexZhFB77rGT-E&e= ).  :)
>>>  Now to your only concrete point so far: what size of "N" is big enough?
>>> 
>>> This thread has been going on for a while and you have apparently missed what was discussed earlier, so let me help you out by repeating it below. :)
>>> 
>>> Cheers
>>> Marin
>>> 
>>> ================================================================================
>>> 
>>> The problem with this ¡°N¡± is that in the context of the FSC, ¡°N¡± refers rather to the number of complex numbers in a Fourier shell which can be pretty low when close to the origin, say at 5 pixels from the Fourier space origin N~ 125 (4*Pi*R**2; with R=5 ¨¤ N=~250, divide by 2 because of the Hermitian symmetry). When we are dealing with say an icosahedral structure, there is a 60-fold redundancy within that shell thus N=~125/60 ~= 2; not a very large N at all!  At R=10 ¡­ N~8; at R=20¡­ N~32, etc¡­   (the expected random correlation sigma 1/¡ÌN: 1/¡Ì2 =1/1.41; 1/¡Ì8 =1/2.82 ¡­) Of course, for a C1 structure, say a ribosome, these N values are much higher and hence the relevance thresholds much lower (VH&S05). Everyone who still claims a fixed value threshold for any structure (icosahedral; C1 or with any other pointgroup symmetry) has a real problem counting!  I suggest all referees out here to no longer accept such ignorance!
>>> 
>>> ===============================================================
>>> 
>>>> On 30/08/2015 11:16, Penczek, Pawel A wrote:
>>>> Marin,
>>>> 
>>>> your understanding of statistics is shaky.  You confuse expected value with outcome of a single experiment.  This confusion first surfaced in your paper on half-bit criterion.
>>>> 
>>>> Most of your statements below can be quantified and shown to be wrong.
>>>> Other stem from hidden assumptions, which are also incorrect.
>>>> 
>>>> For example, what does it mean N is not large in all cases.  What cases?  How large would satisfy you?
>>>> 
>>>> I am not sure why you picked up this subject and what is your point, but you are not helping out.
>>>> 
>>>> Regards,
>>>> Pawel
>>>> 
>>>> 
>>>>> On Aug 30, 2015, at 9:05 AM, Marin van Heel <marin.vanheel at googlemail.com>
>>>>>  wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Hi Alexis,
>>>>> 
>>>>> My final pennies:
>>>>> 
>>>>> (1) We FULLY agree that the B&R74 formula (=F&A75 formula) is wrong when used in connection with a single experiment!
>>>>> 
>>>>> (2) The field view of the image does not change upon binning, hence the SNR obtained from the same field of view increases upon binning. There is no paradox: the SNR just cannot be interpreted as an information metric on the object you are observing, that is all. We choose to FULLY disagree here.
>>>>> 
>>>>> (3) "N is not so large" in ALL cases of interest. The issue was settled more than a decade ago (but largely ignored since). Defining an agreed standard resolution indicator is not a "marginal" issue, it is like defining the "meter" as the unit of length. Of course, even a well defined resolution value can only be a single-valued indicator of what we will be able to interpret in terms of the 3D map details. But still, we need no further "elastic-band" criteria in cryo-EM. For example, when calculating a local resolution map and continuously reducing the comparison area, with the same FSC fixed-threshold value, will necessarily lead to absurd results. It leaves too much room for deliberately "polishing" the quality metrics.
>>>>> 
>>>>> (4) Again, this is a one-off experiment that does not tolerate negative SNR values. It is WRONG to interpret this in terms of an expectation value because that does not reflect the experimental reality.
>>>>> 
>>>>> (5) "Of course, signal and noise are not orthogonal, and the cross-terms are not zero." We FULLY agree that this is what really matters!
>>>>> "It's just that the expectation value of these cross terms is zero."  Well yes, that is true but - again - this expectation value is irrelevant in a one-off experiment! You simply cannot apply that to a single experiment. For example, in VH&S(2005) hundreds of such individual FSC plots are plotted on top of each other to illustrate their expected "one-off" behaviour. One would get close to the expectation value of zero if we would sum hundreds or thousands such FSC plots. But that would not be relevant to any individual FSC or SSNR experiment contained in the set.
>>>>> 
>>>>> (6) Alexis, note that in (B&R74) even their formula 15 estimator is only valid "for large N".
>>>>> 
>>>>> (7) What is relevant is that the cross term between signal and noise cannot be neglected. At the level when we state we have collected enough information to achieve some predetermined resolution threshold, the variance of the signal is typically of the same order as that of the noise... Thus the cross terms between the signal and the noise will be of the same order of magnitude as the also "uncorrelated" noise-to-noise cross terms which "everybody" does include in their calculations. For consistency "everybody" should also use the zero expectation value for the noise-to-noise cross terms. In that case, both sides of the formula go to infinity. Ha Ha!
>>>>> 
>>>>> Let us look at all the extremes of the B&R formula (as any physicist is taught to do at university; I have already hammered on the SNR positivity issue):
>>>>> 
>>>>> A) Signal --> 0 (Noise =/ 0) thus: SNR --> 0 and CCC --> Noise-to-Noise cross term; ergo B&R gives WRONG results in any single experiment with a limited number of sampling points N.
>>>>> It can only be made correct if the number of sampling points goes to infinity (which is not our case) or if the experiment is repeated an infinite number of times (which is not our case). In both cases the B&R formula would become correct (but pointless).
>>>>> 
>>>>> B) Noise --> 0 (Signal =/ 0) thus: SNR --> Infinity  and CCC --> 1  thus CCC/(1-CCC) --> Infinity (here one could define B&R to be correct; but irrelevant)
>>>>> 
>>>>> C) In all other cases the Signal-to-Noise cross terms may not be ignored and the B&R74 formula - which excludes these cross-terms - is wrong.
>>>>> 
>>>>> 
>>>>> To summarise: The B&R74 formula (= the F&A75 formula) is wrong or not relevant. In deriving the formula the all-important cross terms between Signal and Noise have been ignored which are crucial for our one-off FSC & SSNR experiments. A fixed-valued FSC threshold criterion is WRONG and leaves space for deliberate manipulation of the resulting metrics.
>>>>> 
>>>>> May the pennies drop!
>>>>> 
>>>>> Marin
>>>>> 
>>>>> Alexis, thank you for insisting on the nitty-gritty! You forced me to be explicit.
>>>>> Maybe that was just what was needed to clean up this mess in cryo-EM.
>>>>> The bottom line is "Marin van Heel" is right and "everybody else" is wrong! Or did I miss something ...?
>>>>> 
>>>>> 
>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_Fourier-5Fshell-5Fcorrelation&d=BQICAg&c=6vgNTiRn9_pqCD9hKx9JgXN1VapJQ8JVoF8oWH1AgfQ&r=vDDf9rsFxPMXm8JgJa6hc4B9V4qKr7wftnDkLIRdshI&m=yyiodHzukeKHj63IQQ816ZLMCgxCdxReihbtOUqTM8Q&s=gDuzWw1j1111BJvz1waTjaww_S7XFNipoGRb-ie0U-c&e=
>>>>>  =============================================================
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________ 3dem mailing list 3dem at ncmir.ucsd.edu https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=BQICAg&c=6vgNTiRn9_pqCD9hKx9JgXN1VapJQ8JVoF8oWH1AgfQ&r=vDDf9rsFxPMXm8JgJa6hc4B9V4qKr7wftnDkLIRdshI&m=yyiodHzukeKHj63IQQ816ZLMCgxCdxReihbtOUqTM8Q&s=vsu72az2FrqZLzdI-Oexc2jKHYotTUK4Ye6e2Mbz0KM&e=
>>>> 
>>>> _______________________________________________
>>>> 3dem mailing list
>>>> 
>>>> 3dem at ncmir.ucsd.edu
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=BQIFaQ&c=6vgNTiRn9_pqCD9hKx9JgXN1VapJQ8JVoF8oWH1AgfQ&r=vDDf9rsFxPMXm8JgJa6hc4B9V4qKr7wftnDkLIRdshI&m=Kx6H_RV-9c4hSj_x2jvjgZmI2RxnKTf1NruwDG0IwD8&s=3KG9Uag9CdEdJHwT1UdCveJLYS3M2_wXgXDj4aNw15Y&e= 
>>>> .
> 


More information about the 3dem mailing list