[3dem] Which resolution?

Ludtke, Steven J. sludtke at bcm.edu
Fri Feb 21 09:19:00 PST 2020


I've been steadfastly refusing to get myself dragged in this time, but with this very sensible statement (which I am largely in agreement with), I thought I'd throw in one thought, just to stir the pot a little more.

This is not a new idea, but I think it is the most sensible strategy I've heard proposed, and addresses Marin's concerns in a more conventional way. What we are talking about here is the statistical noise present in the FSC curves themselves. Viewed from the framework of traditional error analysis and propagation of uncertainties, which pretty much every scientist should be familiar with since high-school, (and thus would not be confusing to the non statisticians)  the 'correct' solution to this issue is not to adjust the threshold, but to present FSC curves with error bars.

One can then use a fixed threshold at a level based on expectation values, and simply produce a resolution value which also has an associated uncertainty. This is much better than using a variable threshold and still producing a single number with no uncertainty estimate!  Not only does this approach account for the statistical noise in the FSC curve, but it also should stop people from reporting resolutions as 2.3397 Å, as it would be silly to say 2.3397 +- 0.2.

The cross terms are not ignored, but are used in the production of the error bars. This is a very simple approach, which is certainly closer to being correct than the fixed threshold without error-bars approach, and it solves many of the issues we have with resolution reporting people do.  Of course we still have people who will insist that 3.2+-0.2 is better than 3.3+-0.2, but there isn't much you can do about them... (other than beat them over the head with a statistics textbook).

The caveat, of course, is that like all propagation of uncertainty that it is a linear approximation, and the correlation axis isn't linear, so the typical Normal distributions with linear propagation used to justify propagation of uncertainty aren't _strictly_ true. However, the approximation is fine as long as the error bars are reasonably small compared to the -1 to 1 range of the correlation axis. Each individual error bar is computed around its expectation value, so the overall nonlinearity of the correlation isn't a concern.



--------------------------------------------------------------------------------------
Steven Ludtke, Ph.D. <sludtke at bcm.edu<mailto:sludtke at bcm.edu>>                      Baylor College of Medicine
Charles C. Bell Jr., Professor of Structural Biology
Dept. of Biochemistry and Molecular Biology                      (www.bcm.edu/biochem<http://www.bcm.edu/biochem>)
Academic Director, CryoEM Core                                        (cryoem.bcm.edu<http://cryoem.bcm.edu>)
Co-Director CIBR Center                                    (www.bcm.edu/research/cibr<http://www.bcm.edu/research/cibr>)



On Feb 21, 2020, at 10:34 AM, Alexis Rohou <a.rohou at gmail.com<mailto:a.rohou at gmail.com>> wrote:

***CAUTION:*** This email is not from a BCM Source. Only click links or open attachments you know are safe.
________________________________
Hi all,

For those bewildered by Marin's insistence that everyone's been messing up their stats since the bronze age, I'd like to offer what my understanding of the situation. More details in this thread from a few years ago on the exact same topic:
https://mail.ncmir.ucsd.edu/pipermail/3dem/2015-August/003939.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_pipermail_3dem_2015-2DAugust_003939.html&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=Dk5VoQQ-wINYVssLMZihyC5Dj_sWYKxCyKz9E4Lp3gc&m=UWn2RUCMENrXjn3JLSwlIU6Zmp_JYnRrXesjtsM1u2E&s=CZ3YcAV1LVKXsLT0KjCIRby6j3XPA6GqZcOVP3nMyK0&e=>
https://mail.ncmir.ucsd.edu/pipermail/3dem/2015-August/003944.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_pipermail_3dem_2015-2DAugust_003944.html&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=Dk5VoQQ-wINYVssLMZihyC5Dj_sWYKxCyKz9E4Lp3gc&m=UWn2RUCMENrXjn3JLSwlIU6Zmp_JYnRrXesjtsM1u2E&s=oG6lGnei74jC5VVGsfFAdiTpIxrZhs_IH2mH0re5QRM&e=>

Notwithstanding notational problems (e.g. strict equations as opposed to approximation symbols, or omission of symbols to denote estimation), I believe Frank & Al-Ali and "descendent" papers (e.g. appendix of Rosenthal & Henderson 2003) are fine. The cross terms that Marin is agitated about indeed do in fact have an expectation value of 0.0 (in the ensemble; if the experiment were performed an infinite number of times with different realizations of noise). I don't believe Pawel or Jose Maria or any of the other authors really believe that the cross-terms are orthogonal.

When N (the number of independent Fouier voxels in a shell) is large enough, mean(Signal x Noise) ~ 0.0 is only an approximation, but a pretty good one, even for a single FSC experiment. This is why, in my book, derivations that depend on Frank & Al-Ali are OK, under the strict assumption that N is large. Numerically, this becomes apparent when Marin's half-bit criterion is plotted - asymptotically it has the same behavior as a constant threshold.

So, is Marin wrong to worry about this? No, I don't think so. There are indeed cases where the assumption of large N is broken. And under those circumstances, any fixed threshold (0.143, 0.5, whatever) is dangerous. This is illustrated in figures of van Heel & Schatz (2005). Small boxes, high-symmetry, small objects in large boxes, and a number of other conditions can make fixed thresholds dangerous.

It would indeed be better to use a non-fixed threshold. So why am I not using the 1/2-bit criterion in my own work? While numerically it behaves well at most resolution ranges, I was not convinced by Marin's derivation in 2005. Philosophically though, I think he's right - we should aim for FSC thresholds that are more robust to the kinds of edge cases mentioned above. It would be the right thing to do.

Hope this helps,
Alexis



On Sun, Feb 16, 2020 at 9:00 AM Penczek, Pawel A <Pawel.A.Penczek at uth.tmc.edu<mailto:Pawel.A.Penczek at uth.tmc.edu>> wrote:
Marin,

The statistics in 2010 review is fine. You may disagree with assumptions, but I can assure you the “statistics” (as you call it) is fine. Careful reading of the paper would reveal to you this much.

Regards,
Pawel

On Feb 16, 2020, at 10:38 AM, Marin van Heel <marin.vanheel at googlemail.com<mailto:marin.vanheel at googlemail.com>> wrote:



**** EXTERNAL EMAIL ****

Dear Pawel and All others ....

This 2010 review is - unfortunately - largely based on the flawed statistics I mentioned before, namely on the a priori assumption that the inner product of a signal vector and a noise vector are ZERO (an orthogonality assumption).  The (Frank & Al-Ali 1975) paper we have refuted on a number of occasions (for example in 2005, and most recently in our BioRxiv paper) but you still take that as the correct relation between SNR and FRC (and you never cite the criticism...).
Sorry
Marin

On Thu, Feb 13, 2020 at 10:42 AM Penczek, Pawel A <Pawel.A.Penczek at uth.tmc.edu<mailto:Pawel.A.Penczek at uth.tmc.edu>> wrote:
Dear Teige,

I am wondering whether you are familiar with

Resolution measures in molecular electron microscopy.
Penczek PA. Methods Enzymol. 2010.
Citation

Methods Enzymol. 2010;482:73-100. doi: 10.1016/S0076-6879(10)82003-8.

You will find there answers to all questions you asked and much more.

Regards,
Pawel Penczek


Regards,
Pawel
_______________________________________________
3dem mailing list
3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=DwMFaQ&c=bKRySV-ouEg_AT-w2QWsTdd9X__KYh9Eq2fdmQDVZgw&r=yEYHb4SF2vvMq3W-iluu41LlHcFadz4Ekzr3_bT4-qI&m=3-TZcohYbZGHCQ7azF9_fgEJmssbBksaI7ESb0VIk1Y&s=XHMq9Q6Zwa69NL8kzFbmaLmZA9M33U01tBE6iAtQ140&e=>
_______________________________________________
3dem mailing list
3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=Dk5VoQQ-wINYVssLMZihyC5Dj_sWYKxCyKz9E4Lp3gc&m=UWn2RUCMENrXjn3JLSwlIU6Zmp_JYnRrXesjtsM1u2E&s=TeEhUNYC5v59HGWMrPQCMaGK5opuX-NIG2mJvGLuiKA&e=>
_______________________________________________
3dem mailing list
3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=DwICAg&c=ZQs-KZ8oxEw0p81sqgiaRA&r=Dk5VoQQ-wINYVssLMZihyC5Dj_sWYKxCyKz9E4Lp3gc&m=UWn2RUCMENrXjn3JLSwlIU6Zmp_JYnRrXesjtsM1u2E&s=TeEhUNYC5v59HGWMrPQCMaGK5opuX-NIG2mJvGLuiKA&e=

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ncmir.ucsd.edu/pipermail/3dem/attachments/20200221/d01a63e9/attachment-0001.html>


More information about the 3dem mailing list