<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix"><br>

      Hi Alexis<br>

      <br>

      What now? My own (ex-) students questioning me? Well I guess I

      need to see that leniently and interpret this as a success: having

      educated a new generation of independent critical scientists… 

      HeHe. So, no hard feelings, but you do force me to go into much

      more detail here than I had anticipated, so let’s start:<br>

      <br>

      1)    You correctly discuss and interpret the SNR/CCC issue in the

      context the original work by Bershard & Rockmore (1974) from

      which Frank & Al-Ali (1975) have used the formulas. The

      detailed considerations of B&R(74) have not all survived into

      the F&A(75) manuscript. I mention this explicitly because the

      actual use of the formula “SNR = (CCC/(1-CCC))” in cryo-EM today

      is as if this were a true formula relating the CCC to SNR! All

      subsequent papers from the Frank group cite only the F&A(75)

      paper and these typically drop the “~” for a “=” and apply it to

      everything effectively without any restriction (as does almost

      everybody else in the cryo-EM field). Also IT DOESN'T help that

      the only reference to the vH&S (2005) paper (in which we

      criticize the F&A(75) paper) by the Frank group is by Liao

      & Frank (2010) in a review about resolution. Our 2005 paper is

      cited falsely (quoting the opposite of what we actually say) and

      on a subject not related to our criticism!<br>

      <br>

      2)    One more thing Alexis, before I go into the issues you rise.

      The F&A (75) paper is based on the assumption – like do very

      many papers in image processing – that the real-space SNR of an

      image is a measure of its “significant information” content. That

      assumption is simply wrong. A straightforward counter-example:

      take two 4Kx4K standard cryo-EM images of the same vitreous ice

      sample and calculate/guesstimate their SNR(-4096). Then bin the

      two images to 2Kx2K each by averaging four pixels into one and

      then calculate SNR-2048; do that again to find SNR-1024; SNR-512,

      etc. until you reach SNR-1. Each of these binned images obviously

      looks at the same overall area of the sample, but you will see

      that: …  SNR-128 > SNR-256 > SNR-512 > SNR-1024 >

      SNR-2048 > SNR-4096.  Does that mean the information content of

      the binned images is better than that of un-binned originals?? NO!

      Because the high-frequency (Poisson) noise tends to be damped by

      the binning operation whereas the more low-frequency sample signal

      is reinforced, the SNR of the binned images are better than those

      of the originals. If the image’s real-space SNR would be a real

      information-content metric, we would all first bin down our

      precious 4Kx4K direct-electron cryo-EM images to 512x512 or even

      256x256 before starting our real data processing. Ergo: the

      real-space SNR is not an information-content metric and the a

      priori assumption upon which F&A(75) is based is thus

      incorrect. F&A(75) also argue that N, the number of pixels is

      ~ 10.000 (in 1975; more like  N ~ 16.000.000 now) and that thus

      the approximations are valid. We will see below that this is an

      entirely inappropriate argument in the context of the FSC.<br>

      <br>

      3)    In the context of the FSC, “N” refers rather to the number

      of complex numbers in a Fourier shell which can be pretty low when

      close to the origin, say at 5 pixels from the Fourier space origin

      N~ 125 (4*Pi*R**2; with R=5 --> N=~250, divide that by 2

      because of the Hermitian symmetry). When we then are also dealing

      with an icosahedral structure, there is a 60-fold redundancy

      within that shell thus N=~125/60 -> N~2; not a very large N at

      all!  At R=10 … N~8; at R=20… N~32, etc…   (the expected random

      correlation sigma 1/√N: 1/√2 =1/1.41; 1/√8 =1/2.82 …) Of course,

      for a C1 structure, these N values are 60 times higher and hence

      the relevance thresholds much lower (VH&S05). Everyone who

      claims a fixed value threshold for any structure (icosahedral; C1

      or with any other pointgroup symmetry) has a real problem

      counting!  I hope all referees out there take good notice and

      think twice before accepting flawed fixed-valued thresholds!<br>

      <br>

      4)    Finally back to your points Alexis and to the original

      B&R(74) work. You mention B&R(74) formula #6. I had to

      look up the paper again… Actually the formula appears twice there

      with slightly different definitions. The first is formula #6 which

      is the desired one I cited in my first email; the second

      (formula#7) is the real “estimator” you are referring to, I

      believe. I agree with you that putting a CCC= -1 is a bit of an

      extreme example and maybe not in synch with the gist of

      B&R(74). However, it is fully in synch with the interpretation

      given to the formula since F&A(75). You do not discuss my main

      point here, namely that when CCC fluctuates around the zero mark.

      The estimator (formula#7) will then yield negative values ~50% of

      the time and that may only be corrected by repeating the

      experiment an infinite number of times which would then lead to an

      exact SNR = 0. One can only hope your cryo-EM sample will survive

      the necessary infinite number of experiments/exposures! Sorry

      Alexis, your argumentation that negative SNRs should be accepted

      since they will eventually average out to a “0” real SNR value is

      very farfetched (even though mathematically justifiable...). Since

      FSC experiments are typically one-off experiments I’d rather stay

      with two feet on the ground and not follow you along this path. We

      are here dealing with low “N” values in one-off FSC experiments.<br>

      <br>

      5)    Nevertheless, this is not where the real problem is! The

      real problem in terms of the B&R(74) paper surfaces earlier,

      namely at the level of formulas #4 and #5A which are the ones

      discussed in B&R(74) that are closest to the FRC/FSC formulas

      (VH&S05). These formulas still contain the CROSS TERMS between

      the (constant) signal and the random noise (xi and yi each contain

      both the signal and the noise). In formulas #6 (and possibly in

      #7) these cross terms have vanished.<br>

      <br>

      6)    The basic problem in all these derivations is the inner

      product of between signal and noise: s(t).n(t) (in B&R

      speak).  This we discussed extensively in VH&S05. People state

      first that the signal and the noise are UNCORRELATED and that thus

      s(t).n(t) = 0! (See the many references discussed in VH&S05,

      including the Rosenthal & Henderson 2003 “0.143” FSC paper). A

      zero inner-product actually means that these two vectors are

      assumed to be ORTHOGONAL, not UNCORRELATED. By the same token, the

      inner product of two realizations of the UNCORRELATED noise

      vectors should also be defined as ORTHOGONAL and s(t).s’(t) should

      thus also be identical to zero! Here suddenly everybody agrees

      correctly s(t).s’(t) ~ 1/√N. It doesn’t help that the concepts of

      “correlated”, “uncorrelated”, “independent” have various and often

      conflicting definitions in the statistical literature. So let us

      rather stick to the clean, unambiguous definitions of

      orthogonality and non-orthogonality. (“Two vectors are orthogonal

      if and only if their inner product is zero.”) Then let us

      henceforward ask ourselves two clean questions: “Is our signal

      orthogonal to each of our noise vectors?” and “Is one realization

      of the noise vector orthogonal to any other noise-vector

      realization?”<br>

      <br>

      7)    If the answer to either of these questions by any cryo-EM

      scientist is “YES”, I will not accept that person as a friend in

      Facebook! (That was a joke).<br>

      <br>

      8)    Sorry Smith Liu, this may have been far more than you

      bargained for, but the take-home lesson is: fixed-valued FSC

      thresholds are mathematically wrong and must be avoided for the

      sake of science. It confuses newcomers in the field, and the

      continued use of incorrect statistics will continue to damage the

      reputation of the cryo-EM field.<br>

      <br>

      Hope this helps,<br>

      <br>

      Marin<br>

      <br>

      ===========================================================<br>

      <br>

      On 12/08/2015 15:17, Alexis Rohou wrote:<br>

    </div>

    <blockquote cite="mid:55CB8DA6.7030006@gmail.com" type="cite">

      <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

      Hi Marin,<br>

      <br>

      So many tasty worms in there. As you & others already know, I

      agree with you on the dangers of fixed-threshold criteria. <br>

      <br>

      However, on the topic of the “SNR = (CCC/(1-CCC))” formula, I am

      not convinced by your argument involving CC=-1.<br>

      <br>

      The reason is that this formula is really an <i>estimator</i> for

      the true, unknown, SNR. This is explicitly stated by Bershard

      & Rockmore (1974), whose work Frank & Al-Ali (1975) builds

      on as well as by Frank & Al-Ali themselves. See in B&R

      (1974) equation 6, where the left-hand-side is an estimate for SNR

      (alpha circumflex in their notation) based on the right hand-side,

      which involves the sample cross-correlation (r in their notation).

      Or, indeed, see the title: "On estimating signal-to-noise ratio

      using the sample correlation coefficient

      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

      <style type="text/css">

p, li { white-space: pre-wrap; }

</style>".<br>

      <br>

      To put it bluntly, estimators should be expected to "fail" or "get

      it wrong" sometimes (i.e. if used after a single, one-off

      experiment). Thankfully, B&R derived estimates for the

      variance (error) of their estimator.  Saxton (1978) also derives

      this, and Pawel Penczek has a nice & detailed review (2010

      Methods Enzym) of confidence intervals that can be derived from

      such estimator variances. <br>

      <br>

      If the sample CC (FSC in a particular shell) comes out as -1,

      either (1) the fundamental assumption B&R used to derive the

      estimator, namely that we are measuring two noise-corrupted

      versions of the same signal, was violated and we shouldn't be

      using this estimator at all, or (2) the specific occurrences of

      the noise in our two measurements conspired to give us exactly

      anti-correlated measurements. If the number of measurements is not

      tiny, this is incredibly, incredibly unlikely. Therefore, no

      matter what the SNR estimator says (-0.5 in your example), it's OK

      that the truth is very different since if we were to repeat the

      experiment we would almost never, ever get the same (CC=-1) result

      again. <br>

      <br>

      In fact, according to B&R, if we repeated the experiment an

      infinite number of times, the average estimate would be exactly

      correct (if you used their unbiased estimator, but even the one

      you mention is basically fine). The CC=-1 measurement would just

      be seen as a freak outlier. The distribution of estimates can be

      characterized, and this freak measurement would be way out in the

      tail.<br>

      <br>

      I find no reason (yet?) to believe that B&R's estimator is

      wrong.<br>

      <br>

      Cheers,<br>

      Alexis<br>

      <br>

      <br>

      <br>

      -- <br>

      Alexis Rohou<br>

      <br>

      Research Specialist<br>

      Grigorieff Lab<br>

      <a moz-do-not-send="true" href="http://grigoriefflab.janelia.org">http://grigoriefflab.janelia.org</a><br>

      Tel. +1 571 209 4000 x3485<br>

      <br>

      <br>

      <div class="moz-cite-prefix">On 08/12/2015 08:19 AM, Marin van

        Heel wrote:<br>

      </div>

      <blockquote cite="mid:55CB39E0.7020306@googlemail.com" type="cite">

        <meta content="text/html; charset=UTF-8"

          http-equiv="Content-Type">

        <div class="moz-cite-prefix"><br>

          Dear Smith Liu,<br>

          <br>

          You have hit upon a can of worms here… Although the FRC/FSC

          metrics we introduced in 1982/1986 [1, 2] are now considered

          the "gold standard" cryo-EM resolution criterion, these

          resolution issues continue to be heavily debated [3]. Many FSC

          add-ons/variants and tangential issues such as “reference

          bias” have been inserted into the resolution criterion

          discussion. These discussions unfortunately confuse even

          established researchers (referees of major journals…), let

          alone newcomers to the field. Many believe the resolution

          issue is better resolved in X-crystallography. In fact, the

          FSC is arguably a better metric than the R-factor, the

          generally accepted resolution metric in X-ray crystallography

          [4]. Fortunately, FRC/FSC criteria are now slowly also

          becoming the standard in optical microscopy, X-ray microscopy,

          X-ray crystallography, and other fields of 2D/3D imaging.<br>

          <br>

          The most controversial part of the FSC discussion is the FSC

          threshold value to serve as a resolution criterion (such as

          the FSC 0.5 value you mention). It took more than a decade to

          remove the mathematically flawed DPR (Differential Phase

          Residual) from the literature, after I explicitly discussed

          its shortcomings and proposed a corrected phase residual in

          1987 [3]. The discussion in the field was then deviated

          towards the FSC threshold at which one defines the average

          resolution of a 3D structure. The “0.5” “criterion” was just

          postulated ad hoc, without any scientific justification. Ten

          years ago, we argued that all fixed-valued FSC threshold

          criteria (such as: “0.5” and “0.143”) are based on flawed

          statistics [5]. Virtually all more formal justifications for

          resolution criteria start off referring to the old formula

          “SNR = (CCC/(1-CCC))” by Frank & Al-Ali  1975 [6].

          Unfortunately this formula is also mathematically incorrect as

          was discussed previously [5]. <br>

          <br>

          Here is another very simple argument to illustrate its flawed

          definition: the normalised CCC (or FSC) has values in the

          range:  -1<=CCC<=+1, whereas the SNR (=S2/N2) is, per

          definition, positive. Now insert the value CCC= -1, the case

          of perfectly anti-correlated data, into the formula. This

          yields: SNR = “-0.5”, a rampant violation of the SNR

          definition range. The formula could be valid for the limiting

          case of CCC is close to unity, but such high correlation

          values are not relevant in the resolution-threshold context.

          For uncorrelated signals/noise the CCC oscillates around the

          zero mark and, through the flawed Frank & Al-Ali formula,

          produces as many positive as it does erroneous negative SNR

          values.<br>

          <br>

          Unfortunately, virtually all (~100?) papers on resolution

          criteria and validation tests in cryo-EM (from friends and

          foes) are based on this formula and are thus based on “flawed

          statistics” to say the least. With the great recent success of

          cryo-EM, everybody appears to have stopped thinking about the

          basics, and merrily continue to refer to incorrect stuff while

          focusing on “my resolution is better than yours”. After

          decades of funny jokes and verbal FSC controversies at GRC

          meetings, I don’t find it so funny anymore: it is time to

          clean up the mess. I have lost the patience to discuss these

          issues with referees who continue to consider the subject as

          debatable. Questionable actions are sometimes hidden behind

          this controversy such as in Mao & Sodrosky [7], who

          cynically accuse us - their critics - of not knowing how to

          interpret the FSC: “FSC estimates of resolution are known to

          be quite sensitive to statistical bias …” etc. etc.  As I

          said, this whole issue is no longer amusing; it has become a

          matter of the debatable scientific culture (integrity?) in the

          field of the cryo-EM field. <br>

          <br>

          Oh, by the way, Smith Liu, what I really was going to say when

          I started typing an answer to your question is that if you are

          new to the field it is a good idea to read some basic

          literature in Fourier Optics. Maybe my lecture notes can help

          [8]. The horizontal axis in the FSC is 1/spatial-frequency (we

          are in Fourier space) and the FSC values in the curve indicate

          the cross-correlation level at that level of resolution (=

          inside that specific 3D Fourier shell).<br>

          <br>

          Hope this helps,<br>

          <br>

          Marin<br>

          <br>

          [1] Van Heel M, Keegstra W, Schutter W, van Bruggen EFJ:

          Arthropod hemocyanin structures studied by image analysis <a

            moz-do-not-send="true" class="moz-txt-link-freetext"

            href="http://singleparticles.org/methodology/MvH_FRC_Leeds_1982.pdf"><a class="moz-txt-link-freetext" href="http://singleparticles.org/methodology/MvH_FRC_Leeds_1982.pdf">http://singleparticles.org/methodology/MvH_FRC_Leeds_1982.pdf</a></a><br>

          [2] Harauz G & van Heel M: Exact filters for general

          geometry three dimensional reconstruction, Optik 73 (1986)

          146-156<br>

          [3]Van Heel M: Similarity measures between images.

          Ultramicroscopy 21 (1987) 95-100.]. [4] Van Heel: Unveiling

          ribosomal structures: the final phases. Current Opinions in

          Structural Biology 10 (2000) 259-264.<br>

          [5] Van Heel M & Schatz M:  Fourier Shell Correlation

          Threshold Criteria, J. Struct. Biol. 151 (2005) 250-262<br>

          [6] Frank J & Al-Ali L:  Signal-to-noise ratio of electron

          micrographs obtained by cross correlation. Nature (1975)<br>

          [7] Mao Y, Castillo-Menendeza LR, Sodroski JG: Reply to

          Subramaniam, van Heel, and Henderson: Validity of the

          cryo-electron microscopy structures of the HIV-1 envelope

          glycoprotein complex. PNAS 2013 <a moz-do-not-send="true"

            class="moz-txt-link-abbreviated"

            href="http://www.pnas.org/cgi/doi/10.1073/pnas.1316666110">www.pnas.org/cgi/doi/10.1073/pnas.1316666110</a><br>

          [8] Van Heel:  Principles of Phase Contrast (Electron)

          Microscopy. <a moz-do-not-send="true"

            class="moz-txt-link-freetext"

href="http://www.single-particles.org/methodology/MvH_Phase_Contrast.pdf">http://www.single-particles.org/methodology/MvH_Phase_Contrast.pdf</a><br>

          <br>

          ===========================================<br>

          <br>

          <br>

          <br>

          On 08/08/2015 07:45, Smith Liu wrote:<br>

        </div>

        <blockquote

          cite="mid:10144882.10d59.14f0ceab82b.Coremail.smith_liu123@163.com"

          type="cite">

          <div

            style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial">

            <div style="LINE-HEIGHT: 1.7; FONT-FAMILY: Arial; COLOR:

              #000000; FONT-SIZE: 14px">

              <div>Dear All, </div>

              <div><br>

              </div>

              <div>I know the x-axis of the FSC curve is on the reverse

                of the resolution, and the value in the x-axis

                corresponding FSC 0.5 is usually regarded as the reverse

                of the resolution of the whole EM map.</div>

              <div><br>

              </div>

              <div>Here I do not know the meaning of the resolution in

                the X-axis. The Whole map has only one resolution

                corresponding FSC 0.5, then why the x-axis is

                on different resolutions (for example the x-axis is from

                resolution 0 to 20 A, or the reverse of that scope)? Is

                it because different parts of the map have different

                resolutions (caused by different parts of map  have

                different quality), or it is because the X-axis of the

                FSC curve has some relation with Fourier shell? If the

                X-axis of the FSC is on the property related to Fourier

                shell, then what is in the relation of resolution (or

                the reverse of it) in the x-axis with Fourier shell (in

                addition, what is the Fourier shell)?</div>

              <div><br>

              </div>

              <div>Best regards.</div>

              <div><br>

              </div>

              <div>Smith</div>

              <div> </div>

            </div>

            <br>

          </div>

          <br>

          <br>

          <span title="neteasefooter"><span id="netease_mail_footer"></span></span>

        </blockquote>

        <br>

        <br>

        <br>

        <fieldset class="mimeAttachmentHeader"></fieldset>

        <br>

        <pre wrap="">_______________________________________________

3dem mailing list

<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:3dem@ncmir.ucsd.edu">3dem@ncmir.ucsd.edu</a>

<a moz-do-not-send="true" class="moz-txt-link-freetext" href="https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem">https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem</a>

</pre>

      </blockquote>

      <br>

    </blockquote>

    <br>

    <br>

  </body>

</html>