<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix"><br>
Hi Alexis<br>
<br>
What now? My own (ex-) students questioning me? Well I guess I
need to see that leniently and interpret this as a success: having
educated a new generation of independent critical scientists…
HeHe. So, no hard feelings, but you do force me to go into much
more detail here than I had anticipated, so let’s start:<br>
<br>
1) You correctly discuss and interpret the SNR/CCC issue in the
context the original work by Bershard & Rockmore (1974) from
which Frank & Al-Ali (1975) have used the formulas. The
detailed considerations of B&R(74) have not all survived into
the F&A(75) manuscript. I mention this explicitly because the
actual use of the formula “SNR = (CCC/(1-CCC))” in cryo-EM today
is as if this were a true formula relating the CCC to SNR! All
subsequent papers from the Frank group cite only the F&A(75)
paper and these typically drop the “~” for a “=” and apply it to
everything effectively without any restriction (as does almost
everybody else in the cryo-EM field). Also IT DOESN'T help that
the only reference to the vH&S (2005) paper (in which we
criticize the F&A(75) paper) by the Frank group is by Liao
& Frank (2010) in a review about resolution. Our 2005 paper is
cited falsely (quoting the opposite of what we actually say) and
on a subject not related to our criticism!<br>
<br>
2) One more thing Alexis, before I go into the issues you rise.
The F&A (75) paper is based on the assumption – like do very
many papers in image processing – that the real-space SNR of an
image is a measure of its “significant information” content. That
assumption is simply wrong. A straightforward counter-example:
take two 4Kx4K standard cryo-EM images of the same vitreous ice
sample and calculate/guesstimate their SNR(-4096). Then bin the
two images to 2Kx2K each by averaging four pixels into one and
then calculate SNR-2048; do that again to find SNR-1024; SNR-512,
etc. until you reach SNR-1. Each of these binned images obviously
looks at the same overall area of the sample, but you will see
that: … SNR-128 > SNR-256 > SNR-512 > SNR-1024 >
SNR-2048 > SNR-4096. Does that mean the information content of
the binned images is better than that of un-binned originals?? NO!
Because the high-frequency (Poisson) noise tends to be damped by
the binning operation whereas the more low-frequency sample signal
is reinforced, the SNR of the binned images are better than those
of the originals. If the image’s real-space SNR would be a real
information-content metric, we would all first bin down our
precious 4Kx4K direct-electron cryo-EM images to 512x512 or even
256x256 before starting our real data processing. Ergo: the
real-space SNR is not an information-content metric and the a
priori assumption upon which F&A(75) is based is thus
incorrect. F&A(75) also argue that N, the number of pixels is
~ 10.000 (in 1975; more like N ~ 16.000.000 now) and that thus
the approximations are valid. We will see below that this is an
entirely inappropriate argument in the context of the FSC.<br>
<br>
3) In the context of the FSC, “N” refers rather to the number
of complex numbers in a Fourier shell which can be pretty low when
close to the origin, say at 5 pixels from the Fourier space origin
N~ 125 (4*Pi*R**2; with R=5 --> N=~250, divide that by 2
because of the Hermitian symmetry). When we then are also dealing
with an icosahedral structure, there is a 60-fold redundancy
within that shell thus N=~125/60 -> N~2; not a very large N at
all! At R=10 … N~8; at R=20… N~32, etc… (the expected random
correlation sigma 1/√N: 1/√2 =1/1.41; 1/√8 =1/2.82 …) Of course,
for a C1 structure, these N values are 60 times higher and hence
the relevance thresholds much lower (VH&S05). Everyone who
claims a fixed value threshold for any structure (icosahedral; C1
or with any other pointgroup symmetry) has a real problem
counting! I hope all referees out there take good notice and
think twice before accepting flawed fixed-valued thresholds!<br>
<br>
4) Finally back to your points Alexis and to the original
B&R(74) work. You mention B&R(74) formula #6. I had to
look up the paper again… Actually the formula appears twice there
with slightly different definitions. The first is formula #6 which
is the desired one I cited in my first email; the second
(formula#7) is the real “estimator” you are referring to, I
believe. I agree with you that putting a CCC= -1 is a bit of an
extreme example and maybe not in synch with the gist of
B&R(74). However, it is fully in synch with the interpretation
given to the formula since F&A(75). You do not discuss my main
point here, namely that when CCC fluctuates around the zero mark.
The estimator (formula#7) will then yield negative values ~50% of
the time and that may only be corrected by repeating the
experiment an infinite number of times which would then lead to an
exact SNR = 0. One can only hope your cryo-EM sample will survive
the necessary infinite number of experiments/exposures! Sorry
Alexis, your argumentation that negative SNRs should be accepted
since they will eventually average out to a “0” real SNR value is
very farfetched (even though mathematically justifiable...). Since
FSC experiments are typically one-off experiments I’d rather stay
with two feet on the ground and not follow you along this path. We
are here dealing with low “N” values in one-off FSC experiments.<br>
<br>
5) Nevertheless, this is not where the real problem is! The
real problem in terms of the B&R(74) paper surfaces earlier,
namely at the level of formulas #4 and #5A which are the ones
discussed in B&R(74) that are closest to the FRC/FSC formulas
(VH&S05). These formulas still contain the CROSS TERMS between
the (constant) signal and the random noise (xi and yi each contain
both the signal and the noise). In formulas #6 (and possibly in
#7) these cross terms have vanished.<br>
<br>
6) The basic problem in all these derivations is the inner
product of between signal and noise: s(t).n(t) (in B&R
speak). This we discussed extensively in VH&S05. People state
first that the signal and the noise are UNCORRELATED and that thus
s(t).n(t) = 0! (See the many references discussed in VH&S05,
including the Rosenthal & Henderson 2003 “0.143” FSC paper). A
zero inner-product actually means that these two vectors are
assumed to be ORTHOGONAL, not UNCORRELATED. By the same token, the
inner product of two realizations of the UNCORRELATED noise
vectors should also be defined as ORTHOGONAL and s(t).s’(t) should
thus also be identical to zero! Here suddenly everybody agrees
correctly s(t).s’(t) ~ 1/√N. It doesn’t help that the concepts of
“correlated”, “uncorrelated”, “independent” have various and often
conflicting definitions in the statistical literature. So let us
rather stick to the clean, unambiguous definitions of
orthogonality and non-orthogonality. (“Two vectors are orthogonal
if and only if their inner product is zero.”) Then let us
henceforward ask ourselves two clean questions: “Is our signal
orthogonal to each of our noise vectors?” and “Is one realization
of the noise vector orthogonal to any other noise-vector
realization?”<br>
<br>
7) If the answer to either of these questions by any cryo-EM
scientist is “YES”, I will not accept that person as a friend in
Facebook! (That was a joke).<br>
<br>
8) Sorry Smith Liu, this may have been far more than you
bargained for, but the take-home lesson is: fixed-valued FSC
thresholds are mathematically wrong and must be avoided for the
sake of science. It confuses newcomers in the field, and the
continued use of incorrect statistics will continue to damage the
reputation of the cryo-EM field.<br>
<br>
Hope this helps,<br>
<br>
Marin<br>
<br>
===========================================================<br>
<br>
On 12/08/2015 15:17, Alexis Rohou wrote:<br>
</div>
<blockquote cite="mid:55CB8DA6.7030006@gmail.com" type="cite">
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
Hi Marin,<br>
<br>
So many tasty worms in there. As you & others already know, I
agree with you on the dangers of fixed-threshold criteria. <br>
<br>
However, on the topic of the “SNR = (CCC/(1-CCC))” formula, I am
not convinced by your argument involving CC=-1.<br>
<br>
The reason is that this formula is really an <i>estimator</i> for
the true, unknown, SNR. This is explicitly stated by Bershard
& Rockmore (1974), whose work Frank & Al-Ali (1975) builds
on as well as by Frank & Al-Ali themselves. See in B&R
(1974) equation 6, where the left-hand-side is an estimate for SNR
(alpha circumflex in their notation) based on the right hand-side,
which involves the sample cross-correlation (r in their notation).
Or, indeed, see the title: "On estimating signal-to-noise ratio
using the sample correlation coefficient
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<style type="text/css">
p, li { white-space: pre-wrap; }
</style>".<br>
<br>
To put it bluntly, estimators should be expected to "fail" or "get
it wrong" sometimes (i.e. if used after a single, one-off
experiment). Thankfully, B&R derived estimates for the
variance (error) of their estimator. Saxton (1978) also derives
this, and Pawel Penczek has a nice & detailed review (2010
Methods Enzym) of confidence intervals that can be derived from
such estimator variances. <br>
<br>
If the sample CC (FSC in a particular shell) comes out as -1,
either (1) the fundamental assumption B&R used to derive the
estimator, namely that we are measuring two noise-corrupted
versions of the same signal, was violated and we shouldn't be
using this estimator at all, or (2) the specific occurrences of
the noise in our two measurements conspired to give us exactly
anti-correlated measurements. If the number of measurements is not
tiny, this is incredibly, incredibly unlikely. Therefore, no
matter what the SNR estimator says (-0.5 in your example), it's OK
that the truth is very different since if we were to repeat the
experiment we would almost never, ever get the same (CC=-1) result
again. <br>
<br>
In fact, according to B&R, if we repeated the experiment an
infinite number of times, the average estimate would be exactly
correct (if you used their unbiased estimator, but even the one
you mention is basically fine). The CC=-1 measurement would just
be seen as a freak outlier. The distribution of estimates can be
characterized, and this freak measurement would be way out in the
tail.<br>
<br>
I find no reason (yet?) to believe that B&R's estimator is
wrong.<br>
<br>
Cheers,<br>
Alexis<br>
<br>
<br>
<br>
-- <br>
Alexis Rohou<br>
<br>
Research Specialist<br>
Grigorieff Lab<br>
<a moz-do-not-send="true" href="http://grigoriefflab.janelia.org">http://grigoriefflab.janelia.org</a><br>
Tel. +1 571 209 4000 x3485<br>
<br>
<br>
<div class="moz-cite-prefix">On 08/12/2015 08:19 AM, Marin van
Heel wrote:<br>
</div>
<blockquote cite="mid:55CB39E0.7020306@googlemail.com" type="cite">
<meta content="text/html; charset=UTF-8"
http-equiv="Content-Type">
<div class="moz-cite-prefix"><br>
Dear Smith Liu,<br>
<br>
You have hit upon a can of worms here… Although the FRC/FSC
metrics we introduced in 1982/1986 [1, 2] are now considered
the "gold standard" cryo-EM resolution criterion, these
resolution issues continue to be heavily debated [3]. Many FSC
add-ons/variants and tangential issues such as “reference
bias” have been inserted into the resolution criterion
discussion. These discussions unfortunately confuse even
established researchers (referees of major journals…), let
alone newcomers to the field. Many believe the resolution
issue is better resolved in X-crystallography. In fact, the
FSC is arguably a better metric than the R-factor, the
generally accepted resolution metric in X-ray crystallography
[4]. Fortunately, FRC/FSC criteria are now slowly also
becoming the standard in optical microscopy, X-ray microscopy,
X-ray crystallography, and other fields of 2D/3D imaging.<br>
<br>
The most controversial part of the FSC discussion is the FSC
threshold value to serve as a resolution criterion (such as
the FSC 0.5 value you mention). It took more than a decade to
remove the mathematically flawed DPR (Differential Phase
Residual) from the literature, after I explicitly discussed
its shortcomings and proposed a corrected phase residual in
1987 [3]. The discussion in the field was then deviated
towards the FSC threshold at which one defines the average
resolution of a 3D structure. The “0.5” “criterion” was just
postulated ad hoc, without any scientific justification. Ten
years ago, we argued that all fixed-valued FSC threshold
criteria (such as: “0.5” and “0.143”) are based on flawed
statistics [5]. Virtually all more formal justifications for
resolution criteria start off referring to the old formula
“SNR = (CCC/(1-CCC))” by Frank & Al-Ali 1975 [6].
Unfortunately this formula is also mathematically incorrect as
was discussed previously [5]. <br>
<br>
Here is another very simple argument to illustrate its flawed
definition: the normalised CCC (or FSC) has values in the
range: -1<=CCC<=+1, whereas the SNR (=S2/N2) is, per
definition, positive. Now insert the value CCC= -1, the case
of perfectly anti-correlated data, into the formula. This
yields: SNR = “-0.5”, a rampant violation of the SNR
definition range. The formula could be valid for the limiting
case of CCC is close to unity, but such high correlation
values are not relevant in the resolution-threshold context.
For uncorrelated signals/noise the CCC oscillates around the
zero mark and, through the flawed Frank & Al-Ali formula,
produces as many positive as it does erroneous negative SNR
values.<br>
<br>
Unfortunately, virtually all (~100?) papers on resolution
criteria and validation tests in cryo-EM (from friends and
foes) are based on this formula and are thus based on “flawed
statistics” to say the least. With the great recent success of
cryo-EM, everybody appears to have stopped thinking about the
basics, and merrily continue to refer to incorrect stuff while
focusing on “my resolution is better than yours”. After
decades of funny jokes and verbal FSC controversies at GRC
meetings, I don’t find it so funny anymore: it is time to
clean up the mess. I have lost the patience to discuss these
issues with referees who continue to consider the subject as
debatable. Questionable actions are sometimes hidden behind
this controversy such as in Mao & Sodrosky [7], who
cynically accuse us - their critics - of not knowing how to
interpret the FSC: “FSC estimates of resolution are known to
be quite sensitive to statistical bias …” etc. etc. As I
said, this whole issue is no longer amusing; it has become a
matter of the debatable scientific culture (integrity?) in the
field of the cryo-EM field. <br>
<br>
Oh, by the way, Smith Liu, what I really was going to say when
I started typing an answer to your question is that if you are
new to the field it is a good idea to read some basic
literature in Fourier Optics. Maybe my lecture notes can help
[8]. The horizontal axis in the FSC is 1/spatial-frequency (we
are in Fourier space) and the FSC values in the curve indicate
the cross-correlation level at that level of resolution (=
inside that specific 3D Fourier shell).<br>
<br>
Hope this helps,<br>
<br>
Marin<br>
<br>
[1] Van Heel M, Keegstra W, Schutter W, van Bruggen EFJ:
Arthropod hemocyanin structures studied by image analysis <a
moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://singleparticles.org/methodology/MvH_FRC_Leeds_1982.pdf"><a class="moz-txt-link-freetext" href="http://singleparticles.org/methodology/MvH_FRC_Leeds_1982.pdf">http://singleparticles.org/methodology/MvH_FRC_Leeds_1982.pdf</a></a><br>
[2] Harauz G & van Heel M: Exact filters for general
geometry three dimensional reconstruction, Optik 73 (1986)
146-156<br>
[3]Van Heel M: Similarity measures between images.
Ultramicroscopy 21 (1987) 95-100.]. [4] Van Heel: Unveiling
ribosomal structures: the final phases. Current Opinions in
Structural Biology 10 (2000) 259-264.<br>
[5] Van Heel M & Schatz M: Fourier Shell Correlation
Threshold Criteria, J. Struct. Biol. 151 (2005) 250-262<br>
[6] Frank J & Al-Ali L: Signal-to-noise ratio of electron
micrographs obtained by cross correlation. Nature (1975)<br>
[7] Mao Y, Castillo-Menendeza LR, Sodroski JG: Reply to
Subramaniam, van Heel, and Henderson: Validity of the
cryo-electron microscopy structures of the HIV-1 envelope
glycoprotein complex. PNAS 2013 <a moz-do-not-send="true"
class="moz-txt-link-abbreviated"
href="http://www.pnas.org/cgi/doi/10.1073/pnas.1316666110">www.pnas.org/cgi/doi/10.1073/pnas.1316666110</a><br>
[8] Van Heel: Principles of Phase Contrast (Electron)
Microscopy. <a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="http://www.single-particles.org/methodology/MvH_Phase_Contrast.pdf">http://www.single-particles.org/methodology/MvH_Phase_Contrast.pdf</a><br>
<br>
===========================================<br>
<br>
<br>
<br>
On 08/08/2015 07:45, Smith Liu wrote:<br>
</div>
<blockquote
cite="mid:10144882.10d59.14f0ceab82b.Coremail.smith_liu123@163.com"
type="cite">
<div
style="line-height:1.7;color:#000000;font-size:14px;font-family:Arial">
<div style="LINE-HEIGHT: 1.7; FONT-FAMILY: Arial; COLOR:
#000000; FONT-SIZE: 14px">
<div>Dear All, </div>
<div><br>
</div>
<div>I know the x-axis of the FSC curve is on the reverse
of the resolution, and the value in the x-axis
corresponding FSC 0.5 is usually regarded as the reverse
of the resolution of the whole EM map.</div>
<div><br>
</div>
<div>Here I do not know the meaning of the resolution in
the X-axis. The Whole map has only one resolution
corresponding FSC 0.5, then why the x-axis is
on different resolutions (for example the x-axis is from
resolution 0 to 20 A, or the reverse of that scope)? Is
it because different parts of the map have different
resolutions (caused by different parts of map have
different quality), or it is because the X-axis of the
FSC curve has some relation with Fourier shell? If the
X-axis of the FSC is on the property related to Fourier
shell, then what is in the relation of resolution (or
the reverse of it) in the x-axis with Fourier shell (in
addition, what is the Fourier shell)?</div>
<div><br>
</div>
<div>Best regards.</div>
<div><br>
</div>
<div>Smith</div>
<div> </div>
</div>
<br>
</div>
<br>
<br>
<span title="neteasefooter"><span id="netease_mail_footer"></span></span>
</blockquote>
<br>
<br>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
3dem mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:3dem@ncmir.ucsd.edu">3dem@ncmir.ucsd.edu</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem">https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem</a>
</pre>
</blockquote>
<br>
</blockquote>
<br>
<br>
</body>
</html>