[3dem] [ccpem] MRC file format (Compressing cryo-EM data to 8-bits/pix and beyond)
Marin van Heel
marin.vanheel at googlemail.com
Wed Jun 17 14:30:35 PDT 2015
Yes that was understood John. The point is that with loss-less
compression you can achieve even smaller files than with 4/8/16/32 bit
integers, because those, as you state correctly, are largely filled with
zeroes.
Cheers
Marin
On 16/06/2015 18:02, John Rubinstein wrote:
> Dear Marin and Pawel,
>
> For the K2 camera that outputs counts (e.g. 1, 2, 3, 4, etc) there is
> no loss of information in storing these numbers as 4 bit or 8 bit as
> long as you don't exceed the highest integer that the data type can
> hold. Any excess bits just hold 0s. Storing as 32 bits does not cost
> much but it also has no purpose.
>
> Best wishes,
> John
>
> Sent from my BlackBerry 10 smartphone.
> *From: *Marin van Heel
> *Sent: *Tuesday, June 16, 2015 11:22 AM
> *To: *Tom Houweling; CCPEM at JISCMAIL.AC.UK
> *Cc: *3DEM
> *Subject: *Re: [3dem] [ccpem] MRC file format (Compressing cryo-EM
> data to 8-bits/pix and beyond)
>
>
> Dear All,
>
> For various reasons I don’t think this line of reasoning is very
> productive. The data compression to 8 or even 4 bits as has been
> suggested in this discussion can only lead to loss of data (see
> below). It may also represent poor management of the available EM
> resources.
>
> Point by point:
>
> A) Advanced cryo-EM equipment costs of the order of ~5000 AUs
> (Arbitrary Units: $/Eu/£) per day to own and operate, and will
> generate up to ~ 2Tbyte of cryo-EM data per 24h. The costs of storing
> this precious data for “eternity” will not exceed 100 AUs per day,
> that is, one or two percent of the tax-payers total investment in your
> data collection. NOT storing that raw data may NOT be a good idea for
> economic reasons alone (just in case you, for example, need to repeat
> the experiment to get the data back).
>
> B) Compressing all the raw data to save space can make sense as long
> as the compression is loss-less
> (https://en.wikipedia.org/wiki/Lossless_compression). The compression
> (after movie alignment) as suggested, however, may lead to a
> significant information loss.
>
> C) The dynamic range of a raw image is mainly determined by the
> low-frequency components of the data. Scaling the min-max densities
> from 0-255 for compression/truncation to 8 bit data, changes the data
> representation from image to image. The high-resolution information we
> are interested is has a contrast of probably less than 0.1% of the
> strong low-frequency components. The signal we are interested in is
> thus already much smaller than the discretisation error of 1:256 of
> the A-to-D conversion. That does not mean one will not be able to fish
> that information from the discretisation and Poisson noise in the raw
> data… But it will certainly suffer. The grey scales will change from
> image to image purely dependent on whether there is, for example, an
> ice crystal somewhere in the field of view. High-pass filtering will
> remove the large-scale details thus also increase the dynamic range
> available for the high-res frequency data components.
>
> D) Note that the fact that you manage to get a 3D structure out is no
> proof that you have not lost information. It is merely proof for the
> fact that there was enough left over to create a reasonable 3D that
> satisfies you.
>
> E) There are also other reasons for never deleting the original data
> such as validation! You may be challenged – as has happened in the
> recent past (PNAS 2013) - to show the original data set to prove it is
> what you claim it is and was collected on the instrumentation you
> claim it was taken on. (In the PNAS cases the original data has still
> not been released).
>
> F) What one can or wants to do with the raw data changes over time.
> Many new movie alignment algorithms have been proposed recently;
> access to exactly the same raw data is essential for validation of the
> new algorithms. (You may even get more out of your data!)
>
> G) The raw data characterizes the camera (and validates the data set
> as per E) and allow you to correct for its flaws
> (http://www.nature.com/srep/2015/150611/srep10317/full/srep10317.html). You
> may also want to see whether the camera itself deteriorated over time.
>
> H) Especially when the raw data are of some integer type, (and you are
> using data with a limited dynamic range), the data on disk will be
> written in a highly redundant fashion. You may then use loss-less
> compression algorithms to reduce the size of your data without
> suffering any information loss. You may always compress the data, you
> may never compromise on its information content!
>
> Cheers, Marin
>
> ========================================
>
> On 04/06/2015 00:15, Tom Houweling wrote:
>> What I meant is that Relion appears to have no problem reading 16 bit
>> and 8 bit formats, therefore converting to 32bit floating point
>> images should not be necessary.
>>
>> However, the verdict on loss of resolution reducing the data to 8
>> bits is still out. I’m motivated by conserving disk space.
>>
>> I’m currently reprocessing a good dataset that yielded a high
>> resolution structure. But this time I converted the aligned stacks of
>> 32bit per pixel to just 8 by the following method:
>>
>> 1)Calculate the mean and std. deviation
>> 2)Cutoff at +/- 3 std dev
>> 3)Set lowest value to 0 and highest to 255
>>
>> Tom
>>
>>
>>> On Jun 3, 2015, at 10:58 AM, Amedee des Georges
>>> <adesgeorges at GMAIL.COM <mailto:adesgeorges at GMAIL.COM>> wrote:
>>>
>>> Dear Tom,
>>>
>>> Did you see any decrease in resolution with 8bit vs 16? How did it
>>> look?
>>> It’s obviously an advantage to use 8bits for storage if it doesn’t
>>> decrease image quality significantly.
>>>
>>> Best,
>>>
>>> Amedee
>>>
>>> On Jun 3, 2015, at 1:44 PM, Tom Houweling
>>> <tom.houweling at BERKELEY.EDU <mailto:tom.houweling at BERKELEY.EDU>> wrote:
>>>
>>>> We have successfully processed MRC images and stacks in Relion that
>>>> were in 16 bit mode 6 and also in the non MRC sanctioned mode 5 (8
>>>> bit unsigned).
>>>>
>>>> —Tom
>>>>
>>>>
>>>>> On Jun 3, 2015, at 10:22 AM, Rémi Fronzes <remi.fronzes at PASTEUR.FR
>>>>> <mailto:remi.fronzes at PASTEUR.FR>> wrote:
>>>>>
>>>>> Dear All,
>>>>>
>>>>> Maybe a silly question but still worth asking.
>>>>> Is it a problem to extract and use in relion particles from 16bits
>>>>> MRC images (i.e. collected using EPU) ?
>>>>> Or do we have to convert the micrographs in 32 bits MRC format.
>>>>>
>>>>> Cheers
>>>>>
>>>>> Rémi
>>>>>
>>>>>
>>>>> Rémi Fronzes
>>>>> G5 biologie structurale de la sécrétion bactérienne, institut Pasteur
>>>>> CNRS UMR 3528, institut Pasteur
>>>>>
>>>>> Office: +33 (0)145688864
>>>>> Lab: +33 (0) 145688863
>>>>> Mobile: +33 (0) 688263992
>>>>> Email:remi.fronzes at pasteur.fr <mailto:remi.fronzes at pasteur.fr>
>>>>>
>>>>> 25 rue du Docteur Roux
>>>>> Bâtiment Metchnikoff, 3ème étage
>>>>> 75015 Paris, France
>>>>>
>>>>
>>>> --
>>>> Tom Houweling - QB3 Nogales Lab Computer Analyst @ Howard
>>>> Hughes Medical Institute
>>>> University of California Berkeley, 708D Stanley Hall, Berkeley, CA
>>>> 94720
>>>>
>>>>
>>>
>>
>> --
>> Tom Houweling - QB3 Nogales Lab Computer Analyst @ Howard
>> Hughes Medical Institute
>> University of California Berkeley, 708D Stanley Hall, Berkeley, CA 94720
>>
>>
>
>
> --
> ================================================================
>
> Prof Dr Ir Marin van Heel
>
> Professor of Cryo-EM Data Processing
>
> Leiden University
>
>
--
================================================================
Prof Dr Ir Marin van Heel
Professor of Cryo-EM Data Processing
Leiden University
NeCEN Building Room 05.27
Einsteinweg 55
2333 CC Leiden
The Netherlands
Tel. NL: +31(0)715271424 // Mobile NL: +31(0)652736618
Skype: Marin.van.Heel
email: marin.vanheel(A_T)gmail.com
and: mvh.office(A_T)gmail.com
----------------------------------------------
Emeritus Professor of Structural Biology
Imperial College London
Faculty of Natural Sciences
Biochemistry Building (Room 512)
South Kensington Campus
London SW7 2AZ, UK
email: m.vanheel(A_T)ic.ac.uk
Tel. UK: +44(0)2075945316 //Mobile: +44(0)7941540625
----------------------------------------------
Visiting Professor at:
Laboratório Nacional de Nanotecnologia - LNNano
CNPEM/ABTLuS, Campinas, Brazil
Brazilian mobile phone +55-19-983189143
------------------------------------------------------------------
I receive many emails per day and, although I try,
there is no guarantee that I will actually read each incoming email.
Moreover, our Spam filters can be strikt and sometimes make
legitimate emails disappear (try the gmail accounts, alternatively)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ncmir.ucsd.edu/pipermail/3dem/attachments/20150617/c0ee30bb/attachment-0001.html>
More information about the 3dem
mailing list