[3dem] Data storage and compression

Cathy Lawson cathy.lawson at rcsb.org
Thu Aug 29 05:44:54 PDT 2019


Raw images collected from the microscope can be deposited to EMPIAR as well as aligned images and particle stacks.

I’ll also note that EMPIAR made the 2016 map challenge possible as we were able to point challengers to this resource.

Before our EBI colleagues created EMPIAR there was just a patchwork of resources making raw data available for research and education.


> On Aug 29, 2019, at 08:13, Jacopo Marino <jacopo.marino at psi.ch> wrote:
> 
> Hi Takanori,
> You mean the "final" particle stack, or movies, when you say raw data ? The latter might be too large for EMPIAR ?
> 
> Thanks and best wishes,
> Jacopo
> 
> Sent from my iPhone
> 
>> On 29 Aug 2019, at 14:04, Takanori Nakane <tnakane at mrc-lmb.cam.ac.uk> wrote:
>> 
>> Hi,
>> 
>>> Most people just opt for the "hard drive on a shelf" method for completed
>>> projects, which has advantages (cheap/simple) and disadvantages (what
>>> happens if the drive dies)...
>> 
>> After publication of your structures, I recommend raw data to be deposited
>> in EMPIAR.
>> Not only is it useful for reproducibility, education and method development,
>> it also serves as an additional layer of backup. You might drop your disk,
>> water might leak from the ceiling, etc. Having backups in a physically
>> distant
>> place is a good practice.
>> 
>> Best regards,
>> 
>> Takanori Nakanori
>> 
>>> Julien,
>>> are you referring to the raw data, or are you trying to archive all of the
>>> files associated with a project?
>>> 
>>> Counting-mode movies are generally stored and archived as compressed tiff
>>> stacks, though if they are collected on a Falcon, there are issues with
>>> this, as good compression is achieved only pre-normalization (or
>>> post-normalization if you decide you are willing to switch back to an
>>> integer format).
>>> 
>>> If you want to perfectly archive everything exactly as it is (losslessly),
>>> some compression algorithms may do very slightly better than others, but
>>> pretty much any of the commonly used algorithms will do about the same.
>>> Usually the slower ones will do slightly better, but you have to decide if
>>> it's worth the CPU time the compression takes.  By definition, the noisier
>>> the data is, the less compressible it is, unless you are willing to invoke
>>> "lossy" compression and throw away some of the bits of pure noise.
>>> 
>>> Most people just opt for the "hard drive on a shelf" method for completed
>>> projects, which has advantages (cheap/simple) and disadvantages (what
>>> happens if the drive dies)...
>>> 
>>> --------------------------------------------------------------------------------------
>>> Steven Ludtke, Ph.D. <sludtke at bcm.edu<mailto:sludtke at bcm.edu>>
>>>        Baylor College of Medicine
>>> Charles C. Bell Jr., Professor of Structural Biology
>>> Dept. of Biochemistry and Molecular Biology
>>> (www.bcm.edu/biochem<http://www.bcm.edu/biochem>)
>>> Academic Director, CryoEM Core
>>> (cryoem.bcm.edu<http://cryoem.bcm.edu>)
>>> Co-Director CIBR Center
>>> (www.bcm.edu/research/cibr<http://www.bcm.edu/research/cibr>)
>>> 
>>> 
>>> 
>>> On Aug 29, 2019, at 6:30 AM, Julien Bous
>>> <julien.bous at etu.umontpellier.fr<mailto:julien.bous at etu.umontpellier.fr>>
>>> wrote:
>>> 
>>> 
>>> 
>>> Dear Community,
>>> 
>>> I have a question about the best way to store my data once SPA projects
>>> are achieved. Can you advise me about which compression format is to
>>> prefer?
>>> 
>>> Thank you for your interest,
>>> 
>>> Julien
>>> 
>>> 
>>> _______________________________________________
>>> 3dem mailing list
>>> 3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=DwICAg&c=ZQs-KZ8oxEw0p81sqgiaRA&r=GWA2IF6nkq8sZMXHpp1Xpg&m=-Yu84q3MdcWvESYXpaK7NQdEWch6tE1eG9IVNTjLay4&s=NSrJg_YgFffwLELO1auXSC6yYLEsGHVoNV5TI_1eBqM&e=
>>> 
>>> _______________________________________________
>>> 3dem mailing list
>>> 3dem at ncmir.ucsd.edu
>>> https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem
>>> 
>> 
>> 
>> _______________________________________________
>> 3dem mailing list
>> 3dem at ncmir.ucsd.edu
>> https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem
> _______________________________________________
> 3dem mailing list
> 3dem at ncmir.ucsd.edu
> https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem



More information about the 3dem mailing list