[3dem] Data storage and compression

Jacopo Marino jacopo.marino at psi.ch
Thu Aug 29 05:13:51 PDT 2019


Hi Takanori,
You mean the "final" particle stack, or movies, when you say raw data ? The latter might be too large for EMPIAR ?

Thanks and best wishes,
Jacopo

Sent from my iPhone

> On 29 Aug 2019, at 14:04, Takanori Nakane <tnakane at mrc-lmb.cam.ac.uk> wrote:
> 
> Hi,
> 
>> Most people just opt for the "hard drive on a shelf" method for completed
>> projects, which has advantages (cheap/simple) and disadvantages (what
>> happens if the drive dies)...
> 
> After publication of your structures, I recommend raw data to be deposited
> in EMPIAR.
> Not only is it useful for reproducibility, education and method development,
> it also serves as an additional layer of backup. You might drop your disk,
> water might leak from the ceiling, etc. Having backups in a physically
> distant
> place is a good practice.
> 
> Best regards,
> 
> Takanori Nakanori
> 
>> Julien,
>> are you referring to the raw data, or are you trying to archive all of the
>> files associated with a project?
>> 
>> Counting-mode movies are generally stored and archived as compressed tiff
>> stacks, though if they are collected on a Falcon, there are issues with
>> this, as good compression is achieved only pre-normalization (or
>> post-normalization if you decide you are willing to switch back to an
>> integer format).
>> 
>> If you want to perfectly archive everything exactly as it is (losslessly),
>> some compression algorithms may do very slightly better than others, but
>> pretty much any of the commonly used algorithms will do about the same.
>> Usually the slower ones will do slightly better, but you have to decide if
>> it's worth the CPU time the compression takes.  By definition, the noisier
>> the data is, the less compressible it is, unless you are willing to invoke
>> "lossy" compression and throw away some of the bits of pure noise.
>> 
>> Most people just opt for the "hard drive on a shelf" method for completed
>> projects, which has advantages (cheap/simple) and disadvantages (what
>> happens if the drive dies)...
>> 
>> --------------------------------------------------------------------------------------
>> Steven Ludtke, Ph.D. <sludtke at bcm.edu<mailto:sludtke at bcm.edu>>
>>         Baylor College of Medicine
>> Charles C. Bell Jr., Professor of Structural Biology
>> Dept. of Biochemistry and Molecular Biology
>> (www.bcm.edu/biochem<http://www.bcm.edu/biochem>)
>> Academic Director, CryoEM Core
>> (cryoem.bcm.edu<http://cryoem.bcm.edu>)
>> Co-Director CIBR Center
>> (www.bcm.edu/research/cibr<http://www.bcm.edu/research/cibr>)
>> 
>> 
>> 
>> On Aug 29, 2019, at 6:30 AM, Julien Bous
>> <julien.bous at etu.umontpellier.fr<mailto:julien.bous at etu.umontpellier.fr>>
>> wrote:
>> 
>> 
>> 
>> Dear Community,
>> 
>> I have a question about the best way to store my data once SPA projects
>> are achieved. Can you advise me about which compression format is to
>> prefer?
>> 
>> Thank you for your interest,
>> 
>> Julien
>> 
>> 
>> _______________________________________________
>> 3dem mailing list
>> 3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=DwICAg&c=ZQs-KZ8oxEw0p81sqgiaRA&r=GWA2IF6nkq8sZMXHpp1Xpg&m=-Yu84q3MdcWvESYXpaK7NQdEWch6tE1eG9IVNTjLay4&s=NSrJg_YgFffwLELO1auXSC6yYLEsGHVoNV5TI_1eBqM&e=
>> 
>> _______________________________________________
>> 3dem mailing list
>> 3dem at ncmir.ucsd.edu
>> https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem
>> 
> 
> 
> _______________________________________________
> 3dem mailing list
> 3dem at ncmir.ucsd.edu
> https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem


More information about the 3dem mailing list