[3dem] Data storage system

Ludtke, Steven J. sludtke at bcm.edu
Fri Apr 2 17:59:11 PDT 2021


It is quite possible to purchase a decent 10 Gbe copper switch nowadays for relatively little money. I am using a Netgear XS512EM, in the lab. It can do nonblocking 10G over copper on all 12 ports (it claims). I haven't pushed those limits, as I only have a handful of machines with 10G connections.

While some institutions are beginning to upgrade internally to 10G, the ones who adopted 1G over a decade ago often used Cat5 cabling internally, which isn't capable of reliable 10G connectivity, which means those institutions are faced with the possibility of having to rewire everything they want to provide 10G support to. Anyway, my stopgap solution was to just set up a deadnet in the lab for NAS and inter-machine communications. Really no different than a small cluster :^)

I will add that I was not trying to sell Synology on labs with solid IT expertise needing to set up a petabyte of storage. I was saying that for smaller labs Synology presents a very friendly solution, and I dispute the statement that it is "lower quality" in some way.

I have been running standard Supermicro storage rackmount units for ~15 years  (a common platform for trueNAS; enterprise grade). They are excellent, of course, and I highly recommend them, but over a 5+ year period they are NOT worry-free. They develop hardware issues, such as failed power supplies, failed RAID cards, etc. Further, if you fully populate them at time of purchase that also means that the drives will start approaching EOL all at about the same time (anywhere from 3-8 years depending on how lucky you got with a particular batch of drives), leading to very high risk of data loss even with RAID 6. These boxes are not for labs who say "we don't have much IT experience/support", even if you run something like freeNAS. They will likely be great for 3-4 years before you start running into issues.

Before the Synology boxes, the standard solution in the lab was to buy Supermicro workstations with 8 hot-swap drive bays and an internal hardware RAID card. This would give about 800-900 MB/s of bandwidth as a RAID 5, and have a lot of space. However, this storage is all local to the machine, and moving large data over a 1 GB network is ~100 MB/s, so painful for some things. The Synology boxes can sit under the desk, and will email you when there is a drive failure or any other issue, and can still provide roughly the same performance as the previous CPU-tied storage (with the 10 Gbe card).

I will say that one of my biggest issues over the last couple of years is having RAID cards failing in the Supermicro units, and having to go to eBay, etc, to find equivalent replacements, to avoid having to wipe 500 TB of data and copy from backup (the newer RAID cards are often not quite compatible enough with the older cards). Even with the synology boxes, expect that you will likely have to 'refresh' the technology in 5-8 years.
--------------------------------------------------------------------------------------
Steven Ludtke, Ph.D. <sludtke at bcm.edu<mailto:sludtke at bcm.edu>>                      Baylor College of Medicine
Charles C. Bell Jr., Professor of Structural Biology
Dept. of Biochemistry and Molecular Biology                      (https://urldefense.com/v3/__http://www.bcm.edu/biochem__;!!Mih3wA!QqNkIZ5jKtLOqfQMjz4WRKgTOwRG6vN7T-WXpc3rmuCsqkxWaW2loOQA1rzeeNj2rQ$ <https://urldefense.com/v3/__http://www.bcm.edu/biochem__;!!Mih3wA!QqNkIZ5jKtLOqfQMjz4WRKgTOwRG6vN7T-WXpc3rmuCsqkxWaW2loOQA1rzeeNj2rQ$ >)
Academic Director, CryoEM Core                                        (cryoem.bcm.edu<https://urldefense.com/v3/__http://cryoem.bcm.edu__;!!Mih3wA!QqNkIZ5jKtLOqfQMjz4WRKgTOwRG6vN7T-WXpc3rmuCsqkxWaW2loOQA1rwI4feYjQ$ >)
Co-Director CIBR Center                                    (https://urldefense.com/v3/__http://www.bcm.edu/research/cibr__;!!Mih3wA!QqNkIZ5jKtLOqfQMjz4WRKgTOwRG6vN7T-WXpc3rmuCsqkxWaW2loOQA1rz3q6kp4w$ <https://urldefense.com/v3/__http://www.bcm.edu/research/cibr__;!!Mih3wA!QqNkIZ5jKtLOqfQMjz4WRKgTOwRG6vN7T-WXpc3rmuCsqkxWaW2loOQA1rz3q6kp4w$ >)




On Apr 2, 2021, at 7:12 PM, Charles Bayly-Jones <charles.bayly-jones at monash.edu<mailto:charles.bayly-jones at monash.edu>> wrote:

***CAUTION:*** This email is not from a BCM Source. Only click links or open attachments you know are safe.
________________________________
Hi all.

Our lab is also currently making similar considerations for a moderately large server. So this email chain has been quite serendipitous.

I'm wondering whether those (Chen, Steve, Matthias, Tim) with external servers (45drives, trueNAS, synology, etc) might comment on their network setup? Did you opt for a special network switch? Has this been necessary or can these servers directly host the workstations? What kind of local network infrastructure do you use? Connections, SFP+/RJ45/some fancy optical thing?

I note Steve has said, in the small 12-drive synologyNAS that a 10Gbe (RJ45?) card has been sufficient for ~950/600 MB/s IO. I recall the synology can be networked directly to workstations without the need for a switch, if I'm not mistaken. So what is people's experience with the other larger servers?

Perhaps you might also comment on your typical use case? E.g. is your server for long term storage of processed or raw data? Or is it rather a scratch space for analysis with read & write?

Very best,
Charles
_________________

Charles Bayly-Jones
BSc(ScSchProg)(Hons)

Monash Biomedicine Discovery Institute & ARC CoE in advanced molecular imaging
Department of Biochemistry and Molecular Biology
Rm 218, Level 2, Building 77
23 Innovation Walk
Clayton VIC 3800
Australia
[Image result for monash university logo]


On Sat, 3 Apr 2021 at 04:23, Xu, Chen <Chen.Xu at umassmed.edu<mailto:Chen.Xu at umassmed.edu>> wrote:
We also have very positive experience with FreeNAS/TruNAS system. Here is the shop for their
on-the-shelf units.

https://urldefense.com/v3/__https://www.amazon.com/stores/iXsystems/page/C88EFDE3-3E4C-4951-860E-0E8A8BD91BF9?ref_=ast_bln__;!!Mih3wA!XMzGwejDpzMf8LVPKLKH7ywovtHBrbzH4RFa7s3_HCHKfln9EJAtGF_JwCOnSJxj9w$

-Chen

________________________________________
From: 3dem <3dem-bounces at ncmir.ucsd.edu<mailto:3dem-bounces at ncmir.ucsd.edu>> on behalf of Matthias Wolf <matthias.wolf at oist.jp<mailto:matthias.wolf at oist.jp>>
Sent: Friday, April 2, 2021 11:27 AM
To: Krishan Pandey; 3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
Subject: Re: [3dem] Data storage system

Hi Krishan,

Our 60-drive 760 TB FreeNAS has been running flawless since more than 1 year. One failed drive was easily hot swapped. If you want capacity, this does not cost more  per TB than inferior hardware, yet you get good redundancy and performance.

See my old tweet here https://urldefense.com/v3/__https://twitter.com/hicryoem/status/1223966559976083458?s=21__;!!Mih3wA!XMzGwejDpzMf8LVPKLKH7ywovtHBrbzH4RFa7s3_HCHKfln9EJAtGF_JwCPmSkFojQ$ <https://urldefense.com/v3/__https://nam10.safelinks.protection.outlook.com/?url=https*3A*2F*2Furldefense.com*2Fv3*2F__https*3A*2F*2Ftwitter.com*2Fhicryoem*2Fstatus*2F1223966559976083458*3Fs*3D21__*3B!!Mih3wA!TpItSV7E6sE1qSO0JgCwUfuLYuNhW92PMSZUaIF70zdj2z4BE_zg2izktUsKL5clRQ*24&data=04*7C01*7CChen.Xu*40umassmed.edu*7Ca3ecd1c220144d5c148208d8f5ebdb95*7Cee9155fe2da34378a6c44405faf57b2e*7C0*7C1*7C637529740649781691*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C2000&sdata=kHkXkbLqxxMR*2F*2FgoMCQQQxjSAvsIXrzIxrf93xWYXrE*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUl!!Mih3wA!XMzGwejDpzMf8LVPKLKH7ywovtHBrbzH4RFa7s3_HCHKfln9EJAtGF_JwCOaPADKzw$ >

   Matthias

________________________________
From: 3dem <3dem-bounces at ncmir.ucsd.edu<mailto:3dem-bounces at ncmir.ucsd.edu>> on behalf of Krishan Pandey <krishan.pandey at health.slu.edu<mailto:krishan.pandey at health.slu.edu>>
Sent: Friday, April 2, 2021 03:59
To: 3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
Subject: [3dem] Data storage system

Hello,

I am requesting suggestions and cost estimates about off the shelf data storage systems to store raw cryo-EM movies and processed data for our lab. Our initial target is 150-200 TB with options to expand it in future.
We don't have much local IT support for Linux based systems, that's why I am asking for an off-the shelf system which should be easier to install and manage.

Thank you
best regards

Krishan Pandey

_______________________________________________
3dem mailing list
3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=DwMFaQ&c=ZQs-KZ8oxEw0p81sqgiaRA&r=Dk5VoQQ-wINYVssLMZihyC5Dj_sWYKxCyKz9E4Lp3gc&m=07leD78FMARmu-HCbe03-DEChzbpMryPIQrKaR5vjrQ&s=WWJu2lEePTMP8E49792X6Z5bZvvO--YAJn-joW_9exk&e=>
_______________________________________________
3dem mailing list
3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.ncmir.ucsd.edu_mailman_listinfo_3dem&d=DwICAg&c=ZQs-KZ8oxEw0p81sqgiaRA&r=Dk5VoQQ-wINYVssLMZihyC5Dj_sWYKxCyKz9E4Lp3gc&m=07leD78FMARmu-HCbe03-DEChzbpMryPIQrKaR5vjrQ&s=WWJu2lEePTMP8E49792X6Z5bZvvO--YAJn-joW_9exk&e=

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ncmir.ucsd.edu/pipermail/3dem/attachments/20210403/64c86568/attachment-0001.html>


More information about the 3dem mailing list