[3dem] Data storage system

Matthias Wolf matthias.wolf at oist.jp
Fri Apr 2 22:04:38 PDT 2021


I second Lu’s suggestion – tool-free saves a lot of time. Need hot-swap bays, of course.
Also thanks for the dropbox tip, Lu – I did not know it’s that affordable. This is great for archival, even if slow!

To correct my previous post – our server is also supermicro, not Tyan… :-]

There is one more related aspect: although the trays on this server are tool-free, they are made pretty cheaply and are quite fragile. Because of the software-based FreeNAS, the O/S only knows the drive location logically. So when drive XY failed, I had to identify its physical position. The system has a way to iD the drive by LED, but it’s not through FreeNAS. Steve Ludke gave good advice about many things including the advantage of robust hardware-based raid controllers, where drive-ID is a no-brainer. In the end I learned how to ID the correct drive, but it took some thinking. Because the worst is if you pull the wrong drive from a degraded array…

   Matthias


From: Lu Gan <lu at anaphase.org>
Sent: Saturday, April 03, 2021 11:52 AM
To: Matthias Wolf <matthias.wolf at oist.jp>
Cc: Charles Bayly-Jones <charles.bayly-jones at monash.edu>; Xu, Chen <Chen.Xu at umassmed.edu>; 3dem at ncmir.ucsd.edu
Subject: Re: [3dem] Data storage system

Make sure you get a server that has tool-free HDD / SSD caddies. The older servers use caddies that require 4 screws to secure each HDD. Why is this relevant? The HDDs you buy today will not have enough capacity in 3 - 4 years. At that time, you'll need to upgrade tens of HDDs, which is a breeze if the caddies are tool free, but a huge time sink if they're not. The Supermicro servers that Matthias and Craig mentioned should have tool free caddies. If you order a different model, double check because some of the older style servers are still being sold. In addition to Supermicro, you check this website, which has good reviews of modern servers that point out important features like the caddies:
https://urldefense.com/v3/__https://www.servethehome.com/category/storage/__;!!Mih3wA!TrUwzzfVLE1RXJiVYb0TumRmtuyvI-kmj2Fm6HuC0gqpNqTKgV05T6s894zi9thvjw$ 

Regarding backups, a slower but cheaper alternative is to use Dropbox's advanced business plan. It's < USD 1K per year for "as much as you need" storage. The downside is that transfer to/from is slow. In Singapore, we get 20 - 40 MB/sec. To put this number range in perspective, we just had to restore all our backed up data (35 TB) to our local server and it took 2 weeks.

Cheers.
Lu

--
Lu Gan
Associate Professor
Department of Biological Sciences
Centre for BioImaging Sciences
National University of Singapore
14 Science Drive 4
S1A, Lvl 2
Singapore 117543

https://urldefense.com/v3/__http://www.anaphase.org__;!!Mih3wA!TrUwzzfVLE1RXJiVYb0TumRmtuyvI-kmj2Fm6HuC0gqpNqTKgV05T6s894wSwss1OQ$ <https://urldefense.com/v3/__http://www.anaphase.org__;!!Mih3wA!TrUwzzfVLE1RXJiVYb0TumRmtuyvI-kmj2Fm6HuC0gqpNqTKgV05T6s894wSwss1OQ$ >

Tel: (65) 6516 8868


On Sat, Apr 3, 2021 at 10:14 AM Matthias Wolf <matthias.wolf at oist.jp<mailto:matthias.wolf at oist.jp>> wrote:
Hi Charles,

My Tyan FreeNAS box has a quad port 10G SFP+ baseboard-mounted NIC (AOC brand, purchased from Tyan through Superbiiz with the server). I have configured two of its ports for port aggregation with LAGP which is supported by FreeNAS. Two optical fibers are connected to a floor switch managed by IT, which also supports LAGP. The uplink from the floor switch to a core switch in the datacenter has much higher bandwidth (100Gb/s, I think) and this is an all-optical network. My workstations in the lab have 10Gb copper ethernet NICs (Solarflare or Intel) – they are connected to the floor switch with SFP-UTP transceivers on the switch side, because the cable in the building are all ethernet. Fortunately, these are all CAT6a or CAT7, since the buildings are all less than 10 years old. The highest transfer rates between a cluster front end and our FreeNAS box by rsync I have seen was about 1GB/s (~10 Gb/s) disk-to-disk. I have not measured the network speed yet over this route with a tool like iperf. Between this storage server and our lab workstations it is less (200-300Mb/s or so), but that’s probably limited by the local RAID5 array on the workstations, or because no network tuning of the workstations that run CentOS8.

We have a rack enclosure in a side room to the Titan Krios that hosts the K2 summit processor, Tyan FreeNAS box, Warp PC and two FEI Falcon storage servers. All of theses have 10G interfaces and are connected to a local optical switch managed by IT (also in that rack). The FreeNAS also has a BMC. Out HPC people did not want the Tyan server in their datacenter, because they think Tyan/Supermicro etc are inferior hardware without a service contract. They said that there were many reports of supermicro PSUs catching fire and they could not host such hardware in the datacenter. I understand this and I agree, but I bought the Tyan server because it was much cheaper than “professional” datacenter-grade servers.

Before the FreeNAS, I had bought a DDN storage server (55 drives, which was later expanded with an additional 80-drive expansion chassis), which is administered by IT and hosted directly in the OIST datacenter. Because the DDN has smaller drives, is a couple years older and has ridiculous redundancy (wasting more than 30% of the raw capacity), the total usable capacity is pretty much the same as the FreeNAS. The annual service contract of the DDN alone is 1/3 of the total cost of the Tyan FreeNAS box. I have therefore decided to let the DDN service contract lapse until something fails… it’s not a comfortable solution, but I just cannot afford this any more. I am considering swapping its drives against new large ones and installing FreeNAS on it.

For file access we use NFS and CIFS. But I still have not found enough time to figure out NIS on Centos8. This was no problem on  Centos7, but for some reason CentOS8 is much harder in this respect. I have one linux box for user authentication in the lab using LDAP, because I want control over my local segment and IT demand to give up root access when authenticating through their central AD domain controller. FreeNAS has all the major network services preconfigured and it is a simple matter of enabling the service. I like it quite a bit.

I highly recommend to work with your IT team to have them lock down your cryoem network on the router level – for many trouble-free years, ours has been a dedicated network segment isolated from internet and the rest of the university, only allowing routing of the teamviewer and RAPID ports. This way we got rid of the FEI service PCs, which used to act as firewalled gateways, are crappy hardware, and kill your file transfer bandwidth. The Thermo FSE’s actually prefer teamviewer over RAPID (which you can still run directly on the MPC on demand). All the while we can access our scope PCs and workstations through teamviewer and I could even remotely reboot my FreeNAS via its BMC and a web interface once connected to an internal workstation.

Hope this provides some inspiration…

   Matthias

From: Charles Bayly-Jones <charles.bayly-jones at monash.edu<mailto:charles.bayly-jones at monash.edu>>
Sent: Saturday, April 03, 2021 9:12 AM
To: Xu, Chen <Chen.Xu at umassmed.edu<mailto:Chen.Xu at umassmed.edu>>
Cc: Matthias Wolf <matthias.wolf at oist.jp<mailto:matthias.wolf at oist.jp>>; Krishan Pandey <krishan.pandey at health.slu.edu<mailto:krishan.pandey at health.slu.edu>>; 3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
Subject: Re: [3dem] Data storage system

Hi all.

Our lab is also currently making similar considerations for a moderately large server. So this email chain has been quite serendipitous.

I'm wondering whether those (Chen, Steve, Matthias, Tim) with external servers (45drives, trueNAS, synology, etc) might comment on their network setup? Did you opt for a special network switch? Has this been necessary or can these servers directly host the workstations? What kind of local network infrastructure do you use? Connections, SFP+/RJ45/some fancy optical thing?

I note Steve has said, in the small 12-drive synologyNAS that a 10Gbe (RJ45?) card has been sufficient for ~950/600 MB/s IO. I recall the synology can be networked directly to workstations without the need for a switch, if I'm not mistaken. So what is people's experience with the other larger servers?

Perhaps you might also comment on your typical use case? E.g. is your server for long term storage of processed or raw data? Or is it rather a scratch space for analysis with read & write?

Very best,
Charles
_________________

Charles Bayly-Jones
BSc(ScSchProg)(Hons)

Monash Biomedicine Discovery Institute & ARC CoE in advanced molecular imaging
Department of Biochemistry and Molecular Biology
Rm 218, Level 2, Building 77
23 Innovation Walk
Clayton VIC 3800
Australia

On Sat, 3 Apr 2021 at 04:23, Xu, Chen <Chen.Xu at umassmed.edu<mailto:Chen.Xu at umassmed.edu>> wrote:
We also have very positive experience with FreeNAS/TruNAS system. Here is the shop for their
on-the-shelf units.

https://urldefense.com/v3/__https://www.amazon.com/stores/iXsystems/page/C88EFDE3-3E4C-4951-860E-0E8A8BD91BF9?ref_=ast_bln__;!!Mih3wA!XMzGwejDpzMf8LVPKLKH7ywovtHBrbzH4RFa7s3_HCHKfln9EJAtGF_JwCOnSJxj9w$<https://urldefense.com/v3/__https:/www.amazon.com/stores/iXsystems/page/C88EFDE3-3E4C-4951-860E-0E8A8BD91BF9?ref_=ast_bln__;!!Mih3wA!XMzGwejDpzMf8LVPKLKH7ywovtHBrbzH4RFa7s3_HCHKfln9EJAtGF_JwCOnSJxj9w$>

-Chen

________________________________________
From: 3dem <3dem-bounces at ncmir.ucsd.edu<mailto:3dem-bounces at ncmir.ucsd.edu>> on behalf of Matthias Wolf <matthias.wolf at oist.jp<mailto:matthias.wolf at oist.jp>>
Sent: Friday, April 2, 2021 11:27 AM
To: Krishan Pandey; 3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
Subject: Re: [3dem] Data storage system

Hi Krishan,

Our 60-drive 760 TB FreeNAS has been running flawless since more than 1 year. One failed drive was easily hot swapped. If you want capacity, this does not cost more  per TB than inferior hardware, yet you get good redundancy and performance.

See my old tweet here https://urldefense.com/v3/__https://twitter.com/hicryoem/status/1223966559976083458?s=21__;!!Mih3wA!XMzGwejDpzMf8LVPKLKH7ywovtHBrbzH4RFa7s3_HCHKfln9EJAtGF_JwCPmSkFojQ$<https://urldefense.com/v3/__https:/twitter.com/hicryoem/status/1223966559976083458?s=21__;!!Mih3wA!XMzGwejDpzMf8LVPKLKH7ywovtHBrbzH4RFa7s3_HCHKfln9EJAtGF_JwCPmSkFojQ$> <https://urldefense.com/v3/__https://nam10.safelinks.protection.outlook.com/?url=https*3A*2F*2Furldefense.com*2Fv3*2F__https*3A*2F*2Ftwitter.com*2Fhicryoem*2Fstatus*2F1223966559976083458*3Fs*3D21__*3B!!Mih3wA!TpItSV7E6sE1qSO0JgCwUfuLYuNhW92PMSZUaIF70zdj2z4BE_zg2izktUsKL5clRQ*24&data=04*7C01*7CChen.Xu*40umassmed.edu*7Ca3ecd1c220144d5c148208d8f5ebdb95*7Cee9155fe2da34378a6c44405faf57b2e*7C0*7C1*7C637529740649781691*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C2000&sdata=kHkXkbLqxxMR*2F*2FgoMCQQQxjSAvsIXrzIxrf93xWYXrE*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUl!!Mih3wA!XMzGwejDpzMf8LVPKLKH7ywovtHBrbzH4RFa7s3_HCHKfln9EJAtGF_JwCOaPADKzw$<https://urldefense.com/v3/__https:/nam10.safelinks.protection.outlook.com/?url=https*3A*2F*2Furldefense.com*2Fv3*2F__https*3A*2F*2Ftwitter.com*2Fhicryoem*2Fstatus*2F1223966559976083458*3Fs*3D21__*3B!!Mih3wA!TpItSV7E6sE1qSO0JgCwUfuLYuNhW92PMSZUaIF70zdj2z4BE_zg2izktUsKL5clRQ*24&data=04*7C01*7CChen.Xu*40umassmed.edu*7Ca3ecd1c220144d5c148208d8f5ebdb95*7Cee9155fe2da34378a6c44405faf57b2e*7C0*7C1*7C637529740649781691*7CUnknown*7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0*3D*7C2000&sdata=kHkXkbLqxxMR*2F*2FgoMCQQQxjSAvsIXrzIxrf93xWYXrE*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUl!!Mih3wA!XMzGwejDpzMf8LVPKLKH7ywovtHBrbzH4RFa7s3_HCHKfln9EJAtGF_JwCOaPADKzw$> >

   Matthias

________________________________
From: 3dem <3dem-bounces at ncmir.ucsd.edu<mailto:3dem-bounces at ncmir.ucsd.edu>> on behalf of Krishan Pandey <krishan.pandey at health.slu.edu<mailto:krishan.pandey at health.slu.edu>>
Sent: Friday, April 2, 2021 03:59
To: 3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
Subject: [3dem] Data storage system

Hello,

I am requesting suggestions and cost estimates about off the shelf data storage systems to store raw cryo-EM movies and processed data for our lab. Our initial target is 150-200 TB with options to expand it in future.
We don't have much local IT support for Linux based systems, that's why I am asking for an off-the shelf system which should be easier to install and manage.

Thank you
best regards

Krishan Pandey

_______________________________________________
3dem mailing list
3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem
_______________________________________________
3dem mailing list
3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ncmir.ucsd.edu/pipermail/3dem/attachments/20210403/57130789/attachment.html>


More information about the 3dem mailing list