[3dem] Relion Benchmarks for NVIDIA RTX 2080

Takanori Nakane tnakane at mrc-lmb.cam.ac.uk
Thu Mar 21 04:57:03 PDT 2019


Hi,

Note that our ribosome Class3D benchmark is relatively CPU-heavy.
The M-step is not GPU accelerated and can use
only one thread per volume (i.e. half map or class).
This means that single thread performance is important.
In contrast, the E-step is GPU accelerated. Some calculations
in the E-step also depend on CPU (e.g. calculation of CTF),
but they can use multiple threads.

The relative fractions of the E step and the M step are different
depending on many factors, such as the number of particles, the box size,
the job type (Refine3D, Class3D, Class2D).

Optimising against the ribosome Class3D benchmark might not necessarily
improve performance of typical jobs at your lab.

Best regards,

Takanori Nakane

> Hi James,
> Thanks for the information.  We’ve noticed that for GTX1080 and above (1080
> , 1080Ti, etc,…) CPU performance can have a significant impact on benchmark
>  time.  Can you give any sense of the cpu utilisation during the runs?
> Thanks,
> Dr. T.J. Ragan
> Senior Research Computation Officer
> Leicester Institute of Structural and Chemical Biology
> University of Leicester, University Road, Leicester LE1 7RH, UK
> t: +44 (0)116 223 1287
> e: TJ.Ragan at leicester.ac.uk<mailto:tjr22 at leicester.ac.uk>
> w: www.le.ac.uk/liscb<http://www.le.ac.uk/liscb>
> [University of Leicester Logo][Athena Swan Silver Award]
> On 4 Mar 2019, at 17:36, James Montantes
> <james.montantes at exxactcorp.com<mailto:james.montantes at exxactcorp.com>>
> wrote:
> Dear Relion users,
> Exxact Corporation has performed a series of Relion 3.0 benchmarks
> utilizing NVIDIA RTX 2080 GPUs. These were ran using CUDA9.2 and CUDA
10.0
> See below for our results.  We plan to release more benchmarks in the
> future.
> Version: Relion 3.0 Git version as of Oct 21st, 2018
> System: Exxact Relion Cryo-em
> Workstation<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.exxactcorp.com%2FRelion-for-Cryo-EM-Solutions%3Futm_source%3Dweb%2520referral%26utm_medium%3Dbacklink%26utm_campaign%3DRelion%2520User%2520Mailing%2520List%26utm_term%3D3dem%26utm_content%3Drtx%25202080%2520benchmark&data=02%7C01%7Ctj.ragan%40leicester.ac.uk%7Cb4dc00a320a9446e1d9908d6a0cbf1ac%7Caebecd6a31d44b0195ce8274afe853d9%7C0%7C0%7C636873195136060015&sdata=R1oUu54qzot%2FS%2FQM0f2zP56WcuNGyJPUJRAsSzSPeXM%3D&reserved=0>
> 2 x Intel Xeon Scalable Silver 4114 CPUs,  4 x RTX2080, 2 x 2TB SSD Raid
> 0, NVIDIA Driver Version: 410.57
> Benchmark Set 1 - from
> https://www2.mrc-lmb.cam.ac.uk/relion/index.php?title=Benchmarks_%26_computer_hardware<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww2.mrc-lmb.cam.ac.uk%2Frelion%2Findex.php%3Ftitle%3DBenchmarks_%2526_computer_hardware&data=02%7C01%7Ctj.ragan%40leicester.ac.uk%7Cb4dc00a320a9446e1d9908d6a0cbf1ac%7Caebecd6a31d44b0195ce8274afe853d9%7C0%7C0%7C636873195136060015&sdata=7nUlfiJAPcVGn0QUaiBPM4M710sD3xbFP5arz3%2Fj%2BYY%3D&reserved=0>
> wget ftp://ftp.mrc-lmb.cam.ac.uk/pub/scheres/relion_benchmark.tar.gz .
> tar -zxf relion_benchmark.tar.gz
> SUMMARY         CUDA 9.2 sm6.1(hrs)      CUDA 10.0 sm7.5(hrs)
> 1 x RTX2080                        5.24

>     5.79
> 1 x RTX2080 Preread        5.10
> 5.22
> 2 x RTX2080                        3.00

>     3.04
> 2 x RTX2080 Preread        2.68
> 2.69
> 4 x RTX2080                        2.00

>     2.05
> <image001.png>
> COMMANDS
> 3D Classification
> Base Command
> mpirun -n XXX `which relion_refine_mpi` --i Particles/shiny_2sets.star
> --ref emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf
> --ctf_corrected_ref --iter 25 --tau2_fudge 4 --particle_diameter 360 --K
6
> --flatten_solvent --zero_mask --oversampling 1 --healpix_order 2
> --offset_range 5 --offset_step 2 --sym C1 --norm --scale --random_seed 0
> --o class3d
> CUDA 9.2, sm6.1  (Default Relion 3.0beta2 build)
> ### 1 x RTX2080 ###
> mpirun -n 2 `which relion_refine_mpi` --i Particles/shiny_2sets.star
--ref
> emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref
> --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent
> --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5
> --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d
> --scratch_dir /scr --gpu 0 --pool 100 --dont_combine_weights_via_disc
--j
> 6
> 5.24 HOURS
> ### 1 x 2080 pre-read ###
> mpirun -n 2 `which relion_refine_mpi` --i Particles/shiny_2sets.star
--ref
> emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref
> --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent
> --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5
> --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d
> --preread_images --gpu 0 --pool 100 --dont_combine_weights_via_disc --j
6
> 5.10 HOURS
> ### 2 x RTX2080 ###
> mpirun -n 3 `which relion_refine_mpi` --i Particles/shiny_2sets.star
--ref
> emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref
> --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent
> --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5
> --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d
> --scratch_dir /scr --gpu 0:1 --pool 100 --dont_combine_weights_via_disc
> --j 6
> 3.00 HOURS
> ### 2 x 2080 pre-read ###
> mpirun -n 3 `which relion_refine_mpi` --i Particles/shiny_2sets.star
--ref
> emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref
> --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent
> --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5
> --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d
> --preread_images --gpu 0:1 --pool 100 --dont_combine_weights_via_disc
--j
> 6
> 2.68 HOURS
> ### 4 x 2080 ###
> mpirun -n 5 `which relion_refine_mpi` --i Particles/shiny_2sets.star
--ref
> emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref
> --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent
> --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5
> --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d
> --scratch_dir /scr --gpu 0:1:2:3 --pool 100
> --dont_combine_weights_via_disc --j 6
> 2.00 HOURS
> CUDA 10.0, sm7.5
> ### 1 x 2080 ###
> mpirun -n 2 `which relion_refine_mpi` --i Particles/shiny_2sets.star
--ref
> emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref
> --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent
> --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5
> --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d
> --scratch_dir /scr --gpu 0 --pool 100 --dont_combine_weights_via_disc
--j
> 6
> 5.79 HOURS
> ### 1 x 2080 pre-read ###
> mpirun -n 2 `which relion_refine_mpi` --i Particles/shiny_2sets.star
--ref
> emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref
> --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent
> --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5
> --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d
> --preread_images --gpu 0 --pool 100 --dont_combine_weights_via_disc --j
6
> 5.22 HOURS
> ### 2 x 2080 ###
> mpirun -n 3 `which relion_refine_mpi` --i Particles/shiny_2sets.star
--ref
> emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref
> --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent
> --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5
> --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d
> --scratch_dir /scr --gpu 0:1 --pool 100 --dont_combine_weights_via_disc
> --j 6
> 3.04 HOURS
> ### 2 x 2080 pre-read ###
> mpirun -n 3 `which relion_refine_mpi` --i Particles/shiny_2sets.star
--ref
> emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref
> --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent
> --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5
> --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d
> --preread_images --gpu 0:1 --pool 100 --dont_combine_weights_via_disc
--j
> 6
> 2.69 HOURS
> ### 4 x 2080 ###
> mpirun -n 5 `which relion_refine_mpi` --i Particles/shiny_2sets.star
--ref
> emd_2660.map:mrc --firstiter_cc --ini_high 60 --ctf --ctf_corrected_ref
> --iter 25 --tau2_fudge 4 --particle_diameter 360 --K 6 --flatten_solvent
> --zero_mask --oversampling 1 --healpix_order 2 --offset_range 5
> --offset_step 2 --sym C1 --norm --scale --random_seed 0 --o class3d
> --scratch_dir /scr --gpu 0:1:2:3 --pool 100
> --dont_combine_weights_via_disc --j 6
> 2.05 HOURS
> Best Regards,
> James Montantes
> Ph 510.226.7366 x319<tel:510.226.7366;319> | M
> 408.761.3071<tel:408.761.3071> | Fax 510.226.7367<tel:510.226.7367>
> <image002.jpg>
> <image003.jpg>
> Exxact Corporation |
> www.exxactcorp.com<https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.exxactcorp.com%2FRelion-for-Cryo-EM-Solutions%3Futm_source%3Dweb%2520referral%26utm_medium%3Dbacklink%26utm_campaign%3DRelion%2520User%2520Mailing%2520List%26utm_term%3D3dem%26utm_content%3Drtx%25202080%2520benchmark&data=02%7C01%7Ctj.ragan%40leicester.ac.uk%7Cb4dc00a320a9446e1d9908d6a0cbf1ac%7Caebecd6a31d44b0195ce8274afe853d9%7C0%7C0%7C636873195136070029&sdata=TfRvo015rdlkAagqOF5V1RwDiv%2BfWxJUMwnHG%2FG2ZR0%3D&reserved=0>
> Leading HPC and AV Solutions Provider
> ISO 9001:2008 Certified
> _______________________________________________
> 3dem mailing list
> 3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.ncmir.ucsd.edu%2Fmailman%2Flistinfo%2F3dem&data=02%7C01%7Ctj.ragan%40leicester.ac.uk%7Cb4dc00a320a9446e1d9908d6a0cbf1ac%7Caebecd6a31d44b0195ce8274afe853d9%7C0%7C0%7C636873195136090034&sdata=eBfK8lo4PmgjkCSME4l8BDHIMmEyRjVzu7ngd3CVEGE%3D&reserved=0
> _______________________________________________
> 3dem mailing list
> 3dem at ncmir.ucsd.edu
> https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem





More information about the 3dem mailing list