[3dem] GPU-ERROR - Relion-3.0-beta

Takanori Nakane tnakane at mrc-lmb.cam.ac.uk
Mon Feb 4 23:45:29 PST 2019


Hi,

Which version of nvcc did you use for compilation?
Do you have the right version of CUDA runtime in LD_LIBRARY_PATH?

Best regards,

Takanori Nakane

On 2019/02/05 5:20, Dieter Blaas wrote:
> Dear all,
> 
>      I did a test install of the latest version of Relion-3.0-beta on a 
> relatively potent workstation (CentOS Linux release 7.5.1804) and 
> everything runs fine on all 64 CPUs. However, when running e.g. Class2D 
> using 2 or 4 GPUs (set to 5 MPIs/4 Threads and 3 MPIs/2 Threads, 
> respectively) I receive the error below immediately upon starting the 
> run. The 4 GPUs are 1080 Ti (11 GB). To the best of my knowledge none of 
> the possibilities below applies. I am not very familiar with compilation 
> and cannot exclude that I did something wrong.  However, there was no 
> error during the compilation with cmake3 followed by make and I 
> previously installed this version of relion on two other machines 
> without any issue!
> 
> Thanks for hints, bw Dieter
> 
> -------------------------------------------
> 
> ERROR: unknown error in 
> /home/blaas/software/RELIONCUDA/relion-3.0_beta/src/ml_optimiser_mpi.cpp 
> at line 128 (error-code 30)
> in: 
> /home/blaas/software/RELIONCUDA/relion-3.0_beta/src/acc/cuda/cuda_settings.h, 
> line 67
> === Backtrace  ===
> /home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi(_ZN11RelionErrorC1ERKSsS1_l+0x41) 
> [0x448f81]
> /home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi() 
> [0x45e513]
> /home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi(_ZN14MlOptimiserMpi10initialiseEv+0x2284) 
> [0x468a54]
> /home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi(main+0xb79) 
> [0x4367b9]
> /usr/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fc01f87a3d5]
> /home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi() 
> [0x439b0f]
> ==================
> ERROR:
> 
> A GPU-function failed to execute.
> 
>   If this occured at the start of a run, you might have GPUs which
> are incompatible with either the data or your installation of relion.
> If you
> 
>      -> INSTALLED RELION YOURSELF: if you e.g. specified -DCUDA_ARCH=50
>         and are trying ot run on a compute 3.5 GPU (-DCUDA_ARCH=3.5),
>         this may happen.
> 
>      -> HAVE MULTIPLE GPUS OF DIFFERNT VERSIONS: relion needs GPUS with
>         at least compute 3.5. You may be trying to use a GPU older than
>         this. If you have multiple generations, try specifying --gpu <X>
>         with X=0. Then try X=1 in a new run, and so on. The numbering of
>         GPUs may not be obvious from the driver or intuition. For a list
>         of GPU compute generations, see
> 
> en.wikipedia.org/wiki/CUDA#Version_features_and_specifications
> 
>      -> ARE USING DOUBLE-PRECISION GPU CODE: relion was been written so
>         as to not require this, and may thus have unforeseen requirements
>         when run in this mode. If you think it is nonetheless necessary,
>         please consult the developers with this error.
> 
> 
> _______________________________________________
> 3dem mailing list
> 3dem at ncmir.ucsd.edu
> https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem



More information about the 3dem mailing list