[3dem] GPU-ERROR - Relion-3.0-beta

Dieter Blaas dieter.blaas at meduniwien.ac.at
Mon Feb 4 21:20:15 PST 2019


Dear all,

     I did a test install of the latest version of Relion-3.0-beta on a 
relatively potent workstation (CentOS Linux release 7.5.1804) and 
everything runs fine on all 64 CPUs. However, when running e.g. Class2D 
using 2 or 4 GPUs (set to 5 MPIs/4 Threads and 3 MPIs/2 Threads, 
respectively) I receive the error below immediately upon starting the 
run. The 4 GPUs are 1080 Ti (11 GB). To the best of my knowledge none of 
the possibilities below applies. I am not very familiar with compilation 
and cannot exclude that I did something wrong.  However, there was no 
error during the compilation with cmake3 followed by make and I 
previously installed this version of relion on two other machines 
without any issue!

Thanks for hints, bw Dieter

-------------------------------------------

ERROR: unknown error in 
/home/blaas/software/RELIONCUDA/relion-3.0_beta/src/ml_optimiser_mpi.cpp 
at line 128 (error-code 30)
in: 
/home/blaas/software/RELIONCUDA/relion-3.0_beta/src/acc/cuda/cuda_settings.h, 
line 67
=== Backtrace  ===
/home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi(_ZN11RelionErrorC1ERKSsS1_l+0x41) 
[0x448f81]
/home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi() 
[0x45e513]
/home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi(_ZN14MlOptimiserMpi10initialiseEv+0x2284) 
[0x468a54]
/home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi(main+0xb79) 
[0x4367b9]
/usr/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fc01f87a3d5]
/home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi() 
[0x439b0f]
==================
ERROR:

A GPU-function failed to execute.

  If this occured at the start of a run, you might have GPUs which
are incompatible with either the data or your installation of relion.
If you

     -> INSTALLED RELION YOURSELF: if you e.g. specified -DCUDA_ARCH=50
        and are trying ot run on a compute 3.5 GPU (-DCUDA_ARCH=3.5),
        this may happen.

     -> HAVE MULTIPLE GPUS OF DIFFERNT VERSIONS: relion needs GPUS with
        at least compute 3.5. You may be trying to use a GPU older than
        this. If you have multiple generations, try specifying --gpu <X>
        with X=0. Then try X=1 in a new run, and so on. The numbering of
        GPUs may not be obvious from the driver or intuition. For a list
        of GPU compute generations, see

en.wikipedia.org/wiki/CUDA#Version_features_and_specifications

     -> ARE USING DOUBLE-PRECISION GPU CODE: relion was been written so
        as to not require this, and may thus have unforeseen requirements
        when run in this mode. If you think it is nonetheless necessary,
        please consult the developers with this error.




More information about the 3dem mailing list