[3dem] GPU-ERROR - Relion-3.0-beta

Dieter Blaas dieter.blaas at meduniwien.ac.at
Tue Feb 5 05:12:15 PST 2019


Hi Takanori and Dario,

Thanks! unfortunately "-DCUDA_ARCH=61" did not help!

I have:

echo $PATH
/usr/lib64/openmpi3/bin:/home/blaas/bin/:/home/blaas/software/RELIONCUDA/relion-3.0_beta/build/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/cuda/bin/

echo $LD_LIBRARY_PATH
/home/blaas/software/usr/include:/home/blaas/software/usr/lib64:/usr/lib64:/home/blaas/software/RELIONCUDA/relion-3.0_beta/build/lib:/usr/lib64/openmpi/lib:/usr/lib64:/home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/lib:/usr/lib64/openmpi/lib

nvcc --version gives me: Cuda compilation tools, release 10.0, V10.0.130

cat /usr/local/cuda/version.txt gives me: CUDA Version 10.0.130

Anything missing?

bw Dieter


------------------------------------------------------------------------
Dieter Blaas,
Max F. Perutz Laboratories
Medical University of Vienna,
Inst. Med. Biochem., Vienna Biocenter (VBC),
Dr. Bohr Gasse 9/3,
A-1030 Vienna, Austria,
Tel: 0043 1 4277 61630,
Fax: 0043 1 4277 9616,
e-mail: dieter.blaas at meduniwien.ac.at
------------------------------------------------------------------------

Am 05.02.2019 um 08:45 schrieb Takanori Nakane:
> Hi,
>
> Which version of nvcc did you use for compilation?
> Do you have the right version of CUDA runtime in LD_LIBRARY_PATH?
>
> Best regards,
>
> Takanori Nakane
>
> On 2019/02/05 5:20, Dieter Blaas wrote:
>> Dear all,
>>
>>      I did a test install of the latest version of Relion-3.0-beta on 
>> a relatively potent workstation (CentOS Linux release 7.5.1804) and 
>> everything runs fine on all 64 CPUs. However, when running e.g. 
>> Class2D using 2 or 4 GPUs (set to 5 MPIs/4 Threads and 3 MPIs/2 
>> Threads, respectively) I receive the error below immediately upon 
>> starting the run. The 4 GPUs are 1080 Ti (11 GB). To the best of my 
>> knowledge none of the possibilities below applies. I am not very 
>> familiar with compilation and cannot exclude that I did something 
>> wrong. However, there was no error during the compilation with cmake3 
>> followed by make and I previously installed this version of relion on 
>> two other machines without any issue!
>>
>> Thanks for hints, bw Dieter
>>
>> -------------------------------------------
>>
>> ERROR: unknown error in 
>> /home/blaas/software/RELIONCUDA/relion-3.0_beta/src/ml_optimiser_mpi.cpp 
>> at line 128 (error-code 30)
>> in: 
>> /home/blaas/software/RELIONCUDA/relion-3.0_beta/src/acc/cuda/cuda_settings.h, 
>> line 67
>> === Backtrace  ===
>> /home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi(_ZN11RelionErrorC1ERKSsS1_l+0x41) 
>> [0x448f81]
>> /home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi() 
>> [0x45e513]
>> /home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi(_ZN14MlOptimiserMpi10initialiseEv+0x2284) 
>> [0x468a54]
>> /home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi(main+0xb79) 
>> [0x4367b9]
>> /usr/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fc01f87a3d5]
>> /home/blaas/software/RELIONCUDA/relion-3.0_beta/ttt/bin/relion_refine_mpi() 
>> [0x439b0f]
>> ==================
>> ERROR:
>>
>> A GPU-function failed to execute.
>>
>>   If this occured at the start of a run, you might have GPUs which
>> are incompatible with either the data or your installation of relion.
>> If you
>>
>>      -> INSTALLED RELION YOURSELF: if you e.g. specified -DCUDA_ARCH=50
>>         and are trying ot run on a compute 3.5 GPU (-DCUDA_ARCH=3.5),
>>         this may happen.
>>
>>      -> HAVE MULTIPLE GPUS OF DIFFERNT VERSIONS: relion needs GPUS with
>>         at least compute 3.5. You may be trying to use a GPU older than
>>         this. If you have multiple generations, try specifying --gpu <X>
>>         with X=0. Then try X=1 in a new run, and so on. The numbering of
>>         GPUs may not be obvious from the driver or intuition. For a list
>>         of GPU compute generations, see
>>
>> en.wikipedia.org/wiki/CUDA#Version_features_and_specifications
>>
>>      -> ARE USING DOUBLE-PRECISION GPU CODE: relion was been written so
>>         as to not require this, and may thus have unforeseen 
>> requirements
>>         when run in this mode. If you think it is nonetheless necessary,
>>         please consult the developers with this error.
>>
>>
>> _______________________________________________
>> 3dem mailing list
>> 3dem at ncmir.ucsd.edu
>> https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem
>
> _______________________________________________
> 3dem mailing list
> 3dem at ncmir.ucsd.edu
> https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem


More information about the 3dem mailing list