[3dem] relion3.0 3d multi-body problem

Takanori Nakane tnakane at mrc-lmb.cam.ac.uk
Wed Aug 15 11:40:49 PDT 2018


Hi,

Multibody refinement uses a lot of GPU memory. What is the box size?
How many bodies do you have?

If your pixel size is very small compared to the expected resolution,
try downsampling particles. If it is already small, try
"Skip padding: Yes". This reduces the GPU memory consumption to only
1/8.

Best regards,

Takanori Nakane

On 2018/08/15 18:15, Craig Yoshioka wrote:
> I’d run the last round(s) in CPU not GPU.  the 385G of RAM won’t help when you are limited to the 11GB the 1080Tis have for holding working data.
> 
> 
> On Aug 15, 2018, at 10:09 AM, Xu, Tinghai (Peter) <Tinghai.Xu at vai.org<mailto:Tinghai.Xu at vai.org>> wrote:
> 
> Dear all,
> When I using the relion3.0’s 3d multi-body on HPC node (4X GPU 1080Ti with 385G memory), it always get out of memory error. I used almost all the default setting, just use GPU acceleration with Number of MPI procs:5 Number of threads:2.
> I tried with/without Pre-read all particles in to RAM.
> 
> Here is the error massege:
> “ERROR: out of memory in /primary/vari/software/relion/relion-3.0_beta/src/acc/acc_projector_impl.h at line 62 (error-code 2)
> in: /primary/vari/software/relion/relion-3.0_beta/src/acc/cuda/cuda_settings.h, line 67
> === Backtrace  ===
> /primary/vari/software/relion/relion-3.0_beta/build/bin/relion_refine_mpi(_ZN11RelionErrorC1ERKSsS1_l+0x41) [0x4417c1]
> /primary/vari/software/relion/relion-3.0_beta/build/bin/relion_refine_mpi() [0x642239]
> /primary/vari/software/relion/relion-3.0_beta/build/bin/relion_refine_mpi(_ZN12AccProjector9setMdlDimEiiiiiii+0x19d) [0x642a2d]
> /primary/vari/software/relion/relion-3.0_beta/build/bin/relion_refine_mpi(_ZN14MlDeviceBundle22setupFixedSizedObjectsEv+0x30b) [0x60813b]
> /primary/vari/software/relion/relion-3.0_beta/build/bin/relion_refine_mpi(_ZN14MlOptimiserMpi11expectationEv+0x1585) [0x4475e5]
> /primary/vari/software/relion/relion-3.0_beta/build/bin/relion_refine_mpi(_ZN14MlOptimiserMpi7iterateEv+0xaa) [0x454d9a]
> /primary/vari/software/relion/relion-3.0_beta/build/bin/relion_refine_mpi(main+0xb15) [0x433175]
> /usr/lib64/libc.so.6(__libc_start_main+0xf5) [0x2aaab9776b35]
> /primary/vari/software/relion/relion-3.0_beta/build/bin/relion_refine_mpi() [0x4364af]
> ==================
> ERROR:
> 
> A GPU-function failed to execute.
> 
> If this occured at the start of a run, you might have GPUs which
> are incompatible with either the data or your installation of relion.
> If you
> 
>                  -> INSTALLED RELION YOURSELF: if you e.g. specified -DCUDA_ARCH=50
>                     and are trying ot run on a compute 3.5 GPU (-DCUDA_ARCH=3.5),
>                     this may happen.
> 
>                  -> HAVE MULTIPLE GPUS OF DIFFERNT VERSIONS: relion needs GPUS with
>                     at least compute 3.5. You may be trying to use a GPU older than
>                     this. If you have multiple generations, try specifying --gpu <X>
>                     with X=0. Then try X=1 in a new run, and so on. The numbering of
>                     GPUs may not be obvious from the driver or intuition. For a list
>                     of GPU compute generations, see
> 
>                     en.wikipedia.org/wiki/CUDA#Version_features_and_specifications<http://en.wikipedia.org/wiki/CUDA#Version_features_and_specifications>
> 
>                  -> ARE USING DOUBLE-PRECISION GPU CODE: relion was been written so
>                     as to not require this, and may thus have unforeseen requirements
>                     when run in this mode. If you think it is nonetheless necessary,
>                     please consult the developers with this error.
> 
> If this occurred at the middle or end of a run, it might be that
> 
>                  -> YOUR DATA OR PARAMETERS WERE UNEXPECTED: execution on GPUs is
>                     subject to many restrictions, and relion is written to work within
>                     common restraints. If you have exotic data or settings, unexpected
>                     configurations may occur. See also above point regarding
>                     double precision.
> If none of the above applies, please report the error to the relion
> developers at    github.com/3dem/relion/issues<http://github.com/3dem/relion/issues>
>> The run.out message for the last Iteration is:
> “Auto-refine: Iteration= 13
> Auto-refine: Resolution= 22.3816 (no gain for 13 iter)
>   Auto-refine: Changes in angles= 1.37635 degrees; and in offsets= 0.424773 pixels (no gain for 1 iter)
>   Auto-refine: Refinement has converged, entering last iteration where two halves will be combined...
> Auto-refine: The last iteration will use data to Nyquist frequency, which may take more CPU and RAM.
> Estimating accuracies in the orientational assignment ...
> 1.33/1.33 min ............................................................~~(,_,">
> Auto-refine: Estimated accuracy angles= 2.338 degrees; offsets= 0.956 pixels
> Auto-refine: Angular step= 0.9375 degrees; local searches= true
> Auto-refine: Offset search range= 2.151 pixels; offset step= 0.717 pixels
> CurrentResolution= 22.3816 Angstroms, which requires orientationSampling of at least 10 degrees for a particle of diameter 250 Angstroms
> Oversampling= 0 NrHiddenVariableSamplingPoints= 1260
> OrientationalSampling= 1.875 NrOrientations= 140
> TranslationalSampling= 1.434 NrTranslations= 9
> =============================
> Oversampling= 1 NrHiddenVariableSamplingPoints= 40320
> OrientationalSampling= 0.9375 NrOrientations= 1120
> TranslationalSampling= 0.717 NrTranslations= 36
> =============================
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
> with errorcode 1.
>> When I just continue from Iteration 12, it still pops out the same error:
> 
> Does anyone have come across this situation?  Thank you so much.
> 
> Best Regards,
> Tinghai (Peter) Xu
> 
> 333 Bostwick Ave., N.E., Grand Rapids, Michigan 49503
> Phone: 616-234-5787 | Email: Tinghai.Xu at vai.org<mailto:Tinghai.Xu at vai.org>
> _______________________________________________
> 3dem mailing list
> 3dem at ncmir.ucsd.edu<mailto:3dem at ncmir.ucsd.edu>
> https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem
> 
> 
> 
> _______________________________________________
> 3dem mailing list
> 3dem at ncmir.ucsd.edu
> https://mail.ncmir.ucsd.edu/mailman/listinfo/3dem
> 



More information about the 3dem mailing list