Initial 3D Model - RVIPER: Reproducible ab initio 3D structure determination. The program determines a validated initial intermediate resolution structure using a subset of class averages produced by ISAC2.
Usage in command line
sp_rviper.py stack directory --radius=outer_radius --sym=sym --n_rv_runs=n_rv_runs --iteration_start=iteration_start --n_v_runs=n_v_runs --npad=npad --criterion_name=criterion_name --outlier_index_threshold_method=outlier_index_threshold_method --angle_threshold=angle_threshold --outlier_percentile=outlier_percentile --ir=inner_radius --rs=ring_step --xr=x_range --yr=y_range --ts=translational_search_step --delta=angular_step --center=center_type --maxit1=max_iter1 --maxit2=max_iter2 --mask3D=mask3D --moon_elimination=moon_elimination --L2threshold=L2threshold --ref_a=ref_a --n_shc_runs=n_shc_runs --doga=doga --fl=fl --aa=aa --pwreference=pwreference --theta1=theta1 --theta2=theta2 --dpi=dpi
sp_rviper exists only in MPI version.
mpirun --npernode 16 -np 48 --host node1,node2,node3 sp_rviper.py stack output_directory --radius=outer_radius --outlier_percentile=95 --fl=0.25 --xr=2 --moon_elimination=750,4.84
The RVIPER exists only in MPI version. Importantly, the number of used MPI processes must be a multiple of --n_shc_runs (default = 4).
Since RVIPER uses group of processors working together, it is important for efficient execution to have processors within a group allocated to the same node. This way any data exchange within the group does not involve network traffic. The --npernode option of mpirun accomplishes this goal. As shown in the example below when --npernode is used MPI allocates the ranks of the processors sequentially, not moving to the next node until the current one is filled. If --npernode is not used then processors are allocated in a round robin fashion (i.e. jumping to the next node with each allocation). Since in VIPER, groups contain consecutively ranked processors, it is important to provide “--npernode XX”, where XX is the number of processors per node.
The output directory structure generated by sp_rviper is shown in the figure below. Each runXXX
directory contains the output of running the VIPER algorithm (please see sp_viper). The runXXX
directory contains the reconstructed structure of stage1, refvolf2.hdf
, and parameters into refparams2.txt
. After stage 2, the final structure and parameters will be written to volf.hdf
and params.txt
. Other output files are log.txt
and previousmax.txt
. Each mainXXX
directory contains the output of --n_v_runs viper runs (default 3). The number of mainXXX
directories is given by --n_rv_runs.
This program uses multiple VIPER runs to find unstable projections. Based on the user chosen criterion it eliminates the unstable projections and reruns again until all projections are stable. Since VIPER is used as a building block, all requirements from VIPER must be satisfied. Attributes xform.projection have to be set in the header of each file. If their values are not known, all should be set to zero. Determining whether the --n_v_runs reconstructed maps in the current RVIPER iteration have a core set of stable projections is done using one of the following criteria shown in the figures below. The y axis represents the error angle. For example, if a projection has the following assigned angles in three different reconstructed maps 30,45 and 55 then the error associated with this image is abs(30-45) + abs(30-55) + abs(45-55))/3 = 16.6. The x axis represents the image index of the sorted array of error angles.
The first criterion, called “80th percentile” (left image) is satisfied when the 80th percentile is less or equal to 20% of the maximum. The second criterion, called “fastest increase in the last quartile” is satisfied when the last quartile has a length greater than 20% of the maximum. If finishing criterion is not met after executing 10 VIPER runs, (the criterion fails for all combinations of --n_v_runs (default=3) taken by 10 (120 in total)) then the program stops.
Once a criterion is met, a decision is made regarding which images to keep. Currently there are three options implemented:
On our cluster, it takes about 6 hours to process 400 88×88 particles on 64 processors. Memory required is about 0.5GB per processor.
In the example below, RVIPER found in the third iteration (main003
) a set of 3 structures with stable angle assignments. Based on the three structures the program generates variance_volume.hdf
and average_volume.hdf
which can be used as an initial reference.
Penczek 1994, “The ribosome at improved resolution: new techniques for merging and orientation refinement in 3D cryo-electron microscopy of biological particles”, Ultramicroscopy 53, 251-270.
Horatiu Voicu, Pawel A. Penczek
Category 1:: APPLICATIONS Category 3:: GRIDDING
sparx/bin/sp_rviper.py
Beta:: Under evaluation and testing. Please let us know if there are any bugs.
There are no known bugs so far.