ISAC - 2D Clustering: Iterative Stable Alignment and Clustering (ISAC) of a 2D image stack.
Usage in command line
sxisac.py stack_file output_directory --radius=particle_radius --img_per_grp=img_per_grp --CTF --xr=xr --thld_err=thld_err --stop_after_candidates --restart_section --target_radius=target_radius --target_nx=target_nx --ir=ir --rs=rs --yr=yr --ts=ts --maxit=maxit --center_method=center_method --dst=dst --FL=FL --FH=FH --FF=FF --init_iter=init_iter --main_iter=main_iter --iter_reali=iter_reali --match_first=match_first --max_round=max_round --match_second=match_second --stab_ali=stab_ali --indep_run=indep_run --thld_grp=thld_grp --n_generations=n_generations --rand_seed=rand_seed --new --debug --use_latest_master_directory --skip_prealignment
sxisac exists only in MPI version.
mpirun -np 176 --host <host list> sxisac.py bdb:data fisac1 --radius=120 --CTF > 1fou &
mpirun -np 176 --host <host list> sxisac.py bdb:data fisac1 --radius=120 --CTF --restart_section=candidate_class_averages,4 --stop_after_candidates > 1fou &
Note that ISAC will change the size of input data so that they fit into a box of 76×76 pixels by default (see Description below).
The ISAC program needs an MPI environment to work properly. Importantly, the number of MPI processes must be a multiple of the number of independent runs (indep_run, see parameters below).
Depending on the cluster you are running, the way of using MPI will be significantly different. On some clusters,
mpirun -np 32 sxisac.py ...
will be sufficient. On some clusters, one needs to specify the host name:
mpirun -np 32 --host node1,node2,node3,node4 sxisac.py ...
On some clusters, one needs to submit a script to run MPI, please ask your system manager about how to run MPI program on your machine.
Also, different systems have different ways of storing the printout. On some clusters, printout is automatically saved. If it is not, we recommend to use the linux command nohup
to run the program, so that the printout is automatically saved to the text file nohup.out. For example:
mpirun -np 32 sxisac.py bdb:test --img_per_grp=150 --generation=1
If there is no nohup
on your system, you can redirect the printout to a text file.
mpirun -np 32 sxisac.py bdb:test --img_per_grp=150 --generation=1 > output.txt
To restart a run that stopped intentionally or unintentionally, use the '–restart_section' option.
nx
=ny
). The stack can be either in bdb or hdf format. (default required string)
Each generation of the program is divided into two phases. The first one is exploratory. In it, we set the criteria to be very loose and try to find as many candidate class averages as possible. This phase typically should have 10 to 20 rounds (set by –max_round, default = 20). The candidate class averages are stored in class_averages_candidate_generation_n.hdf.
The second phase is where the actual class averages are generated. It typically has 3~9 iterations (set by –match_second, default = 5) of matching. The iterations in the first half are 2-way matching, in the second half of 3-way matching, and the fianl iteration is 4-way matching.
After the second phase, three files will be generated:
The program will perform the following steps (to save computation time, in case of inadvertent termination, i.e. power failure or other causes, the program can be restarted from any saved step location, see options) :
Unfortunately, ISAC is very time and memory consuming. For example, on our cluster, it takes 15 hours to process 50,000 64×64 particles with 256 cores. Therefore, before embarking on the big dataset, we recommend to run a test dataset (about 2,000~5,000 particles) first to get a rough idea of timing. If the timing is beyond acceptable, the first parameter you could change is –max_round. A value of 10 or even 5 should have mild effects on the results.
In case of premature termination (e.g. power failure), the program can be restarted from any saved step location with the –restart_section option.
sxprocess.py --isacselect class_averages.hdf ok.txt
e2bdb.py bdb:data --makevstack:bdb:select1 --list=ohk.txt
The same steps can be performed on files containing candidate class averages.
Let us assume we want to generate a RCT reconstruction using as a basis group number 12 from ISAC generation number 3. We have to do the following steps:
cd generation_0003 sxprocess.py bdb:../data class_averages_generation_3.hdf list3_12.txt --isacgroup=12 --params=originalid
e2bdb.py bdb:test --makevstack=bdb:RCT/group3_12 --list=list3_12.txt
e2proc2d.py --split=12 --first=12 --last=12 class_averages_generation3.hdf group3_12.hdf
sxali2d.py bdb:RCT/group3_12 None --ou=28 --xr=3 --ts=1 --maxit=1 --template=group3_12.12.hdf
sxheader.py group3_12.12.hdf --params=xform.align2d --export=params_group3_12.txt
Yang, Z., Fang, J., Chittuluru, F., Asturias, F. and Penczek, P. A. (2012) Iterative Stable Alignment and Clustering of 2D Transmission Electron Microscope Images. Structure 20:237-247. doi:10.1016/j.str.2011.12.007
Horatiu Voicu, Zhengfan Yang, Jia Fang, Francisco Asturias, and Pawel A. Penczek
Category 1:: APPLICATIONS
sparx/bin/sxisac.py, sparx/bin/isac.py
Beta:: Under evaluation and testing. Please let us know if there are any bugs.
There are no known bugs so far.