ISAC (Iterative Stable Alignment and Clustering) is a 2D classification algorithm. It sorts a given stack of cryo-EM particles into different classes that share the same view of a target protein. ISAC is based around iterations of alternating equal size k-means clustering and repeated 2D alignment routines.
Yang, Z., Fang, J., Chittuluru, J., Asturias, F. J. and Penczek, P. A. (2012) Iterative stable alignment and clustering of 2D transmission electron microscope images. Structure 20, 237–247.
nvcc --version
in your terminal; the resulting output should list the version of your installed CUDA compilation tools.which sphire
in your terminal; the resulting output should give you the path to your SPHIRE installation (the path should indicate a version number of 1.3 or higher).Before you start, make sure your SPHIRE environment is activated.
GPU ISAC comes with a handy installation script that can be used as follows:
./install.sh
All done!
When calling GPU ISAC from the terminal, an example call looks as follows:
mpirun python /path/to/sp_isac2_gpu.py bdb:path/to/stack path/to/output --CTF -–radius=160 --img_per_grp=100 --minimum_grp_size=60 --gpu_devices=0,1
Using the following mix of both mandatory and optional parameters (see below to learn which is which):
mpirun python /path/to/sp_isac2_gpu.py bdb:path/to/stack path/to/output --CTF -–radius=160 --img_per_grp=100 --minimum_grp_size=60 --gpu_devices=0,1
[ ! ] - Mandatory parameters in the GPU ISAC call:
mpirun
is not a GPU ISAC parameter, but is required to launch GPU ISAC using MPI parallelization (GPU ISAC uses MPI to parallelize CPU computations and MPI/CUDA to distribute and parallelize GPU computations)./path/to/sp_isac2_gpu.py
is the path to your sp_isac2_gpu.py file. If you followed these instructions it should be your/installation/path/gpu_isac_2.2/bin/sp_isac2_gpu.py
.path/to/stack
is the path to your input .bdb stack. If you prefer to use an .hdf stack, simply remove the bdb:
prefix.path/to/output
is the path to your preferred output directory.--radius=160
is the radius of your target particle (in pixels) and has to be set accordingly.--gpu_devices
tells GPU ISAC what GPUs to use by specifying their system id values.[?] - Optional parameters recommended to be used when running GPU ISAC:
--CTF
flag to apply phase flipping to your particles.--VPP
flag with phase plate data. This flag may also be useful for non-phase-plate data, such as membrane proteins in membranes, or generally cases where low-resolution data may dominate the alignment. The --VPP
option divides by the 1D rotational power spectrum of each image, or in other words “whitens” the Fourier data.--img_per_grp
to limit the maximum size of individual classes. Empirically, a class size of 100-200 (30-50 for negative stain) particles has been proven successful when dealing with around 100,000 particles. (This may differ for your data set and you can use GPU ISAC to find out; see below.)--minimum_grp_size
to limit the minimum size of individual classes. In general, this value should be around 50-60% of your maximum class size.-h
parameter (in this case you do not need to specify any other parameters):mpirun python /path/to/sp_isac2_gpu.py -h
or simply
python /path/to/sp_isac2_gpu.py -h
EXAMPLE 01: Test run
This example is a test run that can be used to confirm GPU ISAC was installed successfully. It is a small stack that contains 64 artificial faces and is already included in the GPU ISAC installation package. You can process it using GPU ISAC as follows:
cd /gpu/isac/installation/folder
mpirun python bin/sp_isac2_gpu.py 'bdb:examples/isac_dummy_data_64#faces' 'isac_out_test/' --radius=32 --img_per_grp=8 --minimum_grp_size=4 --gpu_devices=0
Note that we don't care about the quality of any produced averages here; this test is used to make sure there are no runtime issues before a more time consuming run is executed.
EXAMPLE 02: TcdA1 toxin data
This example uses the SPHIRE tutorial data set (link to .tar file) described in the SPHIRE tutorial (link to .pdf file). The data contains about 10,000 particles from 112 micrographs and was originally published here (Gatsogiannis et al, 2013).
After downloading the data you'll notice that the extracted folder contains a multitude of subfolders. For the purposes of this example we are only interested in the Particles/
folder that stores the original data as a .bdb file.
You can process this stack using GPU ISAC as follows:
cd /gpu/isac/installation/folder
mpirun python bin/sp_isac2_gpu.py 'bdb:/your/path/to/Particles/#stack' 'isac_out_TcdA1' --CTF --radius=145 --img_per_grp=100 --minimum_grp_size=60 --gpu_devices=0
/your/path/to/Particles/
with the path to the Particles/
directory you just downloaded.--gpu_devices=0
with --gpu_devices=0,1
if you have two GPUs available (and so on).
The final averages can then be found in isac_out_TcdA1/ordered_class_averages.hdf
. You can look at them using e2display.py
(or any other displaying program of your choice) and should see averages like these:
Above: 95 class averages produced when processing the above data set using GPU ISAC. The particle stack contains 11,003 particles and the averages were computed within 6 minutes (Intel i9-7020X CPU and 2x GeForce GTX 1080 GPUs).
Next to producing high quality 2D class averages, GPU ISAC is also an excellent tool to screen your data which allows you to:
GPU ISAC produces a multitude of output files that can be used to analyze the success of running the program, even while it is still ongoing. These include the following:
path/to/output/mainXXX/generationYYY
for the .hdf
files to that contain any newly produced class averages.processed_images.txt
files. These contain the indices of all processed particles and can be used to determine how many particles GPU ISAC did account for during classification.path/to/output/ordered_class_averages.hdf
.GPU ISAC limitations
Known issues
nvcc --version
in your terminal to see the CUDA version you are using.GPU ISAC v2.3.4
GPU ISAC v2.3.3
GPU ISAC v2.3.1 & v2.3.2 (hotfix releases)
-h
parameter to display the help.GPU ISAC v2.3
GPU ISAC v2.2
GPU ISAC “Chimera”