User Tools

Site Tools


pipeline:sort3d:sxrsort3d

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
pipeline:sort3d:sxrsort3d [2018/01/10 19:23]
penczek [Usage]
pipeline:sort3d:sxrsort3d [2018/02/23 16:55]
moriya
Line 2: Line 2:
  
 ===== sxsort3d ===== ===== sxsort3d =====
-3D Clustering - SORT3D DEPTHSorting heterogeneous 3D dataset by checking the reproducible members of two independent runs of K-means clustering with minimum group size constraintSorting requires the 3D reconstruction parameters have been determined already.+3D Clustering - SORT3D: Sort 3D heterogeneity based on the reproducible members of K-means and Equal K-means classificationIt runs after 3D refinement where the alignment parameters are determined.
  
 \\ \\
 ===== Usage ===== ===== Usage =====
  
-Usage1 in command line+Usage in command line
  
-  sxrsort3d_depth.py   --refinement_dir=refinemen_out_dir  --output_dir=master_dir  --niter_for_sorting=num_of_sorting_iterations  --mask3D=mask3d_file  --focus=focus3d_file  --radius=outer_radius  --sym=symmetry  --number_of_images_per_group=num_of_images_per_group --minimum_grp_size=minimum_grp_size --depth_order=depth_of_order --noctf=no_ctf  --instack=input_stack_file --memory_per_node=memeory_per_node --orientation_groups=number_of_orientation_groups --not_include_unaccounted=not_include_unaccounted --stop_mgskmeans_percentage=MGSKmeans_stop_ratio --swap_ratio=accounted_vs_unaccounted_swap_ratio --notapplybckgnoise=do_not_use_background_noise  --do_swap_au=turn_on_swap_accounted_vs_unaccounted  +  sxsort3d.py  stack  outdir  mask  --focus=3Dmask  --radius=outer_radius  --delta=angular_step  --CTF  --sym=c1  --number_of_images_per_group=number_of_images_per_group  --nxinit=nxinit  --smallest_group=smallest_group  --chunk0=CHUNK0_FILE_NAME  --chunk1=CHUNK1_FILE_NAME  --ir=inner_radius  --maxit=max_iter  --rs=ring_step  --xr=xr  --yr=yr  --ts=ts  --an=angular_neighborhood  --center=centring_method  --nassign=nassign  --nrefine=nrefine  --stoprnct=stop_percent  --function=user_function  --independent=indenpendent_runs  --low_pass_filter=low_pass_filter  --unaccounted  --seed=random_seed  --sausage  --PWadjustment=PWadjustment  --protein_shape=protein_shape  --upscale=upscale  --wn=wn  --interpolation=method
- +
-Usage2 in command line +
- +
-  sxrsort3d_depth.py   --instack=input_stack_file  --output_dir=master_dir --mask3D=mask3d_file  --focus=focus3d_file  --radius=outer_radius  --sym=symmetry  --number_of_images_per_group=num_of_images_per_group --minimum_grp_size=minimum_grp_size --depth_order=depth_of_order --nxinit=initial_image_size --noctf=no_ctf   --memory_per_node=memeory_per_node --orientation_groups=number_of_orientation_groups --not_include_unaccounted=not_include_unaccounted --stop_mgskmeans_percentage=MGSKmeans_stop_ratio --swap_ratio=accounted_vs_unaccounted_swap_ratio --notapplybckgnoise=do_not_use_background_noise  --do_swap_au=turn_on_swap_accounted_vs_unaccounted+
  
 \\ \\
 ===== Typical usage ===== ===== Typical usage =====
  
-sxrsort3d.py exists only in MPI version.+sxsort3d exists only in MPI version.
  
-\\ __Initiate sorting from a SPHIRE/SPARX refinement__: In this mode, one can select arbitrary iteration of a 3D refinement directory. Typically, it is the master directory of a sxmeridien refinement via —niter_for_sorting option.  +  mpirun -np 192 sxsort3d.py bdb:data sort3d_outdir1 mask.hdf --focus=ribosome_focus.hdf --chunkdir=/data/n10/pawel/ribosome_frank/ri3/main013 --radius=52 --CTF --number_of_images_per_group=2000 --low_pass_filter=.125 --stoprnct=5
-  mpirun  -np  176  sxrsort3d.py  --refinement_method=SPARX  --refinement_dir=meridien_outdir  --niter_for_sorting=30  --radius=120  --sym=c5  --number_of_images_per_group=6000  --smallest_group=1500  --nindependent=5  --interpolation=trl  --low_pass_filter=0.25+
  
-\\ __Initiate sorting from a data stack__: Currently, this mode is not supported by SPHIRE GUI. +\\ 
-  mpirun  -np  176  sxrsort3d.py  --instack=bdb:data  --mask3D=mask3d.hdf  --focus=focus3d.hdf  -radius=29  --sym=c1  --nxinit=64  --number_of_images_per_group=2000  --nindependent=3  --low_pass_filter=0.25  --interpolation=4nn  --comparison_method=cross  --Kmeans_lpf=adhoc+===== Input ===== 
 +=== Main Parameters === 
 +  ; stack : Input images stack: (default required string) 
 +  ; outdir : Output directory: There is a log.txt that describes the sequences of computations in the program. (default required string) 
 +  ; mask : 3D mask: File path of the global 3D mask for clustering. (default none)
  
-\\ __Initiate sorting from a relion refinement__: For this mode, please provide relion refinement directory. The program will pick up the results of the last iteration and start sorting. Currently, this mode is not supported by SPHIRE GUI. +  focus Binary Focus 3D maskBinary 3D mask used for focused clustering. (default none) 
-  mpirun  -np  160  sxrsort3d.py  --refinement_method=relion  --refinement_dir=relion_outdir  --radius=120  --sym=c5  --nindependent=3  --number_of_images_per_group=6000 +  ; radius Particle radius [Pixels]Used as outer radius for rotational correlation.  Must be smaller than half the box size. (default -1) 
- +  ; delta Angular step for projections [Degrees]Angular step of reference projections. (default '2'
-\\ NOTE - How to continue sxmeridien refinement using sorting results: Please use --ctrefromsort3d option of sxmeridien, then specify the directory where you wish to continue the refinement to --oldrefdir option and a subset of data to —-subset option. The command will load the refinement information from the directory and continue refinement. Optinally, you can specify the iteration number for continuing refinement using -—ctrefromiter option, which is not necessarily be the same iteration where you used for the 3D sorting. Also, one can modify refinement parameters of the selected iteration through the other options.  +  ; CTF Use CTFDo full CTF correction during the alignment. (default False
-  mpirun  -np  88  sxmeridien.py  --ctrefromsort3d  --oldrefdir=meridien_outdir  --ctrefromiter=20  --subset=Clusters3.txt ''' <<BR>><<BR>> +  ; sym : Point-group symmetry: Point-group symmetry of the target structure. (default c1) 
- +  ; number_of_images_per_group : Images per group: Critical value defined by user. It suggests program the number of groups(default 1000) 
-\\  +  ; nxinit : Initial image size for sorting: Initial image size for sorting. (default 64
-===== Input ===== +  ; smallest_group : Smallest group size: Minimum members for identified group. (default 500)  
-=== Main Paramaters === +  ; chunk0 : Chunk file name for 1st halfset: Name of chunk file containing particle IDs of 1st halfset (chunk0for computing margin of error. (default none
-  refinement_method Input 3D refinement methodValid values are 'SPARX' and 'relion'. Currently, SHPIRE GUI (sxgui) supports only 'SPARX'. (default none) +  ; chunk1 Chunk file name for 2nd halfsetName of chunk file containing particle IDs of 2nd halfset (chunk1) for computing margin of error. (default none)
-  ; refinement_dir : Input 3D refinement directory: Usually the master output directory of sxmeridien. (default none) +
-  ; masterdir Output directoryThe master output directory for sorting(default none) +
-  ; niter_for_sorting : 3D refinement iteration: Specify an iteration number of 3D refinement where the 3D alignment parameters should be extracted for this sorting. By default, it uses iteration achieved best resolution. (default -1) +
-  ; mask3D 3D maskFile path of the global 3D mask for clustering. (default none+
-  ; focus Focus 3D maskFile path of binary 3D mask for focused clustering. (default none) +
-  ; radius : Outer radius for rotational correlation [Pixels]: Particle radius in pixel for rotational correlation. The value must be smaller than half the box size. (default -1+
-  ; sym : Point-group symmetry: Point group symmetry of the structure. (default c1)  +
-  ; number_of_images_per_group : Images per group: The number of images per a groupThis value is critical for successful 3D clustering. (default 1000)  +
-  ; smallest_group : Smallest group size: Minimum number of members for being identified as a group. This value must be smaller than the number of images per a group (number_of_images_per_group). (default 500)  +
-  ; nxinit Initial image size for sorting [Pixels]If it is necessary to speed up the processing time, set a non-zero positive integer to this option. Then, the program will reduce image size of original data by resampling to the specified size. By default, program determines the value from resolution. (default -1)+
  
 \\ \\
 === Advanced Parameters === === Advanced Parameters ===
-  ; low_pass_filter Low-pass filter frequency [1/Pixel]: Absolute frequency cutoff of the low-pass filter used on the original image size for the 3D sorting. (default -1.0) +  ; ir Inner radius for rotational correlation [Pixels]: Must be bigger than 1. (default 1
-  ; Kmeans_lpf Low-pass filter method for K-meansLow-pass filter method for K-means clusteringValid values are 'adaptive', 'max', 'min', 'adhoc', and 'avg'. (default adaptive+  ; maxit : Maximum iterations: Maximum number of iteration(default 25) 
-  ; nindependent Independent runs: Number of independent runs for Equal Sized K-means clustering. The value must be an odd number larger than 2. (default 3)  +  ; rs : Step between rings in rotational correlation: Must be bigger than 0. (default 1
-  ; noctf No CTF correctionUse this option if full CTF correction should not be applied during the 3D clusteringBy defaultthe program will do full CTF correction. (default False)  +  ; xr X search range [Pixels]The translational search range in the x direction will take place in -xr to +xr range(default '1'
-  ; PWadjustment : Reference power spectrum file pathPath of text file containing 1D reference power spectrum of a PDB structure or EM map. The power spectrum will be used as reference to adjust the power spectra of clustered volumes. (default none)  +  ; yr : Y search range [Pixels]: The translational search range in the y direction will take place in -yr to +yr range.. If omittedit will be set as xr. (default '-1'
-  ; interpolation 3D reconstruction interpolation methodInterpolation method for 3D reconstructionValid values are 'trl' and '4nn'. (default 4nn)  +  ; ts : Translational search step [Pixels]: The search will be performed in -xr-xr+ts0, xr-ts, xr, can be fractional. (default '0.25'
-  ; comparison_method Comparison methodSimilarity measurement for the comparison between reprojected reference images and particle imagesValid values are 'cross' (cross-correlaton coefficients) and 'eucd' (Euclidean distance). (default cross)  +  ; an : Local angular search width [Degrees]: This defines the neighbourhood where the local angular search will be performed. (default '-1'
-  ; instack Input images stackFile path of particle stack for sortingThis option is not currently supported by SHPIRE GUI (sxgui). (default none)+  ; center : Centering method: 0 - if you do not want the volume to be centered, 1 - center the volume using the center of gravity. (default 0
 +  ; nassign Number of reassignment iterations: Performed for each angular step. (default 1) 
 +  ; nrefine : Number of alignment iterations: Performed for each angular step. (default 0) 
 +  ; stoprnct : Assignment convergence threshold [%]: Used to asses convergence of the run. It is the minimum percentage of assignment change required to stop the run.  (default 3.0) 
 +  ; function : Reference preparation function: Specify name of function used to prepare the reference volume. (default do_volume_mrk05) 
 +  ; independent : Number of independent runs: Number of independent equal-Kmeans. (default 3) 
 +  ; low_pass_filter Low-pass filter frequency [1/Pixels]Low-pass filter used for the 3D sorting on the original image sizeSpecify with absolute frequency. (default -1.0) 
 +  ; unaccounted : Reconstruct unaccounted images: Reconstruct unaccounted images. (default False) 
 +  ; seed : Random seed: Seed used for the initial random assignment for EQ Kmeans. The program generates a random integer by default. (default -1) 
 +  ; sausage : Use sausage filter: A way of filtering volume. (default False) 
 +  ; PWadjustment : Power spectrum referenceText file containing 1D reference power spectrum used for EM density map power spectrum correctionTypically, compute 1D power spectrum from PDB file. (default none) 
 +  ; protein_shape Protein ShapeIt defines protein preferred orientation angles"g" is for globular proteins and "f" is for filament proteins. (default 'g'
 +  ; upscale Power spectrum adjustment strengthThis parameters adjusts how strongly the power spectrum of the volume should be modified to match the reference. A value of 1 brings the volume's power spectrum completely to the reference, while a value of 0 means no modification. (default 0.5
 +  ; wn Target image size [Pixels]Specify optimal window size for data processing. If different than 0, then the images will be rescaled to fit this size. (default 0) 
 +  ; interpolation : 3D interpolation method: Method interpolation in 3D. Options are tr1 or 4nn. (default '4nn')
  
 \\ \\
 ===== Output ===== ===== Output =====
-Please use --masterdir option to specify the output directory. The results will be written here. This directory will be created automatically if it does not exist  Here, you can find a log.txt that describes the sequences of computations in the program.  
  
 \\ \\
 ===== Description ===== ===== Description =====
-sxrsort3d finds out stable members by carrying out two-way comparison of two independent sxsort3d runs. +The clustering algorithm in the program combines a couple of computational techniques, equal-Kmeans clustering, K-means clustering, and reproducibility of clustering such that it not only has a strong ability but also a high efficiency to sort out heterogeneity of cryo-EM images. The command sxsort3d.py is the protocol I {P1). In this protocol, the user defines the group size and thus defines the number of group K. Then the total data is randomly assigned into K group and an equal-size K-means (size restricted K-means) is carried out. N independent equal-Kmeans runs would give N partition of the K groups assignment. Then two-way comparison of these partitions gives the reproducible number of particles.
- +
-For small tested datasets (real and simulated ribosome data around 10K particles), it gives 70%-90% reproducibilityHowever, this rate also depends on the choice of number of images per group and number of particles in the smallest group.+
  
 \\ \\
 ==== Method ==== ==== Method ====
-K-means, equal K-means, reproducibility, two-way comparison. 
  
 \\ \\
 ==== Reference ==== ==== Reference ====
-Not published yet.+Described by A.Einstein in his first paper on spectrum of radiation from a house heater kept at room temperature. Journal of Irreproducible Results, 12, 1905, 12-1127. 
 + 
 +\\ 
 +==== Developer Notes ====
  
 \\ \\
 ==== Author / Maintainer ==== ==== Author / Maintainer ====
 Zhong Huang Zhong Huang
 +
  
 \\ \\
Line 86: Line 89:
 \\ \\
 ==== Files ==== ==== Files ====
-sxrsort3d.py+sparx/bin/sxsort3d.py
  
 \\ \\
 ==== See also ==== ==== See also ====
-[[sxsort3d|sxsort3d.py]]+[[pipeline:utilities:sxheader|sxheader]], [[[pipeline:sort3d:sx3dvariability|sx3dvariability]], [[pipeline:sort3d:sxsort3d_depth|sxsort3d_depth]] and [[pipeline:sort3d:sxrsort3d|sxrsort3d]]
  
 \\ \\
 ==== Maturity ==== ==== Maturity ====
-beta:: Under development. It has been tested, The test cases/examples are available upon requestPlease let us know if there are any bugs.+Alpha:: Under development. Two programs (P1,P2) have been tested on both simulated and experimental ribosome data. For experimental ribosome dataP2 has a reproducible ratio-70-90%P2 can 100%separate two conformations from the simulated ribosome data that contains 5 conformations
  
 \\ \\
-==== Known Bugs ==== +==== Bugs ==== 
-None so far.+There are no known bugs so far.
  
 \\ \\
  
pipeline/sort3d/sxrsort3d.txt · Last modified: 2018/06/20 13:12 (external edit)