User Tools

Site Tools


pipeline:sort3d:sxsort3d_depth

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
pipeline:sort3d:sxsort3d_depth [2018/06/20 13:12]
127.0.0.1 external edit
pipeline:sort3d:sxsort3d_depth [2018/08/22 11:58]
fmerino
Line 2: Line 2:
  
 ===== sxsort3d_depth ===== ===== sxsort3d_depth =====
-3D Clustering - SORT3D_DEPTH: Reproducible 3D Clustering of heterogeneous dataset. Sort out 3D heterogeneity of 2D data whose 3D reconstruction parameters have been determined already.+3D Clustering - SORT3D_DEPTH: Reproducible 3D Clustering on heterogeneous dataset and the 3D parameters of the data remain unchanged during the clustering.
  
 \\ \\
Line 18: Line 18:
 There are two ways of running this command.  There are two ways of running this command. 
  
-\\ __1. 3D sorting from meridien iteration__: Run from a fully finished iteration of meridien run and imports data from there. This mode uses all meridien information (i.e., smear, normalizations and such).+\\ __1. 3D sorting from meridien iteration__: Clustering is initiated from a completed iteration of meridien refinement and imports data from there. This mode uses all meridien information (i.e., smear, normalizations and such).
   mpirun -np 48 sxsort3d_depth.py --refinement_dir='outdir_sxmeridien' --output_dir='outdir_sxsort3d_depth_iteration' --radius=52 --sym='c1' --memory_per_node=60.0 --img_per_grp=2000 --minimum_grp_size=1500 --stop_mgskmeans_percentage=10.0 --swap_ratio=5 --do_swap_au --shake=0.1   mpirun -np 48 sxsort3d_depth.py --refinement_dir='outdir_sxmeridien' --output_dir='outdir_sxsort3d_depth_iteration' --radius=52 --sym='c1' --memory_per_node=60.0 --img_per_grp=2000 --minimum_grp_size=1500 --stop_mgskmeans_percentage=10.0 --swap_ratio=5 --do_swap_au --shake=0.1
  
-\\ __2. 3D sorting from stack__: Run from user-provided orientation parameters stored in stack header.  This mode uses only orientation parameters, which is useful for sorting data refined, say with relion.+\\ __2. 3D sorting from stack__: Clustering is initiated from user-provided orientation parameters stored in stack header.  This mode uses only orientation parameters, which is useful for sorting data refined, say with relion.
   mpirun -np 48 sxsort3d_depth.py --instack='bdb:data' --output_dir='outdir_sxsort3d_depth_stack' --radius=52 --sym='c1' --img_per_grp=2000 --minimum_grp_size=1500 --stop_mgskmeans_percentage=10.0 --swap_ratio=5 --do_swap_au   mpirun -np 48 sxsort3d_depth.py --instack='bdb:data' --output_dir='outdir_sxsort3d_depth_stack' --radius=52 --sym='c1' --img_per_grp=2000 --minimum_grp_size=1500 --stop_mgskmeans_percentage=10.0 --swap_ratio=5 --do_swap_au
  
Line 27: Line 27:
 ===== Input ===== ===== Input =====
 === Main Parameters === === Main Parameters ===
-  ; %%--%%refinement_dir : Meridien run directory: Specify the path to meridien 3D refinement directory. From here, data will be imported. Specific to iteration mode. (default none) +  ; %%--%%refinement_dir : Meridien refinement directory: A string denotes meridien 3D refinement directory. Sorting switches to meridien iteration mode when specified. (default none) 
-  ; %%--%%instack : Input images stack: File path of input particle stack for sorting. Specific to stack mode. (default none) +  ; %%--%%instack : Input images stack: A string denotes file path of input particle stack for sorting. Sorting switches to stack mode when option is specified. (default none) 
-  ; %%--%%output_dir : Output directory: The master output directory for 3D sorting. The results will be written here. This directory will be created automatically if it does not exist. By default, the program uses sort3d_DATA_AND_TIME for the name. Here, you can find a log.txt that describes the sequences of computations in the program. (default none)+  ; %%--%%output_dir : Output directory: A string denotes output directory for 3D sorting. It can be either existing or non-existing. By default, the program uses sort3d_DATA_AND_TIME for the name. Here, you can find a log.txt that describes the sequences of computations in the program. (default none)
  
-  ; %%--%%niter_for_sorting : 3D refinement iteration ID: Specify the iteration ID of 3D refinement for sorting. By default, it uses the iteration at which refinement achieved the best resolution. Specific to iteration mode. (default -1) +  ; %%--%%niter_for_sorting : Iteration ID of 3D refinement for importing data: By default, the program uses the iteration at which refinement achieved the best resolution. Option is valid only for meridien iteration mode. (default -1) 
-  ; %%--%%nxinit : Initial image size: Image size used for MGSKmeans in case of starting sorting from a data stack. By default, the program determines window size. Specific to stack mode. (default -1)  +  ; %%--%%nxinit : Initial image size: Image size used for MGSKmeans in case of starting sorting from a data stack. By default, the program determines window size. Option is valid only for stack mode. (default -1)  
-  ; %%--%%mask3D : 3D mask: File path of the global 3D mask for clustering. (default none) +  ; %%--%%mask3D : 3D mask: A string denotes file path of the global 3D mask for clustering. Imported from 3D refinement unless user wishes a different one in meridien iteration mode. (default none) 
-  ; %%--%%focus : Focus 3D mask: File path of a binary 3D mask for focused clustering. (default none) +  ; %%--%%focus : Focus 3D mask: A string denotes file path of a binary 3D mask for focused clustering. (default none) 
-  ; %%--%%radius : Estimated particle radius [Pixels]: The value must be smaller than half of the box size. (default -1) +  ; %%--%%radius : Estimated particle radius [Pixels]: A integer value that is smaller than half of the box size. Imported from refinement unless user wishes a different one in meridien iteration mode. (default -1) 
-  ; %%--%%sym : Point-group symmetry: Point group symmetry of the macromolecular structure. It can be inherited from refinement. (default c1)  +  ; %%--%%sym : Point-group symmetry: A string denotes point group symmetry of the macromolecular structure. Imported from refinement unless the user wishes a different one in meridien iteration mode. Require specification in stack mode. (default c1)  
-  ; %%--%%img_per_grp : Number of images per group: User expected group size. This value is critical for a successful 3D clustering. (default 1000) +  ; %%--%%img_per_grp : Number of images per group: User expected group size in integer. (default 1000) 
-  ; %%--%%img_per_grp_split_rate : Group splitting rate: Rate for splitting the number of images per group (%%--%%img_per_grp). (default 1) +  ; %%--%%img_per_grp_split_rate : Group splitting rate: An integer value denotes split rate of the group size(%%--%%img_per_grp). (default 1) 
-  ; %%--%%minimum_grp_size : Minimum size of reproducible class: It serves as the minimum size of selected or accounted clusters as well as the minimum group size constraint in MGSKmeans. However this value must be smaller than the number of images per a group (img_per_grp). By default, the program uses half number of the images per group.  (default -1) +  ; %%--%%minimum_grp_size : Minimum size of reproducible class: The minimum size of selected or accounted clusters as well as the minimum group size constraint in MGSKmeans. However this value must be smaller than the number of images per a group (img_per_grp). By default, the program uses half number of the images per group.  (default -1) 
-  ; %%--%%do_swap_au : Swap flag: Randomly swap a certain number of accounted elements per cluster with the unaccounted elements. If the processing with the default values are extremely slow or stalled, please use this --do_swap_au option and set --swap_ratio to a large value (15.0[%] is a good start point). (default False) +  ; %%--%%do_swap_au : Swap flag: A boolean flag to control random swapping a certain number of accounted elements per cluster with the unaccounted elements. If the processing with the default values are extremely slow or stalled, please use this --do_swap_au option and set --swap_ratio to a large value (15.0[%] is a good start point). (default False) 
-  ; %%--%%swap_ratio : Swap percentage [%]: Specify a swap percentage between 0.0[%] and 50.0[%]. Effective only with --do_swap_au. Without --do_swap_au, the program automatically sets --swap_ratio to 0.0. If the processing with the default values are extremely slow or stalled, please use --do_swap_au and set this --swap_ratio option to a large value (15.0[%] is a good start point). (default 1.0)+  ; %%--%%swap_ratio : Swap percentage [%]: the percentage of images for swapping ranges between 0.0[%] and 50.0[%]. Option valid only with --do_swap_au. Without --do_swap_au, the program automatically sets --swap_ratio to 0.0. If the processing with the default values are extremely slow or stalled, please use --do_swap_au and set this --swap_ratio option to a large value (15.0[%] is a good start point). (default 1.0)
   ; %%--%%memory_per_node : Memory per node [GB]: User provided information about memory per node in GB (NOT per CPU). It will be used to evaluate the number of CPUs per node from user-provided MPI setting. By default, it uses 2GB * (number of CPUs per node). (default -1.0)   ; %%--%%memory_per_node : Memory per node [GB]: User provided information about memory per node in GB (NOT per CPU). It will be used to evaluate the number of CPUs per node from user-provided MPI setting. By default, it uses 2GB * (number of CPUs per node). (default -1.0)
  
 \\ \\
 === Advanced Parameters === === Advanced Parameters ===
-  ; %%--%%depth_order : Depth order: The value defines the number of initial independent MGSKmeans runs (2^depth_order). (default 2) +  ; %%--%%depth_order : Depth order: An integer value defines the number of initial independent MGSKmeans runs (2^depth_order). (default 2) 
-  ; %%--%%stop_mgskmeans_percentage : Stop MGSKmeans percentage [%]: Particle change percentage for stopping minimum group size K-means. (default 10.0)+  ; %%--%%stop_mgskmeans_percentage : Image assignment percentage to stop MGSKmeans [%]: A floating number denotes particle assignment change percentage that serves as the converge criteria of minimum group size K-means. (default 10.0)
   ; %%--%%nsmear : Number of smears for sorting: Fill it with 1 if user does not want to use all smears. (default -1)   ; %%--%%nsmear : Number of smears for sorting: Fill it with 1 if user does not want to use all smears. (default -1)
-  ; %%--%%orientation_groups : Number of orientation groups: Number of orientation groups in the asymmetric unit. (default 100)+  ; %%--%%orientation_groups : Number of orientation groups: Number of orientation groups in an asymmetric unit. (default 100)
   ; %%--%%not_include_unaccounted : Do unaccounted reconstruction: Do not reconstruct unaccounted elements in each generation. (default False question reversed in GUI)   ; %%--%%not_include_unaccounted : Do unaccounted reconstruction: Do not reconstruct unaccounted elements in each generation. (default False question reversed in GUI)
   ; %%--%%notapplybckgnoise : Use background noise flag: Flag to turn off background noise. (default False question reversed in GUI)   ; %%--%%notapplybckgnoise : Use background noise flag: Flag to turn off background noise. (default False question reversed in GUI)
-  ; %%--%%random_group_elimination_threshold : Random group elimination threshold: Specify the threshold as a factor of the random group reproducibility standard deviation for eliminating random groups. (default 2.0)+  ; %%--%%random_group_elimination_threshold : Random group elimination threshold: A floating value denotes the random group reproducibility standard deviation for eliminating random groups. (default 2.0)
  
 \\ \\
Line 63: Line 63:
 \\ \\
 ===== Description ===== ===== Description =====
-sxsort3d_depth finds out stable members by carrying out two-way comparison of two independent Kmeans clustering with minimum group size constraint.+sxsort3d_depth performs 3D clustering on data and keeps 3D orientation parameters of data unchanged. It finds out stable group members by carrying out two-way comparison of two independent Kmeans clustering runs. The Kmeans clustering has minimum group size constraint on each cluster and thus the clustering will not fail in any circumstance.
  
 \\ \\
Line 70: Line 70:
 || %%--%%depth_order || The parameter resembles the previous option number of independent runs but it controls sorting in an different way. The default value of 2 is a good choice. || || %%--%%depth_order || The parameter resembles the previous option number of independent runs but it controls sorting in an different way. The default value of 2 is a good choice. ||
 || %%--%%minimum_grp_size || This parameter selects qualified clusters and controls Kmeans clustering stability. The suggested value would be between img_per_grp/2 and img_per_grp but should be less than img_per_grp. || || %%--%%minimum_grp_size || This parameter selects qualified clusters and controls Kmeans clustering stability. The suggested value would be between img_per_grp/2 and img_per_grp but should be less than img_per_grp. ||
-|| %%--%%stop_mgskmeans_percentage || Even though this option is not new, here the suggestion would be not to set it too small. 5.0 - 10.0  is a good choice. ||+|| %%--%%stop_mgskmeans_percentage || The suggestion would be not to set it too small. 5.0 - 10.0  is a good choice. ||
 || %%--%%orientation_groups || It divides the asymmetric unit into the specified number of orientation groups and cast the data orientation parameters into them. It is meant to prevent sorting by angle, i.e., assign certain angle to one group, for example top views to one group and side views to another. || || %%--%%orientation_groups || It divides the asymmetric unit into the specified number of orientation groups and cast the data orientation parameters into them. It is meant to prevent sorting by angle, i.e., assign certain angle to one group, for example top views to one group and side views to another. ||
 || %%--%%swap_ratio || A ratio of randomly replaced particles in a group, it is meant to prevent premature convergence. When the program obtains both stable groups and unaccounted elements, it reassigns unaccounted elements back to stable groups, and continues sorting. Before re-assignment of unaccounted elements, the program swaps some elements of stable groups with unaccounted ones using this specified swap_ratio. || || %%--%%swap_ratio || A ratio of randomly replaced particles in a group, it is meant to prevent premature convergence. When the program obtains both stable groups and unaccounted elements, it reassigns unaccounted elements back to stable groups, and continues sorting. Before re-assignment of unaccounted elements, the program swaps some elements of stable groups with unaccounted ones using this specified swap_ratio. ||
Line 142: Line 142:
  
 \\ \\
- 
pipeline/sort3d/sxsort3d_depth.txt · Last modified: 2019/04/02 10:47 by lusnig