Differences

This shows you the differences between two versions of the page.

--- pipeline:window:cryolo [2019/09/14 23:59]
twagner [Configuration]
+++ pipeline:window:cryolo [2019/09/17 13:40]
twagner [Advanced parameters]
@@ Line 17: / Line 17: @@
 <note>
 You can find more technical details in our paper:
@@ Line 38: / Line 39: @@
 ===== Tutorials =====
-Depending what you want to do, you can follow one of these Tutorials:
+Depending what you want to do, you can follow one of these self-contained Tutorials:
   - I would like to train a model from scratch for picking my particles
@@ Line 49: / Line 50: @@
 ===== Picking particles - Using a model trained for your data =====
+This tutorial explains you how to train a model specific for you dataset.
-==== Data preparation ====
 If you followed the installation instructions, you now have to activate the cryolo virtual environment with
@@ Line 57: / Line 57: @@
 source activate cryolo
 </code>
+==== Data preparation ====
-In the following I will assume that your image data is in the folder ''full_data''.
+{{page>pipeline:window:cryolo:data_preparation}}
-The next step is to create training data. To do so, we have to pick single particles manually in several micrographs. Ideally, the micrographs are picked to completion. [[:cryolo_picking_unlabeled|However, it is not necessary to pick all particles. crYOLO will still converge if you miss some (or even many).]]
-One may ask how many micrographs have to be picked? It depends! Typically 10 micrographs are a good start. However, that number may increase / decrease due to several factors:
-  * A very heterogenous background could make it necessary to pick more micrographs.
-  * When you refine a general model, you might need to pick less micrographs.
-  * If your micrograph is only sparsely decorated, you may need to pick more micrographs.
-We recommend that you start with 10 micrographs, then autopick your data, check the results and finally decide whether to add more micrographs to your training set. If you refine a general model, even 5 micrographs might be enough.
-{{:pipeline:window:box_manager.png?direct&400 |}}
-To create your training data, crYOLO is shipped with a tool called "boxmanager". However, you can also use tools like e2boxer to create your training data.
-Start the box manager with the following command:
-<code>
-cryolo_boxmanager.py
-</code>
-Now press //File -> Open image folder// and the select the ''full_data'' directory. The first image should pop up. You can navigate in the directory tree through the images. Here is how to pick particles:
-  * LEFT MOUSE BUTTON: Place a box
-  * HOLD LEFT MOUSE BUTTON: Move a box
-  * CONTROL + LEFT MOUSE BUTTON: Remove a box
-You can change the box size in the main window, by changing the number in the text field labeled //Box size://. Press //Set// to apply it to all picked particles. __For picking, you should the use minimum sized square which encloses your particle.__
-If you finished picking from your micrographs, you can export your box files with //Files -> Write box files//.
-Create a new directory called ''train_annotation'' and save it there. Close boxmanager.
-Now create a third folder with the name ''train_image''. Now for each box file, copy the corresponding image from ''full_data'' into ''train_image''((While it is nice to keep the things organized, you don't have to copy your training images in a separate folder. In the configuration file (see below) you can also simply specify the full_data directory as "//train_image_folder//". crYOLO will find the correct images using the box files.)). crYOLO will detect image / box file pairs by search taking the box file an searching for an image filename which contains the box filename.
 ==== Start crYOLO ====
@@ Line 94: / Line 65: @@
 ==== Configuration ====
 {{page>pipeline:window:cryolo:configuration}}
-==== Training ====
-Now you are ready to train the model. In case you have multiple GPUs, you should first select a free GPU. The following command will show the status of all GPUs:
+<html>
-<code>
+<div style="background-color: #cfc ; padding: 10px; border: 1px solid green;">
-nvidia-smi
+<b>You can now press the Start button to create you configuration file.</b>
-</code>
+</div>
-For this tutorial, we assume that you have either a single GPU or want to use GPU 0. Therefore we add '-g 0' after each command below. However, if you have multiple (e.g GPU 0 and GPU 1) you could also use both by adding '-g 0 1' after each command.
+</html>
-Navigate to the folder with ''config.json'' file, ''train_image'' folder, etc.
-**Train your network with 3 warmup epochs:**
+{{page>pipeline:window:cryolo:configuration_cmdl_normal}}
-<code>
+==== Training ====
-cryolo_train.py -c config.json -w 3 -g 0
-</code>
-The final model will be called ''model.h5''
-The training stops when the "loss" metric on the validation data does not improve 10 times in a row. This is typically enough. In case want to give the training more time to find the best model. You might increase the "not changed in a row" parameter to, for example, 15 by adding the flag //-e 15//:
-<code>
-cryolo_train.py -c config.json -w 3 -g 0 -e 15
-</code>
-to the training command.
+{{page>pipeline:window:cryolo:training}}
 ==== Picking ====
 {{page>pipeline:window:cryolo:picking}}
@@ Line 125: / Line 84: @@
 ==== Visualize the results ====
 {{page>pipeline:window:cryolo:visualize}}
+==== Evaluate your results ====
+{{page>pipeline:window:cryolo:evaluate_results}}
 ===== Picking particles - Without training using a general model =====
 Here you can find how to apply the general models we trained for you. If you would like to train your own general model, please see our extra wiki page: [[:cryolo_train_general_model|How to train your own general model]]
@@ Line 130: / Line 92: @@
 Our general models can be found and downloaded here: [[howto:download_latest_cryolo|Download and Installation]].
+If you followed the installation instructions, you now have to activate the cryolo virtual environment with
+<code>
+source activate cryolo
+</code>
 ==== Start crYOLO ====
 {{page>pipeline:window:cryolo:start_cryolo}}
 ==== Configuration====
-The next step is to create a configuration file. Type:
+In the GUI choose the //config// action. Fill in your target box size and leave the //train_image_folder// and //train_annot_folder// fields empty.
+{{ :pipeline:window:cryolo_filter_options.png?300|}}
+[[:downloads:cryolo_1#general_phosaurusnet_models|There are three general models available]]. It is important that you choose the same filtering options in //"Model/denoising options"// tab as we did during training the general models:
+  * General model trained for low-pass filtered images : Select //filter// "LOWPASS" and //low_pass_cutoff// of 0.1
+  * General model trained for JANNI-denoised images: Select //filter// "JANNI" and the [[:janni_tutorial#download|janni general model]] for //janni_model//. Keep the defaults for //janni_overlap// and //janni_batches//
+  * General model for negative stain images: Select filter "NONE"
+<html>
+<div style="background-color: #cfc ; padding: 10px; border: 1px solid green;">
+<b>Press the Start button to write the configuration file to disk.  </b>
+</div>
+</html>
+<hidden **Create the configuration file using the command line**>
+In the following I assume that you target box size is 220. Please adapt if necessary.
+For the general **[[:cryolo_nets#network_3_phosaurusnet|Phosaurus network]]** trained for **low-pass filtered cryo images** run:
 <code>
-touch config.json
+cryoloo.py config config_cryolo_.json 220 --filter LOWPASS --low_pass_cutoff 0.1
 </code>
-Open the file with your preferred editor.
+For the general model trained with **neural-network denoised cryo images** (with [[:janni_tutorial#download|JANNI's general model]]) run:
+<code>
-There are two general **[[:cryolo_nets#network_3_phosaurusnet|Phosaurus networks]]** available. One for cryo em images and one for negative stain data.
+cryoloo.py config config_cryolo_.json 220 --filter JANNI --janni_model /path/to/janni_general_model.h5
-=== CryoEM images ===
-For the general **[[:cryolo_nets#network_3_phosaurusnet|Phosaurus network]]** trained for **low-pass filtered cryo images** enter the following inside:
-<hidden **config.json for low-pass filtered cryo-images**>
-<code json config.json>
-    {
-    "model" : {
-        "architecture":         "PhosaurusNet",
-        "input_size":           1024,
-        "anchors":              [205,205],
-        "max_box_per_image":    700,
-        "num_patches":          1,
-        "filter":               [0.1,"tmp_filtered"]
-      }
-    }
-</code>
-</hidden>
-<html><br></html>
-For the general model trained with **neural-network denoised cryo images** (with JANNI's general model) enter the following inside:
-<hidden **config.json for neural-network denoised cryo-images**>
-<code json config.json>
-    {
-    "model" : {
-        "architecture":         "PhosaurusNet",
-        "input_size":           1024,
-        "anchors":              [205,205],
-        "max_box_per_image":    700,
-        "num_patches":          1,
-        "filter":               ["gmodel_janni_20190703.h5",24,3,"tmp_filtered_nn"]
-      }
-    }
 </code>
-You can download the file ''gmodel_janni_20190703.h5'' [[https://github.com/MPI-Dortmund/sphire-janni/tree/master/janni_general_models|here]]
+For the general model for **negative stain data** please run:
-</hidden>
+<code>
-<html><br></html>
+cryoloo.py config config_cryolo_.json 220 --filter NONE
-In all cases please set the value in the //"anchors"// field to your desired box size. It should be size of the minimum particle enclosing square in pixel.
-=== Negative stain images ===
-For the general model for **negative stain data** please use:
-<hidden **config.json for negative stain images**>
-<code json config.json>
-    {
-    "model" : {
-        "architecture":         "PhosaurusNet",
-        "input_size":           1024,
-        "anchors":              [205,205],
-        "max_box_per_image":    700,
-        "num_patches":          1
-      }
-    }
 </code>
 </hidden>
-Please set the value in the //"anchors"// field to your desired box size. It should be size of the minimum particle enclosing square in pixel.
 ==== Picking ====
@@ Line 218: / Line 159: @@
 However, the fine tune mode is still somewhat experimental and we will update this section if see more advantages or disadvantages.
-===== Start crYOLO =====
+If you followed the installation instructions, you now have to activate the cryolo virtual environment with
+<code>
+source activate cryolo
+</code>
+==== Data preparation ====
+{{page>pipeline:window:cryolo:data_preparation}}
+==== Start crYOLO ====
 {{page>pipeline:window:cryolo:start_cryolo}}
 ==== Configuration ====
+{{page>pipeline:window:cryolo:configuration}}
+{{ :pipeline:window:cryolo_pretrained_weights.png?300|}}
+Furthermore, you have to select the model you want to refine. Download the the general model you want to refine specify in the field pretrained_weights in the //"Training options"// tab.
+<html>
+<div style="background-color: #cfc ; padding: 10px; border: 1px solid green;">
+<b>You can now press the Start button to create configuration file.  </b>
+</div>
+</html>
+<hidden **Create the configuration file using the command line:**>
-You can use almost the same configuration as used when  [[pipeline:window:cryolo#configuration|training from scratch]]. You just have to tell crYOLO to use the latest general model((You can download it [[http://sphire.mpg.de/wiki/doku.php?id=downloads:cryolo_1&redirect=1#general_phosaurusnet_models|here]])) by pointing to it with the //"pretrained_weights"// options:
+I assume your box files for training are in the folder ''train_annotation'' and the corresponding images in ''train_image''. I furthermore assume that your box size in your box files is 160 and the model you want to refine is ''gmodel_phosnet_20190516.h5''. To create the config config_cryolo.json simply run:
 <code>
-"train": {
+cryoloo.py config config_cryolo.json 160 --train_image_folder train_image --train_annot_folder train_annot --pretrained_weights gmodel_phosnet_20190516.h5
-    [...]
-    "pretrained_weights":              "LATEST_GENERAL_MODEL.h5",
-    [...]
-    "saved_weights_name":   "my_refined_model.h5",
-    [...]
-}
 </code>
+To get a full description of all available options type:
+<code>
+cryoloo.py config -h
+</code>
+If you want to specify seperate validation folders you can use the %%--%%valid_image_folder and %%--%%valid_annot_folder options:
+<code>
+cryoloo.py config config_cryolo.json 160 --train_image_folder train_image --train_annot_folder train_annot --pretrained_weights gmodel_phosnet_20190516.h5 --valid_image_folder valid_img --valid_annot_folder valid_annot
+</code>
+</hidden>
 ==== Training ====
-In comparison to the training from scratch, you can skip the warm up training ( -w 0 ). Moreover you have to add the //%%--%%fine_tune// flag to tell crYOLO that it should do fine tuning. You can also tell crYOLO how many layers it should fine tune (default is two layers with -lft 2 ):
+Now you are ready to train the model. In case you have multiple GPUs, you should first select a free GPU. The following command will show the status of all GPUs:
 <code>
-cryolo_train.py -c config.json -w 0 -g 0 --fine_tune -lft 2
+nvidia-smi
 </code>
+For this tutorial, we assume that you have either a single GPU or want to use GPU 0.
+In the GUI choose the action //train//. In the //"Required arguments"// tab select the configuration file we created in the previous step and set the number of warmup periods to zero.
+{{ :pipeline:window:cryolo_refine.png?600 |}}
+In the //"Optional arguments"// tab please check the fine_tune box.
+{{ :pipeline:window:cryolo_refine_02.png?300 |}}
+<note important>
+The number of layers to fine tune (specified by layers_fine_tune in the //"Optional arguments"// tab) is still experimental. The default value of 2 worked for us but you might need more layers..
+</note>
 <note tip>
@@ Line 248: / Line 230: @@
 The fine tune mode is especially useful if you want to [[downloads:cryolo_1#run_it_on_the_cpu|train crYOLO on the CPU]]. On my local machine it reduced the time for training cryolo on 14 micrographs from 12-15 hours to 4-5 hours.
 </note>
+<hidden **Run training with the command line**>
+In comparison to the training from scratch, you can skip the warm up training ( -w 0 ). Moreover you have to add the //%%--%%fine_tune// flag to tell crYOLO that it should do fine tuning. You can also tell crYOLO how many layers it should fine tune (default is two layers with -lft 2 ):
+<code>
+cryolo_train.py -c config.json -w 0 -g 0 --fine_tune -lft 2
+</code>
+</hidden>
 ==== Picking ====
 {{page>pipeline:window:cryolo:picking}}
+==== Visualize the results ====
+{{page>pipeline:window:cryolo:visualize}}
+==== Evaluate your results ====
+{{page>pipeline:window:cryolo:evaluate_results}}
 ===== Picking filaments - Using a model trained for your data =====
 Since version 1.1.0 crYOLO supports picking filaments.
@@ Line 262: / Line 257: @@
 {{:pipeline:window:filament_tracing_02.png?300|}}  {{:pipeline:window:filament_tracing_03.png?300|}}
+If you followed the installation instructions, you now have to activate the cryolo virtual environment with
+<code>
+source activate cryolo
+</code>
 ==== Data preparation ====
-{{ :pipeline:window:settings_e2helixboxer.png?300|}} A
+{{ :pipeline:window:settings_e2helixboxer.png?300|}}
 The first step is to create the training data for your model. Right now, you have to use the e2helixboxer.py for this:
@@ Line 272: / Line 274: @@
 After tracing your training data in e2helixboxer, export them using //File -> Save//. Make sure that you export particle coordinates as this the only format supported right now (see screenshot). In the following example, it is expected that you exported into a folder called "train_annotation".
+For projects with roughly 20 filaments per image we successfully trained on 40 images (=> 800 filaments).
+==== Start crYOLO ====
+{{page>pipeline:window:cryolo:start_cryolo}}
 ==== Configuration ====
 {{page>pipeline:window:cryolo:configuration}}
-==== Training ====
-In principle, there is not much difference in training crYOLO for filament picking and particle picking. For project with roughly 20 filaments per image we successfully trained on 40 images (=> 800 filaments). However, in our experience the warm-up phase and training need a little bit more time:
-**Train your network with 10 warm up epochs:**
+<html>
+<div style="background-color: #cfc ; padding: 10px; border: 1px solid green;">
+<b>You can now press the Start button to create you configuration file. </b>
+</div>
+</html>
-<code>
-cryolo_train.py -c config.json -w 10 -g 0 -e 10
-</code>
-The final model will be called ''model.h5''
+{{page>pipeline:window:cryolo:configuration_cmdl_normal}}
+==== Training ====
+{{page>pipeline:window:cryolo:training}}
 ==== Picking ====
+Select the action prediction and fill all arguments in the “Required arguments” tab:
+{{ :pipeline:window:cryolo:cryolo_prediction.png?600 |}}
-The biggest difference in picking filaments with crYOLO is during prediction. However, there are just three additional parameters needed:
+Now select the "Filament options" tab and check "Activate filament mode", specifiy the filament width (e.g. 100) and define the box distance (e.g. 20 for 90% overlap when using a box size if 200):
-  * //- -filament//: Option that tells crYOLO that you want to predict filaments
+{{ :pipeline:window:cryolo_filament.png?700 |}}
-  * //-fw//: Filament width (pixels)
-  * //-bd//: Inter-Box distance (pixels).
+Press the start button to start the picking. The directory ''output_boxes'' will be created and all results are saved there. The format is the eman2 helix format with particle coordinates.
+You can find a detailed description [[:cryolo_filament_import_relion|how to import crYOLO filament coordinates into Relion here]].
+<hidden **Run prediction in commmand line**>
 Let's assume you want to pick a filament with a width of 100 pixels (-fw 100). The box size is 200x200 and you want a 90% overlap (-bd 20). Moreover, you wish that each filament has at least 6 boxes (-mn 6). The micrographs are in the ''full_data'' directory. Than the picking command would be:
 <code>
-cryolo_predict.py -c config.json -w model.h5 -i full_data --filament -fw 100 -bd 20 -o boxes/ -g 0 -mn 6
+cryolo_predict.py -c cryolo_config.json -w cryolo_model.h5 -i full_data --filament -fw 100 -bd 20 -o boxes/ -g 0 -mn 6
 </code>
+</hidden>
-The directory ''boxes'' will be created and all results are saved there. The format is the eman2 helix format with particle coordinates. You can find a detailed description [[:cryolo_filament_import_relion|how to import crYOLO filament coordinates into Relion here]].
 ==== Visualize the results ====
 {{page>pipeline:window:cryolo:visualize}}
-===== Evaluate your results =====
-<note warning>
-Unfortunately, this script **does not work for filamental data**.
-</note>
-The evaluation tool allows you, based on your validation data, to get statistics about your training.
-If you followed the tutorial, the validation data are selected randomly. With crYOLO 1.1.0 a run file for each training is created and saved into the folder runfiles/ in your project directory. This run file contains which files were selected for validation, and you can run your evaluation as follows:
-<code>
-cryolo_evaluation.py -c config.json -w model.h5 -r runfiles/run_YearMonthDay-HourMinuteSecond.json -g 0
-</code>
-The result looks like this:
-{{:pipeline:window:eval_example.png?900 |}}
-The table contains several statistics:
-  * AUC: Area under curve of the precision-recall curve. Overall summary statistics. Perfect classifier = 1, Worst classifier = 0
-  * Topt: Optimal confidence threshold with respect to the F1 score. It might not be ideal for your picking, as the F1 score weighs recall and precision equally. However in SPA, recall is often more important than the precision.
-  * R (Topt): Recall using the optimal confidence threshold.
-  * R (0.3): Recall using a confidence threshold of 0.3.
-  * R (0.2): Recall using a confidence threshold of 0.2.
-  * P (Topt): Precision using the optimal confidence threshold.
-  * P (0.3): Precision using a confidence threshold of 0.3.
-  * P (0.2): Precision using a confidence threshold of 0.2.
-  * F1 (Topt): Harmonic mean of precision and recall using the optimal confidence threshold.
-  * F1 (0.3): Harmonic mean of precision and recall using a confidence threshold of 0.3.
-  * F1 (0.2): Harmonic mean of precision and recall using a confidence threshold of 0.2.
-  * IOU (Topt): Intersection over union of the auto-picked particles and the corresponding ground-truth boxes. The higher, the better -- evaluated with the optimal confidence threshold.
-  * IOU (0.3): Intersection over union of the auto-picked particles and the corresponding ground-truth boxes. The higher, the better -- evaluated with a confidence threshold of 0.3.
-  * IOU (0.2): Intersection over union of the auto-picked particles and the corresponding ground-truth boxes. The higher, the better -- evaluated with a confidence threshold of 0.2.
-If the training data consists of multiple folders, then evaluation will be done for each folder separately.
-Furthermore, crYOLO estimates the optimal picking threshold regarding the F1 Score and F2 Score. Both are basically average values of the recall and prediction, whereas the F2 score puts more weights on the recall, which is in the cryo-em often more important.
-===== Advanced parameters =====
-During **training** (//cryolo_train//), there are the following advanced parameters:
-  * //%%--%%warm_restarts//: With this option the learning rate is decreasing after each epoch and then reset after a couple of epochs.
-  * //%%--%%num_cpu NUMBER_OF_CPUS//: Number of CPU cores used during training
-  * //%%--%%gpu_fraction FRACTION//: Number between 0 - 1 quantifying the fraction of GPU memory that is reserved by crYOLO
-  * //%%--%%skip_augmentation//: Set this flaq, if crYOLO should skip the data augmentation (not recommended).
-  * //%%--%%fine_tune//: With this flag, crYOLO will only train the last layers (fine tune)
-  * //%%-%%lft NUM_LAYER_FINETUNE//: Numbers of layers to fine tune (default is 2).
-During **picking** (//cryolo_predict//), there are these advanced parameters:
-  * //-t CONFIDENCE_THRESHOLD//: With the -t parameter, you can let the crYOLO pick more conservative (e.g by adding -t 0.4 to the picking command) or less conservative (e.g by adding -t 0.2 to the picking command). The valid parameter range is 0 to 1.
-  * //-d DISTANCE_IN_PIXEL//: With the -d parameter you can filter your picked particles. Boxes with a distance (pixel) less than this value will be removed.
-  * //-pbs PREDICTION_BATCH_SIZE//: With the -pbs parameter you can set the number of images picked as batch. Default is 3.
-  * //%%--%%otf//: Instead of saving the filtered images into an seperate directory, crYOLO will filter them on-the-fly and don't write them to disk.
-  * //%%--%%num_cpu NUMBER_OF_CPUS//: Number of CPU cores used during prediction
-  * //%%--%%gpu_fraction FRACTION//: Number between 0 -1 quantifying the fraction of GPU memory that is reserved by crYOLO.
-  * //--monitor//: With this flaq, crYOLO will monitor your input directory and pick images as they appear in the directory. The monitor mode can be stopped by writing the empty file STOP.CRYOLO (( you can create it with <code>touch STOP.CRYOLO</code> )) into the input directory.
-  * //-sr SEARCH_RANGE_FACTOR//: (FILAMENT MODE) The search range for connecting boxes is the box size times this factor. Default is 1.41
 ===== Help =====

User Tools

Site Tools

Differences

Page Tools