pipeline:window:cryolo

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
pipeline:window:cryolo [2019/09/13 22:35]
twagner [Data preparation]
pipeline:window:cryolo [2019/09/15 09:57] (current)
twagner [Picking particles - Using the general model refined for your data]
Line 38: Line 38:
 ===== Tutorials ===== ===== Tutorials =====
  
-Depending what you want to do, you can follow one of these Tutorials:+Depending what you want to do, you can follow one of these self-contained ​Tutorials:
  
   - I would like to train a model from scratch for picking my particles   - I would like to train a model from scratch for picking my particles
Line 44: Line 44:
   - I would like to refine a general model for my particles.   - I would like to refine a general model for my particles.
  
-The **first and the second tutorial** are the most common use cases and well tested. The **third tutorial** is still experimental but might give you better results in less time or less training data. +The **first and the second tutorial** are the most common use cases and well tested. The **third tutorial** is still experimental but might give you better results in less time and less training data. 
  
  
Line 52: Line 52:
  
 ==== Data preparation ==== ==== Data preparation ====
-If you followed the installation instructions,​ you now have to activate the cryolo virtual environment with +{{page>pipeline:​window:​cryolo:​data_preparation}}
- +
-<​code>​ +
-source activate cryolo +
-</​code>​ +
- +
-In the following I will assume that your image data is in the folder ''​full_data''​. +
- +
-The next step is to create training data. To do so, we have to pick single particles manually in several micrographs. Ideally, the micrographs are picked to completion. [[:​cryolo_picking_unlabeled|However,​ it is not necessary to pick all particles. crYOLO will still converge if you miss some (or even many).]] +
-One may ask how many micrographs have to be picked? It depends! Typically 10 micrographs are a good start. However, that number may increase / decrease due to several factors: +
-  * A very heterogenous background could make it necessary to pick more micrographs. +
-  * When you refine a general model, you might need to pick less micrographs. +
-  * If your micrograph is only sparsely decorated, you may need to pick more micrographs. +
- +
-We recommend that you start with 10 micrographs,​ then autopick your data, check the results and finally decide whether to add more micrographs to your training set.  +
- +
-{{:pipeline:​window:​box_manager.png?​direct&​400 |}} +
-To create your training data, crYOLO is shipped with a tool called "​boxmanager"​. However, you can also use tools like e2boxer to create your training data. +
- +
-Start the box manager with the following command: +
-<​code>​ +
-cryolo_boxmanager.py +
-</​code>​ +
- +
-Now press //File -> Open image folder// and the select the ''​full_data''​ directory. The first image should pop up. You can navigate in the directory tree through the images. Here is how to pick particles:  +
- +
-  * LEFT MOUSE BUTTON: Place a box +
-  * HOLD LEFT MOUSE BUTTON: Move a box +
-  * CONTROL + LEFT MOUSE BUTTON: Remove a box +
- +
-You can change the box size in the main window, by changing the number in the text field labeled //Box size://. Press //Set// to apply it to all picked particles. __For picking, you should the use minimum sized square which encloses your particle.__ +
- +
-If you finished picking from your micrographs,​ you can export your box files with //Files -> Write box files//. +
-Create a new directory called ''​train_annotation''​ and save it there. Close boxmanager. +
- +
-Now create a third folder with the name ''​train_image''​. Now for each box file, copy the corresponding image from ''​full_data''​ into ''​train_image''​((While it is nice to keep the things organized, you don't have to copy your training images in a separate folder. In the configuration file (see below) you can also simply specify the full_data directory as "//​train_image_folder//"​. crYOLO will find the correct images using the box files.)). crYOLO will detect image / box file pairs by search taking the box file an searching for an image filename which contains the box filename.+
  
 ==== Start crYOLO ==== ==== Start crYOLO ====
Line 93: Line 58:
  
 ==== Configuration ==== ==== Configuration ====
-You now have to create a configuration file your picking project. It contains all important constants and paths and helps you to reproduce your results later on.  +{{page>​pipeline:window:cryolo:configuration}}
- +
-You can either use the commandline to create the configuration file or the GUI. +
- +
-**Using the command line:** +
- +
-To create an empty file do: +
-<​code>​ +
-touch config.json +
-</​code>​ +
- +
-To use the [[:​cryolo_nets#​network_3_phosaurusnet|Phosaurus network]] copy the following lines into that file: +
-<code json config.json>​ +
-{ +
-    "​model"​ : { +
-        "​architecture"​        "​PhosaurusNet",​ +
-        "​input_size"​          1024, +
-        "​anchors"​             ​[160,​160],​ +
-        "​max_box_per_image": ​   600, +
-        "​num_patches": ​         1, +
-        "​filter": ​              ​[0.1,"​filtered"​] +
-    ​}+
- +
-    "​train":​ { +
-        "​train_image_folder": ​  "​train_image/",​ +
-        "​train_annot_folder": ​  "​train_annotation/",​ +
-        "​train_times": ​         10, +
-        "​pretrained_weights": ​  "​model.h5",​ +
-        "​batch_size": ​          4, +
-        "​learning_rate": ​       1e-4, +
-        "​nb_epoch": ​            50, +
-        "​warmup_epochs": ​       0, +
- +
-        "​object_scale": ​        5.0 , +
-        "​no_object_scale": ​     1.0, +
-        "​coord_scale": ​         1.0, +
-        "​class_scale": ​         1.0, +
-        "​log_path": ​            "​logs/",​ +
-        "​saved_weights_name": ​  "​model.h5",​ +
-        "​debug": ​               true +
-    ​}+
- +
-    "​valid":​ { +
-        "​valid_image_folder": ​  "",​ +
-        "​valid_annot_folder": ​  "",​ +
- +
-        "​valid_times": ​         1 +
-    } +
-+
-</​code>​ +
-//​[[:​cryolo_config|Click here to get more information about the configuration file]]// +
- +
-Please set the value in the //"​anchors"//​ field to your desired box size. It should be the same as in your training box files. Furthermore check if the fields //"​train_image_folder"//​ and //"​train_annot_folder"//​ have the correct values. Typically, 20% of the training data are randomly chosen as validation data. If you want to use specific images as validation data, you can move the images and the corresponding box files to the folders specified in //"​valid_image_folder"//​ and //"​valid_annot_folder"//​. Make sure that they are removed from the original training folder! With the line below, crYOLO automatically filters your images to an absolute frequency 0.1 and write them into a folder "​filtered"​. +
-<​code>​ +
-"​filter": ​              ​[0.1,"​filtered"​]. +
-</​code>​ +
-crYOLO will automatically check if an image in full_data is available in the ''​filtered''​ directory. The filtering is done in parallel. If you don't want to use crYOLO'​s internal filtering, just remove the line and filter them manually. If you remove the line, don't forget to remove the comma at the end of the line above.  +
- +
-<note tip> +
-**Alternative:​ Using neural-network denoising with JANNI** +
- +
-Since crYOLO 1.4 you can also use neural network denoising with [[:​janni|JANNI]]. The easiest way is to use the JANNI'​s general model ([[:​janni#​janni_general_model|Download here]]) but you can also [[:​janni_tutorial#​training_a_model_for_your_data|train JANNI for your data]]. crYOLO directly uses an interface to JANNI to filter your data, you just have to specify the path to your JANNI model, overlap of the patches (default 24), the batch size (default 3) and a path where the denoised images should be written.  +
- +
-To use JANNI'​s denoising you have to use following entry in your config.json:​ +
- +
-<​code>​ +
-"​filter": ​              ​["​path/​to/​janni_model.h5",​24,​3,"​filtered"​] +
-</​code>​  +
- +
-I recommend to use denoising with JANNI only together with a GPU as it is rather slow (~ 1-2 seconds per micrograph on the GPU and 10 seconds per micrograph on the CPU) +
- +
-</​note>​ +
- +
-Please note the wiki entry about the [[:​cryolo_config|crYOLO configuration file]] if you want to know more details. +
- +
-**Using the GUI:** +
 ==== Training ==== ==== Training ====
  
Line 196: Line 85:
 to the training command. to the training command.
 ==== Picking ==== ==== Picking ====
-You can now use the model weights saved in ''​model.h5''​ (//if you come to this section from another point of the tutorial, this filename might be different like ''​gmodel_phosnet_X_Y.h5''//​) to pick all your images in the directory ''​full_data''​. To do this, run:  +{{page>pipeline:​window:​cryolo:​picking}}
-<​code>​ +
-cryolo_predict.py -c config.json -w model.h5 -i full_data/ -g 0 -o boxfiles/ +
-</code>+
  
-You will find the picked particles in the directory ''​boxfiles''​. 
- 
-If you want to pick less conservatively or more conservatively you might want to change the selection threshold from the default of 0.3 to a less conservative value like 0.2 or more conservative value like 0.4 using the //-t// parameter: 
-<​code>​ 
-cryolo_predict.py -c config.json -w model.h5 -i full_data/ -g 0 -o boxfiles/ -t 0.2 
-</​code>​ 
-However, it is much easier to select the best threshold after picking using the ''​CBOX''​ files written by crYOLO as described in the next section 
  
 ==== Visualize the results ==== ==== Visualize the results ====
- +{{page>pipeline:​window:​cryolo:visualize}}
-To visualize your results you can use the box manager: +
-<​code>​ +
-cryolo_boxmanager.py +
-</​code>​ +
-Now press //File -> Open image// folder and the select the ''​full_data''​ directory. The first image should pop up. Then you import the box files with //File -> Import box files// and select in the ''​boxfiles''​ folder the ''​EMAN''​ directory.  +
- +
-Since version 1.3.0 crYOLO writes cbox files in a separate ''​CBOX''​ folder. You can import them into the box manager, change the threshold easily using the live preview and write the new box selection into new box files. +
- +
-[{{ :pipeline:​window:​ezgif-1-3b966b0324d1.gif?​400 |This example shows how to filter particle boxes using the cryolo ​boxmanager. It is an animated gif. Click on it to see it playing.}}+
- +
-<note warning>​ +
-Right now, **this filtering does not yet work for filaments**. +
-</​note>​ +
- +
- +
 ===== Picking particles - Without training using a general model ===== ===== Picking particles - Without training using a general model =====
 Here you can find how to apply the general models we trained for you. If you would like to train your own general model, please see our extra wiki page: [[:​cryolo_train_general_model|How to train your own general model]] Here you can find how to apply the general models we trained for you. If you would like to train your own general model, please see our extra wiki page: [[:​cryolo_train_general_model|How to train your own general model]]
Line 300: Line 163:
  
 ==== Picking ==== ==== Picking ====
-Just follow the description given [[pipeline:​window:​cryolo#​Picking|above]] +{{page>pipeline:​window:​cryolo:picking}}
- +
-As for a direct trained model, you might want to play around with the confidence threshold, either by using the ''​CBOX''​ files after prediction or use directly a different confidence threshold using the -t parameter during prediction. +
  
 +==== Visualize the results ====
 +{{page>​pipeline:​window:​cryolo:​visualize}}
 ===== Picking particles - Using the general model refined for your data ===== ===== Picking particles - Using the general model refined for your data =====
  
Line 315: Line 177:
  
 Why should I //​fine-tune//​ my model instead of training from scratch? Why should I //​fine-tune//​ my model instead of training from scratch?
-  -  From theory, using fine-tuning should reduce the risk of overfitting ((Overfitting means, that the model works good on the training micrographs,​ but not on new unseen micrographs. The model just memorized what it saw instead of learning generic features.)). ​+  -  From theory, using fine-tuning should reduce the risk of overfitting ((Overfitting means, that the model works good on the training micrographs,​ but not on new unseen micrographs. The model just memorized what it saw instead of learning generic features.)) ​and the amount of training data
   - The training is much faster, as not all layers have to be trained.   - The training is much faster, as not all layers have to be trained.
   - The training will need less GPU memory ((We are testing crYOLO with its default configuration on graphic cards with >= 8 GB memory. Using the fine tune mode, it should also work with GPUs with 4 GB memory)) and therefore is usable with NVIDIA cards with less memory. ​   - The training will need less GPU memory ((We are testing crYOLO with its default configuration on graphic cards with >= 8 GB memory. Using the fine tune mode, it should also work with GPUs with 4 GB memory)) and therefore is usable with NVIDIA cards with less memory. ​
Line 321: Line 183:
 However, the fine tune mode is still somewhat experimental and we will update this section if see more advantages or disadvantages. However, the fine tune mode is still somewhat experimental and we will update this section if see more advantages or disadvantages.
  
 +==== Data preparation ====
 +{{page>​pipeline:​window:​cryolo:​data_preparation}}
 +
 +==== Start crYOLO ====
 +
 +{{page>​pipeline:​window:​cryolo:​start_cryolo}}
 ==== Configuration ==== ==== Configuration ====
  
Line 336: Line 204:
  
 ==== Training ==== ==== Training ====
-In comparision ​to the training from scratch, you can skip the warm up training. Moreover you have to add the //​%%--%%fine_tune//​ flag:+In comparison ​to the training from scratch, you can skip the warm up training ​( -w 0 ). Moreover you have to add the //​%%--%%fine_tune//​ flag to tell crYOLO that it should do fine tuning. You can also tell crYOLO how many layers it should fine tune (default is two layers with -lft 2 ):
  
 <​code>​ <​code>​
-cryolo_train.py -c config.json -w 0 -g 0 --fine_tune+cryolo_train.py -c config.json -w 0 -g 0 --fine_tune ​-lft 2
 </​code>​ </​code>​
-==== Picking ==== 
-Picking is identical as with a model trained from scratch, so we will skip it here. Just follow the description given [[pipeline:​window:​cryolo#​Picking|above]] 
  
-==== Training on CPU ====+<note tip>
  
 +**Training on CPU** 
  
 The fine tune mode is especially useful if you want to [[downloads:​cryolo_1#​run_it_on_the_cpu|train crYOLO on the CPU]]. On my local machine it reduced the time for training cryolo on 14 micrographs from 12-15 hours to 4-5 hours. The fine tune mode is especially useful if you want to [[downloads:​cryolo_1#​run_it_on_the_cpu|train crYOLO on the CPU]]. On my local machine it reduced the time for training cryolo on 14 micrographs from 12-15 hours to 4-5 hours.
 +</​note>​
 +==== Picking ====
 +{{page>​pipeline:​window:​cryolo:​picking}}
 +
 +
 ===== Picking filaments - Using a model trained for your data ===== ===== Picking filaments - Using a model trained for your data =====
 Since version 1.1.0 crYOLO supports picking filaments. Since version 1.1.0 crYOLO supports picking filaments.
Line 360: Line 232:
  
 ==== Data preparation ==== ==== Data preparation ====
-{{ :​pipeline:​window:​settings_e2helixboxer.png?​300|}} ​As described [[pipeline:​window:​cryolo#​data_preparation|previously]],​ filtering your image using a low-pass filter is probably a good idea. +{{ :​pipeline:​window:​settings_e2helixboxer.png?​300|}} ​
  
-After this is done, you have to prepare ​training data for your model. +The first step is to create the training data for your model. Right now, you have to use the e2helixboxer.py ​for this:
- Right now, you have to use the e2helixboxer.py ​to generate the training data:+
 <​code>​ <​code>​
 e2helixboxer.py --gui my_images/​*.mrc e2helixboxer.py --gui my_images/​*.mrc
Line 369: Line 240:
  
 After tracing your training data in e2helixboxer,​ export them using //File -> Save//. Make sure that you export particle coordinates as this the only format supported right now (see screenshot). In the following example, it is expected that you exported into a folder called "​train_annotation"​. After tracing your training data in e2helixboxer,​ export them using //File -> Save//. Make sure that you export particle coordinates as this the only format supported right now (see screenshot). In the following example, it is expected that you exported into a folder called "​train_annotation"​.
- 
 ==== Configuration ==== ==== Configuration ====
-You can configure it the same way as for a "​normal"​ project. +{{page>​pipeline:window:cryolo:configuration}}
- +
-<code json config.json>​ +
-{ +
-    "​model"​ : { +
-        "​architecture"​        "​PhosaurusNet",​ +
-        "​input_size"​          1024, +
-        "​anchors"​             ​[160,​160],​ +
-        "​max_box_per_image": ​   600, +
-        "​num_patches": ​         1, +
-        "​filter": ​              ​[0.1,"​filtered"​] +
-    ​}+
- +
-    "​train":​ { +
-        "​train_image_folder": ​  "​train_image/",​ +
-        "​train_annot_folder": ​  "​train_annotation/",​ +
-        "​train_times": ​         10, +
-        "​pretrained_weights": ​  "​model.h5",​ +
-        "​batch_size": ​          4, +
-        "​learning_rate": ​       1e-4, +
-        "​nb_epoch": ​            50, +
-        "​warmup_epochs": ​       0, +
- +
-        "​object_scale": ​        5.0 , +
-        "​no_object_scale": ​     1.0, +
-        "​coord_scale": ​         1.0, +
-        "​class_scale": ​         1.0, +
-        "​log_path": ​            "​logs/",​ +
-        "​saved_weights_name": ​  "​model.h5",​ +
-        "​debug": ​               true +
-    ​}+
- +
-    "​valid":​ { +
-        "​valid_image_folder": ​  "",​ +
-        "​valid_annot_folder": ​  "",​ +
- +
-        "​valid_times": ​         1 +
-    } +
-+
-</​code>​ +
- +
-//​[[:​cryolo_config|Click here to get more information about the configuration file]]// +
- +
-Just adapt the anchors accordingly to your box size. +
 ==== Training ==== ==== Training ====
  
Line 443: Line 269:
  
 ==== Visualize the results ==== ==== Visualize the results ====
-You can use the boxmanager as described [[pipeline:​window:​cryolo#​visualize_the_results|previously]].+{{page>pipeline:​window:​cryolo:​visualize}}
  
 ===== Evaluate your results ===== ===== Evaluate your results =====
Line 484: Line 310:
   * //​%%--%%gpu_fraction FRACTION//: Number between 0 - 1 quantifying the fraction of GPU memory that is reserved by crYOLO   * //​%%--%%gpu_fraction FRACTION//: Number between 0 - 1 quantifying the fraction of GPU memory that is reserved by crYOLO
   * //​%%--%%skip_augmentation//:​ Set this flaq, if crYOLO should skip the data augmentation (not recommended).   * //​%%--%%skip_augmentation//:​ Set this flaq, if crYOLO should skip the data augmentation (not recommended).
 +  * //​%%--%%fine_tune//:​ With this flag, crYOLO will only train the last layers (fine tune)
 +  * //%%-%%lft NUM_LAYER_FINETUNE//:​ Numbers of layers to fine tune (default is 2). 
  
-During **picking** (//​cryolo_predict//​),​ there are five advanced parameters:+During **picking** (//​cryolo_predict//​),​ there are these advanced parameters:
   * //-t CONFIDENCE_THRESHOLD//:​ With the -t parameter, you can let the crYOLO pick more conservative (e.g by adding -t 0.4 to the picking command) or less conservative (e.g by adding -t 0.2 to the picking command). The valid parameter range is 0 to 1.   * //-t CONFIDENCE_THRESHOLD//:​ With the -t parameter, you can let the crYOLO pick more conservative (e.g by adding -t 0.4 to the picking command) or less conservative (e.g by adding -t 0.2 to the picking command). The valid parameter range is 0 to 1.
   * //-d DISTANCE_IN_PIXEL//:​ With the -d parameter you can filter your picked particles. Boxes with a distance (pixel) less than this value will be removed.   * //-d DISTANCE_IN_PIXEL//:​ With the -d parameter you can filter your picked particles. Boxes with a distance (pixel) less than this value will be removed.
Line 491: Line 319:
   * //​%%--%%otf//:​ Instead of saving the filtered images into an seperate directory, crYOLO will filter them on-the-fly and don't write them to disk.   * //​%%--%%otf//:​ Instead of saving the filtered images into an seperate directory, crYOLO will filter them on-the-fly and don't write them to disk.
   * //​%%--%%num_cpu NUMBER_OF_CPUS//:​ Number of CPU cores used during prediction   * //​%%--%%num_cpu NUMBER_OF_CPUS//:​ Number of CPU cores used during prediction
-  * //​%%--%%gpu_fraction FRACTION//: Number between 0 -1 quantifying the fraction of GPU memory that is reserved by crYOLO +  * //​%%--%%gpu_fraction FRACTION//: Number between 0 -1 quantifying the fraction of GPU memory that is reserved by crYOLO
-  * //-sr SEARCH_RANGE_FACTOR//:​ (FILAMENT MODE) The search range for connecting boxes is the box size times this factor. Default is 1.41+  * //​--monitor//:​ With this flaq, crYOLO will monitor your input directory and pick images as they appear in the directory. The monitor mode can be stopped by writing the empty file STOP.CRYOLO (( you can create it with <​code>​touch STOP.CRYOLO</​code>​ )) into the input directory. ​ 
 +  * //-sr SEARCH_RANGE_FACTOR//:​ (FILAMENT MODE) The search range for connecting boxes is the box size times this factor. Default is 1.41  
 +   
  
 ===== Help ===== ===== Help =====
  • pipeline/window/cryolo.1568406901.txt.gz
  • Last modified: 2019/09/13 22:35
  • by twagner