User Tools

Site Tools


pipeline:window:cryolo

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
pipeline:window:cryolo [2019/09/15 09:57]
twagner [Picking particles - Using the general model refined for your data]
pipeline:window:cryolo [2021/02/19 10:00]
twagner
Line 1: Line 1:
-{{ :downloads:cryolo_logo.jpg?600 |}}+{{  :downloads:cryolo_logo.jpg?600  }}
  
 ===== Overview ===== ===== Overview =====
  
-CrYOLO is a fast and accurate particle picking procedure. It's based on convolutional neural networks and utilizes the popular [[https://arxiv.org/abs/1612.08242|You Only Look Once]] (YOLO) object detection system.  +<note warning> 
-  * crYOLO makes picking **fast** -- On a modern GPU it will pick your particles at up to 6 micrographs per second. + 
-  * crYOLO makes picking **smart** -- The network learns the context of particles (e.g. not to pick particles on carbon or within ice contamination ) +**NEW DOCUMENTATION** 
-  * crYOLO makes training **easy** -- You might use a general network model and skip training completely. However, if the general model doesn't give you satisfactory results or if you would like to improve them, you might want to train a specialized model specific for your data set by selecting __particles__ (no selection of negative examples necessary) on a small number of micrographs. + 
-  * crYOLO makes training **tolerant** -- Don't worry if you miss quite a lot particles during creation of your training set. [[:cryolo_picking_unlabeled|crYOLO will still do the job.]]+The documentation has moved to [[https://cryolo.readthedocs.io|https://cryolo.readthedocs.io]] 
 + 
 +</note> 
 + 
 +CrYOLO is a fast and accurate particle picking procedure. It's based on convolutional neural networks and utilizes the popular [[https://arxiv.org/abs/1612.08242|You Only Look Once]] (YOLO) object detection system. 
 + 
 +  * crYOLO makes picking **fast**  – On a modern GPU it will pick your particles at up to 6 micrographs per second. 
 +  * crYOLO makes picking **smart**  – The network learns the context of particles (e.g. not to pick particles on carbon or within ice contamination ) 
 +  * crYOLO makes training **easy**  – You might use a general network model and skip training completely. However, if the general model doesn't give you satisfactory results or if you would like to improve them, you might want to train a specialized model specific for your data set by selecting __particles__  (no selection of negative examples necessary) on a small number of micrographs. 
 +  * crYOLO makes training **tolerant**  – Don't worry if you miss quite a lot particles during creation of your training set. [[:cryolo_picking_unlabeled|crYOLO will still do the job.]]
  
 In this tutorial we explain our recommended configurations for single particle and filament projects. You can find more information how to use crYOLO, about supported networks and about the config file in the following articles: In this tutorial we explain our recommended configurations for single particle and filament projects. You can find more information how to use crYOLO, about supported networks and about the config file in the following articles:
 +
   * [[https://www.youtube.com/embed/JTgldM4wAAk|crYOLO talk at SBGrid]]   * [[https://www.youtube.com/embed/JTgldM4wAAk|crYOLO talk at SBGrid]]
   * [[:cryolo_nets|crYOLO networks]]   * [[:cryolo_nets|crYOLO networks]]
   * [[:cryolo_config|crYOLO configuration file]]   * [[:cryolo_config|crYOLO configuration file]]
  
 +<note>
  
- 
-<note> 
 You can find more technical details in our paper: You can find more technical details in our paper:
  
Line 25: Line 34:
 We are also proud that crYOLO was recommended by F1000: We are also proud that crYOLO was recommended by F1000:
  
-//"CrYOLO works amazingly well in identifying the true particles and distinguishing them from other high-contrast features. Thus, crYOLO provides a fast, automated tool, which gives similar reliable results as careful manual selection and outperforms template based selection procedures."// +//"CrYOLO works amazingly well in identifying the true particles and distinguishing them from other high-contrast features. Thus, crYOLO provides a fast, automated tool, which gives similar reliable results as careful manual selection and outperforms template based selection procedures."//  <html></html> <html> <a href="https://f1000.com/prime/733517098?bd=1" target="_blank"><img src="https://s3.amazonaws.com/cdn.f1000.com/images/badges/badgef1000.gif" alt="Access the recommendation on F1000Prime" id="bg" /> Bettina Böttcher, Biochemistry, University Würzburg</a> </html> </note>
-<html></html> +
-<html> +
-<a href="https://f1000.com/prime/733517098?bd=1" target="_blank"><img src="https://s3.amazonaws.com/cdn.f1000.com/images/badges/badgef1000.gif" alt="Access the recommendation on F1000Prime" id="bg" />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Bettina Böttcher, Biochemistry, University Würzburg</a> +
-</html> +
-</note>+
  
 ===== Installation ===== ===== Installation =====
  
-You can find the download and installation instructions here: [[howto:download_latest_cryolo|Download and Installation]]+You can find the download and installation instructions here: [[:howto:download_latest_cryolo|Download and Installation]]
  
-===== Tutorials =====+{{page>pipeline:window:cryolo:issues}}
  
-Depending what you want to do, you can follow one of these self-contained Tutorials:+===== Release notes =====
  
-  - I would like to train a model from scratch for picking my particles +{{page>pipeline:window:cryolo:changelog}}
-  - I would like to train a model from scratch for picking filaments. +
-  - I would like to refine a general model for my particles.+
  
-The **first and the second tutorial** are the most common use cases and well tested. The **third tutorial** is still experimental but might give you better results in less time and less training data.  +===== Tutorials =====
- +
- +
- +
-===== Picking particles - Using a model trained for your data ===== +
- +
- +
-==== Data preparation ==== +
-{{page>pipeline:window:cryolo:data_preparation}} +
- +
-==== Start crYOLO ==== +
-{{page>pipeline:window:cryolo:start_cryolo}} +
- +
-==== Configuration ==== +
-{{page>pipeline:window:cryolo:configuration}} +
-==== Training ==== +
- +
-Now you are ready to train the model. In case you have multiple GPUs, you should first select a free GPU. The following command will show the status of all GPUs: +
-<code> +
-nvidia-smi +
-</code> +
-For this tutorial, we assume that you have either a single GPU or want to use GPU 0. Therefore we add '-g 0' after each command below. However, if you have multiple (e.g GPU 0 and GPU 1) you could also use both by adding '-g 0 1' after each command. +
- +
-Navigate to the folder with ''config.json'' file, ''train_image'' folder, etc. +
- +
-**Train your network with 3 warmup epochs:** +
- +
-<code> +
-cryolo_train.py -c config.json -w 3 -g 0 +
-</code> +
- +
-The final model will be called ''model.h5'' +
- +
-The training stops when the "loss" metric on the validation data does not improve 10 times in a row. This is typically enough. In case want to give the training more time to find the best model. You might increase the "not changed in a row" parameter to, for example, 15 by adding the flag //-e 15//: +
- +
-<code> +
-cryolo_train.py -c config.json -w 3 -g 0 -e 15 +
-</code> +
- +
-to the training command. +
-==== Picking ==== +
-{{page>pipeline:window:cryolo:picking}} +
- +
- +
-==== Visualize the results ==== +
-{{page>pipeline:window:cryolo:visualize}} +
-===== Picking particles - Without training using a general model ===== +
-Here you can find how to apply the general models we trained for you. If you would like to train your own general model, please see our extra wiki page: [[:cryolo_train_general_model|How to train your own general model]] +
- +
-Our general models can be found and downloaded here: [[howto:download_latest_cryolo|Download and Installation]].  +
- +
-==== Start crYOLO ==== +
-{{page>pipeline:window:cryolo:start_cryolo}} +
- +
-==== Configuration==== +
-The next step is to create a configuration file. Type: +
-<code> +
-touch config.json +
-</code> +
- +
-Open the file with your preferred editor. +
- +
-There are two general **[[:cryolo_nets#network_3_phosaurusnet|Phosaurus networks]]** available. One for cryo em images and one for negative stain data. +
-=== CryoEM images === +
-For the general **[[:cryolo_nets#network_3_phosaurusnet|Phosaurus network]]** trained for **low-pass filtered cryo images** enter the following inside: +
-<hidden **config.json for low-pass filtered cryo-images**> +
-<code json config.json> +
-    { +
-    "model" : { +
-        "architecture":         "PhosaurusNet", +
-        "input_size":           1024, +
-        "anchors":              [205,205], +
-        "max_box_per_image":    700, +
-        "num_patches":          1, +
-        "filter":               [0.1,"tmp_filtered"+
-      } +
-    } +
-</code> +
-</hidden> +
-<html><br></html> +
-For the general model trained with **neural-network denoised cryo images** (with JANNI's general model) enter the following inside: +
-<hidden **config.json for neural-network denoised cryo-images**> +
-<code json config.json> +
-    { +
-    "model" : { +
-        "architecture":         "PhosaurusNet", +
-        "input_size":           1024, +
-        "anchors":              [205,205], +
-        "max_box_per_image":    700, +
-        "num_patches":          1, +
-        "filter":               ["gmodel_janni_20190703.h5",24,3,"tmp_filtered_nn"+
-      } +
-    } +
-</code> +
- +
-You can download the file ''gmodel_janni_20190703.h5'' [[https://github.com/MPI-Dortmund/sphire-janni/tree/master/janni_general_models|here]] +
-</hidden> +
-<html><br></html> +
-In all cases please set the value in the //"anchors"// field to your desired box size. It should be size of the minimum particle enclosing square in pixel.  +
- +
-=== Negative stain images === +
-For the general model for **negative stain data** please use: +
-<hidden **config.json for negative stain images**> +
-<code json config.json> +
-    { +
-    "model" : { +
-        "architecture":         "PhosaurusNet", +
-        "input_size":           1024, +
-        "anchors":              [205,205], +
-        "max_box_per_image":    700, +
-        "num_patches":          1 +
-      } +
-    } +
-</code> +
-</hidden> +
- +
-Please set the value in the //"anchors"// field to your desired box size. It should be size of the minimum particle enclosing square in pixel.  +
- +
-==== Picking ==== +
-{{page>pipeline:window:cryolo:picking}} +
- +
-==== Visualize the results ==== +
-{{page>pipeline:window:cryolo:visualize}} +
-===== Picking particles - Using the general model refined for your data ===== +
- +
- +
-Since crYOLO 1.3 you can train a model for your data by //fine-tuning// the general model. +
- +
-What does //fine-tuning// mean? +
- +
-The general model was trained on a lot of particles with a variety of shapes and therefore learned a very good set of generic features. The last layers, however, learn a pretty abstract representation of the particles and it might be that they do not perfectly fit for your particle at hand. Fine-tuning only traines the last two convolutional layers, but keep the others fixed. This adjusts the more abstract representation for your specific problem.  +
- +
-Why should I //fine-tune// my model instead of training from scratch? +
-  -  From theory, using fine-tuning should reduce the risk of overfitting ((Overfitting means, that the model works good on the training micrographs, but not on new unseen micrographs. The model just memorized what it saw instead of learning generic features.)) and the amount of training data.  +
-  - The training is much faster, as not all layers have to be trained. +
-  - The training will need less GPU memory ((We are testing crYOLO with its default configuration on graphic cards with >= 8 GB memory. Using the fine tune mode, it should also work with GPUs with 4 GB memory)) and therefore is usable with NVIDIA cards with less memory.  +
-  +
-However, the fine tune mode is still somewhat experimental and we will update this section if see more advantages or disadvantages. +
- +
-==== Data preparation ==== +
-{{page>pipeline:window:cryolo:data_preparation}} +
- +
-==== Start crYOLO ==== +
- +
-{{page>pipeline:window:cryolo:start_cryolo}} +
-==== Configuration ==== +
- +
-You can use almost the same configuration as used when  [[pipeline:window:cryolo#configuration|training from scratch]]. You just have to tell crYOLO to use the latest general model((You can download it [[http://sphire.mpg.de/wiki/doku.php?id=downloads:cryolo_1&redirect=1#general_phosaurusnet_models|here]])) by pointing to it with the //"pretrained_weights"// options: +
- +
-<code> +
-"train":+
-    [...] +
-    "pretrained_weights":              "LATEST_GENERAL_MODEL.h5", +
-    [...] +
-    "saved_weights_name":   "my_refined_model.h5", +
-    [...] +
-+
-</code> +
- +
-==== Training ==== +
-In comparison to the training from scratch, you can skip the warm up training ( -w 0 ). Moreover you have to add the //%%--%%fine_tune// flag to tell crYOLO that it should do fine tuning. You can also tell crYOLO how many layers it should fine tune (default is two layers with -lft 2 ): +
- +
-<code> +
-cryolo_train.py -c config.json -w 0 -g 0 --fine_tune -lft 2 +
-</code> +
- +
-<note tip> +
- +
-**Training on CPU**  +
- +
-The fine tune mode is especially useful if you want to [[downloads:cryolo_1#run_it_on_the_cpu|train crYOLO on the CPU]]. On my local machine it reduced the time for training cryolo on 14 micrographs from 12-15 hours to 4-5 hours. +
-</note> +
-==== Picking ==== +
-{{page>pipeline:window:cryolo:picking}} +
- +
- +
-===== Picking filaments - Using a model trained for your data ===== +
-Since version 1.1.0 crYOLO supports picking filaments. +
- +
-Filament mode on Actin: +
- +
-{{:pipeline:window:action_tracing_2.png?300|}}  {{:pipeline:window:action_traceing_1.png?300|}} +
- +
-Filament mode on MAVS (EMPIAR-10031) : +
- +
-{{:pipeline:window:filament_tracing_02.png?300|}}  {{:pipeline:window:filament_tracing_03.png?300|}} +
- +
-==== Data preparation ==== +
-{{ :pipeline:window:settings_e2helixboxer.png?300|}}  +
- +
-The first step is to create the training data for your model. Right now, you have to use the e2helixboxer.py for this: +
-<code> +
-e2helixboxer.py --gui my_images/*.mrc +
-</code> +
- +
-After tracing your training data in e2helixboxer, export them using //File -> Save//. Make sure that you export particle coordinates as this the only format supported right now (see screenshot). In the following example, it is expected that you exported into a folder called "train_annotation"+
-==== Configuration ==== +
-{{page>pipeline:window:cryolo:configuration}} +
-==== Training ==== +
- +
-In principle, there is not much difference in training crYOLO for filament picking and particle picking. For project with roughly 20 filaments per image we successfully trained on 40 images (=> 800 filaments). However, in our experience the warm-up phase and training need a little bit more time: +
- +
-**Train your network with 10 warm up epochs:** +
- +
-<code> +
-cryolo_train.py -c config.json -w 10 -g 0 -e 10 +
-</code> +
- +
-The final model will be called ''model.h5'' +
-==== Picking ==== +
- +
-The biggest difference in picking filaments with crYOLO is during prediction. However, there are just three additional parameters needed: +
- +
-  * //- -filament//: Option that tells crYOLO that you want to predict filaments +
-  * //-fw//: Filament width (pixels) +
-  * //-bd//: Inter-Box distance (pixels). +
- +
-Let's assume you want to pick a filament with a width of 100 pixels (-fw 100). The box size is 200x200 and you want a 90% overlap (-bd 20). Moreover, you wish that each filament has at least 6 boxes (-mn 6). The micrographs are in the ''full_data'' directory. Than the picking command would be: +
-<code> +
-cryolo_predict.py -c config.json -w model.h5 -i full_data --filament -fw 100 -bd 20 -o boxes/ -g 0 -mn 6 +
-</code> +
- +
-The directory ''boxes'' will be created and all results are saved there. The format is the eman2 helix format with particle coordinates. You can find a detailed description [[:cryolo_filament_import_relion|how to import crYOLO filament coordinates into Relion here]]. +
- +
-==== Visualize the results ==== +
-{{page>pipeline:window:cryolo:visualize}} +
- +
-===== Evaluate your results ===== +
-<note warning> +
-Unfortunately, this script **does not work for filamental data**. +
-</note> +
-The evaluation tool allows you, based on your validation data, to get statistics about your training. +
-If you followed the tutorial, the validation data are selected randomly. With crYOLO 1.1.0 a run file for each training is created and saved into the folder runfiles/ in your project directory. This run file contains which files were selected for validation, and you can run your evaluation as follows: +
-<code> +
-cryolo_evaluation.py -c config.json -w model.h5 -r runfiles/run_YearMonthDay-HourMinuteSecond.json -g 0 +
-</code> +
- +
-The result looks like this: +
-{{:pipeline:window:eval_example.png?900 |}} +
- +
-The table contains several statistics: +
-  * AUC: Area under curve of the precision-recall curve. Overall summary statistics. Perfect classifier = 1, Worst classifier = 0 +
-  * Topt: Optimal confidence threshold with respect to the F1 score. It might not be ideal for your picking, as the F1 score weighs recall and precision equally. However in SPA, recall is often more important than the precision.   +
-  * R (Topt): Recall using the optimal confidence threshold.  +
-  * R (0.3): Recall using a confidence threshold of 0.3. +
-  * R (0.2): Recall using a confidence threshold of 0.2. +
-  * P (Topt): Precision using the optimal confidence threshold.  +
-  * P (0.3): Precision using a confidence threshold of 0.3. +
-  * P (0.2): Precision using a confidence threshold of 0.2. +
-  * F1 (Topt): Harmonic mean of precision and recall using the optimal confidence threshold. +
-  * F1 (0.3): Harmonic mean of precision and recall using a confidence threshold of 0.3. +
-  * F1 (0.2): Harmonic mean of precision and recall using a confidence threshold of 0.2. +
-  * IOU (Topt): Intersection over union of the auto-picked particles and the corresponding ground-truth boxes. The higher, the better -- evaluated with the optimal confidence threshold. +
-  * IOU (0.3): Intersection over union of the auto-picked particles and the corresponding ground-truth boxes. The higher, the better -- evaluated with a confidence threshold of 0.3. +
-  * IOU (0.2): Intersection over union of the auto-picked particles and the corresponding ground-truth boxes. The higher, the better -- evaluated with a confidence threshold of 0.2. +
- +
-If the training data consists of multiple folders, then evaluation will be done for each folder separately.  +
-Furthermore, crYOLO estimates the optimal picking threshold regarding the F1 Score and F2 Score. Both are basically average values of the recall and prediction, whereas the F2 score puts more weights on the recall, which is in the cryo-em often more important.+
  
 +Depending what you want to do, you can follow one of these self-contained Tutorials:
  
-===== Advanced parameters ===== +  [[:pipeline:window:cryolo:picking_general|I would like to pick particles without training using a general model]] 
-During **training** (//cryolo_train//), there are the following advanced parameters: +  - [[:pipeline:window:cryolo:picking_scratch|I would like to train a model from scratch for picking my particles]] 
-  * //%%--%%warm_restarts//With this option the learning rate is decreasing after each epoch and then reset after a couple of epochs. +  - [[:pipeline:window:cryolo:picking_filaments|I would like to train a model from scratch for picking filaments]] 
-  * //%%--%%num_cpu NUMBER_OF_CPUS//Number of CPU cores used during training +  - [[:pipeline:window:cryolo:picking_general_refine|I would like to refine a general model for my particle]]
-  * //%%--%%gpu_fraction FRACTION//Number between 0 - 1 quantifying the fraction of GPU memory that is reserved by crYOLO +
-  * //%%--%%skip_augmentation//Set this flaq, if crYOLO should skip the data augmentation (not recommended). +
-  * //%%--%%fine_tune//With this flag, crYOLO will only train the last layers (fine tune) +
-  * //%%-%%lft NUM_LAYER_FINETUNE//Numbers of layers to fine tune (default is 2). +
  
-During **picking** (//cryolo_predict//), there are these advanced parameters: +The **first, second and third tutorial**  are the most common use cases and well tested. The **fourth tutorial**  is still experimental but might give you better results in less time and less training data.
-  * //-t CONFIDENCE_THRESHOLD//: With the -t parameter, you can let the crYOLO pick more conservative (e.g by adding -t 0.4 to the picking command) or less conservative (e.g by adding -t 0.2 to the picking command). The valid parameter range is 0 to 1. +
-  //-d DISTANCE_IN_PIXEL//: With the -d parameter you can filter your picked particles. Boxes with a distance (pixel) less than this value will be removed. +
-  //-pbs PREDICTION_BATCH_SIZE//: With the -pbs parameter you can set the number of images picked as batch. Default is 3. +
-  //%%--%%otf//: Instead of saving the filtered images into an seperate directory, crYOLO will filter them on-the-fly and don't write them to disk. +
-  //%%--%%num_cpu NUMBER_OF_CPUS//: Number of CPU cores used during prediction +
-  * //%%--%%gpu_fraction FRACTION//: Number between 0 -1 quantifying the fraction of GPU memory that is reserved by crYOLO. +
-  * //--monitor//: With this flaq, crYOLO will monitor your input directory and pick images as they appear in the directory. The monitor mode can be stopped by writing the empty file STOP.CRYOLO (( you can create it with <code>touch STOP.CRYOLO</code> )) into the input directory.  +
-  * //-sr SEARCH_RANGE_FACTOR//: (FILAMENT MODE) The search range for connecting boxes is the box size times this factor. Default is 1.41  +
-   +
  
 ===== Help ===== ===== Help =====
Line 329: Line 62:
  
 Find help at our [[https://listserv.gwdg.de/mailman/listinfo/sphire|mailing list]]! Find help at our [[https://listserv.gwdg.de/mailman/listinfo/sphire|mailing list]]!
 +
 +
pipeline/window/cryolo.txt · Last modified: 2021/02/19 10:00 by twagner