pipeline:window:cryolo

This version (2020/02/13 14:03) was approved by twagner.The Previously approved version (2019/09/29 10:34) is available.Diff

CrYOLO is a fast and accurate particle picking procedure. It's based on convolutional neural networks and utilizes the popular You Only Look Once (YOLO) object detection system.

  • crYOLO makes picking fast – On a modern GPU it will pick your particles at up to 6 micrographs per second.
  • crYOLO makes picking smart – The network learns the context of particles (e.g. not to pick particles on carbon or within ice contamination )
  • crYOLO makes training easy – You might use a general network model and skip training completely. However, if the general model doesn't give you satisfactory results or if you would like to improve them, you might want to train a specialized model specific for your data set by selecting particles (no selection of negative examples necessary) on a small number of micrographs.
  • crYOLO makes training tolerant – Don't worry if you miss quite a lot particles during creation of your training set. crYOLO will still do the job.

In this tutorial we explain our recommended configurations for single particle and filament projects. You can find more information how to use crYOLO, about supported networks and about the config file in the following articles:

You can find more technical details in our paper:

Wagner, T. et al. SPHIRE-crYOLO is a fast and accurate fully automated particle picker for cryo-EM. Communications Biology 2, (2019).


We are also proud that crYOLO was recommended by F1000:

“CrYOLO works amazingly well in identifying the true particles and distinguishing them from other high-contrast features. Thus, crYOLO provides a fast, automated tool, which gives similar reliable results as careful manual selection and outperforms template based selection procedures.”

Access the recommendation on F1000Prime Bettina Böttcher, Biochemistry, University Würzburg

You can find the download and installation instructions here: Download and Installation

Known issues
  • Issue 0: Training on multiple GPUs sometimes lead to worse performance (higher loss). We currently recommend to train on single gpus.
  • Issue 17: On the fly filtering (--otf) is slower than using it not, as the filtering is not parallelized in this case.
  • Issue 27: Filament mode is not working with micrographs motion corrected by unblur. Will be fixed in the next release.

Closed issues

Closed issues

  • Issue 1: crYOLO sometimes not exit properly after training finished. Has to be killed manually.
  • Issue 2: If you use automatic filtering with .tif files, you get an error like “OSError: cannot identify image file 'filtered_folder/another_folder/my_image.tif'”. It will be fixed in the next release.
  • Issue 3: (Boxmanager) The visualization only shows the first filament when loading eman1 helical box files (start end coordinates). Will be fixed in the next release.
  • Issue 4: The filament mode will crash if crYOLO cannot identify a single particle in the image. Will be fixed in 1.2.2
  • Issue 5: If movies were aligned with cisTEM and picked with crYOLO, the box position are vertically flipped. Will be fixed in 1.2.2
  • Issue 6: crYOLO does overwrite the environmental variable “CUDA_VISIBLE_DEVICES” with 0 if no gpu is specified by the -g parameter. This leads to the behavior that crYOLO ignores previous settings in CUDA_VISIBLE_DEVICES. Will be fixed in 1.2.2
  • Issue 7: On K3 images crYOLO seems to add a offset toward the longer axis of the input image.
  • Issue 8: There is a logical error in filament tracing, which sometimes connects two parallel filaments.
  • Issue 9: Some people report an error when running cryolo prediction/training: “ImportError: numpy.core.multiarray failed to import”. It will be fixed in 1.2.3.
  • Issue 10: On machines with many cores (e.g 64) an error during filtering might pop up: “[ERROR:0] 53: Can't spawn new thread”
  • Issue 11: If the -g parameter is not provided, crYOLO will use the memory of all GPUs. Will be fixed in 1.2.3.
  • Issue 12: The LineEnhancer depdenceny of crYOLO is still dependent from opencv. Workaround: In the crYOLO environment: conda install opencv
  • Issue 13: After picking it can happen that some of the boxes are not fully immersed in the image. Will be fixed in 1.2.4.
  • Issue 14: Parallelization in filament mode is broken. Will be fixed in 1.2.4.
  • Issue 15: If the --gpu_fraction is used, crYOLO always uses GPU 0. Will be fixed in 1.3.1.
  • Issue 16: --gpu_fraction only works for prediction, not for training. Will be fixed in 1.3.2.
  • Issue 18: Prediction is broken in 1.3.2. It removes all particles as it claim they are not fully immersed in the image.
  • Issue 19: Filtering does not work if target image directory is absolute path.
  • Issue 20: crYOLO 1.3.4 has a normalization bug. During training the images are normalized seperately, but during prediction is done batch wise. Workaround: Use -pbs 1 during prediction. It will be fixed in 1.3.5.
  • Issue 21: The search range for filament tracing is too low for many datasets. To check if you are affected: Use your trained model and pick without the filament options. Check if your filaments a nicely picked (many consecutive boxes on a filament). In the next version, the search range will be increased and added as an optional parameter.
  • Issue 22: If absolute paths are used in the field “train_image” in your configuration file, filtering is skipped.
  • Issue 23: Since crYOLO 1.4.0 it sometimes take long until it starts picking. The reason seems to be the tensorflow update.<del> * <del>Issue 24: Fine-tune mode does not start (cannot find layer model_3). Will be fixed in 1.4.1.<del> * <del>Issue 25: When using GUI, prediction behaves differently than using command line. The reason is, that it uses a different multiprocessing start method. Will be fixed with 1.5.1
  • Issue 26: If you select filtering “None” crYOLO does not train properly.
2020/02/13 13:59 · twagner

crYOLO 1.5.6:

  • Fix installation issues

crYOLO 1.5.5:

  • Added a the option –use_multithreading for training. If python multiprocessing leads to problems during training (e.g. freezing, dying workers) use multithreading instead of multiprocessing.
  • Boxmanager 1.2.8 integrates a low pass filter to make it training data creation easier.
  • Internal refactoring needed for boxmanger 1.2.8 changes.

crYOLO 1.5.4:

  • Fixed systematic offset when picking K3 images using the general model.
  • Fixed crash when minimum distance filter is used (Thanks to Peter Van Blerkom)
  • Add “Print cmd” button to GUI. The command can then be copied to a submission script (Thanks to Wolfgang Lugmayr).
  • Fixed problem that crYOLO does not work if no filtering is used.

crYOLO 1.5.3:

  • Fixed another problem when downloading pretrained weights
  • Fixed problem when opening the GUI on a computer without a GPU it crashes. This is especially problematic for clusters with GPU nodes.

crYOLO 1.5.1:

  • crYOLO checks if updates are available and informs you about it
  • Fixed a problem that crYOLO predict crashed for some users when running it from GUI.
  • Fixed a problem where crYOLO crashed during training due to memory issues.
  • Fixed a problem that the pretrained weights could not be downloaded when the installation was done globally on a cluster. Now they a are downloaded during setup of crYOLO.

crYOLO 1.5.0:

  • Added a GUI for crYOLO (start with cryolo_gui.py)
  • Optimized fast low-pass filtering pipeline: In consequence this, speeds up on-the-fly filtering by 175%, training by 75% and picking by 60%
  • New monitor mode for prediction: When this option is activated, crYOLO will monitor your input folder. This especially useful for automation purposes. You can stop the monitor mode by writing an empty file with the name “stop.cryolo” in the input directory. Just add –monitor in the command line or check the monitor box in GUI.
  • Training is now merged into one command instead of two
  • Add rotation as additional data augmentation
  • Number of layers for fine tuning are now changeable (-lft)
  • cryolo_evaluation.py will now output a html file with the results.
  • Set patch argument as deprecated
  • crYOLO will not allocate the complete GPU memory anymore.
  • Remove warmup as config file option. Please specify it with -w.

Old crYOLO change logs

Old crYOLO change logs

crYOLO 1.4.1:

  • Downgrade the dependencies to tensorflow 1.10.1 and numpy 1.14.5 as some users reported long initialization times. (Thanks to Shaun Rawson)
  • The initialization weights are not longer shipped with the package and downloaded on-the-fly (because they are big and pypi does not allow such big packages)
  • crYOLO is installed through pypi
  • crYOLO box manager is installed through pypi and automatically shipped with the crYOLO package
  • Fixed fine-tune mode (Thanks to Antoine Koehl)
  • Fixed normalization function for YOLO backend (Thanks to Wolfgang Lugmayr)

crYOLO 1.4.0:

  • Support Just Another Noise 2 Noise Implemnentation (JANNI)
  • Add –mask_width as optional parameter for the filament mode
  • Update tensorflow from 1.10.1 to 1.12.3 to make crYOLO compatible with JANNI
  • Update numpy from 1.14.5 to 1.15.4 to make crYOLO compatible with JANNI

crYOLO 1.3.6:

  • Changed filament search radius factor from 0.8 to 1.41 (this fixed issue 21)
  • Add search radius factor as advanced parameter (-sr) during prediction in filament mode
  • Improved error message in case of corrupted config file
  • Fixed issue 22: If absolute paths are used in the field “train_image” in your configuration file, filtering is skipped.

crYOLO 1.3.5:

  • Fixed issue 20: During training the images are normalized separately, but during prediction is done batch wise. The lead to confusing results: some micrographs were perfectly picked, some totally unreasonable, even with the same defocus. This bug only affects the picking, already trained models can still be used.
  • Remove unnecessary dependencies
  • Add __version__ to __init__.py for easy access to package version.

crYOLO 1.3.4:

  • Support for SPHIRE 1.2
  • Changed the minimum threshold for cbox files from 0.01 to 0.1. Much faster in many cases but still low enough. If -t is lower than 0.1, the new threshold is used as minimum.
  • Installation now checks if python 3 is used.
  • Fix issue 19: Filtering does not work if target image directory is absolute path.
  • Fix crash when --otf was specified but filtering was not specified in the config file

crYOLO 1.3.3:

  • Fix issue 18: Prediction is broken in 1.3.2. It removes all particles as it claim they are not fully immersed in the image.

crYOLO 1.3.2:

  • Speedup prediction: Vectorized some parts of the code and optimized the creation of the cbox files. 30% speed up picking / 15% faster training compared to 1.3.1/1.3.0.
  • Bug fix in merging of filaments that sometimes throw “IndexError: list index out of range”. (Thanks to Alexander Belyy)
  • Fix in cryolo_evaluation: If the validation data is specified with -b instead of runfiles, all datasets with only one box file were ignored.
  • Change library requirement to PILLOW version 6.0.0
  • Fix issue 16: --gpu_fraction only works for prediction, not for training.

crYOLO 1.3.1:

  • Fix Issue 15: -g was ignored when –gpu_fraction was used.

crYOLO 1.3.0:

  • Fine tune the general network to your data using the new fine tune option with --fine_tune (https://1n.pm/x8rUH)
  • One-the-fly micrograph filtering during particle picking with --otf (don't double your dataset during picking)(https://1n.pm/goXAa)
  • Interactive threshold adjustment after prediction using the new cbox-files and the crYOLO boxmanager 1.2 (https://1n.pm/k7HoI)
  • Pick only fully immersed particles (Issue 13)
  • Improved filament mode
    • Rewrote tracing
    • Rewrote and speed up merging of filaments
    • Fixed parallelisation of the filament mode (Issue 14)
  • Add tifffile as dependency, as imageio throws a lot of warning for some tif files.
  • Add conversion for uint16 images, as pillow cannot work with them.
  • Add option --skip_augmentation to deactivate augmentation during training (Thanks to Tijmen de Wolf). (https://1n.pm/goXAa)
  • Add option --num_cpu to specify the number of CPUs used during training and during prediction. (Thanks to Nikolaus Dietz) (https://1n.pm/goXAa)
  • Add option to limit the amount of GPU memory reserved by crYOLO with --gpu_fraction (Thanks to Nikolaus Dietz) (https://1n.pm/goXAa)
  • Save anchor size in model every time you write a new model during training (not only at the end)
  • In case of using --min_distance, only the particle with lower confidence is removed (Thanks to Yilai Li)

crYOLO Version 1.2.3:

  • crYOLO now saves the anchors which were used during training inside the .h5 file and takes care that the correct anchors are used during prediction.
  • LineEnhancer dependency is now installed via PyPi, as –follow-dependency-links is removed in pip 19.
  • Fix Issue 9: Removed zignor dependency as it leads to problems for some users (Thanks to Jason Kaelber)
  • Attempt to fix Issue 10: Removed opencv dependency which was connected to this problem (Thanks to Shaun Rawson)
  • Fix issue 11: crYOLO uses now GPU 0 by default if not specified otherwise (e.g. by CUDA_VISIBLE_DEVICES)

crYOLO Version 1.2.2:

  • Added the PhosaurusNet to the crYOLO backend, which makes the patch mode needless for picking single particles.
  • crYOLO now outputs separate folders for EMAN box files and STAR files.
  • When picking filaments it will now additionally output EMAN Start-End and STAR Start-End coordinates (Thanks to Jesse M. Hansen).
  • Fix Issue 4: The filament mode will crash if crYOLO cannot identify a single particle in the image.
  • Fix Issue 5: If movies were aligned with cisTEM and picked with crYOLO, the box positions were vertically flipped. (Thanks to Wei-Chun Kao)
  • Fix Issue 6: crYOLO overwrote the CUDA_VISIBLE_DEVICES variable if the -g parameter is not passed. (Thanks to Shaun Rawson)
  • Fix Issue 7: crYOLO introduces a shift for non square images proportional to the aspect ratio. (Thanks to Shaun Rawson)
  • Fix Issue 8: crYOLO sometimes connects two parallel filaments. The filament tracing was optimized and seems now working properly.
  • Fix a severe bug in filament tracing. Curved filaments splitted by crYOLO in more straight sub pieces. However, during the division, one half of the splitted filament was lost. (Thanks to Sabrina Pospich)
  • Added a wiki entry about the networks which are supported by crYOLO
  • Added a wiki entry about the crYOLO configuration file

crYOLO Version 1.2.1:

  • Fix Issue 2: Tiff files are now written as 32 bit when internal filtering is used.
  • cryolo_evaluation now additionally estimates the optimal threshold based on the F2 score, which puts more weight on recall than on precision
  • File ending of filament box files is now .box instead of .txt (Thanks to Jesse M. Hansen)

crYOLO Version 1.2.0:

  • Switch to Python3 (Please use a fresh environment!)
  • (Hopefully) fixed that crYOLO sometimes freezes during/after training (hard to reproduce, so I'm not 100% sure if it is fixed.)
  • Fix that training with multiple GPUs did not speed up small datasets
  • Low-pass filtering is now integrated into crYOLO
  • Fix two bugs in cryolo_evaluation that lead to an underestimation the performance parameters
  • cryolo_evaluation is now multithreaded if your training data is organised in subfolders
  • cryolo_evaluation now contains a better method for optimal picking threshold estimation
  • Refactoring
  • Minor bug fixes

crYOLO Version 1.1.4:

  • Hot fix for filament mode when applied to non square images.

crYOLO Version 1.1.3:

  • Improved non-maximum-suppression brings 60% speedup during picking!
  • Multi GPU support for training and prediction (e.g by adding -g 0 1 for GPU 0 and GPU 1 to the training/prediction command)
  • Bug fixed which leads to a crash if no particles are picked on the first micrograph (Thanks to Björn Klink).

crYOLO Version 1.1.2:

  • STAR files could now used for training. However, as they don't contain size information the size specified in the anchors in the config.json is used.
  • Slightly improved speed of the filament-mode
  • Fixed another bug running filament mode on non-square images (Thanks to Gregory Alushin)

crYOLO Version 1.1.1:

  • More efficient MRC reading and batch prediction leads to ~50% faster training and ~70% faster picking when crYOLO is used in patch-mode (compared with the patch-mode in 1.1.0).
  • 6x faster filament picking
  • Reading of annotation data is now super-fast :-) (Box filename has to be contained into image filename)
  • Optimized filament picking parameters
  • Fixed bug which made training fail for some 16 bit images
  • Fixed bug which could lead to double picked filaments
  • Fixed bug running filament mode on non-square images (Thanks to Gregory Alushin)
  • Supports EMAN1 helix coordinates
  • Support for star file format. During prediction, both box and star files are written.

crYOLO Version 1.1.0:

  • crYOLO now supports filaments
  • New evaluation tool
  • Supports empty box files for training on particle-free images
  • Extended data augmentation: Horizontal flip and flip along both axes
  • Experimental support of periodic restarts during training (with –warm_restarts)

crYOLO Version 1.0.4:

  • Fix a problem reading backend weights from read-only filesystem (Thanks to Michael Cianfrocco and Jason Key)
  • Make sure that tensorflow version is >= 1.5.0 and < 1.9.0
  • Add support for subfolders in training and validation directories
  • More clear error message when the trained model does not fit to the architecture specified in the config file.

crYOLO Version 1.0.3:

  • Ignore non-image files during training and predction (Thanks to Kellie Woll)
  • Fixed misleading error when non existing folder is used as input for prediction (Thanks to Kellie Woll)
  • Add distance threshold during prediction by adding -d distanceInPixel parameter to prediction command (Thanks to Lifei Fu)
  • Add “–write_empty” parameter to prediction command if an empty box file should be written if no particle is picked.

crYOLO Version 1.0.2:

  • Fix problem when mrc image has dimensions (1,width,height) (Thanks to Reza Behrouzi)

crYOLO Version 1.0.1:

  • Normalization technique is now the same for 8-bit and 32 bit images.
  • Unify image augmentation
2019/09/20 10:51 · twagner

Depending what you want to do, you can follow one of these self-contained Tutorials:

The first, second and third tutorial are the most common use cases and well tested. The fourth tutorial is still experimental but might give you better results in less time and less training data.

Any questions? Problems? Suggestions?

Find help at our mailing list!

  • pipeline/window/cryolo.txt
  • Last modified: 2020/02/13 14:02
  • by twagner