License

  • Author: Thorsten Wagner
  • License: EULA
  • Last Update: 2018-06-20

If you are interested in using crYOLO in a commercial context please contact stefan.raunser@mpi-dortmund.mpg.de

Paper

You can find more technical details in our paper:

SPHIRE-crYOLO: A fast and accurate fully automated particle picker for cryo-EM

Access the recommendation on F1000Prime

Download


Before downloading or using this product, make sure you understand and accept the terms of the license.


crYOLO

Version: 1.3.4

Uploaded: 23. April 2019

DOWNLOAD GPU VERSION

DOWNLOAD CPU VERSION

crYOLO boxmanager

Version: 1.2.1

Uploaded: 05. April 2019

DOWNLOAD

crYOLO PhosaurausNet's eponym

General PhosaurusNet models

For cryo images

Number of datasets: 32 real, 10 simulated, 10 particle free datasets on various grids with contaminations

Uploaded: 14. March 2019

DOWNLOAD

Valid configuration file

Info: Trained on cryo images, therefore negative stain will not work.

For negative stain images

Number of datasets: 10 real datasets

Uploaded: 26. February 2019

DOWNLOAD

Valid configuration file

ARCHIVE

Previous versions of crYOLO, the boxmanager and the general models can be found here: Archive.

Known issues

  • Issue 0: Training on multiple GPUs sometimes lead to worse performance (higher loss). We currently recommend to train on single gpus.
  • Issue 1: crYOLO sometimes not exit properly after training finished. Has to be killed manually.
  • Issue 2: If you use automatic filtering with .tif files, you get an error like “OSError: cannot identify image file 'filtered_folder/another_folder/my_image.tif'”. It will be fixed in the next release.
  • Issue 3: (Boxmanager) The visualization only shows the first filament when loading eman1 helical box files (start end coordinates). Will be fixed in the next release.
  • Issue 4: The filament mode will crash if crYOLO cannot identify a single particle in the image. Will be fixed in 1.2.2
  • Issue 5: If movies were aligned with cisTEM and picked with crYOLO, the box position are vertically flipped. Will be fixed in 1.2.2
  • Issue 6: crYOLO does overwrite the environmental variable “CUDA_VISIBLE_DEVICES” with 0 if no gpu is specified by the -g parameter. This leads to the behavior that crYOLO ignores previous settings in CUDA_VISIBLE_DEVICES. Will be fixed in 1.2.2
  • Issue 7: On K3 images crYOLO seems to add a offset toward the longer axis of the input image.
  • Issue 8: There is a logical error in filament tracing, which sometimes connects two parallel filaments.
  • Issue 9: Some people report an error when running cryolo prediction/training: “ImportError: numpy.core.multiarray failed to import”. It will be fixed in 1.2.3.
  • Issue 10: On machines with many cores (e.g 64) an error during filtering might pop up: “[ERROR:0] 53: Can't spawn new thread”
  • Issue 11: If the -g parameter is not provided, crYOLO will use the memory of all GPUs. Will be fixed in 1.2.3.
  • Issue 12: The LineEnhancer depdenceny of crYOLO is still dependent from opencv. Workaround: In the crYOLO environment: conda install opencv
  • Issue 13: After picking it can happen that some of the boxes are not fully immersed in the image. Will be fixed in 1.2.4.
  • Issue 14: Parallelization in filament mode is broken. Will be fixed in 1.2.4.
  • Issue 15: If the --gpu_fraction is used, crYOLO always uses GPU 0. Will be fixed in 1.3.1.
  • Issue 16: --gpu_fraction only works for prediction, not for training. Will be fixed in 1.3.2.
  • Issue 17: On the fly filtering (--otf) is slower than using it not, as the filtering is not parallelized in this case.
  • Issue 18: Prediction is broken in 1.3.2. It removes all particles as it claim they are not fully immersed in the image.
  • Issue 19: Filtering does not work if target image directory is absolute path.

Installation

System requirements:

crYOLO was tested on Ubuntu 16.04.4 LTS and Ubuntu 18.04 with an NVIDIA Geforce 1080 / Geforce 1080Ti.

However, it should run on Windows as well.

As the GPU accelerated version of tensorflow does not support MacOS, crYOLO does not support it either.

crYOLO depends on CUDA Toolkit 9.0 and the cuDNN 7.1.2 library. It will be automatically installed during crYOLO installation.

Install crYOLO!

The following instructions assume that pip and anaconda or miniconda are available. In case you have a old cryolo environment installed, first remove the old one with:

conda env remove --name cryolo

After that, create a new virtual environment:

conda create -n cryolo -c anaconda python=3.6 pyqt=5 cudnn=7.1.2

Activate the environment:

source activate cryolo

Install crYOLO:

conda install numpy==1.14.5
pip install cryolo-X.Y.Z.tar.gz 
pip install cryoloBM-X.Y.Z.tar.gz

That's it!

You might want to check if everything is running as expected. Here is a reference example:

Reference example with TcdA1

Run it on the CPU

There is also a way to run crYOLO on CPU. To use it, just install the CPU version as provided in the download section. This is especially usefull when you would like to apply the generalized model and don't have a NVIDIA GPU.

Picking with crYOLO is also quite fast on the CPU. On my local machine (Intel i9) it takes roughly 1 second per micrograph and on our low-performance notebooks (Intel i3) 4 seconds.

Training crYOLO is much more computational expensive. Training a model with 14 micrographs from scratch on my local machine take 34 minutes per epoch on the CPU. Given that you often need 25 epochs until convergence it is a task to do overnight (~ 12 hours). However, you might want to try refining the general model, which takes 12 minutes per epoch (~ 5 hours).

Start picking!

Use the step-by-step tutorial to get started!

Change log

crYOLO

crYOLO 1.3.4:

  • Support for SPHIRE 1.2
  • Changed the minimum threshold for cbox files from 0.01 to 0.1. Much faster in many cases but still low enough. If -t is lower than 0.1, the new threshold is used as minimum.
  • Installation now checks if python 3 is used.
  • Fix issue 19: Filtering does not work if target image directory is absolute path.
  • Fix crash when --otf was specified but filtering was not specified in the config file

crYOLO 1.3.3:

  • Fix issue 18: Prediction is broken in 1.3.2. It removes all particles as it claim they are not fully immersed in the image.

crYOLO 1.3.2:

  • Speedup prediction: Vectorized some parts of the code and optimized the creation of the cbox files. 30% speed up picking / 15% faster training compared to 1.3.1/1.3.0.
  • Bug fix in merging of filaments that sometimes throw “IndexError: list index out of range”. (Thanks to Alexander Belyy)
  • Fix in cryolo_evaluation: If the validation data is specified with -b instead of runfiles, all datasets with only one box file were ignored.
  • Change library requirement to PILLOW version 6.0.0
  • Fix issue 16: --gpu_fraction only works for prediction, not for training.

crYOLO 1.3.1:

  • Fix Issue 15: -g was ignored when –gpu_fraction was used.

crYOLO 1.3.0:

  • Fine tune the general network to your data using the new fine tune option with --fine_tune (https://1n.pm/x8rUH)
  • One-the-fly micrograph filtering during particle picking with --otf (don't double your dataset during picking)(https://1n.pm/goXAa)
  • Interactive threshold adjustment after prediction using the new cbox-files and the crYOLO boxmanager 1.2 (https://1n.pm/k7HoI)
  • Pick only fully immersed particles (Issue 13)
  • Improved filament mode
    • Rewrote tracing
    • Rewrote and speed up merging of filaments
    • Fixed parallelisation of the filament mode (Issue 14)
  • Add tifffile as dependency, as imageio throws a lot of warning for some tif files.
  • Add conversion for uint16 images, as pillow cannot work with them.
  • Add option --skip_augmentation to deactivate augmentation during training (Thanks to Tijmen de Wolf). (https://1n.pm/goXAa)
  • Add option --num_cpu to specify the number of CPUs used during training and during prediction. (Thanks to Nikolaus Dietz) (https://1n.pm/goXAa)
  • Add option to limit the amount of GPU memory reserved by crYOLO with --gpu_fraction (Thanks to Nikolaus Dietz) (https://1n.pm/goXAa)
  • Save anchor size in model every time you write a new model during training (not only at the end)
  • In case of using --min_distance, only the particle with lower confidence is removed (Thanks to Yilai Li)

crYOLO Version 1.2.3:

  • crYOLO now saves the anchors which were used during training inside the .h5 file and takes care that the correct anchors are used during prediction.
  • LineEnhancer dependency is now installed via PyPi, as –follow-dependency-links is removed in pip 19.
  • Fix Issue 9: Removed zignor dependency as it leads to problems for some users (Thanks to Jason Kaelber)
  • Attempt to fix Issue 10: Removed opencv dependency which was connected to this problem (Thanks to Shaun Rawson)
  • Fix issue 11: crYOLO uses now GPU 0 by default if not specified otherwise (e.g. by CUDA_VISIBLE_DEVICES)

crYOLO Version 1.2.2:

  • Added the PhosaurusNet to the crYOLO backend, which makes the patch mode needless for picking single particles.
  • crYOLO now outputs separate folders for EMAN box files and STAR files.
  • When picking filaments it will now additionally output EMAN Start-End and STAR Start-End coordinates (Thanks to Jesse M. Hansen).
  • Fix Issue 4: The filament mode will crash if crYOLO cannot identify a single particle in the image.
  • Fix Issue 5: If movies were aligned with cisTEM and picked with crYOLO, the box positions were vertically flipped. (Thanks to Wei-Chun Kao)
  • Fix Issue 6: crYOLO overwrote the CUDA_VISIBLE_DEVICES variable if the -g parameter is not passed. (Thanks to Shaun Rawson)
  • Fix Issue 7: crYOLO introduces a shift for non square images proportional to the aspect ratio. (Thanks to Shaun Rawson)
  • Fix Issue 8: crYOLO sometimes connects two parallel filaments. The filament tracing was optimized and seems now working properly.
  • Fix a severe bug in filament tracing. Curved filaments splitted by crYOLO in more straight sub pieces. However, during the division, one half of the splitted filament was lost. (Thanks to Sabrina Pospich)
  • Added a wiki entry about the networks which are supported by crYOLO
  • Added a wiki entry about the crYOLO configuration file

crYOLO Version 1.2.1:

  • Fix Issue 2: Tiff files are now written as 32 bit when internal filtering is used.
  • cryolo_evaluation now additionally estimates the optimal threshold based on the F2 score, which puts more weight on recall than on precision
  • File ending of filament box files is now .box instead of .txt (Thanks to Jesse M. Hansen)

crYOLO Version 1.2.0:

  • Switch to Python3 (Please use a fresh environment!)
  • (Hopefully) fixed that crYOLO sometimes freezes during/after training (hard to reproduce, so I'm not 100% sure if it is fixed.)
  • Fix that training with multiple GPUs did not speed up small datasets
  • Low-pass filtering is now integrated into crYOLO
  • Fix two bugs in cryolo_evaluation that lead to an underestimation the performance parameters
  • cryolo_evaluation is now multithreaded if your training data is organised in subfolders
  • cryolo_evaluation now contains a better method for optimal picking threshold estimation
  • Refactoring
  • Minor bug fixes

crYOLO Version 1.1.4:

  • Hot fix for filament mode when applied to non square images.

crYOLO Version 1.1.3:

  • Improved non-maximum-suppression brings 60% speedup during picking!
  • Multi GPU support for training and prediction (e.g by adding -g 0 1 for GPU 0 and GPU 1 to the training/prediction command)
  • Bug fixed which leads to a crash if no particles are picked on the first micrograph (Thanks to Björn Klink).

crYOLO Version 1.1.2:

  • STAR files could now used for training. However, as they don't contain size information the size specified in the anchors in the config.json is used.
  • Slightly improved speed of the filament-mode
  • Fixed another bug running filament mode on non-square images (Thanks to Gregory Alushin)

crYOLO Version 1.1.1:

  • More efficient MRC reading and batch prediction leads to ~50% faster training and ~70% faster picking when crYOLO is used in patch-mode (compared with the patch-mode in 1.1.0).
  • 6x faster filament picking
  • Reading of annotation data is now super-fast :-) (Box filename has to be contained into image filename)
  • Optimized filament picking parameters
  • Fixed bug which made training fail for some 16 bit images
  • Fixed bug which could lead to double picked filaments
  • Fixed bug running filament mode on non-square images (Thanks to Gregory Alushin)
  • Supports EMAN1 helix coordinates
  • Support for star file format. During prediction, both box and star files are written.

crYOLO Version 1.1.0:

  • crYOLO now supports filaments
  • New evaluation tool
  • Supports empty box files for training on particle-free images
  • Extended data augmentation: Horizontal flip and flip along both axes
  • Experimental support of periodic restarts during training (with –warm_restarts)

crYOLO Version 1.0.4:

  • Fix a problem reading backend weights from read-only filesystem (Thanks to Michael Cianfrocco and Jason Key)
  • Make sure that tensorflow version is >= 1.5.0 and < 1.9.0
  • Add support for subfolders in training and validation directories
  • More clear error message when the trained model does not fit to the architecture specified in the config file.

crYOLO Version 1.0.3:

  • Ignore non-image files during training and predction (Thanks to Kellie Woll)
  • Fixed misleading error when non existing folder is used as input for prediction (Thanks to Kellie Woll)
  • Add distance threshold during prediction by adding -d distanceInPixel parameter to prediction command (Thanks to Lifei Fu)
  • Add “–write_empty” parameter to prediction command if an empty box file should be written if no particle is picked.

crYOLO Version 1.0.2:

  • Fix problem when mrc image has dimensions (1,width,height) (Thanks to Reza Behrouzi)

crYOLO Version 1.0.1:

  • Normalization technique is now the same for 8-bit and 32 bit images.
  • Unify image augmentation

crYOLO Boxmanager

crYOLO Boxmanager Version 1.2.1:

  • Press “h” for hiding the boxes
  • Fix for loading different box sets with different colors for the case that on of the box sets are cbox files.

crYOLO Boxmanager Version 1.2:

  • Add interactive threshold selection using cbox files

crYOLO Boxmanager Version 1.1.1:

  • Fix Issue 3
  • Now supports STAR Start-End filament format

crYOLO Boxmanager Version 1.1.0:

  • Switch to Python3
  • Minor bug fixes

crYOLO Boxmanager Version 1.0.4:

  • Support of visualization of EMAN1 filament coordinates
  • Make compatible with crYOLO 1.1.3

crYOLO Boxmanager Version 1.0.3:

  • Support of visualization of EMAN2 helical coordinates (particle coordinates)
  • New boxes could be loaded with a new color while keeping the old.
  • Fix problem with makes loading images very long.
  • Several bug fixes

crYOLO Boxmanager Version 1.0.2:

  • Fix problem with invisible (start with .) files. Now they are ignored.

crYOLO Boxmanager Version 1.0.1:

  • Fix crash when cancel import boxfiles
  • Fix crash with qt4

General PhosaurusNet model

Version 20190315::

  • Added KLH
  • Added one inhouse dataset

Version 20190218:

  • Added K3 apoferritin (Thanks to Shaun Rawson)
  • Added two more inhouse datasets

Version 20181221:

Same datasets as the general YOLO network model version 20181120 but with trained with PhosaurusNet.

General YOLO network model in patch mode

Version 20181120:

Added multiple simulated datasets, where each micrograph contains hundreds of particles with different defocus:

  • PDB 1SA0
  • PDB 5LNK
  • PDB 5XNL
  • PDB 6B7N
  • PDB 6BHU
  • PDB 6DMR
  • PDB 6DS5
  • PDB 6GDG
  • PDB 6H3N
  • PDB 6MPU

Besides these simulated datasets we added handpicked

  • ATP Synthase
  • DNA Origami
  • Two more particle-free only-contamination datasets.

It total 45 datasets are now included.

Version 20180823:

Increase the number of hand picked datasets to 25 by adding:

  • Add EMPIAR 10154 (Thanks to Daniel Prumbaum)
  • Add EMPIAR 10186 (Thanks to Sebastian Tacke)
  • Add EMPIAR 10097 Hemagglutinin (Thanks to Birte Siebolds)
  • Add EMPIAR 10081 HCN1 (Thanks to Pascel Lill)
  • Add internal dataset (Thanks to Daniel Roderer)
  • Furthermore we added 8 datasets of protein-free grids (Thanks to Tobias Raisch and Daniel Prumbaum)

Version 20180720:

Added micrographs of 7 new handpicked datasets:

  • EMPIAR 10181 (Thanks to Dennis Quentin)
  • EMPIAR 10017
  • EMPIAR 10028 (Thanks to Oleg Sitsel)
  • User contributed dataset (Thanks to Lifei Fu)
  • EMPIAR 10089
  • EMPIAR 10004 (Thanks to Daniel Roderer)
  • EMPIAR 10072 (Thanks to Tobias Raisch)

Furthermore I had to remove one internal dataset, as it turned out that it is unsuitable for training the general model.

Version 20180704:

Added three more handpicked datasets:

  • spliceosome (EMPIAR 10160)
  • picornavirus (EMPIAR 10033) and
  • an internal dataset.