This is an old revision of the document!

crYOLO configuration file

The config file is organized in the sections model, training and validation. In the following your find a description of each entry.

Model section:

  • #01 architecture: The network used in the backend of crYOLO. Right we support “crYOLO”, “YOLO”, “PhosaurusNet”
  • #02 input_size: This is the size to which the input is rescaled before passed through the network. In the example given here it would be 768×768. The input could either be the whole micrograph or a patch (in case you use the patchmode).
  • #03 anchors: Anchors in YOLO are kind of a priori knowledge. You should specifiy your box size here.
  • #04 max_box_per_image: Maximum number of particles in the image. Only for handling the memory. Keep the default of 600 or 1000.
  • #05 overlap_patches: Optional. Only needed when using patch mode. Specifies how much the patches overlap. In our lab, we always keep the default value.
  • #06 num_patches: Optional. If specified the patch mode will be used. A value of “2” means, that 2×2 patches will be used.
  • #07 filter: Optional. Specifies the absolute cut-off frequency for the low-pass filter and the corresponding output folder. CrYOLO will automatically filter the data in train_image_folder valid_image_folder and save it into the output folder. It will automatically check if a image provided in the train_image_folder is already filtered and use it in case. Otherwise it will filter it.

Training section:

  • #08 train_image_folder: Path to the image folder containing the images to train on. This could either be a seperated folder containing ONLY your training data, but it could also be just the directory containing all of your images. CrYOLO will try to find the image based on annotation data you provided in train_annot_folder.
  • #09 train_annot_folder: Path to folder containing the your annotation files like box or star files. Based on the filename crYOLO will try to find the corresponding images in train_image_folder. It will search for image files, which containing the box filename.
  • #10 train_times: How often each image is presented to the network during one epoch. Default is 10 and should be kept until you have many training images.
  • #11 pretrained_weights: Path to h5 file that is used for initialization. Until you want to use weights from a previous dataset as initialization, the filename specified here should be same as saved_weights_name.
  • #12 batch_size: Specified the number of images crYOLO process in parallel during training. Strongly depending on the memory of your graphic card. 6 should be fine for GPUs with 8GB memory. You can increase in case you have more memory or decrease if you have memory problems. Bigger batches tend to improve convergence and even the final error.
  • #13 learning_rate: Defines the step size during training. Default should be kept.
  • #14 nb_epoch: Maximum number of epochs the network will train. I basically never reach this number, as crYOLO stops training if it recognize that the validation loss is not improving anymore.
  • #15 object_scale: Penality scaling factor for missing picking particles.
  • #16 no_object_scale: Penality scaling factor for picking background.
  • #17 coord_scale: Penality scaling factor for errors in estimating the correct position.
  • #18 class_scale: Irrelevant, as crYOLO only has the “class” “particle”.
  • #19 log_path: Path to folder. During training, crYOLo saves there some logs for visualization in tensorboard. Tensorboard is used to visualize curves for training and validation loss.
  • #20 saved_weights_name: Everytime the network improves in terms of validation loss, it will save the model into the file specified here.
  • #21 debug: If true, the network will provide several statistics during training.

Validation section:

  • #22 valid_image_folder: If not specified, crYOLO will simply select 20% of the training data for validation. However it is possible to specify to use specific images for validation. This should be the path to folder containing these files.
  • #23 valid_annot_folder: If not specified, crYOLO will simply select 20% of the training data for validation. However it is possible to specify to use specific images for validation. This should be the path to folder containing these validation box files.
  • #24 valid_times: How often each image is presented the network during validation. 1 should be kept.