The evaluation tool allows you, based on your validation micrographs, to get statistics about the success of your training.

To understand the outcome, you have to know what precision and recall is. Here is good figure from wikipedia:

Another important measure is the F1 (β=1) and F2 (β=2) score:

Precision metric can be misleading

If your validation micrographs are not labeled to completion the precision value will be misleading. crYOLO will start picking the remaining 'unlabeled' particles, but for statistics they are counted as false-positive (as the software takes your labeled data as ground truth).

If you followed the tutorial, the validation data are selected randomly. A run file for each training is created and saved into the folder runfiles/ in your project directory. These runfiles are .json files containing information about what micrographs were selected for validation. To calculate evaluation metrics select the evaluation action.

Fill out the fields in the “Required arguments” tab:

► Press [Start] to calculate the evaluation results.

Alternative: Run evaluation from the command line

Click to display ⇲

Click to hide ⇱

cryolo_evaluation.py -c config.json -w model.h5 -r runfiles/run_YearMonthDay-HourMinuteSecond.json -g 0

The html file you specified as output looks like this:

The table contains several statistics:

If the training data consist of multiple folders, then evaluation will be done for each folder separately. Furthermore, crYOLO estimates the optimal picking threshold regarding the F1 Score and F2 Score. Both are basically average values of the recall and prediction, whereas the F2 score puts more weights on the recall, which is in cryo-EM often more important.