Table of Contents

How to use SPHIRE's Cinderella for micrograph selection

This tutorial describes how to use Cinderella to sort micrographs. Unfortunately, we cannot provide a pretrained model yet. Therefore the first step is to train a model (see section Training) and to apply a model (see section Classify).

Download & Install

You can find the download and installation instructions here: Download and Installation

Training

The first step is to train Cinderella with manually selected good and bad micrographs. Create two folders, one containing manually selected good micrographs (e.g GOOD_MICS/) and one contain bad micrographs (e.g BAD_MICS/). Both folders can contain subfolders.

How many micrographs do I need?

We typically start with 30 good and 30 bad micrographs.

Then specify the paths into a config file like this:

config.json
{
	"model": {
		"input_size": [512,512]
	},
 
	"train": {
		"batch_size": 6,
		"good_path": "GOOD_MICS/",
		"bad_path": "BAD_MICS/",
		"pretrained_weights": "",
		"saved_weights_name": "my_model.h5",
		"learning_rate": 1e-4,
		"nb_epoch": 100,
		"nb_early_stop": 15
	}
}

The fields in the section model have the following meaning:

The fields in the section train have the following meaning:

The next step is to run the training:

sp_cinderella_train.py -c example_config.json --gpu 1

This will train a classification network on the GPU with ID=1. After the training finishes, you get a my_model.h5 file. This can then be used to classfiy micrographs into good / bad categories.

Classify

Suppose you want to separate good and bad micrographs in the folder micrographs and you want to save a list with the filenames of the good / bad micrographgs into the folder output_folder. Furthermore you want to use the model my_model.h5 and the GPU with ID=1. Micrographs with a confidence bigger than 0.5 should be classified as good micrograph.

This is the command to run:

sp_cinderella_predict.py -i micrographs/ -w model.h5 -o output_folder/ -t 0.5 --gpu 1

You will find the files bad.txt and good.txt in your output_folder.