Skip to content

Tutorial 2 Launch and train CDeep3M

Chris Churas edited this page Aug 12, 2018 · 48 revisions

This page provides instructions on how to perform augmented training of CDeep3M using training data generated in:

Tutorial 1: Generating training data with IMOD

Goals

  1. Launch CDeep3M instance in Amazon Web Services (AWS)
  2. Upload training data to CDeep3M instance
  3. Run CDeep3M train
  4. Check on CDeep3M train status

Requirements

Step 1 Launch CDeep3M instance on AWS

WARNING: The instructions on link below will launch a virtual machine on AWS and charges will be incurred.

A. Create keypair if one hasn't been made already.

Click here to login to AWS and import key pair in us-west-2 region

B. Launch CDeep3m instance

Follow these instructions to launch CDeep3M instance on AWS

Step 2 Download dataset two

From the previous tutorial you should have generated training data which is stored in the train directory and you should have a terminal open where the train directory is visible as seen here:

Terminal showing unzip of datasettwo.zip file

If not, click here to download training data that can be used for this tutorial.

Step 3 Upload train directory to CDeep3M instance

A. Set $cdeep3mhost to hostname of CDeep3M instance

The remainder of this tutorial assumes the hostname of the CDeep3M instance is in the variable $cdeep3mhost

To check if it has been set run the following from the terminal:

echo $cdeep3mhost

If set output will look like this:

$ echo $cdeep3mhost
ec2-34-217-165-23.us-west-2.compute.amazonaws.com
$

If nothing is output like in the example below:

$ echo $cdeep3mhost

$

Then the value will need to be set.

NOTE: The $cdeep3mhost variable will need to be set again if the terminal is closed and re-opened.

If value needs to be set, run the following command from the terminal replacing <PublicDNS> with value circled in red on Step 9 from Launching CDeep3M via AWS CloudFormation:

For Cygwin terminal and Bash terminals enter the following (to find out what shell click here):

export cdeep3mhost=<PublicDNS>

Example:

$ export cdeep3mhost=ec2-34-217-165-23.us-west-2.compute.amazonaws.com
$

Example screenshot with echo command invoked to show the variable is set.

Terminal showing cdeep3mhost variable being set

B. Upload train directory to CDeep3M instance created in Step 1 It is assumed $cdeep3mhost variable has already been set to value circled in red on Step 9 from Launching CDeep3M via AWS CloudFormation

ls train
scp -i ~/.ssh/id_rsa -r train ubuntu@$cdeep3mhost:/home/ubuntu/.

Terminal showing upload of train directory to CDeep3M instance

C. Connect to CDeep3M instance

ssh -i ~/.ssh/id_rsa ubuntu@$cdeep3mhost

Terminal showing connection to CDeep3M instance

D. Verify train directory was uploaded

ls train

Terminal showing verification of upload with ls train command

Step 4 Preprocess training data

A. Preprocess training data by running PreprocessTrainingData.m command which takes three arguments in order; images directory, labels directory, and output directory:

PreprocessTrainingData.m ~/train/images ~/train/labels ~/augtrain

Click here for more information about PreprocessTrainingData.m

Output:

octave: X11 DISPLAY environment variable not set
octave: disabling GUI features
Starting Training data Preprocessing
Training Image Path:
/home/ubuntu/train/images
Training Label Path:
/home/ubuntu/train/labels/
Output Path:
/home/ubuntu/augtrain
Loading:
/home/ubuntu/train/images
Image importer loading ...
/home/ubuntu/train/images
Reading file: /home/ubuntu/train/images/x.000.png
.
.
Verifying labels
Checking image dimensions
Augmenting training data 1-8 and 9-16

Create variation 1 and 9
Saving: /home/ubuntu/augtrain/training_full_stacks_v1.h5
Saving: /home/ubuntu/augtrain/training_full_stacks_v9.h5
.
.
Create variation 8 and 16
Saving: /home/ubuntu/augtrain/training_full_stacks_v8.h5
Saving: /home/ubuntu/augtrain/training_full_stacks_v16.h5
Elapsed time is 9.76223 seconds.
-> Training data augmentation completed
Training data stored in /home/ubuntu/augtrain
For training your model please run runtraining.sh /home/ubuntu/augtrain <desired output directory>

Terminal showing run of PreprocessTrainingData.m

Step 5 Run CDeep3M train job

A. Since training can take a while (hours to even days) we will be using the screen command so we can disconnect. Information about screen can be found here or by typing man screen from the terminal. Type screen as seen below.

screen

Terminal showing screen command message window

Hit enter/return key to continue

B. CDeep3M training is done by invoking runtraining.sh command. Since full training can take a couple days, the instructions below will be retraining the following pretrained model, already preloaded on the CDeep3M instance, ~/sbem/mitochrondria/xy5.9nm40nmz/30000iterations_train_out with training data we generated in tutorial 1. This is done by using the --retrain and --additerations flag as seen below. Type the following command:

runtraining.sh --additerations 20 --retrain ~/sbem/mitochrondria/xy5.9nm40nmz/30000iterations_train_out  ~/augtrain ~/model

Click here for more information about runtraining.sh

Click here for more information about how to retrain a pretrained model

Output:

octave: X11 DISPLAY environment variable not set
octave: disabling GUI features
Verifying input training data is valid ... success
Copying over model files and creating run scripts ... success

A new directory has been created: /home/ubuntu/model
In this directory are 3 directories 1fm,3fm,5fm which
correspond to 3 caffe models that need to be trained
Latest iteration found in 1fm from /home/ubuntu/sbem/mitochrondria/xy5.9nm40nmz/30000iterations_train_out is 30000
Adding 20 iterations so will now run to 30020 iterations
Copying over trained models
Copy of /home/ubuntu/sbem/mitochrondria/xy5.9nm40nmz/30000iterations_train_out/1fm/trainedmodel to /home/ubuntu/model/1fm/trainedmodel success
Copy of /home/ubuntu/sbem/mitochrondria/xy5.9nm40nmz/30000iterations_train_out/3fm/trainedmodel to /home/ubuntu/model/3fm/trainedmodel success
Copy of /home/ubuntu/sbem/mitochrondria/xy5.9nm40nmz/30000iterations_train_out/5fm/trainedmodel to /home/ubuntu/model/5fm/trainedmodel success

Single GPU detected.
Resuming run from snapshot file: /home/ubuntu/model/1fm/trainedmodel/1fm_classifer_iter_30100.solverstate
Resuming run from snapshot file: /home/ubuntu/model/3fm/trainedmodel/3fm_classifer_iter_30100.solverstate
Resuming run from snapshot file: /home/ubuntu/model/5fm/trainedmodel/5fm_classifer_iter_30100.solverstate
Resuming run from snapshot file: /home/ubuntu/model/1fm/trainedmodel/1fm_classifer_iter_30100.solverstate
Resuming run from snapshot file: /home/ubuntu/model/3fm/trainedmodel/3fm_classifer_iter_30100.solverstate
Resuming run from snapshot file: /home/ubuntu/model/5fm/trainedmodel/5fm_classifer_iter_30100.solverstate

Training has completed. Have a nice day!


Training has completed. Results are stored in /home/ubuntu/model
Have a nice day!

Terminal showing run of runtraining.sh

C. Detach screen from CDeep3M training job. This can be done by hitting the key combination Control-a then the d key or in shorter form: Ctrl-a d

D. Re-attach to screen running CDeep3M training job by typing screen -r as seen here:

screen -r

Once training has finished go to the next step. A completed train job will have output text as seen above.

Step 6 Optionally download trained model

A. Exit from CDeep3M instance by typing exit. exit will need to be done twice since we are in a screen virtual terminal.

exit
exit

Terminal showing exit from CDeep3M instance

B. Download the trained model by using the scp command as seen here:

scp -i ~/.ssh/id_rsa -r ubuntu@$cdeep3mhost:/home/ubuntu/model .
ls

Terminal showing download of trained model

Next

Congratulations on completing Tutorial 2

Click here to continue with Tutorial 3: Run CDeep3M

NOTE: If you are not continuing to the next tutorial be sure to shutdown your CDeep3M to avoid incurring additional EC2 charges.

Instructions for shutdown can be found here

Clone this wiki locally