Skip to content
Kangning Song edited this page Sep 8, 2017 · 23 revisions

Welcome to the caffe-mt wiki!

Features of this version of caffe includes:

1.MultiTaskDataLayer:

This layer supports parsing label for multi-task training(classification, regression or any of their combination) without the bother of using mutiple DataLayer instead. And the labels for the multiple tasks are stored in the field float_data of Datum. During training this layer will parse them from this field to the proper top blobs. For example, a simple config for 3 classification tasks should be like this:

layer {
  name: "traindata"
  type: "MultiTaskData"
  top: "data"
  top: "age_label"
  top: "gender_label"
  top: "pose_label"
  include {
    phase: TRAIN
  }
  phase: TRAIN
  transform_param {
    scale: 0.00390625
    mirror: true
    crop_size: 60
  }
  data_param {
    source: "xxx_lmdb"
    batch_size: 60
    backend: LMDB
    task_num: 3
    label_dimension: 1
    label_dimension: 1
    label_dimension: 1
  }
}

Note MultiTaskDataLayer doesn't break the multi-gpu training workflow, so it is good to use when you are training cnn using this layer in multi-GPU mode supported by caffe.

2.ChannlWiseBNLayer/EltWiseBNLayer

a. ChannlWiseBNLayer supports performing Batch Normalization per channel of its input feature maps, which namely will compute a mean/variance/scale/shift scalar per channel of the input feature maps.

b. EltWiseBNLayer supports performing Batch Normalization per element of the channels of its input feature maps, which namely will compute a mean/variance/scale/shift tensor with the same shape of the input feature maps.

3.Local Convolution Layer

This layer splits input feature maps into N*N grid(even with overlap) and performs convolution on each piece of the grid using different kernels, and then combine their results into a larger feature map. A related example on StackOverflow is here.

4.Dynamic Convolution Layer

Dynamic convolution means performing convolution using kernels from bottom. Details can be found here and here.

5.Support image augumentation of motion blur on the fly

6.MTCNNData layer

This layer can balance the number of positive and negative samples within a batch during training a mtcnn network.