项目作者: unsky

项目描述 :
Deformable Convolutional Networks on caffe
高级语言: Jupyter Notebook
项目地址: git://github.com/unsky/Deformable-ConvNets-caffe.git
创建时间: 2017-07-23T09:31:01Z
项目社区:https://github.com/unsky/Deformable-ConvNets-caffe

开源协议:

下载


Caffe implementation of Deformable Convolutional Networks

results:

faster rcnn(resnet50) result is implemented without OHEM and deformable roi pooling

train with pascal voc 2007 + 2012 test on 2007

mAP@0.5 aeroplane bicycle bird boat bottle bus car cat chair cow
0.7811 0.7810 0.8560 0.7828 0.6910 0.5948 0.8430 0.8741 0.8878 0.6213 0.8347
diningtable dog horse motorbike person pottedplant sheep sofa train tv
0.7324 0.8695 0.8893 0.8507 0.7986 0.5226 0.7791 0.7933 0.8528 0.7668

Usage

Use modified caffe

The MNIST example is in caffe/defor/

Compile:

  1. mkdir build cd build cmake .. make all

Train & test:

  1. cd caffe/defor/
  2. ./train_lenet.sh

and the model is in caffe/defor/model_protxt/

use faster rcnn

download voc07,12 dataset ResNet50.caffemodel and rename to ResNet50.v2.caffemodel

  1. cp ResNet50.v2.caffemodel data/pretrained_model/
  • OneDrive download: link

Train &test:

  1. ./experiments/scripts/faster_rcnn_end2end.sh 0 ResNet50 pascal_voc
  2. ./test.sh 0 ResNet50 pascal_voc

Use the codes in your caffe

All codes are in deformable_conv_cxx/

1. Add layer definition to caffe.proto:

  1. optional DeformableConvolutionParameter deformable_convolution_param = 999;
  2. message DeformableConvolutionParameter {
  3. optional uint32 num_output = 1;
  4. optional bool bias_term = 2 [default = true];
  5. repeated uint32 pad = 3; // The padding size; defaults to 0
  6. repeated uint32 kernel_size = 4; // The kernel size
  7. repeated uint32 stride = 6; // The stride; defaults to 1
  8. repeated uint32 dilation = 18; // The dilation; defaults to 1
  9. optional uint32 pad_h = 9 [default = 0]; // The padding height (2D only)
  10. optional uint32 pad_w = 10 [default = 0]; // The padding width (2D only)
  11. optional uint32 kernel_h = 11; // The kernel height (2D only)
  12. optional uint32 kernel_w = 12; // The kernel width (2D only)
  13. optional uint32 stride_h = 13; // The stride height (2D only)
  14. optional uint32 stride_w = 14; // The stride width (2D only)
  15. optional uint32 group = 5 [default = 4];
  16. optional uint32 deformable_group = 25 [default = 4];
  17. optional FillerParameter weight_filler = 7; // The filler for the weight
  18. optional FillerParameter bias_filler = 8; // The filler for the bias
  19. enum Engine {
  20. DEFAULT = 0;
  21. CAFFE = 1;
  22. CUDNN = 2;
  23. }
  24. optional Engine engine = 15 [default = DEFAULT];
  25. optional int32 axis = 16 [default = 1];
  26. optional bool force_nd_im2col = 17 [default = false];
  27. }

you can read the template in deformable_conv_cxx/caffe.proto

2.Move codes to your caffe

  1. move deformable_conv_layer.cpp and deformable_conv_layer.cu to yourcaffepath/src\caffe\layers\
  2. move deformable_conv_layer.hpp to yourcaffepath/include\caffe\layers\
  3. move deformable_conv_layer.hpp to yourcaffepath/include\caffe\layers\
  4. move deformable_im2col.cu to yourcaffepath\src\caffe\util\
  5. move deformable_im2col.hpp to yourcaffepath\include\caffe\util\

3.Compile in your caffe root path

  1. mkdir build cd build cmake .. make all

About the deformable conv layer

The params in DeformableConvolution:

  1. bottom[0](data): (batch_size, channel, height, width)
  2. bottom[1] (offset): (batch_size, deformable_group * kernel[0] * kernel[1]*2, height, width)

Define:

  1. f(x,k,p,s,d) = floor((x+2*p-d*(k-1)-1)/s)+1

the output of the DeformableConvolution layer:

  1. out_height=f(height, kernel[0], pad[0], stride[0], dilate[0])
  2. out_width=f(width, kernel[1], pad[1], stride[1], dilate[1])

Offset layer:

  1. layer {
  2. name: "offset"
  3. type: "Convolution"
  4. bottom: "pool1"
  5. top: "offset"
  6. param {
  7. lr_mult: 1
  8. }
  9. param {
  10. lr_mult: 2
  11. }
  12. convolution_param {
  13. num_output: 72
  14. kernel_size: 3
  15. stride: 1
  16. dilation: 2
  17. pad: 2
  18. weight_filler {
  19. type: "xavier"
  20. }
  21. bias_filler {
  22. type: "constant"
  23. }
  24. }
  25. }

DeformableConvolution layer:

  1. layer {
  2. name: "dec"
  3. type: "DeformableConvolution"
  4. bottom: "conv1"
  5. bottom: "offset"
  6. top: "dec"
  7. param {
  8. lr_mult: 1
  9. }
  10. param {
  11. lr_mult: 2
  12. }
  13. deformable_convolution_param {
  14. num_output: 512
  15. kernel_size: 3
  16. stride: 1
  17. pad: 2
  18. engine: 1
  19. dilation: 2
  20. deformable_group: 4
  21. weight_filler {
  22. type: "xavier"
  23. }
  24. bias_filler {
  25. type: "constant"
  26. }
  27. }
  28. }

the prototxt model should like:

The following animation is generated by Felix Lau (with his tensorflow implementation):https://github.com/felixlaumon/deform-conv/

TODO List

  • all tests passed
  • evaluate performance on Regular MNIST
  • evaluate object detection performance on voc

Deformable Convolutional Networks

Dai, Jifeng, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen
Wei. 2017. “Deformable Convolutional Networks.” arXiv [cs.CV]. arXiv.
http://arxiv.org/abs/1703.06211