Paper summaryleopaillierImprove on [R-CNN](https://arxiv.org/abs/1311.2524) and [SPPnet](https://arxiv.org/abs/1406.4729) with easier and faster training.
Region-based Convolutional Neural Network (R-CNN), basically takes as input and image and several possibles objects (corresponding to Region of Interest) and score each of them.
The feature map is computed for the whole image and then for each region of interest a new fixed-length feature vector is computed using max-pooling. From it two predictions are made for classification and bounding-box offsets.
[![screen shot 2017-04-14 at 12 46 38 pm](https://cloud.githubusercontent.com/assets/17261080/25041460/6e7cba40-2110-11e7-8650-faae2a6b0a92.png)](https://cloud.githubusercontent.com/assets/17261080/25041460/6e7cba40-2110-11e7-8650-faae2a6b0a92.png)
By sharing computation for RoIs of the same image and allowing simple SGD training it really improves performance training although at testing it's still not as fast as YOLO9000.