Faster R-CNN: Towards Real-Time Object Detection with Region Proposal NetworksFaster R-CNN: Towards Real-Time Object Detection with Region Proposal NetworksRen, Shaoqing and He, Kaiming and Girshick, Ross B. and Sun, Jian2015
Paper summaryleopaillier_Objective:_ Improve on Fast R-CNN and [SPPnet](https://arxiv.org/abs/1406.4729) by incorporating the region proposal network directly.
_Dataset:_ [PASCAL VOC](http://host.robots.ox.ac.uk/pascal/VOC/) and [COCO](http://mscoco.org/).
Both Fast R-CNN and SPPnet takes as input an image and several possibles objects (corresponding to regions of interest) and score each of them. They are thus two different entities:
1. A region proposal network.
2. A classification/detection network (Fast R-CNN/SSPnet).
First image features are extracted using a state of the art ConvNet, then they are used for both Region proposal and actual detection/classification on those regions.
[![screen shot 2017-04-14 at 2 59 28 pm](https://cloud.githubusercontent.com/assets/17261080/25043807/01a287b6-2123-11e7-944c-01493371df29.png)](https://cloud.githubusercontent.com/assets/17261080/25043807/01a287b6-2123-11e7-944c-01493371df29.png)
By incorporating the region proposal network right after the feature ConvNet its computation cost becomes basically free which leads to an elegant solution (only one network) but more importantly greatly improve speed at test time.
This work proposes a two stage object detection algorithm based on convolutional neural network (CNN). The first stage is region proposal, which is based on the traditional sliding window method but working on the top layer feature map of CNN (RPN). In the second stage, a fast R-CNN is applied to the proposed regions. Since the convolution layers are shared between RPN and R-CNN, and the calculation is speeded up using GPU, the algorithm can achieve near real-time (5fps).