Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs
Paper summary ## Segmented SNN **Summary**: this paper use 3-stage 3D CNN to identify candidate proposals, recognize actions and localize temporal boundaries. **Models**: this network can be mainly divided into 3 parts: generate proposals, select proposal and refine temporal boundaries, and using NMS to remove redundant proposals. 1. generate multiscale(16,32,64,128,256.512) segment using sliding window with 75% overlap. high computing complexity! 2. network: Each stage of the three-stage network is using 3D convNets concatenating with 3 FC layers. * the proposal network is basically a classifier which will judge if each proposal contains action or not. * the classification network is used to classify each proposal which the proposal network think is valid into background and K action categories * the localization network functioned as a scoring system which raises scores of proposals that have high overlap with corresponding ground truth while decreasing the others. .
arxiv.org
arxiv-sanity.com
scholar.google.com
Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs
Zheng Shou and Dongang Wang and Shih-Fu Chang
arXiv e-Print archive - 2016 via Local arXiv
Keywords: cs.CV

more

[link]
Summary by shiyu 11 months ago
Loading...
Your comment:


ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: and