The aim of this project is to dive into the field of action recognition and explore various techniques.
Till now 2 models have been implemented
1 - The model.py is a pytorch implementation of the paper - A Closer Look at Spatiotemporal Convolutions for Action Recognition
Link to the paper is - https://arxiv.org/abs/1711.11248v3
2 - A pytorch implementation of MobileNets for less computational Models. Consult the paper - MobileNetV2: Inverted Residuals and Linear Bottlenecks - https://arxiv.org/abs/1801.04381v4
1 - Opencv to load and resize videos and
2 - For mixed precision training and fp16 conversions Apex library is used. For installation and more info see https://github.com/NVIDIA/apex.
1 - Implement attention and compare the performance.
2 - Visualize where the model focuses