Model used was ResNET50 and was trained over RNNs/LSTMs.
The model was trained on Flickr8K image data set.
A custom Data Generator was enforced during training which had the work of maintaining RAM usage.
The results obtained in any time were processed on NVIDIA 1060 3GB GPU.
Recommended System Requirements to train model.
- A good CPU and a GPU
- Atleast 16GB of RAM
- Active internet connection so that keras can download model weights
Required libraries for Python used while making & testing of this project
- Python - 3.6.7
- Numpy
- Tensorflow(GPU) - 2.1.0
- Keras - 2.3.1
- Matplotlib
Model & Config | Argmax |
---|---|
ResNET50
|
(Lower the better) |
- After the training of the model, Random 15 images were selected from the test set and their captions were generated.
- The Images as well as their generated captions are saved in main.ipynb file so you can check them out.