Skip to content

Model used was ResNET50(https://iq.opengenus.org/resnet50-architecture/) and was trained over RNNs/LSTMs. The model was trained on Flickr8K image data set. A custom Data Generator was enforced during training which had the work of maintaining RAM usage. The results obtained in any time were processed on NVIDIA 1060 3GB GPU. Major libraries used(…

Notifications You must be signed in to change notification settings

AkshayPS12/Image-Caption-Generator-Data-Science-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image-Caption-Generator-Data-Science-

Model used was ResNET50 and was trained over RNNs/LSTMs.

The model was trained on Flickr8K image data set.

A custom Data Generator was enforced during training which had the work of maintaining RAM usage.

The results obtained in any time were processed on NVIDIA 1060 3GB GPU.

Table of Contents

  1. Requirements
  2. Training parameters and results
  3. Generated Captions on Test Images

1. Requirements

Recommended System Requirements to train model.

  • A good CPU and a GPU
  • Atleast 16GB of RAM
  • Active internet connection so that keras can download model weights

Required libraries for Python used while making & testing of this project

  • Python - 3.6.7
  • Numpy
  • Tensorflow(GPU) - 2.1.0
  • Keras - 2.3.1
  • Matplotlib

2. Training parameters and results

NOTE

Model & Config Argmax
ResNET50
  • Epochs = 20
  • Batch Size = 3
  • Optimizer = Adam
    Crossentropy loss
    (Lower the better)
  • loss(First epoch): 3.1794

  • loss(Second epoch): 2.9504

  • ... and so on the result of all are listed in main.ipynb

  • loss(20th epoch): 2.2446

3. Generated Captions on Test Images

  • After the training of the model, Random 15 images were selected from the test set and their captions were generated.
  • The Images as well as their generated captions are saved in main.ipynb file so you can check them out.

About

Model used was ResNET50(https://iq.opengenus.org/resnet50-architecture/) and was trained over RNNs/LSTMs. The model was trained on Flickr8K image data set. A custom Data Generator was enforced during training which had the work of maintaining RAM usage. The results obtained in any time were processed on NVIDIA 1060 3GB GPU. Major libraries used(…

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published