LINEAR_CLASSIFIER (ASSOCIATION-IMAGE-CAPTION)

The task of this project is to recognize if an image-text couple matches. In order to achieve this task we created two RNNs and one classifier. We used Glove pre-trained embeddings for text and the output of an object detection operation for images in pytorch format (.pt). Precision and accuracy are both around 88% after fifty epochs of training.

SETUP

google Colaboratory (colab)
google drive

DATA

“2014 Train/Val annotations”
Glove 6b.zip
Images features and classes

ILLUSTRATION

An example on how it works:

ISSUES AND IMPROVEMENTS

speed up the elaboration process by using two dictionaries, one containing images id and vectors, the other containing captions id and embeddings
use a flexible padding

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Linear_Classifier_Image_Text.ipynb		Linear_Classifier_Image_Text.ipynb
README.md		README.md
Search_engine.ipynb		Search_engine.ipynb
datasets_elaborations.ipynb		datasets_elaborations.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LINEAR_CLASSIFIER (ASSOCIATION-IMAGE-CAPTION)

SETUP

DATA

ILLUSTRATION

ISSUES AND IMPROVEMENTS

About

Releases

Packages

Contributors 6

Languages

License

D2KLab/cmr

Folders and files

Latest commit

History

Repository files navigation

LINEAR_CLASSIFIER (ASSOCIATION-IMAGE-CAPTION)

SETUP

DATA

ILLUSTRATION

ISSUES AND IMPROVEMENTS

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages