AIRD-Datasets

This repository provides access to preprocessed datasets used in the paper A. Jaiswal, Y. Wu, W. AbdAlmageed, I. Masi, and P. Natarajan, "AIRD: Adversarial Learning Framework for Image Repurposing Detection" (Proceedings of CVPR, 2019). The paper presents an adversarial framework for image repurposing detection and offers the datasets used as a contribution to the community.

Image samples at a glance

_{Examples of evidences in the AIRD-Datasets. Samples above show supporting evidences under three different domains: (a) Places (Google Landmarks), (b) Faces (IJBC-IRD), and (c) Paintings (Painter by Numbers).}

_{Examples of fake candidates in the AIRD-Datasets. Samples above show confusing fake-candidates useful in training and assessing image repurposing detection models under three different domains: (a) Places (Google Landmarks), (b) Faces (IJBC-IRD), and (c) Paintings (Painter by Numbers).}

Summary of Datasets

Name	Source	Image Content	Label (# unique)	Training Size	Testing Size	Encoding
Google Landmarks	Kaggle	Indoor/Outdoor Scenes	Landmark ID (13,885)	977,624	238,965	NetVLAD [1] + PCA + L₂-norm
IJBC-IRD	NIST	Cropped & Aligned Faces	Subject ID (1,649)	13,748	2,629	Face-ResNet [2] + PCA + Signed-Square Rooting
Painter by Numbers	Kaggle	Paintings	Artist ID (1,000)	58,701	14,162	ConvNet + L₂-norm

Download Links

Download Google Landmarks
Download IJBC-IRD
Download Painter by Numbers

Instructions

The download links provide access to compressed archives (.tar.gz files). Each of these can be uncompressed using:

$ tar xvzf <filename>.tar.gz

This would create a directory with files for the encodings, the labels, and the precomputed similarity-based retrievals:

Filename	Description	File-type	Loading in Python
encoding_<split>.h5	Image Encodings	HDF5	h5py
metadata_<split>.csv	Labels	CSV	pandas
precomputed_retrievals_<split>.npy	Retrieval Indices	NumPy Binary	NumPy

where <split> takes values: train and test.

Precomputed Retrievals: In our experiments, we treated the training data as the reference world dataset as well. Hence, all retrieval entries are indices into the training encoding and metadata files.

Citation

Please cite our paper with the following bibtex if you use any of these:

@InProceedings{jaiswal2019aird,
    author = {Jaiswal, Ayush and Wu, Yue and AbdAlmageed, Wael and Masi, Iacopo and Natarajan, Premkumar},
    title = {{AIRD: Adversarial Learning Framework for Image Repurposing Detection}},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2019}
}

References

[1] R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic. NetVLAD: CNN architecture for weakly supervised place recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016.

[2] I. Masi, A. T. Tran, T. Hassner, G. Sahin and G. Medioni. Face-Specific Data Augmentation for Unconstrained Face Recognition, International Journal of Computer Vision (IJCV), 2019.

Disclaimer

We do not claim ownership for the original source data. We only provide encodings of the images and relevant labels to further research in image repurposing detection and semantic integrity assessment of multimedia data.

Contact

If you have any questions, drop an email to [email protected] or [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AIRD-Datasets

Image samples at a glance

Summary of Datasets

Download Links

Instructions

Citation

References

Disclaimer

Contact

About

Releases

Packages

isi-vista/AIRD-Datasets

Folders and files

Latest commit

History

Repository files navigation

AIRD-Datasets

Image samples at a glance

Summary of Datasets

Download Links

Instructions

Citation

References

Disclaimer

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages