Ever wanted to censor images on the web using deep neural networks?
A deep learning project by Artur Puzio and Grzegorz Uriasz made as part of an internship at deepsense.ai sponsored by The Polish Children's Fund and supervised by Piotr Migdał.
A browser extension is available for Mozilla Firefox. Try it here.
- Create a deep learning model for detecting trypophobia triggers suitable for running on CPU
- Create a plug and play browser plugin for censoring trypophobic images on the fly while browsing the internet running entirely client-side.
- Prepare a high quality data set for training trypophobia classifiers consisting of a combination of different data sources
Create a utility for scrapping images from Google ImagesCreate utilities for quick image sorting and image normalizationCreate a browser plugin using the WebExtension API capable of censoring images on the flyCreate neural networks suitable for running on a CPU in Javascript- One global browser-wide keras.js instance in the browser plugin, cache predictions based on image fingerprints, create a settings page
- Polish the browser plugin and
publish it in the plugin store(s)
The utilities contained in the utils
folder are small programs and scripts useful in generating the data set and easing the usage of the deep learning lab Neptune.
Note: The provided images may be or not be subject to copyright. By downloading the dataset you agree to use it only for research purposes.
- 6.5k trypophobia triggering images obtained from:
- 6k Reddit (/r/trypophobia) - using Prawtimestamps by voussoir, wget and Ripme by 4pr0n
- 546 Google images (keyword: trypophobia) - using our own scrapper
- 10.5k neutral images obtained from Google images using our own scrapper:
- 10k by supplying it 5k randomly chosen words from this english dictionary and downloading 2 images per word
- 192 with bushes keyword (introduced in v2 to eliminate false positives for greenery)
- 181 with grass keyword (since v2)
- 98 with forest keyword (since v2)
Images have been divided into 4 folders
/valid/trypo
- 500 random trypophobia triggering images/valid/norm
- 500 random neutral images/train/trypo
- rest of the trypophobia triggering images/train/norm
- rest of the neutral images
- Non-image files and animated images have been removed using this tool of ours and this tool of ours.
- Downloaded images have been de-duplicated using md5 hashes and a fuzzy deduplication tool available in digiKam.
- Trypophobia triggers were manually checked and isolated from random spam using our tool.
- Images have been rescaled (maintaining aspect ratio) and cropped to 256x256 using our tool.
- Images have been split into train and validation sets using our tool.
Anyone interested in the "raw" unprocessed data please send us an email.
The models were made in the Keras machine learning framework and are compatible with the Keras.JS javascript library. The models were trained on the Google Computing Platform using Neptune. This repository contains some of the models together with the training results. We examined the performance of different size models and decided to aim for one with less than 20k parameters. We achieved up to 90% accuracy and 0.27 log-loss on the validation set. Additionally, some models with <10k parameters came close to achieving these results.
The browser plugin censors images encountered while browsing the web. It uses a supplied trained model to determine which images are safe to reveal and which a warning must be issued for. The extension is a WebExtension and was tested on Mozilla Firefox. Currently the extension works on most sites. You can try it here.