Skip to content
This repository has been archived by the owner on Dec 2, 2024. It is now read-only.

idealo/imageatm

Repository files navigation

Image ATM (Automated Tagging Machine)

Build Status License

Image ATM is a one-click tool that automates the workflow of a typical image classification pipeline in an opinionated way, this includes:

  • Preprocessing and validating input images and labels
  • Starting/terminating cloud instance with GPU support
  • Training
  • Model evaluation

Read the documentation at: https://idealo.github.io/imageatm/

Image ATM is compatible with Python 3.6 and is distributed under the Apache 2.0 license.

Installation

There are two ways to install Image ATM:

  • Install Image ATM from PyPI (recommended):
pip install imageatm
  • Install Image ATM from the GitHub source:
git clone https://github.com/idealo/imageatm.git
cd imageatm
python setup.py install

Usage

Train with CLI

Run this in your terminal

imageatm pipeline config/config_file.yml

Train without CLI

Run the data preparation:

from imageatm.components import DataPrep

dp = DataPrep(
    samples_file = 'sample_configfile.json',
    image_dir = 'sample_dataset/',
    job_dir='sample_jobdir/'
)
dp.run(resize=True)

Run the training:

from imageatm.components import Training

trainer = Training(image_dir=dp.image_dir, job_dir=dp.job_dir)
trainer.run()

Run the evaluation:

from imageatm.components import Evaluation

evaluator = Evaluation(image_dir=dp.image_dir, job_dir=dp.job_dir)
evaluator.run()

Test

Test execution is triggered by these commands:

pip install -e ".[tests, docs]"
pytest -vs --cov=imageatm --show-capture=no --disable-pytest-warnings tests/

Transfer learning

The following pretrained CNNs from Keras can be used for transfer learning in Image-ATM:

  • Xception
  • VGG16
  • VGG19
  • ResNet50, ResNet101, ResNet152
  • ResNet50V2, ResNet101V2, ResNet152V2
  • ResNeXt50, ResNeXt101
  • InceptionV3
  • InceptionResNetV2
  • MobileNet
  • MobileNetV2
  • DenseNet121, DenseNet169, DenseNet201
  • NASNetLarge, NASNetMobile

Training is split into two phases, at first only the last dense layer gets trained, and then all layers are trained.

For each phase the learning rate is reduced after a patience period if no improvement in validation accuracy has been observed. The patience period depends on the average number of samples per class (n_per_class):

  • if n_per_class < 200: patience = 5 epochs
  • if n_per_class >= 200 and < 500: patience = 4 epochs
  • if n_per_class >= 500: patience = 2 epochs

Training is stopped early after a patience period that is three times the learning rate patience to allow for two learning rate adjustments before stopping training.

Contribute

We welcome all kinds of contributions. See the Contribution guide for more details.

Bump version

To bump up the version, use

bumpversion {part} setup.py

Cite this work

Please cite Image ATM in your publications if this is useful for your research. Here is an example BibTeX entry:

@misc{idealods2019imageatm,
  title={Image ATM},
  author={Christopher Lennan and Malgorzata Adamczyk and Gunar Maiwald and Dat Tran},
  year={2019},
  howpublished={\url{https://github.com/idealo/imageatm}},
}

Maintainers

Copyright

See LICENSE for details.

TO-DOs:

  • We are currently using Keras 2.2. The plan is to use tf.keras once TF 2.0 is out. Currently tf.keras is buggy, especially with model saving/loading (tensorflow/tensorflow#22697)