-
Required Python Version: >= 3.7
-
Installation of necessary packages can be done by running the command given below:
pip install -r requirements.txt
- After completing the installation step, one could inference the model on images by executing the command given below:
python infer.py --img_path /path/to/your/image --model_path /path/to/the/model --classes number of classes
- The original model that was trained on 81 coco classes is present in repository itself by the name
mvit_og.pt
, to execute this model, we can simply ignore model_path and classes arguments, since they are passed to the program as default parameters. Hence the final command becomes:
python infer.py --img_path /path/to/your/image
- Image with the name detected.jpg would be saved in the current folder.
- For inferencing the model on batch of images, one could use the batch argument, example is given below:
python infer.py --batch /path/to/your/image_folder
-
All detected images would be saved in the folder named
dets_full
in the same directory. -
Some detected results:
- To fine-tune MobileVIT detector on a custom dataset then it must be in COCO format.
- Dataset strutcture should be in the format as shown below:
--Dataset
|---> train
|---> test
- To train your dataset, please follow the commands given below:
python train.py --path_to_images path/to/dataset --lr 0.01 --epochs 10 --classes no_of_classes_present --batch_size 32 --path_test_annotations path/to/test/annotations.json --path_train_annotations path/to/train/annotations.json --model_path mvit_og.pt
- After training, the file will output two files:
a. loss_trends.jpg: training loss graph.
b. mvit.pt: learned model file. - The mvit.pt file can be inferred using the infer.py file, you just have to pass model path, images or bath path and number of classes it detects.
python infer.py --model_path path/to/model --classes no_of_classes --img_path /path/to/image/
@article{mehta2021mobilevit,
title={MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer},
author={Mehta, Sachin and Rastegari, Mohammad},
journal={arXiv preprint arXiv:2110.02178},
year={2021}
}