This project is part of the Fundamentals of Computer Vision course at KNTU, Spring 2024. The goal is to develop a Convolutional Neural Network (CNN)-based system for detecting and extracting data from Iranian license plates.
- Introduction
- Dataset Preparation
- Data Augmentation
- Model Architecture
- Training the Model
- Evaluation
- Dependencies
- Acknowledgments
- License
Automatic License Plate Recognition (ALPR) is essential for various applications such as traffic management, law enforcement, electronic toll collection, and security. In this project, we develop a CNN-based system to detect and extract data from Iranian license plates.
-
Labeling the Dataset: The dataset was labeled using Label Studio with the "Semantic Segmentation with Polygons" interface to annotate the four corners of the license plates.
-
Accessing the Dataset: The dataset used in this project is not publicly available yet. For access, please contact the course's teacher, B. Nasihatkon, via the course website or personal page.
-
Loading the Dataset: The images and labels are loaded from the specified directory and preprocessed to be used in training the model.
Data augmentation is applied to increase the diversity of the training set and help reduce overfitting. The augmentation techniques used include:
- Slight shifts
- Blurs
- Noise injection
- Crops
- Rotations
- Contrast adjustments
The regression model is designed to take color images of cars and output the coordinates of the four corners of the license plate. The architecture includes:
- Convolutional Layers: Several convolutional layers with ReLU activation to extract features from the input images.
- Max Pooling Layers: Applied after some convolutional layers to reduce the spatial dimensions and retain important features.
- Fully Connected Layers: Dense layers to process the extracted features and make predictions. Dropout layers are included for regularization.
- Output Layer: A final dense layer outputs 8 values, representing the normalized coordinates of the four corners of the license plate.
The classification model reads the extracted license plate image and identifies the characters. The unique architecture includes:
- Convolutional Layers: Multiple convolutional layers with ReLU activation to extract detailed features from the license plate images.
- Max Pooling Layers: Applied after convolutional layers to reduce the dimensions while keeping essential features.
- Reshape Layer: Reshapes the output to obtain a separate feature vector for each character in the license plate.
- Fully Connected Layers for Each Character: Each character's feature vector is processed through individual dense layers.
- Output Layers: Each dense layer is connected to an output layer with softmax activation to classify the characters. The third character is treated as a Persian letter, and the rest as digits.
The models are trained using the following steps:
- Splitting the Data: The dataset is split into training, validation, and test sets.
- Compiling the Model: The models are compiled with the Adam optimizer and mean squared error loss for the regression model, and categorical cross-entropy for the classification model.
- Training: The models are trained with early stopping and model checkpoint callbacks to prevent overfitting and save the best models.
The models are evaluated on the test set using Mean Absolute Error (MAE) for the regression model and accuracy for the classification model. Visualization functions are used to display the predictions and extracted license plates.
- Python 3.x
- TensorFlow 2.x
- NumPy
- OpenCV
- Matplotlib
- WandB
- Hugging Face Hub
Install the dependencies using:
pip install tensorflow numpy opencv-python matplotlib wandb huggingface_hub
This project is directed by Mahdi Lotfi and guided by B. Nasihatkon. Special thanks to the contributors Alireza Honardoost, Morteza Hajiabadi, and Kasra Davoodi for their efforts. The course was taught by B. Nasihatkon (course website).
This project is licensed under the MIT License. See the LICENSE file for details.
The trained regression model is available on Hugging Face: KNTU-VC-4022-License-Plate-Recognition