This repository contains the code and resources for the project titled "Designing Classic and Deep Models for Anti-Spoofing Algorithm", supervised by Dr. Mohammadi as part of the Computer Vision course. The project involves building and evaluating models to detect face spoofing using both classical and deep learning approaches.
The dataset used in this project is a combination of the
FASD-CASIA dataset and additional anti-spoofing-face-fake images. The combined dataset includes images and videos labeled as either real
or fake
.
-
Uploading and Loading Data:
The dataset is uploaded to the Hugging Face platform and loaded into Google Colab for processing. -
Labeling:
Image files are labeled based on their filenames:***_fake.jpg
→ Label: 0***_real.jpg
→ Label: 1
-
Frame Extraction:
For the testing phase, frames are extracted from videos. A random frame from each video is saved with a label indicating whether it isreal
orfake
.
-
ResNet-50:
- Pretrained on ImageNet.
- Modified by replacing the fully connected (FC) layer with a Global Average Pooling (GAP) layer, followed by a dense layer with 1024 neurons and a final FC layer with 2 neurons for classification.
-
ViT-Base-16/224 (Google):
- Pretrained Vision Transformer model.
- Fine-tuned using the project dataset.
- Data preprocessing includes resizing images to
224x224
and normalizing pixel values. - Labels are converted to one-hot encoded format.
- The models are trained on the training set and evaluated on the test set, including both random and cropped frames.
-
CNN:
- A simple CNN model with several convolutional layers followed by fully connected layers.
- Extracts features such as frequency, Local Binary Patterns (LBP), depth, and statistical features.
-
InceptionV3:
- Another model tested with an input size of
75x75
. - Evaluated similarly to the other models.
- Another model tested with an input size of
- Evaluation Metrics:
- Accuracy on random frames.
- Accuracy on cropped frames using MTCNN for face detection and cropping.
- Evaluation Metrics:
- Feature extraction on a subset of the dataset.
- Accuracy on raw and cropped frames.
- The ResNet-50 model achieved an accuracy of approximately 72% on cropped test images.
- The ViT model underwent fine-tuning for improved performance.
- The CNN model reached an accuracy of 65% on raw test frames.
- InceptionV3 was also evaluated and produced notable results, which are detailed in the CSV files.
/data/
: Contains the dataset images and videos./documents/
: project's report (Farsi)/notebooks/
: Jupyter notebooks for data preprocessing, model training, and evaluation./results/
: CSV files with predictions and accuracy metrics for the models.
-
Clone the repository:
git clone https://github.com/Bahareh0281/Video-Based-Face-Liveness-Detection.git cd Video-Based-Face-Liveness-Detection
-
Install dependencies: Ensure you have Python 3 and the required libraries installed.
-
Run the notebooks: Navigate to the
/notebooks/
directory and open the relevant notebook in Google Colab or Jupyter. -
Evaluate the models: Follow the instructions in the notebooks to train and evaluate the models using your dataset.
This project was conducted under the supervision of Dr. Mohammadi as part of the Computer Vision course.
This project is licensed under the MIT License. See the LICENSE file for details.