This repository provides tools for preprocessing, splitting, augmenting, and visualizing image datasets for machine learning tasks.
- Dataset Splitting: Automatically splits datasets into training, validation, and test sets.
- Image Augmentation: Applies transformations like rotation, flipping, cropping, color jitter, and noise addition.
- Visualization: Generates bar charts, heatmaps, and sample image displays for dataset analysis.
augmentation_to_image.py
: Script for augmenting images in the training set and saving the results.dataset_visualization.ipynb
: Notebook for splitting datasets, generating visualizations, and displaying sample images..gitignore
: Specifies files and directories to exclude from version control.
Use the dataset_visualization.ipynb
notebook to:
- Download dataset
- Place your dataset in the
dataset/
folder, structured with subfolders for each class. - Run
dataset_visualization.ipynb
to split the dataset intotrain
,validation
, andtest
sets.
Run the augmentation_to_image.py
script:
python augmentation_to_image.py
Augmented images will be saved in split_data/augmented_train
.
Use the dataset_visualization.ipynb
notebook to:
- Generate bar charts and heatmaps for dataset distribution.
- Display random sample images from each dataset split.
For folders containing main.py
, use the script to train models and generate outputs.
- Note: Ensure the dataset paths in
main.py
are adjusted according to your directory structure before running.
To evaluate the results:
- Run the
load_and_test.py
script in the respective folders. - This script loads the trained model and evaluates its performance on the test dataset.
Install the required Python libraries with:
pip install -r requirements.txt
code_repo/
├── augmentation_to_image.py
├── dataset_visualization.ipynb
├── .gitignore
├── requirements.txt
├── split_data/
│ ├── train/
│ ├── validation/
│ ├── test/
│ ├── augmented_train/
└── dataset/
└── Labeled Data/
- Modify directory paths in scripts as needed for your environment.
- Ensure the dataset folder is organized with subfolders for each class.
This project is licensed under the MIT License.