This repository presents a study on applying machine learning to lung auscultation for enhanced diagnostic accuracy. Utilizing a Kaggle dataset, CNNs, ANNs, SVMs, and RFs were implemented and compared for classifying pulmonary disorders, demonstrating ML's potential in advancing precision medicine.
- This repository provides Python and R scripts designed for the classification of lung sounds using various machine learning algorithms.
- The project uses a comprehensive dataset of lung auscultation sounds, analyzing them to detect patterns associated with different respiratory conditions.
- Important: Please follow the provided instructions carefully to ensure proper execution of the scripts, particularly the handling of audio files and model training phases.
Lung auscultation remains a critical diagnostic tool in healthcare, yet the interpretation of lung sounds is prone to variability between clinicians. This project aims to mitigate this issue by leveraging machine learning to enhance diagnostic precision, ensuring consistent and reliable assessments.
- Deep Learning (DL) Algorithms: Implementations of Convolutional Neural Networks (CNNs) and Artificial Neural Networks (ANNs) for lung sound classification.
- Traditional ML Algorithms: Support Vector Machines (SVMs) and Random Forests (RFs) are used for comparison.
- Data Preprocessing: Audio preprocessing techniques including noise reduction, normalization, and feature extraction using Mel-frequency cepstral coefficients (MFCCs).
- Visualization Tools: ROC curves, confusion matrices, and precision-recall curves to evaluate model performance.
- Ensure Python 3.x is installed.
- To classify lung sounds using CNN or ANN, execute:
python CNN & ANN.ipynb
- View the output results and confusion matrices for accuracy and performance analysis.
- Load the dataset and execute:
Rscript SVM and RF.Rmd
- Python 3.x for deep learning models
- R 4.4.0 for SVM and Random Forest models
- Libraries:
TensorFlow
,Keras
,scikit-learn
,pandas
,librosa
(for Python) andtidyverse
,randomForest
,e1071
(for R).
Clone the repository:
git clone https://github.com/AFLucas-UOM/Classification-of-Lung-Respiratory-Sounds.git
- Project Report (PDF)
- Kaggle Dataset for respiratory sound classification.
- CNN: Achieved a classification accuracy of 90% after training for 74 epochs.
- ANN: Demonstrated the highest accuracy of 92% with a batch size of 5.
- SVM: Performed moderately, with an accuracy of 70%.
- RF: Outperformed SVM, achieving a classification accuracy of 74%.
- CNN: 90% accuracy
- ANN: 92% accuracy
- SVM: 70% accuracy
- RF: 74% accuracy
Contributions to improve the code, add new features, or optimize model performance are welcome! Fork the repository, make your changes, and submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
This project was developed as part of the ARI2201
course at the University of Malta
, under the supervision of Dr. Kristian Guillaumier
.
For inquiries or feedback, please contact Andrea Filiberto Lucas.