The project is being carried out as part of the subject Fundamentals of Data Processing.
Note: The project made in Polish language.
The purpose of the project is to classify a patient as healthy or sick based on medical data of patients, including morphological test results. However, the subject itself can be divided into the analysis of this data and the prediction of the processed data according to the analysis.
A similar project was carried out previously in SAS software, hence the additional idea is to reproduce these analyses in Python and compare certain results with each other. The repository also includes a report on the execution of the project in SAS.
The data of patients screened for thyroid disease came from the UCI Machine Learning Repository website and was provided in 1987 by the Garavan Institute and J. Ross Quinlan of the New South Wales Institute in Sydney, Australia. The project used a sick dataset, with a total number of records (patients) of 3772.