Skip to content

JG1ANDONLY/Weather_Data_Analysis_in_Ithaca_NY

Repository files navigation

Weather Data Analysis in Ithaca, NY from 2021.01 to 2022.04

Data Analysis & Model Building Report for the INFO 1998 Introduction to Machine Learning Final Project

Authors:
Zhongyi (James) Guo ([email protected])
Zixian (Maggie) Huang ([email protected])
Authors are in no particular order.
Github Repository: https://github.com/JG1ANDONLY/Weather_Data_Analysis_in_Ithaca_NY
Date: 04/27/2022


Introduction & Data Cleaning

In this report, we first raised a quesiton: Can the range of temperature support predictions of snowing? We performed data cleaning on the raw dataset for easier later reference and saved the cleaned dataset as final_data.csv.

Exploratory Data Analysis (EDA) & Model Building

Then, we performed Exploratory Data Analysis (EDA) to detect patterns among some variables that we are interested in studying, and discovered potential relationships between the daily range of temperature (min temperature & max temperature) and the amount of snow. Next, we decided to build two models using Logistic Regression and K-Nearest Neighbors algorithm.

Model Training & Accuracy Scores

We did train-test split and tested the accuracy scores of both models. We reached the model accuracy at 0.809 for the Logistic Regression model and at 0.786 for the K-Nearest Neighbors model with k = 10. Afterwards, confusion matrices were plotted for model tuning & validating and error analysis.

Conclusion

Finally, we reached a conclusion that the daily temperature range can efficiently forecast snowing in Ithaca, NY.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published