Skip to content

jipy0222/CS150A-Final-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

data_exploration_cleaning.py

Do some exploration of origin train.csv and test.csv.

Do naive data cleaning.

exploration&cleaning.ipynb is the ipython version of python source code.

preprocess.py

Do some feature engineering by PySpark and generate train_pyspark.csv and test_pyspark.csv.

model_train_output.py

Choose models, apply PCA and similar processing skills, tune hyperparameters, compare performance and output output.csv.

result.txt

A log of hyperparameter tuning temp_results.

test.csv

Attention: the test.csv in root directory is our output file!!!!!!!!!

NOT the test.csv in data folder!!!!!!!!!!!

About

Final Project: KDD CUP 2010

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published