Codes and notes from Udacity Intro to Machine Learning course.
In order to run the sample codes, you'll need the following packages:
If you don't have Python installed, heads up to the Python Download Page. All the codes are tested using Python version 2.7.
Run the following command to install all of the required dependencies:
$ pip install -U numpy scipy matplotlib scikit-learn nltk
This project is also use the Enron dataset. To download the dataset run the following commands:
# Go to the project directory.
$ cd /path/to/intro-to-machine-learning
# RUn the startup script.
$ python tools/startup.py
The startup script will also check for all of the required modules. The Enron dataset is around 400 MB, so it may take a while to complete. You should get the similar output on your terminal:
✅ nltk is installed.
✅ numpy is installed.
✅ scipy is installed.
✅ sklearn is installed.
✅ matplotlib is installed.
⏳ Downloading the Enron dataset, this may take a while...
✅ Enron dataset is downloaded: /path/to/intro-to-machine-learning/data/enron_mail_20150507.tar.gz
⏳ Unzipping Enron dataset, this may take a while...
✅ Enron dataset is extracted to: /path/to/intro-to-machine-learning/data
🎉 You're ready to go!
- Naive Bayes
- Support Vector Machine
- Decision Tree
- Choose Your Own Algorithm
- Datasets and Questions
- Regression
CC BY-NC-ND 4.0 · Risan Bagja Pradana
This repository is in no way affiliated with, authorized, maintained, sponsored or endorsed by Udacity or any of its affiliates or subsidiaries. This is an independent and unofficial library.