Disaster Response Pipeline Project

Description

The goal of this project is to build a Natural Language Processing (NLP) model that categorizes messages on a real time basis. This Project is part of the Data Science Nanodegree Program by Udacity in collaboration with Figure Eight. The dataset is provided by Figure Eight and contains pre-labelled tweet and messages from real-life disaster events.

This project has three main sections:

Building an ETL pipeline to extract data from source, clean the data and load the data into a SQLite DB
Building a machine learning pipeline to train and classify text messages in various categories
Run a web app which can show the model results in real time

Requirements

Python 3
Machine Learning Libraries: NumPy, SciPy, Pandas, Sciki-Learn
Natural Language Process Libraries: NLTK
SQLlite Database Libraries: SQLalchemy
Model Loading and Saving Library: Pickle
Web App and Data Visualization: Flask, Plotly

Installing

To clone the git repository:

git clone https://github.com/OmoyeniO/Disaster-Response-Pipeline.git

Execution

Run the following commands in the project's directory to set up the database, train model and save the model.
- To run ETL pipeline to clean data and store the processed data in the database python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/disaster_response_db.db
- To run the ML pipeline that loads data from DB, trains classifier and saves the classifier as a pickle file python models/train_classifier.py data/disaster_response_db.db models/classifier.pkl
Run the following command in the app's directory to run your web app --- Go to app directory: cd app then python run.py
Go to http://0.0.0.0:3000/

Important Files

app/templates/*: templates/html files for web app

data/process_data.py: Extract Train Load (ETL) pipeline used for data cleaning, feature extraction, and storing data in a SQLite database

models/train_classifier.py: A machine learning pipeline that loads data, trains a model, and saves the trained model as a .pkl file for later use

run.py: This file can be used to launch the Flask web app used to classify disaster messages

Additional resources

The code contains two jupyter notebooks names ML pipeline preperation and ETL pipeline preperation that will help in understanding how the model works in detail:

ETL Pipeline Preparation Notebook: learn everything about the implemented ETL pipeline
ML Pipeline Preparation Notebook: look at the Machine Learning Pipeline developed with NLTK and Scikit-Learn

You can use ML Pipeline Preparation Notebook to re-train the model or tune it through a dedicated Grid Search section.

Authors

Omoyeni Ogundipe

Acknowledgements

Udacity for providing an amazing Data Science Nanodegree Program
Figure Eight for providing the relevant dataset to train the model

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.ipynb_checkpoints		.ipynb_checkpoints
app		app
data		data
disaster		disaster
models		models
.DS_Store		.DS_Store
ETL Pipeline Preparation.ipynb		ETL Pipeline Preparation.ipynb
ML Pipeline Preparation.ipynb		ML Pipeline Preparation.ipynb
Procfile		Procfile
README.md		README.md
Screenshot.png		Screenshot.png
Screenshot2.png		Screenshot2.png
app.py		app.py
nltk.txt		nltk.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Disaster Response Pipeline Project

Table of Contents

Description

Requirements

Installing

Execution

Important Files

Additional resources

Authors

Acknowledgements

About

Releases

Packages

Languages

OmoyeniO/Disaster-Response-Pipeline

Folders and files

Latest commit

History

Repository files navigation

Disaster Response Pipeline Project

Table of Contents

Description

Requirements

Installing

Execution

Important Files

Additional resources

Authors

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages