Skip to content

This repository contains my work during the Himmah data science Bootcamp.

License

Notifications You must be signed in to change notification settings

ZarahShibli/Data-Science-Bootcamp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Bootcamp


  1. Overview About Bootacamp
  2. Bootcamp Plan

1. Overview About Bootacamp

This repository contains my work during the Himmah data science Bootcamp. It's presented by SDA and Coding Dojo.

2. Bootcamp Plan

This bootcamp divided into 4 main stacks:

Business Intelligence

  • Week-1
    • Analysis data using Excel.
  • Week-2
    • Understand SQL and NoSQL.
  • Week-3
    • Business Intelligence overview.
    • Tableau.

The statistic Programming Language R

  • Week-1
    • Introduction to R.
    • Load data and packages.
    • Conditional statements and data vectorization.
  • Week-2
    • Data visualization using ggplot.
    • Understand Exploratory Data Analysis (EDA).
    • Understand Tidy Data.
    • Data types.
  • Week-3
    • Probability and decision analysis.
    • Understand training, validation, and testing sets.
    • Build a shiny app (Dashboard).

Introduction to Python

  • Week-1
    • Intro to anaconda and Jupyter notebook.
    • Intro to Numpy and linear equations.
    • Loops, conditional statements, and functions.
  • Week-2
    • Extract data from an API.
    • Data manipulation and statistic analysis.
    • Data visualization using Matplotlib and Seaborn.
    • EDA with insightful visualization.
  • Week-3
    • Deal with missing values and implication techniques.
    • Build a dash application.

Introduction to Machine Learning

  • Week-1
    • Intro to Machine Learning.
    • Build heuristic model.
    • Understanding Cost functions.
    • Build a linear regression model.
  • Week-2
    • Understand the data pipeline.
    • Understand Scikit-learn API.
    • Build logistic regression.
    • Apply feature engineering techniques.
    • Improve model using GridSearch.
  • Week-3
    • Overview of ensemble modeling and understanding boosting and bagging techniques.
    • Understand Principal Component Analysis (PCA)
    • Understand the Decision trees and random forests.
    • Clustering technique.

Capstone Project

The aim of this project is to build customer segmentation models for a delivery app using Kmena and DBSCAN. For more details about the capstone project see project repository.