Advanced Data Transformation in SQL Workshop

Advanced Data Transformation in SQL Workshop

Advanced Data Transformation in SQL Workshop

If you like this workshop, you'd love my Practical Hands on Data Engineering Workshop.

Live virtual workshop

The workshop will be streamed on YouTube live: Advanced Data Processing in SQL YouTub Live. Post stream, it will be available to watch and follow at your own pace.

How to use nested data types in SQL, YouTube Link

Prerequisites

Sign up for a Github account.
Go through the Setup process and complete the 0-basics notebook exercises.

Setup

You have two options to run the exercises in this repo

Option 1: Github codespaces (Recommended)

Steps:

Create Github codespaces with this link.
Wait for Github to install the requirements.txt. This step can take about 5minutes.
In the terminal run python setup.py to create the tables and data necessary for the exercises.
Now open the 0-basics.ipynb (or any ipynb) and it will open in a Jupyter notebook interface. You will be asked for your kernel choice, choose Python Environments and then python3.10.13 Global.
Complete the 0-basics notebook as prerequisite.

Option 2: Run locally

Steps:

Clone this repo, cd into the cloned repo
Start a virtual env and install requirements.
In the terminal run python setup.py to create the tables and data necessary for the exercises.
Start Jupyter lab and run the ipynb notebooks.
Complete the 0-basics notebook as prerequisite.

git clone https://github.com/josephmachado/adv_data_transformation_in_sql.git
cd adv_data_transformation_in_sql
python -m venv ./env # create a virtual env
source env/bin/activate # use virtual environment
pip install -r requirements.txt
python setup.py
jupyter lab

Data Model

The TPC-H data represents a car parts seller’s data warehouse, where we record orders, items that make up that order (lineitem), supplier, customer, part (parts sold), region, nation, and partsupp (parts supplier).

Note: Have a copy of the data model as you follow along; this will help in understanding the examples provided and in answering exercise questions.

Topics covered in the workshop

Feedback

I'd love to hear any feedback, please send them by clicking here.

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
.devcontainer		.devcontainer
concepts		concepts
images		images
.gitignore		.gitignore
0-basics.ipynb		0-basics.ipynb
README.md		README.md
create_tables.sql		create_tables.sql
requirements.txt		requirements.txt
setup.py		setup.py
tpch_erd.png		tpch_erd.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advanced Data Transformation in SQL Workshop

Live virtual workshop

Prerequisites

Setup

Option 1: Github codespaces (Recommended)

Option 2: Run locally

Data Model

Topics covered in the workshop

Feedback

About

Releases

Packages

Languages

josephmachado/adv_data_transformation_in_sql

Folders and files

Latest commit

History

Repository files navigation

Advanced Data Transformation in SQL Workshop

Live virtual workshop

Prerequisites

Setup

Option 1: Github codespaces (Recommended)

Option 2: Run locally

Data Model

Topics covered in the workshop

Feedback

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages