Skip to content

VirtusLab/pyspark-workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pySpark for Data Science Workshop

Local setup

NOTE: If you are using jupyter available in the cluster you can skip this setup. It is useful for people wanting to run workshop exercises locally.

  1. Install anaconda https://conda.io/projects/conda/en/latest/user-guide/install/index.html#id2

  2. Create conda environment with packages from requirements file

> conda create -y --name pyspark_env --file environment/requirements.txt
  1. Activate newly created conda environment
> source activate pyspark_env
  1. Run jupyter notebook
> jupyter notebook
  1. Open notebook with exercises
pySpark SQL exercises.ipynb

Authors

Mikołaj Kromka Grzegorz Gawron

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •