Fun data science with zeppelin and docker
Built with ❤︎ by Anderson Santos and contributors
DockerHub repository: https://hub.docker.com/r/supergarotinho/zeppelin/
- Spark - 2.1.1
- zeppelin - 0.7.1
- spark
- shell
- angular
- markdown
- postgresql
- jdbc
- python
- hbase
- elasticsearch
- Python libs:
- Python 3.5
- Data
- NumPy
- pandas
- PandaSQL
- ML and Math
- sklearn
- SciPy
- Visualization
- matplotlib
- seaborn
- folium (GeoVisualization)
- wordcloud
- Util
- ijson
- datetime
- tweepy
- NLP
- nltk
- punkt - sentence segmentation
- stopwords
- rslp - lemmatizer da Viviane Orengo
- floresta - Corpus Floresta Sint?tica for PT_BR
- gensim (Topic and language modelling)
- nltk
- Graphs
- networkx
- igraph
You can enter at the directory that you want to save your notebooks and run:
docker run --rm -d -p 8080:8080 -v $PWD:/notebook -e ZEPPELIN_NOTEBOOK_DIR='/notebook' supergarotinho/zeppelin
Build and push:
docker build -t supergarotinho/zeppelin .
docker push supergarotinho/zeppelin
- Anderson Santos - Initial work - supergarotinho
See also the list of contributors who participated in this project.
This project is licensed under the BSD-3 License - see the LICENSE.md file for details
- From zeppelin official image: apache/zeppelin:0.7.2