aws-s3-to-pandas

Playbook for AWS Lambda, AWS S3 (e.g. csv files), and pandas.

Pre-processing code for longitudinal study of options data by ticker. Uses Python and AWS Lambda.

AWS Lambda code to split-and-shuffle options data (end-of-day) by ticker. Learned that pandas groupby objects can be convert into list of tuples (key, dataframe).

Similar concept as split-map-shuffle-reduce paradigm used in "big data" analysis.

Steps:

Download CSV files from AWS S3
Read CSV into pandas dataframe
Split the dataframe into list-of-tuples (str_ticker, df_by_ticker)
Upload each tuple-dataframe up to AWS S3 by ticker

Further research:

Extend groupby into Type (call/put)
Extend groupby into Expiration Date
Extend groupby into Strike Price
Look for opportunities to "map" after "split"
Look for opportunities to "reduce" after "shuffle"

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
README.md		README.md
lambda-s3-pandas-split.py		lambda-s3-pandas-split.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

aws-s3-to-pandas

About

Releases

Packages

Languages

pkgit123/aws-s3-to-pandas

Folders and files

Latest commit

History

Repository files navigation

aws-s3-to-pandas

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages