A distributed system with advanced data flows.
As the scale of data continues to grow and developments in AI promise to extract even greater value from data, applications must increasingly scale horizontally to keep up. In a distributed world, having the skills to build and reason about data-intensive, distributed systems will be critical. I aim to improve some of these skills through this project.
At a very high level, my objectives for this project are the following.
- Build a distributed system consisting of multiple microservices
- Communicate, store, and process large quantities of data
- Use a variety of data engineering techniques to solve unique data challenges for different use cases
To accomplish these objectives, I'll need to decide what to build, why, and how. To answer these questions, some research will be needed.
To run the dev stack:
cp .env.template .env
# fill out the .env file
docker compose -f deploy/dev.yaml --env-file .env up