A scalable, distributed, and Dockerized pipeline for weather prediction using Kafka, Cassandra, and Spark.
Note: To customize the port numbers, host names, replication factor, et cetera, please modify the config file.
- Start the Weather App
Note: running this App will take up roughly 20G memory!
make start
- Train the model with Spark ML
make trainer
- Inference on stream data and save to Cassandra
make predictor
- View the data stored in Cassandra
make viewer
- Clean up
make clean
- To generate the gRPC proto
python3 -m grpc_tools.protoc -I=. --python_out=. --grpc_python_out=. report.proto