-
Notifications
You must be signed in to change notification settings - Fork 3
Philosophy
Sossity makes it trivial to do large-scale, collaborative data processing for streaming and batch workloads.
-
Pragmatic: this is being written to solve a specific problem of ingesting streaming data, transforming it, and outputting it to tools and external platforms.
- We will use components that balance engineering time efficiency with human understandability.
- We will make it easy for anyone to choose their data pipelines, push them to github, and subscribe to other pipelines.
- This is not a general-purpose facade on top of all streaming systems.
- We will rely on the underlying substrate to do as much of the work as possible.
-
Convention-Driven: We will reduce the amount of configuration and mistakes by being highly opinionated and internally consistent. This prevents the myriad of issues that arise from mapping domain-logic to infrastructure.
-
Composable: Composability will be enforced on pipelines. A junior engineer should be able to choose operations according to a library and compose them together into a pipeline without having to modify the operations.
- Any streaming pipeline should be able to read the data from another pipeline.
- “One-off” batch repair jobs should be easy to deploy
- Individual computations in a pipeline can be built with libraries per typical Java engineer best practices. The orchestrator’s finest level of granularity is the pipeline.
-
Convergent: Sossity is the sole controller of resources within a project.
- Deployments are not a matter of adding or removing resources, but instead converge on the state described by the Sossity config file. When the descriptor file is pushed, Sossity + Terraform will examine what components need to be stood up or torn down, and do so.
-
Cloud-y: We should use cloud services available to us instead of managing clusters and spinning up VMs whenever possible.