Skip to content
Bradford Stephens edited this page Aug 22, 2018 · 14 revisions

Sossity Overview

Sossity is a system for composing and collaborating on data pipelines. Sossity makes it easy for minimally technical people to do data processing at scale, so they can use familiar tools on their data.

It is a unified platform composed of several years' worth of Open Source dependency management and automation tools on Google Cloud.

Read the Philosophy wiki for the founding philosophy of the project.

There are several major components:

  1. angled-dream, a simplified Cloud Dataflow SDK for building Pipelines
  2. Sossity, for pipeline linking and dependency management
  3. Sources (App Engine REST Endpoints)
  4. Sinks (an SDK for data outputs), including File and Bigquery outputs, usually running in docker containers
  5. Containers (microservices), including responsys-resource

There are also several cloud platforms:

  1. Google Cloud, including Cloud Dataflow (Stream processing), Kubernetes (Docker management), Pubsub (queues), and App Engine (REST Endpoints).
  2. Github, where business code is stored.
  3. CircleCI, where all Github code is built and then deployed using Sossity and Terraform. (This is preferred over Jenkins)
  4. Terraform, for cloud resource management.

Lifecycle

Overview Workflow

Chart Link