django-cqrs
is an Django application, that implements CQRS data synchronisation between several Django microservices.
In Connect we have a rather complex Domain Model. There are many microservices, that are decomposed by subdomain and which follow database-per-service pattern. These microservices have rich and consistent APIs. They are deployed in cloud k8s cluster and scale automatically under load. Many of these services aggregate data from other ones and usually API Composition is totally enough. But, some services are working too slowly with API JOINS, so another pattern needs to be applied.
The pattern, that solves this issue is called CQRS - Command Query Responsibility Segregation. Core idea behind this pattern is that view databases (replicas) are defined for efficient querying and DB joins. Applications keep their replicas up to data by subscribing to Domain events published by the service that owns the data. Data is eventually consistent and that's okay for non-critical business transactions.
Full documentation is available at https://django-cqrs.readthedocs.org.
You can find an example project here
- Setup
RabbitMQ
- Install
django-cqrs
- Apply changes to master service, according to RabbitMQ settings
# models.py
from django.db import models
from dj_cqrs.mixins import MasterMixin, RawMasterMixin
class Account(MasterMixin, models.Model):
CQRS_ID = 'account'
CQRS_PRODUCE = True # set this to False to prevent sending instances to Transport
class Author(MasterMixin, models.Model):
CQRS_ID = 'author'
CQRS_SERIALIZER = 'app.api.AuthorSerializer'
# For cases of Diamond Multiinheritance the following approach could be used:
from mptt.models import MPTTModel
from dj_cqrs.metas import MasterMeta
class ComplexInheritanceModel(MPTTModel, RawMasterMixin):
pass
MasterMeta.register(ComplexInheritanceModel)
# settings.py
CQRS = {
'transport': 'dj_cqrs.transport.rabbit_mq.RabbitMQTransport',
'host': RABBITMQ_HOST,
'port': RABBITMQ_PORT,
'user': RABBITMQ_USERNAME,
'password': RABBITMQ_PASSWORD,
}
- Apply changes to replica service, according to RabbitMQ settings
from django.db import models
from dj_cqrs.mixins import ReplicaMixin
class AccountRef(ReplicaMixin, models.Model):
CQRS_ID = 'account'
id = models.IntegerField(primary_key=True)
class AuthorRef(ReplicaMixin, models.Model):
CQRS_ID = 'author'
CQRS_CUSTOM_SERIALIZATION = True
@classmethod
def cqrs_create(cls, sync, **mapped_data):
# Override here
pass
def cqrs_update(self, sync, **mapped_data):
# Override here
pass
# settings.py
CQRS = {
'transport': 'dj_cqrs.transport.RabbitMQTransport',
'queue': 'account_replica',
'host': RABBITMQ_HOST,
'port': RABBITMQ_PORT,
'user': RABBITMQ_USERNAME,
'password': RABBITMQ_PASSWORD,
}
- Apply migrations on both services
- Run consumer worker on replica service. Management command:
python manage.py cqrs_consume -w 2
- When there are master models with related entities in CQRS_SERIALIZER, it's important to have operations within atomic transactions. CQRS sync will happen on transaction commit.
- Please, avoid saving different instances of the same entity within transaction to reduce syncing and potential racing on replica side.
- Updating of related model won't trigger CQRS automatic synchronization for master model. This needs to be done manually.
- By default
update_fields
doesn't trigger CQRS logic, but it can be overridden for the whole application in settings:
settings.CQRS = {
...
'master': {
'CQRS_AUTO_UPDATE_FIELDS': True,
},
...
}
or a special flag can be used in each place, where it's required to trigger CQRS flow:
instance.save(update_fields=['name'], update_cqrs_fields=True)
- When only needed instances need to be synchronized, there is a method
is_sync_instance
to set filtering rule. It's important to understand, that CQRS counting works even without syncing and rule is applied every time model is updated.
Example:
class FilteredSimplestModel(MasterMixin, models.Model):
CQRS_ID = 'filter'
name = models.CharField(max_length=200)
def is_sync_instance(self):
return len(str(self.name)) > 2
Bulk synchronizer without transport (usage example: it may be used for initial configuration). May be used at planned downtime.
- On master service:
python manage.py cqrs_bulk_dump --cqrs-id=author
->author.dump
- On replica service:
python manage.py cqrs_bulk_load -i=author.dump
Filter synchronizer over transport (usage example: sync some specific records to a given replica). Can be used dynamically.
- To sync all replicas:
python manage.py cqrs_sync --cqrs-id=author -f={"id__in": [1, 2]}
- To sync all instances only with one replica:
python manage.py cqrs_sync --cqrs-id=author -f={} -q=replica
Set of diff synchronization tools ()
- To get diff and synchronize master service with replica service in K8S:
kubectl exec -i MASTER_CONTAINER -- python manage.py cqrs_diff_master --cqrs-id=author |
kubectl exec -i REPLICA_CONTAINER -- python manage.py cqrs_diff_replica |
kubectl exec -i MASTER_CONTAINER -- python manage.py cqrs_diff_sync
- If it's important to check sync and clean up deleted objects within replica service in K8S:
kubectl exec -i REPLICA_CONTAINER -- python manage.py cqrs_deleted_diff_replica --cqrs-id=author |
kubectl exec -i MASTER_CONTAINER -- python manage.py cqrs_deleted_diff_master |
kubectl exec -i REPLICA_CONTAINER -- python manage.py cqrs_deleted_sync_replica
- Python 3.6+
- Install dependencies
requirements/dev.txt
- Python 3.6+
- Install dependencies
requirements/test.txt
export PYTHONPATH=/your/path/to/django-cqrs/
Check code style: flake8
Run tests: pytest
Tests reports are generated in tests/reports
.
out.xml
- JUnit test resultscoverage.xml
- Coverage xml results
To generate HTML coverage reports use:
--cov-report html:tests/reports/cov_html
- docker-compose
cd integration_tests
docker-compose run master