Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helm chart for cht sync #90

Merged
merged 22 commits into from
Jul 19, 2024
Merged

helm chart for cht sync #90

merged 22 commits into from
Jul 19, 2024

Conversation

witash
Copy link
Contributor

@witash witash commented Apr 19, 2024

this branch contains

  1. a helm chart for deploying cht sync on kubernetes.
    including logstash, postgrest, and dbt containers for all deployments
    optionally a postgres db and service
    and experimentally a redis instance and the redis worker to copy data from it

  2. change to the github task which was building and pushing images with the :latest tag on every push on every branch, which causes problems for deploying on kubernetes; for now, it just tags images with the branch name, we can make additional changes in its own branch after this is merged.

  3. creates the couchdb database in the dbt worker.

Closes #75

@github-advanced-security
Copy link

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

# Add environment variables and volume mounts as needed
env:
- name: PGRST_DB_URI
value: "postgres://postgres:postgres@postgres:5432/data"

Check failure

Code scanning / SonarCloud

PostgreSQL database passwords should not be disclosed High

Make sure this PostgreSQL database password gets changed and removed from the code. See more on SonarCloud
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that sonar really wants to not share dummy passwords. Can we try something like:

postgres://<pguser>:<pgpassword>@postgres:5432/data

inner.service: postgrest
spec:
containers:
- name: postgrest

Check warning

Code scanning / SonarCloud

CPU limits should be enforced Medium

Specify a CPU limit for this container. See more on SonarCloud
inner.service: postgrest
spec:
containers:
- name: postgrest

Check warning

Code scanning / SonarCloud

Memory limits should be enforced Medium

Specify a memory limit for this container. See more on SonarCloud
labels:
app: cht-sync
inner.service: postgrest
spec:

Check warning

Code scanning / SonarCloud

Service account tokens should not be mounted in pods Medium

Set automountServiceAccountToken to false for this specification of kind Deployment. See more on SonarCloud
app: cht-sync
spec:
containers:
- name: logstash

Check warning

Code scanning / SonarCloud

CPU limits should be enforced Medium

Specify a CPU limit for this container. See more on SonarCloud
app: cht-sync
spec:
containers:
- name: logstash

Check warning

Code scanning / SonarCloud

Memory limits should be enforced Medium

Specify a memory limit for this container. See more on SonarCloud
metadata:
labels:
app: cht-sync
spec:

Check warning

Code scanning / SonarCloud

Service account tokens should not be mounted in pods Medium

Set automountServiceAccountToken to false for this specification of kind Deployment. See more on SonarCloud
metadata:
labels:
app: cht-sync
spec:

Check warning

Code scanning / SonarCloud

Service account tokens should not be mounted in pods Medium

Set automountServiceAccountToken to false for this specification of kind Deployment. See more on SonarCloud
app: cht-sync
spec:
containers:
- name: dbt

Check warning

Code scanning / SonarCloud

CPU limits should be enforced Medium

Specify a CPU limit for this container. See more on SonarCloud
app: cht-sync
spec:
containers:
- name: dbt

Check warning

Code scanning / SonarCloud

Memory limits should be enforced Medium

Specify a memory limit for this container. See more on SonarCloud
@witash witash marked this pull request as ready for review May 3, 2024 13:12
@witash
Copy link
Contributor Author

witash commented May 3, 2024

this is not totally final because there isn't a good way to access the postgres db in the cluster (see #87). but I want to merge this to main since #79 depends on it and the github build task in main keeps overwriting the latest images with feature branches.
in the meantime, you can access the database for testing and development if you have kubectl access kubectl exec -it -tty postgres-xxxx -- psql

@witash witash changed the title helm chart helm chart for cht sync May 3, 2024
@witash witash requested review from Hareet and dianabarsan May 3, 2024 14:31
@witash witash requested review from nydr and removed request for Hareet May 13, 2024 12:30
Copy link
Member

@dianabarsan dianabarsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaving the ymls to be reviewed by the experts, I added some small comments about the values template.

deploy/cht_sync/values.yaml.template Outdated Show resolved Hide resolved
deploy/cht_sync/values.yaml.template Outdated Show resolved Hide resolved
deploy/cht_sync/values.yaml.template Show resolved Hide resolved
@andrablaj
Copy link
Member

@witash, is this PR ready for re-review? If yes, re-request review from @dianabarsan or any other people who should have a look at this.

@andrablaj andrablaj requested a review from dianabarsan May 20, 2024 13:48
memory: {{ (.Values.postgrest).memory_limit | default "500Mi" }}
env:
- name: PGRST_DB_URI
value: |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This causes the error below. Setting the value to {{ printf "postgres://%s:%s@%s:5432/%s" .Values.postgres.user .Values.postgres.password .Values.postgres.host .Values.postgres.db }} works though.

{"code":"PGRST000","details":"connection to server at \"postgres\" (172.20.202.161), port 5432 failed: FATAL:  database \"data\n\" does not exist\n","hint":null,"message":"Database connection error. Retrying the connection."}

Copy link
Member

@dianabarsan dianabarsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I have an overarching comment about dbt-run.py, this is not entirely related to this work of adding helm charts.

For the helm chart work, I believe we should add an e2e test that deploys cht-sync over k3d to validate these charts.

dbt/dbt-run.py Outdated
{os.getenv('POSTGRES_SCHEMA')}
{os.getenv('POSTGRES_SCHEMA')};

CREATE TABLE IF NOT EXISTS {os.getenv('POSTGRES_SCHEMA')}.{os.getenv('POSTGRES_TABLE')} (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should totally aim not not create tables and such in code. I believe the whole allure of dbt is that you can have these versioned clean schema files, but here we're just inlining the table create in python?
Is there a way we can use dbt to create these and keep the schema in its own file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how is this connected to the helm chart?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not related to the the helm chart and was already merged to main separately.
dbt mainly creates tables by wrapping select from existing tables in create table statements, the assumption is that there's some source db that dbt is not managing.

but it is possible to just run raw sql, including ddl and yea i agree in this case probably makes more sense to do that instead of creating the table here. Also then could just have one "root" table instead of two.

related to medic/cht-pipeline#84

@andrablaj andrablaj removed the request for review from nydr June 17, 2024 09:33
@witash witash requested a review from dianabarsan July 16, 2024 08:15
Copy link
Member

@dianabarsan dianabarsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff!
I left a comment about how couchdb instances affect values in config,

- name: POSTGRES_SCHEMA
value: {{ $.Values.postgres.schema }}
- name: COUCHDB_USER
value: {{ $.Values.couchdb.user }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user can vary between instances

- name: COUCHDB_HOST
value: {{ $service.host }}
- name: COUCHDB_DBS
value: {{ $.Values.couchdb.dbs }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also vary between which databases we sync from every instance.

- name: COUCHDB_DBS
value: {{ $.Values.couchdb.dbs }}
- name: COUCHDB_PORT
value: {{ $.Values.couchdb.port | quote }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for port and secure.
We can't assume all instances will be the same.

@witash witash requested a review from dianabarsan July 18, 2024 10:39
# values for each couchdb instance
# host and password are required
# other values can be ommitted if common to all couchdb instances and specified above
couchdbs: # values for in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really exemplify of how I would add multiple couchdb servers to sync from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate on this? what specifically does not exemplify how would you add multiple couchdb servers to sync from?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a reader I'm not imagining how this would look like.

Is it like:

couchdbs: # values for in
  - host: "host1"
    password: ""
    user: ""
    dbs: "medic medic-sentinel"
    port: "5984"
    secure: "true"

   - host: "host2"
    password: ""
    user: ""
    dbs: "medic medic-users"
    port: "5984"
    secure: "true"

I think it would be helpful if we add an example, because these iterator types are weird in yml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's what its supposed to be.
There is a section couchdb for shared values, and couchdbs for a list of values which are not shared.
which makes it a little more complicated, but otherwise when user,dbs,port and secure are NOT different, like for MoH Kenya, have to copy them 47 times, and don't like copy/pasting that much.

added an example to values.yaml.template

couchdbs:
  - host: "host1" # required for all couchdb instances
    password: "" # required for all couchdb instances
  - host: "host2"
    password: ""
  - host: "host3"
    password: ""
    user: "user2" # required if different than above
    dbs: "medic medic_sentinel" # required if different than above
    port: "5984" # required if different than above
    secure: "true" # required if different than above

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot, I think this example is super helpful.

Copy link
Member

@dianabarsan dianabarsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! Thanks for all the updates!

@witash witash merged commit 63cc236 into main Jul 19, 2024
5 checks passed
@witash witash deleted the helm branch July 19, 2024 07:57
lorerod pushed a commit that referenced this pull request Jul 24, 2024
* add helm chart

* fix typos

* add volumeClaim and remove serviceName

* changing postgres service to load balancer for access

* add redis and redis-worker

* fix sonar issues and resource limits

* fix defaults and allow configurable limts

* tag images with branch and allow use in templates

* initialize table in dbt container

* remove external load balancer

* remove defaults in values template

* change redis port

* remove newline from env var

* remove old templates and add couch2pg

* multiple instances for couch2pg

* allow user,dbs,port, and secure to vary

* adding multi instance example
@medic-ci
Copy link

🎉 This PR is included in version 1.0.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docker Compose to Kubernetes migration
5 participants