Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

M1-12 Maintenance and DevOps #675

Labels
Epic Zenhub label (Pleas do not modify) PO issue Created by Product owners
Milestone

Comments

@elisabettai
Copy link
Contributor

elisabettai commented Sep 7, 2022

(This is a task, not a milestone)
Continuously maintain o²S²PARC platform, harden, and increase stability and scalability to accommodate increased use.

For example:

  • enable scaling of selected core services through recently introduced modularization and plugin concept
  • file-system performance boosting, especially for large files

Deliverable: monthly update slides as part of the NIH SIM-Core update meeting
Deadline: Month 1–12 (continuous task)

DETAILS IN https://github.com/orgs/ITISFoundation/projects/9/views/34

Watermelon

Preview Give feedback
  1. t:bug
    mrnicegyu11
  2. observability
    mrnicegyu11
  3. 3 of 3
    EPIC t:enhancement
    mrnicegyu11

Sundae

Preview Give feedback
  1. 3 of 3
    EPIC t:enhancement
    mrnicegyu11
  2. FAST p:high-prio
    mrnicegyu11
  3. 14 of 24
    blocked / paused during maintenance p:high-prio t:infra-ops
  4. 5 of 6
    blocked / paused during maintenance p:high-prio
    mrnicegyu11
  5. FAST p:mid-prio t:bug
    mrnicegyu11
  6. FAST p:mid-prio t:bug
    mrnicegyu11
  7. p:high-prio t:infra-ops
    mrnicegyu11
  8. FAST p:high-prio t:bug
    YuryHrytsuk
  9. 5 of 5
    p:high-prio t:bug
    YuryHrytsuk
  10. SECURITY p:low-prio
    mrnicegyu11
  11. p:high-prio t:bug
    YuryHrytsuk
  12. FAST p:high-prio t:bug
    YuryHrytsuk
  13. FAST p:mid-prio t:bug

Baklava (ops)

Preview Give feedback
  1. FAST p:low-prio t:bug
    YuryHrytsuk
  2. p:high-prio
    mrnicegyu11
  3. FAST t:bug
    mrnicegyu11
  4. t:enhancement
    mrnicegyu11
  5. p:high-prio t:bug
    YuryHrytsuk
  6. observability p:high-prio t:bug t:infra-ops
    mrnicegyu11
  7. p:high-prio t:bug
    YuryHrytsuk mrnicegyu11
    sanderegg
  8. p:mid-prio t:enhancement t:infra-ops
    YuryHrytsuk
  9. 0 of 2
    FAST p:high-prio
    mrnicegyu11
  10. FAST p:high-prio t:bug
    mrnicegyu11
  11. FAST observability
    mrnicegyu11
  12. FAST p:low-prio
    YuryHrytsuk
  13. FAST p:high-prio
    YuryHrytsuk
@elisabettai elisabettai added the PO issue Created by Product owners label Sep 7, 2022
@elisabettai
Copy link
Contributor Author

This is a continuation of ITISFoundation/osparc-simcore#428, fresh start into Y6. Could also be used to write updates on DevOps side.

@pcrespov
Copy link
Member

pcrespov commented Oct 5, 2022

Update on sprint Vaporwave

Done

Ongoing

Open

@mrnicegyu11
Copy link
Member

mrnicegyu11 commented Nov 3, 2022

Update on sprint Katherine Switzer

User support

Security

Bugfixes / Technical debt

CI/CD

e2e testing

Pre/releases during this sprint

Dependencies

DevOps

  • Traefik Dashboard /api handling, which leads to not showing weird errors if a jupyter-service container dies (thx @elisabettai )
  • Add log-rotation to docker daemon
  • Dont run docker daemon on loglevel DEBUG
  • Harmonize /docker mounts
  • Added 2 grafana visualisations
  • Add e2e test for oSparc products,APIs, and OpenAPI Swagger Doc Pages
  • Allow for CPU/RAM limits in simcore & ops services, mitigating config output strips quote marks and loses version number docker/compose#7771
  • Add pre-commit hooks and linting to opsarc-ops repo (thx @pcrespov)
  • Add graylog alert for dy-volume-removal finding zombies

@sanderegg
Copy link
Member

sanderegg commented Dec 1, 2022

Update on sprint Athena

User support

Security

Bugfixes / Technical debt

CI/CD

e2e testing

Pre/releases during this sprint

Dependencies

DevOps

  • Bugfix release for deployment-agent: v.0.10.4 : 🐛 Fix bug: Fix pulling only files with tags osparc-deployment-agent#121
  • Remove burstable aws instances
  • PG Migration v10 --> v14
  • Refactor ops-repo
  • Licensing server adjustments
  • Add pyenv to all machines
  • Add secure sudo password to all machines
  • Increase machine memory on osparc.io managers
  • Terraform: Prototype for EC2, VPC, Subnets, Elastic IPs, SSH Keys, LoadBalancer
  • Admin-Panel: Add docker socket access, script by @GitHK

@sanderegg
Copy link
Member

sanderegg commented Jan 10, 2023

Update on sprint Zefram Cochrane

User support

Security

Bugfixes / Technical debt

CI/CD

e2e testing

Pre/releases during this sprint

Dependencies

DevOps

  • Re-Establish Postgres database backups
  • Create AWS Machine Image (AMI) with S4L-lite already on it
  • Enable dynamic services in local deployment
  • Switch staging.osparc.io to new github repository
  • Proof of Concept for limiting maximum filesize in a docker container
  • Add s4l-lite domains and products everywhere
  • Enable E2E tests when maintenance page is still up
  • Fix: graylog dashboard provisioning
  • Service deprecation on osparc.io (Date: 30th of Jan 2023)

@sanderegg
Copy link
Member

sanderegg commented Feb 16, 2023

Update on sprint Resistance is Futile

User support

Security

Bugfixes / Technical debt

CI/CD

e2e testing

Pre/releases during this sprint

DevOps

@sanderegg
Copy link
Member

sanderegg commented Aug 10, 2023

Update Sundae

Highlights:

  • completed mypy type-checking on osparc-simcore repository, CI steps checks for all python-based packages
  • improved operations tooling for releases

In details:

@sanderegg
Copy link
Member

sanderegg commented Sep 7, 2023

Update Baklava

In details:

@elisabettai
Copy link
Contributor Author

elisabettai commented Sep 11, 2023

@sanderegg, @pcrespov, @matusdrobuliak66. I am closing this one. For Y7 (simcore)-related maintenance, please use this new one #1108

@mrnicegyu11, @YuryHrytsuk: for operations, we have a new issue in Y7 as well: #1109. I guess we can move the tasks still open here in that new one. I can do that, if that's useful.

@sanderegg, @pcrespov, @matusdrobuliak66, @mrnicegyu11, @YuryHrytsuk. In Y7, we also have a series of new cases, which are a bit of mixture of comp. backend, operations, monitoring, etc... Those should cover the work we're doing for the release. Those new cases are:
#1100
#1101
#1102
Maybe I will try to link the new ones with the work packages we already have. The goal is not to add more work/useless issues, only making reporting on those new milestones easier. 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment