Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPIC - Add the possibility to integrate the Operator with Prometheus and Grafana #461

Open
2 of 3 tasks
ricardozanini opened this issue May 9, 2024 · 5 comments
Open
2 of 3 tasks
Assignees
Labels
enhancement New feature or request epic

Comments

@ricardozanini
Copy link
Member

ricardozanini commented May 9, 2024

Description

Placeholder EPIC for issues related to adding monitoring support to the SonataFlow Operator.

Relates to:

Issues

@JudeNiroshan
Copy link

JudeNiroshan commented May 11, 2024

As a sonataflow operator user, I should be able to enable a monitoring flag via the SonataFlowPlatform CR, which will result in creating the required software components.

apiVersion: sonataflow.org/v1alpha08
kind: SonataFlowPlatform
spec:
  services:
    ...
    monitoring:
      enabled: true
  1. The Sonataflow operator should install the Prometheus & Grafana operators in the OCP/K8s cluster.
  2. Once the Prometheus and Grafana are in place, the operator should create a connection between Prometheus & Grafana via the GrafanaDataSource CR in Grafana. https://grafana.com/docs/grafana-cloud/developer-resources/infrastructure-as-code/grafana-operator/operator-dashboards-folders-datasources/#add-a-data-source
  3. Then create a ServiceMonitor object that can capture/collect metrics from all the deployed Serverless Workflows.
  4. Finally as the operator user, I expect to see a default Grafana Dashboard.

@ricardozanini
Copy link
Member Author

@JudeNiroshan can you formulate on this request?

The Sonataflow operator should install the Prometheus & Grafana operators in the OCP/K8s cluster.

The operator won't be responsible for installing third-party operators in a cluster. The reason is that we won't add administrative permissions to the operator such as installing CRDs. Also, installing an operator comes with many configuration options. So it's highly complex to add an interface and wrappers around these installation procedures that can change from time to time when a new operator version is released.

Once the Prometheus and Grafana are in place, the operator should create a connection between Prometheus & Grafana via the GrafanaDataSource CR in Grafana

This is fine. We can certainly try to check if these CRDs are available in the cluster and create CRs to bind Prometheus and Grafana to deployed workflows.

Then create a ServiceMonitor object that can capture/collect metrics from all the deployed Serverless Workflows.

I'll break it down into Grafana and Prometheus integration, so we can verify those PRs separately.

Finally as the operator user, I expect to see a default Grafana Dashboard.

Can you create this dashboard and share it with me? So we can maintain and keep it in this repo. Feel free to send a follow up PR to the implemented feature updating the one I'll use as a placeholder.

@ricardozanini
Copy link
Member Author

@JudeNiroshan one more thing regarding Grafana Data Sources. Please see: https://grafana.github.io/grafana-operator/docs/api/#grafanadatasourcespecdatasource

Looks like a data source requires credentials to access Prometheus. We can deploy the DS using the well-known credentials for a simple Prometheus installation, but in production environments, I don't think we can rely on this.

In this case, we can accept a secret containing the Prometheus credentials or use the well-known if empty.

@JudeNiroshan
Copy link

The reason is that we won't add administrative permissions to the operator such as installing CRDs. Also, installing an operator comes with many configuration options

Understood. Let's keep the installation outside the sonataflow operator.(e.g. in a helm chart)

Can you create this dashboard and share it with me? So we can maintain and keep it in this repo.

Sure.

we can accept a secret containing the Prometheus credentials or use the well-known if empty.

Agreed.

@ricardozanini Will this feature be considered for the next immediate sonataflow release(13th June 2024)?

@ricardozanini
Copy link
Member Author

@JudeNiroshan I'm afraid not. Also, we already cut upstream already for the operator. This one should be on Apache KIE 11.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request epic
Projects
Status: 📋 Backlog
Development

No branches or pull requests

2 participants