-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: unify metrics ( cleanup and add missing metrics ) #2207
Conversation
Signed-off-by: adarsh0728 <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2207 +/- ##
==========================================
+ Coverage 63.91% 64.06% +0.15%
==========================================
Files 338 338
Lines 41085 41002 -83
==========================================
+ Hits 26259 26269 +10
+ Misses 13756 13671 -85
+ Partials 1070 1062 -8 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
Signed-off-by: adarsh0728 <[email protected]>
Signed-off-by: adarsh0728 <[email protected]>
How have you tested it? |
For metrics that I removed: Ran pipelines locally with sources(http, generator and kafka).. compared the source specific metrics with forwarder metrics - Testing doc |
Signed-off-by: adarsh0728 <[email protected]>
Signed-off-by: adarsh0728 <[email protected]>
Signed-off-by: adarsh0728 <[email protected]>
…unter) metrics Signed-off-by: adarsh0728 <[email protected]>
Signed-off-by: adarsh0728 <[email protected]>
Signed-off-by: adarsh0728 <[email protected]>
@KeranYang @kohlisid @yhl25 - please review. |
Signed-off-by: adarsh0728 <[email protected]>
Signed-off-by: adarsh0728 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably noticed that one of the e2e tests is failing. It's because the vertex processing rate calculation stopped working. This is because in numaflow, for each vertex, we use forwarder_data_read_total
metric to calculate processing rate. See here. Please make sure that we don't change this name for ALL type of vertices. @adarsh0728
@adarsh0728 please take a look at https://github.com/numaproj/numaflow/blob/main/pkg/daemon/server/service/rater/rater_test.go#L63 to understand how the metric is used to calculate data processing rate and see if your PR somehow breaks the contract. |
Thanks, checking |
@KeranYang @whynowy: I think I got the issue.
What should be the best possible way out here? Should I introduce a new label for partition idx (option 1) @KeranYang |
@adarsh0728 thank you for diving deep into the issue. The problem is introduced because we move Kafka per-partition metrics to source forwarder. I think we can simplify it by keeping Kafka per-partition metric in Kafka source and continue using vertex name as source partition name. cc: @whynowy |
sounds good to me. |
thanks @whynowy and @KeranYang . Will do the necessary changes and update. |
Signed-off-by: adarsh0728 <[email protected]>
Signed-off-by: adarsh0728 <[email protected]>
Signed-off-by: adarsh0728 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please double check metrics.md
in sync with your changes.
Resolves #2218
The PR aims to have unified metrics.
Currently we are exposing metrics for different sources/sinks(kafka, http etc) in addition to forwarder metrics.
Cleanup of such metrics as well as addition of any missing metrics