-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Record metrics for invocations when they start #1411
Conversation
gateway/handlers/notifiers.go
Outdated
Inc() | ||
} else if event == "started" { | ||
serviceName := getServiceName(originalURL) | ||
p.Metrics.StartedCounter.WithLabelValues(serviceName).Inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice if the metric name was symmetric with the metric that capture the end of the request GatewayFunctionInvocation
and GatewayFunctionInvocationStarted
or using FunctionInvocationCompleted
and FunctionInvocationStarted
to be even more precise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already capture the end of the request using the existing metric.
What do you suggest? Renaming would be a breaking change, recording double the information would be hideous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am specifically talking about the variable name, Metrics.StartedCounter
isn't very descriptive. It is entirely possible we will want other started counters later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give me a suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gateway_function_invocation_started
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the name of the Go variable and my suggestion was/is GatewayFunctionInvocationStarted
or FunctionInvocationStarted
, so that this code would look like
p.Metrics.FunctionInvocationStarted.WithLabelValues(serviceName).Inc()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated as per that suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I see what you mean now. I haven't changed that internal variable, I'll do it next.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated and retrofitted into the other counter
func (p PrometheusFunctionNotifier) Notify(method string, URL string, originalURL string, statusCode int, duration time.Duration) { | ||
seconds := duration.Seconds() | ||
serviceName := getServiceName(originalURL) | ||
func (p PrometheusFunctionNotifier) Notify(method string, URL string, originalURL string, statusCode int, event string, duration time.Duration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use an enum for the event
instead of an arbitrary string? it will be more type safe, easier to read in the docs and IDEs, and we can avoid bugs to do typos. Especially if we pair it with switch below on the enum constants.
type FunctionEvent int
const (
FunctionStarted = iota
FunctionEnded
)
func (d FunctionEvent) String() string {
return [...]string{"started", "ended"}[d]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree. This is an experimental PR to see if we can solve at least one of the issues for the user who is in production experiencing this problem today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
¯_(ツ)_/¯ if you are happy with it, then i guess we can leave it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a reasonable enhancement
Thanks for looking at the metrics. Happy to do polishing after we get to something that works. This goes with https://github.com/openfaas-incubator/faas-idler/pull/35 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments
@stefanprodan @LucasRoesler I've added comments for this, also there was interest from a large enterprise user in this feature to know when long-running requests are started. |
@alexellis @stefanprodan @LucasRoesler |
Hi @StorkZ, Can you introduce who you are, what company you would be using this for and why you feel it's important? Why do you think that "progress has stalled"? Perhaps this isn't a priority for our roadmap? Alex |
I work at a saas startup called Kadromierz. We use this project to run exports for clients, that often are required only once a month. We use faas-idler to salce them down to 0 instances. The only problem is that there is a chance, that instance will be scaled to 0 before the function even ran. Currently we handle this by retrying given function call, but it is not a final solution. I didn't mean it to sound like I'm hastening you. I'm very grateful for this project and if there is something I can do to push this forward, then just let me know. |
I'm glad that OpenFaaS is useful to your business, thank you for giving some context. Perhaps you could start off with sending your use-case to the ADOPTERS file? #776 OpenFaaS is provided for free, features can be prioritized for clients who have a support relationship, otherwise it depends on the roadmap and what little time I can donate to the project. I'd suggest reading up on how the project works here: Setting expectations, support and SLAs In the meantime, I'll note that you are interested in this feature. Would you be willing to help with testing? |
Thanks for being patient with me. |
Can you outline a scenario that you are running / running into that we can use to reproduce the behaviour and then test a fix, if we proceed with this patch? |
c0d9233
to
6f2edb5
Compare
* This experimental patch records metrics as invocations start so that the metrics can be used to make better scale to zero decisions in faas-idler. Tested with Kubernetes on a single-node cluster, metrics reported as expected. Existing metrics still report. Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
6f2edb5
to
b7f09c3
Compare
Squashing changes from review into a single commit. |
Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
This should be a "no-harm" change to use the namespace/sub- system definition for CounterOpts. Signed-off-by: Alex Ellis (OpenFaaS Ltd) <[email protected]>
@StorkZ there are testing instructions and images for this and faasIdler shared in #contributors on Slack. |
Signed-off-by: Alex Ellis (OpenFaaS Ltd) [email protected]
Description
Record metrics for invocations when they start
Motivation and Context
How Has This Been Tested?
Tested with Kubernetes on a single-node cluster, metrics
reported as expected. Existing metrics still report.
Types of changes