-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function invocation incorrect for multiple namespaces #1413
Comments
@LucasRoesler @Waterdrips PTAL |
/set title: Function invocation incorrect for multiple namespaces |
Hi @karuppiah7890 i also noticed this the other day and was thinking about the correct fix. I think the most direct way top to fix this can be found in the scaling handler faas/gateway/handlers/scaling.go Line 33 in b9a3476
We should pass the default namespace to the PrometheusNotifier struct Lines 75 to 77 in 238ce1b
Line 94 in 238ce1b
And then using the same methods as in the scaling handler, we can then set the metric correctly. Additionally, the faas/gateway/metrics/add_metrics.go Lines 53 to 55 in df97efa
faas/gateway/metrics/add_metrics.go Line 78 in df97efa
The namespace would need to be selected from the GET parameters and looks like this https://github.com/openfaas/faas-netes/blob/7a645a75749a130da130fc8fc77884e712fbac5e/handlers/reader.go#L25-L31 This of course requires constructing the handler so that it knows the default namespace, passing it here faas/gateway/metrics/add_metrics.go Line 17 in b9a3476
The two changes should fix the issue of saving and the selecting the correct metric per namespace, including the default namespace. I think this is the shortest and easiest way to fix the behavior. Long term, we might also want to consider adding the namespace to the prometheus labels. In summary:
|
A fix is being worked on by @viveksyngh in #1488 - I'm not sure if it follows the above 1:1, but we are reviewing it. |
Fixes openfaas#1413 Fixes openfaas/faas-netes#707 This adds both namespace and non-namespace scoped counts of invocation to metric agregation and fixes the Gateway UI not updating the invocation count automatically without a page reload. Signed-off-by: Alistair Hey <[email protected]>
Fixes openfaas#1413 Fixes openfaas/faas-netes#707 This adds both namespace and non-namespace scoped counts of invocation to metric agregation and fixes the Gateway UI not updating the invocation count automatically without a page reload. Signed-off-by: Alistair Hey <[email protected]>
Fixes openfaas#1413 Fixes openfaas/faas-netes#707 This adds both namespace and non-namespace scoped counts of invocation to metric agregation and fixes the Gateway UI not updating the invocation count automatically without a page reload. Signed-off-by: Alistair Hey <[email protected]>
Fixes openfaas#1413 Fixes openfaas/faas-netes#707 This fixes the Gateway UI not updating the invocation count automatically without a page reload. Tested by deploying on a local cluster and making sure invocations go up with and without namespace suffix Signed-off-by: Alistair Hey <[email protected]>
Fixes #1413 Fixes openfaas/faas-netes#707 This fixes the Gateway UI not updating the invocation count automatically without a page reload. Tested by deploying on a local cluster and making sure invocations go up with and without namespace suffix Signed-off-by: Alistair Hey <[email protected]>
My actions before raising this issue
Expected Behaviour
I have wordcount function in
openfaas-fn
and inanother-ns
namespacesI invoke the wordcount function multiple times in both namespaces in my web UI portal. The invocation count in the web UI shows the correct number of invocations in both namespaces. And the CLI also shows same correct count as the web UI portal when function is invoked from web UI portal. When CLI invokes the function using
$ faas invoke wordcount
or using$ faas invoke wordcount -n <namespace>
, the count is shown correctly in both CLI and web UI portalCurrent Behaviour
I invoked the wordcount function multiple times in both namespaces in my web UI portal. But the invocation count in the web UI doesn't increase, it shows the same count, in both namespaces. And the CLI also shows same count as the web UI portal when invoked from web UI portal. But when CLI invokes the function using
$ faas invoke wordcount
, the count increases in both CLI and web UI portalSteps to Reproduce (for bugs)
install openfaas in k8s with k3sup. this installs with openfaas-fn as the default namespace for functions. login to the gateway in the cli
deploy functions in openfaas-fn using this
openfaas-fn
namespace and inanother-ns
namespaceYou will notice that the count doesn't increase in the UI, and it doesn't increase in the CLI too, when doing
$ faas list
or$ faas list -n openfaas-fn
or$ faas list -n another-ns
Also check web UI portal. the count shows up now, increases, but is the number of CLI invocations
Context
I was just trying out openfaas. Noticed that my invocation count shows up wrong
Your Environment
faas-cli version
):docker version
(e.g. Docker 17.0.05 ):Are you using Docker Swarm or Kubernetes (FaaS-netes)?
Kubernetes
Operating System and version (e.g. Linux, Windows, MacOS):
MacOS Catalina. v 10.15.2
Code example or link to GitHub repo or gist to reproduce problem:
Provided it all above 😄
Other diagnostic information / logs from troubleshooting guide
Initially I kind of assumed that the invocation count comes from prometheus, later it turned out to be the truth. So, I was seeing Grafana and Prometheus and seeing the metrics for invocation count. This is what Prometheus shows when I try what I have described above:
You can see below how the CLI shows the number as
6
forwordcount
function, in both namespaces. My default isopenfaas-fn
when nothing is providedIn prometheus, the metric which has the value 6 is this:
If you notice the
function_name
label value, it'swordcount
. This is the count for the number of invocations from the CLI with$ faas invoke wordcount
And there are two other metrics with different values, which relate to
wordcount
functionNotice the
function_name
label values, it'swordcount.another-ns
andwordcount.openfaas-fn
. This is the count of invocations that happened when I invoked in web UI.But it doesn't show up as that though, it shows up as 6
On checking a bit of code for how the metrics comes, some assumptions and intuitions based on input and output and how it's all related and connecting the dots, this is what I can say:
The key difference is how the request goes to the gateway. When I do a CLI invocation
gateway log is like
and for following invocations
or invoking in web UI portal in
openfaas-fn
namespacethe gateway log is like
for
another-ns
or invoking in web UI portal in
another-ns
namespacethe gateway log is like
The invocation count is shown using the response from the gateway for list functions API. Checking the code, gateway uses the following code to find the invocation count using the prometheus metrics data
faas/gateway/server.go
Line 155 in 03dc882
faas/gateway/metrics/add_metrics.go
Lines 53 to 55 in df97efa
faas/gateway/metrics/add_metrics.go
Line 64 in df97efa
faas/gateway/metrics/add_metrics.go
Line 91 in df97efa
You can see how the metric label's function name and the function name from the provider (? not sure about the term 😅) are matched, without considering namespace. So, seeing the above prometheus metrics data, naturally the value
6
will come, no matter what namespace the user is looking at in CLI or web ui portal.So, that's one issue, reading of the invocations count data. I think to fix it - just adding the namespace along with name like
<name>.<namespace>
should work. And a test for it too!Next issue is, how did the wrong data even get into prometheus in the first place? There are three sets of invocation counts, but only two namespaces. The metric with label
function_name=wordcount
is not a correct one, there should always be a namespace to be specific about which namespace the count refers to, even though if there's just one. I guess I'm right about this, considering every function Must be in a namespace and it's very clear that multiple namespaces is supported, namespace must be part of the label value. Do correct if I'm missing something 😅And checking the code, the gateway is what exposes the metrics at port 8082. And prometheus scrapes these metrics.
Looking at the metrics data, the metrics seem to be right for web UI portal invocations and for CLI invocations with namespace flag, except for the one with
function_name=wordcount
label, which got created from CLI invocations without namespace flag. This is how the gateway logs look for such a caseNow why does this log matter? The url path in this matters, which is
/function/wordcount
My guess based on the code - when requests are made, they go through these parts of the code
faas/gateway/server.go
Lines 204 to 206 in 03dc882
faas/gateway/server.go
Line 110 in 03dc882
functionNotifiers
is here and has prometheus in itfaas/gateway/server.go
Line 83 in 03dc882
And the notify call is made here
faas/gateway/handlers/forwarding_proxy.go
Line 69 in 238ce1b
And for prometheus, the implementation is here
faas/gateway/handlers/notifiers.go
Line 49 in 238ce1b
and this is where the service name is obtained
faas/gateway/handlers/notifiers.go
Line 51 in 238ce1b
faas/gateway/handlers/notifiers.go
Line 76 in 238ce1b
And here is where the metric is created for prometheus to scrape
faas/gateway/handlers/notifiers.go
Lines 59 to 61 in 238ce1b
This is all good, but I think the service name will be
wordcount
when url is/function/wordcount
, but it will bewordcount.openfaas-fn
when url is/function/wordcount.openfaas-fn
, and so metric also will be wrong.The following is speculation - Have to check CLI code for this.
To fix this, I think the CLI has to make calls with namespace in the url path like
/function/wordcount.openfaas-fn
ifopenfaas-fn
is the default namespace. I think this is not happening, but it still works because behind the scenes, even with/function/wordcount
, as the default namespace function is taken up somehow and it all works, but to make the invocation count work, we might have to pull in namespaces and use the first one as the default according to this idea and then use that for the requestI'll check CLI code next to understand better 😄 and also check web UI portal code, and then post more here about my findings
The text was updated successfully, but these errors were encountered: