failed to list alerts... context canceled #151
Replies: 30 comments
-
Could you please give some more details? |
Beta Was this translation helpful? Give feedback.
-
maybe problem with alerts amount? |
Beta Was this translation helpful? Give feedback.
-
Maybe, yes. We should see that in the logs. |
Beta Was this translation helpful? Give feedback.
-
thats strange, bot bot logging any command |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I'm getting:
Is this related? |
Beta Was this translation helpful? Give feedback.
-
Are you sure your user is subscribed? If not, it cannot be found in the data store. |
Beta Was this translation helpful? Give feedback.
-
Yes, the user is definitely subscribed. I'm getting alerts, but I can't reliably use the /alerts command. It only works some of the time. |
Beta Was this translation helpful? Give feedback.
-
+1 , but some time |
Beta Was this translation helpful? Give feedback.
-
Same Problem here when calling |
Beta Was this translation helpful? Give feedback.
-
Something needs to be in the logs. Can you try setting |
Beta Was this translation helpful? Give feedback.
-
the third last line was when i got the status correctly. |
Beta Was this translation helpful? Give feedback.
-
Hmm. Looks like you're running into timeouts? Sadly I don't log anything more verbose than |
Beta Was this translation helpful? Give feedback.
-
but the rest of it works just fine, i downloaded the linux-arm release and started with the following arguments:
maybe this will help you |
Beta Was this translation helpful? Give feedback.
-
I am getting the same random behaviour, something like 1 out of 3/4 times works /status . I have increased the initialInterval to 2000 * time.Millisecond but still the same problem. Any hint where I should be looking at? |
Beta Was this translation helpful? Give feedback.
-
After upgrading alertmanager and prometheus, it seems that the problem is gone, either with the official image or the one I built increasing the time out. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
same bug for me :( |
Beta Was this translation helpful? Give feedback.
-
In logs I get
but it worked today. I use the same code on different machine.
|
Beta Was this translation helpful? Give feedback.
-
Edit: I am back to msg="failed to get status" err="context canceled". It only worked one time.. However I have never manager to get it working without the workaround bellow.. I had the same issue. I think I found the problem. I am also using docker-compose. When trying a similar configuration as @optimistic5 I get the issue. But if I switch to:
Then /status works fine. which it didn't before. I think the issue is how to setup networking so that alertmanager-bot can reach the alertmanager inside the cluster. Any suggestions? |
Beta Was this translation helpful? Give feedback.
-
Yes, exactly. For these commands to work, the bot needs to be able to successfully make HTTP requests against alertmanager. |
Beta Was this translation helpful? Give feedback.
-
+1. The same problems occur periodically. For example now /silences and other commands works fine, but /alerts say: "failed to list alerts... context canceled". And it worked just recently. Exec to the container and check connectivity works fine:
|
Beta Was this translation helpful? Give feedback.
-
Is that a continous problem? If just one of these fails then that's totally normal. Network is reliable. |
Beta Was this translation helpful? Give feedback.
-
2-3 successful requests from 20 attempts, the rest with an error "failed to list alerts... context canceled" /silences there are also errors here, but now all requests have been successful. |
Beta Was this translation helpful? Give feedback.
-
What do you see in the logs of the bot and alertmanager while that happens? |
Beta Was this translation helpful? Give feedback.
-
Practically nothing...
|
Beta Was this translation helpful? Give feedback.
-
I don't know if this information will help you, but I noticed this fact: I look at traffic up to alertmanager - the bot always gets the correct json with alerts. Here are 2 screenshots of traffic when requesting /alerts: The first is when the error occurs "failed to list alerts... context canceled". The second is when the bot gives a correct response. |
Beta Was this translation helpful? Give feedback.
-
same bug for me :( |
Beta Was this translation helpful? Give feedback.
-
Same bug, no solution yet? Why issue was closed? |
Beta Was this translation helpful? Give feedback.
-
problem with 0.4.0 bot version
Beta Was this translation helpful? Give feedback.
All reactions