Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"kwatch detected a crash in pod" when pod is deleted normally #50

Closed
dyasny opened this issue Jan 20, 2022 · 3 comments · Fixed by #58
Closed

"kwatch detected a crash in pod" when pod is deleted normally #50

dyasny opened this issue Jan 20, 2022 · 3 comments · Fixed by #58

Comments

@dyasny
Copy link

dyasny commented Jan 20, 2022

Describe the bug
When I delete a pod (or an object containing pods, i.e. deployment) normally (also tested via argocd) I receive an alert about a pod crash:

 kwatch detected a crash in pod
There is an issue with container in a pod!
Name
rex2-c8f75f688-nxczd
Container
tools
Namespace
myns
Reason
Error
Events
...
[2022-01-20 20:36:45 +0000 UTC] Created Created container rex2
[2022-01-20 20:36:45 +0000 UTC] Started Started container rex2
...
[2022-01-20 20:38:34 +0000 UTC] Killing Stopping container rex2
...

 Logs
unable to retrieve container logs for docker://c733494c9012466a9d98cf6bacf629127f3741d6d23c18e48c71d3237466965d
kwatch

This isn't a crash but a normal container termination, I should not be getting alerts

To Reproduce
Steps to reproduce the behavior:

  • run kwatch
  • delete some pod

Expected behavior

I should not receive alerts on cleanly terminated pods, only on crashes

Actual behavior
Getting a crash alert

Version/Commit
v0.3.0

@simonfrey
Copy link
Collaborator

simonfrey commented Jan 24, 2022

Same here. Pods are normally stopped, but getting an alert for it.
Logs have one of two patterns for that alert:
Same as above:

unable to retrieve container logs for docker://XXX

or

failed to try resolving symlinks in path "/var/log/pods/XXX/YYY/0.log": lstat /var/log/pods/XXX/YYY/0.log: no such file or directory

@simonfrey
Copy link
Collaborator

simonfrey commented Jan 24, 2022

Guess related to #44

@simonfrey
Copy link
Collaborator

I think the following process is the problem:

  1. Container get's shutdown from k8s
  2. Container has no graceful shutdown implemented
  3. k8s forcefully kills the container which results in the above unexpected alert

I think a solution for this could be parsing the events and if they contain "Killing Stopping container XXX" then not consider this a crash....but this definitely has some potential for also silencing real crashes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants