-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add checks config fields and get_checks API #668
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, thanks!
Conflicts: - ops/testing.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of type hints for consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a quick skim, this is looking nice, thanks!
Let's please hold until the server is merged as there will be minor differences.
I assume we're waiting on this: canonical/pebble#86? |
@rwcarlsen Yep, that's right. I'll note that in the PR description. |
Do we have some discourse docs ready explaining checks to users/charm-devs? |
@rwcarlsen Yep, here they are: benhoyt/juju-docs#1 ... the plan is to add them (and Dora will review/edit) once canonical/pebble#86 and then this PR is merged. |
* remove Service.restarts (for now at least) * rename Check.failures to .threshold * change CheckInfo.healthy bool to .status str * remove CheckInfo.last_error and .error_details (for now at least) * add CheckInfo.threshold
Also add CheckStatus enum
@benhoyt Looking at activity today, looks like we're pretty darn close to getting this merged? :) |
This reverts commit 15d96a8, now that canonical/grafana-k8s-operator#68 is merged.
@benhoyt - have things settled enough here for another review? |
@rwcarlsen Yeah, definitely. The Pebble changes are now merged - go ahead and review. Also I think we're waiting on @niemeyer's review here, as this is an API change (though that probably won't be before next week, given the sprint). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Looks like we'll need to make an ops framework guide doc for health checks. Unless you have something in the works already - let's create an issue to track that.
ops/pebble.py
Outdated
@@ -775,6 +873,52 @@ def __repr__(self): | |||
).format(self=self) | |||
|
|||
|
|||
class CheckInfo: | |||
"""Check status information.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to get some more detail in this docstring since this is one of the main classes users will interact with for this - or at least a link to a discourse doc that discusses it. Particularly - what is status (i.e. the main thing users should be checking). I looked in the docs you made and couldn't find any mention there of status, UP, DOWN, etc. either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main doc is the Pebble SDK doc with a section on health checks: https://juju.is/docs/sdk/pebble#heading--health-checks. I have added a new model.Container.get_checks
and model.Container.get_check
method (to match get_services
and get_service
) and added a brief doc section on that here -- I'll add that to the Discourse doc as soon as this is merged.
I've also added much better docstrings on CheckInfo
, which will appear in the API documentation (and is linked to from the above).
'period': '50ms', | ||
'threshold': 2, | ||
'exec': { | ||
'command': 'sleep x', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is a dumb question - but sleep x
should exit nonzero right - so why does your assert below say the status is UP
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only marked as DOWN
after two checks have taken place (and the number of failures has hit the threshold). Which happens after 2 times the 50ms period. So initially the check will be UP
, and only down after two tries, which is tested in the loop below. I've added comments here to clarify -- thanks.
class Check: | ||
"""Represents a check configuration in a Pebble configuration layer.""" | ||
|
||
def __init__(self, name: str, raw: typing.Dict = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the choice to make raw an untyped dict instead of unpacking it?
I would rather have Check('name', override:str='', level:str='', period:str='', ...)
, makes for a way easier to read initializer, and easier usage of the class.
If you really dislike doing check=Check('name', **my_raw_dict)
, then we should expose an alternative classmethod
Check.from_dict(name:str, dict_:dict)
.
If we were on more modern python I could also go for a dataclass or a TypedDict, but until then, this feels like better code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the whole Check class could very well be a dataclass, and save some eq, repr and to_dict boilerplate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this should arguably be done with a from_dict
. However, I've made this consistent with the other layer-configuration classes, which all do it the same way. If doing it again maybe we could have considered dataclasses or some such (then again, that's 3.7+ and ops
still needs to support 3.5 for now). So I'm going to leave as is to be consistent.
class Check: | ||
"""Represents a check configuration in a Pebble configuration layer.""" | ||
|
||
def __init__(self, name: str, raw: typing.Dict = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the whole Check class could very well be a dataclass, and save some eq, repr and to_dict boilerplate
Co-authored-by: Jon Seager <[email protected]>
@benhoyt would appreciate if you could add the docs you drafted at some point! Thanks for all your work on this! :D |
Will do on Monday, thanks @jnsgruk. |
Added brief docs here. I'll ask Dora to review. |
Add support for the new backoff and checks-related fields in the plan configuration types, as well as support for the new
/v1/checks
API (Client.get_checks
andContainer.get_checks
andContainer.get_check
).See also the brief docs I'm adding in benhoyt/juju-docs#3 (in addition to the API reference docs in this PR).