-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Added environment specific labels to client library when running in Cloud Run Jobs #877
Conversation
|
||
def set_item(key, val): | ||
if val: | ||
additional_labels[key] = val |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function seems a bit unoptimized for something that has to be run on each log
Can we remove this inner function?
I assume the env vars aren't expected to change during runtime, right? If we can rely on that, maybe we can build the additional_labels dict once at init time, and attach it to each cloud_run_jobs
log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at it, we can build part of the additional_labels
dict once at init time, but something like setting the trace_id
label will have to be done per log record, as that label depends on what the value of _trace
is in each LogRecord
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The env vars aren't expected to change during runtime, but I've also refactored the GAE trace_id
label code into this change as well. Does that change during runtime?
if resource.type == _GAE_RESOURCE_TYPE and record._trace is not None: | ||
# add GAE-specific label | ||
labels = {_GAE_TRACE_ID_LABEL: record._trace, **(labels or {})} | ||
labels = {**add_environment_labels(resource, record), **(labels or {})} or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to double-check: That this is only for the CloudLoggingHandler, not the StructuredLogHandler. That's intended, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think the Stackdriver daemon adds the proper labels.
def add_environment_labels_test_flow(self, resource_type, expected_labels): | ||
resource = Resource(type=resource_type, labels={}) | ||
actual_labels = add_environment_labels(resource, self.record) | ||
self.assertDictEqual(actual_labels, expected_labels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this is just me, but I don't like this kind of indirection for assertions. I think this is one place where DRY doesn't apply. It feels to easy to accidently make all tests into no-ops and not notice, because the assert is separate from the test code
I prefer to have asserts in each test function (and not behind any conditionals if possible).
_monitored_resources._CLOUD_RUN_JOBS_TASK_INDEX_LABEL: test_task_index, | ||
_monitored_resources._CLOUD_RUN_JOBS_TASK_ATTEMPT_LABEL: test_task_attempt, | ||
}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe also a test for other resource types?
Suggestion: This seems like it could be a good place for pytest.mark.paramaterize
to really scale up different test cases:
@pytest.mark.paramaterize("env,record_fields,resource,expected_labels", [
{}, {"_trace":"trace"}, "gae_app", {"appengine.googleapis.com/trace_id": "trace"},
{"CLOUD_RUN_EXECUTION": "test_name"}, {}, "cloud_run_job", {"run.googleapis.com/execution_name": "test_name"}
...
])
def test_add_environment_labels(env, resource, expected_labels):
...
…ons of add_resource_labels
"""Builds a dictionary of labels to be inserted into a LogRecord of the given resource type. | ||
This function should only build a dict of items that are consistent across multiple logging statements | ||
of the same resource type, such as environment variables. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe add a comment about cache?
_CLOUD_RUN_EXECUTION_ID, | ||
) | ||
set_item(_CLOUD_RUN_JOBS_TASK_INDEX_LABEL, _CLOUD_RUN_TASK_INDEX) | ||
set_item(_CLOUD_RUN_JOBS_TASK_ATTEMPT_LABEL, _CLOUD_RUN_TASK_ATTEMPT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this works, but I find the indirection a bit harder to read than it probably needs to be. Maybe something like:
var_map = ((_CLOUD_RUN_EXECUTION_ID, _CLOUD_RUN_JOBS_EXECUTION_NAME_LABEL), (_CLOUD_RUN_TASK_INDEX, _CLOUD_RUN_JOBS_TASK_INDEX_LABEL), (_CLOUD_RUN_TASK_ATTEMPT, _CLOUD_RUN_JOBS_TASK_ATTEMPT_LABEL))
for envvar, label_name in var_map:
value = os.environ.get(envvar, None):
if value:
labels[label_name] = value
Refactored code that adds
trace_id
label in App Engine to also add Cloud Run Jobs labels in the Cloud Run Jobs environment.Fixes #867