-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⚗️ Add tracing spans to simcore services #6792
base: master
Are you sure you want to change the base?
⚗️ Add tracing spans to simcore services #6792
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6792 +/- ##
==========================================
+ Coverage 85.19% 88.51% +3.31%
==========================================
Files 1545 1201 -344
Lines 61634 52312 -9322
Branches 2155 1045 -1110
==========================================
- Hits 52508 46302 -6206
+ Misses 8799 5823 -2976
+ Partials 327 187 -140
Continue to review full report in Codecov by Sentry.
|
Quality Gate passedIssues Measures |
Alas, hotfix to prod ASAP, I say! |
from servicelib.logging_utils import config_all_loggers | ||
from simcore_service_api_server import exceptions | ||
|
||
setup_tracing_spans_for_simcore_service_functions() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really do not like that the setup has to be called depending on the imports
if it is experimental, why not to use DEV_FEATURE to enable/disable it?
I assume you will be on top of it right?
from servicelib.logging_utils import config_all_loggers | ||
from simcore_service_api_server import exceptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _opentelemetry_method_span(cls): | ||
for name, value in cls.__dict__.items(): | ||
if callable(value) and not name.startswith("_"): | ||
setattr(cls, name, _opentelemetry_function_span(value)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so here value
is a callable
and you basically decorate the member functions.
MINOR
setattr(cls, name, _opentelemetry_function_span(func=value))
|
||
def _opentelemetry_method_span(cls): | ||
for name, value in cls.__dict__.items(): | ||
if callable(value) and not name.startswith("_"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about properties, or (in general) descriptors , do you decorate them?
return None | ||
|
||
|
||
def setup_tracing_spans_for_simcore_service_functions(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why dont you pass the name of the module instead of relying on things that were imported?
If i undestand it right, yoy want to decorate all members and functions of a given module so the steps would be
- import all modules in a package
- inspect all members and functions
- apply decorators above
for 1 and 2 see what we do with walk_model_examples_in_package
. I guess is similar to the MetaPathFinder but higher level since uses built-in pkgutil.walk_packages
setattr(module, name, _opentelemetry_method_span(cls)) | ||
|
||
|
||
class _AddTracingSpansFinder(MetaPathFinder): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MetaPathFinder ... ?? this sounds like a wonder name for a spaceship 🚀 :-D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is nice to start a trial, but I would be very wary of:
- induce performance hits,
- every single function is now wrapped by network calls,
- what happens when the collector is down?
- how much overhead?
- how can I exclude some functions?
Actually I thought that you wanted to initialy allow to decorate some specific functions to be in the telemetry. I find instrumenting everything is too much with too many unknown on the effects.
If you wish we could try the benchmark test that lists all the services in the registry, I have it ready at the moment.
@@ -2,7 +2,16 @@ | |||
|
|||
""" | |||
|
|||
import importlib | |||
import importlib.machinery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very cool name...
return wrapper | ||
|
||
|
||
def _opentelemetry_method_span(cls): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here I have some questions:
- how heavy is using this? If I get it right, using this wraps every single function with a network call to the telemetry collector right? That means any call is network-limited is that correct?
- if yes, then what happens if the collector is broken or the network is shaky?
- does this break the function?
- if there are network issues, does it retry? how many times? what is the timeout?
- if you telemetrize a ping function for example, that might be quite bad no? these function typically should run with no "brakes", can you exclude functions?
from servicelib.logging_utils import config_all_loggers | ||
from simcore_service_api_server import exceptions | ||
|
||
setup_tracing_spans_for_simcore_service_functions() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this not be called in your import directly? I have the feeling this is very fragile as this will for sure move with some of our tools.
also, this instruments everything? I guess my questions above then remain more than ever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
What do these changes do?
Related issue/s
How to test
Dev-ops checklist