Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record logger name as the instrumentation scope name #3810

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
([#3778](https://github.com/open-telemetry/opentelemetry-python/pull/3778))
- Fix license field in pyproject.toml files
([#3803](https://github.com/open-telemetry/opentelemetry-python/pull/3803))
- Record logger name as the instrumentation scope name
([#3810](https://github.com/open-telemetry/opentelemetry-python/pull/3810))

## Version 1.23.0/0.44b0 (2024-02-23)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
import logging
import threading
import traceback
from functools import lru_cache
from os import environ
from time import time_ns
from typing import Any, Callable, Optional, Tuple, Union # noqa
Expand Down Expand Up @@ -452,9 +453,6 @@ def __init__(
) -> None:
super().__init__(level=level)
self._logger_provider = logger_provider or get_logger_provider()
self._logger = get_logger(
__name__, logger_provider=self._logger_provider
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a very significant change. But I am also confused about why the logging handler was storing a logger for it's own source namespace and not even the namespace of the logger it's added to. Can anyone explain the purpose of this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more clear. This seems like a good change. I'm just surprised by how strange the existing code's functionality is. So, I wanted to make sure there was not a good reason for it to be that way.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a single logger instance stored in _logger was chosen based on perceived need for performance, based on the discussion in the original issue #2485

Based on the discussion yesterday, I'm planning to use pytest-benchmark to test if there is a performance regression with creating one logger instance per logger name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tm0nk

Are you planning on including these tests as part of this pr?


@staticmethod
def _get_attributes(record: logging.LogRecord) -> Attributes:
Expand Down Expand Up @@ -534,6 +532,7 @@ def _translate(self, record: logging.LogRecord) -> LogRecord:
"WARN" if record.levelname == "WARNING" else record.levelname
)

logger = get_logger(record.name, logger_provider=self._logger_provider)
return LogRecord(
timestamp=timestamp,
observed_timestamp=observered_timestamp,
Expand All @@ -543,7 +542,7 @@ def _translate(self, record: logging.LogRecord) -> LogRecord:
severity_text=level_name,
severity_number=severity_number,
body=body,
resource=self._logger.resource,
resource=logger.resource,
attributes=attributes,
)

Expand All @@ -553,15 +552,15 @@ def emit(self, record: logging.LogRecord) -> None:

The record is translated to OTel format, and then sent across the pipeline.
"""
if not isinstance(self._logger, NoOpLogger):
self._logger.emit(self._translate(record))
logger = get_logger(record.name, logger_provider=self._logger_provider)
if not isinstance(logger, NoOpLogger):
logger.emit(self._translate(record))

def flush(self) -> None:
"""
Flushes the logging output. Skip flushing if logger is NoOp.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update this comment?

"""
if not isinstance(self._logger, NoOpLogger):
self._logger_provider.force_flush()
self._logger_provider.force_flush()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the if statement here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no longer a single self._logger instance to check against. Perhaps we need to keep track of all logger instances (one for each logger name encountered) and do a force_flush() if even a single one is not a NoOpLogger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. It doesn't seem like there would be an issue with flushing a no op either so I think this works fine. Resolved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an issue here with typing, because the LoggerProvider is of the API variety, as it comes from the API's get_logger_provider() function, and it doesn't specify a force_flush() method.



class Logger(APILogger):
Expand Down Expand Up @@ -622,6 +621,7 @@ def __init__(
def resource(self):
return self._resource

@lru_cache(maxsize=None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the point of this only to speed up the multiple calls to get logger? Or is there a functionality benefit. Given that the logs will be coming from multiple loggers, I am not sure this cache is necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is only to speed up the multiple calls to get_logger. We can be more explicit in keeping track of one logger instance per logger name encountered. @lru_cache may not be the right approach here if we want to be able to retrieve all logger instances after they've been placed in the cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"@lru_cache may not be the right approach here if we want to be able to retrieve all logger instances after they've been placed in the cache"
Not sure I understood this. Could you explain further?

Please add cache tests and the pytest benchmarking you mentioned. Otherwise, looks great to me!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are not defining maxsize this is the same as just @cache right?

def get_logger(
self,
name: str,
Expand Down
32 changes: 32 additions & 0 deletions opentelemetry-sdk/tests/logs/test_export.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ def test_simple_log_record_processor_default_level(self):
self.assertEqual(
warning_log_record.severity_number, SeverityNumber.WARN
)
self.assertEqual(
finished_logs[0].instrumentation_scope.name, "default_level"
)

def test_simple_log_record_processor_custom_level(self):
exporter = InMemoryLogExporter()
Expand Down Expand Up @@ -104,6 +107,12 @@ def test_simple_log_record_processor_custom_level(self):
self.assertEqual(
fatal_log_record.severity_number, SeverityNumber.FATAL
)
self.assertEqual(
finished_logs[0].instrumentation_scope.name, "custom_level"
)
self.assertEqual(
finished_logs[1].instrumentation_scope.name, "custom_level"
)

def test_simple_log_record_processor_trace_correlation(self):
exporter = InMemoryLogExporter()
Expand All @@ -129,6 +138,9 @@ def test_simple_log_record_processor_trace_correlation(self):
self.assertEqual(
log_record.trace_flags, INVALID_SPAN_CONTEXT.trace_flags
)
self.assertEqual(
finished_logs[0].instrumentation_scope.name, "trace_correlation"
)
exporter.clear()

tracer = trace.TracerProvider().get_tracer(__name__)
Expand All @@ -140,6 +152,10 @@ def test_simple_log_record_processor_trace_correlation(self):
self.assertEqual(log_record.body, "Critical message within span")
self.assertEqual(log_record.severity_text, "CRITICAL")
self.assertEqual(log_record.severity_number, SeverityNumber.FATAL)
self.assertEqual(
finished_logs[0].instrumentation_scope.name,
"trace_correlation",
)
span_context = span.get_span_context()
self.assertEqual(log_record.trace_id, span_context.trace_id)
self.assertEqual(log_record.span_id, span_context.span_id)
Expand All @@ -166,6 +182,9 @@ def test_simple_log_record_processor_shutdown(self):
self.assertEqual(
warning_log_record.severity_number, SeverityNumber.WARN
)
self.assertEqual(
finished_logs[0].instrumentation_scope.name, "shutdown"
)
exporter.clear()
logger_provider.shutdown()
with self.assertLogs(level=logging.WARNING):
Expand Down Expand Up @@ -206,6 +225,10 @@ def test_simple_log_record_processor_different_msg_types(self):
for item in finished_logs
]
self.assertEqual(expected, emitted)
for item in finished_logs:
self.assertEqual(
item.instrumentation_scope.name, "different_msg_types"
)


class TestBatchLogRecordProcessor(ConcurrencyTestBase):
Expand Down Expand Up @@ -379,6 +402,8 @@ def test_shutdown(self):
for item in finished_logs
]
self.assertEqual(expected, emitted)
for item in finished_logs:
self.assertEqual(item.instrumentation_scope.name, "shutdown")

def test_force_flush(self):
exporter = InMemoryLogExporter()
Expand All @@ -398,6 +423,9 @@ def test_force_flush(self):
log_record = finished_logs[0].log_record
self.assertEqual(log_record.body, "Earth is burning")
self.assertEqual(log_record.severity_number, SeverityNumber.FATAL)
self.assertEqual(
finished_logs[0].instrumentation_scope.name, "force_flush"
)

def test_log_record_processor_too_many_logs(self):
exporter = InMemoryLogExporter()
Expand All @@ -416,6 +444,8 @@ def test_log_record_processor_too_many_logs(self):
self.assertTrue(log_record_processor.force_flush())
finised_logs = exporter.get_finished_logs()
self.assertEqual(len(finised_logs), 1000)
for item in finised_logs:
self.assertEqual(item.instrumentation_scope.name, "many_logs")

def test_with_multiple_threads(self):
exporter = InMemoryLogExporter()
Expand Down Expand Up @@ -443,6 +473,8 @@ def bulk_log_and_flush(num_logs):

finished_logs = exporter.get_finished_logs()
self.assertEqual(len(finished_logs), 2415)
for item in finished_logs:
self.assertEqual(item.instrumentation_scope.name, "threads")

@unittest.skipUnless(
hasattr(os, "fork"),
Expand Down
Loading