-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: improved logging #89
base: development
Are you sure you want to change the base?
Conversation
Why do we want |
text = text.strip() | ||
if not text: | ||
return None | ||
|
||
topics, _ = self.model.transform([text]) | ||
topic = self.model.get_topic_info(topics[0]) | ||
with Timer(with_prefix(logger, "[topic]").debug): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be more convenient to have a timer as a decorator for the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm going to use a function decorator here, I will have to introduce a new function instead of using the code block.
How is it more convenient?
# Setting up log levels | ||
logger.setLevel(logging.DEBUG) | ||
# Setting log levels for the analytics application | ||
app_logger.setLevel(LOG_LEVEL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the LOG_LEVEL set for the app logger and not for the root logger here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting it for the root logger enables logging in the depedency packages (like urllib3
and sentence_transformers
). I'm not sure that's what we want. Typically LOG_LEVEL
is understood as a log level of the application itself.
Moreover, we could not be sure that LOG_LEVEL=INFO
in the dependency packages do not expose sensitive information. So I'm actually disinclined to enable any LOG_LEVEL for the root logger, even INFO.
async def influx_writer_impl(record: Point): | ||
await influx_write_api.write(bucket=influx_bucket, record=record) | ||
async def influx_writer_impl(logger: Logger, record: Point): | ||
with Timer(with_prefix(logger, "[influx]").debug): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reply to: #89 (comment)
langid
, influx
and topic
are the names of the tasks.
They do not map to the module names straightforwardly.
In which module a task is defined is random.
A simple module renaming breaks the format of logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed - used individual loggers
@@ -54,12 +51,18 @@ async def detect_lang_by_text(text: str) -> str | None: | |||
if not text: | |||
return None | |||
|
|||
logger = logging.getLogger("app.langid") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move it outside of the function to a module level?
except Exception: | ||
pass | ||
except Exception as e: | ||
logger.error(f"error: {str(e)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not logger.exception(e)
to have stack trace included?
async def _task(i: int, message_str: str) -> dict: | ||
add_logger_prefix(f"[{i}/{n}]") | ||
|
||
async with Timer(logger.debug): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This timer results will likely to be skewed, because you will have a lot of tasks run concurrently due to asyncio.gather
.
I'm not sure that this is what was expected here.
logger.info("success") | ||
return {"status": "success"} | ||
except starlette.requests.ClientDisconnect: | ||
return _error("client disconnect") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use starlette.requests.ClientDisconnect
here?
I'm not sure that all this extra code (including defining a function within a function) worth it since the error is not returned to the response anyway.
Batches: 100%|███████████████| 1/1 [00:00<00:00, 1.44it/s]
tqdm progress bars printed on each requestChat completion response length {int}
info logs printed on each requestBefore:
After:
After V2: