Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improved logging #89

Open
wants to merge 10 commits into
base: development
Choose a base branch
from
Open

feat: improved logging #89

wants to merge 10 commits into from

Conversation

adubovik
Copy link
Collaborator

@adubovik adubovik commented Dec 27, 2024

  1. Removed Batches: 100%|███████████████| 1/1 [00:00<00:00, 1.44it/s] tqdm progress bars printed on each request
  2. Removed Chat completion response length {int} info logs printed on each request
  3. Added logger and timings to the individual stages of message processing: topic, lang_id, influx
  4. Used the same log format that is used in adapters
  5. Added LOG_LEVEL env var that resolves Log level management #54

Before:

INFO:     Loading environment from '.env'
2024-12-27 14:20:29,709 [INFO] - Started server process [83620]
2024-12-27 14:20:29,709 [INFO] - Waiting for application startup.
2024-12-27 14:20:30,255 [INFO] - Load pretrained SentenceTransformer: all-mpnet-base-v2
2024-12-27 14:20:31,684 [INFO] - Use pytorch device_name: mps
Batches[encode]: 100%|███████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.74it/s]
2024-12-27 14:20:32,640 [INFO] - Application startup complete.
2024-12-27 14:20:32,640 [INFO] - Uvicorn running on http://127.0.0.1:5006 (Press CTRL+C to quit)
...
2024-12-27 14:20:58,702 [INFO] - Chat completion response length 4
Batches[encode]: 100%|███████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  3.35it/s]
2024-12-27 14:20:59,028 [INFO] - Chat completion response length 4
Batches[encode]: 100%|███████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  4.93it/s]
2024-12-27 14:20:59,244 [INFO] - Chat completion response length 4
Batches[encode]: 100%|███████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  3.33it/s]
2024-12-27 14:20:59,552 [INFO] - Chat completion response length 4
Batches[encode]: 100%|███████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.89it/s]
2024-12-27 14:20:59,943 [INFO] - Chat completion response length 4
Batches[encode]: 100%|███████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  6.47it/s]
2024-12-27 14:21:00,116 [INFO] - 127.0.0.1:62705 - "POST /data HTTP/1.1" 200
2024-12-27 14:21:00,117 [INFO] - 127.0.0.1:62708 - "POST /data HTTP/1.1" 200
2024-12-27 14:21:00,117 [INFO] - 127.0.0.1:62699 - "POST /data HTTP/1.1" 200
2024-12-27 14:21:00,117 [INFO] - 127.0.0.1:62701 - "POST /data HTTP/1.1" 200
2024-12-27 14:21:00,130 [INFO] - 127.0.0.1:62704 - "POST /data HTTP/1.1" 200
2024-12-27 14:21:00,130 [INFO] - 127.0.0.1:62703 - "POST /data HTTP/1.1" 200
2024-12-27 14:21:00,131 [INFO] - 127.0.0.1:62700 - "POST /data HTTP/1.1" 200
2024-12-27 14:21:00,131 [INFO] - 127.0.0.1:62709 - "POST /data HTTP/1.1" 200
2024-12-27 14:21:00,133 [INFO] - 127.0.0.1:62702 - "POST /data HTTP/1.1" 200
2024-12-27 14:21:00,134 [INFO] - 127.0.0.1:62710 - "POST /data HTTP/1.1" 200

After:

INFO:     Loading environment from '.env'
INFO:     | 2024-12-27 14:15:27,123 | 40334 | uvicorn.error | Started server process [40334]
INFO:     | 2024-12-27 14:15:27,124 | 40334 | uvicorn.error | Waiting for application startup.
INFO:     | 2024-12-27 14:15:28,204 | 40334 | sentence_transformers.SentenceTransformer | Load pretrained SentenceTransformer: all-mpnet-base-v2
INFO:     | 2024-12-27 14:15:29,233 | 40334 | sentence_transformers.SentenceTransformer | Use pytorch device_name: mps
INFO:     | 2024-12-27 14:15:30,463 | 40334 | uvicorn.error | Application startup complete.
INFO:     | 2024-12-27 14:15:30,480 | 40334 | uvicorn.error | Uvicorn running on http://127.0.0.1:5006 (Press CTRL+C to quit)
...
INFO:     | 2024-12-27 14:18:42,464 | 40334 | app | number of messages: 2
DEBUG:    | 2024-12-27 14:18:42,567 | 40334 | app | [2/2][topic] 0.151s
DEBUG:    | 2024-12-27 14:18:42,568 | 40334 | app | [2/2][langid] 0.004s
DEBUG:    | 2024-12-27 14:18:42,647 | 40334 | app | [1/2][influx] 2.377s
INFO:     | 2024-12-27 14:18:42,879 | 40334 | app | [1/2] success
DEBUG:    | 2024-12-27 14:18:42,934 | 40334 | app | [1/2] 2.721s
...
DEBUG:    | 2024-12-27 14:18:43,242 | 40334 | app | [2/2][influx] 0.558s
INFO:     | 2024-12-27 14:18:43,256 | 40334 | app | [2/2] success
DEBUG:    | 2024-12-27 14:18:43,345 | 40334 | app | [2/2] 0.758s
DEBUG:    | 2024-12-27 14:18:43,345 | 40334 | app | 2.676s
DEBUG:    | 2024-12-27 14:18:43,566 | 40334 | app | response: [{"status": "success"}, {"status": "success"}]
INFO:     | 2024-12-27 14:18:43,678 | 40334 | uvicorn.access | 127.0.0.1:62587 - "POST /data HTTP/1.1" 200

After V2:

INFO:     Loading environment from '.env'
INFO:     | 2024-12-27 14:15:27,123 | 40334 | uvicorn.error | Started server process [40334]
INFO:     | 2024-12-27 14:15:27,124 | 40334 | uvicorn.error | Waiting for application startup.
INFO:     | 2024-12-27 14:15:30,463 | 40334 | uvicorn.error | Application startup complete.
INFO:     | 2024-12-27 14:15:30,480 | 40334 | uvicorn.error | Uvicorn running on http://127.0.0.1:5006 (Press CTRL+C to quit)
...
INFO:     | 2024-12-27 14:18:42,464 | 40334 | app | number of messages: 2
DEBUG:    | 2024-12-27 14:18:42,567 | 40334 | app.topic | [2/2] 0.151s
DEBUG:    | 2024-12-27 14:18:42,568 | 40334 | app.langid | [2/2] 0.004s
DEBUG:    | 2024-12-27 14:18:42,647 | 40334 | app.influx | [1/2] 2.377s
INFO:     | 2024-12-27 14:18:42,879 | 40334 | app | [1/2] success
DEBUG:    | 2024-12-27 14:18:42,934 | 40334 | app | [1/2] 2.721s
...
DEBUG:    | 2024-12-27 14:18:43,242 | 40334 | app.influx | [2/2] 0.558s
INFO:     | 2024-12-27 14:18:43,256 | 40334 | app | [2/2] success
DEBUG:    | 2024-12-27 14:18:43,345 | 40334 | app | [2/2] 0.758s
DEBUG:    | 2024-12-27 14:18:43,345 | 40334 | app | 2.676s
DEBUG:    | 2024-12-27 14:18:43,566 | 40334 | app | response: [{"status": "success"}, {"status": "success"}]
INFO:     | 2024-12-27 14:18:43,678 | 40334 | uvicorn.access | 127.0.0.1:62587 - "POST /data HTTP/1.1" 200

@adubovik adubovik self-assigned this Dec 27, 2024
@adubovik adubovik requested a review from Allob as a code owner December 27, 2024 14:22
@Allob
Copy link
Collaborator

Allob commented Jan 17, 2025

INFO:     | 2024-12-27 14:18:42,464 | 40334 | app | number of messages: 2
DEBUG:    | 2024-12-27 14:18:42,567 | 40334 | app | [2/2][topic] 0.151s
DEBUG:    | 2024-12-27 14:18:42,568 | 40334 | app | [2/2][langid] 0.004s
DEBUG:    | 2024-12-27 14:18:42,647 | 40334 | app | [1/2][influx] 2.377s
INFO:     | 2024-12-27 14:18:42,879 | 40334 | app | [1/2] success
DEBUG:    | 2024-12-27 14:18:42,934 | 40334 | app | [1/2] 2.721s

Why do we want [langid] and others to be a message prefix instead of the name of the logger?
I think standard approach like logger = logging.getLogger("langid") or even logger = logging.getLogger(__name__) should work well enough here.

text = text.strip()
if not text:
return None

topics, _ = self.model.transform([text])
topic = self.model.get_topic_info(topics[0])
with Timer(with_prefix(logger, "[topic]").debug):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be more convenient to have a timer as a decorator for the function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm going to use a function decorator here, I will have to introduce a new function instead of using the code block.

How is it more convenient?

# Setting up log levels
logger.setLevel(logging.DEBUG)
# Setting log levels for the analytics application
app_logger.setLevel(LOG_LEVEL)
Copy link
Collaborator

@Allob Allob Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the LOG_LEVEL set for the app logger and not for the root logger here?

Copy link
Collaborator Author

@adubovik adubovik Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting it for the root logger enables logging in the depedency packages (like urllib3 and sentence_transformers). I'm not sure that's what we want. Typically LOG_LEVEL is understood as a log level of the application itself.

Moreover, we could not be sure that LOG_LEVEL=INFO in the dependency packages do not expose sensitive information. So I'm actually disinclined to enable any LOG_LEVEL for the root logger, even INFO.

async def influx_writer_impl(record: Point):
await influx_write_api.write(bucket=influx_bucket, record=record)
async def influx_writer_impl(logger: Logger, record: Point):
with Timer(with_prefix(logger, "[influx]").debug):
Copy link
Collaborator Author

@adubovik adubovik Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reply to: #89 (comment)

langid, influx and topic are the names of the tasks.
They do not map to the module names straightforwardly.
In which module a task is defined is random.
A simple module renaming breaks the format of logs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed - used individual loggers

@@ -54,12 +51,18 @@ async def detect_lang_by_text(text: str) -> str | None:
if not text:
return None

logger = logging.getLogger("app.langid")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move it outside of the function to a module level?

except Exception:
pass
except Exception as e:
logger.error(f"error: {str(e)}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not logger.exception(e) to have stack trace included?

async def _task(i: int, message_str: str) -> dict:
add_logger_prefix(f"[{i}/{n}]")

async with Timer(logger.debug):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This timer results will likely to be skewed, because you will have a lot of tasks run concurrently due to asyncio.gather.
I'm not sure that this is what was expected here.

logger.info("success")
return {"status": "success"}
except starlette.requests.ClientDisconnect:
return _error("client disconnect")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use starlette.requests.ClientDisconnect here?
I'm not sure that all this extra code (including defining a function within a function) worth it since the error is not returned to the response anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Log level management
2 participants