Replies: 1 comment
-
I'm also finding this kind of profiler. Do you have update for this? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
There are currently three schemes in vllm for system monitoring and performance analysis: OTel, prometheus, and torch profiler.
How about to provide a service profiler which just reflect the detail events inner vllm framework(not include torch).
For example, the service profiler should record each request arrival time and finish time(not the life cycle of vllm service, user can control the start time and end time of service profiler), some important events during its life cycle, such as queue switching, prefill forward, decode forward, token sampling, block swap in/out and relevant request id and so on.
Just like torch profiler, service profiler provide the system performance data with trace event format, perfetto/chrome trace will make the inference system no longer a black boxed to user and more conducive to analyzing the performance issues of the vllm framework.
Service profiler should be a light offline tools for performance analysis. An example like:
Beta Was this translation helpful? Give feedback.
All reactions