Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add decorator to profile unit subpub methods, added to GenAxisArray #67

Draft
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

cboulay
Copy link
Contributor

@cboulay cboulay commented Dec 26, 2024

This adds a new .util.profile submodule with a decorator that can be used on a Unit's sub-pub methods.

I'm not very good at decorators, especially when using generators and asyncio together, so this will definitely need another look.

@cboulay
Copy link
Contributor Author

cboulay commented Dec 26, 2024

Adding some details (docs will need to be added after we settle on implementation details):

  • Only active when environment variable EZMSG_LOGLEVEL is set to DEBUG.
  • Will log to ~/.ezmsg/profile/ezprofiler.log unless a filename is specified in environment variable EZMSG_PROFILE.
  • The log file is a CSV with a header line for column titles: Source,Topic,SampleTime,PerfCounter,Elapsed
    • Source -- the module name
    • Topic -- the topic name WITHOUT the stream name (i.e., no OUTPUT_SIGNAL or INPUT_SIGNAL)
    • SampleTime -- the timestamp associated with the first (default) or the last sample in a chunk.
    • PerfCounter -- time.perf_counter() immediately after the node yielded its result.
    • Elapsed -- the difference between time.perf_counter() before the node received its data and after it yielded its result.

Some notes:

The reason for tracking SampleTime and PerfCounter (rather than just Elapsed) is so we can track a specific sample of data through a pipeline and its accumulated processing. It is not enough to look only at the Elapsed values because we might have units that yield more than 1 result per input (e.g., Window on a particular large chunk of input data). However, I've encountered some anomalies tracking samples this way, especially when dealing with data that modify the number of samples along the time axis (e.g., Downsample and Window). I'll need to sort that out before merging.

When the SampleTime cannot be known, i.e. because the method is not yielding AxisArray messages, the SampleTime is None.

I think I should add a datetime to each line.

BTW, I'm working on a live dashboard to monitor performance of a running pipeline, at least for nodes that are decorated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant