Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for L0 Tracer Metrics #285

Open
matcabral opened this issue Feb 27, 2024 · 0 comments
Open

Support for L0 Tracer Metrics #285

matcabral opened this issue Feb 27, 2024 · 0 comments
Labels
API: Tools enhancement New feature or request

Comments

@matcabral
Copy link
Contributor

Summary

Extend L0 Metrics support with new collection paradigm that allows retrieving asynchronous events. The proposed sampling name is "Tracer based metrics"

Details

Motivation

Existing L0 metrics collection modes (streamer and query) are limited to events that are produced in a defined occurrence rate (defined during configuration). Therefore, the proposal is to add extensions APIs to allow events from different nature, for example asynchronous.

Interoperability with Other APIs

A new type of sampling will be added to https://spec.oneapi.io/level-zero/latest/tools/api.html#zet-metric-group-sampling-type-flags-t to differentiate what metric groups can be used with the new set of APIs. APIs that are independent of collection mode (e.g. zetMetricGroupGet(), zetMetricGroupGetProperties(), zetMetricGet() ) will work with all metric groups.

Proposed APIs

New Enumerations

ZET_METRIC_SAMPLING_TYPE_EXP_FLAG_TRACER_BASED

extend the sampling types

zet_metric_group_sampling_type_flags_t {
...
ZET_METRIC_SAMPLING_TYPE_EXP_FLAG_TRACER_BASED
}

New Stypes

ZET_STRUCTURE_TYPE_METRIC_TRACER_DESC_EXP

extend tools stypes

zet_structure_type_t {
...
ZET_STRUCTURE_TYPE_METRIC_TRACER_DESC_EXP
}

New Handles

zet_metric_tracer_exp_handle_t

Metric tracer Handle

New Structures

zet_metric_tracer_desc_t

zet_metric_tracer_desc_t {
        zet_structure_type_t stype;  
        const void *pNext;  
        uint32_t notifyEveryNBytes; 
}
Attribute Description
stype [in] expected to be set to ZET_STRUCTURE_TYPE_METRIC_TRACER_DESC_EXP
pNext [in,out][optional] must be null or a pointer to an extension-specific structure (i.e. contains stype and pNext)
notifyEveryNBytes [in,out] number of collected bytes after which notification event will be signaled. If the requested value is not supported exactly, then the driver may use a value that is the closest supported approximation and shall update this member during zetMetricTracerCreate()

New Functions

zetMetricTracerCreateExp

zetMetricTracerCreateExp(zet_context_handle_t hContext, zet_device_handle_t hDevice, uint32_t metricGroupCount, zet_metric_group_handle_t *phMetricGroups, zet_metric_tracer_exp_desc_t *desc, ze_event_handle_t hNotificationEvent,
zet_metric_tracer_exp_handle_t *phMetricTracer);

Open a metric tracer on a device.

  • The notification event must have been created from an event pool that was created using [ZE_EVENT_POOL_FLAG_HOST_VISIBLE]
  • The duration of the signal event created from an event pool that was created using [ZE_EVENT_POOL_FLAG_KERNEL_TIMESTAMP]
  • The application must not call this function from simultaneous threads with the same device handle.
  • Tracer is created in “disabled” state
  • Metric groups must be of the sampling type ZET_METRIC_SAMPLING_TYPE_EXP_FLAG_TRACER_BASED
  • All metric groups must be first activated
Parameter Description
hContext [in] handle of the context object
hDevice [in] handle of the device
metricGroupCount [in] metric group count
phMetricGroups [in][range(0, metricGroupCount )] handles of the metric groups to trace
desc [in,out] metric tracer descriptor
hNotificationEvent [in][optional] event used for report availability notification. Note: If buffer is not drained when the event it flagged, there is a risk of HW event buffer being overrun
phMetricTracer [out] handle of the metric tracer

zetMetricTracerDestroyExp

zetMetricTracerDestroyExp( zet_metric_tracer_exp_handle_t hMetricTracer);

Deletes the metric tracer object

  • The application must not call this function from simultaneous threads with the same tracer handle.
Parameter Description
hMetricTracer [in] handle of the metric tracer

zetMetricTracerEnableExp

zetMetricTracerEnableExp(zet_metric_tracer_exp_handle_t hMetricTracer,  bool synchronous);

Lightweight call that starts the event collections.

  • Confirmation of successful asynchronous operation is done by calling zetMetricTracerReadDataExp()
Parameter Description
hMetricTracer [in] handle of the metric tracer
synchronous [in] request synchronous behavior

zetMetricTracerDisableExp

zetMetricTracerDisableExp( zet_metric_tracer_exp_handle_t hMetricTracer,  bool synchronous);

Lightweight call that stops the event collections.

  • Confirmation of successful asynchronous operation is done by calling zetMetricTracerReadDataExp()
Parameter Description
hMetricTracer [in] handle of the metric tracer
synchronous [in] request synchronous behavior

zetMetricTracerReadDataExp

zetMetricTracerReadDataExp(zet_metric_tracer_exp_handle_t hMetricTracer, size_t *pRawDataSize, uint8_t *pRawData);

Reads data from metric tracer

  • Data can be retrieved even if the tracer is in disabled state
Parameter Description
hMetricTracer [in] handle of the metric tracer
pRawDataSize [in,out] pointer to size in bytes of raw data requested to read. if size is zero, then the driver will update the value with the total size in bytes needed for all data available. if size is non-zero, then driver will only retrieve the amount of data that fits into the buffer. If size is larger than size needed for all data, then driver will update the value with the actual size needed
pRawData [in,out][optional][range(0, *pRawDataSize)] buffer containing tracer events in raw format

Usage Example

    zet_metric_group_handle_t     hMetricGroup           = nullptr;
    ze_event_handle_t            hNotificationEvent     = nullptr;
    ze_event_pool_handle_t       hEventPool             = nullptr;
    ze_event_pool_desc_t         eventPoolDesc          = {ZE_STRUCTURE_TYPE_EVENT_POOL_DESC, nullptr, 0, 1};
    ze_event_desc_t              eventDesc              = {ZE_STRUCTURE_TYPE_EVENT_DESC};
    zet_metric_tracer_exp_handle_t hMetricTracer;

    // Find a metric group suitable for Tracer Based collection

    FindMetricGroup( hDevice,  ZET_METRIC_SAMPLING_TYPE_EXP_FLAG_TRACER_BASED, &hMetricGroup );
    
    // Configure the HW

    zetContextActivateMetricGroups( hContext, hDevice, /* count= */ 1, &hMetricGroup );

    // Create notification event

    zeEventPoolCreate( hContext, &eventPoolDesc, 1, &hDevice, &hEventPool );
    eventDesc.index  = 0;
    eventDesc.signal = ZE_EVENT_SCOPE_FLAG_HOST;
    eventDesc.wait   = ZE_EVENT_SCOPE_FLAG_HOST;
    zeEventCreate( hEventPool, &eventDesc, &hNotificationEvent );

     // Create tracer
    
      zet_metric_tracer_exp_desc_t tracerDescriptor = {
      ZET_STRUCTURE_TYPE_TRACER_EXP_DESC, 
      nullptr, 1024};

    zetMetricTracerCreateExp(hContext, hDevice, 1, hMetricGroup , &tracerDescriptor, hNotificationEvent, &hMetricTracer);
      
    // Enable the tracer
    
    zetMetricTracerEnableExp(hMetricTracer, true);

    // Run workload 
    
    workload(hDevice);

    // Wait for data, optional

    zeEventHostSynchronize( hNotificationEvent, 1000 /*timeout*/ );

    size_t rawDataSize = 0;
    zetMetricTracerReadDataExp(hMetricTracer, &rawDataSize, nullptr);
    uint8_t* rawData = malloc(rawDataSize);
    zetMetricTracerReadDataExp(hMetricTracer, &rawDataSize, rawData);

    // Close metric tracer

    zetMetricTracerDisableExp(hMetricTracer, true);
    zetMetricTracerDestroyExp(hMetricTracer);
    zeEventDestroy( hNotificationEvent );
    zeEventPoolDestroy( hEventPool );

    // Clean device configuration

    zetContextActivateMetricGroups( hContext, hDevice, 0, nullptr );
    free(rawData);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API: Tools enhancement New feature or request
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

2 participants