From 9ad04fbdc84fe9befdde97169de18debb30945ba Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Pekka=20J=C3=A4=C3=A4skel=C3=A4inen?= Date: Fri, 6 Dec 2024 16:21:11 +0200 Subject: [PATCH] Integrated to the main unified specification and other updates. * Moved the functionality to clCreateBufferWithProperties, thus now requiring 3.0+. * Single memobj query for fetching the address(es). * Also other smaller improvements pointed by Kevin. * Candidate for 1.0.0. --- api/cl_ext_buffer_device_address.asciidoc | 96 ++++++ api/opencl_runtime_layer.asciidoc | 126 ++++++- .../cl_ext_buffer_device_address.asciidoc | 320 ------------------ extensions/extensions.txt | 4 +- xml/cl.xml | 23 +- 5 files changed, 232 insertions(+), 337 deletions(-) create mode 100644 api/cl_ext_buffer_device_address.asciidoc delete mode 100644 extensions/cl_ext_buffer_device_address.asciidoc diff --git a/api/cl_ext_buffer_device_address.asciidoc b/api/cl_ext_buffer_device_address.asciidoc new file mode 100644 index 000000000..26f83e1f9 --- /dev/null +++ b/api/cl_ext_buffer_device_address.asciidoc @@ -0,0 +1,96 @@ +// Copyright 2024 The Khronos Group Inc. +// SPDX-License-Identifier: CC-BY-4.0 + +include::{generated}/meta/{refprefix}cl_ext_buffer_device_address.txt[] + +=== Other Extension Metadata + +*Last Modified Date*:: + 2024-12-06 +*IP Status*:: + No known IP claims. +*Contributors*:: + - Pekka Jääskeläinen, Intel + + - Karol Herbst, Red Hat + + - Henry Linjamäki, Intel + + - Kevin Petit, Arm + + +=== Description + +This extension provides access to raw device pointers for cl_mem buffers +without requiring a shared virtual address space between the host and +the device. + +==== Background + +Shared Virtual Memory (SVM) introduced in OpenCL 2.0 is the first feature +that enables raw pointers in the OpenCL standard. Its coarse-grain +variant is relatively simple to implement on various platforms in terms of +coherency requirements, but it requires mapping the buffer's address range +to the host virtual address space. +However, various higher-level heterogeneous APIs present a memory allocation +routine which can allocate device-only memory and provide raw addresses to +it without guarentees of system-wide uniqueness. For example, minimal +implementations of OpenMP's omp_target_alloc() and CUDA/HIP's +cudaMalloc()/hipMalloc() do not require a shared address space between the host and the device. + +Host-device unified addressing might not be a major implementation issue in +systems which can provide virtual memory across the platform, but might +bring challenges in cases where the device presents a global memory with +a disjoint address space (that can also be a physical memory address space) or, +for example, when a barebone embedded system lacks virtual memory support altogether. +This extension is targeted to complement the OpenCL SVM extension by providing +an additional lower-end step in the spectrum of type of pointers/buffers OpenCL +can allocate. + +=== New Command + + * {clSetKernelArgDevicePointerEXT} + +=== New Types + + * {cl_mem_device_address_EXT} + +=== New Enums + + * {cl_mem_properties_TYPE} + ** {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} + ** {CL_MEM_DEVICE_SHARED_ADDRESS_EXT} + * {cl_mem_info_TYPE} + ** {CL_MEM_DEVICE_ADDRESS_EXT} + * {cl_kernel_exec_info_TYPE} + ** {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT} + +=== Version History + +[cols="5,15,15,70"] +[grid="rows"] +[options="header"] +|==== +| *Version* | *Date* | *Author* | *Changes* +| 0.9.0 | 2024-12-06 | Pekka Jääskeläinen, Kevin Petit | + Integrated to the main unified specification. + Moved the functionality to clCreateBufferWithProperties, + thus requiring 3.0+. Single memobj query for fetching the + address(es). Also other smaller improvements pointed by Kevin. + Candidate for final 1.0.0. +| 0.3.0 | 2024-09-24 | Pekka Jääskeläinen, Karol Herbst | + Made the allocation flags independent from each other and + renamed them to CL_MEM_DEVICE_SHARED_ADDRESS_EXT and + CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT. The first one guarantees the + same address across all devices in the context, whereas the latter + allows per-device addresses. +| 0.2.0 | 2024-09-09 | Pekka Jääskeläinen, Karol Herbst | + Changed the CL_MEM_DEVICE_ADDRESS_EXT wording for multi-device + cases "all", not "any", covering a case where not all devices + can ensure the same address across the context. In that case + CL_INVALID_VALUE can be returned. Defined sub-buffer address + computation to be 'base_addr + origin'. Added error conditions + for clSetKernelExecInfo when the device doesn't support + device pointers. +| 0.1.0 | 2024-05-07 | Pekka Jääskeläinen | First draft text for feedback. + This version describes the first API version that was prototyped + in PoCL and RustiCL using temporary placeholder flag/enum values. + The PoCL implementation and initial discussion on the extension + can be found https://github.com/pocl/pocl/pull/1441[in this PR]. +|==== diff --git a/api/opencl_runtime_layer.asciidoc b/api/opencl_runtime_layer.asciidoc index a10d5fb92..fbd601d32 100644 --- a/api/opencl_runtime_layer.asciidoc +++ b/api/opencl_runtime_layer.asciidoc @@ -594,6 +594,39 @@ include::{generated}/api/version-notes/CL_MEM_DEVICE_HANDLE_LIST_KHR.asciidoc[] {CL_MEM_DEVICE_HANDLE_LIST_END_KHR_anchor}) to associate with the external memory handle. endif::cl_khr_external_memory[] + +ifdef::cl_ext_buffer_device_address[] + +| {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT_anchor} + +include::{generated}/api/version-notes/CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT.asciidoc[] + | {cl_bool_TYPE} + | When set to CL_TRUE, specifies that the buffer must have a single fixed + device-side address for its lifetime, and the address can be queried via {clGetMemObjectInfo}. + + Each device in the context can have their own (fixed) device-side address and + a copy of the created buffer which are synchronized + implicitly by the runtime. + + The flag might imply that the buffer will be "pinned" permanently to + a device's memory, but might not be necessarily so, as long as the address + range of the buffer remains constant. + + The device addresses of sub-buffers derived from {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} + allocated buffers can be computed by adding the sub-buffer origin to the + device-specific start address. + +| {CL_MEM_DEVICE_SHARED_ADDRESS_EXT_anchor} + +include::{generated}/api/version-notes/CL_MEM_DEVICE_SHARED_ADDRESS_EXT.asciidoc[] + | {cl_bool_TYPE} + | When set to CL_TRUE, the buffer has otherwise the same properties as + when allocated using the {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT_anchor} flag, + but with an additional property that the buffer's address is the same across + all the devices in the context. + +endif::cl_ext_buffer_device_address[] + |==== ifdef::cl_khr_external_memory[] @@ -660,6 +693,15 @@ ifdef::cl_khr_external_memory[] ** if _properties_ does not include a supported external memory handle and {CL_MEM_DEVICE_HANDLE_LIST_KHR} is specified as part of _properties_. endif::cl_khr_external_memory[] +ifdef::cl_ext_buffer_device_address[] + * {CL_INVALID_DEVICE} + ** If _properties_ includes either {CL_MEM_DEVICE_SHARED_ADDRESS_EXT} or + {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} and there is at least one device in + the context that doesn't support such allocation. + * {CL_INVALID_VALUE} + ** If _properties_ includes both {CL_MEM_DEVICE_SHARED_ADDRESS_EXT} and + {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} at the same time. +endif::cl_ext_buffer_device_address[] [[memory-flags-table]] .List of supported memory flag values @@ -6149,6 +6191,21 @@ include::{generated}/api/version-notes/CL_MEM_D3D11_RESOURCE_KHR.asciidoc[] returns the _resource_ argument specified when _memobj_ was created. endif::cl_khr_d3d11_sharing[] +ifdef::cl_ext_buffer_device_address[] +| {CL_MEM_DEVICE_ADDRESS_EXT_anchor} + +include::{generated}/api/version-notes/CL_MEM_DEVICE_ADDRESS_EXT.asciidoc[] + | {cl_mem_device_address_EXT_TYPE}[] + | If _memobj_ was created using {clCreateBufferWithProperties} with + the {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} property set to CL_TRUE, + returns a list of device addresses for the buffer, one for each + device in the context. If the buffer was allocated + with the {CL_MEM_DEVICE_SHARED_ADDRESS_EXT} property, + only one device address is returned. + +endif::cl_ext_buffer_device_address[] + + |==== // refError @@ -6158,6 +6215,12 @@ successfully. Otherwise, it returns one of the following errors: * {CL_INVALID_MEM_OBJECT} if _memobj_ is a not a valid memory object. +ifdef::cl_ext_buffer_device_address[] + ** Returned for the {CL_MEM_DEVICE_ADDRESS_EXT} query if + the {cl_ext_buffer_device_address_EXT} is not supported or if the + buffer was not allocated with neither {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} or + {CL_MEM_DEVICE_SHARED_ADDRESS_EXT}. +endif::cl_ext_buffer_device_address[] * {CL_INVALID_VALUE} if _param_name_ is not valid, or if size in bytes specified by _param_value_size_ is < size of return type as described in the <> table and _param_value_ is not @@ -10454,6 +10517,48 @@ Otherwise, it returns one of the following errors: required by the OpenCL implementation on the host. -- +ifdef::cl_ext_buffer_device_address[] +[open,refpage='clSetKernelArgDevicePointerEXT',desc='Set a device pointer as the argument value for a specific argument of a kernel.',type='protos'] +-- +To set a device pointer as the argument value for a specific argument of a +kernel, call the function + +include::{generated}/api/protos/clSetKernelArgDevicePointerEXT.txt[] +include::{generated}/api/version-notes/clSetKernelArgDevicePointerEXT.asciidoc[] + + * _kernel_ is a valid kernel object. + * _arg_index_ is the argument index. + Arguments to the kernel are referred by indices that go from 0 for the + leftmost argument to _n_ - 1, where _n_ is the total number of arguments + declared by a kernel. + * _arg_value_ is the device pointer that should be used as the argument value for + argument specified by _arg_index_. + The device pointer specified is the value used by all API calls that enqueue + _kernel_ ({clEnqueueNDRangeKernel} and {clEnqueueTask}) until the argument + value is changed by a call to {clSetKernelArgDevicePointerEXT} for _kernel_. + The device pointer can only be used for arguments that are declared to be a + pointer to `global` memory allocated with {clCreateBufferWithProperties} with + either the {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT} or {CL_MEM_DEVICE_SHARED_ADDRESS_EXT} + property. The pointer value specified as the argument value + can be the pointer to the beginning of the buffer or any offset into + the buffer region. The device pointer value must be naturally aligned according to + the argument's type. + +{clSetKernelArgDevicePointerEXT} returns {CL_SUCCESS} if the argument was set +successfully. Otherwise, it returns one of the following errors: + + * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object. + * {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support + the device pointer. + * {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index. + * {CL_INVALID_ARG_VALUE} if _arg_value_ specified is not a valid value. + * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required + by the OpenCL implementation on the device. + * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources + required by the OpenCL implementation on the host. +-- +endif::cl_ext_buffer_device_address[] + [open,refpage='clSetKernelExecInfo',desc='Pass additional information other than argument values to a kernel.',type='protos'] -- To pass additional information other than argument values to a kernel, call @@ -10497,6 +10602,19 @@ include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM grain system SVM allocations. These fine grain system SVM pointers may be passed as arguments or defined in SVM buffers that are passed as arguments to _kernel_. + +ifdef::cl_ext_buffer_device_address[] +| {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT_anchor} + +include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT.asciidoc[] + | {cl_mem_device_address_EXT_TYPE}[] + | Device pointers must reference locations contained entirely within + buffers that are passed to kernel as arguments, or that are passed + through the execution information. Non-argument device pointers accessed + by the kernel must be specified by passing pointers to those buffers + via this {clSetKernelExecInfo} option. +endif::cl_ext_buffer_device_address[] + |==== // refError @@ -10506,7 +10624,13 @@ successfully. Otherwise, it returns one of the following errors: * {CL_INVALID_KERNEL} if _kernel_ is a not a valid kernel object. - * {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support SVM. + * {CL_INVALID_OPERATION} for {CL_KERNEL_EXEC_INFO_SVM_PTRS} if no devices in + the context associated with _kernel_ support SVM. +ifdef::cl_ext_buffer_device_address[] + * {CL_INVALID_OPERATION} for {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT} if no + device in the context associated with _kernel_ support device pointers. +endif::cl_ext_buffer_device_address[] + * {CL_INVALID_VALUE} if _param_name_ is not valid, if _param_value_ is `NULL` or if the size specified by _param_value_size_ is not valid. * {CL_INVALID_OPERATION} if _param_name_ is diff --git a/extensions/cl_ext_buffer_device_address.asciidoc b/extensions/cl_ext_buffer_device_address.asciidoc deleted file mode 100644 index fdf4eb44e..000000000 --- a/extensions/cl_ext_buffer_device_address.asciidoc +++ /dev/null @@ -1,320 +0,0 @@ -= cl_ext_buffer_device_address - -// This section needs to be after the document title. -:doctype: book -:toc2: -:toc: left -:encoding: utf-8 -:lang: en - -:blank: pass:[ +] - -// Set the default source code type in this document to C, -// for syntax highlighting purposes. -:language: c - -// This is what is needed for C++, since docbook uses c++ -// and everything else uses cpp. This doesn't work when -// source blocks are in table cells, though, so don't use -// C++ unless it is required. -//:language: {basebackend@docbook:c++:cpp} - -== Name Strings - -`cl_ext_buffer_device_address` - -== Contact - -Pekka Jääskeläinen, Intel (pekka 'dot' jaaskelainen 'at' intel 'dot' com) - -== Contributors - -// spell-checker: disable -Pekka Jääskeläinen, Intel + -Karol Herbst, Red Hat + -Henry Linjamäki, Intel + -// spell-checker: enable - -== Notice - -Copyright (c) 2024 Intel Corporation. All rights reserved. - -== Status - -Draft. - -== Version - -Built On: {docdate} + -Revision: 0.3.0 - -== Dependencies - -This extension is written against the OpenCL Specification version 3.0.16. - -This extension requires OpenCL 1.0 or later. - -== Overview - -The basic cl_mem buffer API doesn't enable access to the underlying raw -pointers in the device memory, preventing its use in host side -data structures that need pointer references to objects. - -Shared Virtual Memory (SVM) introduced in OpenCL 2.0 is the first feature -that enables raw device side pointers in the OpenCL standard. Its coarse-grain -variant is relatively simple to implement on various platforms in terms of -coherency requirements, but it requires mapping the buffer's address range -to the host virtual address space although it might not be needed by the -application. This is not an issue in systems which can provide virtual memory -across the platform, but might provide implementation challenges in cases -where the device presents a global memory with a disjoint address space -(that can also be a physical memory address space) or, for example, when -a barebone embedded system lacks virtual memory support altogether. - -Various higher-level APIs present a memory allocation routine which can -allocate device-only memory and provide raw pointers to it without guarentees -of system-wide uniqueness: For example, minimal implementations of OpenMP's -omp_target_alloc() and CUDA/HIP's cudaMalloc()/hipMalloc() do not require a shared -address space between the host and the device. This extension is meant to -provide a minimal set of features to implement such APIs using the cl_mem -buffers without requiring a shared virtual address space between the host and -the device. - -=== New API Function - -include::{generated}/api/protos/clSetKernelArgDevicePointerEXT.txt[] - -=== New API Enums - -Enums for enabling device pointer properties when creating a buffer -{clCreateBuffer}, see <>: - -[source] ----- -#define CL_MEM_DEVICE_SHARED_ADDRESS_EXT (1ul << 31) -#define CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT (1ul << 30) ----- - -Enums for querying the device pointer from the cl_mem <>: - -[source] ----- -#define CL_MEM_DEVICE_PTR_EXT 0xff01 ----- - -Enums for setting information of indirect device pointer accesses to kernels <>. This is for OpenCL 2.0 and above. When implementing the -extension on an older OpenCL version, indirect device pointer access is not supported. - -[source] ----- -#define CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT 0x11B8 ----- - -== New API Types - -Returned as the query result value *clGetMemObjectInfo* with `CL_DEVICE_PTR_EXT`. - -[source] ----- -typedef cl_ulong cl_mem_device_address_EXT; ----- - -Returned as the query result value *clGetMemObjectInfo* with `CL_DEVICE_PTRS_EXT`. - -[source] ----- -typedef struct _cl_mem_device_address_pair_EXT -{ - cl_device_id device; - cl_mem_device_address_EXT address; -} cl_mem_device_address_pair_EXT; ----- - -== Modifications to the OpenCL API Specification - -=== Section 5.2.1 - Creating Buffer Objects: - -Add new allocation flags <>: - -[[list-of-supported-memory-flag-values-adds]] -.List of supported memory flags by {clCreateBuffer} -[width="100%",cols="<50%,<50%",options="header"] -|==== -| Memory Flags | Description -| {CL_MEM_DEVICE_SHARED_ADDRESS_EXT_anchor} - -include::{generated}/api/version-notes/CL_MEM_DEVICE_SHARED_ADDRESS_EXT.asciidoc[] - | This flag specifies that the buffer must have a single fixed address - for its lifetime and the address should be unique at least across the devices - of the context, but not necessarily within the host (virtual) memory. - - The flag might imply that the buffer will be "pinned" permanently to - a device's memory, but might not be necessarily so, as long as the address - range of the buffer remains constant. - - The address is guaranteed to remain the same until the buffer is freed, and - the address can be queried via {clGetMemObjectInfo}. - - The device-specific buffer content updates are still performed by - implicit or explicit buffer migrations performed by the runtime or the - client code. If all of the devices in the context do not support - this type of allocations, an error (CL_INVALID_VALUE) is returned. - - The device addresses of sub-buffers derived from CL_MEM_DEVICE_SHARED_ADDRESS_EXT - allocated buffers can be computed by adding the sub-buffer origin to the - start address. - -| {CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT_anchor} - -include::{generated}/api/version-notes/CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT.asciidoc[] - | This flag specifies that the buffer must have a single fixed address - for its lifetime. Each device in the context can have their own (fixed) - device-side address and a copy of the created buffer which are synchronized - implicitly by the runtime. The main difference to a default cl_mem allocation - in that case is that the addresses are queriable with CL_MEM_DEVICE_PTRS_EXT - and the per-device address is guaranteed to be the same for the entire lifetime - of the cl_mem. - - The device addresses of sub-buffers derived from CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT - allocated buffers can be computed by adding the sub-buffer origin to the - device-specific start address. - -|==== - -// refError - -=== Section 5.5.6 - Memory Object Queries - -Add a new information type <>: - -[width="100%",cols="<33%,<17%,<50%",options="header"] -|==== -| Memory Object Info | Return type | Description -| {CL_MEM_DEVICE_PTR_EXT_anchor} - -include::{generated}/api/version-notes/CL_MEM_DEVICE_PTR_EXT.asciidoc[] - | {cl_mem_device_address_EXT_TYPE} - | Returns the device address for a buffer allocated with - CL_MEM_DEVICE_SHARED_ADDRESS_EXT. If the buffer was not created with the flag - or there are multiple devices in the context and the buffer address is - not the same for all of them, it returns CL_INVALID_MEM_OBJECT. - -| {CL_MEM_DEVICE_PTRS_EXT_anchor} -include::{generated}/api/version-notes/CL_MEM_DEVICE_PTRS_EXT.asciidoc[] - | {cl_mem_device_address_pair_EXT_TYPE} - | Returns the device-address pairs for all devices in the context. - The per-device addresses might differ when the buffer was allocated - with the CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT enabled. -|==== - - -=== Section 5.9.2 - Setting Kernel Arguments - -Add a new kernel argument setter for device pointers <>: - -To set a device pointer as the argument value for a specific argument of a -kernel, call the function - -include::{generated}/api/protos/clSetKernelArgDevicePointerEXT.txt[] -include::{generated}/api/version-notes/clSetKernelArgDevicePointerEXT.asciidoc[] - - * _kernel_ is a valid kernel object. - * _arg_index_ is the argument index. - Arguments to the kernel are referred by indices that go from 0 for the - leftmost argument to _n_ - 1, where _n_ is the total number of arguments - declared by a kernel. - * _arg_value_ is the device pointer that should be used as the argument value for - argument specified by _arg_index_. - The device pointer specified is the value used by all API calls that enqueue - _kernel_ ({clEnqueueNDRangeKernel} and {clEnqueueTask}) until the argument - value is changed by a call to {clSetKernelArgDevicePointer} for _kernel_. - The device pointer can only be used for arguments that are declared to be a - pointer to `global` memory allocated with clCreateBuffer() with the - CL_MEM_DEVICE_SHARED_ADDRESS_EXT flag. The pointer value specified as the argument value - can be the pointer to the beginning of the buffer or be a pointer offset into - the buffer region. The device pointer value must be naturally aligned according to - the argument's type. - -// refError - -{clSetKernelArgDevicePointerEXT} returns {CL_SUCCESS} if the function was executed -successfully. Otherwise, it returns one of the following errors: - - * {CL_INVALID_KERNEL} if _kernel_ is not a valid kernel object. - * {CL_INVALID_OPERATION} if no devices in the context associated with _kernel_ support - the device pointer. - * {CL_INVALID_ARG_INDEX} if _arg_index_ is not a valid argument index. - * {CL_INVALID_ARG_VALUE} if _arg_value_ specified is not a valid value. - * {CL_OUT_OF_RESOURCES} if there is a failure to allocate resources required - by the OpenCL implementation on the device. - * {CL_OUT_OF_HOST_MEMORY} if there is a failure to allocate resources - required by the OpenCL implementation on the host. - -Add a new flag to clSetKernelExecInfo for setting indirect device pointer access info <>: - -[width="100%",cols="<33%,<17%,<50%",options="header"] -|==== -| Kernel Exec Info | Type | Description -| {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT_anchor} - -include::{generated}/api/version-notes/CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT.asciidoc[] - | {cl_mem_device_address_EXT_TYPE} - | Device pointers must reference locations contained entirely within - buffers that are passed to kernel as arguments, or that are passed - through the execution information. - - Non-argument device pointers accessed by the kernel must be specified - by passing pointers to those buffers via {clSetKernelExecInfo}. -|==== - -// refError - -Change the descriptions for when returning CL_INVALID_OPERATION from {clSetKernelExecInfo} -as follows: - - * {CL_INVALID_OPERATION} if passing {CL_KERNEL_EXEC_INFO_SVM_PTRS} or - {CL_KERNEL_EXEC_INFO_SVM_FINE_GRAIN_SYSTEM} with _param_value_ set to CL_TRUE - and no device in the context associated with _kernel_ support SVM. - * {CL_INVALID_OPERATION} if passing {CL_KERNEL_EXEC_INFO_DEVICE_PTRS_EXT} and no - device in the context associated with _kernel_ support device pointers. - -== Interactions with Other Extensions - -This extension is targeted to complement the OpenCL SVM extension and/or the -Intel Unified Shared Memory extension by providing an additional lower-end -step in the spectrum of type of pointers/buffers OpenCL can allocate. The -extension can be seen as a simplification of the USM Device allocation type -which drops the need to map the device buffer's address range to the same -position in the host memory or to implement platform-wide VM. - -== Issues - -None. - -== Version History - -[cols="5,15,15,70"] -[grid="rows"] -[options="header"] -|==== -| *Version* | *Date* | *Author* | *Changes* -| 0.3.0 | 2024-09-24 | Pekka Jääskeläinen, Karol Herbst | - Made the allocation flags independent from each other and - renamed them to CL_MEM_DEVICE_SHARED_ADDRESS_EXT and - CL_MEM_DEVICE_PRIVATE_ADDRESS_EXT. The first one guarantees the - same address across all devices in the context, whereas the latter - allows per-device addresses. -| 0.2.0 | 2024-09-09 | Pekka Jääskeläinen, Karol Herbst | - Changed the CL_MEM_DEVICE_ADDRESS_EXT wording for multi-device - cases "all", not "any", covering a case where not all devices - can ensure the same address across the context. In that case - CL_INVALID_VALUE can be returned. Defined sub-buffer address - computation to be 'base_addr + origin'. Added error conditions - for clSetKernelExecInfo when the device doesn't support - device pointers. -| 0.1.0 | 2024-05-07 | Pekka Jääskeläinen | First draft text for feedback. - This version describes the first API version that was prototyped - in PoCL and RustiCL using temporary placeholder flag/enum values. - The PoCL implementation and initial discussion on the extension - can be found https://github.com/pocl/pocl/pull/1441[in this PR]. -|==== diff --git a/extensions/extensions.txt b/extensions/extensions.txt index 52b05450b..e942063a7 100644 --- a/extensions/extensions.txt +++ b/extensions/extensions.txt @@ -41,8 +41,8 @@ include::cl_ext_image_from_buffer.asciidoc[] include::cl_ext_image_raw10_raw12.asciidoc[] <<< include::cl_ext_image_requirements_info.asciidoc[] -<<< -include::cl_ext_buffer_device_address.asciidoc[] +//<<< +//include::cl_ext_buffer_device_address.asciidoc[] // Vendor Extensions :leveloffset: 0 diff --git a/xml/cl.xml b/xml/cl.xml index 73e4d80df..c71f8b0a0 100644 --- a/xml/cl.xml +++ b/xml/cl.xml @@ -255,7 +255,7 @@ server's OpenCL/api-docs repository. typedef cl_bitfield cl_platform_command_buffer_capabilities_khr; typedef cl_bitfield cl_mutable_dispatch_asserts_khr typedef cl_bitfield cl_device_kernel_clock_capabilities_khr; - typedef cl_ulong cl_mem_device_address_EXT; + typedef cl_ulong cl_mem_device_address_ext; Structure types @@ -305,10 +305,6 @@ server's OpenCL/api-docs repository. size_t origin size_t size - - cl_device_id device - cl_mem_device_address_EXT address - cl_version version char name[CL_NAME_VERSION_MAX_NAME_SIZE] @@ -724,6 +720,8 @@ server's OpenCL/api-docs repository. + + @@ -1637,9 +1635,8 @@ server's OpenCL/api-docs repository. - - - + + @@ -3735,7 +3732,7 @@ server's OpenCL/api-docs repository. cl_uint arg_index const void* arg_value - + cl_int clSetKernelArgDevicePointerEXT cl_kernel kernel cl_uint arg_index @@ -7202,7 +7199,7 @@ server's OpenCL/api-docs repository. - + @@ -7211,15 +7208,13 @@ server's OpenCL/api-docs repository. - - + - - +