Version 1.0
- Background
- Optimization Goals
- Goals
- Allow applications to avoid unnecessary loads/stores of resources from/to main memory on TBDR architectures
- Allow TBDR architectures to opportunistically persistent resources in on-chip cache across Render Passes (even in separate Command Lists)
- Allow the new APIs to be used with existing drivers
- Design allows UMD to choose optimal rendering path without heavy CPU penalty
- Allow ISVs to verify proper use of the feature even on drivers that aren't necessarily making behavior changes based on feature use
- Non-goals
- Goals
- Summary of API changes
- Render Pass output bindings
- Render Pass workloads
- Resource Access declaration
- D3D12_RENDER_PASS_BEGINNING_ACCESS_DISCARD
- D3D12_RENDER_PASS_BEGINNING_ACCESS_PRESERVE
- D3D12_RENDER_PASS_BEGINNING_ACCESS_CLEAR
- D3D12_RENDER_PASS_BEGINNING_ACCESS_NO_ACCESS
D3D12_RENDER_PASS_BEGINNING_ACCESS_RESUME_WRITINGD3D12_RENDER_PASS_BEGINNING_ACCESS_RESUME_LOCAL_READ- D3D12_RENDER_PASS_ENDING_ACCESS_DISCARD
- D3D12_RENDER_PASS_ENDING_ACCESS_PRESERVE
- D3D12_RENDER_PASS_ENDING_ACCESS_RESOLVE
- D3D12_RENDER_PASS_ENDING_ACCESS_NO_ACCESS
D3D12_RENDER_PASS_ENDING_ACCESS_SUSPEND_WRITINGD3D12_RENDER_PASS_ENDING_ACCESS_SUSPEND_LOCAL_READ
- Render Pass Flags
- Render Pass tiers/CheckFeatureSupport
- Open Issues
- API Headers
- Runtime validation
- HLK Test Plan
- History of changes
Render Passes are intended to help TBDR-based (and other) renderers improve GPU efficiency by reducing memory traffic to/from off-chip memory, by enabling applications to better identify resource rendering ordering requirements/data dependencies.
The name "Render Passes" may be changed in a future spec.
Allow applications to avoid unnecessary loads/stores of resources from/to main memory on TBDR architectures
This feature introduces the concept of Render Passes, which are intended to provide a central location for applications to indicate their data dependencies for a set of rendering operations.
These data dependencies are intended to allow drivers to inspect this data at bind/barrier time, and issue instructions that minimize resource loads/stores from/to main memory.
Allow TBDR architectures to opportunistically persistent resources in on-chip cache across Render Passes (even in separate Command Lists)
A common rendering pattern is for an application to render to RTV A, then turn around and texture from that resource as SRV A at some time in future (while rendering to RTV B). For cases in which the writes to RTV B are reading from pixels 'one-to-one' (mapped to the identical location in SRV A), some architectures may be able to continue the current binning pass during the writes to RTV A, and avoid a flush to main memory (since the SRV A reads only have a dependency on the current tile).
A design goal is to enable these two passes to be coalesced, without a intervening flush to main memory.
[v0.08] Postponed to a future release, were not able to identify common ISV scenarios for this functionality.
Another common rendering pattern is for the application to render to the same render target(s) across multiple command lists serially, even though the rendering commands are generated in parallel. This design seeks to allow drivers to avoid a flush to main memory on Command List boundaries, when the application knows it will resume rendering on the immediate succeeding Command List.
A design goal is to allow these two passes to be combined in a way that avoids an intervening flush to main memory.
The design seeks to allow new APIs to be run on existing drivers (not necessarily with any performance improvements), to ensure as wide-as-possible an install base for the APIs.
This feature should be aligned with the low CPU-overhead goals of D3D12, and should be designed in such a way to not significantly impact CPU usage for common rendering workloads (no more than ~20%).
Allow ISVs to verify proper use of the feature even on drivers that aren't necessarily making behavior changes based on feature use
The debug layer should be able to help identify incorrect use of the feature even when running on a non-supporting driver. (e.g. how the DX11 debug layer clears a resource to a random color in response to a Discard call).
-
This design relies on applications to properly identify data/ordering dependencies for its operations, it does not intend to allow the runtime or driver to deduce opportunities to re-order/avoid loads and stores.
-
The design does not seek to remove/reduce the need for resource barriers, or resource state tracking by applications (there would obviously be value in that, but that is orthogonal from this feature).
-
Automatic fixing of 'bad app behavior' (e.g. unneeded copies/clears) -- our preferred solution is the automated performance warnings already in PIX (which continue to be added).
-
Allow re-ordering of workloads based on app-indicated flexible ordering dependencies. This design optimizes for apps ordering their own submissions in a way that reduces flushes (e.g. maximizing use of a _RESUME flags, and minimizing use of _PRESERVE).
A Render Pass is defined by:
-
A set of output bindings (RTVs, DSV) fixed for the duration of the Render Pass, and
-
A list of GPU operations that target that set of output bindings, and
-
Metadata that describes the load/store dependencies for all output bindings targeted by the Pass, and
-
(optionally) an object that identifies the Render Pass as a specific, persistent workload where the cost of any upfront PGO-style training is likely worth future gains.
-
This last area is not currently in the spec, but may be addressed in a future iteration.
-
[v0.08] Will not be addressed in RS5
-
Render Target and Depth Buffer bindings are declared at the start of a Render Pass, directly in the ID3D12GraphicsCommandList::BeginRenderPass call.
The BeginRenderPass call accepts the lists of RTVs/DSV -- either are optional, though at least one must be specified. All bindings are via CPU descriptors, and like OMSetRenderTargets they are 'snapped' at BeginRenderPass time from their respective (CPU) descriptor heaps.
RTVs/the DSV are not inherited into the Render Pass, they must be set. The RTVs/DSV set in BeginRenderPass the Render Pass are not propagated out to the Command List; they are in an undefined state following the Render Pass (the driver does not need to manually clear these to something well-defined).
A Render Pass is a subset of commands within a Command List, a Command List may contain multiple Render Passes. The boundaries of a Render Pass are declared explicitly on the Command List via the BeginRenderPass/EndRenderPass Command List APIs.
Render Passes may not be nested, and they must be ended within the current Command List (they cannot straddle Command Lists, see below for optimizations designed to enable efficient multi-threaded Render Pass generation).
Writes from within a Render Pass are not 'valid' to be read until the
end of the Render Pass. This precludes some types of barriers from
within the Render Pass (e.g. barriering from RENDER_TARGET
to
SHADER_RESOURCE
on the currently-bound render target), but some other
barriers may be allowed. See sub-section below for more details.
The sole exception to the 'no reads on writes that occurred within the render pass' are the implicit reads that occur as part of depth-testing and render target blending.
Thus, the list of disallowed APIs within a Render Pass is:
-
AtomicCopyBuffer*
-
BeginRenderPass
-
ClearState
-
ClearRenderTargetView
-
ClearDepthStencilView
-
ClearUnorderedAccessViewUint
-
ClearUnorderedAccessViewFloat
-
CopyBufferRegion
-
CopyTextureRegion
-
CopyResource
-
CopyTiles
-
DiscardResource
-
Dispatch [added in v0.08]
-
OMSetRenderTargets
-
ResolveQueryData
-
ResolveSubresource(Region)
-
SetProtectedResourceSession
-
SetDescriptorHeaps [added in v0.08, removed in v1.0] -
Else?
The core runtime will remove the Command List if any of the above APIs are called during Command List recording (the UMD can assume those DDIs will never be called during a render pass).
As stated earlier, an application may not read from or consume writes
that occurred within the same render pass. This disallows certain
barriers, for example from RENDER_TARGET
to SHADER_RESOURCE
on the
currently-bound render target (and the debug layer will error to that
effect). But, that same barrier on a render target that was written
outside the current Render Pass is completely valid, as the writes
should have all completed ahead of the Render Pass even starting.
So as a (crucial) optimization, given a conformant workload, the UMD is free to assume that the application is not depending on any writes within a Render Pass, and thus it is free to automatically move any barriers encountered in a Render Pass to the beginning of the Render Pass, where they can be coalesced (and not interfere with any tiling/binning operations). This should be valid, as all writes should have been finished by the time the Render Pass starts.
To give a more complete example, if a rendering engine has a ~DX11
resource binding design, and does barriers 'on demand' based on how
resources are bound, when writing into a UAV near the end of the frame
(and will be consumed in the next frame), it may happen to leave the
resource in the UNORDERED_ACCESS
state at the conclusion of the frame.
In the next frame, when the engine goes to bind the resource as an SRV,
it will then find that the resource is not in the correct state, and
issue a UNORDERED_ACCESS
-> PIXEL_SHADER_RESOURCE
barrier. If this
barrier occurs within the Render Pass, the UMD is free to assume all
writes occurred outside the Render Pass, and thus 'move' the barrier
up to the start of the Render Pass -- even if the UMD can't prove that
(given how UAV writes can't be tracked with the bindless design we
have).
Example non-conformant barriers:
-
RENDER_TARGET
/DEPTH_WRITE
to any read state on currently-bound RTVs/DSVs -
Any aliasing barrier
-
UAV barriers
Example conformant barriers:
-
UNORDERED_ACCESS
toINDIRECT_ARGUMENT
(the UMD can assume the UAV writes occurred before the Render Pass) -
COPY_DEST
toPIXEL_SHADER_RESOURCE
(the UMD can assume that the copy occurred before the Render Pass started).
At BeginRenderPass time, the user must declare all Resources that are serving as RTVs/DSVs within that Render Pass, and specify their beginning/engine 'access' characteristics.
D3D12_RENDER_PASS_BEGINNING_ACCESS_TYPE_DISCARD , |
D3D12_RENDER_PASS_ENDING_ACCESS_TYPE_DISCARD , |
D3D12_RENDER_PASS_BEGINNING_ACCESS_TYPE_PRESERVE , |
D3D12_RENDER_PASS_ENDING_ACCESS_TYPE_PRESERVE , |
D3D12_RENDER_PASS_BEGINNING_ACCESS_TYPE_CLEAR , |
D3D12_RENDER_PASS_ENDING_ACCESS_TYPE_RESOLVE , |
D3D12_RENDER_PASS_BEGINNING_ACCESS_TYPE_NO_ACCESS , |
D3D12_RENDER_PASS_ENDING_ACCESS_TYPE_NO_ACCESS |
The "BEGINING
" and "ENDING
" enums must both be provided for all
resources, and both are provided at BeginRenderPass time.
BEGINNING_DISCARD
signifies that the application does not have any
dependency on the prior contents of the resource. A given implementation
may return the previously-written contents, or it may return
uninitialized data. However, reading from the resource must not produce
a GPU hang, the read may 'at worst' only return undefined data.
A read is defined as a traditional UAV/SRV/CBV/VBV/IBV/IndirectArg binding/read, or as a blend/depth-testing-induced read.
BEGIN_PRESERVE
signifies the application has a dependency on the prior
contents of the resource, and the contents must be loaded from main
memory.
BEGIN_CLEAR
signifies the application has a dependency on the resource
being cleared to a specific (app-provided) color. This clear occurs
whether or not the resource is interacted with any further in the Render
Pass.
The API will allow the application to specify the clear values at
BeginRenderPass time, via a D3D12_CLEAR_VALUE
struct.
NO_ACCESS
signifies the resource will not be read from or written to
during the Render Pass. It is most expected to used to denote whether
the depth/stencil plane for a DSV is not accessed.
Must be paired with ENDING_ACCESS_NO_ACCESS
.
BEGIN_RESUME_WRITING
signifies that the application is resuming
writing to a surface that was previously written to in the previous
Render Pass and had the ENDING_ACCESS_SUSPEND_WRITING
flag, and no
intervening GPU work occurred between these two Render Passes.
It is spec'd that the writes in the 'resuming' Render Pass occur after
the writes in the 'suspending' Render Pass.
The intent of this flag is to allow writes to the same Render Target
to span multiple command lists, without flushing any on-chip caches.
RESUME_WRITING
may resume from a Render Pass in a separate Command
List, as long as the suspending/resuming Command Lists are executed
(back to back) in the same ExecuteCommandLists group.
[v0.08] -- In v0.08, suspend/resume was moved from being a per-view flag and moved to be a general command list flag. Same semantics (suspend/resumes must not have any intervening GPU work, and if they span command lists those command lists must be in the same ExecuteCommandLists call.)
BEGINNING_ACCESS_RESUME_LOCAL_READ
signifies that the application
wants to read from a resource that was previously immediately written
to by the GPU (i.e. still potentially in tile cache), and will be read
from in a pixel-local fashion (i.e. reads will always be on the exact
output pixel, or adjacent pixels).
The goal of this enum (paired with
ENDING_ACCESS_SUSPEND_LOCAL_READ
) is to allow resources to be read
from (when possible) without issuing a flush between two subsequent
Render Pass operations.
Specifically, the application warrants:
-
The resource has previously been written to in a Render Pass that had the 'write' characteristics ofD3D12_RENDER_PASS_ENDING_ACCESS_SUSPEND_LOCAL_READ
. -
No GPU operations have occurred between the previous Render Pass'SUSPEND_LOCAL
and the current Render Pass.ARESUME_LOCAL_READ
may be chained from within the same Command List -or- between Command Lists in the same ExecuteCommandLists call, but not across separate ExecuteCommandLists calls.
-
The resource to be read from matches the size and format of the newly-bound render target -
All reads/writes are '1-to-1' (plus optional gutter read pixels specified by the app), the reads from the source texture will be written to the same location on the target texture, and the render target texture is the same size as the source texture.
The kernel size of the read (how many surrounding pixels are needed)
are specified at BeginRenderPass time, through the
AdditionalWidth/AdditionalHeight fields on
D3D12_RENDER_PASS_BEGINNING_ACCESS_RESUME_LOCAL_READ_PARAMETERS
.
The AdditionalWidth/AdditionalHeight parameters must match those on the
previous ENDING_ACCESS_SUSPEND_LOCAL_READ
.
[v0.08] Removed from API/DDI, may be added in future release in response to ISV usage scenarios.
ENDING_ACCESS_DISCARD
signifies the application will have no future
dependency on the data written to the resource during this Render Pass
(may be appropriate for a depth buffer where the depth buffer will never
be textured from prior to future writes).
ENDING_ACCESS_PRESERVE
signifies the application will have a
dependency on the written contents of this resource in the future (and
they must be preserved).
ENDING_ACCESS_RESOLVE
allows the application to directly resolve a
MSAA surface to a separate resource at the conclusion of the Render Pass
(ideally while the MSAA contents are currently in the tile cache, for
TBDRs).
The resolve dest is expected to be in the RESOLVE_DEST
resource state
at the time the Render Pass ends.
The resolve source will be left in its initial resource state at the time the Render Pass ends. Resolve operations sumbmitted by a render pass will not implicitly change the state of any resources.
ENDING_ACCESS_NO_ACCESS
signifies the resource will not be read from
or written to during the Render Pass. It is most expected to used to
denote whether the depth/stencil plane for a DSV is not accessed.
Must be paired with BEGINNING_ACCESS_NO_ACCESS
.
See D3D12_RENDER_PASS_BEGINNING_ACCESS_RESUME_WRITING.
ENDING_ACCESS_SUSPEND_WRITING
signifies that the application will
continue writing to the resource in an immediately succeeding Render
Pass (with no intervening GPU work).
See D3D12_RENDER_PASS_BEGINNING_ACCESS_RESUME_LOCAL_READ.
Signifies the application has written to the resource, and it will
read from the resource in the future in a one-to-one (plus optional
gutter pixels) fashion (in the pixel shader, only the current pixel plus
an optional number of surrounding pixels will be read from). Most
importantly, this enum signifies that no GPU operations will occur
between the SUSPEND and RESUME.
typedef enum D3D12_RENDER_PASS_FLAGS
{
D3D12_RENDER_PASS_FLAG_NONE = 0,
D3D12_RENDER_PASS_FLAG_ALLOW_UAV_WRITES = 0x1,
D3D12_RENDER_PASS_FLAG_SUSPENDING_PASS = 0x2,
D3D12_RENDER_PASS_FLAG_RESUMING_PASS = 0x4
} D3D12_RENDER_PASS_FLAGS;
UAV writes are permitted within a Render Pass, but the user must
specifically indicate that they will be issuing UAV writes within the
Render Pass with the ALLOW_UAV_WRITES
flag, to let UMDs opt out of
tiling if necessary.
UAV accesses must follow the earlier "writes are not valid to read until the end of the Render Pass". Relatedly, UAV barriers are not permitted within a Render Pass.
UAV bindings (via root tables or root descriptors) are inherited into Render Passes, and are propagated out of Render Passes.
In v0.08, applications now suspend/resume an entire render pass. Suspending/Resuming render passes must have identical views/access flags between the passes, and may not have any intervening GPU ops (draws, dispatches, discards, clears, copies, updatetilemappings, writebufferimmediates, queries, query resolves, ..) between the suspending/resuming render passes.
The intended use case is multi-threaded rendering, where say four CLs (each with their own render passes) could target the same render targets. When render passes are suspended/resumed across command lists, the command lists must be executed in the same ExecuteCommandLists call.
A render pass can be both resuming and suspending -- in the multi-threaded example, CLs 2 and 3 would be resuming from 1 and 2 respectively, and suspending to 3 and 4 respectively.
Applications will be able to query the extent to which a UMD/HW efficiently supports Render Passes.
Though Render Passes will always function given the runtime's mapping logic, this allows applicatinos (notably 11on12) to determine when it is (possibly) worth their while to issue their commands as Render Passes, and when it is definitely not a benefit (when the runtime is just mapping to the existing API surface).
There will be three tiers of support:
-
D3D12_RENDER_PASS_TIER_0
-- UMD has not implemented DDI Table, supported via SW emulation -
D3D12_RENDER_PASS_TIER_1
-- Render Passes implemented by UMD, RT/DB writes may be accelerated. UAV writes not efficiently supported within the Render Pass. -
D3D12_RENDER_PASS_TIER_2
-- Tier 1, plus that UAV writes (pursuant to the read-after-write prohibition) are likely to be efficiently supported (more efficient than issuing the same work without a Render Pass).
The UMD will report back these tiers to the runtime. The runtime will validate that UMDs which fill out the DDI table at least report back TIER_1, and at the same time will validate that UMDs which do not fill out the DDI table do not claim anything other than TIER_0 support (the runtime will fail device creation in this case).
This requirement will only be present for UMDs supporting a the DDI build version in which this change is made.
-
Should the
BEGINNING/ENDING_ACCESS
flags be at the sub-resource level for RTVs? They are already specific to the depth/stencil planes.- [v0.08] Not for RS5
-
Should the user specify at the Render Pass level the write/read-regions of a resource? Can that be inferred through viewport/scissor information mostly-well-enough?
- [v0.08] Not for RS5
-
Do PSOs, VBs/IBs need to be incorporated into the Render Pass structure?
- [v0.08] No indication this is needed.
-
Better name than
_DISCARD
? E.g. "DON'T_CARE
"? -
Is it ok to not support SO within a Render Pass (spec currently does not support this)
- [v0.08] SO is supported within render passes
-
Should we let users call
OMSetRenderTargets
within a Render Pass, as long as they still to RTVs specified?-
Need to determine if there's a scenario for this that's worth the extra complexity
-
[v0.08] No
-
-
Should we add back in resource barriers associated with BeginRenderPass? Is that buying UMDs anything?
-
Can ROV writes be supported?
-
Else?
// Beginning Access
typedef enum D3D12_RENDER_PASS_BEGINNING_ACCESS_TYPE
{
D3D12_RENDER_PASS_BEGINNING_ACCESS_TYPE_DISCARD,
D3D12_RENDER_PASS_BEGINNING_ACCESS_TYPE_PRESERVE,
D3D12_RENDER_PASS_BEGINNING_ACCESS_TYPE_CLEAR,
D3D12_RENDER_PASS_BEGINNING_ACCESS_TYPE_NO_ACCESS
} D3D12_RENDER_PASS_BEGINNING_ACCESS_TYPE;
typedef struct D3D12_RENDER_PASS_BEGINNING_ACCESS_CLEAR_PARAMETERS
{
D3D12_CLEAR_VALUE ClearValue;
} D3D12_RENDER_PASS_BEGINNING_ACCESS_CLEAR_PARAMETERS;
typedef struct D3D12_RENDER_PASS_BEGINNING_ACCESS
{
D3D12_RENDER_PASS_BEGINNING_ACCESS_TYPE Type;
union
{
D3D12_RENDER_PASS_BEGINNING_ACCESS_CLEAR_PARAMETERS Clear;
};
} D3D12_RENDER_PASS_BEGINNING_ACCESS;
// Ending Access
typedef enum D3D12_RENDER_PASS_ENDING_ACCESS_TYPE
{
D3D12_RENDER_PASS_ENDING_ACCESS_TYPE_DISCARD,
D3D12_RENDER_PASS_ENDING_ACCESS_TYPE_PRESERVE,
D3D12_RENDER_PASS_ENDING_ACCESS_TYPE_RESOLVE,
D3D12_RENDER_PASS_ENDING_ACCESS_TYPE_NO_ACCESS
} D3D12_RENDER_PASS_ENDING_ACCESS_TYPE;
typedef struct D3D12_RENDER_PASS_ENDING_ACCESS_RESOLVE_SUBRESOURCE_PARAMETERS
{
UINT SrcSubresource;
UINT DstSubresource;
UINT DstX;
UINT DstY;
D3D12_RECT* pSrcRect;
} D3D12_RENDER_PASS_ENDING_ACCESS_RESOLVE_SUBRESOURCE_PARAMETERS;
typedef struct D3D12_RENDER_PASS_ENDING_ACCESS_RESOLVE_PARAMETERS
{
ID3D12Resource* pSrcResource;
ID3D12Resource* pDstResource;
// Can be a subset of RT's array slices, but can't target
// subresources that weren't part of RTV/DSV.
UINT SubresourceCount;
const D3D12_RENDER_PASS_ENDING_ACCESS_RESOLVE_SUBRESOURCE_PARAMETERS* pSubresourceParameters;
DXGI_FORMAT Format;
D3D12_RESOLVE_MODE ResolveMode;
BOOL PreserveResolveSource; // New to v0.08
} D3D12_RENDER_PASS_ENDING_ACCESS_RESOLVE_PARAMETERS;
typedef struct D3D12_RENDER_PASS_ENDING_ACCESS
{
D3D12_RENDER_PASS_ENDING_ACCESS_TYPE Type;
union
{
D3D12_RENDER_PASS_ENDING_ACCESS_RESOLVE_PARAMETERS Resolve;
};
} D3D12_RENDER_PASS_ENDING_ACCESS;
// Render Target Desc
typedef struct D3D12_RENDER_PASS_RENDER_TARGET_DESC
{
D3D12_CPU_DESCRIPTOR_HANDLE cpuDescriptor;
D3D12_RENDER_PASS_BEGINNING_ACCESS BeginningAccess;
D3D12_RENDER_PASS_ENDING_ACCESS EndingAccess;
} D3D12_RENDER_PASS_RENDER_TARGET_DESC;
// Depth-Stencil Desc
typedef struct D3D12_RENDER_PASS_DEPTH_STENCIL_DESC
{
D3D12_CPU_DESCRIPTOR_HANDLE cpuDescriptor;
D3D12_RENDER_PASS_BEGINNING_ACCESS DepthBeginningAccess;
D3D12_RENDER_PASS_BEGINNING_ACCESS StencilBeginningAccess;
D3D12_RENDER_PASS_ENDING_ACCESS DepthEndingAccess;
D3D12_RENDER_PASS_ENDING_ACCESS StencilEndingAccess;
} D3D12_RENDER_PASS_DEPTH_STENCIL_DESC;
// UAV Access Mode
typedef enum D3D12_RENDER_PASS_FLAGS
{
D3D12_RENDER_PASS_FLAG_NONE = 0x0,
D3D12_RENDER_PASS_FLAG_ALLOW_UAV_WRITES = 0x1,
D3D12_RENDER_PASS_FLAG_SUSPENDING_PASS = 0x2,
D3D12_RENDER_PASS_FLAG_RESUMING_PASS = 0x4
} D3D12_RENDER_PASS_UAV_ACCESS_FLAGS;
DEFINE_ENUM_FLAG_OPERATORS( D3D12_RENDER_PASS_FLAGS );
[uuid(9742FB99-3D7D-4572-977C-CED7D15EB709), object, local, pointer_default(unique)]
interface ID3D12GraphicsCommandList4 : ID3D12GraphicsCommandList3
{
void BeginRenderPass(
[annotation("_In_")] UINT NumRenderTargets,
[annotation("_In_reads_opt_(NumRenderTargets)")] const D3D12_RENDER_PASS_RENDER_TARGET_DESC* pRenderTargets,
[annotation("_In_opt_")] const D3D12_RENDER_PASS_DEPTH_STENCIL_DESC* pDepthStencil,
D3D12_RENDER_PASS_FLAGS Flags
);
void EndRenderPass();
}
The following validation will be added to the core runtime:
-
[checked in 3/15/18] Render Passes may not be started within non-DIRECT command lists
-
[checked in 3/20/18] Render Passes must be ended before calling ID3D12CommandList::Close
-
[checked in 3/20/18] Render Passes may not be nested (two begins in a row)
-
[checked in 3/20/18] A Render Pass may not be ended twice in a row
-
[checked in 3/22/18] Disallowed APIs may not be called during a Render Pass (command list will be removed)
The following will be added to SDK Layers:
-
[checked in 3/20/18] Translate
DISCARD
(BEGINNING_ACCCESS
orENDING_ACCESS
) to "clear-to-a-random-value", since that shouldn't impact app at all. -
No GPU work occurs within a command list between a
ENDING_SUSPEND
andBEGINNING_RESUME
. -
If a Command List ends with a
ENDING_SUSPEND
, the next command list within that ExecuteCommandLists group must exist, and must begin with aBEGINNING_RESUME
(with matching RT/DBs). -
A ResourceBarrier is not issued on a resource that was actually written within the current Render Pass (e.g. a bound UAV, or currently-bound RTs/DBs).
11on12 will replace OMSetRenderTargets calls with BeginRenderPass/EndRenderPass as appropriate to test general rendering functionality with Render Passes.
For testing more specific functionality, the following will be exercised via the HLK:
-
A BEGINNING clear, with no rendering in Render Pass, and an ending PRESERVE results in a clear, for various resource formats and types (and RT counts).
-
A BEGINNING clear, with rendering in the Render Pass, and an ending PRESERVE results in a clear (with drawing properly ordered after the clear), for various resource formats and types (and RT counts).
-
A BEGINNING discard, that uses a blending or depth operations that have dependencies on existing contents does not result in a GPU hang (undefined rendering values is fine).
-
A ENDING resolve correctly resolves resources in a variety of configurations (including using the new MIN/MAX capabilities for depth/stencil that were added in ResolveSubresourceRegion).
-
Utilizing SUSPEND/RESUME results in no rendering difference (versus
ENDING_PRESERVE
/BEGINNING_PRESERVE
), for various resource formats and types. -
For a BEGINNING/ENDING preserve, when no work occurs in the Render Pass, the values are still present in the Render Target outside of the Render Pass
Date | Version | Description |
---|---|---|
8/7/2019 | 1.0 | Updated public spec to match implementation regarding SetDescriptorHeaps |
10/30/2018 | 0.09 | Clarified inconsistency about what resource state render targets and depth buffers are left in after a render pass that ends with a resolve |
5/15/2018 | 0.08 | General API/DDI refactoring – runtime support for previous DDI version remains, but will be removed prior to feature complete. Suspend/Resume has been refactored to be at the pass level, rather than per-view. Suspend/Resume_LOCAL has been removed from the API/DDI, will be re-added if more ISV scenarios are identified in a future release. SetDescriptorHeaps no longer allowed during a render pass. Applications can now explicitly specify whether during a resolve the resolve source (e.g. the bound RT/DB) should be preserved or not. Minor API/DDI enum renaming to better follow D3D12 conventions |
4/10/2018 | 0.07 | Allow Resource Barriers that obey the “no read dependencies may be taken on a write that occurred during the Render Pass” rule. Add new API/DDI CAP for applications to know whether Render Passes are potentially more optimal on the hardware (e.g. whether the DDI table is even implemented), and for the driver to reflect whether UAV writes within the Render Pass can be implemented efficiently. Remove SOSetTargets from the list of disallowed APIs within a Render Pass, fill in some APIs that were missing from the list (notably copies). One new HLK test case |
3/7/2018 | 0.06 | Allow the Resolve-at-EndRenderPass operation to (optionally) target multiple subresources, earlier design assumed just a single subresource per RTV. Changes to D3D12_RENDER_PASS_ENDING_ACCESS_RESOLVE_SUBRESOURCE_PARAMETERS and D3D12_RENDER_PASS_ENDING_ACCESS_RESOLVE_PARAMETERS |
2/16/2018 | 0.05 | Address feedback: Remove Resource Barrier integration from BeginRenderPass, feedback is that it is likely not a significant win, and this simplifies the design. Can add back in if potential wins identified once up and running. Change ONE_TO_ONE_READ to LOCAL_READ , where user specifies kernel size for reads. |
2/14/2018 | 0.04 | Address feedback: Clear*View APIs are not allowed within a Render Pass. Clarified that Render Targets/Depth Buffer are in an undefined state following a Render Pass, don’t necessarily need to be cleanly unbound by UMD. Open issue if runtime enforces this, similar to how it unbinds RTs at start of Command List. Added following to list of open issues: Should SO be allowed within a Render Pass? |
2/13/2018 | 0.03 | Add API headers. Move access specification to sub-resource granularity.Reduce verbosity of UAV bindings.Add “ISV can validate proper usage of feature even if UMD does not support feature” design goal. |
01/23/2018 | 0.02 | Design simplification to only allow a single set of writable surfaces per Render Pass. |
01/09/2018 | 0.01 | Initial version |