Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scene Management in Compute #62

Closed
devshgraphicsprogramming opened this issue Feb 6, 2021 · 4 comments
Closed

Scene Management in Compute #62

devshgraphicsprogramming opened this issue Feb 6, 2021 · 4 comments
Labels
enhancement New feature or request

Comments

@devshgraphicsprogramming
Copy link
Member

devshgraphicsprogramming commented Feb 6, 2021

Description

Description of the related problem

Solution proposal

Compute Shader for position update decoupled from framerate [REWRITE the WRITE UP]

Position Update Dispatch
Takes the requested relative user-control transform updates, and animation blends scheduled for this frame (animation blending rate is different per-node, and supplied from a buffer).

The user supplied relative matrices overwrite the target nodes' relative transforms (blend manager's relative transform modifications are pushed to TransformTreeManager before user sourced CPU relative transform modifications are pushed to TransformTreeManager).

The first blend for a node overwrites the target node relative transform, the following accumulate.
We map 1 invocation : 1 node, not 1 invocation : 1 blend.

Therefore the blendlists need to be kept together by nodeID via
https://github.com/Devsh-Graphics-Programming/Nabla/blob/scene_manager/include/nbl/builtin/glsl/transform_tree/modification_request_range.glsl

Nodes without parents should write to their global matrices directly as an optimization if possible.

Input:

  • Animation Blends <nodeID,keyframeRangeID/the range itself,start timestamp,weight,flags, such as whether to loop>
  • Animation Keyframes
  • Node attributes (update timestamp, relative transform, etc.)

Output:

  • Changed node attributes (modifiction timestamp, relative transform, etc.)
  • Root node global tforms

Improvement: Reduce blend rates for nodes that have small projections on the screen (far away), out of frustums, not visible by occlusion culling.

At this point a question poses itself, do we control the Animation Rate via comparing the current_timestamp-last_modified_timestamp>animation_rate_delta (which requires us to launch an invocation for every single animated node), or do we somehow bin/sort the nodes by update frequency (how!?) and dispatch only what we need ( would need a prefix sum dispatch before the main one).

Best thing would be to treat this like a particle system (have animation records jump between buckets), but then it makes it pretty hard to remove/pause/suspect/end already added animations (requires parallel searching in the deletion list).

Manual control of skeleton nodes is achieved via not having any animation blends.
Global matrix override achieved via not having/detaching from a parent.
Bind pose matrices handled via adding pseudo-children.

Compute Shader for bone translation

@devshgraphicsprogramming
Copy link
Member Author

devshgraphicsprogramming commented Sep 14, 2021

GPU Culling Compute Shader

Per-view cull (frustum and possibly occlusion) the meshbuffers of chosen LoD meshes of all the instances.

Input:

  • Instances (UIDs and their root nodes and Meshes, bounding box overrides for skinned instances)
  • Meshes (their Meshbuffers)
  • Meshbuffers (AABBs, mappings to MDI offsets)

Optimization opportunities:

  • Dispatch Indirect

Output:

  • Correct instance counts in MDI structs
  • Visible instance-meshbuffer lists per camera

Forward Compatibility:

  • ?

@devshgraphicsprogramming
Copy link
Member Author

MDI Compaction Compute Shader

Input:

  • unordered lists of visible instance-meshbuffer per camera

Output:

  • ordered lists of visible instance IDs for MDI calls
  • MDI call parameter structs with the correct base instance

@devshgraphicsprogramming
Copy link
Member Author

devshgraphicsprogramming commented Sep 14, 2021

Bone MVP*BindPose Computation

TODO

Compute Skinning (Optional)

Need to allocate a total copy of all vertices of a skinned meshbuffer from a dedicated buffer range meant for vertices and normals.
This needs to happen for every instance

To save space we can alias a LoDs to the same allocation (as which LoD is rendered is exclusive, unless you've got two viewing cameras), therefore we should allow for a flag specifying whether all views must use the same LoD (aliased) or not.

Compute skinning recomputes the vertex positions and normals (fresh position and normal buffer per instance).
This means that skinned meshes must use programmable pulling for vertex positions and normals.

The skinning must always be an indirect dispatch, we should only skin potentially visible instances (at least frustum cull them).

Input:

  • TODO
  • Global Table of Nodes

Output:

  • Transformed positions and normals of meshbuffer vertices
  • New and accurate bounding boxes of the instances to be used for GPU culling

Optimization opportunities (in order of ease):

  • Skin only meshbuffers with moved armature (Dispatch Indirect from the transform hierarchy / animation update stage)
  • Skin only possibly visible meshbuffers
  • Output only potentially visible triangle's vertices / do triangle culling, so the vertex data gets compacted
  • do the whole allocation of vertex data from a range on the GPU

Forward Compatibility:

  • Proper Smooth Normal Computation (secondary index list of where to accumulate triangle normals to and with what weight)

@devshgraphicsprogramming
Copy link
Member Author

almost done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant