Allow indirect rendering, allow multimesh without transform for each instance #8647
Replies: 4 comments 5 replies
-
I have a similar discussion and was looking a little bit more into it, apparently the way to do this type of stuff for now in Godot is with particles. |
Beta Was this translation helpful? Give feedback.
This comment was marked as off-topic.
This comment was marked as off-topic.
-
currently gpu instancing in godot is wrapped in multimesh and most of its functions are written at core level, which means user can't use it for advanced usage, although it does support 4 float custom data. |
Beta Was this translation helpful? Give feedback.
-
Like the first post said, I think indirect rendering could be a very useful addition for grass vegetation, and the current multimesh solution is somewhat inadequate**. Let me elaborate: The ProblemMultimesh works great for rendering the same object many times in a single draw call. However, generating a multimesh primarily requires two pieces of information: the number of instances to draw, and the transform of each instance. You have to calculate these transforms on the CPU*** which is not as suited to large parallel computing tasks as the GPU. Also calculating the positions / transforms for each grass instance is independent of any information of the other grass instances, making this perfectly suited for the GPU. ***You can calculate the transforms in parallel on the GPU using a compute shader for example, but you still have to pass back this huge buffer of transforms to the CPU, only for it to be passed back again to the GPU for drawing, which seems kind of unnecessary? Furthermore, the CPU really isn't doing anything with the data once it receives it, it simply generates the multimesh to be sent back to the GPU, but doesn't need to modify the data. Also this data readback from the compute shader and generating the multimesh is rather expensive. **Is this a Big Problem?If you're alright with just culling grass based on multimesh chunks, then you won't have to generate a new multimesh that often. Only when the player moves into a new chunk, you'll have to update and maybe generate new multimeshes. However, if you're worried about visibility culling the instances, technically, if even a small amount of the bounding box from a large multimesh grass chunk enters the view frustum, the whole multimesh is rendered. To mitigate this you'll have to calculate what grass positions are on screen every frame, which means re-generating the multimesh every frame... Potential SolutionAn indirect rendering solution would solve this because if the developer wants to move these transformation calculations to the GPU, and carry out frustum culling on each frame using a compute shader, they won't be bottlenecked with the current GPU>CPU>GPU design, and would instead be able to point the GPU to the drawing data it needs indirectly. |
Beta Was this translation helpful? Give feedback.
-
Hey, I'm writing a game that needs to spawn a lot of vegetation, and I would really want it to be as efficient as possible to work on mobile.
Currently, multimesh holds a Transform3D\2D for each instance it has, and this is mandatory by the engine (if you set it's buffer to a size smaller than instance_count*sizeofTransform3D\2D it throws)
That takes a lot of memory and reduces performance when one wants to get to very high number of instances, of very simple and predictable geometry
I recently came across this video where he demonstrates optimizations for rendering a whole lot of triangles
One of them is sending a lot less data in the buffer by not sending unneeded things like UVs, vertex positions and more, because they are all calculate-able on the GPU from a list of much less data.
Or if I understood correctly - he is drawing the mesh indirectly on the GPU (I'm not proficient in graphics programming so I hope I'm using the correct term)
Currently in Godot, it is quite simple to achieve the shader side of this approach - each vertex gets a VERTEX_ID which would let us position it accordingly without using any existing transform data.
But there is no option to opt-out of sending the whole transform in the buffer, or even having it created automatically by the engine when increasing the instance count.
And of course ideally the best capability would be to be able to send any user-defined information in the buffer, and then position vertices accordingly in a shader, thus conserving memory and runtime.
With correct optimizations by the user, that would allow for creating very large or detailed worlds - be it terrain with a lot of vertices as seen in the video; or a lot of repeating, mostly-uniform instances as in my vegetation case
If there is any existing way or workaround to achieve a similar result in the existing state of the engine I would also really appreciate hearing about it!
Beta Was this translation helpful? Give feedback.
All reactions