You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For now, the Combiner generate one shader per set of option and try to find, in every generated shaders, if the given set of option already has a shader to use (and generate the shader if not).
While not amazingly expensive, this lookup seems to be a big part of the CPU job. Without talk about string contatenation and everything to generate and compile many GLSL shader. We could optimize all of this but I think we could do GLSL branching on uniform.
Branching in GPU is not the best practice but:
N64 combiner computation is very simple (a-b)*c+d. No complex effect here.
The idea is to branch only on uniforms, not varyings (not per vertex or per pixel branching).
Worse: Even cheap GPU spend his time waiting for the CPU.
What I suggest is to do branching on uniform directly in the GLSL code and avoid the CPU lookup. This would load the GPU more (not so much actually) but let precious CPU ressource avaible.
I want to make Rice run on GLES 2 devices first before jump on this but it would improve performances.
In general, decrease CPU load is a good thing, even if it mean put more work on the GPU. GPUs are never really overloaded anyway.
N64 was doing the switch in hardware and was only dealing with single color (textures where sampled before in the pipeline). It's not very translatable in GLSL as you have to deal with texture sampling in the GLSL code.
Any feedback is welcome.
The text was updated successfully, but these errors were encountered:
It looks to me like your shader lookup code shouldn't really be that expensive unless the N64 ROM creates a lot of different shaders (hundreds or thousands). However compiling a shader in an expensive operation, so you may see some dropped frames when shaders are first used. As far as moving all the branching/switch code into the shader, I think that's a reasonable thing to do. It will simplify your host-side code a lot and avoid compiling shaders at run time. The GPU overhead from branching on the 'uniform' variables should be very low.
For now, the Combiner generate one shader per set of option and try to find, in every generated shaders, if the given set of option already has a shader to use (and generate the shader if not).
While not amazingly expensive, this lookup seems to be a big part of the CPU job. Without talk about string contatenation and everything to generate and compile many GLSL shader. We could optimize all of this but I think we could do GLSL branching on
uniform
.Branching in GPU is not the best practice but:
(a-b)*c+d
. No complex effect here.uniforms
, notvaryings
(not per vertex or per pixel branching).What I suggest is to do branching on
uniform
directly in the GLSL code and avoid the CPU lookup. This would load the GPU more (not so much actually) but let precious CPU ressource avaible.I want to make Rice run on GLES 2 devices first before jump on this but it would improve performances.
In general, decrease CPU load is a good thing, even if it mean put more work on the GPU. GPUs are never really overloaded anyway.
N64 was doing the switch in hardware and was only dealing with single color (textures where sampled before in the pipeline). It's not very translatable in GLSL as you have to deal with texture sampling in the GLSL code.
Any feedback is welcome.
The text was updated successfully, but these errors were encountered: