Monday, 15 April 2013

opengl - Cost of Branching on uniforms on modern GPUs -


While using GLSL on modern (GL 3.3 +) GPU, how much is the possibility of branches on a uniform?

In my engine I am going to the point where I have a lot of shaders and I have many different quality presets for many quality as it stands, I use uniform I'm (if) to choose different quality presets in shaders. I'm worried though that I can get better performance by compiling the shader again and using #ifdef.

Actually what I want to know, if my fear is unfounded, I should know about it. Do the modern GPU have similar cheap branches? I have done some tests myself and I have found very little difference, but I have tested only one NVIDIA 680. I believe I am not an expert, but perhaps my speculation is nothing better.

I would think that the branches on the uniform are really cheap. This is very different from the branches on the texture or attribute data, because all ALUs in the SIM will follow the same code path from the shader, hence it is the "real" branch rather than the performance mask. I am also not sure how the shader processor suffers from branch bubbles in its pipeline, but the pipeline is definitely bound to be more shallow than normal-purpose CPUs (especially very short clock) They usually run at speeds)

I wish I could be more useful and I appreciate it if someone can answer more officially. I, for one, do not worry much about the branches on the uniform, but still, as always, if you have the possibility, please profile your shader and see if it makes a special difference.


No comments:

Post a Comment