Note that there are some explanatory texts on larger screens.

plurals
  1. POHow are GLSL shader programs executed on the graphics hardware pipeline?
    primarykey
    data
    text
    <p>As I toy with OpenGL ES 2.0 and GLSL more and more, I'm questioning exactly how the shader programs are executed on the hardware. I understand the concepts behind vertex and fragment shader programs just fine, but as to how they work on the metal is still very unclear. All too often when reading about GPU's I've come across the term pipeline and that a GPU has a given number of pipelines. </p> <p>I understand what the pipeline does, it's fed a set of vertices (representing a geometric primitive) and executes the vertex shader with given parameters and sends the outputs of the vertex shader through some fixed-function hardware that performs operations based on those outputs. The vertex shader also outputs values that are interpolated over each fragment of the primitive and input into the fragment shader, making it easy to perform a lot of complex rendering using a common algorithm.</p> <p>But does this mean that if a GPU has n pipelines, that at any given moment, each of the n pipelines can be executing an instance of a shader program of a single geometric primitive?</p> <p>I've been reading the OpenGL ES 2.0 programming guide (about 60% through it according to kindle), but perhaps my still developing understanding has caused me to miss the answer to this very question.</p> <p>One practical reason I ask this question is regarding what work should or shouldn't be done on the CPU instead of the GPU. For instance, if I'm operating a single update and render thread, is it smart to do the matrix to vector multiplication on the CPU, where it will have to be done in a line of all objects? Or would it be better to outsource to the GPU where the shader programs that execute the drawing of multiple geometric primitives could be done concurrently on different pipelines? </p> <p>I'm working on optimizing some code to draw many quads on the screen using VBO's instead of separate drawing calls for each. But since this is considered array rendering, I would need to send all matrices to the GPU for each vertex, even though the mvp matrix is the same for every 4 vertices of the quad, which could be considered a hit on bandwidth. But if the shader programs are executed concurrently, as opposed to one after the other on my rendering thread on the CPU, perhaps it's a worthy trade off. But I don't quite have that level of expertise to say one way or the other.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload