Note that there are some explanatory texts on larger screens.

plurals
  1. POPerformance hit from blending large quad
    text
    copied!<p>I have a game which runs pretty well (55-60fps) on a retina display. I want to add a fullscreen overlay that blends with the existing scene. However, even when using a small texture, the performance hit is huge. Is there an optimization I can perform to make this useable?</p> <p>If I use a 80x120 texture (the texture is rendered on the fly, which is why it's not square), I get 25-30FPS. If I make the texture smaller, performance increases, but quality is not acceptable. In general, though, the quality of the overlay is not very important (it's just lighting).</p> <p>Renderer utilization is at 99%.</p> <p>Even if I use a square texture from a file (.png), performance is bad.</p> <p>This is how I create the texture:</p> <pre><code> [EAGLContext setCurrentContext:context]; // Create default framebuffer object. glGenFramebuffers(1, &amp;lightFramebuffer); glBindFramebuffer(GL_FRAMEBUFFER, lightFramebuffer); // Create color render buffer and allocate backing store. glGenRenderbuffers(1, &amp;lightRenderbuffer); glBindRenderbuffer(GL_RENDERBUFFER, lightRenderbuffer); glRenderbufferStorage(GL_RENDERBUFFER, GL_RGBA8_OES, LIGHT_WIDTH, LIGHT_HEIGHT); glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER, lightRenderbuffer); glGenTextures(1, &amp;lightImage); glBindTexture(GL_TEXTURE_2D, lightImage); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST); glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, LIGHT_WIDTH, LIGHT_HEIGHT, 0, GL_RGBA, GL_UNSIGNED_BYTE, NULL); glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, lightImage, 0); </code></pre> <p>And here is the rendering...</p> <pre><code>/* Draw scene... */ glBlendFunc(GL_ONE, GL_ONE); //Switch to offscreen texture buffer glBindFramebuffer(GL_FRAMEBUFFER, lightFramebuffer); glBindRenderbuffer(GL_RENDERBUFFER, lightRenderbuffer); glViewport(0, 0, LIGHT_WIDTH, LIGHT_HEIGHT); glClearColor(ambientLight, ambientLight, ambientLight, ambientLight); glClear(GL_COLOR_BUFFER_BIT); /* Draw lights to texture... */ //Switch back to main frame buffer glBindFramebuffer(GL_FRAMEBUFFER, defaultFramebuffer); glBindRenderbuffer(GL_RENDERBUFFER, colorRenderbuffer); glViewport(0, 0, framebufferWidth, framebufferHeight); glBlendFunc(GL_DST_COLOR, GL_ZERO); glBindTexture(GL_TEXTURE_2D, glview.lightImage); /* Set up drawing... */ glDrawElements(GL_TRIANGLE_FAN, 4, GL_UNSIGNED_SHORT, 0); </code></pre> <p>Here are some benchmarks I took when trying to narrow down the problem. 'No blend' means I glDisable(GL_BLEND) before I draw the quad. 'No buffer switching' means I don't switch back and forth from the offscreen buffer before drawing.</p> <pre><code>(Tests using a static 256x256 .png) No blend, No buffer switching: 52FPS Yes blend, No buffer switching: 29FPS //disabled the glClear, which would artificially speed up the rendering No blend, Yes buffer switching: 29FPS Yes blend, Yes buffer switching: 27FPS Yes buffer switching, No drawing: 46FPS </code></pre> <p>Any help is appreciated. Thanks!</p> <p><strong>UPDATE</strong></p> <p>Instead of blending the whole lightmap afterward, I ended up writing a shader to do the work on the fly. Each fragment samples and blends from the lightmap (kind of like multitexturing). At first, the performance gain was minimal, but then I used a lowp sampler2d for the light map, and then I got around 45FPS.</p> <p>Here's the fragment shader:</p> <pre><code>lowp vec4 texColor = texture2D(tex, texCoordsVarying); lowp vec4 lightColor = texture2D(lightMap, worldPosVarying); lightColor.rgb *= lightColor.a; lightColor.a = 1.0; gl_FragColor = texColor * color * lightColor; </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload