Note that there are some explanatory texts on larger screens.

plurals
  1. PODirectX - GetSurfaceLevel Performance Issue
    primarykey
    data
    text
    <p>I'm implementing deferred shading in a directx 9 application. My method of deferred shading requires 3 render targets( color, position, and normal ). It is necessary to:</p> <ol> <li>set the render targets in the device at the beginning of the 'render' function</li> <li>draw the data to them in the 'rt pass'</li> <li>remove the render targets from the device( so as not to draw over them during subsequent passes)</li> <li>set the render targets as textures for subsequent passes so that the effect can recall data 'drawn' to the rt's in the 'rt pass'...</li> </ol> <p>This method works fine, however, I am experiencing performance issues. I've narrowed them down to two function calls:</p> <pre><code>IDirect3DTexture9::GetSurfaceLevel() IDirect3DDevice9::SetRenderTarget() </code></pre> <p>Here is code to set render target:</p> <pre><code>IDirect3DDevice9 *pd3dDevice = CEffectManager::GetDevice(); IDirect3DTexture9 *pRT = CEffectManager::GetColorRT(); IDirect3DSurface9 *pSrf = NULL; pRT-&gt;GetSurfaceLevel( 0, &amp;pSrf ); pd3dDevice-&gt;SetRenderTarget( 0, pSrf ); </code></pre> <p>PIX indicates that the duration( cycles ) of the call to GetSurfaceLevel() is very high ~1/2 ms per call( Duration / Total Duration * 1 / FrameRate ). Because it is necessary to get 3 surfaces, combined, the duration is too high! Its more than 4 times greater than the combined draw calls...</p> <p>I tried to eliminate the call to GetSurfaceLevel() by storing a pointer to the surface during render target creation...oddly enough, the SetRenderTarget() function assumed the same duration( when before its duration was negligible ). Here is altered code:</p> <pre><code>IDirect3DDevice9 *pd3dDevice = CEffectManager::GetDevice(); IDirect3DSurface9 *pSrf = CEffectManager::GetColorSurface(); pd3dDevice-&gt;SetRenderTarget( 0, pSrf ); </code></pre> <p>Is there a way around this performance issue? Why does the second method take as long as the first? It seems as though the process within IDirect3DDevice9::SetRenderTarget() simply takes time...is there a device state that I can set to help performance?</p> <p>Update: I've implemented the following code in order to better test performance:</p> <pre><code>IDirect3DDevice9 *pd3dDevice = CEffectManager::GetDevice(); IDirect3DTexture9 *pRT = CEffectManager::GetColorRT(); IDirect3DSurface9 *pSRF = NULL; IDirect3DQuery9 *pEvent = NULL; LARG_INTEGER lnStart, lnStop, lnFrequency; // create query pd3dDevice-&gt;CreateQuery( D3DQUERYTYPE_EVENT, &amp;pEvent ); // insert 'end' marker pEvent-&gt;Issue( D3DISSUE_END ); // flush command buffer while( S_FALSE == pEvent-&gt;GetData( NULL, 0, D3DGETDATA_FLUSH ) ); // get start time QueryPerformanceCounter( &amp;lnStart ); // api call pRT-&gt;GetSurfaceLevel(); // insert 'end' marker pEvent-&gt;Issue( D3DISSUE_END ) // flush the command buffer while( S_FALSE == pEvent-&gt;GetData( NULL, 0, D3DGETDATA_FLUSH ) ); QueryPerformanceCounter( &amp;lnStop ); QueryPerformanceFrequency( &amp;lnFreq ); lnStop.QuadPart -= lnStart.QuadPart; float fElapsedTime = ( float )lnStop.QuadPart / ( float )lnFreq.QuadPart; </code></pre> <p>fElapsedTime on average measured 10 - 50 microseconds I performed the same test on IDirect3DDevice9::SetRenderTarget() and the results on average measured 5 - 30 microseconds...</p> <p>This data is much better than what I got from PIX...It suggests that there is not as much of a delay as I thought, however, the framerate is drastically reduced using deferred shading...this seems to be the most likely source for the loss of performance...did I effectively query the device?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload