XNA Creators Club Online
Page 1 of 1 (8 items)
Sort Posts: Previous Next

Performance analysis through StopWatch

Last post 6/12/2009 2:23 AM by Shivoa. 7 replies.
  • 6/11/2009 7:21 AM

    Performance analysis through StopWatch

    Hi,

    I'm having a bit of a problem working out where I'm losing performance in some test app in the Xbox 360.  My testing revolves around doing some Gaussian blurs on ResolveTexture2Ds and quite a few calls to ResolveBackBuffer to grab the images back to texture and seeing how much I can push this before hitting the limits of the 360.

    At the moment I'm logging the time taken for the Draw and Update functions by using a Stopwatch (from System.Diagnostics) to see how long the code takes but am running into problems.  My output is coming from
    stopwatch.Elapsed.TotalMilliseconds.ToString() 
    but when I push up the number of blurs I'm doing each Draw I quickly hit a drop from 60 to 30fps but my stopwatch values don't seem to show why.  When I drop to 30fps I get a double return from my Update counter (I understand why as the 360 is catching up to the 60Hz Update schedule so has to run it twice) and these are both under 0.015ms (testing and doing very little here) but the Draw stopwatch is only returning 1.8ms, which doesn't seem to vibe with the need to drop to 30fps due to long draw time.

    The on-screen output is as I expect and doing a remote Perf mon I find that GC is running about once every 10-15 secs and recovering about a meg of assorted.  Looking through the heap viewer there doesn't seem to be much jumping out at me (then again I'm still quite new to all this) and if it was a GC issue I'd expect there to be a framerate drop to coincide rather than this constant drop to half-framerate.

    Could any of you experienced chaps give me some suggestions to what might be going wrong or what tools I should use to find the issue.  From what I've read the stopwatch should be giving correct values but that doesn't fit with the output I'm getting.
  • 6/11/2009 9:06 AM In reply to

    Re: Performance analysis through StopWatch

    I believe timing the draw function will only give you the amount of time the CPU spends there shuffling things in & out of the GPU & setting states & such, it won't tell you the actual time the GPU had to spend drawing.  A screen capture from PIX for windows will show you how many nanoseconds (I believe that is its unit of measure) that the various things that are done by the GPU takes, as well as getting an estimate of the total time required to draw the scene.

    best,
    Byron
    ..shaders make you feel... powerful, or very very stupid.
    http://drjbn.spaces.live.com/
  • 6/11/2009 6:27 PM In reply to

    Re: Performance analysis through StopWatch

    Byron Nelson:
    I believe timing the draw function will only give you the amount of time the CPU spends there shuffling things in & out of the GPU & setting states & such, it won't tell you the actual time the GPU had to spend drawing. 


    Exactly. You're almost certainly GPU bound with this type of workload, so timing your CPU draw code isn't going to tell you much.

    These articles might help to understand how the CPU and GPU work in parallel, and they have some tips on how you can figure out where the GPU is spending its time.
    XNA Framework Developer - blog - homepage
  • 6/11/2009 6:46 PM In reply to

    Re: Performance analysis through StopWatch

    Thanks, I was worried the answer might be something like that.  As I understand it I can only use PIX to analyse the 360 if I have access to the XDK so as an indie I'll end up getting the results for whatever PC I run my project on.  I'm not sure the ATi 2400 or nVidia 8800GT are that close to the 360 GPU and memory architecture to provide useful results (other that finding bad code that takes far too long to execute).
  • 6/11/2009 7:52 PM In reply to

    Re: Performance analysis through StopWatch

    Shivoa:
    Thanks, I was worried the answer might be something like that.  As I understand it I can only use PIX to analyse the 360 if I have access to the XDK so as an indie I'll end up getting the results for whatever PC I run my project on.  I'm not sure the ATi 2400 or nVidia 8800GT are that close to the 360 GPU and memory architecture to provide useful results (other that finding bad code that takes far too long to execute).


    Well if we're only talking about doing guassian blurs here, then you're probably going to be primarily limited by texturing.  This means you'd want something that comes close to 360's GPU in terms of bandwidth, texture units, and texture cache.  Unfortunately the ATI 2400 is a well below the 360 GPU in terms of those specs, and the 8800GT exceeds those specs by quite a bit.

    Also be aware that using a ResolveTexture2D will slow you down on the PC, since it will cause data to be copied from the backbuffer to another surface.  If you're rendering data that you want used as a texture, just render it to a RenderTarget2D.
    Matt Pettineo | DirectX/XNA MVP


    Ride into The Danger Zone | PIX With XNA Tutorial
  • 6/11/2009 8:11 PM In reply to

    Re: Performance analysis through StopWatch

    Is the performance difference a PC only thing?  What little I've read on the subject doesn't really give a clear advantage to either ResolveTexture2D/ResolveBackBuffer or RenderTarget2D/GetTexture.

    I've been using ResolveTexture2D because that means I'm using the back buffer as a temp surface to render to rather than creating a new persistent RenderTarget2D in the two step blur operation.  ie I can blur along the x to the back buffer and resolve to the original ResolveTexture2D of the scene, and then blur along y that again to the back buffer and resolve.  The temp target I render to is the back buffer that I have to have and will be cleared when I later render the complete scene.  If I was to use RenderTarget2Ds I would have to write the blur along x to a new RenderTarget2D and then flip back to the original for the y, thus creating wasted memory.

    Hope that makes sense, if I'm doing it the wrong way and additional RenderTarget2Ds is the right way than please correct me.
  • 6/11/2009 9:09 PM In reply to

    Re: Performance analysis through StopWatch

    Yeah it's only the PC.  On the 360 anything you render into eDRAM has to be resolved out into main memory before you can use it as a texture, so it doesn't matter whether you use a ResolveTexture2D or a RenderTarget2D (since either one will result in eDRAM being copied out).  However on the PC the backbuffer is a separate surface in memory that you can't use as a texture, therefore when you call ResolveBackBuffer data from that surface has to be copied to another surface that's part of a texture.  So if you use a RenderTarget2D, you avoid the copy. But if in your case you're really worried about saving that bit of memory, then you can stick with your approach.
    Matt Pettineo | DirectX/XNA MVP


    Ride into The Danger Zone | PIX With XNA Tutorial
  • 6/12/2009 2:23 AM In reply to

    Re: Performance analysis through StopWatch

    Strange, I'm not seeing that reflected in the performance on my dev box (E7200 C2D, 2GB RAM, ATi 2400XT 256MB) with PIX.

    PIX is giving between 9-12ms for each render of a 1280x720 texture through the blur (so around 20ms for x and y blurs combined), 1.5-2.5ms full-screen draws without the gaussian pixel shader involved (either from Texture2D or Resolved or RenderTargets), 0.1-0.3ms for a DrawString here and there, and with 1ms clear operations whenever I need to use them.

    The only real difference I can see in PIX is that ResolveTexture2D has a double clear (one to purple when it de-allocates in the ResolveBackBuffer, and then to black when I manually call in the clear to get to a clean starting point) which add a ms here and there.  Of course this is with a low end GPU so I'm not sure just how much this is applicable to getting some performance boost on the 360.
Page 1 of 1 (8 items) Previous Next