XNA Creators Club Online
Page 1 of 1 (8 items)
Sort Posts: Previous Next

GPU to CPU communication in XNA

Last post 6/1/2009 6:57 PM by Shawn Hargreaves. 7 replies.
  • 5/25/2009 11:36 PM

    GPU to CPU communication in XNA

    What would be a good way to accomplish something such as this using the pixel shader :

    On a per-frame basis, count up the number of output pixels in a specified rectangle
    on the screen that fall into some range, for example, with the red component between 0
    and 0.5.  Then, report this count back to the CPU once per frame.

    Any comments, ideas etc. would be appreciated...
  • 5/26/2009 4:22 AM In reply to

    Re: GPU to CPU communication in XNA

    The easiest way to read data like that back from the GPU is drawing a quad with states set so it will only render the pixels you are interested in, and use an occlusion query object to read back how many pixels were actually drawn.

    Beware, though, this may ruin your performance, unless you deal with the asynchronous nature of queries so you can cope with the data being 1 or 2 frames late in arriving back on the CPU! It is not possible to read back GPU data to the CPU without calling a stall of at least 1 if not more frames. See here.
    XNA Framework Developer - blog - homepage
  • 5/26/2009 5:17 AM In reply to

    Re: GPU to CPU communication in XNA

    Thanks Shawn, your blog has some very helpful info. 

    If the CPU does something else for a while (say a couple of frames worth of time) before doing the getdata, would the getdata return immediately (i.e. no stall)?
  • 5/26/2009 6:28 AM In reply to

    Re: GPU to CPU communication in XNA

    Generally, you will only be able to read back an occlusion query result about 1-3 CPU frames later (ie, the CPU has processed that many full Draw() loops). This is because when you make a Draw call (of any kind) the request is stored in a queue and the GPU will eventually get round to processing it (to reduces cases of the GPU waiting on the CPU). So there is an expected lag in the system. This is why you cannot make a draw call then immediately expect to get a value returned back without waiting a *long* time, even if you process a bunch of data. It physically takes the GPU time to catch up.

    Xen: Graphics API for XNA
    www.codeplex.com/xen
  • 5/27/2009 7:15 PM In reply to

    Re: GPU to CPU communication in XNA

    Thanks.

    Finally, is occlusion query the best way to do this (in other words, are there some other reasonable ways to communicate GPU data to the CPU?)
  • 5/27/2009 7:51 PM In reply to

    Re: GPU to CPU communication in XNA

    The only ways to read back data are via an occlusion query or GetData on a rendertarget. Which is better depends entirely on what you are doing.

    Occlusion queries have the advantage that you can poll them to see if the data is available yet, so you can detect when to read the value and avoid causing a stall. GetData has no way to query whether the GPU data is ready, so if you call it too soon, the pipeline will stall and there is no way to detect that.

    Occlusion queries return you a single integer, which can be good if you are reading back a summed result computed on the GPU. Rendertargets can return as much or as little data as you like depending on how big they are. Reading large amounts of rendertarget data will be significantly slow on many Windows cards, depending on what sort of bus they are connected over and how much effort the driver vendor has put into optimizing readback operations (typically not very much).
    XNA Framework Developer - blog - homepage
  • 6/1/2009 6:28 PM In reply to

    Re: GPU to CPU communication in XNA

    Hmmm, if a poll on the occlusion query says the data is available, wouldn't that also serve as a flag that GetData could be done on the same rendertarget without causing a stall?
  • 6/1/2009 6:57 PM In reply to

    Re: GPU to CPU communication in XNA

    Answer
    Reply Quote
    Not necessarily. The GPU doesn't necessarily do things in the same order you issued your drawing calls!

    In practice, most likely, yes.

    But you never know what the driver is going to do. Some drivers do crazy things like reordering your rendertarget operations to make them more efficient for the hardware (check out the presentations from Intel about the Larrabee renderer for some examples) which could make that logic spectacularly invalid.
    XNA Framework Developer - blog - homepage
Page 1 of 1 (8 items) Previous Next