XNA Creators Club Online
Page 1 of 1 (23 items)
Sort Posts: Previous Next

Array access speed

Last post 1/26/2008 8:50 PM by DivideByZero. 22 replies.
  • 1/25/2008 8:52 PM

    Array access speed

    I'm having a bit of a nightmare, I have a 600x600 array for AI purposes and all I'm doing is reading and writing 50 32x32 patches to it in a simple loop. The problem is it's slowing my application to a crawl and I can't figure out why, even clearing the arrays causes the frame rate to become extremely unstable, even if the array is half that size. Here are the two lines in question...

    private static uint[,] aRotation;

    Array.Clear(aRotation, 0, nGridWidth * nGridHeight);


    I'm really lost here.. Little help?
  • 1/25/2008 9:00 PM In reply to

    Re: Array access speed

    What you should first do is use a tool like NProf to figure out where your CPU time is being spent for sure, rather than guessing and trying things. Check out this site: http://swampthingtom.blogspot.com/2007/04/software-efficiency-and-optimization.html. Once you know where you are spending time (and how much) you can start to figure out solutions to it.

    Also keep in mind the size of those arrays. 300x300 means you have 90,000 numbers to reset. And a 600x600 is 360,000. So it's definitely going to take a lot of time to work with if you are editing all the entries in the array.
  • 1/25/2008 10:19 PM In reply to

    Re: Array access speed

    Thanks, that helped me clear up some weired issues. All in all the loop is accessing about 500x500 elements, which is really hardly any at all, I've written a software rasterizer in Flash which does about the same speed...

    Firstly I had a couple of casts in the loop, 4 more of them brought me down to 1fps, so I optimized them out. then there were 8 if statements, removing them tripled the speed, I expect I can do some tricky maths to compensate for those.

    This brought the execution time way down to 26.38, better but still a very heavy function. All that was left were the array writes, comment those out and we're down to 6.71. So my guess would be I should change to using linear arrays instead of 2D, I don't have time to do that right now though and we're back to more stable framerates. Cheers.

  • 1/26/2008 6:28 PM In reply to

    Re: Array access speed

    Hazy:
    So my guess would be I should change to using linear arrays instead of 2D.

    I strongly disadvise that.  Switching from a 2D array to a 1D array only brings the hassle of manually converting 2D indicies to a 1D index.

    Example:

    int x = ..., y = ...;
    int i = x + y * width;
    If the array is logically 2D, implement it as 2D.
  • 1/26/2008 7:08 PM In reply to

    Re: Array access speed

    Actually you should not use multidimensional arrays because they do not perform well in .Net. Regular arrays have IL instructions (such as newarr, ldelem, ldelema, ldlen, and stelem) whereas multidimensional require method calls. Not only that, I don't think the JIT compiler optimizes the range check for multidimensional arrays.

    That math would be far faster than method calls and range checking.

  • 1/26/2008 8:25 PM In reply to

    Re: Array access speed

    Agreed, use a single dimension array whenever performance is critical.

    X + Y * Width is not very difficult.  I recommend just creating a wrapper that will give you a 2D array element accessor if you want convenience, but then you can grab the underlying array for quick operations across the entire dataset.
  • 1/26/2008 8:27 PM In reply to

    Re: Array access speed

    DivideByZero:

    Actually you should not use multidimensional arrays because they do not perform well in .Net. Regular arrays have IL instructions (such as newarr, ldelem, ldelema, ldlen, and stelem) whereas multidimensional require method calls. Not only that, I don't think the JIT compiler optimizes the range check for multidimensional arrays.

    That math would be far faster than method calls and range checking.



    Are you talking about "jagged" arrays (arrays of arrays)? Or true multi-dimentional arrays?

    I'd be mildly surprised if what you said is true for the latter as the internal representation is linear and it should be trivial for the compiler to reduce everything to linear operations. Also, this probably depends on the platform.
    Brandon Bloom
    Software Design Engineer
    XNA Platform and Tools
  • 1/26/2008 8:50 PM In reply to

    Re: Array access speed

    I'm not talking about jagged arrays (which actually in .Net are faster), I'm talking about multidimensional arrays.

    Take a look at the IL that is generated. I know, it is surprising but I'm not making it up. This isn't even counting what the JIT will do or how it is handled on the compact framework.

    new byte[500,500]

    ldc.i4 0x1f4
    ldc.i4 0x1f4
    newobj instance void uint8[0...,0...]::.ctor(int32, int32)

    new byte[250000]

    ldc.i4 0x3d090
    newarr uint8

    mdarray.GetLength(0)

    ldc.i4.0 
    callvirt instance int32 [mscorlib]System.Array::GetLength(int32)

    szarray.Length

    ldlen

    mdarray[x,y] = 8;

    ldloc.1 
    ldloc.2 
    ldloc.3 
    ldc.i4.8 
    call instance void uint8[0...,0...]::Set(int32, int32, uint8)

    int i = x + (y*500);
    szarray[i] = 8;

    ldloc.s x
    ldloc.s y
    ldc.i4 0x1f4
    mul
    add
    stloc.s V_8
    ldloc.s szarray
    ldloc.s V_8
    ldc.i4.8
    stelem.i1

  • 1/26/2008 8:53 PM In reply to

    Re: Array access speed

    On page 206 of Accelerated C# 2005 is say pretty much what DivideByZero said and gives and example, I like that book a lot.
  • 1/26/2008 8:57 PM In reply to

    Re: Array access speed

    I am talking about multidimensional arrays not jagged. What the JIT compiler does aside, just look at the IL that is generated:

    byte[,] construction:

    newobj instance void uint8[0...,0...]::.ctor(int32, int32)

    byte[] construction:

    newarr uint8

    byte[,] setting:

    call instance void uint8[0...,0...]::Set(int32, int32, uint8)

    byte[] setting:

    stelem.i1

  • 1/26/2008 9:50 PM In reply to

    Re: Array access speed

    I don't think the method call vs. IL instruction really means much, at least on the current desktop CLR.  I looked at the generated machine code, and both resolve to a call instruction to some internal routine.  The internal routines for multi-dimensional array creation do probably involve some extra logic, though.  Writing/reading to the arrays is also done directly on the memory location, so no method call overhead there.

    For reads/writes, the JIT compiler seems to output 1 range check per dimension (as observed with cordbg with JIT optimizations enabled and c# compiler optimizations during compilation).  So, you're making the runtime code do twice the number of range checks for every array access when you use a 2D array vs a 1D array.  This is most likely where you're losing your time.

    Interestingly, these bounds checks remain when doing this within an unsafe code block.  I thought the runtime was supposed to bypass these checks when in unsafe code.



    Microsoft DirectX/XNA MVP
  • 1/26/2008 10:29 PM In reply to

    Re: Array access speed

    I read on an MSDN Magazine article (Feb 2002) that md arrays aren't as performant and also in an article by Charles Petzold in Beautiful Code which both mentioned the method call vs instruction as being an issue. Granted, the msdn article is old.

    Range checks are optimized away from the indexer if the JIT compiler can determine you have done the range check in the loop without having to use unsafe code.

    for (int i=0; i<array.Length; i++) array[i] = 0; //because the range check is in the loop the JIT removes it on the index

    For unsafe code you have to pin the array to a pointer to bypass the range check.

    Using a fixed block you can get a pointer to a managed array. If you do this with a multidimensional array, the pointer works like a linear array.

    static void Main() {
          int[,,] a = new int[2,3,4];
          unsafe {
             fixed (int* p = a) {
                for (int i = 0; i < a.Length; ++i)   // treat as linear
                   p[i] = i;
             }
          }
          for (int i = 0; i < 2; ++i)
             for (int j = 0; j < 3; ++j) {
                for (int k = 0; k < 4; ++k)
                   Console.Write("[{0},{1},{2}] = {3,2} ", i, j, k, a[i,j,k]);
                Console.WriteLine();
             }
       }

    ouputs:

    [0,0,0] =  0 [0,0,1] =  1 [0,0,2] =  2 [0,0,3] =  3
    [0,1,0] =  4 [0,1,1] =  5 [0,1,2] =  6 [0,1,3] =  7
    [0,2,0] =  8 [0,2,1] =  9 [0,2,2] = 10 [0,2,3] = 11
    [1,0,0] = 12 [1,0,1] = 13 [1,0,2] = 14 [1,0,3] = 15
    [1,1,0] = 16 [1,1,1] = 17 [1,1,2] = 18 [1,1,3] = 19
    [1,2,0] = 20 [1,2,1] = 21 [1,2,2] = 22 [1,2,3] = 23
  • 1/26/2008 11:12 PM In reply to

    Re: Array access speed

    Responding to your original post, Hazy; is your AI algorithm something that can be reduced to a GP-GPU type algorithm? I can't read into what you're storing in that 600x600 array, but if you're just looking to process the data as fast as possible I'd bet there is something you can do on the graphics card...

  • 1/27/2008 9:19 PM In reply to

    Re: Array access speed

    You guys just dumped object oriented programming for little inline optimizations.  I stand firmly by my initial statement: if the array is logically 2D, implement it as 2D.
  • 1/27/2008 9:51 PM In reply to

    Re: Array access speed

    How does this go against OO programming?
    Microsoft DirectX/XNA MVP
  • 1/28/2008 2:53 AM In reply to

    Re: Array access speed

    Arrays aren't particularly "object oriented" in the first place.  People generally only need to use them when they desire raw speed, otherwise most people use a List<>, which isn't too much slower, but sometimes every bit counts.

    Here's the results of a quick test of various relative array access speeds:

    Test1a    5.390625      //   for (int i = 0; i < one.Length; i++) one[i] = 0;
    Test1b 10.21875 // for (int y = 0; y < height; y++) for (int x = 0; x < width; x++) one[x + y * height] = 0;
    Test2 23.265625 // for (int y = 0; y < height; y++) for (int x = 0; x < width; x++) two[x,y] = 0;

    So accessing a single dimensional array using two dimensional components is twice as slow as linear access, and accessing a two dimensional array is twice as slow as if you declared the array as single dimensional.

    In most real usage scenarios, the access time will be negligible compared to the operations you may perform on each element, but its worth knowing just the same.

    I don't bother ever allocating two dimensional arrays, it's really not that hard to access a single dimensional array as if it was two dimensional AND you have the added benefit of being able to treat it as a single dimensional array which is pretty handy.

    There's a reason the XNA Texture2D.GetData<> commands use a single dimensional array.

    As I said above though, just wrap your 2d arrays in an object with like so, and then you have convenience, AND direct access to a single dimensional array when desired:

    public class Map<T>
    {
        public T[] Array { get; }
        public int Width { get; }
        public int Height { get; }
        public T this[int x, int y]
        {
           get { return array[x + y * height]; }
           set { array[x + y * height]; }
        }
    }
  • 1/28/2008 8:26 AM In reply to

    Re: Array access speed

    It'd be interesting to know what it is you're doing exactly, maybe there are higher level ways to optimize the algorithm.

    Maybe it's possible to spread the AI work over multiple gameticks for instance?

  • 1/28/2008 9:03 AM In reply to

    Re: Array access speed

    Even doing this

    long j = 0;

    for (long i = 0; i < 360000; i++)

    {

    j = j + i / 10000;

    }

     

    on my computer takes 12 milliseconds, so maybe you're expecting too much.

  • 1/28/2008 9:48 AM In reply to

    Re: Array access speed

    John Wells:

    Responding to your original post, Hazy; is your AI algorithm something that can be reduced to a GP-GPU type algorithm? I can't read into what you're storing in that 600x600 array, but if you're just looking to process the data as fast as possible I'd bet there is something you can do on the graphics card...



    That would have been nice, but I would then have to either get the data back to the CPU, which famously causes the pipeline to hickup... Or find some very clever way of keeping the data there, as I had only a very short timespan for the while project I didn't investigate that. I had a similar problem with doing fluid simulation on the GPU recently.

    Harald Maassen:

    It'd be interesting to know what it is you're doing exactly, maybe there are higher level ways to optimize the algorithm.

    Maybe it's possible to spread the AI work over multiple gameticks for instance?



    I'm drawing influence fields for flocking AI, though it would have been better to just have done the O(n*log(n)) comparison because those arrays really did hurt. I never did get time to comvert them to linear arrays, I don't know much about compilers but I find it a little strange .NET does not inline that for you, there must be a good reason though.

    What I did instead was to spread the load across my two cores and do the AI the same time as the draw loop, it worked out well I think. I was doing this for the DBP warm up competition becuase my team desperately need to get to GDC '08 to find a producer and can't afford to pay our own way. Here's what I came up with, all was done in a little over 48 hours (without sleep) except for an MD2 loader class I used from Ziggyware I believe, althout I did re-write most of it. We're hoping Microsoft will want to showcase what can be done in such a short time with XNA.

    For some reason I spent 6 hours writing a fur shader for the grass, which isn't fun when you're hallucinating I tell ya...The idea is to be able to control a large battle by only manually influencing a small number of troups...

    (game video)
    http://www.youtube.com/watch?v=Y6yLHDzyVgQ
  • 1/28/2008 9:56 AM In reply to

    Re: Array access speed

    Wow! That's really cool! So using multiple cores worked out for you? Any tips to share? Did you have to lock your data or did you just assign one core to process half of the data?

    I'm playing with some multi-core programming and a starvation algorithm atm.

    Thanks!

  • 1/28/2008 11:47 AM In reply to

    Re: Array access speed

    While I don't know the details of the flocking algorithm I'm guessing spreading the load over multiple gameticks is possible. If it's possible to use the same influence array for, say, N ticks, you can use that time to calculate a new one. Just do the math for 50/N objects per tick. The data won't be 100% correct because during those ticks the objects will be moving around, but there's bound to be a buffer where you won't notice any strange behaviour.

  • 1/28/2008 12:08 PM In reply to

    Re: Array access speed

    John Wells:

    Wow! That's really cool! So using multiple cores worked out for you? Any tips to share? Did you have to lock your data or did you just assign one core to process half of the data?

    I'm playing with some multi-core programming and a starvation algorithm atm.

    Thanks!



    It's nothing that interesting, all I did was start a thread at the beginning of the draw loop for the AI call and join it at the end. The only thing is I had to look out for was enteties beaing created or deleted and leaving null values or modified collections.

    Harald Maassen:

    While I don't know the details of the flocking algorithm I'm guessing spreading the load over multiple gameticks is possible. If it's possible to use the same influence array for, say, N ticks, you can use that time to calculate a new one. Just do the math for 50/N objects per tick. The data won't be 100% correct because during those ticks the objects will be moving around, but there's bound to be a buffer where you won't notice any strange behaviour.



    Yes, that's enterily feasable, there's already an asyncronicity because of the threading and certianly nobody is going to notice, I think it helps to have some fuzzy things like that happen with AI. Also you could have some frame based coherancy where the last frame is used to smooth over any sigificant differences like c = a + (b * dt).
  • 2/2/2008 11:44 AM In reply to

    Re: Array access speed

    Swordshock:

    Hazy:
    So my guess would be I should change to using linear arrays instead of 2D.

    I strongly disadvise that.  Switching from a 2D array to a 1D array only brings the hassle of manually converting 2D indicies to a 1D index.

    Example:

    int x = ..., y = ...;
    int i = x + y * width;
    If the array is logically 2D, implement it as 2D.


    As mentioned before arrays with any number of dimentions are the same, the problem with the speed appears to be in the interpreter checking that each dimention is in range and being slow to do so. Us more interested in speed are only concerend if we've spilled over the end.

    The multiply can also be eliminated with a bit shift and use a ^2 width array, or like similar to the following for a 640 with to give more options for array sizes...
    i =  x + (y << 8) + (y << 7);

Page 1 of 1 (23 items) Previous Next