Compressing byte arays

Last post 03-10-2008, 3:03 AM by Spodi. 6 replies.
Sort Posts: Previous Next
  •  03-05-2008, 10:29 AM

    Compressing byte arays

    Hi
    I am using System.Net for networking. When I send byte arrays over a socket, I would like to compress them to save bandwidth. I found this method that uses the GZipStream from .NET to compress:

            public static byte[] Compress(byte[] buffer)

            {

                MemoryStream ms
    = new MemoryStream();

                GZipStream zip
    = new GZipStream(ms, CompressionMode.Compress, true);

                zip.Write(buffer, 0, buffer.Length);

                zip.Close();

                ms.Position
    = 0;



                MemoryStream outStream
    = new MemoryStream();



               
    byte[] compressed = new byte[ms.Length];

                ms.Read(compressed, 0, compressed.Length);



               
    byte[] gzBuffer = new byte[compressed.Length + 4];

                Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length);

                Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4);

               
    return gzBuffer;

            }


    However, this method usually increases the size of my byte arrays, so it is not very good. Has anyone build a good compress method they would like to share?


    My XNA MMORPG
    My blog
  •  03-05-2008, 10:51 AM

    Re: Compressing byte arays

    How big are your arrays? Small ones won't compress well and often as you are seeing the extra lookup table overhead makes them bigger. There are just not enough regular patterns in your stream. Try compressing an array full of identical values just to make sure you have not got a mistake in your code.

    For small stuff youy are probably better of looking at encoding rather than compressing - or accepting that it can't be made smaller. See Shawns posts from December http://blogs.msdn.com/shawnhar/archive/2007/12.aspx



    The ZBuffer - News and information for XNA and Managed DirectX
  •  03-05-2008, 11:21 AM

    Re: Compressing byte arays

    Usually the arrays are around 20 bytes (i don't want to compress these). A few byte arrays are 2000 bytes (these i want to compress). Compressing a 100 byte array, with only 0 values will result in a 130 compressed array. I will try to look at Shawn’s blog post.


    My XNA MMORPG
    My blog
  •  03-05-2008, 11:25 AM

    Re: Compressing byte arays

    steffanp:

    Compressing a 100 byte array, with only 0 values will result in a 130 compressed array.

    Now that surprises me.. maybe the gz algorithm doesn't notice long streams of identical values. If you know your data stream has lots of identical values next to each other you could look at implementing a RLE compression yourself. its pretty simple. Of course this will ony help you if your data often has this kind of pattern.



    The ZBuffer - News and information for XNA and Managed DirectX
  •  03-05-2008, 11:58 AM

    Re: Compressing byte arays

    The ZMan [MVP/Moderator]:
    steffanp:

    Compressing a 100 byte array, with only 0 values will result in a 130 compressed array.

    Now that surprises me.. maybe the gz algorithm doesn't notice long streams of identical values. If you know your data stream has lots of identical values next to each other you could look at implementing a RLE compression yourself. its pretty simple. Of course this will ony help you if your data often has this kind of pattern.

    I expect it does but as you said there is some overhead to compression that makes it pointless for smaller data. 100 bytes (all zeroed) compresses to more than 100 bytes but 10,000 bytes compresses to less than 200.


    XNA UK User Group - XNA Goodies
  •  03-05-2008, 3:50 PM

    Re: Compressing byte arays

    100 bytes is probably too small to get much useful compression, although it really depends on the nature of your data and how well that fits whatever algorithm you choose. Generalized algorithms like zip tend to do ok on a wide range of data, but take a while to get warmed up, while more specialized algorithms might do better with less overhead if you can find the right one for your particular data.

    I'm surprised you don't get at least ok results with a 2000 byte data block, though.

    What kind of information do you have in there?

    --
    XNA Framework Developer
    blog - homepage
  •  03-10-2008, 3:03 AM

    Re: Compressing byte arays

    Generalized compression algorithms are probably going to be one of the last resorts or used only for a lazy attempt to try and compress unless you are sending pre-constructed data (file transfers, images, etc). If you have control over the data being sent, I reccomend you check out using a bit stream combined with prediction, deltas and value ranges.

    Prediction: Lets say you have a common default value - this can be just about anything. Lets say your characters can change the color of their skin - R, G and B all on a 0-255 range. This takes 3 bytes. But a good chunk of players, maybe about 25%, still use the default color. You can have a bit for a boolean stating that "if true, use the default color, else read the next 3 bytes for the color". You can extend this to add more values, even weigh them by the rate of occurance using a Huffman or Algorithmic style weight distribution.

    Deltas: This one is a bit harder to make use of and also will have to go under the assumption that the server knows what packets the client has received and vise versa (ie a reliable and ordered message system). It is common to have a lot of NPCs of the same type in one area. If you store the values last received from this packet - at least the ones that would be of use like their name and apperance - you can use booleans like above to state "if true, use the last received value, else read the new value".

    Value Ranges: Using a Truncated Binary Encoding algorithm, you can make use of those values between powers of 2. If you have 0 to 10, setting it on the range of 0 to 15 is still going to be a waste of those 5 sequences (11 to 15). They will never occur. If you define the min and max values on both the reading and writing (in opposed to the usual "how many bits to use"), you can slip some of those values in the 0 to 10 range into 3 bytes and the rest into 4 instead of 4 for them all.

    vbGORE - open-source VB6 multiplayer ORPG engine
View as RSS news feed in XML
©2007 Microsoft Corporation. All rights reserved. Privacy Statement Terms of Use Code of Conduct Feedback