XNA Creators Club Online
Page 1 of 1 (9 items)
Sort Posts: Previous Next

Drawing 15,0000 objects performance issues

Last post 5/25/2009 9:33 AM by gooo. 8 replies.
  • 5/10/2009 4:09 PM

    Drawing 15,0000 objects performance issues

    I am working on a virtual world client for Seconlife and OpenSim.  In case you aren't famliar with these environments what they are is client/server applications where the client retrieves the world from the server and renders it.  The virtual world is broken up int o 256m x 256x squares called a region.  In each of these regions there can be up to 15,000 primitives.  Some of the primitives are cubes, cylinders, spheres, and torus.  When the client retreives the region from the server it gets a list of up to 15,000 primitives which will include an identifier for one of those types as well as vectors for position, scale, and rotation.  These primitives are basically profiles that are extruded and so they also include path cut, hollow, twist, shear modifiers.  Additional each of the faces of the geomtry can have its own material which can include color, texture, transparency and capability to modify the UV to do tiling or other operations on the texture placement.

    So what I have is a realtime geometry pipeline which takes the primitive parameters and creats a vertex and index array.  I then create an effect for each surface.

    I have got some basic rendering working but running into some performance issues and understanding XNA and DirectX more and have some questions on what is the best approach to rendering this.

    1.  Since cubes can have 6 faces and each face can have its own texture i am looping through the primitives doing a draw call for each face.  That is 15,000 x 6 = 90,000 draw calls.  This seems to be too many draw calls.  I loaded up PIX and I don't see any particular place were there is a time duration problem.  Seems this many draw calls is just too much.  So I have the following ideas on how to reduce the number of calls but curious if you guys know of some better ideas.

    A. Implement some kind of culling by tracking the face normals and skipping ones that are facing away from the camera.  That should reduce the draw calls by at least 50% since I can only see 3 of the 6 sides of a cube.  But is it faster to do this normal check on CPU time or better to do the draw call and let the GPU decide to draw it?

    B. Although the cube can have 6 different textures a lot of cubes use a default material for all sides so  if I change my index buffer to be grouped by material not face I should be able to reduce the number of draw calls effect.  But my question here is for a lot of the primitives they use the same geomtry so I can use hardward intancing which will cut down on vertex buffer size but not really the number of draw calls per effect.  Is there a way I could setup the effect to contain all the information for all sides of the primitive.  For instance have a array of textures, and array of side colors, and array of transparency values and in the vertex shader know which of the array entries to use to pull the diffuse color from the texture, etc.  This way i could reduce my draw calls down to 15,000 draw calls.  Can I pass an array of textures to the shader?

    C. Setup an algorithm to remove hidden objects.  Since some objects are behind others that can't be seen.  As well as decide if the primitive can even be seen in the view although it is possible to havwe all 15,000 items in view.  Also, a long these lines some items are far enough away if they are small you can't see them.  But to do all of this I would need to do a lot of math on CPU time.  Is it better to just pass all these objects to the GPU and let it decide?


    2. For textures what kind of optimizations can I do.  I am pulling abritrary size images from the server and converting into Texture2D objects.  Would it help if I make smaller images for objects that are far away and only use the full size texture for close objects?  For instance an object that is off in the distance  may only be using 10x10 pixel area on the screen.  Would it be better to make a 16x16. 64x64, 128x128, 512x512 versions of the oriiginal and use one based on the distance away or does it really matter?

    3.  How does the number of triangles relate to rendering time?  I have a default terrain that is a 256 x 256 square made of up two triangles.  When close these two triangles can rendering into the complete view which could be 1024x1024 pixels.  Is it ok to use two triangels when up that close?  Or is it better to break that surface into more triangles?  And if so what is a good rule of thumb for level of detail as far as how many pixels one triangle should cover.

    4. One idea I am playing with is try to pass in the effect the shear, twist, hollow, profile cut values and try to let the GPU perform those operations.  I don't really need to make new geomtry just do transformations on them.  If I do these operations in memory then I have to make a new copy of vertices, if I can do it in the GPU then I can have a reused set of vertices and do basically do hardware instancing on them.

    5. For vertex and index buffers is it ok to have a seperate vertex and index buffer for all 15,000 objects and set it at each primitive draw which could be 15,000 times per frame or is that too many.  I am thinking I want to try to make a scene buffer with all the vertices and try to set change the vertex buffer as little as possible.  What might be some good strategies for this?

    Any ideas or thoughts would be appreciated.  I am thinking I want to targer around the 24 to 30 fps.  How many draw commands, triangles, vertices do I need to try to stay within?  What are some of the metrics people are targeting for their games?

  • 5/10/2009 5:20 PM In reply to
    • (23)
    • premium membership
    • Posts 73

    Re: Drawing 15,0000 objects performance issues

    Draw calls are basicly the most time-requireing part of a program.  Therefore using any of those algorithams (unless they require  massive for-next loops) will probably save procceser time.  Bellow are the fastest, and most efficent algoristhams to use
    1) don't draw hidden objects (like the 3 faces of a cube you mentioned)
    2) don't draw anything behind the user
    3) If the above don't help enough you may need to put in a z-barrier (essentualy an If-statement that prevents drawing of anything too far away.  This will also cut down on the programs function so save it for last.

    As for using different size textures you may want to hold off.  The number of draw calls will ultomantly be unchanged so it will not matter (that much).  THe most important thing is to keep the number of thimes you refference drawwing something to a minum. 
    Also it could be beneficial to make tryangles into squares.  Even though the number of pixels remains the same, the number of draw calls will decrease, and the O.S. likes that better.

    I hope this helped
  • 5/11/2009 6:33 PM In reply to

    Re: Drawing 15,0000 objects performance issues

    As soon as you say "15,000 objects", I think you have a problem!

    3D graphics cards are very good at drawing huge numbers of triangles, but they like to have just a few draw calls, with many triangles in each. The usual advice is that for great performance, should have no more than 1,000 draw calls per frame.

    To render a scene like this, I think you will need to merge your objects. Find things that don't need to animate or move around independently, and flatten them into a single vertex buffer and index buffer, so you can draw them all in a single call. This also requires your objects to share the same texture: if that is not the case, you might want to look into texture pages. If you have objects that are identical except they can move or are animated differently, look into hardware instancing techniques.

    Most games do this kind of data merging and optimization work ahead of time, either manually inside their modelling program or as an automated process while the content is built (for instance using a custom content processor). The unique challenge for games that download their content on the fly in pieces is figuring out ways to optimize this into a good form for rendering on the fly while the game is running.
    XNA Framework Developer - blog - homepage
  • 5/11/2009 10:11 PM In reply to

    Re: Drawing 15,0000 objects performance issues

    Hi, thanks, yeah, looking into how to use hardward intancing.  I can package up multiple multiple geometries into one call using hardware or software instancing.  The thing is for my geometry every object could have 1 or more faces.  I have 15,000 objects, but what i really have is 15,000 x 6 faces to draw.  Each face could have its own material which can included a texture.  The good thing is a lot of the objects just have one material for all faces.

    So here is my question, if I use hardware instancing how can I apply different textures to each face.  I am thinking I can expand stream 1 to include a material # or other details besides just the matrix transformation but how can I pass a list of textures to the shader.  I saw I can do an array of textures but I haven't been able to find how many I can pass at once.  If I can draw 100 faces at a time that is 150 draw calls.

    So really like the hardware instancing idea, I just have to figure out how to have differernt materials on different faces.
  • 5/11/2009 10:22 PM In reply to

    Re: Drawing 15,0000 objects performance issues

    ktweedy1:
    if I use hardware instancing how can I apply different textures to each face.


    You can't, at least not with good performance on a 3D graphics card. Thousands of individual faces, each one with a different texture, is pretty much the worst case for hardware accelerated rendering performance.

    To get good rendering speed, you need to figure out how to change this data so that it consists of just a small number of draw calls, each of which includes many thousands of triangles that all use the same shader, texture, and renderstate settings. Sorting by texture is probably a good place to start. If that doesn't get you enough performance, you could look into packing multiple textures into a texture atlas (aka. sprite sheet).
    XNA Framework Developer - blog - homepage
  • 5/13/2009 7:31 AM In reply to

    Re: Drawing 15,0000 objects performance issues

    Yeah, that packing into a texture seems like an interesting idea mixed with some kind of map mipping techique based on distance the item is away that estimates the actual pixel space that the shape will use on the screen to make smaller set of textures all packed into one texture per call.

    What are some of the metrics around texture size?  As a general rule textures will come down form the server 512x512 or smaller with a few exceptions of bigger ones.  What size textures can I pass to the shader before it will start gagging on them.

    Since I will be reusing vertex geometry i guess i will need to pass the vertex transformation matrix and a UV transformation matrix in stream 1 o the hardware instancing processing.
  • 5/13/2009 9:36 AM In reply to

    Re: Drawing 15,0000 objects performance issues

    I can't give you an exact metric, though bigger than 1024 x 1024 will probably give you poor texture caching. I use 2048 x 2048 only for very special cases. The question you should be asking is if you really need 512 x 512 size textures when you are drawing 15 000 objects. Maybe you can trim this down a bit since 512x512 is aproximately a quarter of the entire screen, depending on your resolution ofcourse.

    Regards
  • 5/13/2009 12:42 PM In reply to

    Re: Drawing 15,0000 objects performance issues

    Yeah, that is what i was trying to say, but didn't say so well.  Based on distance away and size of the object i can probably estiamte actual pixel size and make smaller versionso f the textures that I can pack all together.  1024x1024 i shouild be able to put 64 16x16 images on it.
  • 5/25/2009 9:33 AM In reply to

    Re: Drawing 15,0000 objects performance issues

    Hi I have a very similar problem ... the only difference is that I know exactly that every scene object I get from the server uses one of about 15 models, there is just one possible texture per model but this texture has to be combined with the objects color (which can change during runtime)... additionally not every object's got texture .. some use solid colors only

    I'm trying to implement hardware instancing to render the complete scene using as few draw calls as possible ... but I'm wondering which way I get the best performance, by creating an effect for each texture-color combination and moving my object from one "appearance collection" to another whenever it changes it's color? or is there a way to pass the shader color information together with my instance data?

    oh and one more general question about instancing: how do you structure your scene? so far I've organized my scene into scene objects owning an instance refering to a model ... but I'm starting to think that the opposite way would be better for creating my render batches... what's your opinion?

    thanks!
Page 1 of 1 (9 items) Previous Next