There are a few reasons why your code may run slower on the Xbox. This FAQ covers the background and possible workarounds. You should ALWAYS profile your code before you attempt to determine or even worry about performance issues.
Some blog posts http://blogs.msdn.com/netcfteam/archive/2006/12/22/managed-code-performance-on-xbox-360-for-the-xna-framework-1-0.aspx and http://blogs.msdn.com/netcfteam/archive/2006/12/22/managed-code-performance-on-xbox-360-for-xna-part-2-gc-and-tools.aspx give a good summary
Floating Point is slower
The XNA Framework on Xbox uses the Compact CLR (common language runtime) to run your compiled IL code. The Compact CLR is optimized for devices like PDAs and cell phones, where small size is more important than high performance. As such, the implementation of the Compact CLR on the Xbox does not optimize floating point code at all. It also doesn't do things like inlining small functions (such as Vector3.CrossProduct or even just Vector2.operator+). Even worse, it passes all arguments on the stack, not in registers, which means that even though the PPC has lots of FP registers, you get stack spill for each function you call (if you pass by ref, it's still on the stack, it just doesn't unnecessarily go into registers).
To add insult to injury, the PowerPC cores in the Xbox are rather unsophisticated from a microarchitecture point of view. They are in-order, which means that one instruction cannot retire (or even issue, to some extent) until the previous instruction has retired. This means that the entire pipeline stalls when you miss the cache. This is, by the way, why they made the CPUs hyper-threaded: when one thread is stalled waiting for memory, the other thread gets scheduled and hopefully has gotten whatever memory it was waiting for filled at that point. The CLR JIT (just-in-time) compiler just emits "naive" code that performs the operations of the IL in the managed assemblies, but does pretty much nothing to try to optimize the code stream to re-order data hazards etc, so it's not at all as efficient as highly optimized C++ code would be even under the same restrictions.
Finally, the Xbox has a very nice very wide SIMD floating point unit (think AltiVec or SSE). Unfortunately, the Compact CLR doesn't know anything about how to use that. It may be that some functions (like Vector3.Transform()) are coded by the XNA team to be implemented using all the native niceness, but anything you write on your own won't.
Workarounds:
Garbage Collection is different
The GC on the Xbox is not generational. So when a GC occurs ALL objects on the heap will be scanned to determine if they are live or not.
Workarounds:
- Avoid creating garbage
- Keep your heap small and simple so garbage collections will be fast
More detailed info
here.
The JIT Compiler does not optimize code as well
Small routines are not inlined.
Workarounds:
- Inline yourself IF profiling shows that this is critical path. Inlining prematurly will make your code a maintenance nightmare
Play Kissy Poo - a game for 4 year olds on Xbox and windows
The ZBuffer News and information for XNA
Follow
The Zman on twitter,
Email me Please read
the forum FAQs -
Bug/Feature reporting Don't forget to mark good answers and good playtest feedback when you see it!!!