Hi there,
There are a couple of possible ways to do this -- using a single cue (for instance, named "MyAmbience") or multiple cues ("BackgroundAmbience", "Birds", "Flies", and whatever other cue names you want triggered). For the former, the programmer would trigger it once; for the latter, the programmer would fire off the base ambience and then also the foreground elements separately and recurrently.
As to the structure of the cue(s), they would likely each have a single sound. If you're using a single cue, you'd have one track with a looping play wave event (the loop property is available globally on the sound by selecting it in the sound bank, or per play wave event by selecting any of them) firing off your background ambience. You'd then use other tracks for the other elements (birds, flies, etc.). For each of these, since we want randomized timing, note that there are a pair of properties that will be of interest -- "Time Stamp" and "Random Offset". The first tells the event when the earliest is that it can be triggered, and the second says how far from that time it might be triggered. For instance, if you wanted to wait up to five minutes, but a minimum of 2 minutes between triggers, you might set Time Stamp to be 120 sec (2 min), and Random Offset to be 180 sec (3 min). The wave will then fire between 2 and 5 minutes from now.
Here's the rub with this cool randomized timing functionality -- currently it only works on the play wave event itself, not on any subsequent loops. When the sound (or specific track) starts to play, it'll wait as defined above, but if you then loop the track, the next variation will be chosen immediately after this one completes, with no delay/silence. A workaround is to have silent waves of variable lengths as variations [XMA compression keeps silent waves tiny on Xbox 360; on Windows, they can similarly be tiny by generating them at very low sampling rates], though this of course introduces its own potential issues (the engine might randomly choose silent variations too frequently, or not frequently enough, etc.). Another alternative is to use the multiple cues described above -- the programmer listens for the notifications relating to the cue, and when the chosen wave completes, the programmer fires off another instance of the cue (which again waits for time stamp+random offset amount of time before firing another instance).
Hope this helps a bit!
-Scott