XNA Creators Club Online
Page 1 of 1 (23 items)
Sort Posts: Previous Next

Understanding VisualizationData

Last post 1/13/2010 11:32 PM by SRH. 22 replies.
  • 12/23/2008 3:26 AM

    Understanding VisualizationData

    XNA’s new VisualizationData class opens up many possibilities in terms of authoring visualizations and (what I’m personally more interested in) creating music-generated gameplay. However, aside from the bundled XNA documentation, there isn’t a lot of information out there on how to use the VisualizationData class in a useful way. I will attempt to address this issue through this thread. I hope to document what I’ve learned about visualization data in a way that makes sense to the musically inclined and otherwise. If time allows, I’ll supplement my findings with code samples, links to information I’ve found around the web, and some video demos on YouTube. In the tradition of NeHe’s OpenGL tutorials that I’m sure many of us grew up on, I’ll try to keep my code as simple and clean as possible so that even novice XNA developers can understand what I’m doing. Also in that tradition, we’ll start with the very basics of getting access to a song’s visualization data and steadily progress to doing some cooler stuff with it. So let’s jump right in!

     

    Getting Started

    The VisualizationData class lives under the Microsoft.Xna.Framework.Media namespace. The easiest way to get started playing with visualization data is to play some music from your Windows Media Player library through the MediaPlayer, which we’ll do now. If you want to follow along, this first project is going to take you from an empty project to building your first music visualization. So create a new Windows Game (extending this to the Xbox 360 requres a little more footwork, which we’ll talk about in future posts), and insert the following code where appropriate (remember, this stuff requires XNA 3.0): (EDIT: Can't explain the formatting garbage below.)

    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

    MediaLibrary mediaLibrary; 
    SongCollection songCollection; 
    VisualizationData visualizationData; 
     
    public VisualizationDataGame() 
        mediaLibrary = new MediaLibrary(); 
        visualizationData = new VisualizationData(); 
        songCollection = mediaLibrary.Songs; 

    The MediaLibrary object represents the music that’s been added to your Windows Media Player library. So before this will work, you’ll need to make sure Media Player’s detected some songs. These songs can be MP3s, WMAs, or WAVs, as long as they aren’t DRM’d. (Side note: for some reason, I can’t get MP3’s to play through XNA’s MediaPlayer on my desktop, but they work fine on my laptop. They play fine through actual media players (Windows Media Player, Zune, iTunes, WinAmp, etc.). I’ve talked to the XNA development team, and we haven’t arrived at an explanation for this behavior. But so far, my desktop is the only instance of this happening. So I’d be interested in hearing from anyone else that has this problem.) Anyway, back to the code.

    The SongCollection is a list of all of the songs contained in your media library, obtained through the MediaLibrary.Songs property. We’ll use this collection to tell the MediaPlayer what to play soon. VisualizationData is the object that will contain the frequency and sample data of the currently playing song.

    protected override void Initialize() 
        MediaPlayer.Play(songCollection); 

    In our Initialize function, we simply tell the MediaPlayer (which is a static class, no need to instantiate it) to play all of the songs in our media library from start to finish. All that’s left is to get the visualization data:

    protected override void Update(GameTime gameTime) 
        MediaPlayer.GetVisualizationData(visualizationData); 

    Our Update function fills the visualizationData object with the currently playing song’s frequency and sample information each frame. We need to call this each frame to keep the data up to date.

    Easy stuff, right? If you were to insert this code into a new Windows game and hit the Run button, you’d see the customary cornflower blue screen and the first song in your media library playing. But this thread’s supposed to be about visualizing music, not just listening to it. So let’s get to the fun stuff.

     

    Seeing Sounds

    We can go about this one of two ways: I can try my best to explain what the frequency and sample data mean in words and hope you get it. Or we can get something on the screen for us to look at, which should make our discussion of the frequency and sample data much clearer. So let’s get through a little more code, and then we’ll talk about what we’re seeing.

    The first thing you’ll want to do is open up your favorite graphics editor (mine’s Paint.NET), and create a 1x1 blank image. I called mine Blank.png. I then created a Textures subdirectory in my Content folder in my project. Place your blank image in this new subdirectory. Now we’re going to declare a new texture in our project:

    Texture2D texture; 

    Simple enough. Next, I like to stick to the common screen resolution of 1280x720. This creates a widescreen viewport on which to display our visualization data. So add the following to the game’s constructor:

    public VisualizationDataGame() 
        graphics = new GraphicsDeviceManager(this); 
        graphics.PreferredBackBufferWidth = 1280; 
        graphics.PreferredBackBufferHeight = 720; 
        graphics.ApplyChanges(); 
     
        ... 

    We’ll want to initialize the texture object in the LoadContent function, so it should look something like this:

    protected override void LoadContent() 
        // Create a new SpriteBatch, which can be used to draw textures 
        spriteBatch = new SpriteBatch(GraphicsDevice); 
        texture = Content.Load<Texture2D>("Textures/Blank"); 

    Finally, our Draw function is where most of the work is being done. What we’re going to do is graph our frequency and sample data on the top and bottom halves of the screen, respectively.

    protected override void Draw(GameTime gameTime) 
        GraphicsDevice.Clear(Color.Black); 
        Viewport viewport = graphics.GraphicsDevice.Viewport; 
        int x, y, width, height; 
     
        spriteBatch.Begin(); 

     

    We start this function off by clearing the screen to black and declaring a few variables to make things more readable later. We then tell our sprite batch to begin drawing.  

        for(int f = 0; f < visualizationData.Frequencies.Count; f++)  
        { 
            x = viewport.Width * f / visualizationData.Frequencies.Count; 
            y = (int)(viewport.Height / 2 - visualizationData.Frequencies[f] * viewport.Height / 2); 
            width = 1; 
            height = (int)(visualizationData.Frequencies[f] * viewport.Height / 2); 
            spriteBatch.Draw(texture, new Rectangle(x, y, width, height), Color.White); 
        } 

    In this first loop, we iterate through each element in our frequency array. For each of these elements, we’re going to draw a lines juxtaposed left-to-right across the screen from halfway down the screen up to that frequency band’s power level (again, I’ll explain all the technical terms soon).( Technically, we’re really drawing lines from each frequency band’s power level down to halfway down the screen due to the way 2D screen coordinates are specified in SpriteBatch’s Draw method… semantics :))

        for (int s = 0; s < visualizationData.Samples.Count; s++) 
        { 
            x = viewport.Width * s / visualizationData.Samples.Count; 
            width = 1; 
            if (visualizationData.Samples[s] > 0.0f) 
            { 
                y = (int)(0.75f * viewport.Height - visualizationData.Samples[s] * viewport.Height / 4); 
                height = (int)(visualizationData.Samples[s] * viewport.Height / 4); 
            } 
            else 
            { 
                y = (int)(0.75f * viewport.Height);  
                height = (int)(-1.0f * visualizationData.Samples[s] * viewport.Height / 4); 
            } 
            spriteBatch.Draw(texture, new Rectangle(x, y, width, height), Color.White); 
        } 
        spriteBatch.End(); 
         
        base.Draw(gameTime); 

    In this second loop, we iterate through each element in the sample array. However, in this case, sample values can be negative, so we have to do a little extra footwork to draw their lines since you can’t specify a line with a negative height in the SpriteBatch’s Draw function. (Later, when we convert this to 3D, this will be a lot more elegant).

    And there you have it, the Hello World of music visualization! If you’ve been following along, you can build your solution and run it to watch little white lines dance along to your music. Once you get over this first sense of accomplishment, you’ll undoubtedly ask yourself what these lines mean. And we’ll explore that next. But for now, take some time to observe the visualization for yourself so that our next discussion makes sense.

    If you’re lazy like me, you can download the project here (coming soon). And for the truly lethargic, you can see the results here (also coming soon).

     

    The Science of Sound

    So now that we’ve got something to look at, let’s make some sense of this visualization data. But first, a disclaimer: what you are about to read is based largely on observation and speculation. I am not on the XNA development team. I do not claim to know exactly how all this stuff works. If others know more about this stuff than I do, this is where I invite them to chime in. Now, on with the show.

    If you stayed awake during Physics class, you’ll remember that sound, in its most basic form, is vibration composed of many frequencies that the human ear can detect. Different sound waves are differentiated by certain properties such as frequency, wavelength, amplitude, and so on. Whenever frequency of a soundwave changes, its difference is reflected in the sound’s pitch. Also, whenever the amplitude of a soundwave changes, its difference is reflected in the sound’s loudness.

    Music visualization is driven largely by these changes in pitch and loudness. VisualizationData’s Frequencies and Samples properties give us access to this data. Each of these properties are a collection of 256 floats (so in the example above, we could have done all of the drawing with one for loop, but for readability’s sake, I kept them seperated).

    Each collection of sample data gives you a snapshot (at a very low resolution) of the waveform of the currently playing song at that instant in time. The values of these elements range from -1.0 to 1.0. Unfortunately, there’s not a whole lot to do with this raw data that I’ve discovered. The real magic comes from applying a Fast Fourier Transformation (FFT) to compute the frequency components that make up the sound you’re hearing. Computing the FFT involves a lot of math that I have to admit I slept through my senior year in college, but the good news is that we get this for free in the frequency data!

    Each element of the frequency data represents a frequency band from 20Hz to 20KHz (the range of sound audible to the human ear). For the less musically inclined, the sounds near the 20Hz range would be low, like a bass drum, a tuba, or the lower notes of a piano (the ones on the left side). The sound near the 20KHz range would be high, like a flute, violin, or the high notes of a piano (the ones on the right side). What complicates things a bit is that the distribution of these bands is logarithmic, which means that elements at the higher end of the spectrum represent more frequencies than those at the lower end.

    Each value of these elements (from 0.0 to 1.0) represent the power level of that frequency band. Take a look at this video (coming soon), which is our basic visualizer playing another song. Notice how the lower frequencies (on the left side of the screen) bounce up and down in synch with the bass drum of the beat at the beginning of the song. Similarly, the upper frequencies (on the right side of the screen) bounce up and down in synch with the snare drum of the beat. Cool!

    Even if you’ve never heard the Mario Bros. theme song in your entire life (in which case you must not be from around here), you can probably tap your foot to the song in the above video if you have the slightest semblence of rhythm. So if hearing and reacting to a beat is so simple to humans, we should be able to teach our games to hear and react to beats in music as well. This is one of the topics we’ll explore on future posts. Some other topics I hope to discuss include:

    • Porting our example to the Xbox 360 (only involves a few extra steps)
    • Filtering out some audio data to more closely align what we hear with what we see
    • On the same note, emphasizing some audio data for some purpose
    • Some more applications of visualization data as visualizers in games
    • Some experimental uses of visualization data to drive gameplay

    Hopefully this post has been helpful in generating more thought around XNA’s new VisualizationData class. If you’ve been playing around with music visualization, add your experiences to this thread. Stay tuned for more!


    Happy Holidays!

    Ron

  • 12/23/2008 4:01 AM In reply to

    Re: Understanding VisualizationData

    Nice article but the forums are really not the place for it... you should make  blog or send it to Ziggy to host (http://www.ziggyware.com/) or post it on http://www.xnawiki.com/
    Play Kissy Poo - a game for 4 year olds on Xbox and windows
    The ZBuffer
    News and information for XNA
      Follow The Zman on twitter, Email me
        Please read the forum FAQs - Bug/Feature reporting
          Don't forget to mark good answers and good playtest feedback when you see it!!!
  • 12/23/2008 6:15 PM In reply to

    Re: Understanding VisualizationData

    You're probably right. I'll probably host the rest of these on my site... Still, I'd love to hear from people that have been experimenting with the new API.

    Cheers!
    Ron
  • 2/14/2009 10:27 PM In reply to

    Re: Understanding VisualizationData

    Thanks very much for your tutorial!!!
    Its possible get the current sound of the main audio device? like Winamp or a game and put on spectrum?
    And how we can be this more faster? i think have some delay.!!! But its very nice, help me a lot!!!!!
    thankssssssssssssss
  • 3/2/2009 3:46 PM In reply to

    Re: Understanding VisualizationData

    Is there a site we can go to that has the finished article, including code and videos?
  • 3/3/2009 8:50 PM In reply to

    Re: Understanding VisualizationData

    I'll get this and hopefully part two of the tutorial on my site this weekend. I'll post a link to it as soon as it's ready!
  • 4/22/2009 12:59 AM In reply to

    Re: Understanding VisualizationData

    I have a question about determining the ranges in the Frequency collection.

    Is there a way to determine what frequencies are what in the 255 array.
    I know that
    visualizationData.Frequencies[0] = 20hz
    and visualizationData.Frequencies[255] = 20kh

    but what does  visualizationData.Frequencies[100] = ?



  • 4/23/2009 12:23 PM In reply to

    Re: Understanding VisualizationData

    Did you ever finish that tutorial? I would be very interested looking at that...
    Released: Painting Party | LDAPT! | Ladybird Galaxy | Spacebrix (RIP)
  • 5/1/2009 2:53 AM In reply to
    • (723)
    • premium membership
    • Posts 303

    Re: Understanding VisualizationData

    YorubaX:
    I have a question about determining the ranges in the Frequency collection.

    Is there a way to determine what frequencies are what in the 255 array.
    I know that
    visualizationData.Frequencies[0] = 20hz
    and visualizationData.Frequencies[255] = 20kh

    but what does  visualizationData.Frequencies[100] = ?



    I'm wondering the same thing.
    If I have to I might go through the process of generating some pure sine waves and playing them at different frequencies... but if someone else knows it would save me a lot of time.
    Serious as a heart attack...
    ... and twice as deadly.
  • 5/9/2009 6:57 PM In reply to

    Re: Understanding VisualizationData

    I have been trying to and experimenting with this same thing. It looks to me that the data is linear. in which case there are about 80hz between bands. I can tell you it is not an FFT of the signal data. I have run an FFT on the sample data getting a logarithmic output of frequency data and it differs greatly. Apart from someone at microsoft who knows the only way we are going to make heads or tails from this data is to feed different frequencies into it and graph the results.

    I would think that what this data is and how it's derived would be documented somewhere? Is this in the Xbox SDK, Direct X SDK?

    Actually found some answers almost immediately. http://msdn.microsoft.com/en-us/library/microsoft.xna.framework.media.mediaplayer.getvisualizationdata.aspx

    Seek and ye shall find no?


    Henry
    My wife says most of my posts should finish with "Get off my lawn"

    smokinskull.com
    My Twitter
  • 5/9/2009 7:40 PM In reply to
    • (723)
    • premium membership
    • Posts 303

    Re: Understanding VisualizationData

    Big Daddio:
    I have been trying to and experimenting with this same thing. It looks to me that the data is linear. in which case there are about 80hz between bands. I can tell you it is not an FFT of the signal data. I have run an FFT on the sample data getting a logarithmic output of frequency data and it differs greatly. Apart from someone at microsoft who knows the only way we are going to make heads or tails from this data is to feed different frequencies into it and graph the results.

    I would think that what this data is and how it's derived would be documented somewhere? Is this in the Xbox SDK, Direct X SDK?

    Actually found some answers almost immediately. http://msdn.microsoft.com/en-us/library/microsoft.xna.framework.media.mediaplayer.getvisualizationdata.aspx

    Seek and ye shall find no?





    For my purposes that information is insufficient.
    I generated sine waves for notes between E1 and A5, then recorded all frequencies excited above 0.1f.

    The results should be usable in my game... though it isn't as clean as I expected it to be.  I honestly do not know much about audio analysis, but since I was using perfect sinusoids I expected the band of active frequencies to be extremely low for each note.  For example, with a frequency of 41.2 (E1) I excited frequency bands 0-82 in the visualization data... and it wasn't highly concentrated in just a few of them either.  It wasn't until much higher frequency sine waves until I would see the response focused on just a few of the bands.
    Serious as a heart attack...
    ... and twice as deadly.
  • 6/23/2009 10:31 AM In reply to

    Re: Understanding VisualizationData

    I have been looking for a good bit of code ever since I heard about data visualization in xna 3.1

    This is great. I am going to start cooking my new game now.

    Thanks for the code sample :)
  • 9/20/2009 10:46 PM In reply to
    • (0)
    • premium membership
    • Posts 3

    Re: Understanding VisualizationData

    Has anyone gotten this to work on the Zune HD? I get a System.NotImplementedException when I try to set the MediaPlayer.IsVisualizationDataEnabled to true (no visualization occurs otherwise), and also when I try to make a call to MediaPlayer.GetVisualizationData(). Any thoughts?
  • 10/6/2009 6:39 PM In reply to

    Re: Understanding VisualizationData

    OMG thank you! Was looking for something like this for a project i am doing. Thx alot!
    ^_^
  • 10/9/2009 3:40 PM In reply to

    Re: Understanding VisualizationData

    Wow, totally dropped the ball on keeping this tutorial going. But I swear, if anyone's interested, I'm putting some work in this week. It might just be re-writing part one for a talk I'm doing. But hopefully, I can create some new content as well. Expect an update next weekend!

    -Ron
  • 10/12/2009 2:13 AM In reply to
    • (173)
    • premium membership
    • Posts 61

    Re: Understanding VisualizationData

    Looking forward to it.  I'm trying to focus on one project at a time, but I've been wanting to do a game featuring audio triggers for years.
  • 11/10/2009 4:07 PM In reply to

    Re: Understanding VisualizationData

    Gyst:
    Has anyone gotten this to work on the Zune HD? I get a System.NotImplementedException when I try to set the MediaPlayer.IsVisualizationDataEnabled to true (no visualization occurs otherwise), and also when I try to make a call to MediaPlayer.GetVisualizationData(). Any thoughts?


    did you get an answer? I am having the same problem.
  • 11/14/2009 10:58 PM In reply to

    Re: Understanding VisualizationData

    zacW:
    Gyst:
    Has anyone gotten this to work on the Zune HD? I get a System.NotImplementedException when I try to set the MediaPlayer.IsVisualizationDataEnabled to true (no visualization occurs otherwise), and also when I try to make a call to MediaPlayer.GetVisualizationData(). Any thoughts?


    did you get an answer? I am having the same problem.

    I'm having the same problem on the Zune HD with latest firmware 4.3 (191).
    Most of the other MediaPlayer controls work, just nothing to do with visualization data...
  • 11/15/2009 8:56 AM In reply to

    Re: Understanding VisualizationData

    I for one am still very interested in this.

    Henry
    My wife says most of my posts should finish with "Get off my lawn"

    smokinskull.com
    My Twitter
  • 11/30/2009 6:24 AM In reply to

    Re: Understanding VisualizationData

    you man are awesome
  • 11/30/2009 7:52 PM In reply to

    Re: Understanding VisualizationData

    Great tutorial and I for one don't mind it being in the forums.  Would like to see more forum posts like it.

    Games: ZenHak

    Site:Zenfar ZenHak, Zenfar Battle Grounds, WiiPunch...
  • 1/13/2010 3:21 PM In reply to

    Re: Understanding VisualizationData

    Calculating the frequency values returned in the 256 floats...

    x^255 = 20.000Hz
    x^0 = 20Hz

    log(10) 20000 = 4,30103
    log(10) 20 = 1,30103

    so, this means that for 1 item in the 256 floats we actually
    calculate 10^(power * 0,01171875)
    0,01171875 being ((4.30103 - 1.30103) / 256) so our high and low value...
    divided by 256 items...

    if we want to know the Hertz value for position 100, we do this...
    value[100] = 10^(100 * 0,01171875 + 1,30103) (1,30103 is the value for our 20hz, remember we start
    from 20hz, not 0, so we must add that value too...
    this gives us the value for position [100] as 297,10Hz approximately

    Let's try other values with this formula...
    value[70] = 10^(70 * 0,01171875 + 1,30103) => ~ 132,23 Hz [71 : 135,85Hz, 72:139,56Hz, 73:143,38Hz ]
    value[150] = 10^(150 * 0,01171875 + 1,30103) => ~ 1145,09 Hz
    value[30] = 10^(30 * 0,01171875 + 1,30103) => ~ 44,93 Hz
    value[220] = ~~ same formula => 7571,03Hz
    value[221] = ~~ => 7778 Hz
    value[250] = ~~ => 17010 Hz

    Of course these values are approximates. I didn't take all the numbers after the comma, but
    it's more than you'll ever need, I guess...

    So the formula would be...
    10^(POSITION * 0,01171875 + 1,30103) where POSITION = [0..255]
    ^ means power
    I've used log(10) here...
    I am no math genius, but I think I got it right. If you have better ideas, let me know...
  • 1/13/2010 11:32 PM In reply to
    • (723)
    • premium membership
    • Posts 303

    Re: Understanding VisualizationData

    Things definitely did not turn out that clean in my experimental analysis.  You can look at my post above, but basically I used MATLAB to generate pure sinusoids for a given frequency, and checked which values were excited when it was played.

    At low frequencies huge bands were excited, with a barely discernible peak at the appropriate value.  You definitely couldn't tell that most of the power was at the appropriate frequency.
    Serious as a heart attack...
    ... and twice as deadly.
Page 1 of 1 (23 items) Previous Next