Tim Sneath

Musings of a Client Platform Guy

March, 2004

  • Tim Sneath

    Digging inside Windows Media Files

    • 0 Comments

    I've been scratching an itch - of the coding variety.

    For some while now, I've wanted to catalogue all the Windows Media files that I've (legally) ripped over the last year or so, but documentation on MSDN or the broader Internet was scant. So I've hacked together a rather nice little managed wrapper that makes it really easy to get at the metadata in any WMA files. You can use the wrapper to pull a single attribute out of a file, or to recursively trawl through a directory structure and build a strongly-typed dataset containing all the populated attributes from every file.

    I think it's rather cool, even if I say so myself! Rather than drone on at greater length here, I've put an article online which describes the code and provides a download location for the sample code. Have fun with it and let me know if you put it to any interesting uses.

  • Tim Sneath

    Ten Reasons for Developers to Attend TechEd Europe 2004

    • 13 Comments

    The one where Tim sells out to the marketing droids...

    OK, so this is more of a sales pitch than I would normally include in my blog, but I've heard a couple of people over the last week say that they thought TechEd was primarily aimed at IT Pros rather than developers, and I wanted to correct that perception. In Europe, at least, TechEd actually has a slight bias towards developers (mostly because there's another, slightly smaller event called IT Forum that's pure IT Pro territory).

    I'm responsible for the content of TechEd Europe this year - perhaps the most exciting and daunting challenge I've had since joining Microsoft, and I'm hugely keen to ensure that it's the best event we can put on for developers, architects and IT Pros. I'd love to hear your ideas about how we could make the event more worthwhile. Whether you've attended in previous years and think we're missing something big, or whether you've never attended because it didn't seem to meet your needs, I'd be very interested to read your feedback.

    Anyway, having declared my vested interest, here's my non-marketing marketing pitch :-)

    1. More than 400 breakout sessions, Q&A panels, chalk and talk discussions, and birds of a feather meetings covering every aspect of Microsoft software in the enterprise;
    2. Several (currently secret) new product announcements and the first public unveiling of several big new features in Whidbey and Yukon;
    3. No dull keynote monologues - I've been fighting really hard to do away with the traditional event approach of getting some random VP to waffle on about the pet subject. We've instead been working on something rather different that I hope will go down really well. Don't expect to see marketecture slides, just cool technology.
    4. In-depth training on SQL Server 2005 and Visual Studio 2005; we're currently standing at 38 sessions purely on the next generation of our developer technologies. You could attend the conference twice over and only see sessions covering Yukon and Whidbey! (Oh, and we hope to provide every attendee with a copy of Yukon Beta 2 / Whidbey Beta 1.)
    5. New this year: a dedicated architecture track with top-rated speakers such as Pat Helland planning to participate. We're trying to make TechEd Europe far more attractive to enterprise architects, with plenty of sessions for those who design rather than code.
    6. Seven pre-conference sessions on topics including Writing Secure Code, Patterns and Practices, BizTalk 2004 and SQL Server;
    7. Some great confirmed speakers including David Chappell, Fernando Guerrero, Juval Lowy, Rafal Lukawiecki, Ingo Rammer, Mark Russinovich and Clemens Vasters;
    8. Hands-on labs where you can get to grips with the latest builds of SQL Server 2005, Visual Studio 2005 and 64-bit technologies;
    9. A mobile development track that's twice as big as any from previous years at TechEd. We've integrated the Mobile DevCon to ensure that there's the best possible range of sessions on development for Windows Mobile platforms, including coverage of .NET Compact Framework and SQL Server CE.
    10. A really big party - of course!

    Hope to see some of you there...

  • Tim Sneath

    Extracting Metadata from Windows Media files

    • 19 Comments

    In this article, I'll describe how to use the Windows Media Format SDK to access the metadata embedded in Windows Media files for cataloguing purposes. Also included is two managed classes written in C# that vastly simplify the usage of this SDK.

    Download MediaCatalog 1.0 (35KB)

    Introduction

    Over the last year, I've been gradually filling a spare hard disk with rips of all my CDs. It's fantastic to be able to play any CD from my catalogue so easily, and it means I can hide the CDs themselves away from my young daughter's sticky fingers! The problem is that as my digital collection has accumulated, it's getting harder to see what I've got. I've painstakingly tagged all my CDs with metadata, but Windows doesn't currently provide any easy mechanism to sort or manipulate that metadata. So I thought I'd follow Duncan Mackenzie's example and hack together a media cataloguing application.

    The trouble is that it's difficult to extract the metadata from a Windows Media file. The Windows Media Player SDK provides a nice interop library you can use to embed Windows Media Player in your managed application and drive it programmatically, but I definitely wanted to avoid driving GUI controls, given the number of files I want to catalogue. Instead, I fired up MSDN Library and discovered the Windows Media Format SDK, a low-level API into the file format itself.

    This SDK isn't easy to program against from managed code, however - it's pretty grungy COM interop. Fortunately, with the aid of MSDN, Adam Nathan's .NET and COM book and a quick look at some pretty dodgy samples, I was able to build a fairly clean managed wrapper that provides a straightforward interface into the SDK. Ironically, I haven't finished writing the graphical front-end catalogue application that generated the itch in the first place, but I thought the managed library was interesting enough in its own right to share.

    Using MediaCatalog

    I've divided up the managed library into a high-level API and a low-level API. The low-level API is a class that allows you to open a media file, examine the attributes by index or name, and enumerate through them using a foreach loop. The high-level API abstracts the previous class and provides methods to allow recursive or non-recursive iteration through a database structure, creating a strongly-typed DataSet object that contains all the most common attributes in the audio files it finds. You could bind the output to a Windows Forms DataGrid, for example, and indeed the sample test harness included with the code does exactly that.

    Low-Level API

    To access the low-level API, you instantiate an object of type MetadataEditor, passing the constructor the filename of the media file you're interested in. You can then either enumerate through the object using a foreach statement, or query it by name or using an indexer. The object supports an int-based indexer or alternatively an enum-based indexer that simplifies access using common attributes. The following C# code sample demonstrates each of these choices.

       using (MetadataEditor md = new MetadataEditor("britney.wma"))
       {
          // Enumerate through each of the attributes in the file
          foreach(Attribute attr in md)
          {
             Console.Write(attr.Name);
             Console.Write(": ");
             Console.WriteLine(attr.Value.ToString());
          }
          
          // Set author to be the bitrate of the media file
          string author = md.GetAttributeByName("ID3/TPE1");
          // Set d to the duration of the media file (e.g. 3m 45s)
          TimeSpan d = md[MediaMetadata.Duration];
       }
    

    Remember that since this object uses unmanaged resources, it's important to call the MetadataEditor.Dispose() method when you've finished using it in order to close the underlying resources. Alternatively wrap it inside a using statement as demonstrated above.

    High-Level API

    This API contains three main methods that can be used to extract album information across multiple directories if necessary:

    Method Description
    RetrieveTrackInfo Retrieves structured property information for the given media file. Returns a TrackInfo object containing commonly-used fields.
    RetrieveSingleDirectoryInfo Retrieves media information for a single directory. Returns a MediaData object (a strongly-typed DataSet)
    RetrieveRecursiveDirectoryInfo Recursively trawls through a directory structure for media files, using them to build a DataSet of media metadata. Returns a MediaData object.

    As a quick example, the following C# code snippet binds a Windows Forms DataGrid to the output of RetrieveRecursiveDirectoryInfo:

       MediaDataManager mdm = new MediaDataManager();
       musicData = mdm.RetrieveRecursiveDirectoryInfo(@"\\timserver\music");
       mediaInfo.DataSource = musicData;
       mediaInfo.DataMember = "Track";

    The MediaDataManager object also exposes an event that can be used to track progress (particularly useful during a long recursive directory search). Use the following syntax to enable it:

       mdm.TrackAdded += new MediaDataManager.TrackAddedEventHandler(mdm_TrackAdded);

    Things To Do

    The wrappers aren't complete by any means, and I'd love to hear your suggestions of how they might be improved (or even some code!). Several things on my own personal list:

    • Improve the intuitiveness of some of the class names
    • Add setters for the attributes to allow metadata to be modified
    • Add greater flexibility to the recursive searches to allow them to execute on a background thread
    • Write a decent cataloguing engine that takes advantage of the tools!
  • Tim Sneath

    Passing Strings to Unmanaged Code

    • 7 Comments

    I've just come across a nasty bug in some sample code (from us, I'm ashamed to say), that highlights the pitfalls of passing string buffers between managed and unmanaged code.

    To go back a step or two, I've been trying to create a small application to pull metadata out of Windows Media files so that I can catalogue my music collection. (Incidentallly, there are several supported ways to achieve this, including the Windows Media Player SDK and the Windows Media Format SDK.) I'd come across this little function that iterated through all the metadata attributes in a file and dumped them to the console. But for some reason, the function only seemed to be printing the attribute names and not the associated values. The statement looked something like this:

       Console.WriteLine("* {0, 3}  {1, 25} {2, 3}  {3, 3}  {4, 7}  {5}", 
          wIndex, pwszName, wStream, wLangID, pwszType, pwszValue);

    According to the debugger, I was seeing the contents of wIndex and pwszName, but none of the other parameters. Stranger still, when I preceded the Console.WriteLine call with a similar call to MessageBox.Show, the function printed all the parameters. Needless to say, when you get into the kind of debugging situation where you're seeing truly unexpected results, you often disappear down a blind alley trying to solve a problem that doesn't exist. In my case, I started testing the hypothesis that it was a timing issue that the message box display eradicated; I wasted several hours experimenting with wait loops and searching through the documentation for references to file status that with hindsight couldn't have fixed the problem.

    Suddenly it came to me in a flash: the debugger was showing the value of pwszName as "Duration\0". Of course! There was a null-termination character at the end of the string that shouldn't have been there. It wasn't that the call to Console.WriteLine didn't contain the right parameters - it was simply seeing the \0 and terminating the string at that point. MessageBox.Show obviously deals with this differently.

    So how had pwszName got created like this? Looking back at the sample code that generated the values, I saw something like the following:

       string pwszName = null;
       ushort wNameLen = 0;
       HeaderInfo3.GetAttributeByIndex( wAttribIndex,
                                        ref wStreamNum,
                                        pwszName,
                                        ref wNameLen,
                                        out wAttribType,
                                        pbAttribValue,
                                        ref wAttribValueLen );
       pwszName = new String( (char)0, wAttribNameLen );
       HeaderInfo3.GetAttributeByIndex( wAttribIndex,
                                        ref wStreamNum,
                                        pwszName,
                                        ref wNameLen,
                                        out wAttribType,
                                        pbAttribValue,
                                        ref wAttribValueLen );

    It's pretty clear from this piece of code what's wrong: the creator (presumably a C++ programmer judging by the code style) has called the function once to determine the length of the retrieved string and then called it a second time to fill a pre-populated string. They forgot to trim the final null value(s), with a statement such as the following:

       pwszName = pwszName.Substring(0, wNameLen);

    Even this is not a great way of handling string buffers. A far better approach would have been to have used the System.Text.StringBuilder class - a mutable string type that can be passed wherever a string is required by an API function. Rather than trimming the returned string, I rewrote the API declaration to use a StringBuilder rather than a fixed-length string and changed the sample code accordingly:

       StringBuilder pwszName = null;
       ushort wNameLen = 0;
       HeaderInfo3.GetAttributeByIndex( wAttribIndex,
                                        ref wStreamNum,
                                        pwszName,
                                        ref wNameLen,
                                        out wAttribType,
                                        pbAttribValue,
                                        ref wAttribValueLen );
       pwszName = new StringBuilder(wNameLen);
       HeaderInfo3.GetAttributeByIndex( wAttribIndex,
                                        ref wStreamNum,
                                        pwszName,
                                        ref wNameLen,
                                        out wAttribType,
                                        pbAttribValue,
                                        ref wAttribValueLen );

    The moral of the story: whenever you need to pass a string buffer to a Windows API call, use StringBuilder. (Of course, string is just fine if the unmanaged function doesn't modify its contents.) And if you're wondering why a string is being prematurely truncated, make sure you check for rogue null-termination characters!

  • Tim Sneath

    Around the World in 30 Years

    • 4 Comments

    I love this site - I've been wanting to create something similar for a couple of years:

    Which countries in the world have you visited? Here's the link you can use to create your own "visited country" map. (Thanks to Paul Bartlett for the link.)

Page 1 of 4 (16 items) 1234