Welcome to MSDN Blogs Sign in | Join | Help

The sample code for the previous post is available here.

If you're looking into developing with WIC or image processing on Vista, check out the WPF Imaging video on channel 9.

Vista includes support for a new image format named Windows Media Photo.  This image format consists of a codec (offering features such as lossless encoding, high dynamic range encoding, highly efficient operation…) and a new container format.  In other words, lots of features to make you want to try it out.  My aim for this post was to write some code to create my first Window Media Photo file.  The tools at our disposal to write this application are the Windows Media Photo (I’d like to abbreviate to WMP but that’s taken so I’ll go with WMPhoto) codec itself and the Windows Imaging Component (WIC) that will allow us to access that codec.  WIC is another new feature for Vista, documented in the Windows SDK) so we get to play with two new technologies for the price of one!

WIC abstracts away the specifics of discovering and working with still image codecs.  For this application it will allow us to locate a codec for our source image, decode it, locate the WMPhoto codec, pass the decoded source image for encoding and save to a file.  The bulk of the application is with WIC and the steps it takes are as follows: 

WIC defines a factory object (IWICImagingFactory) that allows an application to create the specific objects needed to do its work.  For decoding an image we can ask the factory to create the required decoder object (IWICBitmapDecoder) from an exiting file by passing a filename.  Other than looking at the file extension,  each codec registered with WIC can include one or more entries that contain the first few bytes that uniquely identify an image format (JPEG would be FF D8).  Using this WIC can ‘sniff’ for the correct codec.  Once we have the decoder object we can ask it for a frame object (IWICBitmapFrameDecode)  that will allow access to the decoded image.  The concept of frames exists because an image container can contain more than one image (TIFF being the prime example).  The frame object inherits from the most fundamental image object in WIC (IWICBitmapSource) which can be thought of as corresponding to an image as it would be understood in GDI+.

That’s decoding covered, but we still need to get things into the WMPhoto format.  Do this we first use the factory object to create a stream object (IWICStream) and initialise it to a file on the disk.  We then use the factory again to ask it to create the WMPhoto encoding object (IWICBitmapEncoder) specifying the WMPhoto container identifier.  Just we decompressed a frame from a decoder in the last paragraph, we will now ask the encoding object for a frame (IWICBitmapFrameEncode) that will store the encoded image.  Because we specified the WMPhoto container, we encoding frame will use the WMPhoto codec and it will pass us an object to allow us to configure the codec with the options we choose.

I’ll break from WIC briefly to discuss the options passed to the codec.  The options are configured through named properties.  Some are defined by WIC and some are specific to the codec being used.  For this application I’ll configure ‘ImageQuality’ (defined by WIC) to govern the level of lossy compression, a value between 0 and 1.0.  Alternatively you could specify ‘lossless’ (defined by WIC) but that’s probably only worth doing if you have immaculate source images (RAW from a digital camera for example).   I’ll also specify ‘UseCodecOptions’ (defined by WMPhoto) and set that to false.  This tells the WMPhoto codec to pick default values for other properties we could configure, based on the ImageQuality value.

Continuing with the encoding, we can now set the pixel format and dimensions of the source frame, on the encoding frame.  We ask the encoding frame to encode from the source frame (a chaining feature of WIC) and the image data is compressed.  The finish off we then call commit on the frame and encoder objects, to flush everything out to the stream and close it.

WIC is a COM based API so to code this application I followed the same pattern as the WebCam code.  That is to create a C++ class to do the COM work, wrap that it a C++\CLI class to expose it to the managed world (forward calls, convert parameters, handle errors) and then a C# application to make use of it.  I created a simple command line app that takes an input file and an output file and successfully created my first WMPhoto image using the highest quality encoding.

There is a lot more to WMPhoto.  I’ve only used JPG images as sources which is a bit pointless as quality has already been lost.  Converting from RAW source would be interesting, as would looking at the functionality of the container format.  The most interesting perhaps being the operations you can perform in the compressed domain.  For example, to create a JPG thumbnail in GDI+ you would typically have to decode the JPEG, resize the decoded image and then re-encode it.  With WMPhoto there is the prospect of performing this and other operations without decompressing, saving memory and CPU time.  For more information about WMPhoto check out Bill Crow’s blog.

The sample code for the previous post is available here.

What would I change about this sample?  The main thing is that the DirectShow initialisation occurs inside BuildWindowCore, which occurs very early in the creation of the UI.  This initialisation could fail looking for a WebCam, or having found one fail to build a suitable graph.  As the sample stands, the test application will not fail nicely.  To handle these conditions, it would be better to have it behind an initialisation method so that a consumer of the control can control it and design a more graceful failure path.

 

A colleague asked how to capture video from a webcam in WPF.  I find it’s a lot more enjoyable to learn about a technology by having something to aim for and this seemed an ideal project.

So what tools are at our disposal to write a WPF application that will show the live output from a web cam attached locally to the machine?  My first thought was the Media Foundation API new in Vista.  This is a replacement for the venerable DirectShow which is powerful but very closely tied to COM (if you develop in managed code and wanted to use DirectShow, you’re no doubt aware that there is no official managed wrapper).  Media Foundation looked ideal until I found that it doesn’t support hardware devices as yet.  It supports creating your own sources and there is a WAV source sample in the SDK, but creating a source filter for a hardware device will take you to the kernel and that’s a step too far for me.

OK, it needs to be DirectShow.  This will allow code to enumerate the system for hardware and connect all the necessary components to show the camera output on the screen (termed a graph - if you’ve not used DirectShow, play around with the GraphEdit tool in the SDK bin folder, all will be revealed).  As its DirectShow, it will also mean C++.  Using interop is possible but its a dark tunnel to go down before you see light, so I chose managed C++ to bridge WPF to DirectShow.

DirectShow will render the images from the camera using a component (called a filter in DirectShow) called the Video Mixing Renderer.  This is a DirectX based component that will push the pixels around (see last post!) and in its default form likes to work with window handles and other non WPF concepts.  This pointed me to the HwndHost control which gives a window handle to host things in.  We can subclass this control and override a few methods on the control to hook things up.  Enough background, here’s how I put things together:

I’ve got two classes: one managed that will subclass HwndHost and be a managed class, and another that will be unmanaged to do all the DirectShow magic.  The managed class will instantiate and call methods on the undamaged class.  First the managed class:  I override the BuildWindowCore to create my own window using CreateWindowEx(), instantiate the unmanaged class and ask it to use DirectShow and hook it to our newly created window.  I override DestroyWindowCore to clean-up the unmanaged class and our created window.  The ArrangeOverride method also needs to be overridden (terminology meltdown!) so that we know the final size of the control when it is rendered to the screen as its not know on the call to BuildWindowCore.  That, plus methods to start and stop the preview is all that’s in the managed class.

The unmanaged class exposes a setup method that will (DirectShow terminology warning) enumerate for video input class devices, create an instance of a device filter, add it to a graph and render its output pin.  It then queries the graph for its COM interface to configure the video output and it’s at this point we can return to hooking in with WPF.  This method was passed the window handle of our created window and we can configure DirectShow to render within the window area (i.e. our new WPF control).  This class also exposes a method to do the final sizing of the video output and this is called from the ArrangeOverride method mentioned above.  Some further methods to start, stop and cleanup and this class is done.

But does it work?  The end result is a control we can place in our WPF application that will show the webcam output, nice! Yes, it seems to work well.  The order in which you configure window properties as part of the creation process has an effect on Aero Glass.  The Desktop Window Manager service controlling Aero Glass will fall over (with a event log details to tell you that it did) if the order is incorrect.

That’s all well and good, I hear you all cry, but there’s no code!  You could have made all this up!  A fair point and my answer is that including code snippets would have made writing this up much more complicated and even longer.  When I can find a spot to host the sample code I’ll make it available if people are interested and you can pick over it at your leisure.

 

I've set up the blog account and I'm sitting here trying to create this first post.  As someone working in technology and software development I had high expectations that every post would be a beautifully crafted, innovative example of the use of best practice to a ground breaking technology.  The fact that the first post is going to be like a thousand other ‘Hello World’ first posts, is a bit of a bit disappointing.

So what is this blog going to be about?  I’m interested in technologies that move pixels around the screen, get sound to come out of the speakers.  This blog aims expose my wanderings through the technologies that relate to this area, to anyone who’s interested.

 
Page view tracker