Welcome to MSDN Blogs Sign in | Join | Help

Why are the Multimedia Timer APIs (timeSetEvent) not as accurate as I would expect?

The Multimedia Timer APIs (MM Timer APIs) get their high accuracy by using the Programmable Interrupt Controller (PIC) built into the hardware on the machine. By default Windows specifies a PIC duration of about 10 – 16 milliseconds. Every time the PIC fires the operating system kernel “wakes up”. Any executing user-mode threads are suspended and the kernel scheduler kicks in to determine what thread should be executed next. You can use the MM Timer APIs to actually change the resolution of the PIC (timeBeginPeriod). That’s right an unassuming user mode function can radically alter how the OS goes about its business.

 

As I mentioned earlier the default PIC resolution for the OS when it starts up is around 16 milliseconds. Let’s say that you set a periodic timer to fire every 5 milliseconds, with the PIC set at 16 milliseconds you will only be alerted every 16 milliseconds (at best).  This level of accuracy is usually good enough for most applications. However, for time critical applications such as audio and video playback this resolution just is not good enough.

 

The MM Timer APIs allow the developer to reprogram the Programmable Interrupt Controller (PIC) on the machine. You can specify the new timer resolution. Typically, we will set this to 1 millisecond. This is the maximum resolution of the timer. We can’t get sub-millisecond accuracy. The effect of this reprogramming of the PIC is to cause the OS to wake up more often. This increases the chances that our application will be notified by the operating system at the time we specified. I say, “Increases the chances” because we still can’t guarantee that we will actually receive the notification even though the OS work up when we told it.

 

Remember that the PIC is used to wake up the OS so that it can decide what thread should be run next. The OS uses some very complex rules to determine what thread gets to occupy the processor next. Two of the things that the OS looks at to determine if it should run a thread or not are thread priority and thread quantum. Thread priority is easy. The higher the thread’s priority the more likely the OS will be to schedule it next. The thread quantum is a bit more complex. The thread’s quantum is the maximum amount of time the thread can run before another thread has a chance to occupy the processor. By default the thread quantum on Windows NT based systems is about 100 milliseconds. This means that a thread can “hog” the CPU for up to 100 milliseconds before another thread has a chance to be scheduled and actually execute.  

 

Here is an example. Let’s say that we reprogrammed the PIC to fire every 1 millisecond (timeBeginPeriod). We also setup a periodic timer to fire every 1 millisecond (timeSetEvent). We expect that exactly every one millisecond the OS will alert our application and allow us to do some processing. If all of the virtual stars are aligned just right then we get our callback once every millisecond as we expect. In reality we get called after 100 milliseconds and receive 10 timer messages in rapid succession. Why is this? This is what probably happened. The OS decided that there was a thread that had a higher priority than the MM Timer thread and the high priority thread got priority over us. This thread must have been really busy. The OS decided to continue to schedule it until its entire quantum was used up. Once the thread’s quantum was used up the OS was forced to schedule a different thread. We were lucky enough that the next thread that got scheduled was our timer thread.

 

Even though we reprogrammed the PIC the OS will make decisions behind our back and can cause our timer callback to get delayed. We have no way to change the thread quantum that the OS assigns at startup (that I know of). This is just the way the OS works. There are no workarounds for this issue. Luckily, this is the worst-case scenario. It rarely happens in the real world. The MM Timer callbacks tend to occur about when we expect. Typically, we can see the timer delayed between 1 – 20 milliseconds. The actual delay will depend on what else the OS is trying to do during the timer callback. It’s rare to see the timer delayed more than about 20 milliseconds but it can certainly happen.

 

I just want to point out there are side effects to using the (timeBeginPeriod) API. To quote from Larry Osterman’s blog: “[Calling timeBeginPeriod and timeEndPeriod] …has a number of side effects - it increases the responsiveness of the system to periodic events (when event timeouts occur at a higher resolution, they expire closer to their intended time).  But that increased responsiveness comes at a cost - since the system scheduler is running more often, the system spends more time scheduling tasks, context switching, etc.  This can ultimately reduce overall system performance, since every clock cycle the system is processing "system stuff" is a clock cycle that isn't being spent running your application.”

 

I hope this helps to explain the inherent problems with using the Multimedia Timer APIs and the reason why you aren’t getting the accuracy that you would expect.

Here is a consolidated list of changes that we can expect for DirectShow Windows 7

Potentially Breaking Change (*IMPORTANT*):

Intelligent Connect adds a level of indirection on Win 7 that may have a major impact to DirectShow developers. Starting in Windows 7, DirectShow has a list of preferred filters for certain media subtypes. If there is a preferred filter for the media type that is being rendered, the Filter Graph Manager tries that filter first. An application can modify the list of preferred filters by using the IAMPluginControl interface. Changes to the list affect the application's current process, and are discarded after the process ends. Also in Win 7 DirectShow has a list of blocked filters for certain media subtypes. The Filter Graph Manager skips filters on this list. An application can modify the list of blocked filters by using the IAMPluginControl interface. Changes to this list affect the application's current process, and are discarded after the process ends. See the IAMPluginControl interface entry below for more detailed information.

 

Updated filters:

Microsoft MPEG-1/DD/AAC Audio Decoder (Microsoft DTV-DVD Audio Decoder) – This filter is compatible with almost all MPEG audio formats available today including MPEG-1 and MPEG-2 Layer 1 and 2, AAC, DTS and Dolby Digital. The Microsoft implementation of the Dolby Digital technology is restricted under terms of the Dolby Digital licensing program to use by Microsoft applications. This means that just like on Windows Vista 3rd party developers still can’t use the filter to decode Dolby Digital content. This makes is very difficult to develop a DVD playback application using MS only components. Also keep in mind that this filter is not available on IA-64 platforms.

 

Microsoft MPEG-2 Video Decoder (Microsoft DTV-DVD Video Decoder) – This filter is compatible with MPEG-1, 2 and 4 video playback. MPEG 4 is new for Win 7. This decoder supports decoding h.264 and x.264 MPEG 4 content as well as AVC content. MS MPEG 4 variants are not supported by this filter. All of the decoder’s capabilities should be available for use by 3rd party developers. DVD decryption is not currently part of Microsoft’s MPEG-2 decoder implementation. 3rd party decrypter can plug into the Microsoft MPEG-2 decoder using the “DVD Copy Protection Property Set”.

 

Added Interfaces:

IAMAsyncReaderTimestampScaling – This interface is implemented by pull source filters to enable large file support. Large file support is any media file over 860 GB. This interface is used by filters that implement the IASyncReader  interface. This limits the use of this interface to pull-mode source filters only. The in box “File Source (Async)” and the “File Source (URL)” filters implement this interface by default. Parser filters that connect to the pull-mode source can use this interface to enable large file support. Luckily if you are just writing a simple file playback application you don’t need to worry about this interface. You only need to use this if you are writing a pull-mode parser / demux filter. I would HIGHLY discourage anyone from trying to write their own pull-mode parser filter. I’ve been doing DirectShow for years and the last time I tried to implement a pull-mode parser I failed miserably. I ran into so many complex threading issues that it just wasn’t worth the effort. Just write your own push source filter and incorporate the parser / demux functionality in the source filter itself.

 

IAMPluginControl – This interface controls the preferred and blocked filter lists for the new intelligent connect functionality added in Win 7 (see “important” above). You can use this interface to add or remove GUIDs from the preferred and blocked lists. This interface allows your application to get around the requirements of the preferred and blocked lists. If you have a specific filter arrangement that your application requires and you see that other filters are being added to the graph unexpectedly, then you will probably need to implement this interface and make sure that you update the lists accordingly. The major problem with using this interface is that you CANNOT use it to modify the preferred and blocked list on a system wide basis. This interface only allows you to modify the list for the current process (your process). You can’t modify the lists for other processes. The changes are discarded when your process ends. This can be a real problem if you expect your filter to be loaded into a graph within Windows Media Player (WMP). Your filter will not be on the preferred list and probably won’t get loaded into the WMP graph as you expect. At this time there is no way to change this behavior. There is currently no way to force your filter to be loaded into a graph in an application that you do not control if the graph is satisfied by filters on the preferred list. There is no supported way to modify the preferred list on a system wide level, since the list is under system file protection. If you attempt to hack the preferred list you may cause the installer / application that changes the values in the list to be detected as malware and quarantined.

 

Intelligent Connect in Windows 7

http://msdn.microsoft.com/en-us/library/dd390342(VS.85).aspx

 

Microsoft MPEG-1/DD/AAC Audio Decoder

http://msdn.microsoft.com/en-us/library/dd390676(VS.85).aspx

 

Microsoft MPEG-2 Video Decoder

http://msdn.microsoft.com/en-us/library/dd390679(VS.85).aspx

 

H.264 Video Types

http://msdn.microsoft.com/en-us/library/dd757808(VS.85).aspx

 

DVD Copy Protection Property Set

http://msdn.microsoft.com/en-us/library/dd388585(VS.85).aspx

 

IAMAsyncReaderTimestampScaling Interface

http://msdn.microsoft.com/en-us/library/dd389120(VS.85).aspx

 

IAMPluginControl Interface

http://msdn.microsoft.com/en-us/library/dd319756(VS.85).aspx

Activating an audio endpoint when not connected causes the device to report incorrect capabilities

The other day we had a major OEM report an issue with the WASAPI. In particular when they found that when they called “IAudioEndpointVolume::QueryHardwareSupport” it would occasionally return unexpected results for USB web cam devices. Most of the time calling this method would return a full list of capabilities for the device. Occasionally this method reported that the device did not have any capabilities.

 

Initially I wasn’t able to reproduce the problem. I tried everything I could think of including re-writing their test application. Finally, they were able to identify a set of steps that reproduced the issue every time. They sent me two separate applications. The first one enumerated all of the endpoints on the system and called “IMMDevice::Activate” on each one. The second application called the “IAudioEndpointVolume::QueryHardwareSupport” method.

 

Here are the steps:

1)  Unplug the webcam device and reboot the machine.

2)  Run app. 1 calling “IMMDevice::Activate” for all available endpoints / devices.

3)  Plug in the web cam device.

4)  Run the second application calling “IAudioEndpointVolume::QueryHardwareSupport”.

 

I would expect that “QueryHardwareSupport” would return “3”. In this very specific case I saw a return of “0”.

 

I looked very closely at the application that called “Activate”. I noticed that “DEVICE_STATEMASK_ALL” was specified when the application was enumerating the devices / endpoints. The application then enabled each of the devices / endpoints retuned in the enumeration. Every activation succeeded without error.

 

The problem suddenly became clear; you should not attempt to activate a device / endpoint that is not attached to the system. Attempting to do so will invalidate the endpoint. The underlying WASAPI code appears to query the HW for its capabilities when you call “IMMDevice::Activate”. Since the HW is not present, the driver returns that the device has no capabilities. This has the effect of invalidating the device. This invalidated state is cached by the subsystem. When you plug in a device, “IAudioEndpointVolume::QueryHardwareSupport” returns the cached capabilities.

 

Bottom line, don’t use “DEVICE_STATEMASK_ALL” if you are going to activate the enumerated endpoints. You can get just the active endpoints by using “DEVICE_STATE_ACTIVE”. If for some reason you want to activeate a jack that is unplugged you can or “DEVICE_STATE_ACTIVE” with “DEVICE_STATE_UNPLUGGED”. Try to avoid using “DEVICE_STATE_DISABLED”, “DEVICE_STATE_NOTPRESENT” and “DEVICE_STATEMASK_ALL” if you are going to call “IMMDevice::Activate” on the enumerated endpoints.

Exception playing .mp3 & .wma files using embedded WMP from .hta files

We recently had an engineer report a strange exception when displaying .hta files via mshta.exe. If you embed the Windows Media Player OCX control (WMP) in a web page and then start playback of a file, you will get an exception like the following:

Windows - Bad Image

Exception Processing Message 0xc000007b Parameter 0x760892A0

0x760892A0 0x760892A0 0x760892A0

This issue always occurs on the 64 bit version of Windows Vista. It can also occur on the 32 bit version of Vista when certain older versions of firewall / anti-virus software are installed or if the machine is infected with certain viruses.

It took us a while to track this one down. Eventually we found that the exception is being caused by the MFPMP.exe process. This process is the secure audio and video playback process for WMP on Vista. The interesting thing is that WMP always uses MFPMP for all WMA, WMV, ASF and MP3 content regardless if it is protected with DRM or not. I talked to the guys on the WMP team about this. They said that they wanted to have a different playback path for non-DRM encoded files but decided that it was just much easier to have everything go through MFPMP.

The biggest challenge in keeping DRM content safe is finding ways to keep attackers from reading the decryption keys directly out of memory. Every DRM system has suffered from this attack vector, from Microsoft to Apple and everyone in-between. MFPMP uses the “Protected Environment” or “Secure Process” (PE) APIs that were added in Vista. These APIs allow the MFPMP process to be isolated. A process that is protected in this way cannot be debugged. You can’t connect a debugger to the process and peer into its inner workings.

Another attack that is used to read data out of a process is known as “DLL injection”. In this type of attack, rather than attaching a debugger the attacker finds a way to get their malicious DLL to run in the context of the process to be attacked. This attack takes many forms. MFPMP via the PE protects against this attack by requiring that all DLLs that run within the process to be “signed”. Signing is a way to watermark a DLL. When the MFPMP process reads in a new DLL it checks for this digital watermark or signature. If the signature is determined to be valid the DLL is loaded into the process. If the signature is not valid then MFPMP doesn’t allow the DLL to be loaded.

MFPMP’s rejection of un-trusted DLLs causes the error. In this case mshta.exe loads an .hta file. This file contains an embedded WMP OCX control. This control loads a standard ASF file. The WMP OCX control then attempts to play this file by starting up an instance of MFPMP.exe. While MFPMP.exe is starting up it attempts to load a DLL that is not properly signed. MFPMP.exe raises a “critical error” (Bad Image). This error is handled by the default error handler in the operating system (OS). This exception handler displays a dialog box that shows the “bad image” message.

The interesting thing is that if you use the same .hta file in Internet Explorer (IE) you don’t see the exception. The exact same order of operations holds true to IE as it does for MSHTA.exe. The main difference is that IE has code that explicitly suppresses the default error handler in the OS (SetErrorMode(SEM_FAILCRITICALERRORS )).  MSHTA.exe does not contain this code and there are no plans to add it.

You can confirm that you are experiencing this issue by going to Event Viewer=>"Windows Log"=>Security. Search the list for an entry labeled "Audit Failure". This entry should have the following message text:

Code integrity determined that the image hash of a file is not valid.  The file could be corrupt due to unauthorized modification or the invalid hash could indicate a potential disk device error.

File Name: xxxx

The “file name” section contains the name of the DLL that is being rejected by the MFPMP process. On Vista 64 this file might be named something like “wow64.dll”. This is an operating system file. On the 32 bit Vista OS this file will have the name of a 3rd party DLL. A number of older anti-virus and firewall applications use undocumented APIs to inject their DLLs into every process on the system. Since these 3rd party DLLs are not properly signed, MFPMP rejects them.

To resolve this issue on 32 bit versions of Vista you need to get the latest version of your anti-virus or firewall application. If upgrading does not fix the problem you may need to uninstall the 3rd party application. Or you can call your application’s vendor and ask them to stop using undocumented and potentially dangerous APIs. On Vista 64 there are no plans to properly sign wow64.dll to prevent this issue. We hope (although it has not been confirmed) that this issue will be addressed before Windows 7 ships.

Playing an MPEG1 file in WMP via HTTP returns the wrong duration

A colleague of mine in Japan recently found this very interesting issue in Windows Media Player (WMP) on Windows XP. From what he found WMP returns the wrong duration value when playing back MPEG1 files via HTTP. This only appears to occur if the file has not been cached locally by the player. This issue can cause WMP to stop playing content before it has reached the end.

 

For example:

  - Expected duration: 1 min 20 sec

  - Actual duration: 50 sec

 

On XP when WMP tries to play an MPEG1 file using HTTP and progressive download, it uses the old DirectShow “URL Async File Source” filter. To determine the length of the file WMP queries DShow for the duration. The "MPEG1 Splitter" filter communicates with the “URL Async File Source” filter, requesting the duration. The “URL Async File Source” filter calculates the duration based on the amount of data that it has currently downloaded. As you can imagine the actual amount of file that is downloaded when the source filter is queried could vary widely. The”MPEG1 Splitter” filter recognizes that the duration being returned isn’t quite right. The filter then tries to calculate the duration based on information in the System Pack Header. This calculation will always return a duration that is shorter than the full length of the file.

On Vista we use a totally different way of calculating the duration in this scenario. Since the HTTP source filter on Vista supports “seeking” over HTTP, it is possible to calculate the total duration. The “MPEG1 Splitter” filter sends a command to the source filter to seek to the end of the file. We then read the time stamp information in the last few PES packets at the end of the file. This yields the actual duration of the file.

The main reason for all this strangeness is that MPEG1 and MPEG2 are streaming formats (remember progressive download is not the same as streaming).  They are just not designed to be file archive containers.  That’s why things like Video CD and DVD wrap the MPEG streaming format and add things like indexes to enable seeking.  Of course that doesn’t stop people trying to use them as file formats.  The only way to be sure of getting the proper duration is to pre-scan the entire file before we start playback. Without an index this is a very expensive operation (order “n”). This adds a major delay before we start playing the file. Because of this a design decision was made that we would just have to deal with incorrect duration values.

Calling the Format SDK, DirectShow, Media Foundation or the WASAPI from managed code (C#, VB.net)

I get this series of questions from different developers from around the world at least once a week.

Q. I want to use DirectShow (Windows Media Format SDK, WASAPIs) from my C# (VB.net, managed) code. Why doesn’t Microsoft have a COM interop library that I can use? Why do I have to rely on a 3rd party library to be able to do this?

A. The answer is “nondeterministic finalization”.  Just to be clear, calling any method in the Format SDK, DirectShow or the WASAPI, that is time dependent is not supported from managed code. In other words, if you try to integrate any of these technologies into your managed application via one of the many 3rd party libraries out there, and you run into problems, you are on your own. Microsoft will not be able to help you unless you can reproduce the problem in unmanaged (standard C++, not C++ CLI) code.

 

Q. OK, so why exactly?

A. Again, the answer is “nondeterministic finalization” or the lack of deterministic finalization in the CLR. As we all know the best feature of any managed code environment from .net to Java is the fact that we never have to worry about cleaning up after ourselves. We are given a “maid” that comes around after us and cleans up our mess. In terms of managed code or the CLR, this “maid” is called the “garbage collector” (GC). The GC checks to see if we have any objects that we are no longer using. If it finds an object or two that are no longer in use it releases them and their associated memory. Because the GC determines when we are finally done with an object, rather than our code explicitly releasing the object, we say that any environment that uses a GC is “nondeterministic”. We never know when the objects we are no longer using will actually be deleted. Unmanaged C++ on the other hand is”deterministic”. In order for us to avoid memory leaks we have to carefully manage the creation and deletion of each object that we instantiate on the heap.

 

Q. OK, so now I understand “nondeterministic finalization” how does that relate to the Format SDK, DirectShow or the WASAPIs?

A. What do all of these APIs have in common? They all take data as input, process it and schedule it for output at a specific time. In the case of the Format SDK, the IWMWriter takes audio and video data as input and then sends the data over the network to a client (or WMS) for playback. In order for playback to continue smoothly on the client the data must be sent at a steady rate. In order for us to playback media at, say 30 frames per second, we have to deliver our data at this rate. If we fall behind this rate then we run the risk of a “buffer under run”.  If we go over this rate then we run the risk of a “buffer over run”.  We need to do whatever we can to avoid these. So we have a large buffer on the client and boost the thread priority of the “playback” thread to make sure nothing interrupts us while we are trying to output our data. I’m sure you have experienced glitching HD audio and video, on a slow machine, when you launch a big application (like MS Word) during playback. Launching a large application takes a lot of the CPU’s time to get the application started. Since we are stealing the CPU away from our playback application, the application just can’t deliver the data on time and so we under run the buffer. We get clicks and pops in the audio and dropped frames in the video.


Q. It’s obvious why low CPU resources can cause audio and video glitching, what does this have to do with “nondeterministic finalization”, the GC and calling multimedia APIs from managed code?

A.  We don’t support multimedia APIs from managed code for the same reason that starting or running CPU hungry applications can cause glitching in unmanaged applications. What happens when the GC runs in your managed application? The GC code goes through the objects within the generation(s) that it is collecting. Objects that no longer have an outstanding reference get released. To keep the GC from getting confused while it is looking for objects to release we have to freeze all of the threads within the managed portion of the application. In other words, when the GC runs, your entire application is put on pause. This is just like what happens when we have a low CPU resource problem. We can’t process data fast enough to be able to play it back on schedule. If the GC runs for too long we can completely under run our output buffer. When this happens either we get glitching audio and video. If we get too far behind the multimedia API will give us an error.

 

Q. So if I understand this correctly the problem only happens when the GC runs. If the GC doesn’t take very long to run then we should be OK, right?

A. Correct, if we can guarantee that the GC will run for less than 1/30th of a second (minus our processing time, so more like 1/60th of a second) then everything should work as expected. However, we can’t determine when the GC is going to run or how long it is going to take to run. If we are collecting one of the later generations and we have lots of objects per generation it could take a very long time for the GC to complete its collection. Keep in mind that during its collection the GC may be interrupted by other applications (CPU contention) causing the GC to take even longer to complete its collection. If the GC takes more than 1/60th of a second we will get a dropped frame or audible clicks and pops in the audio.

 

Q. So can I keep the GC from running and causing this problem?

A. Not really. Currently there is no good way to control when the GC will run or how long it will take to run. If we could do this consistently and effectively we could minimize the effect of “nondeterministic finalization” on the system. Unfortunately this functionality is not built into the CLR. We can force the GC to run but we can’t keep it from running and we can’t readily predict when it is going to run. There is a known technique for micro managing the managed heap that can allow the GC to be controlled with come accuracy. However, just because to can do something doesn’t mean that you should.

 

Q. But… Microsoft shipped Managed DirectSound, how is that any different than the WASAPIs?

A. Managed DirectSound has been deprecated. It has basically been abandoned. It was determined early on that while the intention was good, the performance was not. Depending on the complexity of your application.

 

Q. How about the new media stack in Silverlight 3?

A. Don’t get me started. I have plans to blog about this new functionality as soon as SL3 is officially released.

 

Q. How about the managed code sample in the Windows Media Format SDK? Doesn’t this indicate that Microsoft is willing to support the Format SDK form managed code?

A. Yes and no. Remember the real problem here is with timely delivery of time sensitive data. Some of the Format SDK APIs are not time sensitive, such as the APIs the query ASF file headers. These APIs can safely be called from managed code. Because of this a decision was made to ship a managed sample to demonstrate what APIs could safely be called from Managed code. That is why we don’t see any interop code for the IWMReader or IWMWriter. They have very time sensitive APIs associated with them.

 

Q. So does Microsoft have any plans to make these multimedia APIs available from managed code?

A. There are currently no plans to port any of these multimedia APIs to managed code. Keep in mind that nondeterministic finalization is not a bug, it’s a feature. This is not a limitation in the multimedia APIs in question but rather a fundamental design feature of all managed languages. The best we can hope for is some way to control the inner workings of the GC. Maybe we can get some additional extensions that will allow us to more closely mimic the deterministic nature of unmanaged C++.  Also remember that there is a big performance difference between .net and unmanaged C++. The current overall performance of the CLR doesn’t really allow complex codec creation or DSP implementation in managed code. As clock speeds get faster and multiprocessor techniques get better, we might get there but I don’t expect it to be any time soon.

Using the Format SDK to push to multiple publishing points on the same server

I have been working with an engineer for weeks trying to figure out why we were not able to push to multiple publishing points on the same server using the Windows Media Format SDK. I know that there was a bug in the Format SDK that prevented multiple push sinks from working. That bug was fixed a couple of years back (943335) and is included in the latest version of the Format Runtime.

If you enabled the registry setting described in (943335), you could push to two publishing points at the same time. If you added a third push sink, the Format SDK application appeared to hang. One of the three publishing points started but then they never received data from the application. The server was just sitting there waiting for data, and the client was completely hung up.

During debugging into the windows sources I could confirm that the fix was working as expected. The fix basically allowed WinInet sessions to be spawned as needed for additional push sinks giving each session a unique ID via a HTTP cookie. Typically you would expect one session per push sink. The Format SDK doesn’t spawn a new WinInet session for each sink right away. Complex logic is used to determine when to create a new WinInet session.  The code was modified so that each push sink session got its own cookie to identify the push sink within the newly spawned session. Again I could confirm that this code functioning correctly.

Additional debugging showed that the Format SDK code was doing the right thing. However, it was just sitting there waiting for WinInet to confirm a “write” operation. This confirmation was never received. Since this section of the Format SDK code was a single threaded state machine, confirmation from WinInet that the data was written was required before additional data could be sent.

Because of this we were able to narrow the problem down to WinInet, not playing well with others. But why, would this component allow a third session to be created but not allow data to be written to this session? After talking to some of the resident WinInet experts, WinInet has a fundamental limitation of two simultaneous sessions. At this point the light went on. We knew that IE had a limitation of two download sessions but had no idea that this echoed into the world of WinInet.

We found that WinInet has a registry key that can be set to the maximum number of per process sessions (KB 282402). After modifying this registry key to equal the number of push sinks configured in our application, lo-and-behold everything started working as expected. No more hang and all publishing points started getting the data.

The question that I asked was why the original KB 943335 describing the multiple push sink fix did not include the information about the WinInet limitation. The answer that I got was, “please update the KB”. So I’m going to go ahead and put in a request for the change. It probably won’t get updated right away so I thought that I needed to get a blog out about this issue while we wait for the update.

 

To allow more than 2 push sinks to be enabled within a single process you need to set the following registry keys:

HKCU\Software\Microsoft\Windows Media\WMSDK\General\MultiplePushToSameServer (DWORD)

Set this value to “1”.

 

HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings\MaxConnectionsPerServer (DWORD)

Set this value to the maximum number of publishing points that you will be pushing to, for example “7”.

 

Reference:

FIX: Encoder objects try to create multiple push publishing points on a Windows Media server from a single process

http://support.microsoft.com/kb/943335

How do I configure Internet Explorer to download more than two files at one time?

http://support.microsoft.com/kb/282402

 

 

Where the heck did Microsoft hide the WM Format SDK x64 libraries for Vista? How about XP?

It looks like the 64 bit version of the Windows Media Format SDK .lib static library files are not where you would expect to find them. If you go to the Windows Media SDK downloads page, there is a happy little link for the Windows Media Format 11 SDK (link below). This link does not detail a specific platform supported by the download. Because of this the assumption is that this link will allow you to download the SDK for all platforms. I got caught in this trap too. Unfortunately this is not the case.

The link on the Windows Media download page is for XP SP2 / SP3 only. This download is not intended to be used on Windows Vista. Luckily if you accidently download this version of the SDK and install it on Vista and use it to build your application, you are safe as long as you are building against x86. If you want to target x64 on Vista this download is not going to work for you. In other words, if you are building against the WM Format SDK on Vista, don’t use the Windows Media Format 11 SDK link on the Windows Media Download page. This download is intended for XP SP2 / 3 only.

Question: So what do I use if I want to build against the Windows Media Format 11 SDK on Vista? Answer: The Windows SDK for Windows Server 2008 and .net Framework 3.5 (link below).

It looks like we have bundled all of the WM Format SDK libraries into the latest version of the Windows SDK (formerly Platform SDK). For whatever reason the information on the Windows Media download page does not mention this. According to the documentation on this download page the WM Format components are not included in the Windows SDK. The good news is that the Windows SDK is now a “one stop shop” for all your Windows Media needs.

 

Windows Media Format 11 SDK for XP x86 & x64

http://download.microsoft.com/download/a/c/3/ac367925-39e7-4451-a175-a224f94fbdce/wmformat11sdk.exe

Windows SDK for Windows Server 2008 and .NET Framework 3.5

http://www.microsoft.com/downloads/details.aspx?familyid=F26B1AA4-741A-433A-9BE5-FA919850BDBF&displaylang=en

DirectShow Filter vs. DMO - Ready... Fight!

There are some rather interesting misunderstandings surrounding Direct Media Objects (DMO). I talked to one engineer the other day that called DMOs inferior to DirectShow filters. While I personally think that DirectShow is one of the most amazing technologies of all time, times are changing.

 

From what I understand, the DMO actually got its start with the DirectSound team. They wanted to integrate digital signal processing (DSP) plug-ins into their framework but didn’t want the overhead of using the full DirectShow implementation. The DMO was born.

 

You can think of the DMO as a light weight version of the DirectShow filter. If you write a DirectShow filter you can pretty much only use it within the DirectShow framework. It’s extremely difficult to wrap a DirectShow filter to get it to work outside of the framework. DMOs on the other hand are designed to be used wherever needed. You can simply chain them together, feed them data and watch them do their thing. No framework, no overhead, just pure DSP. Because of this they have become the DSP architecture choice here at Microsoft.

 

The DMO API is now considered the replacement for the very restrictive DirectShow filter paradigm. In fact, the new Media Foundation Transform (MFT) API is just an extension of the standard DMO APIs.

 

The great thing about DMOs is that if you write a simple DMO rather than a DirectShow filter, you are going to have a DSP that will work with all three media platforms on Windows, DirectShow, Windows Media Format and Windows Media Foundation with minimal effort.

 

So the next time you need to write a DSP take a serious look at the DMO.

Rendering MMS, RTSP and HTTP with DirectShow in Vista and Windows 7

I was talking to a DirectShow engineer the other day that was having issues rendering URLs using DirectShow on XP and Vista. He noticed that the behavior was very different between the two operating systems. Luckily I was privy to a very interesting conversation between the original developer of the WM ASF Writer filter and a number of PMs that were trying to figure out why rendering URLs appeared to be broken on Vista.

 

From what I learned from the conversation, when you render a URL in DirectShow on XP, the source filter that is added to the graph is from WMP 6.4. This filter is very old and has not been updated in a long while. In fact this filter no longer ships on Vista and will not ship on Win 7.

 

There are some interesting distinctions between different URL sources. We basically have four different URL protocols supported by the Windows Media subsystem. MMS, RTSP and two flavors of HTTP, standard HTTP (from a web server) and HTTP streaming. On Vista if you try to render a stream in DirectShow using MMS or RTSP you will get an error stating the no source filter could be found. This is due to the fact that these two protocols are not registered by default. HTTP (both flavors) however will work as expected since this protocol is registered.

 

At this time there is no good solution for rendering URLs on Vista or Win7 in DirectShow. Since the WMP 6.4 components are missing the only filter that could possibly handle MMS, RTSP or HTTP for ASF formatted files would be the WM ASF Reader. Attempts to get the WM ASF Reader to render these protocols on Vista and Win 7 by default has met with resistance here at Microsoft. The WM ASF reader was not designed to work with a URL as a source (although it is possible to force it to load from a URL). It was determined early on that getting the WM ASF reader filter to work properly as a URL source would be too costly.

 

All current development effort here at Microsoft in the media space has been put into the new Media Foundation (MF) platform. This new platform is the planned replacement for the aging DirectShow. Unfortunately MF doesn't yet have full parity with DirectShow. Windows 7 improves the MF platform greatly and it is my hope that we will finally have a ready replacement for DirectShow in Windows 7.

 

“MediaElement.SetSource” throws “AE_INVALID_FORMAT_ERROR” when setting “Stream” sources

When reading a stream from a location such as a media resource object, and then calling “MediaElement.SetSource” may return “AE_INVALID_FORMAT_ERROR”. If you construct a “MemoryStream” using your “Stream” object (reading all bytes and writing to a MemoryStream), then “SetSource” works fine.

 

This issue may ocur when the “Stream” object is not seekable. A “MemoryStream” will always be seekable and will always work with “SetSource”. "SetSource" always expects a seekable “Stream”.

Posted by MediaSDKStuff | 1 Comments

“Leave audio at remote computer” Doesn't Work in Windows Server 2008

When connecting to Windows Server 2008 through remote desktop choosing to leave the audio on the remote machine does not work as expected. Although the option shows up on the remote desktop application, if you select it and then connect, the audio is simply disabled as if you had selected the “do not play” option in the application.

 

This behavior is by design. Server can have more than one active session, so it can lead to problems (including privacy) when trying to use the "leave at remote computer" option. Because of this, the server will automatically roll over to the “do not play” option when the "leave at remote computer" option is selected. There are no workarounds for this behavior.  The "leave at remote computer" option is available on client SKUs.

Posted by MediaSDKStuff | 1 Comments
Filed under: ,

Hotfix 929182 not updating Windows Media Encoder 9 components as expected on Vista

I was in training a few weeks ago (WMS 2008 new features) and found that no matter what I did I was not able to get hotfix KB929182 installed on my Vista lab machine. The hotfix would act as if it installed correctly but the affected DLLs were not updated. The title of the KB in question is: “FIX: You may experience issues when you use Windows Media Encoder 9 Series on a computer that is running Windows Vista“. After having a number of CDNs report this issue, we investigated further and another engineer on my team was able to figure out the steps necessary to get this KB to actually install as expected. We think we know why this is occurring but we are still investigating. The steps below should help you to get around the issue until we can get the patch updated (assuming our requested changes are accepted).

 

From the Control Panel, Program and Files

 

1. Remove Security Update

        Security Update for Windows Media Encoder (KB954156)

 

2. Remove Hotfix

        Hotfix for Windows Media Encoder KB929182

 

3. Remove current installation of the Windows Media Encoder 9 Series

 

4. Restart the system (DO NOT install updates)

 

5. Reinstall the Windows Media Encoder

http://www.microsoft.com/downloads/details.aspx?FamilyID=5691ba02-e496-465a-bba9-b2f1182cdf24&displaylang=en&Hash=Q7YOichtn5uaPziFse1ENeP0pr3Zm3uDvElbQn2p9UGArhz5hMZIta1wqWhKm7czKB%2bzqI%2fM7a20p1YBh%2fuqIg%3d%3d

 

6. Verify versions of

        wmenc          9.0.0.2980

        WMEncEng.dll   9.0.0.2980

 

7. Install Windows Media Encoder Hotfix

        Hotfix for Windows Media Encoder (KB929182)

 

8. Verify versions of the following files:

        wmenc          9.0.0.3352

        WMEncEng.dll   9.0.0.3352

 

9. Perform Windows Update to reinstall security updates

 

10. Verify encoder files remain at current version 9.0.0.3352

 

 

Reference:

FIX: You may experience issues when you use Windows Media Encoder 9 Series on a computer that is running Windows Vista

http://support.microsoft.com/kb/929182/en-us

Windows Media Player does not negotiate media types for DMO DSP plug-ins on Vista

Here is an interesting issue I ran across writing Windows Media Player (WMP) custom DSP plug-ins. When I am prototyping a new filter or plug-in for DirectShow or WMP, I will usually write my transform’s algorithm using a single format / media type. Once I have the algorithm working with this single media type I’ll look into supporting other media types. In the past I've usually I only support a few “compatible” types.

 

What I found is that when writing a custom WMP DSP plug-in (using the plug-in wizard), and hard coding support for only a single media type (WAVE_FORMAT_PCM, MEDIASUBTYPE_YV12), WMP would never load the plug-in. I debugged the issue down and found that WMP was always trying to connect with a certain media type. I found this to be the case for both audio and video. For audio WMP was always trying to use the IEEE subtype (WAVE_FORMAT_IEEE_FORMAT) and for video it was always using NV-12 (MEDIASUBTYPE_NV12) when playing back WMA or WMV formatted content.

 

This is certainly not the behavior that I expected. In fact on XP, WMP has no problem negotiating different media types for a custom plug-in. It took a very long time but finally I was able to get an answer from a guy on the WMP team. In short this behavior is “by design”. From what I understand the developers made this change after seeing numerous reports of performance problems with custom DMO plug-ins.

 

This is what I found: WMP builds the graph first, and then tries to insert the plug-in(s) into the filter chain. After negotiation, WMP will “lock” into using the negotiated media type. In other words WMP will only use the media type that was previously negotiated between the decoder and renderer. In most cases (but not all), this will be NV-12 (video) or IEEE (audio). If the plug-in wants to be inserted into the chain, it needs to support the format that’s already been negotiated. When WMP tries to add the plug-in to the graph, the WMP graph manager uses “connect direct”. This has the effect of bypassing intelligent connect. Because of this, additional filters will not be inserted into the graph. This is why we don’t see a color-space converter being added to the graph to try and facilitate the negotiation between the filters.

 

The key design decision behind this behavior is performance. If the inserted plug-in requires a format that the upstream and downstream filters don’t support and the color-space converter is added to try and facilitate this connection, additional processing power will be required. It is very likely that a color-space converter will need to be added both before and after the plug-in. If each plug-in in the chain requires a color-space converter in order to function there could be dozens of color-space converters in the graph (depending on the number of plug-ins). Since color-space conversion itself cannot be accelerated, this convoluted topology would cause extremely high CPU usage. Because of this, the decision was made to require the plug-ins to support the negotiated format type.

 

Arguably it is possible that the plug-in will now need to do any of the color-space conversion, still causing the possibility of high CPU usage. That is why it is recommended that the plug-in’s algorithm be optimized to use the new NV12 or IEEE formats. These are very efficient standardized formats closely related to the existing YUV and PCM formats. Only minor changes to the overall algorithm of the plug-in should be necessary to support these formats. However, to be on the safe side and guarantee that your plug-in will always get loaded, you should support all of the formats that WMP may try to use. Here is a list:

 

Audio:

WAVE_FORMAT_IEEE_FLOAT

WAVE_FORMAT_EXTENSIBLE

WAVE_FORMAT_PCM

Video:

MEDIASUBTYPE_NV12

MEDIASUBTYPE_YV12

MEDIASUBTYPE_YUY2

MEDIASUBTYPE_UYVY

MEDIASUBTYPE_RGB32

MEDIASUBTYPE_RGB24

MEDIASUBTYPE RGB555

MEDIASUBTYPE RGB565

Using the WM Encoder to Protect Existing Content with WM DRM

One of the things that a lot of DRM savvy CDNs are doing is to use WM Encoder to add WM DRM to their content without actually re-encoding. This is basically a copy operation from the input file (WMV) to an output file (WMV). The samples are never decompressed but rather passed directly through the Encoder. In the interim the Encoder can add WM DRM protection. This allows a very quick way to add WM DRM protection to a file without needing to re-encode, without any quality loss and without needing to protect the video on the server side with the WMRM SDK components. It’s a really a great way to get WM DRM protection enabled on the client.

The trick is to make sure that the input profile (the header information of the input file) matches the output profile (the profile you configure in the WM Encoder). This isn’t difficult to do but you do need to write a bit of code to do it. You can use the WMEncBasicEdit interface to generate a profile object that you can then use to feed back into the Encoder as the output profile. Using the WMEncBasicEdit object you can just grab the profile from the input file and tell the Encoder to use it to encode. The WM Encoder is smart enough to know that if the input and output profiles match that there is no reason to do any work and it doesn’t re-encode.

Unfortunately this method falls apart when using certain types of input files. If your input content happens to be interlaced, when you run your input file through this process you get corrupted output. The problem is that the profile that is extracted from the source file using WMEncBasicEdit doesn’t contain information about the interlaced nature of the source file. WM Encoder only checks that the basic input and output profile information matches. It doesn’t check to see if the input file was interlaced. Because of this WM Encoder tries to process the input content as if it were encoded using progressive frames. Since they are actually interlaced frames, the video that ends up in the output file plays back all garbled, distorted and corrupted. This is simply due to the fact that the WM Encoder is not setting the interlaced flag in the copied output. Theoretically if you could get the interlaced flag to be set correctly in the output file then the video would playback as expected. Unfortunately I haven’t found a good way of doing this.

I was able to confirm with the developers that currently own the WM Encoder 9 bits, that they are not interested in fixing this issue at this time. This is due to the fact that the WM Encoder 9 is at the end of its life and is being replaced with the Expression Encoder application.

The following link points to some VB.net WinForms / ASP sample code that I wrote years ago to demonstrate how to use this technique. I haven’t tried running this code since it was first written so it might not work out of the box. However, it should give you a starting point to enable you to create your own solution. The code can be found here: http://code.msdn.microsoft.com/DRMEncoderProfile

 

More Posts Next page »
 
Page view tracker