Understanding High-End Video Performance Issues with Hyper-V

Understanding High-End Video Performance Issues with Hyper-V

Rate This
  • Comments 30

A while ago I wrote a relatively short blog post high-lighting the fact that there are performance issues with Hyper-V when used with a high-end graphics adapter.  Since then I have been inundated with people asking questions and trying to get their heads around this issue.  Today I would like to take a chance to drill in on this:

What is the cause of the problem?

Okay – let’s grab the pertinent text from the original KB article:

This issue occurs when a device driver or other kernel mode component makes frequent memory allocations by using the PAGE_WRITECOMBINE protection flag set while the hypervisor is running. When the kernel memory manager allocates memory by using the WRITECOMBINE attribute, the kernel memory manager must flush the Translation Lookaside Buffer (TLB) and the cache for the specific page. However, when the Hyper-V role is enabled, the TLB is virtualized by the hypervisor. Therefore, every TLB flush sends an intercept into the hypervisor. This intercept instructs the hypervisor to flush the virtual TLB. This is an expensive operation that introduces a fixed overhead cost to virtualization. Usually, this is an infrequent event in supported virtualization scenarios. However, some video graphics drivers may cause this operation to occur very frequently during certain operations. This significantly magnifies the overhead in the hypervisor.

Usually when I talk to people about this – their eyes start to gloss over – so let’s dig in a little here.  With the help of Wikipedia we can get some definitions here:

  • Write-combining (http://en.wikipedia.org/wiki/Write-combining):

    Write combining (WC) is a computer bus technique for allowing data to be combined and temporarily stored in a buffer -- the write combine buffer (WCB) -- to be released together later in burst mode instead of writing (immediately) as single bits or small chunks.

    Write combining cannot be used for general memory access (data or code regions) due to the 'weak ordering'. Write-combining does not guarantee that the combination of writes and reads is done in the correct order. For example, a Write/Read/Write combination to a specific address would lead to the write combining order of Read/Write/Write which can lead to obtaining wrong values with the first read (which potentially relies on the write before).

    In order to avoid the problem of read/write order described above, the write buffer can be treated as a fully-associative cache and added into the memory hierarchy of the device in which it is implemented. Adding complexity slows down the memory hierarchy so this technique is often only used for memory which does not need 'strong ordering' (always correct) like the frame buffers of video cards.

    In summary, write-combining is a method of accessing memory that is typically only used by video cards.

  • Translation Lookaside Buffer (TLB) (http://en.wikipedia.org/wiki/Translation_Lookaside_Buffer)

    A Translation lookaside buffer (TLB) is a CPU cache that memory management hardware uses to improve virtual address translation speed. It was the first cache introduced in processors. All current desktop and server processors (such as x86) use a TLB. A TLB has a fixed number of slots that contain page table entries, which map virtual addresses to physical addresses. It is typically a content-addressable memory (CAM), in which the search key is the virtual address and the search result is a physical address. If the requested address is present in the TLB, the CAM search yields a match quickly, after which the physical address can be used to access memory. This is called a TLB hit. If the requested address is not in the TLB, the translation proceeds by looking up the page table in a process called a page walk. The page walk is a high latency process, as it involves reading the contents of multiple memory locations and using them to compute the physical address. Furthermore, the page walk takes significantly longer if the translation tables are swapped out into secondary storage, which a few systems allow. After the physical address is determined, the virtual address to physical address mapping and the protection bits are entered in the TLB.

    So the TLB is a CPU cache that helps with translation between virtual address spaces and physical address.  Note that these virtual address spaces have nothing to do with virtual machines – but are used to allow multiple applications on an operating system to be isolated from each other.

Summarizing all of this – video card drivers tend to use memory access methods that cause Hyper-V to need to clear out the CPU cache for memory page table mapping a lot.  This is an expensive thing to do in Hyper-V at the best of times.  In fact – the above TLB article on Wikipedia even has a section on the problems of virtualization and the TLB.

Now that we have the ground rules in place – let’s head on to some of the other questions.

How could you possibly ship Hyper-V with this issue?  Did you not test this product?

To answer the second question first – I actually was the first person (in the world) to hit this issue.  Early on in development I tried to use Hyper-V as my desktop OS on my home system with a GeForce 8800 video card.  Everything seemed to work okay (though some things were oddly sluggish) until I tried to pay Age of Empires III.  I had never played this game before, and the first time I tried to play it was on top of Hyper-V.  In short, it sucked.  Unfortunately I spent most of the weekend trying to tweak my rig and looking for patches to Age of Empires III before I thought to try disabling Hyper-V.

As soon as I realized what was happening I filed a bug and the issue was investigated.

When the issue was determined to be a specific result of the combination of the Hyper-V hypervisor and the Nvidia driver – we decided to leave things as they were for a couple of reasons:

  • Windows Server does not include any video drivers other than the SVGA driver by default
  • Windows Server will not install a high-end video driver automatically at any stage – you need to manually install the Windows 7 drivers

Also, Hyper-V was being developed solely for server virtualization and:

  • We have always recommended that nothing be run in the management operating system, other than basic management tools
  • No server workload that we tested generated anywhere near the rate of TLB flushing that these video drivers cause

Finally, this is a really hard issue to address.  In fact, there are no hypervisor based virtualization platforms that addresses this issue today – and while there are several under development I suspect that they will either have specific hardware requirements (I will get to this later) or will have simplifications / limitations to help them mitigate this issue (like only having one virtual machine).

Why does this affect Hyper-V and not Virtual PC?

Here we are seeing the difference between a hypervisor and a host VMM type solution.  With a hypervisor base platform (like Hyper-V) everything runs on top of the hypervisor – even the management operating system.  Where as with a hosted VMM platform (like Virtual PC) the host operating system still has direct access to the hardware.  To explain this better – here is a diagram:

Drawing1

Hopefully you can see the difference here.  It should also be noted that all desktop virtualization products available today use an architecture similar to that of Virtual PC.

How do I know if this is affecting my computer?

To check if this is affecting your system – what you need to do is open Performance Monitor (you can do this by running “perfmon” from the start menu).  Select the Performance Monitor node and click on the plus symbol to add a new counter.  Then find the Hyper-V Hypervisor Root Partition entry, expand it, select Virtual TLB Flush Entries/sec and add the Root counter.  This will allow you to keep an eye on the rate of TLB flushing in the management operating system:

UntitledUntitled2

So what do you look for now?  On my system – the only time I see a significant rate of TLB flushing (>10) is when I start a virtual machine.  A system that has this problem will either generate a continuous rate of TLB flushing above 100 or will generate spikes in the thousands.

What can I do to stop this / work around it?

There are a couple of options here:

  1. Use the default video driver (SVGA).

    Yes, I know it is not sexy or fun – but if you are planning to just use Hyper-V as a server virtualization platform this is your easiest and simplest option.  It is the way we intended Hyper-V to be used, and it will always give the best performance.

  2. Tone down the use of 3D graphics.

    Some video cards (like the Nvidia Quadro FX 1700M) seem to work fine as long as Aero is not enabled and no 3D applications are running.  If I enable Aero I start to see a fairly frequent rate of spikes in my TLB flush count (which causes annoying lurches in the window animation).  Running a 3D game (like Halo 2) is just terrible.

    This means that for those of you who do not want the high-end driver for 3D graphics, but instead need it for multi-monitor support or for the ability to connect a projector to your laptop (like me) this may work.

  3. Choose your video card carefully.

    As a general rule of thumb – the less capable the video card, the less likely this is to be an issue.  My previous laptop had an integrated graphics controller – which was terrible for gaming – but worked great for Hyper-V.  When I wanted to get my new laptop and found that there was no Intel option – I tracked down a coworker with a similar graphics card in their laptop and tried out Hyper-V on it before going ahead and buying it.

  4. Get a system with Second Level Address Translation (SLAT).

    SLAT is a technology that goes by different names depending on whether you get Intel (where it is called “Extended Page Tables” (EPT)) or AMD (where it is called “Nested Page Tables” (NPT) or “Rapid Virtualization Indexing” (RVI)).  These technologies are an extension to the traditional TLB that allow us to use the hardware to handle multiple TLBs – one for each virtual machine.  We added support for this hardware in Windows Server 2008 R2.  If you run Windows Server 2008 R2 on a system with SLAT capabilities – you will not have any problems running 3D graphics at all.

    Intel started shipping this technology in the Nehalem (or core i7) processor line.  AMD has been shipping this for a while now – ever since generation 3 of the AMD Quad-core family.  Unfortunately neither have shipped this technology in the laptop processors yet – though Intel has indicated that they are planning to soon.

Hopefully this has answered all of your questions satisfactorily.  If you have any further questions – please feel free to ask away.  I would also encourage you that if you have a video card that appears to work well with Hyper-V and 3D graphics – post the details in the comments so that others can benefit from your good fortune!

Cheers,
Ben

Leave a Comment
  • Please add 8 and 1 and type the answer here:
  • Post
  • We discovered this issue on a tester's workstation. They had been given a machine kitted out with enough CPU and RAM to run a couple of VMs in Hyper-V. The management operating system wasn't going to be used for anything hard - just a web browser, mail client, etc.

    The whole system ran so sluggishly when Hyper-V was enabled that we tried going back to the SVGA driver. Unfortunately this meant that the tester could only use one monitor, which was ridiculous.

    So we had to disable Hyper-V and move his VMs to a separate server machine.

    It's great to hear that it's fixed on newer processors. Is this something which we need to check the CPU's specs for, or can we safely assume that all Core i* CPUs have it, for example?

  • Hi Ben,

    Thanks for this post. It can be astruggle to convey all of this to developers that are relying on Hyper-V for SharePoitn 2010 development (the only other x64 solution would be VmWare Workstation).

    Do you think we might ever see multi-monitor suport in the SVGA driver? For some reason I think that won't be possible but it would be a huge help if it could happen.

    Cheers,

    Tristan

  • I didn't realize it at the time, but I had this issue. I bought a NVIDIA GeForce 9500 GT card, and it barely coud play video. I thought it was a video card driver issue, because I was using Vista drivers on Windows 2008. I swapped out the card with my kids Radeon X1300 card which does support 2 monitors, although 1 digital, and 1 analog, but it works fine. The Radeon card is 4 years old now, but at the time was a beast. Also, the kids think I am the best by taking the crappy card. hehe

  • Not seeing this issue on a Dell Latitude E6500 running 2k8r2 x64 Datacenter, hyper-v enabled.  It's got an Nvidia Quadro NVS 160M, running driver version 186.21 - TLB flushing is spikey at VM boot, but then settles down to 0 pretty quickly.  FYI...

  • Rik Hemsley -

    Unfortunately, Intel's documentation is rather vague here.  What I have heard is that it has to be Core i7 (not i5) but it might be worthwhile to ask on their support forums for clarification.

    Tristan Watkins -

    Good question - I will see if I can find an answer.

    John Sinclair -

    Whoops!  I meant to mention this in the post - seeing a spike of TLB flushing during virtual machine start is completely normal.

    Cheers,

    Ben

  • Is a spike durring "Ctrl+Alt+Del" normal? I have it on a Dell D630 Name with Mobile Intel(R) 965 Express Chipset Family Graphic with still 2008 R1.

    Also I think I noticed that until "C:\Windows\System32\net.exe start hvboot" is started it seems to spike not at all. Durring some periods I have longer periods of loads between 20 and 100+. I'm using the newest Intel Graphics driver. For some reason I cannot run Aero anymore (using Vista Basic). Also I noticed disabling Aero enhance the performance a lot. Also I use only 16 bit color depth which also I think improves performance a bit for me.

  • Given the mess that Intel has made of VT in their product matrix (and now of SLAT), looks like I'm sticking with AMD chips.

  • Hi Ben!

    great post!

    I have this issue too in my ws2008R1. Especially there is a big spike (it is constantly over 90 for seconds) after Nod32 updates it's database, and it least for about 10 seconds, while windows is very-very slow :(

    I don't think this NOD32 issue is connected to video card driver, it must be some other problem. But I've realized this by using the perfmon on the way you wrote down.

    What do you think, should I write a mail to Eset support, or is it a Windows issue?

    Thanks!

  • @ Tom

    be careful with AMD, there's a lot of virtualization issues with certain BIOSes, mostly affecting the client side, laptops with AMD CPUs are especially affected.

  • I just bought Core i7 laptop with GeForce 230M; HP pavillion DV7-3050ec. Installed Windows 2008 R2 x64 and installed Win7 drivers. All working perfectly, even when Virtualization is enabled in BIOS. Add Hyper-V role and I get a blue screen in Nvidia driver on boot. I tried versions 186.44; 186.81 and even beta 195.62 with no luck. This is not a performance issue, I only wish it was that, but I cannot even boot the system with Hyper-V. If I revert to standard VGA, it's all fine, but then I would rather sell this laptop then look at VGA screen.

    I am looking for anything to try, please any ideas so I dont have to go back to RDP to my server or go back to VMWare, ouch. I will post back results if anything works.

      By the way I don't buy the argument this is for servers, because virtualization belongs to workstations as well. Thanks

  • Windows 7 Ultimate 8 gig of memory Nvidia GTX 285 video card.

    XP mode the best color I can get is medium 16 Bit color. all my graphics are washed out I need 32 Bit.

    Any Ideas?

    I have tried changing the video card in the virtual xp session and the s3 driver still comes back.

  • Biztalker,

    Any luck?    I have the same scenario w/ two different core i7 laptops..  one ATI the other Nvidia graphics.   Whenver hyper-v role is enabled w/ graphics driver installed; blue screen.

    I do have another mobile server laptop w/ W5590 Xeon and a NVidia GTX 260M and I have Hyper-V role and Aero running like a champ.  Curious why that one is working?

  • No luck or answer yet I will keep on waiting

  • @sintak/biztalk

    Hi, I have the same problem w/ my core i7 laptop (720QM / HD4670).   Hyper-V + GFX Driver = BSoD

    I don't need 3D really, just must have the full screen resolution (1920x1080).

    Tried ATI Catalyst 9.5 up to 9.11 (Desktop moddified) / Win7 Standard-VGA-Driver

    No luck.

    Do the Core i7 Mobile have S.L.A.T.? Can you enable/disable?

  • @sintak, biztalk, resil

    If you want support for HyperV/WinVPC, go to the technet forums.

    http://social.technet.microsoft.com/Forums/en-US/w7itprovirt/threads

    http://social.technet.microsoft.com/forums/en-US/winserverhyperv/threads/

    WinVPC only (always) emulates the S3 Trio, you can get 32bit color by disabling IC, or 24bit in seamless mode. http://smudj.wordpress.com/2009/10/08/xpmode-and-24bit-color/

Page 1 of 2 (30 items) 12