VMworld: Is it a Scalability Issue to run Drivers in the Hyper-V Parent Partition? (Answer: No)

VMworld: Is it a Scalability Issue to run Drivers in the Hyper-V Parent Partition? (Answer: No)

  • Comments 16

I am sitting in the VMworld session “TA3880 - Head-To-Head Comparison: VMware vSphere and ESX vs. Hyper-V and XenServer”.  It is interesting to listen to VMware’s perspective on this.

One point that they have raised – which I have heard before – is that VMware ESX has better scalability than Hyper-V because they run their drivers in the hypervisor, while we run our drivers in the parent partition.  VMware usually then continues to say that they tried this approach (drivers in the parent partition, or the “service console” to use VMware nomenclature) with older versions of ESX and it caused scalability issues that were resolved by moving the drivers and emulated devices into the hypervisor.

Now, I remember when VMware announced that they were moving all of their drivers and emulated devices into the hypervisor.  At this time they were proudly talking about how they were doing this and how it helped so much with scalability, and I was thinking “that’s insane – why would they do that! I would never put code that complicated into the hypervisor”.  So I decided to do some research; and I found the simple answer for this:

The ESX service console (in what they now call “ESX classic”) is a uniprocessor partition.

Compared to this the Hyper-V parent partition has access to all processors in the physical computer, and runs an operating system with great scalability (Windows Server).

So yes, a running your drivers and emulated devices in a uniprocessor partition would be a scalability bottleneck.  But running your drivers and emulated devices in a multi-processor, highly scalable parent partition does not cause a single bit of scalability issues.

Cheers,
Ben

Leave a Comment
  • Please add 1 and 2 and type the answer here:
  • Post
  • "and runs an operating system with great scalability (Windows Server)."

    Most people would argue that Linux is as-scalable, if not more so, than Windows Server.

  • It's obvious that Microsoft and VmWare have different philosophies and methods in employing drivers in their 'layers'.  When I hear anyone throw around the term "scalability", I look to see numbers or baselines to support the claim.

    Scalability is an often abused term.  I have an idea of how scalable VmWare, and Citrix v hosts are, but have seen no data about Hyper-V.  All things being equal (uniform vms), what kind of density can I expect on a hyper-V host with 8 cores, e7x intel cpus and 64GB of Ram?  and what would be the delta (evidence of scalability) if I used a host with 4 cores, 32GB or RAM, and another with 16cores, 128 GB of ram?  I'm looking for a exact number, just an idea.

    My company is starting to take a hard look at Hyper-V, and real scalability (not theoritical) is something our data centers require.  And if the scalability is there, we are willing to pay for it.

  • Ben, you have missed one important fact about the service console, it is a management partition and not where the drivers live.  ESX Classic's Service console does not contain drivers, it uses the same driver architecture as ESXi. If the Service Console fails for whatever reason, the VMs are unaffected, unlike Hyper-v where the whole host is rendered useless.

  • Show me a VM running on Hyper-V doing more than 100.000 IOPS and I'll stop caring where you put your drivers.

  • Horsie -

    Either way, MP is more scalable than UP, and that is my point.

    MrV -

    There are so many factors here that it is hard to give a simple answer.  For server workloads we support having 8 virtual processors per logical processor in the computer.  The number of VMs will then depend on what sort of I/O infrastructure you have for those systems, how many virtual processors you give to a VM, how much memory you give to a VM, and so on.

    Steve -

    That is true today - but it was not true with ESX 2.0.  They moved all this code into their hypervisor in 3.0 to address this scalability issue.

    RonTom -

    How about 180,000 IOPS? http://ir.qlogic.com/phoenix.zhtml?c=85695&p=irol-newsArticle&ID=1169854&highlight=

    Cheers,

    Ben

  • That's just dirty marketing from you!

    My post said "a VM", one Virtual Machine!

    But if you want to compare hypervisors in stead, then:

    Show me ONE Hyper-V server doing 350.000 IOPS and I'll stop caring where you put your drivers.

  • Hi, Ben,

    Either I misunderstood you or you misunderstood the speaker:

    When are we going to compare the scalability of parent partition/Linux console?

    Those are management overhead, so we try to shrink full-blown Windows 2008 with Hyper-V down to Core with Hyper-V, and ESX tried to get rid of their console and became ESXi.

    Does scalability comparison in virtualization area compare how many VMs we can stuff inside one machine, or how fast (CPU/Network/IOPS/...) a VM can perform when we double the interface/CPU/HBA/.... those kind of stuffs? Who cares you parent partition has more IOPS than ESX Linux Console? ESX 3/4 did not route those I/O to their console.

    Can you offer any use-case that MP parent partition is much better than SP Linux Console? I think I got one: you can run SQL/Exchange/IIS on MP parent partition but SP Linux console probably can't or they can not server as many connections as MP Hyper-V parent partition.

    Usually you post is technical and precise, but this one.... what is your point?

  • Spencer, I could be wrong, but I think the discussion is about where is an appropriate place for drivers to live. VMware is making a claim that because Hyper-V has drivers in the parent partition it is unscalable. VMware's justification for making this acusation is that when they put the drivers in their own parent partition it wasn't scalable. I take Ben's post as pointing at that isn't a fair place to come from on evaluating Hyper-V placing drivers in the parent partition. VMwares parent partition was uniprocessor and of course at scale couldn't handle the workload. Due to the fact that the parent partition in Hyper-V is Windows server it can handle the workload of running the drivers. Basically it is deflating VMware FUD that is based on their lack of knowledge about Hyper-V (or blatant disregard for the truth).

    Once you understand that Hyper-V has a more scalable parent partition that can handle the workload of the drivers you have to ask yourself where is the best place for drivers. I think MS has it right. Why would you put that complex code into the hypervisor?

  • RonTom -

    First, let me say that I have never been a fan of the whole IOPs benchmark - as I believe that it is not relevant for most users, who would never have storage hardware that comes anywhere near this sort of capability.  The more important measure is what is the difference between native and virtual performance.

    That said - since you have bought this up - if you go here: http://blogs.vmware.com/performance/2009/05/350000-io-operations-per-second-one-vsphere-host-with-30-efds.html

    You will see that VMware did not acheive 350,000 IOPs with a single virtual machine, but rather with three virtual machines.  Their first virtual machine acheived 119,094 IOPs.

    Now, if you go here: http://i.zdnet.com/blogs/qlogic-hyperv.jpg you will see that the QLogic report was indeed with a single virtual machine.  Further more - if you look at the third column (which uses an 8KB block size just like the above VMware benchmark) you will see that a single virtual machine gets 116,720 IOPs.

    Pretty comparable - if you ask me (and note that this was on Windows Server 2008 - not Windows Server 2008 R2).

    Cheers,

    Ben

  • Spencer -

    Nate is correct.  Sorry if this was not clear in my post - and please let me know if you need further clarification.

    Cheers,

    Ben

  • ...

    Why would you put that complex code into the hypervisor?

    Friday, September 04, 2009 1:07 PM by Nate

    Hi Nate,

       Let me answer that question for you - you would put them in the hypervisor so that you don't need a ~9GB operating system as a dependency to run your system.

    Regards

  • On the 350K IOPs test, since it was against a mid-range array, we needed multiple arrays (we were storage-processor bound), hence multiple datastores, hence the 3 VMs.   I would expect based on what we saw that one VM (either with three VMDKs in 3 seperate datastores - or with an even faster backend config) could have done it.

    But - I have to say (as much as I'm a VMware fan), that this as a proof point one way or another seems silly to me.

    Those are LUDICROUS workloads (at least in today's virtualized world).   350K IOPs is roughly equivalent to 650K E2K7 mailboxes assuming a heavy mailbox profile.

    This debate on hypervisor scalability has some merit as we move to all x86 workloads (including stuff that historically are physical - like the storage processors in the example above themselves) start to matter.

    But - for 99.9% of use cases today - it's more about "the basics" and how well you can do them before we move on to the pure performance realm.

    Looking forward to playing with R2, but catching up with vmotion is not "getting the basics" (IMHO) - today, storage vmotion, things like site recovery manager - they are as important to me as the driver code path.

    Just one person's 2 cents (and likely a tainted world-view).

  • Protegimus, Server Core has a footprint more like 1GB, not 9GB. The Vmware service console is 2GB minimum isn't it? Even VMware ESXi ends up being around 1GB when drivers etc. are included. In other words, your answer isn't valid. The drivers will take space regardless of where they are installed, so again I ask why add the COMPLEXITY to the hypervisor?

  • Nate:

    Not sure where you get 1GB for Server Core footprint, but it is at least 3-5 times that.

    VMware ESXi does not consume anywhere near 1GB -- I recently wrote about it here:  http://www.vcritical.com/2009/08/the-vmware-esxi-4-64mb-hypervisor-challenge/

    Regards,

    Eric Gray

  • Eric, the 1GB footprint comes from sites like this: http://itknowledgeexchange.techtarget.com/network-administrator/managing-windows-2008-server-core-local-settings/. It was an old artical though so I just went ahead and laid down a fresh VM of 2008 to check the vhd size. It was 2.3GB, so bigger than 1 but still no 9 or even 5 GB. If you really wanted to you can shrink it even more by dinking with it, which is what you are doing with EXi in your post. You even state that it is an unsupported thing to do, and in the previous post you admit that to install in a supported manner ESXi takes 1GB. I think it's fair to say I was accurate then in saying ESXi takes 1GB. In any event we are really bickering over semantics, and it is pointless. The disk footprint differences are not anything that I consider to be a valid competing factor. I know VMware really pushes their footprint thing like it actual means something, but it really doesn't. Unless I'm trying to run my systems on old thumb drives out of the drawer, who cares. Either option is sufficiently small enough to be placed on affordable flash media at this point. If you are buying a new server that is designed to flash boot you might just be buying some new flash media with it.

Page 1 of 2 (16 items) 12