Welcome to MSDN Blogs Sign in | Join | Help

SQL Server 2008 on Windows Server 2008 Hyper-V Performance Guidance

Wow this is great day.  The SQL team has worked really hard in creating a performance guidance document with solid suggestions and performance numbers when running on WS08 Hyper-V.

They looked at OLAP and OLTP workloads.  Even if you are not running SQL it will give you and idea of how other database like workloads are likely to perform.

Check it out...
http://sqlcat.com/whitepapers/archive/2008/10/03/running-sql-server-2008-in-a-hyper-v-environment-best-practices-and-performance-recommendations.aspx

 

Posted by tvoellm | 2 Comments

Hyper-V Performance Counters - Part five of many - "Hyper-VM VM Vid Numa Node"

There are a couple of performance counter sets in Hyper-V that show memory usage.  These counters tend to show memory used by one component of the system and the "Hyper-V VM Vid Numa Node" counter set is no different. 

This counter set is intended to show memory used by the Virtual Infrastructure Driver (Vid).  The good news is this counter set shows "guest physical memory".  This memory includes the allocation you specify when creating the VM and some other support memory like video memory.  It is the majority of the memory used by the Virtual Machine (VM).

Hyper-V VM Vid Numa Node shows memory on a per node basis.  If you have an AMD machine you will likely see a Non-uniform Memory Access (NUMA) node for each processor.  If you have an Intel based machine you will see anywhere from one to many nodes depending on the architecture.

The following are the performance counters in the counter set;

  • PageCount                   - Number of 4K pages that the node contains.  To find the total physical memory on the system you would add PageCount from each NUMA node and multiply by 4Kbytes.
  • PagesInUse                  - Number of pages in use by the Vid on this node.  Keep in mind this is most but now all memory supporting a VM.
  • ProcessorCount           - Number of logical processors this node contains.  Logical process are equal to the number of processor (sockets) * number of cores per processor * 2 if you have HyperThreading (HT) or now called Symmetric Multithreading (SMT)

  Enjoy,

  Anthony F. Voellm (aka Tony)

Looking for that last ounce of performance? Then try affinitizing your VM to a NUMA node

There are not many performance knobs in Hyper-V which is by design.  We really seek out of the box performance.  However if you are looking for that last bit of performance from your Virtual Machines (VM’s) and have already made a good selection for networking and storage you might consider setting the Non-Uniform Access (NUMA) node.

On AMD and some Intel based architectures the machine might be defined as NUMA.  You can find out if the machine is NUMA by looking at the "Hyper-VM VM Vid Numa Node" performance counters.  This counter set will report a counter set instance for each NUMA node unless the machine is a single node. 

NUMA means each CPU has a different path to memory and those paths can have various lengths.  For example CPU 0 on Node 0 when accessing CPU M’s memory on Node X might take 10ns where as CPU 0 on Node 0 accessing CPU N’s memory on Node Y will take 20ns.  It’s this difference in memory access times that can impact overall VM performance.  The worst case would be for a VM’s Virtual Processor (VP) to be running on a node furthest from where the memory for the VM is allocated.

In order to improve performance you can place VM’s on different nodes.  In addition to placing the memory the Hypervisor scheduler will attempt to run the VM’s VP’s near where the memory where the VM is allocated.  This creates a dual affinity which can be very beneficial.

If you would like to try placing VM’s on different nodes to improve performance you can use the code sample below to place the VM.

Example 1 – List currently configured VM’s

            C:\> .\numa.ps1 /list

Example 2 – Set NUMA affinity to node 1.  Node numbering starts at node 0

            C:\> numa.ps1 /set tonyvm 1

Example 3 -  Clear NUMA affinity

            C:\> numa.ps1 /clear tonyvm

  Enjoy,

  Anthony F. Voellm (aka Tony)

Save the following powershell script into a file named numa.ps1.  WS08 has powershell as part of the release if you are running management from a remote machine that is XP / Vista you can download powershell here http://www.microsoft.com/windowsserver2003/technologies/management/powershell/default.mspx

 

################################################
# Developer: Anthony F. Voellm
#          : Taylor Brown
# Copyright (c) 2008 by Microsoft Corporation
# All rights reserved
#
# This is "demonstration" code and there are no
# warrantees expressed or implied
################################################

 

# This script will set the Virtual Machine to run
# on a specific NUMA node

 

# Check command line arguments

if (($args.length -lt 1) -or
    (($args[0] -ne "/list") -and
     ($args[0] -ne "/set") -and
     ($args[0] -ne "/clear")) -or
     (($args[0] -eq "/set") -and ($args.length -lt 3)) -or
     (($args[0] -eq "/clear") -and ($args.length -lt 2))) {
     Write-Host "numa.ps1 /list [<Hyper-V host>]"
     Write-Host "numa.ps1 /set <vm machine name> <required node> [<Hyper-V host>]"
     Write-Host "numa.ps1 /clear <vm machine name> [<Hyper-V host>]`n"
     Write-Host "Options:"
     Write-Host "`t/list - show configured VM's"
     Write-Host "`t/set <vm machine name> <required node> - set the NUMA node for the VM"
     Write-Host "`t/clear <vm machine name> - clear NUMA node seting for the VM"
     exit;
  }

 

# just display VM's
if ($args[0] -eq "/list") {
  if ($args.length -gt 1) {
    $HyperVHost = $args[1];
  }
  Get-WmiObject -Namespace 'root\virtualization' -Query "Select * From Msvm_ComputerSystem" | select ElementName
  exit;
}

 

# Set or clear

$HyperVHost = '.';
if ($args[0] -eq "/set") {
  if ($args.length -gt 3) {
    $HyperVHost = $args[3];
  }
  $VMName = $args[1];
  $RequiredNode = $args[2];
} elseif ($args[0] -eq "/clear") {
  if ($args.length -gt 2) {
    $HyperVHost = $args[2];
  }
  $VMName = $args[1];
}

  
#Main Script Body
$VMManagementService = Get-WmiObject -Namespace root\virtualization -Class Msvm_VirtualSystemManagementService -ComputerName $HyperVHost


$Query = "Select * From Msvm_ComputerSystem Where ElementName='" + $VMName + "'"
 

$SourceVm = Get-WmiObject -Namespace root\virtualization -Query $Query -ComputerName $HyperVHost

 

$VMSettingData = Get-WmiObject -Namespace root\virtualization -Query "Associators of {$SourceVm} Where ResultClass=Msvm_VirtualSystemSettingData AssocClass=Msvm_SettingsDefineState" -ComputerName $HyperVHost

 

if ($args[0] -eq "/set") {
  $VMSettingData.NumaNodesAreRequired = 1
  $VMSettingData.NumaNodeList = @($RequiredNode)
} else {
  $VMSettingData.NumaNodesAreRequired = 0
}

 

$VMManagementService.ModifyVirtualSystem($SourceVm, $VMSettingData.PSBase.GetText(1))

  

 

Posted by tvoellm | 1 Comments
Filed under:

Hyper-V Storage Analysis

Hyper-V Perfies,

In a previous blog entry I explained the different types of storage choices you have with Hyper-V.  However we had nore released yet so no numbers could be published.  You can find the original post below...

http://blogs.msdn.com/tvoellm/archive/2007/10/13/what-windows-server-virtualization-aka-viridian-storage-is-best-for-you.aspx

Now that we have release I wanted to share some numbers from a mid-range storage device. 

For the storage performance results below we used;

  •  A dual core 2GHz system
  • 8 SAS disk system configured to run RAID0
  • Off the shelf SCSI RAID controller
  • We pre-populated the Virtual Hard Disks used with non-zero data to avoid testing an optimization we have were zero block are not actually written to Dynamic / Diff VHDs if they have never been written before. 
  • IOMeter was used to generate the load and gather results.
  • Root results were tested using VHD loopback.  This means we used the WMI interfaces in Windows Server 2008 to mount the VHD into the root without any virtual machine.  This was pretty close (<1% diff) to bare metal performance (no Hyper-V enabled).  Check out this link on how to do it http://blogs.msdn.com/virtual_pc_guy/archive/2008/02/01/mounting-a-virtual-hard-disk-with-hyper-v.aspx 
  • SCSI and IDE results were collected from a Windows Server 2008 64bit one virtual processor guest.
  • Only read performance was collected for fixed Root VHD + 1 Diff Disk.  This is why some parts of the graphs are label "NOT TESTED".  Reads have to traverse the differencing chain and writes go to the outer most disk so write look just like dynamic disk write performance and were not collected.

We chose this configuration because it was pretty low to mid range, would be very low cost, and therefore within reach of all customers running Hyper-V.  If you are running Hyper-V with big servers and storage containing 100's of disks you find it handles it very well with very little overhead.  That might be the subject of a future post.

 The following graphs show sequential I/O performance of 512Byte reads and writes.  We choose this to measure because it would approximate the behavior or loading applications, editing small files, and other types of workloads doing sequential I/O.  We saw big improvements with larger I/O's (>4x) but for many workloads those are not that likely to be encountered.

The following graphs show 8KByte random read and write performance.  We chose 8KByes reads and writes because many servers such as SQL Server 2008 use this sized I/O for most operations. 

From the graphs you can see that Emulated IDE (aka you are not running with Integration Components installed) is never a good choice.

 Passthough followed by fixed disks have the overall best performance.  Dynamic and differencing disks are good choices when you need flexibility.

   Enjoy,

   Tony Voellm

   Liang Yang

 

 

Posted by tvoellm | 0 Comments

WS08 Hyper-V now supports 24LP

Hyper-V Perfies,

 

Now available to the general public is Windows Server 2008 (WS08) Hyper-V 24 Logical Processor (LP) Support.  This revises our RTM support limit of 16 Logical Processors.  We also increased the number of supported running virtual machines from 128 to 192.  This update also has the side benefit of improving workloads that start large numbers of processes by increasing the process address space cache limit from 192 to 384.  All in all this update will allow you to get even more mileage from WS08 Hyper-V.

 

You can find the needed update here http://www.microsoft.com/downloads/details.aspx?FamilyID=fe36823a-7e5a-4262-9bf5-d6b3ae3ad375&DisplayLang=en

 

Here is some Q & A to help you figure out how and when to install this update;

Q: Should everyone install this update?

A: No.  Install this update only on machine needing 24LP support or if you are running more than 192 processes in the root (no recommeded) / guest.  Keep in mind there are very few workloads that create this many processes and there are only two that come to mind.  Terminal Services and Hyper-V running > 192 virtual machines (VM).

 

Q: Does this mean I can now run 192 VM's on my 16 logical processor system?

A: No.  We only support an 8 virtual processor (VP) to 1 logical processor (LP) limit.

 

Q: What does 8 VP to 1 LP mean?

A: This mean is you have 8 logical processors you can run up to 64 virtual processors in any combinaiton.  For example 16 Windows Server 2008 (WS08) VM's with 4 VP each, or 64 Windows Server 2003 VM's with 1 VP each, or 8 WS08 4VP VM's with 32 W2k3 1VP VM's.  You get the picture.

 

Q: What is a logical processor aka LP?

A: A logical processor is what we call anything that shows up in Windows as a processor in task manager.  More precisely it is the "number processors * number of cores * 2 if you are running Hyper Threading (HT) or now called Symmetric Multi Threading (SMT)."  For example a dual proc quad core system with SMT enabled has 2 * 4 * 2 = 16 LP.  A quad proc dual core system has 4 * 2 = 8 LP.

 

Leading the 24LP effort was a blast and it is also why my blog has been pretty quiet recently.

 

  Enjoy,

  Tony Voellm

 

Posted by tvoellm | 1 Comments
Filed under:

BizTalk team releases best practice doc for running on WS08 Hyper-V

Below is a great link on how to get BizTalk 2006 working on top of Hyper-V.  It includes practical advice that applies to other servers and roles as well.

Check it out... http://msdn.microsoft.com/en-us/library/cc768518.aspx

   Tony

Posted by tvoellm | 1 Comments
Filed under:

How to get Processor Utilization for Hyper-V via WMI

There are a number of groups building management software (OEMs, Microsoft, …) for Hyper-V which is cool to see.  A common ask from these teams has been around reading and computing VM CPU usage.

The following is an example of how to compute Hyper-V guest processors usage.  You can use the same formula for “% Total Run  Time, “% Hypervisor Time” and “% Idle time”.   The counters show up in the Win32_PerfRawData_HvStats_HyperVHypervisorLogicalProcessor WMI object asPercentGuestRunTime”, “PercentTotalRunTime”, “PercentHypervisorRunTime”, and “PercentIdleTime”.

To make the formula easier to read lets use:

                GN – Percent Guest Run Time (substitute other usage values here)

                PN – Timestamp_PerfTime

                FN – Frequency_PerfTime

                LP = Number of logical processors (Get this from the “Hyper-V Hypervisor” counterset)

 

                         F1 * (G2 – G1)

 Utilization =  -----------------------------------

                     100000 * LP * (P2-P1)

 

G2 and P2 are the second values read and G1 and P1 are the first values read.

To test the formula lets read the “Hyper-V Hypervisor Logical Process” counterset twice via the Win32_PerfRawData_HvStats_HyperVHypervisorLogicalProcessor WMI object about 10 seconds apart with a single VM running at 100% Guest CPU.  Since my test machine has two CPU’s (2 LP) this means we should see about 50% overall utilization.

V:\backup>winrm enum wmi/root/cimv2/* -filter:"select * from Win32_PerfRawData_HvStats_HyperVHypervisorLogicalProcessor where name='_Total'"

Win32_PerfRawData_HvStats_HyperVHypervisorLogicalProcessor

    C1TransitionsPersec = 409197889

    C2TransitionsPersec = 0

    C3TransitionsPersec = 0

    Caption = null

    ContextSwitchesPersec = 889911109

    Description = null

    Frequency_Object = 0

    Frequency_PerfTime = 14318180

    Frequency_Sys100NS = 10000000

    HardwareInterruptsPersec = 92282462

    InterProcessorInterruptsPersec = 8174254

    InterProcessorInterruptsSentPersec = 8174254

    MonitorTransitionCost = 16

    Name = _Total

    PercentC1Time = 4193635539355

    PercentC2Time = 0

    PercentC3Time = 0

    PercentGuestRunTime = 314976793671

    PercentHypervisorRunTime = 53745475789

    PercentIdleTime = 8385447570540

    PercentTotalRunTime = 368722269460

    SchedulerInterruptsPersec = 384836664

    TimerInterruptsPersec = 33425466

    Timestamp_Object = 0

    Timestamp_PerfTime = 6268633722843

    Timestamp_Sys100NS = 4199325031975

    TotalInterruptsPersec = 518718846

 

V:\backup>winrm enum wmi/root/cimv2/* -filter:"select * from Win32_PerfRawData_HvStats_HyperVHypervisorLogicalProcessor where name='_Total'"

Win32_PerfRawData_HvStats_HyperVHypervisorLogicalProcessor

    C1TransitionsPersec = 409201218

    C2TransitionsPersec = 0

    C3TransitionsPersec = 0

    Caption = null

    ContextSwitchesPersec = 889922035

    Description = null

    Frequency_Object = 0

    Frequency_PerfTime = 14318180

    Frequency_Sys100NS = 10000000

    HardwareInterruptsPersec = 92283571

    InterProcessorInterruptsPersec = 8174425

    InterProcessorInterruptsSentPersec = 8174425

    MonitorTransitionCost = 16

    Name = _Total

    PercentC1Time = 4193667417779

    PercentC2Time = 0

    PercentC3Time = 0

    PercentGuestRunTime = 315044817737

    PercentHypervisorRunTime = 53746578312

    PercentIdleTime = 8385511363951

    PercentTotalRunTime = 368791396049

    SchedulerInterruptsPersec = 384840537

    TimerInterruptsPersec = 33426627

    Timestamp_Object = 0

    Timestamp_PerfTime = 6268728855292

    Timestamp_Sys100NS = 4199364353043

    TotalInterruptsPersec = 518725160

 

Based the formula above and the data below we get = 51% which is spot on.

  Enjoy,

    Tony Voellm

 

 

 

WS08 Hyper-V as RTMed!!!!

Wow its been a great ride helping get Hyper-V to perform the way it does.  So much so I've not posted in a while.  You will see some upcoming posts on performance counters, WMI perf interfaces, and real data (yes now with RTM I can post data).

In the meantime check out my co-workers post on where to get the final (aka RTM) WS08 Hyper-V bits...

http://blogs.technet.com/jhoward/archive/2008/06/26/hyper-v-rtm-announcement-available-today-from-the-microsoft-download-centre.aspx 

   Tony

 

Posted by tvoellm | 1 Comments
Filed under: ,

Windows Server Performance 2008 Tuning guide now includes Hyper-V

The Windows Server Performance team has updated teh Window Server 2008 Tuning guide to include Hyper-V.  The Hyper-V section provide a lot of recommendations you will find useful.

Check it out below...

http://blogs.technet.com/winserverperformance/archive/2008/06/17/power-and-hyper-v-are-now-part-of-the-windows-server-2008-tuning-guide.aspx

 

 

Posted by tvoellm | 0 Comments
Filed under:

Why can't I start my VM when there is plenty of free memory?

Issue #1: A number of customers has reported than VM's fail to start when there is plently of memory.  The error reported from the Hyper-V manager is displayed below.  Here I tried to start a 4GB VM when there was 7+GB free.   What's going on?

Hyper-V Memory Error

The most common cause has to do with a bug in Windows Server 2008 NUMA memory allocations.  This means you'll only hit this bug on NUMA machines (all multi-proc AMD machines are NUMA + many high end Intel based servers).  You can verify you have this bug by looking at the Task Manager reported "Cached" and "Free" memory.

Hyper-V Free Memory Task Manager

Here free memory is 9MB but really much of the "Cached" memory can be made free also.  Standby Pages, Modified Pages, and System File Cache make up the "Cached" memory.  Most (all but a couple 100MB) of the cached memory can be converted to "free",

Fix #1: To resolve the issue will need to contact Microsoft Product Support http://support.microsoft.com/contactus/cu_sc_more_master#tab1 and request the following hotfix - KB953585

Alternate Fix #1: The second way to avoid this bug is to not do any work in the root parition directly.  Create a "management" VM and do your work there.

Background #1:  The most common way of hitting this issue is by running large file copies in the root (aka host) parition.  Doing lots of file copies causes the System File Cache to bloat in size.  When Hyper-V goes to start it will flush the System File Cache if it needs memory.  The System File Cache pages (aka file pages from the copies) move to the Standy list and from there will get zeroed and freed.  The bug is WS08 prevents NUMA allocations from the Standyby list.

Issue #2: The second most common cause to the can't start the VM issue when it appears there is plenty of memory is due to configuring a NUMA machine to use fewer processors than the system actually has.  Typically this is done using "bcdedit /set numproc X". 

Fix #2: If you set NUMA machine in this mode for testing memory attached to the "hidden" processors will be inaccessible to Hyper-V.  Some machines allow you to reassign the memory (via the BIOS or front end controller) to the "active" processors.

  - Tony Voellm

Posted by tvoellm | 1 Comments
Filed under:

Hyper-V Performance FAQ

Hyper-V Performance FAQ

Anthony F Voellm (aka Tony)

6/19/2008

http://blogs.msdn.com/tvoellm

 

Q: What is the recommend configuration for performance testing?

A: Here are some simple steps:

1.       Be sure to have the latest WS08 Hyper-V build – Hyper-V RC1 which is on Microsoft Downloads and Windows Update

2.       Next you need to make sure you are running a “Supported OS” with the latest SP. 

Operating System

Virtual Processor Limit

Windows Server 2008 64-bit

4

Windows Server 2003 32-bit

2

Windows Server 2008 32-bit

4

Windows Server 2003 64-bit

1

Windows Vista SP1 32-bit / 64bit

1

Windows XP SP3 32-bit / 64bit

1

Windows 2000 32bit

1

SUSE/RedHat LINUX

1

 

3.       Make sure the guest and root OS have integration components installed (http://blogs.msdn.com/tvoellm/archive/2008/04/19/hyper-v-how-to-make-sure-you-are-getting-the-best-performance-when-doing-performance-comparisons.aspx **and** http://blogs.msdn.com/tvoellm/archive/2008/01/02/hyper-v-integration-components-and-enlightenments.aspx )

4.       Make sure you are using the “Network Adapter” and not the “Legacy Network Adapter”.  The legacy adapter has a lot of emulation which causes lots of CPU overhead.

5.       Use passthrough disks attached to SCSI for the next performance.  Next best is Fixed VHD attached to SCSI.  To understand storage better see (http://blogs.msdn.com/tvoellm/archive/2007/10/13/what-windows-server-virtualization-aka-viridian-storage-is-best-for-you.aspx )

6.       Follow these tips for avoiding pitfalls http://blogs.msdn.com/tvoellm/archive/2008/04/19/hyper-v-how-to-make-sure-you-are-getting-the-best-performance-when-doing-performance-comparisons.aspx

Q: How do I monitor performance?

A: First you need to understand that the clocks in the root and guest Virtual Machines may not be accurate see (http://blogs.msdn.com/tvoellm/archive/2008/03/20/hyper-v-clocks-lie.aspx ).  Given an understanding of clocks you can see why we implemented the “Hyper-V Hypervisor Logical Processor” performance counters (access using perfmon) which are not skewed by clock effects.  There are other Hyper-V performance counters that are useful.  See the following for more details (http://blogs.msdn.com/tvoellm/archive/tags/Hyper-V+Performance+Counters/default.aspx )

Q: Should I use passthrough or iSCSI attached to the guest for storage?

A: The decision depends on what features you need to expose to the guest.  In the passthrough case the drive will show up without knowledge of the underlying LUN. 

Educated guess: If you are looking for raw performance passthough will give you the best result.

Reason: When doing IO from the guest using passthough you traverse the guest storage stack + disk stack in the root.  When doing iSCSI you traverse the storage + networking stack in the guest + root networking stack

Q: Is there a simple way to disable the hypervisor to run some baseline tests on the native system.

A: Yes.  “bcdedit /set hypervisorlaunchtype off”  and reboot the server. You should also consider changing the protocols on the root network device to re-enable IP and turn off the “Microsoft Virtual Network Switch Protocol”.   See the following for more details (XXX – coming to my blog soon J).  For starters turning off the Hypervisor should be enough for testing native performance.

To turn it back on do “bcdedit /set hypervisorlaunchtype on” and reboot the server.

Q:  Is there any common terminology used to talk about virtual machine configurations?

A: Yes - Internal to Microsoft we typically use the following;

·         Native = System without the Hyper-V role. This means you have no virtual drivers, virtual switch, …

·         Root = you have Hyper-V role installed and but are not using a virtual switch for networking.

·         Guest = Guest Virtual Machine.

·         8p.child.2x1p or better 8p.child.2VMx1VP = A system with 8 logical processors / cores running 2 Virtual Machines (VM) each with 1 Virtual Processor (VP)

Q: Are there any services that should be stopped?

A: No If you are running Server Core which is the ideal root OS to use.  Regardless of running server core vs server you should close the Hyper-V Management Console because it has a noticeable impact on CPU.  If you want the details see http://blogs.msdn.com/tvoellm/archive/2008/04/19/hyper-v-how-to-make-sure-you-are-getting-the-best-performance-when-doing-performance-comparisons.aspx

Q: Is it ok run applications / processes in the root OS?

A: You should avoid running any Role / Feature or custom service in the root.  If you have services you want to run put them in a guest VM.  Running roles in the root can have a negative impact the guest VM’s.  This is due to how the Hypervisor scheduler handles the root virtual processors.

Q: Are there additional knobs for performance nuts?

A: We are trying to make Hyper-V knob-less.  However we are engineers and here are some tips.

1.       Remove the CDROM drive from the guest if you don’t need it

2.       Look into the Caps / Weights / Reserves in the CPU config.  You can use these to “balance” workloads.

3.       You can use the WMI interfaces to force a VM to a particular node (coming to my blog soon J).  You don’t guarantee node affinity for VP’s but we do for memory.  There is a good chance the VP’s will stay on the node because the scheduler is NUMA aware.

Q: Are their addition resources that are useful for understanding Hyper-V?

A: Yes – here is a list

       http://blogs.technet.com/windowsserver/default.aspx

       http://blogs.msdn.com/virtual_pc_guy/

       http://blogs.technet.com/jhoward/

       http://blogs.msdn.com/tvoellm

       http://blogs.technet.com/winserverperformance/

 

Posted by tvoellm | 2 Comments
Filed under:

Negative ping times in Windows VM's - whats up?

Just a quick blog post that might help you resolve an issue that some customers have seen running under Hyper-V VM's.  The issue is negative ping times on multi-processor guests.

If you see negative ping times in multiprocessor W2k3 guest OSes you might consider setting the /usepmtimer in the boot.ini file. 

The root issue comes about from the Win32 QueryPerformanceCounter function.  By default it uses a time source called the TSC.  This is a CPU time source that essentially counts CPU cycles.  The TSC for each (virtual) processor can be different so there is no guarantee that reading TSC on one processor has anything to do with reading TSC on another processor.  This means back to back reads of TSC on different VP's can actually go backwards.  Hyper-V guarantees that TSC will not go backwards on a single VP.

So here the problem with negative ping times is the time source is using QueryPerformanceCounter which is using TSC.  By using the /usepmtimer boot.ini flag you change the time source for QueryPerformanceCounter from TSC to the PM timer which is a global time source.

  - Tony Voellm

Posted by tvoellm | 3 Comments
Filed under: ,

Hyper-V Performance Counters – Part four of many – “Hyper-V Hypervisor Virtual Processor” and “Hyper-V Hypervisor Root Virtual Processor” counter set

The “Hyper-V Hypervisor Virtual Processor” and “Hyper-V Hypervisor Root Virtual Processor” counter sets have the same counters.  The only difference between the two is the ““Hyper-V Hypervisor Root Virtual Processor” contains counters for only the Root Virtual Processors (VP’s) whereas “Hyper-V Hypervisor Virtual Processor”  has counter for all other partitions.

The virtual processor counters are very useful because they help you understand how much guest VM’s are running and where they are running.  Unfortunately these counters do suffer from a small amount of clock skew in WS08 Hyper-V but this only slightly reduces their usefulness.  We hope to remove the clock skew in future releases.  The skew shows up in that some” %” counters may exceed 100%.  I’ve seen some go as much as 110% depending on the system load.  The problem has to do with the fact this counter set uses the clock from the root rather than from the hypervisor as a basis of time. For more on clock skew see (http://blogs.msdn.com/tvoellm/archive/2008/03/20/hyper-v-clocks-lie.aspx). 

Virtual Processors (VP) are the unit of execution for a partition and each partition contains one guest virtual machine (VM).   For each VP there is a set of counters.  Perfmon.exe will let you view the counters separately or as an average for all VP’s called “_Total”.  VP counters are prefixed with the name of the partition like this “WS08 Guest 1:” followed by the VP id like this “Hv VP 0”.  This makes it easy to identify which VP’s go with which partitions.

The VP counters have a lot of detail on what the virtual processors are doing so I have ordered them with the most useful counters at the top.

Hyper-V Hypervisor [Root] Virtual Processor counters

·         %Guest Run Time – For guest VM’s this is the percentage of time the guest VP is running in non-hypervisor code on an LP or for the _Total the total across all guest VP’s.   For the root this is the percentage of time the root VP is running in non-hypervisor code on an LP or for _Total the total across all root VP’s.  If you sum the _Total for both the guest VP’s and root VP’s this will equal the % Guest Run Time _Total of the Logical Processor counter set.

·         %Hypervisor Run Time – For guest VM’s this is the percentage of time the guest VP is running in hypervisor code on an LP or for the _Total the total across all guest VP’s.   For the root this is the percentage of time the root VP is running in hypervisor code on an LP or for _Total the total across all root VP’s.  If you sum the _Total for both the guest VP’s and root VP’s this will equal the % Hypervisor Run Time _Total of the Logical Processor counter set.

·         %Total Run Time – This is just a sum of %Guest Run Time + % Hypervisor Runtime on a per VP basis.  If you add _Total from both the Root Virtual Processor and Virtual Processor counter sets it will equal  (% Total Run Time _Total - % Idle Time _Total) from the Logical Processor counters.

·         Total Intercepts/sec – Whenever a guest VP needs to exit is current mode of running for servicing in the hypervisor this is called an intercept.  Some common causes of intercepts are resolving Guest Physical Address (GPA) to Server Physics Address (SPA) translations, privileged instructions like hlt / cupid / in / out, and the end of the VP’s scheduled time slice.

·         Total Intercepts Cost – This is a relative measure of cost of intercepts.   The cost can vary based on the types of intercepts and the machine architecture.

·         Hypercalls/sec – Hypercalls are one form of enlightenment.  Guest OS’s use the enlightenments to more efficiently use the system via the hypervisor.   TLB flush is an example hypercall.  If this value is zero and says zero this is an indication that Integration Components are not installed.  New OS’s like WS08 can use hypercalls without enlightened drivers so it is only a prereq. not a guarantee of having Integration Components installed.

·         Hypercalls Cost – This is a relative measure of cost of hypercalls.   The cost can vary based on the types of calls and the machine architecture.

·         HLT Instructions/sec – Number of CPU halts per second on the VP.  A HLT will cause the hypervisor scheduler to de-schedule the current VP and move to the next VP in the runlist.

·         HLT Instructions Cost - This is a relative measure of cost of halt.   The cost can vary based on the machine architecture.