Virtual Machine Queue (VMQ) – Hyper-V 2012 R2 & 2016

In this post I will explain how to configure VMQ on Windows Server 2012(R2) and 2016.

The goal of this post is to explain how to configure Virtual Machine Queue on Windows Server 2012 R2 and Windows Server 2016 with Hyper-V.

 Teaming Mode: Address Hash Hyper-V Port  Dynamic
 Switch Independent:  Min Queues  Sum Queues  Sum Queues
 Switch Dependent:  Min Queues  Min Queues  Min Queues

Guidelines for setting up VMQ:

  • Do not use CPU 0
  • Only physical cores can be used
  • Do not overlap NUMA nodes
  • Stay below the 64 Logical CPUs

Explanation Guidelines:

  • Why do you skip CPU 0?
    • CPU 0 is reserved for system processes
  • Why can we only use physical cores?
    • VMQ can only use ‘physical’ & ‘even’ logical processors
  • Do not overlap NUMA nodes, use 1 NUMA node per NIC
    • NUMA overlap will result in a sum error when using Hyper-V Port/Dynamic (ID 106)
  • Stay below the 64 Logical CPUs
    • The Windows OS running in the management partition, known as the host, or “root” partition, will only use up to a maximum of 64 root Virtual Processors (root VPs)
  • Impact on VM’s
    • The Hyper‑V hypervisor will continue to manage and utilize all logical processors in the system to run any workload in guest Virtual Machines (VMs) on guest VPs
  • Spanning VMQ
    • Try to use 1 NUMA node per NIC.  If the socket is split into 2 NUMA nodes, then each NUMA node has it’s own local resources like CPU & RAM
  • Check the Max NumberOfReceiveQueues
    • Example: Emulex adapters currently support up to 30 VMQ on Windows Server 2012
  • RSS is disabled as a network adapter is connected to a virtual switch

I will describe the steps to configure VMQ on Windows Server 2012+ Hyper-V servers below.

Sub-NUMA and COD (Cluster-on-die) are two fairly new snooping options to split the CPU resources (Skylake & Haswell). Turn this BIOS option off unless you know how this works.

Microsoft has no information or best practices about these configurations yet. If you have more information about Sub-NUMA & Hyper-V, please let me know in the comments below.

VMware has support for COD in VMware 6 or later.

Each server has the following hardware:

  • 2 NICs (1 active/active team) and 1 Switch
  • 2 Sockets (Intel/AMD) with 1 NUMA node per socket
  • Hyperthreading enabled

Do you want to reset the current adapter-properties?

Reset-NetAdapterAdvancedProperty -Name "NIC 1"
Reset-NetAdapterAdvancedProperty -Name "NIC 2"

Dynamic / Hyper-V Port (Sum of Queues) VMQ configuration:

Step 1. VMQs per NIC

Check the following configuration with PowerShell.

Get-NetAdapterVmq -Name “NIC 1" | fl

Output (ex.): NumberOfReceiveQueues : 31 VMQs

Get-NetAdapterVmq -Name “NIC 2" | fl

Output (ex.): NumberOfReceiveQueues : 31 VMQs

Please note the maximum VMQs of the network adapters from your machine.

Step 2. NUMA Nodes & Cores with hwloc

Get detailed information about your NUMA Nodes & Cores.
Download: https://www.open-mpi.org/software/hwloc/v1.11/
Unzip hwloc-win32-build-1.11.x.zip to your c:/ drive.

Open CMD and browse to the bin folder:

CD C:\hwloc-win64-build-1.11.10\bin

Open the following executable in CMD:

lstopo-no-graphics.exe

Output lstopo:
In this small example we only have 2 sockets and 2 NUMA nodes with 2 physical cores each.

Socket 1 - NUMANode 0 - NIC1
NUMANode L#0 (P#0 116GB) + Package L#0 + L3 L#0 (35MB)
L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
PU L#0 (P#0) CPU 0, do not use this one!
 PU L#1 (P#1) Hyper-threading Core (ignore)
 L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
 PU L#2 (P#2) CPU 2, available for VMQ NIC 1
 PU L#3 (P#3) Hyper-threading Core (ignore)

Socket 2 - NUMANode 1 - NIC2:
NUMANode L#1 (P#1 133GB) + Package L#1 + L3 L#1 (35MB) 
 L2 L#2 (1024KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
 PU L#4 (P#4 CPU 4, do not use this one. (divide)
 PU L#5 (P#5) Hyper-threading Core (ignore)
L2 L#3 (1024KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
 PU L#6 (P#6) CPU 6, available for VMQ NIC 2
 PU L#7(P#7 Hyper-threading Core (ignore)

You can also use the following Cmdlet to get more details about the NUMA nodes:

Get-VMHostNumaNode

Show the MaxProcessors.

Get-NetAdapterRss

In my opinion hwloc is the best option for an overview of the NUMA nodes.

Step 3. Add your configuration in a table

The easiest way is to create a table in excel:
P = Physical Core
HT = Hyper-threading Core
0 = Start NUMA0
14 = Start NUMA1

Number (use lstopo) = 0 to 27 (28 logical cores in this scenario)
Physical Cores = 28:2 = A total of 14 Physical cores
14 Cores – 2 = 12 Available cores
Physical Core 0 = Reserved for system processes
Physical Core 14 = Reserved, keep the number of cores equal to NIC1
Total: 6 cores for NIC1 / 6 cores for NIC2

NUMA Node 0 / Socket 1 / NIC1:

Set-NetAdapterVmq -Name "NIC1" -BaseProcessorNumber 2 -MaxProcessors 6 -MaxProcessorNumber 12

NUMA Node 1 / Socket 2 / NIC2:

Set-NetAdapterVmq -Name "NIC2" -BaseProcessorNumber 16 -MaxProcessors 6 -MaxProcessorNumber 26

Step 4: Explanation

Please note that multiple VMQ queues can use the same CPU core.
6 CPU cores are used for a maximum of 31 VMQs per NIC.

-BaseProcessorNumber 2 = Processor 2 / Physical Core 2 on socket 1
-MaxProcessors 6 = Max 6 processors for NIC1
-MaxProcessorNumber 12 = Processor 12 (last physical core) / Physical Core 7 on socket 1

-BaseProcessorNumber 16 = Processor 16 / Physical Core 2 on socket 2
-MaxProcessors 6 = Max 6 processors for NIC2
-MaxProcessorNumber 26 = Processor 26 (last physical core) / Physical Core 7 on socket 2

Address Hash (Min Queues) VMQ configuration:

The NICs in the team need to use overlapping processor sets in Address Hash mode (Min Queues). This means that you need to use the Set-NetAdapterVMQ to configure each NIC in your team to use the same processors.

Check your Excel table and start from Processor 2 / Physical Core 2.

Set-NetAdapterVmq -Name "NIC1" -BaseProcessorNumber 2 -MaxProcessors 13 -MaxProcessorNumber 26
Set-NetAdapterVmq -Name "NIC2" -BaseProcessorNumber 2 -MaxProcessors 13 -MaxProcessorNumber 26

Hyper-V is NUMA-aware and will always try to keep the VMs at 1 NUMA node.

“In switch independent / address hash configuration the team will use the MAC address of the primaryteam member (one selected from the initial set of team members) on outbound traffic. MAC addresses get used differently depending on the configuration and load distribution algorithm selected.  This can, in unusual circumstances, cause a MAC address conflict.” Source

In my opinion, Address Hash is not the best option for Hyper-V in combination with converged servers because of the ‘risk’ of MAC address flapping. Hyper-V Port is in many cases the best solution. Check the advantages and disadvantages here.

Thanks for reading my first post! Was it useful? Let me know by leaving a comment.

Sources:
https://support.microsoft.com/en-us/help/2812283/hyper-v-limits-the-maximum-number-of-processors-in-the-hyper-v-host-os
https://blogs.technet.microsoft.com/networking/2016/01/04/virtual-machine-queue-vmq-cpu-assignment-tips-and-tricks/

2 thoughts on “Virtual Machine Queue (VMQ) – Hyper-V 2012 R2 & 2016

    1. Tom van Brienen Post author

      Hello Max,

      The same applies to converged partitionable NICs in 2016. This also depends on your network configuration. (Physical switches etc.) VMQ is a hardware virtualization technology.

      VMMQ is also interesting for configurations with heavy VM network load. Your hardware must support this, just like VMQ.

      Or in Server 2019 d.VMMQ: https://blogs.technet.microsoft.com/networking/2018/08/22/netperf4vw/

      You have to take a look at your physical network cards for the correct VMQ setup.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *