NUMA Tuning on libvirt
“The only true wisdom is in knowing you know nothing” - Socrates
Introduction⌗
Back when I built my server (Link), I assumed that by right-sizing my VMs (such that 2 VMs are placed in each NUMA node) the hypervisor (libvirt/QEMU/kvm in my case) would take care of any NUMA tuning… I was soo wrong…
François Donzé has a post on this: Link
RedHat also has some guidelines on NUMA tuning: Link
Key Takeaways⌗
-
According to RedHat, a combination of vcpupin, emulatorpin and numatune is required in order to actually pin it to a NUMA node.
-
A “strict” numatune memory policy is dangerous and will lead to VMs OOMing in case memory runs out on a NUMA node. According to Francois, the “interleave” policy is a much better option since it would allow to borrow memory from nearby nodes.
-
numastat can be used to verify that the NUMA tuning is working as expected. For example, “watch -n 1 numastat -c qemu-kvm” is a great way to observe VMs requesting memory from other nodes.
-
numactl can be used to check the NUMA topology of the machine. For example, “numactl -H”.
NUMA Tuning on libvirt⌗
Assuming each NUMA node has access to 16 cores and 32GBs of RAM, we can pin 2 machines (each with 8 core + 16GBs of RAM) to it.
Let’s assume NUMA node 0 has access to 16 cores (0-7 and 64-71)
For machine1,
virsh vcpupin machine1 0 0 --config
virsh vcpupin machine1 1 1 --config
virsh vcpupin machine1 2 2 --config
virsh vcpupin machine1 3 3 --config
virsh vcpupin machine1 4 4 --config
virsh vcpupin machine1 5 5 --config
virsh vcpupin machine1 6 6 --config
virsh vcpupin machine1 7 7 --config
virsh emulatorpin machine1 0-7 --config
virsh numatune machine1 --mode interleave --nodeset 0 --config
For machine2,
virsh vcpupin machine2 0 64 --config
virsh vcpupin machine2 1 65 --config
virsh vcpupin machine2 2 66 --config
virsh vcpupin machine2 3 67 --config
virsh vcpupin machine2 4 68 --config
virsh vcpupin machine2 5 69 --config
virsh vcpupin machine2 6 70 --config
virsh vcpupin machine2 7 71 --config
virsh emulatorpin machine2 64-71 --config
virsh numatune machine2 --mode interleave --nodeset 0 --config