We run a Hyper-V host for a couple of VMs that we can't run in AWS or Google Cloud Platform. I arrived at work the other day to find that, at some point during my bike ride in, several of the VMs had become unreachable by my monitoring system. Further investigation showed that the VMs weren't completely unreachable, but the packet loss was so bad that they couldn't communicate with the monitoring server, or complete SSH negotiation. The host, however, was functioning perfectly.
After a while interrogating the host with Powershell (it's a server core host) I was unable to find the root of the problem or a workaround... everything looked normal! Eventually I came across this article, which suggests that you should not have VMQ enabled on network adapters operating at <10Gbps with Hyper-V 2016. It appears VMQ is enabled by default for 10GbE adapters (even if the link speed is only 1Gbps) and must be manually disabled. I suspect that 1GbE adapters will have it disabled by default.
I diabled VMQ and voila! the network performance was restored immediately.
VMQ is enabled on the physical adapter. You can check if it is enabled by running
PS C:\> Get-NetAdapterVmq | select Name,Enabled
Name Enabled
---- -------
Ethernet 1 True
Ethernet 2 False
Ethernet 3 True
To disable it for a specific adapter
PS C:\> Set-NetAdapterVmq -Enabled $false "Ethernet 2"
Footnote:
This issue just started happening out of the blue for me, and I have no idea what caused it to start dropping packets when it had been fine for 9 months. Even a reboot hadn't made a difference, suggesting that it wasn't just a case of a full queue.