Translations of this page:
Show pagesource
AdminRecent ChangesSitemap

VirtIO RNG may cause high CPU utilization by rand_harvestq in FreeBSD 13 VM

I found my home server suddenly got increased power consumption. Its difference was up to 30W, so it's not obviously error.

It was caused by powering up FreeBSD VM on ProxmoxVE, but I had no reasonable reasons. When I casually saw the PVE's CPU utilization graph, I found it increased considerably after upgrading to 13.0-RELEASE.

Although I checked FreeBSD top result, there were nothing significant highly loaded processes, but it showed “System” used 4-5% CPU on closer look. It meant some sort of system processes ate a CPU. And I checked details with top -SP, then a rand_harvestq process constantly ate 40-80% of one of CPU cores.

The process appears for harvesting entropies of random number from its name. Show my article about the entropy harvesting.

There are no weird points at system variables relevant to the harvesting. If I was to say it, using two random sources which are 'VirtIO Entropy Adapter'(VirtIO RND) and 'Intel Secure Key RNG'(RDRAND) is specific to the system running on a virtual machine.

$ sysctl kern.random
kern.random.fortuna.concurrent_read: 1
kern.random.fortuna.minpoolsize: 64
kern.random.rdrand.rdrand_independent_seed: 0
kern.random.use_chacha20_cipher: 1
kern.random.block_seeded_status: 0
kern.random.random_sources: 'VirtIO Entropy Adapter','Intel Secure Key RNG'
kern.random.harvest.mask_symbolic: VMGENID,PURE_VIRTIO,PURE_RDRAND,[UMA],[FS_ATIME],SWI,INTERRUPT,NET_NG,[NET_ETHER],NET_TUN,MOUSE,KEYBOARD,ATTACH,CACHED
kern.random.harvest.mask_bin: 100001001000000111011111
kern.random.harvest.mask: 8683999
kern.random.initial_seeding.disable_bypass_warnings: 0
kern.random.initial_seeding.arc4random_bypassed_before_seeding: 0
kern.random.initial_seeding.read_random_bypassed_before_seeding: 0
kern.random.initial_seeding.bypass_before_seeding: 1

And then, the reason for the high CPU utilization was, after all, VirtIO Entropy Adapter.

It is a para-virtualization device for a random number device as it is named to use its physical device on a host from a VM. It is expected to use lower CPU if we use common sense, but it's not so in this case. I don't know why, though. The VirtIO RNG's random source is assigned /dev/urandom on the host, so it may not be blocking. I haven't changed these settings at all..hummm…why?

I decided to use the VM without the VirtIO RNG because getting high-loaded by para-virtualization is putting the cart before the horse. Intel Secure Key RNG still works as the random source in FreeBSD, so it'll be no problem.

The power consumption returned to former level, then my wallet stays away from losing money.

ConnectX-3's VF works on Win10 guest on PVE!

I finally succeed in working a virtual function of ConnectX-3 SR-IOV on a Windows 10 Pro. guest on Proxmox VE 6.3!

Getting the device manager to recognise the VF was a piece of cake, but I struggled with an “Error code 43” problem even though I tried to install its driver over and over. As it turned out, the cause was that PVE's built-in PF driver was outdated indeed. I will write it down at a later date if time permits.

Let's just see a performance of SR-IOV!

As shown in above picture, two Windows 10 PCs were connected by ConnectX-3 via a Layer-3 switch, one of which was physical machine (left-side in the picture) and the another was virtual machine (right-side). Then I measured line speed using NTttcp which was a Microsoft's official tool. Sender was left-side machine, and receiver was right-side. The result is below:

There are four task manager in the picture, but the active task manager in the bottom right corner runs on the physical machine, the others do on the virtual machine. It turns out that the connection speeds up to about 26Gbps. When I tested it using virtio-net instead of the PF on the VM, it sped up to about 12Gbps in comparable terms. The SR-IOV works out lower CPU load even more than faster speed. Only the SR-IOV could do that!

I also found out it peaked up to about 32Gbps on a side note.

It looks scrubing degraded ZFS pools can't be imported by another systems?

It's a long story, I imported a RAIDZ pool which consisted of 4 HDDs and was created by FreeBSD to a ZoL environment as a degraded pool consisted of 3 HDDs.

Although scrub was automatically running, I cancelled it and exported it, then tried to import the entire 4 HDDs on the BSD system, but it couldn't.

# zpool import zdata
cannot import 'zdata': no such pool available

A such error happened. I sweated it because the pool could get corrupted by come-and-go the pool between the BSD and Linux with degraded.

On the ZoL environment, I found the degraded pool was succeeded to import. After the pool got back to 4 HDDs and finished to resilver, it could be imported on the FreeBSD system.

It appears a degraded pool can't be imported by other systems once it is imported by the first system until its reliability will be recovered, I guess. I don't know this behaviour is normal or a special case occurred by come-and-go the pool between FreeBSD (Legacy ZFS) and the ZoL.

If it is right behaviour, the pool virtually will get unavailable when the PC will break up, so I guess it may not be right behaviour. I log this phenomenon as it actually happened.

ConnectX-3 VFs are not working on pfSense 2.4.5 on PVE 6.3

The project builds up stout router with SR-IOV is now starting up!

That being so I tried to create a pfSense 2.4.5-p1 VM on Proxmox VE 6.3-2 with ConnectX-3 VFs by PCI pass-through, but it didn't recognise them properly…😇

Some error logs are recorded in dmesg such as “pcib1: failed to allocate initial I/O port window: 0xd000-0xdfff” and “pcib1: Failed to allocate interrupt for PCI-e events”, and the VFs aren't listed up with pciconf -lv command. That means device probing fails anyways, I guess.

The pfSense 2.4 is based on FreeBSD 11.3-RELEASE. It is old-ish, so I tried newer pfSense 2.5 which is under development and is based on FreeBSD 12-STABLE.

The system recognises the VFs properly and mlxen devices are created though the former error occurs same as usual. I don't know why two mlxen are identified as mlxen0 and mlxen2 even though I pass through sequence number of VFs. It's weird, but I'll settle for this for now.

It seems mlx4 modules are embedded into a kernel of pfSense 2.4.5 or 2.5.

It is out of the question if NIC is out of order, so I decided to use developing pfSense 2.5. There is no doubt stable pfSense 2.5 will be released at some future date.

How to boot a FreeBSD system in any partitions with loader.efi

In about FreeBSD 12.0-RELEASE, loader.efi has been used as UEFI bootloader instead of prior boot.efi.

Both loaders can boot a FreeBSD system from ZFS or UFS file system. The loader.efi only looks for the file system from the storage which the loader is loaded from though the boot1.efi does from multiple ones. Briefly speaking, the former can't boot the system from other HDDs. Well, it can do it if I operate to set the boot devices manually in loader prompt each time, but it is not realistic.

I thought that it somehow got be able to automatically boot the system, then I read some documents and googled, but there was no idea. After I reluctantly saw the source codes of loader.efi, I found a way using a loader.env file and a rootdev variable.

Regardless of loader.efi or boot1.efi, it eventually uses the value of currdev variable as a boot target. In loader.efi, the 'currdev' is set from a 'rootdev' variable unquestionably if it exists.

And then, it seems the rootdev is set by /efi/freebsd/loader.env file in ESP. This function is developed relatively recently, then FreeBSD 12.2-RELEASE and over support it.

Add a following line to the file to specify the file system path corresponding to root directory in ZFS. The value becomes like disk0p1 in UFS. Suffix colon is not wrong, it's necessary!

rootdev=zfs:zroot/ROOT/default:

They are undocumented as of 9th January 2021, so they may change in the future and also as-is.

Of course, I simply use prior boot1.efi with no need to such a chore way.

en/start.txt · Last modified: 2021-02-02 05:37 by Decomo
CC Attribution-Noncommercial-Share Alike 4.0 International
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0