Translations of this page:
Show pagesource
AdminRecent ChangesSitemap

Some QSFP+ transceivers may cause no re-link up 40GbE connections on FreeBSD

Five months have passed since I setup a 40GBASE-SR4 network between my PC (Windows 10) and home server (FreeBSD) by ConnectX-3. When the PC went to sleep and resumed, I don't know why, the 40GBASE-SR4 connection could have not linked up again unless I physically unplug and plug a QSFP+ transceiver on the server side.

The problem is certain to be no TX signals on the server when it happens.

$ ifconfig -v mlxen0
        ether e4:1d:2d:74:16:e0
        hwaddr e4:1d:2d:74:16:e0
        media: Ethernet 40Gbase-CR4 <full-duplex> (autoselect)
        status: no carrier
        plugged: QSFP+ 40GBASE-SR4 (MPO Parallel Optic)
        vendor: Mellanox PN: MC2210411-SR4 SN: MEQSRIC0115 DATE: 2015-03-23
        compliance level: Unspecified
        nominal bitrate: 10300 Mbps
        module temperature: 40.00 C voltage: 3.22 Volts
        lane 1: RX: 0.57 mW (-2.37 dBm) TX: 0.36 mW (-4.38 dBm)
        lane 2: RX: 1.06 mW (0.26 dBm) TX: 0.37 mW (-4.30 dBm)
        lane 3: RX: 0.96 mW (-0.17 dBm) TX: 0.00 mW (-30.46 dBm)
        lane 4: RX: 1.12 mW (0.52 dBm) TX: 0.37 mW (-4.20 dBm)

If the PC side will do so because of an electric instability by sleep and resume, that makes sense. However, I have no idea why the server side will do so. It seems to be a compatibility issue in the long run. I had replaced the transceiver which is 10Gtek's compatible module AMQ10-SR4-M1 with Avago AFBR-79EQPZ two months ago, but the problem has no longer happened at the moment. The PC side uses AMQ10-SR4-M1 no change from before and works fine.

It has got to be the compatibility issue, doesn't it?

Windows' Tiered Storage Space causes weird hitching

I built a tiered Storage Space on Windows Storage Server 2016 and create a NTFS volume from it. After that, I fell on the problem the server got hitching and took suffering time to open folders when another machine was writing data into the server via CIFS. It looked to me like the problem occurred if SSD-tier on the Storage Space would be filled up, but I'm still looking for a nice solution.

The storage components are below:

  • SSD-tier
    • Intel DC S3500 240GB x2 (RAID-0. Only 160GB is assigned to the Storage Space.)
  • HDD-tier
    • 8TB 7200RPM SATA x6(RAID-10. Stripped 3 set of a pair of mirrored HDD. All HDDs are CMR.)
  • One NTFS volume is allocated 100% of the pool.

Both tier are logical drives underlying a hardware RAID card, so the server recognise them as each one drive. (Well, this configuration is not recommended in fact.)

I tried to copy 96 files to the volume which sizes are 1KB to 4GB, total 25.2GB. The copying goes well at first, but stops suddenly on the way. TaskManager tells Tiered Storage Management (記憶域階層管理) is active in this situation, so that a data moving process from SSD-tier to HDD-tier may be working.

And then, I saw by ResourceMonitor that target files' I/O response time was over 1000ms and disk queues got quite a lot. (The queue is normally under 1, or around 2 or 3 at most if going well.) It's too long latency for the data moving…

Looking at actual responsiveness, the system seems to block file I/Os until SSD-tier obtains a certain amount of empty spaces. According to Microsoft, IOPS will decrease drastically equivalent to HDD performance if SSD-tier is full, but it doesn't mean the process completely stops.

Usage examples of the Tiered Storage Space I can read on the net are mostly Hyper-V related, so I wonder it is ever unsuitable for file server. Be that as it may, I feel it is the usually case that frequently accessed data is placed in the SSD-tier to speed up with file server use.

I've gone through a performance issue once when handling a tons of files by Samba, so I choose Windows Server because I thought it was comfortable by official CIFS implementation, sigh indeed… It's just beginning.

(2018-08-16 EDIT)

I found the same report of my problem on Microsoft Japan Forum which is as very useful as a fart in a lift for me: 記憶域スペースで階層化構築時における急激なパフォーマンス低下について

A complete good-for-nothing answer makes me an empty laugh.

Make sure zfs_enable="YES" when ZFS isn't mounted automatically

Make sure to set zfs_enabled=“YES” in /etc/rc.conf when ZFS pool excluding root pool isn't mounted automatically at system boot up.

The root pool is mounted automatically if the setting dosen't exist, so it it difficult to find the problem. I checked canmount and mountpoint properties of unmounted ZFS pool, but they were okay. I had trouble detecting its reason.

We should be extra careful of it in case we manually install the FreeBSD without bsdinstall.

Finally understood the reason why Samba 4.7.4 wastes huge memories on NAS4Free

I experienced the Samba daemons wolfed a lot of memories when I looked into bad behaviour of CIFS shareing on my friend's NAS. They wastesd gigabyte order memories per one process, then consumed 16GB of physical memory and 64GB of swap. I had no choice but to shut down the machine forcibly. It was clearly unusual. I think the lack of memory caused proximately the problem because ARC couldn't use enough memory and therefore storage performance was poor.

I tried to fiddle with some options, then it seemed a shadow copy option brought the disaster. A following picture shows difference between the option 'On' and 'Off' of top command.

It is alarmingly at-a-glance. The memory usages were different order of magnitude literally. The samba enabled shadow copy option almost dried up the memory in less than a day, on the other hand, disabled one works fine four-day-old alghough the load average is up to 13. FYI, the file sharing service also works in this situation.

The VSS in Samba means a vfs_shadow_copy2 module has some bugs, doesn't it? I felt there were no problem if the options was enabled when NAS4Free was version 9 or 10 although my memory was so dim.

Run "zpool labelclear" when reusing storages used to be other pool member.

We should execute zpool labelclear command to erase ZFS pool information on target disks when we create a new pool by reusing storages which have used to be other pool member.

Trying to create a new pool with reused storages is most often failed with the following message:

# zpool create ztank da0p3
invalid vdev specification
use '-f' to override the following errors:
/dev/da0p3 is part of potentially active pool 'zroot'

The command noticed kindly, but zroot pool has been definitely destroyed and I surely want to create ztank with da0p3 in this example. Although I feel ZFS had better clear the label, ZFS is even capable to undo the zpool destory, so they don't clear it, I guess. FYI: About restoring destroyed pool, see zpool import -D option.

However, we can unexpectedly create a new pool with storages remaining old label without the notice. If it so happened, it is really nightmare. There will be invalid old pool and valid new pool on same disks in administrative information. It looks obviously mess-up. The following is a reproduction log:

# zpool status
  pool: newtank
 state: ONLINE
  scan: none requested

        NAME        STATE     READ WRITE CKSUM
        newtank     ONLINE       0     0     0
          da0p3     ONLINE       0     0     0

errors: No known data errors

  pool: oldtank
 state: UNAVAIL
  scan: none requested

        NAME                      STATE     READ WRITE CKSUM
        oldtank                   UNAVAIL      0     0     0
          1234567890123456789     UNAVAIL      0     0     0

errors: No known data errors

The log shows two pools, newtank and oldtank, which consist of each different storage. Actually, the oldtank has been already destroyed pool which consisted of da0p3. And now, it is member of the newtank. ZFS system somehow recognise the invalid oldtank. I don't get it…

If that happens, it is past saving. We can't destroy the oldtank by reason of non-existent pool, and also can't do anything with vdev number of “1234567890123456789”. Just because we can't do anything, DO NOT EXECUTE zpool labelcelar IN THIS STAGE. Otherwise both of pools will be destroyed. _:(´ཀ`」∠):_ (It's my real experience…)

For upon reason, we never forget to do zpool labelclear when creating new pools.

(2017-11-14 追記)

The problem occurred just at the right moment. Get the screen shot.

Steps are…

  1. Destroy previous zroot pool (red) and do labelclear.
  2. Create new zroot pool (green).
  3. Install FreeBSD 11.0-RELEASE newly.
  4. Run freebsd-update to upgrade to 11.1-RELEASE.
  5. Reboot the system, but failed to boot at Trying to mount root sequence.
    • I think in retrospect the previous pool might appear at this step and the system try to mount it.
  6. I booted the system with kernel.old and tried to roll back to the latest 11.0-RELEASE with freebsd-update.
  7. Broke the system completely. Never so much as boot loader can load kernel or zfs module.
  8. Boot from an installer media, run zpool import and get the above screen shot.

I think I surely ran labelclear. I was wondering if I used wrong zpool.cache file. No idea.

(2017-11-16 追記)

I… I need to tell you something.

The system recognised the old zroot in spite of doing labelclear and zero-filling with dd to each partition again. I ran zpool labelclear da0, then it disappeared at last. Of course, the first GPT table is broken in this way. (Nonetheless it is able to recover from second table.) I have no idea what the pool label is on GPT area, that is, what I created the pool without partitioning. How did this happen!?


Do execute zpool labelclear to all target devices /dev/daX and partitions /dev/daXpY. Fill entire disk by zero if you will.

en/start.txt · Last modified: 2021-02-02 05:37 by Decomo
CC Attribution-Noncommercial-Share Alike 4.0 International Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0