Some QSFP+ transceivers may cause no re-link up 40GbE connections on FreeBSD

Five months have passed since I setup a 40GBASE-SR4 network between my PC (Windows 10) and home server (FreeBSD) by ConnectX-3. When the PC went to sleep and resumed, I don't know why, the 40GBASE-SR4 connection could have not linked up again unless I physically unplug and plug a QSFP+ transceiver on the server side.

The problem is certain to be no TX signals on the server when it happens.

$ ifconfig -v mlxen0
mlxen0: flags=8947<UP,BROADCAST,DEBUG,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=ad00b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWFILTER,VLAN_HWTSO,LINKSTATE,RXCSUM_IPV6>
        ether e4:1d:2d:74:16:e0
        hwaddr e4:1d:2d:74:16:e0
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
        media: Ethernet 40Gbase-CR4 <full-duplex> (autoselect)
        status: no carrier
        plugged: QSFP+ 40GBASE-SR4 (MPO Parallel Optic)
        vendor: Mellanox PN: MC2210411-SR4 SN: MEQSRIC0115 DATE: 2015-03-23
        compliance level: Unspecified
        nominal bitrate: 10300 Mbps
        module temperature: 40.00 C voltage: 3.22 Volts
        lane 1: RX: 0.57 mW (-2.37 dBm) TX: 0.36 mW (-4.38 dBm)
        lane 2: RX: 1.06 mW (0.26 dBm) TX: 0.37 mW (-4.30 dBm)
        lane 3: RX: 0.96 mW (-0.17 dBm) TX: 0.00 mW (-30.46 dBm)
        lane 4: RX: 1.12 mW (0.52 dBm) TX: 0.37 mW (-4.20 dBm)

If the PC side will do so because of an electric instability by sleep and resume, that makes sense. However, I have no idea why the server side will do so. It seems to be a compatibility issue in the long run. I had replaced the transceiver which is 10Gtek's compatible module AMQ10-SR4-M1 with Avago AFBR-79EQPZ two months ago, but the problem has no longer happened at the moment. The PC side uses AMQ10-SR4-M1 no change from before and works fine.

It has got to be the compatibility issue, doesn't it?




  • en/blog/2019/2019-05-13.1578651163.txt.gz
  • Last modified: 2020-01-10 19:12
  • by Decomo