Exablaze logo

Exasock Bonding support extensions

The exasock driver has been extended to support bonding, but only in the active-backup mode. Furthermore it's meant to manage only bonds containing Cisco Nexus SmartNIC (formerly ExaNIC) devices, so it will reject attempts to bind it to already-existing bonding interfaces which contain non-SmartNIC devices, or to add a non-SmartNIC device to a bond which is currently under its management. Finally, it isn't supported in combination with the ATE feature.

All configuration of bonding interfaces and other standard bonding configuration such as MII Monitor intervals, ARP Monitor intervals, primary slave configuration etc, are done as normal through /sys on the upstream Linux bonding driver.

A summary of the operation of this extension is that it wraps itself around the Linux bonding driver and enforces constraints which are required (such as SmartNIC device membership and rejecting bond modes other than active-backup), and then it exports information about each bonding interface it has wrapped, using a per-bond /dev file. Userspace software can then read this /dev metadata and use it to determine which link within the bonding interface is the active one.

Known pitfalls

  • We strongly recommend that you set an mii or arp interval on your bonding interfaces (/sys/class/net/<iface>/bonding/{miimon,arp_interval} so that the Linux Bonding driver's monitor service will actively poll your bond for when the active SmartNIC goes down. The reason is that the SmartNIC driver doesn't currently support link down messages, so the Linux bonding driver can currently miss active link down events -- and also because Linux itself doesn't issue the right kinds of event messages to ensure that we can faithfully track the status of the link inside of the exasock-bonding extension without using polling. Using an active monitor (whether MII or ARP) circumvents this issue.
    • Exasock-bonding will actively monitor the miimon/arp_interval properties of the interfaces under its management, and it will poll for link status updates at the lower of the two intervals (i.e, MIN(miimon, arp_interval)).
    • However, Exasock-bonding doesn't actually query the drivers for the links in the bond -- it queries the Linux bonding driver for to know what it thinks is the current link status. So if you don't set an MII/ARP interval, and Linux's Bonding driver fails to detect a link status change, so also will Exasock-bonding.
  • We only support active-backup mode, and attempting to write a value other than active-backup into /sys/class/net/{iface}/bonding/mode will result in an error, if {iface} is a bonding interface under this extension's management.

Known limitations:

  • Exasock-bonding does not allow recursive bond membership. The extension will reject attempts to make it manage a bonding interface which has member devices that are themselves, bonding interfaces.
  • Libexasock in userspace requires that a bond have at least one member device before an attempt is made to use that bond with libexasock.
  • There doesn't have to be an active device in the link though, but there must be at least one member.
    • NOTE: The Linux bonding driver will usurp the MAC address of one of the member devices. We recommend that you not remove that particular device from the bond. In practice, this works perfectly fine, but the current behaviour of the Linux bonding driver is that it will retain that usurped MAC address even after releasing the device that it usurped it from. So you will end up with 2 devices having the same MAC address.
  • Exasock-bonding does not support VLAN pseudo-interfaces as slave devices - all slaves must be direct SmartNIC devices.

Compiling the extensions

Just run make as you usually would for the exasock kernel module.

Your kernel may not support the bonding extension because before Linux v3.19, the kernel didn't publicly export the headers needed for the extension to build. You'll see a warning message if this is the case.

Using the extensions

For a practical example, see the setupbonding shell script inside of exanic-software (examples/exasock/setupbonding).

All configuration, setup, etc is done using standard Linux utilities (ifconfig, ip, etc). All of the standard Linux bonding driver configuration interfaces work exactly the same way they usually do.

Creating a bonding master

This section is only in here for completeness and to make it clear that all configuration is exactly the same as the standard Linux bonding driver procedure.

Notice

Bonding can be set up using the sysfs interface or iproute2 commands; this guide uses the sysfs interface.

For example, to create a new bonding interface named mybond

echo "+mybond" >/sys/class/net/bonding_masters

To add two slave devices, enp0s0 and enp0s0d1 to mybond:

echo "+enp0s0" >/sys/class/net/mybond/bonding/slaves
echo "+enp0s0d1" >/sys/class/net/mybond/bonding/slaves

To remove enps0d1 from mybond:

echo "-enp0s0" >/sys/class/net/mybond/bonding/slaves

All of this works exactly the same as the stock bonding configuration, because Exasock-bonding uses the Linux bonding module.

Placing an already existing bonding interface under management:

Before attempting to place a bonding interface under the management of Exasock-bonding, you must first create it with the Linux Bonding driver -- follow the steps above or see the Linux bonding documentation, or go on the web and use whichever tutorial suits your preferences.

From there, you simply tell Exasock-bonding about the bonding interface which you have created and which you wish to place under its management.

To do this, you have to first load the Exasock kernel module:

modprobe exasock

When the Exasock driver has been successfully loaded, it will create a /sys file called /sys/class/net/exabond_masters -- the naming is meant to be similar to the /sys file created by Linux's bonding module (bonding_masters).

Let's assume you've created a bonding interface called mybond, like the above example.

To tell Exasock-bonding that you want it to manage mybond, you just execute this command:

echo "+mybond" >/sys/class/net/exabond_masters

When you are ready to have Exasock-bonding stop managing this bonding interface, execute this command:

echo "-mybond" >/sys/class/net/exabond_masters

Notice

You do not need to remove all existing NICs from the bond before you remove it from being under the management of Exasock-bonding. You're free to place bonding interfaces under management and remove them from management fluidly -- just be sure to close all handles to the /dev/exabond-{iface-name} file because of course, Linux won't allow Exasock-bonding to delete the /dev file until all handles are closed.

Notice

You do not need to add member SmartNICs to a bond before placing it under management -- you can place empty bonds under management.

Reading the link status of the bond from userspace:

The bonding extension to the exasock kernel module exports metadata about exasock bonding interfaces to the userspace through device files in /dev.

Whenever you place a bonding interface under management a new /dev node will be created with a name of the form /dev/exabond-{BONDING_IFACE_NAME}.

For example, if you created a bond called mybond and then placed it under Exasock-bonding's management as shown above, then upon placing it under management, a new /dev file will be created called /dev/exabond-mybond.

To read the metadata in that file, simply mmap() it as READ ONLY because the extensions will reject attempts to map it as writeable.

  • For documentation of the data structure and so on, please see src/libs/exasock/kernel/exasock-bonding.h.

  • For a convenient library which will do all the work of parsing the data structure for you as well as dealing with the integrity protocol for ensuring that you don't get partial reads due to races between the kernel and userspace, see the library implemented in src/libs/exasock/exasock-bonding-priv.h.