Exablaze logo

Introducing the ExaNIC HPT

With a resolution of 0.25ns (250ps), the ExaNIC HPT has nearly 25x higher resolution than our previous capture solutions (6.2ns). With this high capture resolution, the device is capable of producing very high precision measurements. It is also more sensitive to measurement and methodological errors than previous devices. To understand how to make the best use of your ExaNIC HPT, we have produced this guide.

Quick Start: Using the ExaNIC HPT

The ExaNIC software package ships with an example ExaNIC HPT utility currently called exanic-benchmarker (src/examples/exanic/exanic-benchmark.c). This tool is designed to allow you to quickly start working with the ExaNIC HPT as well as a starting point for writing your own software. It operates in two modes, cable length estimation mode and system measurement mode.

Warning The Linux kernel sometimes sends packets out of network interfaces (e.g. ARP requests). These packets can cause exanic-benchmarker to see spurious packets and report an error. Be sure to enable bypass-only mode on the NIC to avoid this error.

Cable length estimation mode

In this mode, exanic-benchmarker attempts to estimate the length of a loopback cable attached to the device. It does this by sending a small packet and measuring the time that the packet takes to return. We have supplied a short length of optical fibre in your ExaNIC HPT package to demonstrate this mode (you will need to supply your own optical SFP+'s). The supplied fibre is nominally 20cm long, though we have found wide variations when measuring with the ExaNIC HPT and these have been confirmed using a tape-measure.

To estimate the length of a cable, the utility needs to know the type of media (fibre, direct attach AWG24 or direct attach AWG30 cable) and which ports to use for sending and receiving. The following example uses ports 0 and 1 for TX/RX respectively (on exanic0) to estimate the length of a loopback fibre :

$ ./exanic-benchmarker -d exanic0 -p 0 -P 1 -t fibre -T fibre -E
Percentile 99.00 = 1.72 ns
Percentile 95.00 = 1.47 ns
Percentile 90.00 = 1.47 ns
Percentile 75.00 = 1.47 ns
Percentile 50.00 = 1.47 ns
Percentile 25.00 = 1.47 ns
Percentile 10.00 = 1.22 ns
Percentile 5.00 = 1.22 ns
Percentile 1.00 = 1.22 ns

Fiber length estimated to be 0.30m +0.05m/-0.05m

System measurement mode

Once the length of the cables have been estimated, exanic-benchmarker can be used to benchmark other equipment. For example, we can use it measure the latency across a device. To begin with, we first measure the latency across the optical coupler supplied in your kit. This device should have a latency of 0, which gives a good calibration point to start with. In the following example, we have two cables that we that have estimated (and measured) to be 1.1m each, connected to each other using an optical coupler. Again we uses ports 0 and 1 for TX/RX respectively on exanic0.

./exanic-benchmarker -d exanic0  -t fibre -T fibre -p 0 -P 1 -l 1.1 -L 1.1         
Percentile 99.00 = 0.02 ns                                                         
Percentile 95.00 = 0.02 ns                                                         
Percentile 90.00 = 0.02 ns                                                         
Percentile 75.00 = 0.02 ns                                                         
Percentile 50.00 = 0.02 ns                                                         
Percentile 25.00 = -0.23 ns                                                        
Percentile 10.00 = -0.23 ns                                                        
Percentile 5.00 = -0.23 ns                                                         
Percentile 1.00 = -0.23 ns   

The above measurement confirms that our system is well calibrated. It has a median value of 0.02ns which is ~0 and a first percentile value of -0.23ns which is one ExaNIC HPT clock cycle from the median.

Now, we replace the optical coupler with a ExaLink Fusion patch. We expect this value to be around 5ns.

./exanic-benchmarker -d exanic0  -t fibre -T fibre -p 0 -P 1 -l 1.1 -L 1.1         
Percentile 99.00 = 5.02 ns                                                         
Percentile 95.00 = 5.02 ns                                                         
Percentile 90.00 = 5.02 ns                                                         
Percentile 75.00 = 4.77 ns                                                         
Percentile 50.00 = 4.77 ns                                                         
Percentile 25.00 = 4.77 ns                                                         
Percentile 10.00 = 4.77 ns                                                         
Percentile 5.00 = 4.77 ns                                                          
Percentile 1.00 = 4.52 ns   

The above measurement reports a median value of 4.77ns +/- 0.25ns, which is exactly as expected (for ports B11/B12).

The exanic-benchmarker tool is capable of writing the raw capture values out to a file which can then be used for generating timeline or distribution plots. To do so, pass the -w filename option to the application.

Programming the ExaNIC HPT

The ExaNIC HPT supports exactly the same programming API as existing ExaNIC network cards. It can thus be used as a drop in replacement for the ExaNIC X10 and other Exablaze network adapters. Starting from version 1.9.0, the ExaNIC software library libexanichas been modified to include picosecond timing resolution. Following is a description of the relevant parts of this API for ExaNIC HPT operation.

typedef uint64_t exanic_cycles_t;
typedef uint32_t exanic_cycles32_t;

An ExaNIC timestamp value is represented by either a 32 bit exanic_cycles32_t value or a 64 bit exanic_cycles_t value. An exanic_cycles_t value holds the number cycles of the ExaNIC clock since the UNIX epoch. An exanic_cycles32_t value contains the lower 32 bits of an exanic_cycles_t value.

Most of the libexanic API functions that send or receive frames return an exanic_cycles32_t value. The two most common examples are exanic_get_tx_timestamp() and exanic_receive_frame(). The exanic_receive_frame() function returns the timestamp corresponding to the moment when a frame was received. Conversely, the exanic_get_tx_timestamp() function returns a timestamp corresponding to the moment when the most recently transmitted frame was sent.

ssize_t exanic_receive_frame(exanic_rx_t *rx, char *rx_buf, size_t rx_buf_size,
    exanic_cycles32_t *timestamp);

exanic_cycles32_t exanic_get_tx_timestamp(exanic_tx_t *tx)

As described above, exanic_cycles32_t values represent only the lower 32 bits of a full 64 bit timestamp. To expand a 32 bit timestamp use the exanic_expand_timestamp() function. Note that the high clock speeds of ExaNIC HPT devices means that 32 bit values must be expanded quickly. An ExaNIC HPT has an internal clock speed of 4GHz, which means that the value can overflow in a little over 1 second. If you do not expand a 32bit timestamp within about 1 second of it being captured, the result is undefined.

exanic_cycles_t exanic_expand_timestamp(exanic_t *exanic,
    exanic_cycles32_t timestamp)

For speed sensitive applications we recommend performing timestamp related calculations in cycles space. Timestamps can then be converted pico/nanoseconds values at a later stage. The libexanicAPI contains a variety of functions for converting to useful formats. Nanoseconds values can be converted directly into a 64 bit nanoseconds since the epoch value, or into a familiar UNIX struct timespec value.

void exanic_cycles_to_timespec(exanic_t *exanic, exanic_cycles_t cycles,
        struct timespec *ts);

uint64_t exanic_cycles_to_ns(exanic_t *exanic, exanic_cycles_t cycles);

Picosecond values since the UNIX epoch can be contained without overflow in an exanic_timespcps_t structure. Meanwhile smaller values (such as those coming from a difference between timestamps) can be converted directly into a 64 bit picosecond value.

struct exanic_timespecps
    uint64_t tv_sec; /* seconds since UNIX epoch */
    uint64_t tv_psec; /* picosecond portion */

void exanic_cycles_to_timespecps(exanic_t *exanic, exanic_cycles_t cycles,
        struct exanic_timespecps *tsps);

uint64_t exanic_cycles_to_ps(exanic_t *exanic, exanic_cycles_t cycles,
        bool *overflow);

Performing high quality measurements with the ExaNIC HPT

Context: How far is 0.25ns?

Before getting too involved in the technical details we need to consider the scale and context (physics) in which your measurements will occur. This is important to ensure that you get the best results from your ExaNIC HPT .

It may seem strange to relate a time (0.25ns) to distance ("how far"), but this is exactly the way we need to think to operate and calibrate the ExaNIC HPT correctly. To understand why, let's start with one of the fundamental constants in the universes, the speed of light.

The speed of light in a vacuum (C) is known to be very slightly below 300 million meters per second. The propagation of electrical and optical signals in copper/fibre is related to the speed of light. As a rough rule of thumb, we can say that a signal (electrical or optical) will travel about at 0.65C, or roughly 200 million meters per second. This means that in 0.25ns, a signal can propagate about 50mm. This is important, because the physical scale of ExaNIC HPT is much larger than 50mm. If we are to operate and calibrate the ExaNIC HPT we need to take into account many effects that are usually considered "in the noise" for other devices. For example, cable lengths, transceiver delays, through chip delays and even propagation delays through the physical board itself. With this being the case, the first and most obvious question is "how do you compensate for these delays?".

Architecture: Where does the measurement actually happen?

NIC pipeline

All network adapters and capture cards need to perform similar internal operations to receive packets. The figure above shows the typical path that a packet takes once it arrives at the SFP+ connector socket as an electrical signal (either from a DA cable or from the output of a SFP+ optical module).

The ExaNIC HPT quite differently to other network devices. Rather than running a 156Mhz clock, it operates a 4GHz clock for timestamping. This gives the sampling resolution of 0.25ns. Also, unlike other devices the ExaNIC HPT takes samples at the moment that the first bit arrives at the FPGA, rather than when the first bit exits the MAC unit. This means that measurements from the ExaNIC HPT are more precise than other network capture devices. It also means that you should be careful when directly comparing round trip times and other values between the ExaNIC HPT and other devices.

Compensating for measurement effects

To perform high quality measurements on the ExaNIC HPT, you need to take into account the cable lengths and types that you are using to connect your device (speed of light), the cable and track lengths through the device (speed of light), which may differ between ports.

The exanic-benchmarker application has compensation values built in, though they are estimates and may vary depending on your measurement setting.

The following values are used to calculate the speed of signal propagation in various media as well as an estimate for the time taken to cross an SFP optical-electrical module.

#define SR_SFP_LATENCY                0.9   //900ps RX+TX                          
#define NANOS_PER_METER_FIBER         4.98                                         
#define NANOS_PER_METER_TWINAX_AWG30  4.45                                         
#define NANOS_PER_METER_TWINAX_AWG24  4.76        

switch (media)                                                              
    case MEDIA_TYPE_FIBRE:                                                  
        offset += SR_SFP_LATENCY;                 
        offset += NANOS_PER_METER_FIBER * tx_cable_len;                     
        offset += NANOS_PER_METER_FIBER * rx_cable_len;                  
    case MEDIA_TYPE_AWG24:                                                  
        offset += NANOS_PER_METER_TWINAX_AWG24 * tx_cable_len;              
        offset += NANOS_PER_METER_TWINAX_AWG24 * rx_cable_len;              
    case MEDIA_TYPE_AWG30:                                                  
        offset += NANOS_PER_METER_TWINAX_AWG30 * tx_cable_len;              
        offset += NANOS_PER_METER_TWINAX_AWG30 * rx_cable_len;              

Meanwhile the following values are used to compensate for constant FPGA related effects for per-port related effects:

float offset = 33;
if (tx_port == 0 && rx_port == 0)                                           
    offset += 0.1;                                                          
else if (tx_port == 1 && rx_port == 0)                                      
    offset += 0.625;                                                        
else if (tx_port == 0 && rx_port == 1)                                      
    offset += 0.375;                                                        
else if (tx_port == 1 && rx_port == 1)                                      
    offset += 0.85;