Reducing latency with Exasock

Author David - Date 2014/07/14

We've been working really hard lately to bring you our kernel bypass sockets library for the ExaNIC, and we're really pleased with the results. We call it ExaNIC sockets - “exasock” - and it's a library that allows you to transparently improve the latency of your existing applications without requiring a rebuild.

So just what is a kernel bypass sockets library and how could it help you? Lets say, like many of our customers, you have an existing TCP or UDP networking application that is latency critical. In some cases, either the source code for that application isn't available or the effort required to port it to a custom networking API is prohibitive. For these customers, an easy approach to get instant performance gains is to use a library that intercepts regular socket calls, providing faster alternatives. This is exactly what exasock does, and the latency boost is impressive.

For those in finance, the industry standard way of measuring latency is via benchmarking through STAC and we've certainly done thatfor exasock, but the results are only available to STAC members. For those with access I'd recommend you check them out, but in this post I'd like to show you the performance of exasock using sockperf, a fairly standard sockets benchmarking utility. At the same time I'd like to show you how easy it is to get started with exasock. Let's get going!

First things first, lets start sockperf with exasock acceleration. To do this we simply prefix the application with exasock, like this:

$ exasock taskset -c 5 ./sockperf pp -i -t 5 -m 12

Additionally, I'm using taskset to pin the sockperf process to an isolated CPU, which prevents the process from being interrupted by the scheduler and allows us to get more consistent results. In this case, we're doing a UDP ping-pong latency test, running for 5 seconds with a message payload of 12 bytes. By prefixing sockperf with exasock, all socket calls are transparently accelerated. We have a second machine running at which runs sockperf in server mode, also accelerated by exasock. The results look like this:

sockperf: == version #2.5.241 ==
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)

[ 0] IP =    PORT = 11111 # UDP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=5.100 sec; SentMessages=2307403; ReceivedMessages=2307402
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=5.000 sec; SentMessages=2263449; ReceivedMessages=2263449
sockperf: ====> avg-lat=  1.091 (std-dev=0.050)
sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
sockperf: Summary: Latency is 1.091 usec
sockperf: Total 2263449 observations; each percentile contains 22634.49 observations
sockperf: ---> <MAX> observation =   10.729
sockperf: ---> percentile  99.99 =    2.403
sockperf: ---> percentile  99.90 =    1.973
sockperf: ---> percentile  99.50 =    1.199
sockperf: ---> percentile  99.00 =    1.182
sockperf: ---> percentile  95.00 =    1.130
sockperf: ---> percentile  90.00 =    1.117
sockperf: ---> percentile  75.00 =    1.102
sockperf: ---> percentile  50.00 =    1.083
sockperf: ---> percentile  25.00 =    1.073
sockperf: ---> <MIN> observation =    1.038

That's less than 1.1 microseconds average latency with an equally low median (50th percentile) latency. Of course, exasock also supports TCP. Running sockperf again, this time with the --tcp switch on both client and server, we get average and median figures of:

sockperf: ====> avg-lat=  1.554 (std-dev=6.217)
sockperf: ---> percentile  50.00 =    1.132

If we remove exasock acceleration, running the command:

$ taskset -c 5 ./sockperf pp -i -t 5 -m 12

The latency increases noticably:

sockperf: ====> avg-lat= 12.462 (std-dev=0.181)
sockperf: ---> percentile  50.00 =   12.432

and adding the –-tcp switch to client and server, also without exasock:

sockperf: ====> avg-lat= 16.318 (std-dev=0.374)
sockperf: ---> percentile  50.00 =   16.235

We've repeated the above tests using sockperf to measure the performance of exasock over a range of different TCP and UDP packet sizes, and we think the results are impressive. Here we're looking at the median latency figures measured using sockperf with exasock acceleration:

Graph showing Exasock performance at difference payload sizes

So if you'd like to see how your application goes with exasock then get in touch– we can organise a trial for you and we'd love to hear about the performance gains you get!