WireGuard Performance Tuning

WireGuard generally doesn’t require any performance tuning to use. It’s been designed to work well on modern network stacks under a variety of different configurations. Sending traffic through its encrypted tunnel requires only a little bit of overhead, in the form of slightly higher CPU and network usage.

That said, there are a few things you can adjust if you are experiencing WireGuard performance issues. This article will walk you through some strategies to Testing and Tuning your WireGuard network performance.

Testing

How do you know if you have a WireGuard performance issue? There are basically two classes of issues:

  1. Poor performance with a Single Connection

  2. Poor performance when Many Connections are active

Single Connection

To check for performance with a single WireGuard connection, try running a bandwidth-intensive operation between two endpoints that reflects the performance issue you appear to have, like uploading or downloading a large file to or from a web application, or participating in a video call. Run it several times with the WireGuard connection up, and several times using the same exact endpoints with the WireGuard connection down, and compare your results.

You should expect to see around 90% of the performance with your tests when using WireGuard as compared to when not using WireGuard. You may need to temporarily change some of your firewall or other network settings to allow the two endpoints to connect outside of the WireGuard tunnel.

If it’s not possible to connect the two endpoints outside of WireGuard for testing, and you have one or more WireGuard hops between the two endpoints — for example, you have a WireGuard hub between the two endpoints, or a WireGuard site-to-site connection between the two endpoints — you could try testing each individual hop without WireGuard; and then compare the performance of the worst hop without WireGuard versus the full end-to-end connection with WireGuard. As long as you have only two or three WireGuard hops, you should expect to see at least 50% of the performance with your end-to-end WireGuard test as compared to sending the same traffic without WireGuard over the worst performing hop.

If the performance of a specific application over WireGuard is important to you, you should test with it; but if you don’t have a specific application in mind (or it’s not feasible to test it from the same endpoints when using WireGuard as when not), iPerf is a nifty utility for testing the general bandwidth available for a connection between two endpoints. It’s available via the iperf3 package in most Linux distros.

For example, to test the generic TCP upload throughput of a WireGuard connection between two endpoints, you can run iperf3 --server on the “server side” of the connection, and iperf3 --client 10.0.0.2 on the “client side” of the connection (where 10.0.0.2 is the IP address of the WireGuard interface on the server side). Then run iperf3 --client 10.0.0.2 --reverse to test the connection’s generic TCP download throughput.

The output of iPerf on the client side will look something like this:

$ iperf3 --client 10.0.0.2
Connecting to host 10.0.0.2, port 5201
[  5] local 10.0.0.1 port 51340 connected to 10.0.0.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  64.0 MBytes   537 Mbits/sec    1    854 KBytes
[  5]   1.00-2.00   sec  78.5 MBytes   659 Mbits/sec    0    854 KBytes
[  5]   2.00-3.00   sec  76.7 MBytes   643 Mbits/sec    0    854 KBytes
[  5]   3.00-4.00   sec  76.6 MBytes   642 Mbits/sec    0    854 KBytes
[  5]   4.00-5.00   sec  77.8 MBytes   653 Mbits/sec    0    854 KBytes
[  5]   5.00-6.00   sec  68.4 MBytes   574 Mbits/sec    7    730 KBytes
[  5]   6.00-7.00   sec  77.9 MBytes   653 Mbits/sec    0    846 KBytes
[  5]   7.00-8.00   sec  76.3 MBytes   640 Mbits/sec    0    846 KBytes
[  5]   8.00-9.00   sec  77.9 MBytes   654 Mbits/sec    0    846 KBytes
[  5]   9.00-10.00  sec  77.7 MBytes   652 Mbits/sec    0    846 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   752 MBytes   631 Mbits/sec    8             sender
[  5]   0.00-10.05  sec   750 MBytes   626 Mbits/sec                  receiver

iperf Done.
$ iperf3 --client 10.0.0.2 --reverse
Connecting to host 10.0.0.2, port 5201
Reverse mode, remote host 10.0.0.2 is sending
[  5] local 10.0.0.1 port 46262 connected to 10.0.0.2 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  15.9 MBytes   133 Mbits/sec
[  5]   1.00-2.00   sec  10.3 MBytes  86.6 Mbits/sec
[  5]   2.00-3.00   sec  7.55 MBytes  63.3 Mbits/sec
[  5]   3.00-4.00   sec  8.38 MBytes  70.2 Mbits/sec
[  5]   4.00-5.00   sec  14.6 MBytes   122 Mbits/sec
[  5]   5.00-6.00   sec  14.2 MBytes   119 Mbits/sec
[  5]   6.00-7.00   sec  18.3 MBytes   154 Mbits/sec
[  5]   7.00-8.00   sec  11.5 MBytes  96.5 Mbits/sec
[  5]   8.00-9.00   sec  12.9 MBytes   108 Mbits/sec
[  5]   9.00-10.00  sec  8.60 MBytes  72.2 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.02  sec   124 MBytes   104 Mbits/sec   64             sender
[  5]   0.00-10.00  sec   122 MBytes   103 Mbits/sec                  receiver

iperf Done.

And the output on the server side will look like this:

$ iperf3 --server
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.0.0.1, port 51334
[  5] local 10.0.0.2 port 5201 connected to 10.0.0.1 port 51340
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  59.0 MBytes   495 Mbits/sec
[  5]   1.00-2.00   sec  77.8 MBytes   652 Mbits/sec
[  5]   2.00-3.00   sec  77.2 MBytes   648 Mbits/sec
[  5]   3.00-4.00   sec  76.7 MBytes   643 Mbits/sec
[  5]   4.00-5.00   sec  77.7 MBytes   651 Mbits/sec
[  5]   5.00-6.00   sec  68.8 MBytes   578 Mbits/sec
[  5]   6.00-7.00   sec  76.7 MBytes   643 Mbits/sec
[  5]   7.00-8.00   sec  76.9 MBytes   645 Mbits/sec
[  5]   8.00-9.00   sec  77.4 MBytes   649 Mbits/sec
[  5]   9.00-10.00  sec  78.0 MBytes   655 Mbits/sec
[  5]  10.00-10.05  sec  4.02 MBytes   661 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.05  sec   750 MBytes   626 Mbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.0.0.1, port 46248
[  5] local 10.0.0.2 port 5201 connected to 10.0.0.1 port 46262
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  16.5 MBytes   138 Mbits/sec    6    194 KBytes
[  5]   1.00-2.00   sec  10.9 MBytes  91.1 Mbits/sec    8   93.1 KBytes
[  5]   2.00-3.00   sec  7.22 MBytes  60.5 Mbits/sec    7    101 KBytes
[  5]   3.00-4.00   sec  8.25 MBytes  69.2 Mbits/sec    7    124 KBytes
[  5]   4.00-5.00   sec  14.8 MBytes   124 Mbits/sec    5    132 KBytes
[  5]   5.00-6.00   sec  14.3 MBytes   120 Mbits/sec    5    186 KBytes
[  5]   6.00-7.00   sec  18.1 MBytes   152 Mbits/sec    4    186 KBytes
[  5]   7.00-8.00   sec  12.0 MBytes   101 Mbits/sec    7    116 KBytes
[  5]   8.00-9.00   sec  12.4 MBytes   104 Mbits/sec    5    210 KBytes
[  5]   9.00-10.00  sec  8.85 MBytes  74.3 Mbits/sec   10   85.4 KBytes
[  5]  10.00-10.02  sec   559 KBytes   215 Mbits/sec    0   93.1 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.02  sec   124 MBytes   104 Mbits/sec   64             sender
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------

From these two runs, it looks like our generic TCP upload throughput through the WireGuard tunnel is around 600 Mbits/sec, and download throughput is around 100 Mbits/sec.

The output of each side should look about the same, but notice that we only see the TCP congestion window size calculation (the column labeled “Cwnd” in the output) on the “sending side” of the connection — on the client side when testing upload speeds (first test), and on the server side when testing download speeds (second test).

The congestion window is a key factor in determining how fast the sending side of the connection can send data to the recipient. A bigger window allows the sender to send more data to the recipient before the sender requires an acknowledgement from the recipient that it received all the packets sent. The sender will pause when it hits the end of that window without getting an acknowledgement; and it will try resending a subset of packets it just sent (ie retransmit with a smaller congestion window) if it has to pause too long.

Ideally, the network stack of the endpoint on the sending side should be able to figure out the optimal congestion window size pretty quickly, and mostly stick to it for the duration of the test. That’s what happened with our first test above (upload), but not with the second (download). With the second test, the congestion window started at 194K, then immediately dropped to 93K, then eventually rose up to 210K, before ending the test back at 93K.

To get a more reproducible result, it’d be better to throw out the parts of the test where the network stack is searching for the best congestion window; and if the congestion window still bounces around, to take a longer sample of data than iPerf’s default of 10 seconds.

Let’s try running the download test again, but this time throwing out the first 10 seconds, and then taking a sample of 20 seconds long. We’d run the following command on the client side:

$ iperf3 --client 10.0.0.2 --reverse --omit 10 --time 20
Connecting to host 10.0.0.2, port 5201
Reverse mode, remote host 10.0.0.2 is sending
[  5] local 10.0.0.1 port 52710 connected to 10.0.0.2 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  16.8 MBytes   141 Mbits/sec                  (omitted)
[  5]   1.00-2.00   sec  9.58 MBytes  80.4 Mbits/sec                  (omitted)
[  5]   2.00-3.00   sec  11.1 MBytes  93.1 Mbits/sec                  (omitted)
[  5]   3.00-4.00   sec  12.5 MBytes   105 Mbits/sec                  (omitted)
[  5]   4.00-5.00   sec  13.5 MBytes   114 Mbits/sec                  (omitted)
[  5]   5.00-6.00   sec  11.8 MBytes  98.9 Mbits/sec                  (omitted)
[  5]   6.00-7.00   sec  8.60 MBytes  72.1 Mbits/sec                  (omitted)
[  5]   7.00-8.00   sec  10.5 MBytes  88.5 Mbits/sec                  (omitted)
[  5]   8.00-9.00   sec  11.4 MBytes  95.7 Mbits/sec                  (omitted)
[  5]   9.00-10.00  sec  8.79 MBytes  73.8 Mbits/sec                  (omitted)
[  5]   0.00-1.00   sec  6.55 MBytes  54.9 Mbits/sec
[  5]   1.00-2.00   sec  7.75 MBytes  65.0 Mbits/sec
[  5]   2.00-3.00   sec  12.7 MBytes   107 Mbits/sec
[  5]   3.00-4.00   sec  11.1 MBytes  93.2 Mbits/sec
[  5]   4.00-5.00   sec  9.13 MBytes  76.6 Mbits/sec
[  5]   5.00-6.00   sec  11.6 MBytes  96.9 Mbits/sec
[  5]   6.00-7.00   sec  6.87 MBytes  57.6 Mbits/sec
[  5]   7.00-8.00   sec  10.3 MBytes  86.5 Mbits/sec
[  5]   8.00-9.00   sec  7.70 MBytes  64.6 Mbits/sec
[  5]   9.00-10.00  sec  12.2 MBytes   102 Mbits/sec
[  5]  10.00-11.00  sec  9.21 MBytes  77.3 Mbits/sec
[  5]  11.00-12.00  sec  8.99 MBytes  75.4 Mbits/sec
[  5]  12.00-13.00  sec  15.4 MBytes   129 Mbits/sec
[  5]  13.00-14.00  sec  13.7 MBytes   115 Mbits/sec
[  5]  14.00-15.00  sec  8.47 MBytes  71.0 Mbits/sec
[  5]  15.00-16.00  sec  13.3 MBytes   112 Mbits/sec
[  5]  16.00-17.00  sec  7.90 MBytes  66.3 Mbits/sec
[  5]  17.00-18.00  sec  14.2 MBytes   119 Mbits/sec
[  5]  18.00-19.00  sec  16.6 MBytes   139 Mbits/sec
[  5]  19.00-20.00  sec  7.82 MBytes  65.6 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-20.07  sec   211 MBytes  88.3 Mbits/sec  139             sender
[  5]   0.00-20.00  sec   211 MBytes  88.6 Mbits/sec                  receiver

iperf Done.

And see the following output on the server side:

-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 10.0.0.1, port 52702
[  5] local 10.0.0.2 port 5201 connected to 10.0.0.1 port 52710
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  18.9 MBytes   159 Mbits/sec   13    163 KBytes       (omitted)
[  5]   1.00-2.00   sec  10.0 MBytes  83.9 Mbits/sec    8   85.4 KBytes       (omitted)
[  5]   2.00-3.00   sec  11.0 MBytes  92.1 Mbits/sec    5    140 KBytes       (omitted)
[  5]   3.00-4.00   sec  12.1 MBytes   102 Mbits/sec    5    140 KBytes       (omitted)
[  5]   4.00-5.00   sec  13.2 MBytes   110 Mbits/sec    5    147 KBytes       (omitted)
[  5]   5.00-6.00   sec  12.1 MBytes   102 Mbits/sec    5    163 KBytes       (omitted)
[  5]   6.00-7.00   sec  8.79 MBytes  73.8 Mbits/sec    8    109 KBytes       (omitted)
[  5]   7.00-8.00   sec  9.94 MBytes  83.4 Mbits/sec    6    116 KBytes       (omitted)
[  5]   8.00-9.00   sec  12.1 MBytes   101 Mbits/sec    6    186 KBytes       (omitted)
[  5]   9.00-10.00  sec  8.79 MBytes  73.8 Mbits/sec   10   62.1 KBytes       (omitted)
[  5]   0.00-1.00   sec  6.73 MBytes  56.5 Mbits/sec    7    101 KBytes
[  5]   1.00-2.00   sec  7.70 MBytes  64.6 Mbits/sec   10    101 KBytes
[  5]   2.00-3.00   sec  12.1 MBytes   102 Mbits/sec    6   69.9 KBytes
[  5]   3.00-4.00   sec  12.2 MBytes   102 Mbits/sec    9   46.6 KBytes
[  5]   4.00-5.00   sec  7.70 MBytes  64.6 Mbits/sec    4    171 KBytes
[  5]   5.00-6.00   sec  12.1 MBytes   102 Mbits/sec    8   62.1 KBytes
[  5]   6.00-7.00   sec  7.64 MBytes  64.1 Mbits/sec    9   69.9 KBytes
[  5]   7.00-8.00   sec  9.82 MBytes  82.4 Mbits/sec    9   69.9 KBytes
[  5]   8.00-9.00   sec  6.61 MBytes  55.4 Mbits/sec    5    179 KBytes
[  5]   9.00-10.00  sec  12.0 MBytes   101 Mbits/sec    5    186 KBytes
[  5]  10.00-11.00  sec  9.94 MBytes  83.4 Mbits/sec    8   93.1 KBytes
[  5]  11.00-12.00  sec  8.85 MBytes  74.3 Mbits/sec    8    116 KBytes
[  5]  12.00-13.00  sec  15.3 MBytes   129 Mbits/sec    4    147 KBytes
[  5]  13.00-14.00  sec  14.3 MBytes   120 Mbits/sec    6    116 KBytes
[  5]  14.00-15.00  sec  7.64 MBytes  64.1 Mbits/sec    7    140 KBytes
[  5]  15.00-16.00  sec  14.3 MBytes   120 Mbits/sec    9   62.1 KBytes
[  5]  16.00-17.00  sec  7.82 MBytes  65.6 Mbits/sec    7   85.4 KBytes
[  5]  17.00-18.00  sec  13.3 MBytes   112 Mbits/sec    4    132 KBytes
[  5]  18.00-19.00  sec  17.6 MBytes   148 Mbits/sec    6    132 KBytes
[  5]  19.00-20.00  sec  7.64 MBytes  64.1 Mbits/sec    7    140 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-20.07  sec   211 MBytes  88.3 Mbits/sec  139             sender
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------

It looks like our download connection is just fundamentally pretty flaky, as the congestion window continued to bounce around from a low of 46K to a high of 186K throughout the test. Taking more samples did allow us to arrive at what is probably a little more reliable throughput number, however, of around 90Mbits/sec.

But is that good or bad?

To find out, we need to test the connection outside of the WireGuard tunnel. To do that, we’d run the same test using the public IP address of the server-side endpoint, instead of its WireGuard IP address. To access the iPerf server from the client, we may need to adjust our firewall rules (for example, to allow access to TCP port 5201 on the public interface of the server from public IP address of the client).

For this example, we’ll say the server side of the connection has a public IP address of 203.0.113.2. This is what we see when running the tests from the client side:

$ iperf3 --client 203.0.113.2
Connecting to host 203.0.113.2, port 5201
[  5] local 192.168.1.11 port 59094 connected to 203.0.113.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  84.1 MBytes   706 Mbits/sec    0   1.60 MBytes
[  5]   1.00-2.00   sec  98.8 MBytes   828 Mbits/sec   50   1.17 MBytes
[  5]   2.00-3.00   sec   102 MBytes   860 Mbits/sec    0   1.30 MBytes
[  5]   3.00-4.00   sec   102 MBytes   860 Mbits/sec    0   1.39 MBytes
[  5]   4.00-5.00   sec   102 MBytes   860 Mbits/sec    0   1.47 MBytes
[  5]   5.00-6.00   sec   104 MBytes   870 Mbits/sec    0   1.52 MBytes
[  5]   6.00-7.00   sec   104 MBytes   870 Mbits/sec    0   1.56 MBytes
[  5]   7.00-8.00   sec   104 MBytes   870 Mbits/sec    0   1.58 MBytes
[  5]   8.00-9.00   sec   104 MBytes   870 Mbits/sec    0   1.60 MBytes
[  5]   9.00-10.00  sec   105 MBytes   881 Mbits/sec    0   1.60 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1010 MBytes   848 Mbits/sec   50             sender
[  5]   0.00-10.04  sec  1008 MBytes   843 Mbits/sec                  receiver

iperf Done.
$ iperf3 --client 203.0.113.2 --reverse --omit 10 --time 20
Connecting to host 203.0.113.2, port 5201
Reverse mode, remote host 203.0.113.2 is sending
[  5] local 192.168.1.11 port 33482 connected to 203.0.113.2 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  8.84 MBytes  74.1 Mbits/sec                  (omitted)
[  5]   1.00-2.00   sec  3.70 MBytes  31.0 Mbits/sec                  (omitted)
[  5]   2.00-3.00   sec  5.58 MBytes  46.8 Mbits/sec                  (omitted)
[  5]   3.00-4.00   sec  3.89 MBytes  32.6 Mbits/sec                  (omitted)
[  5]   4.00-5.00   sec  4.19 MBytes  35.2 Mbits/sec                  (omitted)
[  5]   5.00-6.00   sec  3.74 MBytes  31.4 Mbits/sec                  (omitted)
[  5]   6.00-7.00   sec  3.70 MBytes  31.1 Mbits/sec                  (omitted)
[  5]   7.00-8.00   sec  2.51 MBytes  21.0 Mbits/sec                  (omitted)
[  5]   8.00-9.00   sec  4.02 MBytes  33.7 Mbits/sec                  (omitted)
[  5]   9.00-10.00  sec  3.63 MBytes  30.4 Mbits/sec                  (omitted)
[  5]   0.00-1.00   sec  3.84 MBytes  32.2 Mbits/sec
[  5]   1.00-2.00   sec  6.70 MBytes  56.2 Mbits/sec
[  5]   2.00-3.00   sec  4.86 MBytes  40.8 Mbits/sec
[  5]   3.00-4.00   sec  5.28 MBytes  44.3 Mbits/sec
[  5]   4.00-5.00   sec  5.49 MBytes  46.0 Mbits/sec
[  5]   5.00-6.00   sec  6.44 MBytes  54.0 Mbits/sec
[  5]   6.00-7.00   sec  4.43 MBytes  37.2 Mbits/sec
[  5]   7.00-8.00   sec  4.89 MBytes  41.0 Mbits/sec
[  5]   8.00-9.00   sec  5.32 MBytes  44.7 Mbits/sec
[  5]   9.00-10.00  sec  5.44 MBytes  45.7 Mbits/sec
[  5]  10.00-11.00  sec  7.04 MBytes  59.0 Mbits/sec
[  5]  11.00-12.00  sec  5.41 MBytes  45.4 Mbits/sec
[  5]  12.00-13.00  sec  3.16 MBytes  26.5 Mbits/sec
[  5]  13.00-14.00  sec  3.13 MBytes  26.2 Mbits/sec
[  5]  14.00-15.00  sec  3.91 MBytes  32.8 Mbits/sec
[  5]  15.00-16.00  sec  5.84 MBytes  49.0 Mbits/sec
[  5]  16.00-17.00  sec  6.81 MBytes  57.2 Mbits/sec
[  5]  17.00-18.00  sec  7.08 MBytes  59.4 Mbits/sec
[  5]  18.00-19.00  sec  5.71 MBytes  47.9 Mbits/sec
[  5]  19.00-20.00  sec  3.29 MBytes  27.6 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-20.02  sec   104 MBytes  43.5 Mbits/sec  169             sender
[  5]   0.00-20.00  sec   104 MBytes  43.7 Mbits/sec                  receiver

iperf Done.

And this from the server:

-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 198.100.51.11, port 48277
[  5] local 192.168.200.22 port 5201 connected to 198.100.51.11 port 10908
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  78.2 MBytes   656 Mbits/sec
[  5]   1.00-2.00   sec  98.8 MBytes   829 Mbits/sec
[  5]   2.00-3.00   sec   103 MBytes   862 Mbits/sec
[  5]   3.00-4.00   sec   102 MBytes   852 Mbits/sec
[  5]   4.00-5.00   sec   103 MBytes   868 Mbits/sec
[  5]   5.00-6.00   sec   104 MBytes   871 Mbits/sec
[  5]   6.00-7.00   sec   103 MBytes   867 Mbits/sec
[  5]   7.00-8.00   sec   104 MBytes   871 Mbits/sec
[  5]   8.00-9.00   sec   104 MBytes   873 Mbits/sec
[  5]   9.00-10.00  sec   104 MBytes   876 Mbits/sec
[  5]  10.00-10.04  sec  4.17 MBytes   868 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.04  sec  1008 MBytes   843 Mbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 198.100.51.11, port 10592
[  5] local 192.168.200.22 port 5201 connected to 198.100.51.11 port 39572
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  9.78 MBytes  82.0 Mbits/sec   17   48.1 KBytes       (omitted)
[  5]   1.00-2.00   sec  3.98 MBytes  33.4 Mbits/sec   15   38.2 KBytes       (omitted)
[  5]   2.00-3.00   sec  5.72 MBytes  48.0 Mbits/sec    3   52.3 KBytes       (omitted)
[  5]   3.00-4.00   sec  3.98 MBytes  33.4 Mbits/sec   11   29.7 KBytes       (omitted)
[  5]   4.00-5.00   sec  3.91 MBytes  32.8 Mbits/sec    5   50.9 KBytes       (omitted)
[  5]   5.00-6.00   sec  3.91 MBytes  32.8 Mbits/sec   10   63.6 KBytes       (omitted)
[  5]   6.00-7.00   sec  3.91 MBytes  32.8 Mbits/sec    8   24.0 KBytes       (omitted)
[  5]   7.00-8.00   sec  2.24 MBytes  18.8 Mbits/sec   10   33.9 KBytes       (omitted)
[  5]   8.00-9.00   sec  3.91 MBytes  32.8 Mbits/sec    6   45.2 KBytes       (omitted)
[  5]   9.00-10.00  sec  3.91 MBytes  32.8 Mbits/sec    8   39.6 KBytes       (omitted)
[  5]   0.00-1.00   sec  3.36 MBytes  28.1 Mbits/sec    6   50.9 KBytes
[  5]   1.00-2.00   sec  6.71 MBytes  56.3 Mbits/sec    7   56.6 KBytes
[  5]   2.00-3.00   sec  5.03 MBytes  42.2 Mbits/sec    5   82.0 KBytes
[  5]   3.00-4.00   sec  5.59 MBytes  46.9 Mbits/sec   15   48.1 KBytes
[  5]   4.00-5.00   sec  5.03 MBytes  42.3 Mbits/sec    2   91.9 KBytes
[  5]   5.00-6.00   sec  6.71 MBytes  56.3 Mbits/sec   13   67.9 KBytes
[  5]   6.00-7.00   sec  4.47 MBytes  37.5 Mbits/sec    7   50.9 KBytes
[  5]   7.00-8.00   sec  5.03 MBytes  42.2 Mbits/sec    8   50.9 KBytes
[  5]   8.00-9.00   sec  5.03 MBytes  42.2 Mbits/sec    6   63.6 KBytes
[  5]   9.00-10.00  sec  5.59 MBytes  46.8 Mbits/sec    5   79.2 KBytes
[  5]  10.00-11.00  sec  6.71 MBytes  56.4 Mbits/sec    6   56.6 KBytes
[  5]  11.00-12.00  sec  5.59 MBytes  46.9 Mbits/sec   12   46.7 KBytes
[  5]  12.00-13.00  sec  3.42 MBytes  28.7 Mbits/sec   11   33.9 KBytes
[  5]  13.00-14.00  sec  2.80 MBytes  23.5 Mbits/sec   14   41.0 KBytes
[  5]  14.00-15.00  sec  3.91 MBytes  32.8 Mbits/sec    8   49.5 KBytes
[  5]  15.00-16.00  sec  5.65 MBytes  47.4 Mbits/sec    5   60.8 KBytes
[  5]  16.00-17.00  sec  7.27 MBytes  61.0 Mbits/sec    5   91.9 KBytes
[  5]  17.00-18.00  sec  6.77 MBytes  56.8 Mbits/sec   10   63.6 KBytes
[  5]  18.00-19.00  sec  5.72 MBytes  48.0 Mbits/sec    6   79.2 KBytes
[  5]  19.00-20.00  sec  3.42 MBytes  28.7 Mbits/sec   18   38.2 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-20.02  sec   104 MBytes  43.5 Mbits/sec  169             sender
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------

For these two runs, it looks like our generic TCP upload throughput outside the WireGuard tunnel is around 850 Mbits/sec, and download throughput is around 45 Mbits/sec. Our previous results from within the WireGuard tunnel showed around 600 Mbits/sec upload, and 90 Mbits/sec download — so there does appear to be some room for improvement on the upload side (we’re only getting around 70% of our upload throughput when piped through the WireGuard tunnel) — but our download throughput is actually 200% better when using WireGuard!

Why is download throughput so much better with WireGuard than without in this example? This is a fairly unusual result, and it’s probably just a side-effect of a heavily-congested network in our download direction. Some of the hops in that direction may have been configured with more spare bandwidth for UDP traffic versus TCP traffic, so in this particular case, we’re able to get a lot more data through the connection simply by wrapping our TCP traffic in WireGuard UDP packets.

But these results do illustrate the need to test each set of endpoints with WireGuard vs non-WireGuard connections in order to figure out whether or not you can expect to achieve any performance benefit from WireGuard-specific tuning.

Many Connections

To check for performance issues when many WireGuard connections are active, you usually need a dedicated load-testing tool, such as JMeter, Gatling, K6, etc, so that you can simulate the actions of many individual WireGuard users all using their connections at the same time. Similar to testing a Single Connection, you’d run a load test several times with the WireGuard connection up, and run the same test with the WireGuard connection down, and then compare the results.

And as with a Single Connection, you should expect to see around 90% of the performance with your tests when using WireGuard as compared to when not using WireGuard.

Also similar to a Single Connection, if it’s not possible to connect from the same endpoints running a load test against a specific application with WireGuard as without, you could use iPerf to at least simulate generic network traffic. (The downside to this is that you’re effectively tuning your performance to produce better iPerf results, rather than necessarily improving the performance of a real application.)

For example, you can simulate additional connections running through the WireGuard tunnel between two endpoints by adding the --parallel option to the iPerf command. The following would simulate 10 connections; the first test in the upload direction, and then the second in the download:

$ iperf3 --client 10.0.0.2 --parallel 10
Connecting to host 10.0.0.2, port 5201
[  5] local 10.0.0.1 port 52428 connected to 10.0.0.2 port 5201
[  7] local 10.0.0.1 port 52442 connected to 10.0.0.2 port 5201
[  9] local 10.0.0.1 port 52444 connected to 10.0.0.2 port 5201
[ 11] local 10.0.0.1 port 52446 connected to 10.0.0.2 port 5201
[ 13] local 10.0.0.1 port 52458 connected to 10.0.0.2 port 5201
[ 15] local 10.0.0.1 port 52468 connected to 10.0.0.2 port 5201
[ 17] local 10.0.0.1 port 52474 connected to 10.0.0.2 port 5201
[ 19] local 10.0.0.1 port 52480 connected to 10.0.0.2 port 5201
[ 21] local 10.0.0.1 port 52490 connected to 10.0.0.2 port 5201
[ 23] local 10.0.0.1 port 52504 connected to 10.0.0.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  21.0 MBytes   176 Mbits/sec   51    225 KBytes
[  7]   0.00-1.00   sec  10.8 MBytes  91.0 Mbits/sec   43    171 KBytes
[  9]   0.00-1.00   sec  8.40 MBytes  70.4 Mbits/sec   28   85.4 KBytes
[ 11]   0.00-1.00   sec  8.12 MBytes  68.1 Mbits/sec   26    147 KBytes
[ 13]   0.00-1.00   sec  11.7 MBytes  98.1 Mbits/sec   29    256 KBytes
[ 15]   0.00-1.00   sec  7.97 MBytes  66.9 Mbits/sec   28    171 KBytes
[ 17]   0.00-1.00   sec  10.7 MBytes  89.5 Mbits/sec   26    171 KBytes
[ 19]   0.00-1.00   sec  10.5 MBytes  88.4 Mbits/sec   30    194 KBytes
[ 21]   0.00-1.00   sec  9.51 MBytes  79.8 Mbits/sec   30    186 KBytes
[ 23]   0.00-1.00   sec  7.94 MBytes  66.6 Mbits/sec   33    179 KBytes
[SUM]   0.00-1.00   sec   107 MBytes   895 Mbits/sec  324
- - - - - - - - - - - - - - - - - - - - - - - - -
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   9.00-10.00  sec  7.12 MBytes  59.8 Mbits/sec    5    140 KBytes
[  7]   9.00-10.00  sec  10.4 MBytes  87.5 Mbits/sec    5    202 KBytes
[  9]   9.00-10.00  sec  11.8 MBytes  98.7 Mbits/sec   11    124 KBytes
[ 11]   9.00-10.00  sec  9.58 MBytes  80.4 Mbits/sec    7    186 KBytes
[ 13]   9.00-10.00  sec  11.2 MBytes  94.1 Mbits/sec    5    155 KBytes
[ 15]   9.00-10.00  sec  8.31 MBytes  69.7 Mbits/sec   10    116 KBytes
[ 17]   9.00-10.00  sec  12.3 MBytes   103 Mbits/sec    5    186 KBytes
[ 19]   9.00-10.00  sec  6.55 MBytes  54.9 Mbits/sec    7    155 KBytes
[ 21]   9.00-10.00  sec  6.73 MBytes  56.5 Mbits/sec    8    101 KBytes
[ 23]   9.00-10.00  sec  11.0 MBytes  92.6 Mbits/sec    4    225 KBytes
[SUM]   9.00-10.00  sec  95.1 MBytes   797 Mbits/sec   67
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   116 MBytes  97.5 Mbits/sec  120             sender
[  5]   0.00-10.02  sec   113 MBytes  94.8 Mbits/sec                  receiver
[  7]   0.00-10.00  sec   113 MBytes  95.2 Mbits/sec  111             sender
[  7]   0.00-10.02  sec   112 MBytes  93.8 Mbits/sec                  receiver
[  9]   0.00-10.00  sec  87.5 MBytes  73.4 Mbits/sec   94             sender
[  9]   0.00-10.02  sec  85.9 MBytes  71.9 Mbits/sec                  receiver
[ 11]   0.00-10.00  sec  94.2 MBytes  79.0 Mbits/sec   88             sender
[ 11]   0.00-10.02  sec  93.1 MBytes  77.9 Mbits/sec                  receiver
[ 13]   0.00-10.00  sec   101 MBytes  84.9 Mbits/sec   88             sender
[ 13]   0.00-10.02  sec  99.8 MBytes  83.5 Mbits/sec                  receiver
[ 15]   0.00-10.00  sec  84.9 MBytes  71.2 Mbits/sec   99             sender
[ 15]   0.00-10.02  sec  83.9 MBytes  70.2 Mbits/sec                  receiver
[ 17]   0.00-10.00  sec  96.3 MBytes  80.8 Mbits/sec   81             sender
[ 17]   0.00-10.02  sec  95.0 MBytes  79.5 Mbits/sec                  receiver
[ 19]   0.00-10.00  sec  96.0 MBytes  80.6 Mbits/sec  113             sender
[ 19]   0.00-10.02  sec  94.7 MBytes  79.2 Mbits/sec                  receiver
[ 21]   0.00-10.00  sec  93.3 MBytes  78.3 Mbits/sec   94             sender
[ 21]   0.00-10.02  sec  92.2 MBytes  77.1 Mbits/sec                  receiver
[ 23]   0.00-10.00  sec  87.8 MBytes  73.6 Mbits/sec   92             sender
[ 23]   0.00-10.02  sec  86.7 MBytes  72.5 Mbits/sec                  receiver
[SUM]   0.00-10.00  sec   971 MBytes   814 Mbits/sec  980             sender
[SUM]   0.00-10.02  sec   957 MBytes   800 Mbits/sec                  receiver

iperf Done.
$ iperf3 --client 10.0.0.2 --parallel 10 --reverse --omit 10 --time 20
Connecting to host 10.0.0.2, port 5201
Reverse mode, remote host 10.0.0.2 is sending
[  5] local 10.0.0.1 port 37494 connected to 10.0.0.2 port 5201
[  7] local 10.0.0.1 port 37506 connected to 10.0.0.2 port 5201
[  9] local 10.0.0.1 port 37518 connected to 10.0.0.2 port 5201
[ 11] local 10.0.0.1 port 37530 connected to 10.0.0.2 port 5201
[ 13] local 10.0.0.1 port 37534 connected to 10.0.0.2 port 5201
[ 15] local 10.0.0.1 port 37540 connected to 10.0.0.2 port 5201
[ 17] local 10.0.0.1 port 37550 connected to 10.0.0.2 port 5201
[ 19] local 10.0.0.1 port 37554 connected to 10.0.0.2 port 5201
[ 21] local 10.0.0.1 port 37564 connected to 10.0.0.2 port 5201
[ 23] local 10.0.0.1 port 37572 connected to 10.0.0.2 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  5.45 MBytes  45.7 Mbits/sec                  (omitted)
[  7]   0.00-1.00   sec  6.23 MBytes  52.3 Mbits/sec                  (omitted)
[  9]   0.00-1.00   sec  4.09 MBytes  34.3 Mbits/sec                  (omitted)
[ 11]   0.00-1.00   sec  6.37 MBytes  53.4 Mbits/sec                  (omitted)
[ 13]   0.00-1.00   sec  5.84 MBytes  49.0 Mbits/sec                  (omitted)
[ 15]   0.00-1.00   sec  2.58 MBytes  21.6 Mbits/sec                  (omitted)
[ 17]   0.00-1.00   sec  3.60 MBytes  30.2 Mbits/sec                  (omitted)
[ 19]   0.00-1.00   sec  4.68 MBytes  39.3 Mbits/sec                  (omitted)
[ 21]   0.00-1.00   sec  3.14 MBytes  26.3 Mbits/sec                  (omitted)
[ 23]   0.00-1.00   sec  5.37 MBytes  45.1 Mbits/sec                  (omitted)
[SUM]   0.00-1.00   sec  47.4 MBytes   397 Mbits/sec                  (omitted)
- - - - - - - - - - - - - - - - - - - - - - - - -
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]  19.00-20.00  sec  9.37 MBytes  78.6 Mbits/sec
[  7]  19.00-20.00  sec  3.81 MBytes  31.9 Mbits/sec
[  9]  19.00-20.00  sec  5.42 MBytes  45.5 Mbits/sec
[ 11]  19.00-20.00  sec  6.81 MBytes  57.1 Mbits/sec
[ 13]  19.00-20.00  sec  7.72 MBytes  64.8 Mbits/sec
[ 15]  19.00-20.00  sec  4.66 MBytes  39.1 Mbits/sec
[ 17]  19.00-20.00  sec  4.79 MBytes  40.2 Mbits/sec
[ 19]  19.00-20.00  sec  4.38 MBytes  36.8 Mbits/sec
[ 21]  19.00-20.00  sec  5.56 MBytes  46.7 Mbits/sec
[ 23]  19.00-20.00  sec  5.11 MBytes  42.9 Mbits/sec
[SUM]  19.00-20.00  sec  57.6 MBytes   483 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-20.04  sec   125 MBytes  52.3 Mbits/sec  135             sender
[  5]   0.00-20.00  sec   125 MBytes  52.3 Mbits/sec                  receiver
[  7]   0.00-20.04  sec   114 MBytes  47.8 Mbits/sec  132             sender
[  7]   0.00-20.00  sec   114 MBytes  47.8 Mbits/sec                  receiver
[  9]   0.00-20.04  sec   120 MBytes  50.3 Mbits/sec  122             sender
[  9]   0.00-20.00  sec   120 MBytes  50.2 Mbits/sec                  receiver
[ 11]   0.00-20.04  sec   112 MBytes  46.8 Mbits/sec  146             sender
[ 11]   0.00-20.00  sec   112 MBytes  46.9 Mbits/sec                  receiver
[ 13]   0.00-20.04  sec   117 MBytes  49.1 Mbits/sec  137             sender
[ 13]   0.00-20.00  sec   117 MBytes  49.2 Mbits/sec                  receiver
[ 15]   0.00-20.04  sec   112 MBytes  46.9 Mbits/sec  131             sender
[ 15]   0.00-20.00  sec   112 MBytes  46.8 Mbits/sec                  receiver
[ 17]   0.00-20.04  sec   114 MBytes  47.8 Mbits/sec  139             sender
[ 17]   0.00-20.00  sec   114 MBytes  48.0 Mbits/sec                  receiver
[ 19]   0.00-20.04  sec   112 MBytes  46.8 Mbits/sec  130             sender
[ 19]   0.00-20.00  sec   111 MBytes  46.7 Mbits/sec                  receiver
[ 21]   0.00-20.04  sec   119 MBytes  50.0 Mbits/sec  130             sender
[ 21]   0.00-20.00  sec   119 MBytes  50.0 Mbits/sec                  receiver
[ 23]   0.00-20.04  sec   113 MBytes  47.3 Mbits/sec  122             sender
[ 23]   0.00-20.00  sec   113 MBytes  47.3 Mbits/sec                  receiver
[SUM]   0.00-20.04  sec  1.13 GBytes   485 Mbits/sec  1324             sender
[SUM]   0.00-20.00  sec  1.13 GBytes   485 Mbits/sec                  receiver

iperf Done.

Those 10 connections managed about 800 Mbits/sec of combined throughput when uploading, and 485 Mbits/sec downloading.

Just like with a Single Connection, we’d want to compare this to the same number of connections running without WireGuard. To do that, we’d run the same tests using the public IP address of the server endpoint, instead of its WireGuard IP:

$ iperf3 --client 203.0.113.2 --parallel 10
Connecting to host 203.0.113.2, port 5201
[  5] local 192.168.1.11 port 34860 connected to 203.0.113.2 port 5201
[  7] local 192.168.1.11 port 34864 connected to 203.0.113.2 port 5201
[  9] local 192.168.1.11 port 34872 connected to 203.0.113.2 port 5201
[ 11] local 192.168.1.11 port 34884 connected to 203.0.113.2 port 5201
[ 13] local 192.168.1.11 port 34898 connected to 203.0.113.2 port 5201
[ 15] local 192.168.1.11 port 34914 connected to 203.0.113.2 port 5201
[ 17] local 192.168.1.11 port 34922 connected to 203.0.113.2 port 5201
[ 19] local 192.168.1.11 port 34928 connected to 203.0.113.2 port 5201
[ 21] local 192.168.1.11 port 34938 connected to 203.0.113.2 port 5201
[ 23] local 192.168.1.11 port 34954 connected to 203.0.113.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  17.5 MBytes   147 Mbits/sec   24    529 KBytes
[  7]   0.00-1.00   sec  14.5 MBytes   122 Mbits/sec   29    281 KBytes
[  9]   0.00-1.00   sec  15.9 MBytes   133 Mbits/sec   39    297 KBytes
[ 11]   0.00-1.00   sec  10.1 MBytes  84.3 Mbits/sec   36    298 KBytes
[ 13]   0.00-1.00   sec  5.33 MBytes  44.7 Mbits/sec    8    140 KBytes
[ 15]   0.00-1.00   sec  15.2 MBytes   127 Mbits/sec   44    290 KBytes
[ 17]   0.00-1.00   sec  1.94 MBytes  16.3 Mbits/sec    8   38.2 KBytes
[ 19]   0.00-1.00   sec  6.08 MBytes  51.0 Mbits/sec   10    148 KBytes
[ 21]   0.00-1.00   sec  5.95 MBytes  49.9 Mbits/sec   16    174 KBytes
[ 23]   0.00-1.00   sec  17.8 MBytes   149 Mbits/sec    5    510 KBytes
[SUM]   0.00-1.00   sec   110 MBytes   925 Mbits/sec  219
- - - - - - - - - - - - - - - - - - - - - - - - -
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   9.00-10.00  sec  11.2 MBytes  94.4 Mbits/sec    0    273 KBytes
[  7]   9.00-10.00  sec  11.3 MBytes  94.9 Mbits/sec    0    276 KBytes
[  9]   9.00-10.00  sec  11.8 MBytes  98.9 Mbits/sec    0    286 KBytes
[ 11]   9.00-10.00  sec  10.4 MBytes  87.6 Mbits/sec    0    264 KBytes
[ 13]   9.00-10.00  sec  10.6 MBytes  88.6 Mbits/sec    0    256 KBytes
[ 15]   9.00-10.00  sec  13.9 MBytes   117 Mbits/sec    0    325 KBytes
[ 17]   9.00-10.00  sec  9.45 MBytes  79.2 Mbits/sec    0    215 KBytes
[ 19]   9.00-10.00  sec  10.6 MBytes  88.6 Mbits/sec    0    249 KBytes
[ 21]   9.00-10.00  sec  10.1 MBytes  85.0 Mbits/sec    0    235 KBytes
[ 23]   9.00-10.00  sec  10.0 MBytes  83.9 Mbits/sec    0    233 KBytes
[SUM]   9.00-10.00  sec   109 MBytes   918 Mbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   151 MBytes   127 Mbits/sec  113             sender
[  5]   0.00-10.01  sec   149 MBytes   125 Mbits/sec                  receiver
[  7]   0.00-10.00  sec   114 MBytes  95.3 Mbits/sec   85             sender
[  7]   0.00-10.01  sec   111 MBytes  93.2 Mbits/sec                  receiver
[  9]   0.00-10.00  sec   123 MBytes   103 Mbits/sec   97             sender
[  9]   0.00-10.01  sec   121 MBytes   101 Mbits/sec                  receiver
[ 11]   0.00-10.00  sec   122 MBytes   103 Mbits/sec   76             sender
[ 11]   0.00-10.01  sec   121 MBytes   101 Mbits/sec                  receiver
[ 13]   0.00-10.00  sec  89.7 MBytes  75.2 Mbits/sec   27             sender
[ 13]   0.00-10.01  sec  88.4 MBytes  74.1 Mbits/sec                  receiver
[ 15]   0.00-10.00  sec   132 MBytes   111 Mbits/sec   99             sender
[ 15]   0.00-10.01  sec   130 MBytes   109 Mbits/sec                  receiver
[ 17]   0.00-10.00  sec  63.0 MBytes  52.9 Mbits/sec   30             sender
[ 17]   0.00-10.01  sec  61.9 MBytes  51.8 Mbits/sec                  receiver
[ 19]   0.00-10.00  sec  89.0 MBytes  74.7 Mbits/sec   35             sender
[ 19]   0.00-10.01  sec  88.0 MBytes  73.8 Mbits/sec                  receiver
[ 21]   0.00-10.00  sec  93.6 MBytes  78.5 Mbits/sec   46             sender
[ 21]   0.00-10.01  sec  92.6 MBytes  77.6 Mbits/sec                  receiver
[ 23]   0.00-10.00  sec   118 MBytes  98.8 Mbits/sec  191             sender
[ 23]   0.00-10.01  sec   115 MBytes  96.0 Mbits/sec                  receiver
[SUM]   0.00-10.00  sec  1.07 GBytes   919 Mbits/sec  799             sender
[SUM]   0.00-10.01  sec  1.05 GBytes   903 Mbits/sec                  receiver

iperf Done.
$ iperf3 --client 203.0.113.2 --parallel 10 --reverse --omit 10 --time 20
Connecting to host 203.0.113.2, port 5201
Reverse mode, remote host 203.0.113.2 is sending
[  5] local 192.168.1.11 port 32818 connected to 203.0.113.2 port 5201
[  7] local 192.168.1.11 port 32830 connected to 203.0.113.2 port 5201
[  9] local 192.168.1.11 port 32832 connected to 203.0.113.2 port 5201
[ 11] local 192.168.1.11 port 32834 connected to 203.0.113.2 port 5201
[ 13] local 192.168.1.11 port 32850 connected to 203.0.113.2 port 5201
[ 15] local 192.168.1.11 port 32854 connected to 203.0.113.2 port 5201
[ 17] local 192.168.1.11 port 32856 connected to 203.0.113.2 port 5201
[ 19] local 192.168.1.11 port 32866 connected to 203.0.113.2 port 5201
[ 21] local 192.168.1.11 port 32880 connected to 203.0.113.2 port 5201
[ 23] local 192.168.1.11 port 32892 connected to 203.0.113.2 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  3.94 MBytes  33.0 Mbits/sec                  (omitted)
[  7]   0.00-1.00   sec  5.30 MBytes  44.5 Mbits/sec                  (omitted)
[  9]   0.00-1.00   sec  2.13 MBytes  17.8 Mbits/sec                  (omitted)
[ 11]   0.00-1.00   sec  4.10 MBytes  34.4 Mbits/sec                  (omitted)
[ 13]   0.00-1.00   sec  5.98 MBytes  50.2 Mbits/sec                  (omitted)
[ 15]   0.00-1.00   sec  4.69 MBytes  39.3 Mbits/sec                  (omitted)
[ 17]   0.00-1.00   sec  2.38 MBytes  20.0 Mbits/sec                  (omitted)
[ 19]   0.00-1.00   sec  4.76 MBytes  39.9 Mbits/sec                  (omitted)
[ 21]   0.00-1.00   sec  3.66 MBytes  30.7 Mbits/sec                  (omitted)
[ 23]   0.00-1.00   sec  3.54 MBytes  29.7 Mbits/sec                  (omitted)
[SUM]   0.00-1.00   sec  40.5 MBytes   340 Mbits/sec                  (omitted)
- - - - - - - - - - - - - - - - - - - - - - - - -
...
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]  19.00-20.00  sec  4.21 MBytes  35.3 Mbits/sec
[  7]  19.00-20.00  sec  3.53 MBytes  29.6 Mbits/sec
[  9]  19.00-20.00  sec  4.63 MBytes  38.8 Mbits/sec
[ 11]  19.00-20.00  sec  4.95 MBytes  41.5 Mbits/sec
[ 13]  19.00-20.00  sec  6.21 MBytes  52.1 Mbits/sec
[ 15]  19.00-20.00  sec  4.13 MBytes  34.7 Mbits/sec
[ 17]  19.00-20.00  sec  5.93 MBytes  49.7 Mbits/sec
[ 19]  19.00-20.00  sec  4.28 MBytes  35.9 Mbits/sec
[ 21]  19.00-20.00  sec  4.85 MBytes  40.7 Mbits/sec
[ 23]  19.00-20.00  sec  4.70 MBytes  39.4 Mbits/sec
[SUM]  19.00-20.00  sec  47.4 MBytes   398 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-20.01  sec  87.9 MBytes  36.8 Mbits/sec  172             sender
[  5]   0.00-20.00  sec  88.1 MBytes  37.0 Mbits/sec                  receiver
[  7]   0.00-20.01  sec  81.5 MBytes  34.2 Mbits/sec  168             sender
[  7]   0.00-20.00  sec  81.6 MBytes  34.2 Mbits/sec                  receiver
[  9]   0.00-20.01  sec  95.6 MBytes  40.1 Mbits/sec  146             sender
[  9]   0.00-20.00  sec  95.4 MBytes  40.0 Mbits/sec                  receiver
[ 11]   0.00-20.01  sec   114 MBytes  47.7 Mbits/sec  150             sender
[ 11]   0.00-20.00  sec   114 MBytes  47.8 Mbits/sec                  receiver
[ 13]   0.00-20.01  sec  97.3 MBytes  40.8 Mbits/sec  153             sender
[ 13]   0.00-20.00  sec  97.3 MBytes  40.8 Mbits/sec                  receiver
[ 15]   0.00-20.01  sec  91.3 MBytes  38.3 Mbits/sec  168             sender
[ 15]   0.00-20.00  sec  91.2 MBytes  38.2 Mbits/sec                  receiver
[ 17]   0.00-20.01  sec  92.3 MBytes  38.7 Mbits/sec  169             sender
[ 17]   0.00-20.00  sec  92.5 MBytes  38.8 Mbits/sec                  receiver
[ 19]   0.00-20.01  sec  92.4 MBytes  38.7 Mbits/sec  135             sender
[ 19]   0.00-20.00  sec  92.5 MBytes  38.8 Mbits/sec                  receiver
[ 21]   0.00-20.01  sec  95.1 MBytes  39.9 Mbits/sec  161             sender
[ 21]   0.00-20.00  sec  94.9 MBytes  39.8 Mbits/sec                  receiver
[ 23]   0.00-20.01  sec   108 MBytes  45.3 Mbits/sec  157             sender
[ 23]   0.00-20.00  sec   108 MBytes  45.2 Mbits/sec                  receiver
[SUM]   0.00-20.01  sec   955 MBytes   400 Mbits/sec  1579             sender
[SUM]   0.00-20.00  sec   955 MBytes   401 Mbits/sec                  receiver

iperf Done.

Without WireGuard, the tests show about 900 MBits/sec uploading, and 400 MBits/sec downloading. That’s about what you’d expect for upload throughput (800 MBits/sec with WireGuard versus 900 MBits/sec without), but surprisingly faster with WireGuard than without when downloading (485 MBits/sec with WireGuard versus 400 MBits/sec without). This is probably due to particularly bad TCP congestion on the download path.

For this particular example, no amount of WireGuard-specific tuning is likely to improve performance with 10 concurrent connections. It might be a different story with a 100 or 1000 concurrent connections, though, so important to test with real-world load (but you likely will need at least a few additional client machines in order to test 100 concurrent connections, and a dozen or more to test 1000 current connections).

Tuning

If you’re getting significantly worse performance with WireGuard than you expect on your tests, here are five problem areas you can check and try to adjust:

CPU

Handling WireGuard traffic requires some extra CPU cycles in order to encrypt and decrypt packets being sent and received through the WireGuard tunnel. This overhead is usually minimal, but can have an impact on underpowered systems (like say a Raspberry Pi), or systems that handle many concurrent WireGuard connections.

While you’re running tests with WireGuard, check the CPU usage of each endpoint, as well as each WireGuard hub or gateway involved in the connection. Eyeballing the CPU usage with a utility like htop, or a similar tool that displays the CPU usage visually, is fine for this kind of check. If the CPU usage stays consistently above 90% during a test on one of the systems, it’s likely that the CPU on the system is limiting WireGuard throughput.

Unfortunately, about the only thing you can do to fix this issue is to replace the system with one that has a more powerful CPU.

TCP Congestion Control

The default TCP congestion-control algorithm used by most modern operating systems, CUBIC, is not a great fit for WireGuard (or any other protocol that encapsulates TCP in UDP). The BBR algorithm is a better fit, as it is less sensitive to packet loss and much more aggressive in its search for the optimal congestion window. If your performance tests show the TCP congestion window bouncing around (or narrowing significantly after initially opening quickly), TCP congestion control is likely the problem — and switching to the BBR algorithm may help.

Congestion control is managed on the sending side of the connection — the server side for downloads, the client side for uploads. It generally must be set globally for the host, so be aware that changing it for TCP connections sent through WireGuard will also change it for TCP connections not sent through WireGuard (possibly to their detriment).

On Linux, you can view the current TCP congestion-control algorithm by running the following command:

$ sysctl net.ipv4.tcp_congestion_control
net.ipv4.tcp_congestion_control = cubic

And you can change it to BBR (until the next reboot) with the following command

$ sudo sysctl -w net.ipv4.tcp_congestion_control=bbr

To apply it automatically after a system reboot, add the following line to the /etc/sysctl.conf file (or any file in the /etc/sysctl.d/ directory, like /etc/sysctl.d/custom.conf):

net.ipv4.tcp_congestion_control=bbr
Caution

For Linux kernels prior to 4.20, you should also change the qdisc (queue discipline) for your physical network interfaces to use the fq qdisc (see the Can I use BBR with fq_codel? discussion). Usually the simplest way to do this is to add the following line to your /etc/sysctl.conf file, and then reboot:

net.core.default_qdisc=fq

On FreeBSD, BBR has been available since version 13.0, but you usually have to compile the kernel with some additional options. With a kernel that supports BBR, you can activate it by loading the tcp_bbr module and setting the net.inet.tcp.functions_default kernel parameter:

# kldload tcp_bbr
# systemctl net.inet.tcp.functions_default=bbr

See the How to enable TCP BBR in FreeBSD? thread for details.

On the latest versions of Windows, you can view the current TCP congestion-control algorithm by running the following command:

> Get-NetTCPSetting | Select SettingName, CongestionProvider
SettingName      CongestionProvider
-----------      ------------------
Automatic
InternetCustom   CUBIC
DatacenterCustom CUBIC
Compat           CUBIC
Datacenter       CUBIC
Internet         CUBIC

And you can change it to BBR with the following commands:

> netsh int tcp set supplemental Template=Internet CongestionProvider=bbr2
> netsh int tcp set supplemental Template=Datacenter CongestionProvider=bbr2
> netsh int tcp set supplemental Template=Compat CongestionProvider=bbr2
> netsh int tcp set supplemental Template=DatacenterCustom CongestionProvider=bbr2
> netsh int tcp set supplemental Template=InternetCustom CongestionProvider=bbr2

Packet Fragmentation

Many Ethernet connections have an MTU (Maximum Transmission Unit) of 1500, meaning that each Ethernet frame can carry up to 1500 bytes of content (not including the Ethernet headers themselves, but including the headers of the packets sent inside each frame). However, other MTU sizes are common, such as 9000 or more for network links inside a datacenter, or 1492 for many PPPoE (Point-to-Point Protocol over Ethernet) connections between ISPs (Internet Service Providers) and their subscribers.

Packets larger than an Ethernet connection’s MTU size cannot be transmitted. When too-large IPv4 packets are encountered, the network stack of the device can do one of two things:

  1. Drop the packet, and send back an ICMP “Fragmentation Needed” packet to the original packet’s source.

  2. Fragment the packet into smaller packets that fit the MTU size, and send the smaller packets along to the original packet’s destination.

(With IPv6, the packet is always dropped, and an ICMPv6 “Packet Too Big” packet is sent back.)

Both options will have a negative effect on performance, as they result in additional packets being sent and processed. Furthermore, when the second option is taken, the ICMP (or ICMPv6) packet may never make it back to the original source, as NAT (Network Address Translation) or a restrictive firewall in front of the source may block all ICMP and ICMPv6 packets. If those packets are blocked, the connection will appear to “hang” completely, as the traffic from the original source will never make it in any form through to its intended destination.

The fix for this is to manually set the MTU on the WireGuard connection low enough so that packets do not have to be fragmented. The MTU of each WireGuard interface should be set 60 bytes smaller than the MTU of the narrowest link in the connection when using IPv4 (and 80 bytes smaller when using IPv6).

One way to figure out the right MTU size is by trial-and-error: start with an MTU of 1280 (the smallest MTU legal with IPv6), and run some performance tests with it. Keep increasing the MTU until you hit a performance cliff.

Negotiated MSS

Another way to figure out the right MTU for a WireGuard interface is to check the negotiated MSS (Maximum Segment Size) of a TCP connection made between the two endpoints outside of the WireGuard tunnel (this only works, however, if the routers where the MTU sizes change have implemented “MSS clamping”). MSS is the maximum payload size a TCP packet can carry through the connection (not including packet headers). MSS is effectively MTU minus the TCP/IP headers; and the combined TCP/IP headers are 40 bytes with IPv4, and 60 bytes with IPv6:

Regular IPv4 Packet
Figure 1. MTU and MSS with a regular IPv4 TCP packet

The MTU on a WireGuard interface should be 60 bytes smaller than the MTU of the Ethernet interface through which its tunneled packets travel (when using IPv4 to transport the tunneled packets; 80 bytes smaller when using IPv6). And the MSS of the tunneled TCP packets should be a further 40 bytes smaller (when the packets themselves are using IPv4; 60 bytes when using IPv6):

Tunneled IPv4 Packet
Figure 2. MTU and MSS with an IPv4 TCP packet tunneled through WireGuard via an IPv4 UDP packet

To check the negotiated MSS, run tcpdump on each endpoint to capture the SYN and SYN-ACK packets of a TCP handshake. On the recipient endpoint, the SYN packet will show the MSS as adjusted by any routers on the way to the recipient (whereas on the initiating endpoint, the SYN packet will show the MSS as originally requested by the initiator). On the initiating endpoint, the returned SYN-ACK packet will show the MSS as adjusted by any routers on the way back to the initiator (whereas on the recipient endpoint, the SYN-ACK will show the MSS as originally requested by the recipient).

If the connection has been routed through a different path on the way to the recipient than it was on the way back to the initiator, you may see a different value for the MSS in the SYN packet on the recipient than you see in the SYN-ACK packet on the initiator; but usually they will be the same.

For example, if you run tcpdump with an expression like tcp[tcpflags] == tcp-syn or tcp[tcpflags] == tcp-syn|tcp-ack on both endpoints, and then open an SSH connection from one to the other outside of WireGuard, you’ll see output from tcpdump like this on the initiating side of the connection:

$ sudo tcpdump -ni eth0 'tcp[tcpflags] == tcp-syn or tcp[tcpflags] == tcp-syn|tcp-ack'
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
17:16:55.656028 IP 192.168.1.11.19546 > 203.0.113.2.22: Flags [S], seq 3349242392, win 64240, options [mss 1460,sackOK,TS val 2936811543 ecr 0,nop,wscale 7], length 0
17:16:55.656089 IP 203.0.113.2.22 > 192.168.1.11.19546: Flags [S.], seq 735473634, ack 3349242393, win 62643, options [mss 1360,sackOK,TS val 3872048576 ecr 2936811543,nop,wscale 6], length 0

And output like this on the receiving side of the connection:

$ sudo tcpdump -ni eth0 'tcp[tcpflags] == tcp-syn or tcp[tcpflags] == tcp-syn|tcp-ack'
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
17:16:55.656028 IP 198.51.100.11.19546 > 192.168.200.22.22: Flags [S], seq 3349242392, win 64240, options [mss 1360,sackOK,TS val 2936811543 ecr 0,nop,wscale 7], length 0
17:16:55.656089 IP 192.168.200.22.22 > 198.51.100.11.19546: Flags [S.], seq 735473634, ack 3349242393, win 62643, options [mss 8960,sackOK,TS val 3872048576 ecr 2936811543,nop,wscale 6], length 0

This shows that the initiating endpoint originally sent a SYN packet with an MSS of 1460 (indicating an MTU of 1500 on its Ethernet interface), but by the time it got to the receiving endpoint, some intermediate router along the way had lowered the MSS to 1360. Similarly, the recipient endpoint had originally sent its SYN-ACK packet back with an MSS of 8960 (indicating an MTU of 9000 on its Ethernet interface), but it had been lowered to 1360 by the time it made it back to the initiator.

Since this connection uses IPv4, we’d add 40 bytes back to the negotiated MSS of 1360 to come up with an MTU size of 1400 for the narrowest link in the connection. Then we’d subtract 60 bytes from that link MTU size to come up with 1340 for the MTU size we need to set for a WireGuard interface using this connection (100 bytes smaller than we’d usually expect).

Packet Too Big

With IPv6, you can also use tcpdump to check for “Packet Too Big” ICMPv6 messages, which will include in the message the MTU size to which packets must be lowered. However, this will only work if the source endpoint is not behind NAT66 or a restrictive firewall that blocks ICMPv6 messages. Also, you must make sure you send a moderately large message from the initiator side (like a file upload) to get a “Packet Too Big” message back to the initiator, or a moderately large message from the receiver (like a file download) to get a “Packet Too Big” message back to the receiver. (Plus you may need to repeat the process one or more times to actually find the smallest chokepoint.)

For example, if you run tcpdump with an expression like icmp6[icmp6type] == icmp6-packettoobig on one endpoint, and then try to send a packet that’s too big from it over IPv6 to the other endpoint, you should see output like the following from tcpdump:

$ sudo tcpdump -ni eth0 'icmp6[icmp6type] == icmp6-packettoobig'
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
18:13:47.908474 IP6 2001:db8:100::1 > 2001:db8:200::2: ICMP6, packet too big, mtu 1400, length 1240

Subtract 80 from the MTU listed in the ICMPv6 message (1400 in the above example), and set that as the MTU for your WireGuard interface.

MSS Clamping

With a site-to-site WireGuard connection, once you determine the correct MTU to use for the WireGuard interfaces on each side of the connection, make sure you also add an MSS clamping rule to the firewall on one or both sides. Such a rule will direct the firewall to rewrite the requested MSS in TCP packets where the packet’s MSS value is larger than the firewall’s own MSS calculation (which the firewall determines using the MTU of its own network interfaces on the packet’s path, and the header sizes of the TCP/IP version being used).

This will ensure that TCP connections between two endpoints which themselves don’t know about the reduced MTU on the WireGuard connection between them will have their MSS automatically adjusted, without requiring additional ICMP messages, or risking packet fragmentation.

The following iptables rule will clamp all the MSS of all TCP connections forwarded through the firewall:

iptables -t mangle -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu

With nftables, you’d add a similar rule somewhere in the forward hook of a filter chain:

table inet filter {
    ...
    chain forward {
        type filter hook forward priority 0; policy drop;
        ...
        tcp flags syn / syn,rst tcp option maxseg size set rt mtu
        ...
    }
    ...
}
Tip

See the WireGuard with Nftables guide for an example that performs MSS clamping in a slightly more targeted way, clamping just the packets going out the WireGuard tunnel at each site — thereby avoiding performing the clamping work if not necessary.

Packet Buffering

The default network settings for most modern operating systems are already well-tuned for handling a few dozen concurrent network connections. However, if you’re running a WireGuard hub, gateway, or service endpoint that handles hundreds of concurrent WireGuard connections, there are a few settings on it you can adjust that may improve performance.

On Linux specifically, there are three sets of kernel parameters related to packet buffering which can be tuned to improve WireGuard connection throughput. Run your load tests before and after to see if your changes have any effect. Trial-and-error is the only way to determine the optimal value for each parameter; a good rule of thumb is to try doubling the parameter’s current value, and see if that has any effect.

You can view the current value of a kernel parameter with the sysctl command:

$ sysctl net.core.netdev_budget
net.core.netdev_budget = 300

And you can change it (until the next reboot) with the -w flag:

$ sudo sysctl -w net.core.netdev_budget=600

To apply your custom settings after a system reboot, add them to the /etc/sysctl.conf file (or any file in the /etc/sysctl.d/ directory, like /etc/sysctl.d/custom.conf).

Packet Processing Rate

On a server with a beefy CPU, you may be able to improve throughput by increasing the number of packets processed per network-device driver poll, allowing the server to clear buffered packets more efficiently. Try adjusting these kernel parameters:

net.core.netdev_budget

Maximum number of packets that can be processed during one polling cycle. The cycle is ended when this or net.core.netdev_budget_usecs is hit.

net.core.netdev_budget_usecs

Maximum number of microseconds that can elapse during one polling cycle.

net.core.dev_weight

Maximum number of packets (per CPU core) from the backlog queue that can be processed per polling cycle.

net.core.netdev_tstamp_prequeue

Set this to 0 to allow processing to be more evenly distributed across CPU cores.

UDP Send/Receive Buffers

If your server’s CPU is powerful enough to clear lots of packets quickly, increasing the raw number of UDP packets it can buffer may also allow it to handle higher WireGuard throughput. Try adjusting these kernel parameters:

net.core.rmem_default

Default size of a socket receive buffer in bytes. When you increase this value, make sure you also increase the size of net.core.rmem_max to be at least as big as this. WireGuard packets will be added to a buffer of this size when they are received — if the buffer (and backlog queue) fills up, incoming packets will be dropped until space can be cleared by processing the packets already in the buffer.

net.core.rmem_max

Maximum size of a socket receive buffer in bytes. Usually there’s no reason to set this larger than net.core.rmem_default.

net.core.wmem_default

Default size of a socket send buffer in bytes. When you increase this value, make sure you also increase the size of net.core.wmem_max to be at least as big as this. WireGuard packets will be added to a buffer of this size when they are sent — if the buffer fills up, outgoing packets will be dropped until space can be cleared by sending the packets already in the buffer. (This is usually not as much of an issue as the receive buffer.)

net.core.wmem_max

Maximum size of a socket send buffer in bytes. Usually there’s no reason to set this larger than net.core.wmem_default.

net.core.netdev_max_backlog

Maximum number of packets (per CPU core) that can be buffered in the backlog queue. This is used to buffer packets temporarily until there’s room to add them to the appropriate socket’s receive buffer.

Note that there are diminishing returns to increasing these values, as eventually you’ll get to the point where you’re buffering more packets than the server can process. When that happens, your measured performance will rapidly get worse than before, as the server starts dropping packets again — but this time with a greater delay that makes it harder for clients to scale back their own send rate — creating the conditions for bufferbloat.

TCP Send/Receive Buffers

If you’re trying to improve the performance of an application that uses TCP (such as a web app), and that application is hosted on a WireGuard endpoint, you may also want to try tuning the kernel parameters related to TCP buffering on that endpoint. In particular, try adjusting these parameters:

net.ipv4.tcp_rmem

TCP socket receive buffer sizes in bytes: min, default, and max. The kernel can automatically adjust what buffer size is actually allocated for a socket between the min and the max.

net.ipv4.tcp_wmem

TCP socket send buffer sizes in bytes: min, default, and max. The kernel can automatically adjust what buffer size is actually allocated for a socket between the min and the max.

net.ipv4.tcp_adv_win_scale

TCP window scaling factor. See Cloudflare’s Optimizing TCP for high WAN throughput while preserving low latency article for an excellent explanation of how this works together with net.ipv4.tcp_rmem and net.ipv4.tcp_wmem.

net.core.somaxconn

Maximum number of packets that can be buffered in TCP connection queues. See Cloudflare’s SYN packet handling in the wild article for a good discussion about tuning this.

Connection Tracking

Connection tracking (aka “conntrack”) allows a firewall to identify and group all the individual packets being sent back and forth between two endpoints as part of an ongoing, bidirectional connection between the two endpoints. Using the bigger picture of an ongoing connection allows the firewall to “remember” packets going one way through the connection, so it can apply special rules to the packets coming back the other way. Firewalls with this capability are called stateful firewalls.

If you are running your WireGuard connections through a stateful firewall, the firewall’s conntrack system will add some extra performance overhead. The impact of this will be miniscule, however — unless you’re running hundreds of concurrent connections through it.

Connection tracking is usually used for things like NAT, connection masquerading, port forwarding, or egress-only connections. If you’re not using any of those features with WireGuard, you may find you can improve throughput by disabling conntrack for your WireGuard connections.

For example, the following iptables rules will disable connection tracking for traffic sent or received inside a WireGuard tunnel (where the WireGuard interface is named wg0):

iptables -t raw -A PREROUTING -i wg0 -j NOTRACK
iptables -t raw -A OUTPUT -o wg0 -j NOTRACK

And the following iptables rules will do the same for the WireGuard tunnel itself (for a WireGuard interface listening on port 51820):

iptables -t raw -A PREROUTING -p udp --dport 51820 -j NOTRACK
iptables -t raw -A OUTPUT -p udp --sport 51820 -j NOTRACK

The equivalent rules for nftables would look something like this:

table inet raw {
    ...
    chain prerouting {
        type filter hook prerouting priority -300; policy accept;
        ...
        iifname "wg0" notrack
        udp dport 51820 notrack
        ...
    }
    ...
    chain output {
        type filter hook output priority -300; policy accept;
        ...
        oifname "wg0" notrack
        udp sport 51820 notrack
        ...
    }
    ...
}

On the other hand, if you do use connection tracking on a Linux system that processes hundreds or thousands of concurrent connections, you may want to adjust the following kernel parameters to make sure the system’s conntrack hashtable doesn’t completely fill up:

net.netfilter.nf_conntrack_buckets

Size of the connection-tracking hashtable.

net.netfilter.nf_conntrack_max

Maximum number of entries allowed in the connection-tracking hashtable.

You can use the conntrack command to check connection-tracking stats (it’s available in most Linux distros via the conntrack or conntrack-tools package). The conntrack -L command will list the current contents of the conntrack hashtable, and the conntrack -S command will show the system’s cumulative conntrack stats:

$ sudo conntrack -L
udp      17 19 src=192.168.200.22 dst=185.125.190.58 sport=60354 dport=123 src=185.125.190.58 dst=192.168.200.22 sport=123 dport=60354 mark=0 use=1
udp      17 21 src=192.168.200.22 dst=69.89.207.99 sport=46452 dport=123 src=69.89.207.99 dst=192.168.200.22 sport=123 dport=46452 mark=0 use=1
tcp      6 299 ESTABLISHED src=192.168.200.22 dst=5.180.19.95 sport=22 dport=28499 src=5.180.19.95 dst=192.168.200.22 sport=28499 dport=22 [ASSURED] mark=0 use=1
udp      17 25 src=192.168.200.22 dst=169.254.169.123 sport=56722 dport=123 src=169.254.169.123 dst=192.168.200.22 sport=123 dport=56722 mark=0 use=1
udp      17 29 src=127.0.0.1 dst=127.0.0.53 sport=51869 dport=53 src=127.0.0.53 dst=127.0.0.1 sport=53 dport=51869 mark=0 use=1
udp      17 25 src=192.168.200.22 dst=68.233.45.146 sport=59584 dport=123 src=68.233.45.146 dst=192.168.200.22 sport=123 dport=59584 mark=0 use=1
conntrack v1.4.6 (conntrack-tools): 6 flow entries have been shown.
$ sudo conntrack -S
cpu=0           found=3 invalid=0 insert=0 insert_failed=0 drop=0 early_drop=0 error=0 search_restart=5 (null)=0 (null)=0
cpu=1           found=2 invalid=1 insert=0 insert_failed=0 drop=0 early_drop=0 error=0 search_restart=3 (null)=0 (null)=0

Also, if you use connection tracking for the TCP connections transiting your WireGuard network, you may find that the conntrack system will occasionally flag established TCP connections as “invalid”, due to missing or out-of-order TCP packets. If your firewall drops invalid conntrack flows, these connections will terminate abruptly. Changing the value of the following two kernel parameters to 1 will prevent this from happening:

net.netfilter.nf_conntrack_be_liberal

Set this to 1 to avoid marking TCP state violations as invalid (except for out-of-window RST packets).

net.netfilter.nf_conntrack_ignore_invalid_rst

Set this to 1 to avoid marking out-of-window RST packets as invalid.