How to Use WireGuard With Nftables

Nftables is an increasingly popular firewall tool for Linux, replacing the venerable iptables. Most Linux distributions now offer first-class support for nftables; and while it’s technically possible to use iptables and nftables together on the same host, it usually causes problems — so you should pick just one or the other to run on a given host.

For hosts where you’re using nftables, follow this guide to set up a simple firewall that matches the WireGuard topology to which the host belongs:

Nftables Basics

Nftables is a more powerful and flexible than iptables, with a correspondingly more complicated syntax. While it’s still possible to jam rules onto nftables chains with PreUp statements in your WireGuard config, it’s probably best to just put them all in a master nftables config file (or in a file included by your master nftables config file). Most distros use either /etc/nftables.conf or /etc/sysconfig/nftables.conf for this master config file.

Installing Nftables

Debian

The master nftables config file for Debian-based distros (including Ubuntu) is located at /etc/nftables.conf. You can install the nftables package with the following command:

sudo apt install nftables

Fedora

The master nftables config file for Fedora-based distros (including RHEL, CentOS, Amazon Linux, Oracle Linux, etc) is located at /etc/sysconfig/nftables.conf. You can install the nftables package with the following command:

sudo dnf install nftables

Arch

The master nftables config file for Arch Linux is located at /etc/nftables.conf. You can install the nftables package with the following command:

sudo pacman -S nftables

Alpine

The master nftables config file for Alpine Linux is located at /etc/nftables.nft. You can install the nftables package with the following command:

sudo apk add nftables

Running Nftables

The nftables package in most distros includes a systemd service that will automatically start nftables on boot if you run the following command:

sudo systemctl enable nftables

(For OpenRC distros like Alpine, instead run rc-update add nftables default.)

You can restart (or start if not already running) this systemd service to reload your nftables configuration with the following command:

sudo systemctl restart nftables

(For OpenRC distros like Alpine, instead run rc-service nftables restart.)

If nftables fails to start, you can see its error messages by running the following command:

journalctl -u nftables

Base Configuration

Updated 2023-05-13

The original version of this article recommended a slightly more complicated base configuration, with several additional chains to drop known bad traffic earlier in the packet flow.

This updated version simplifies and consolidates all packet filtering into the main filter table.

Following is the base nftables configuration that we’ll use for all the examples in this article:

#!/usr/sbin/nft -f
flush ruleset

define pub_iface = "eth0"
define wg_port = 51820

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        # accept all loopback packets
        iif "lo" accept
        # accept all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # accept all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }
        # drop new connections over rate limit
        ct state new limit rate over 1/second burst 10 packets drop

        # accept all DHCPv6 packets received at a link-local address
        ip6 daddr fe80::/64 udp dport dhcpv6-client accept
        # accept all SSH packets received on a public interface
        iifname $pub_iface tcp dport ssh accept
        # accept all WireGuard packets received on a public interface
        iifname $pub_iface udp dport $wg_port accept

        # reject with polite "port unreachable" icmp response
        reject
    }

    chain forward {
        type filter hook forward priority 0; policy drop;
        reject with icmpx type host-unreachable
    }
}

Save it to /etc/nftables.conf (or wherever your Linux distro uses for its master nftables config file, described above). Change its wg_port definition to the ListenPort of the host’s WireGuard interface (for “Endpoint A” in the examples in this article, that would be 51821). Change its pub_iface definition to the name of the host’s physical network interface.

If your host uses multiple physical interfaces (like a WAN interface at eth0 and a LAN interface at eth1; or a wired interface at ens3 and a wireless interface at wls2; etc), include each interface in the pub_iface definition using set notation:

define pub_iface = { "eth0", "eth1" }

The filter table is where the main packet-filtering firewall rules live. The input chain hooks the Netfilter input path (packets sent to the host), and the forward chain hooks the forward path (packets forwarded through the host). The base configuration rejects all packets on the forward path. For the input path, it allows:

  1. Loopback packets (used by services on the host communicating to itself through a network socket)

  2. ICMP and ICMPv6 packets (used for network diagnostics)

  3. Packets part of an already-established connection (usually from connections established by initiating an outbound request, like a connection to an external HTTP server)

  4. DHCPv6 packets sent to this host’s link-local address (on UDP port 546)

  5. SSH packets sent to eth0 (on TCP port 22)

  6. WireGuard packets sent to eth0 (on UDP port 51820)

And it rejects everything else (sending a “Destination port unreachable” ICMP or ICMPv6 packet in response).

It also hard drops (with no ICMP/ICMPv6 response) new TCP/UDP connections if they start coming in too fast:

        ct state new limit rate over 1/second burst 10 packets drop

This rule allows a burst of 10 new connections, then starts blocking new connections after that, replenishing the burst pool by one every second (using the standard “token bucket” methodology — see the Netfilter Limits documentation for details). Depending on traffic to the host, you may want to adjust this rate limit.

If you don’t use DHCPv6 for the host, you can omit the DHCPv6 rule. If you don’t need to remotely-administer the host, you can omit the SSH rule; or if you know that you’re going to SSH into the host only from a specific set of addresses, you can adjust it to limit it to allowing only those source addresses. For example, you could limit SSH connections to allow them only from 198.51.100.1 and the 203.0.113.0/24 block:

        iifname $pub_iface tcp dport ssh ip saddr { 198.51.100.1, 203.0.113.0/24 } accept

Similarly, if you know you are going to connect via WireGuard to the host only from a specific set of addresses, you can adjust the WireGuard rule to limit it to allowing only those source addresses. For example, you could limit WireGuard connections to allow them only from 203.0.113.2:

        iifname $pub_iface udp dport $wg_port ip saddr 203.0.113.2 accept
Tip

For nftables rules referencing the lo (loopback) interface, use iif (input interface by index) or oif (output interface by index). For other interfaces, use iifname (input interface by name) and oifname (output interface by name).

Rules built with the iif and oif expressions store references to the interface’s index at the time the rule was loaded. This allows for slightly faster execution of the rule — but if the interface is brought down and then back up again, the index will change, and the rule will become a no-op.

Also, on system boot, the nftables service provided by most distros will start before any network interface except lo is available (usually as part of systemd’s network-pre.target). If you have any iif or oif rules in your master nftables config file that reference an interface other than lo, the nftables service will fail to start — leaving the host exposed with no firewall.

Point to Point

So with the simple two-host, point-to-point WireGuard VPN (Virtual Private Network) described in the WireGuard Point to Point Configuration guide, we can set up a nftables firewall on both points (replacing the iptables firewall described in the “extra” sections of the guide) pretty much by using the Base Configuration.

Here’s a network diagram of the scenario:

Point to Point VPN
Figure 1. Point to point scenario

In this scenario, Endpoint A makes HTTP requests through WireGuard to Endpoint B. Endpoint B doesn’t ever need to initiate connections to Endpoint A (it just responds to requests from Endpoint A).

Endpoint A

On Endpoint A, we can use our nftables Base Configuration practically verbatim. The only thing we need to tweak is the wg_port definition. Change it to 51821 to match the port used to Configure WireGuard on Endpoint A:

define wg_port = 51821

Endpoint B

On Endpoint B, we can similarly use our nftables Base Configuration almost verbatim. We also need to change the wg_port definition, this time to 51822, to match the port used to Configure WireGuard on Endpoint B:

define wg_port = 51822

But we also need to add an additional rule to the input chain of the filter table, to allow access to Endpoint B’s HTTP server. If we want to allow any other host in Site B to access this server, as well as Endpoint A, we can do it it by adding the following rule to the bottom of the input chain (just before the reject statement):

        tcp dport http accept

If we want to limit access to the HTTP server so that only Endpoint A (or any other host connected via the same WireGuard interface) can access it, we can do it it instead with this rule:

        iifname "wg0" tcp dport http accept

If we add a wg_iface definition for the wg0 interface, the full /etc/nftables.conf file for Endpoint B will look like the following:

#!/usr/sbin/nft -f
flush ruleset

define pub_iface = "eth0"
define wg_iface = "wg0"
define wg_port = 51822

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        # accept all loopback packets
        iif "lo" accept
        # accept all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # accept all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }
        # drop new connections over rate limit
        ct state new limit rate over 1/second burst 10 packets drop

        # accept all DHCPv6 packets received at a link-local address
        ip6 daddr fe80::/64 udp dport dhcpv6-client accept
        # accept all SSH packets received on a public interface
        iifname $pub_iface tcp dport ssh accept
        # accept all WireGuard packets received on a public interface
        iifname $pub_iface udp dport $wg_port accept

        # accept all HTTP packets received on a WireGuard interface
        iifname $wg_iface tcp dport http accept

        # reject with polite "port unreachable" icmp response
        reject
    }

    chain forward {
        type filter hook forward priority 0; policy drop;
        reject with icmpx type host-unreachable
    }
}

Test It Out

You can test this out by trying to access Endpoint B’s HTTP server from Endpoint A:

$ curl 10.0.0.2
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
...

You should see the HTML of Endpoint B’s homepage printed. If you start up another web server running on Endpoint B on some other port, like say TCP port 8080 (run python3 -m http.server 8080 for a temporary server serving the contents of the current directory), you won’t be able to access it from Endpoint A (or anywhere else):

$ curl 10.0.0.2:8080
curl: (7) Failed to connect to 10.0.0.2 port 8080: Connection refused

Hub and Spoke

With a hub-and-spoke topology, an nftables firewall for the spokes will look much like the points in a point-to-point topology. The hub, too, will be pretty similar to the Base Configuration, but will need at least one forwarding rule to allow traffic to be forwarded from spoke to spoke.

For an example, we’ll use the hub-and-spoke WireGuard VPN described in the WireGuard Hub and Spoke Configuration guide. Here’s a network diagram of the scenario:

Hub and Spoke VPN
Figure 2. Hub and spoke scenario

In this scenario, Endpoint A makes HTTP requests through Host C to Endpoint B. Endpoint B doesn’t ever need to initiate connections to Endpoint A (it just responds to requests made by Endpoint A).

Endpoint A

On Endpoint A, we can use our nftables Base Configuration practically verbatim. The only thing we need to tweak is the wg_port definition, to match the port used to Configure WireGuard on Endpoint A:

define wg_port = 51821

Endpoint B

On Endpoint B, we can similarly use our nftables Base Configuration almost verbatim. We also need to change the wg_port definition, this time to 51822, to match the port used to Configure WireGuard on Endpoint B:

define wg_port = 51822

But we also need to add an additional rule to the input chain of the filter table, to allow access to Endpoint B’s HTTP server. To allow any host in the WireGuard network to access it, add the following rule to the bottom of the input chain (just before the reject statement):

        iifname "wg0" tcp dport http accept

If we wanted to allow any host in the WireGuard network unrestricted access to any service on Endpoint B (like SSH, databases, and other network applications), we could use the following rule instead:

        iifname "wg0" accept

Host C

On Host C, we need a few more changes from our nftables Base Configuration. We do need to change the wg_port definition, this time to 51823, to match the port used to Configure WireGuard on Host C:

define wg_port = 51823

And we also need to add a few additional rules to the forward chain of the filter table, to allow packets to be forwarded between Endpoint A and Endpoint B.

First let’s add a definition for our WireGuard interface name:

define wg_iface = "wg0"

Then, if we want to allow Host C to forward any traffic from any host in our WireGuard network to any other host, we could just add the following rule to the forward chain of its filter table (right before the reject with icmpx type host-unreachable rule):

        iifname $wg_iface oifname $wg_iface accept

But if we want to use Host C to enforce some access control rules for our WireGuard network, we might instead want to build a separate filter chain just for it, and have the rule in the forward chain instead jump to it for additional filtering (goto is like jump, but doesn’t jump back — we won’t need to return to the forward chain in this case):

        iifname $wg_iface oifname $wg_iface goto wg-forward

The minimal version of this wg-forward chain (which we’d put in our filter table), would look like this:

    chain wg-forward {
        # forward all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # forward all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }

        # forward all HTTP packets for Endpoint B
        ip daddr 10.0.0.2 tcp dport http accept

        # reject with polite "administratively prohibited" icmp response
        reject with icmpx type admin-prohibited
    }

At the beginning of the chain, it would forward all ICMP and ICMPv6 traffic from any host in our WireGuard network to any other host, as well as the traffic for any already-established connections. And at the end of the chain, it would reject all traffic not explicitly accepted (ie most types of new connections) by dropping it and replying with a “Destination unreachable (Communication administratively prohibited)” ICMP or ICMPv6 packet.

In the middle, we’d put our WireGuard VPN access rules. In the minimal version above, we allow connections to Endpoint B’s HTTP server (10.0.0.2 TCP port 80) from any host.

A more restrictive version of that rule might allow only Endpoint A (10.0.0.1) to access Endpoint B’s HTTP server:

        ip saddr 10.0.0.1 ip daddr 10.0.0.2 tcp dport http accept

If there were several services on Endpoint B that we wanted to allow Endpoint A to access, we could adjust this rule to allow those additional services, like the following (allowing access to TCP ports 22, 80, 8080, and 8081):

        ip saddr 10.0.0.1 ip daddr 10.0.0.2 tcp dport { ssh, http, 8080, 8081 } accept

Or if we added a couple more spokes to our network (say 10.0.0.4 and 10.0.0.5), we could adjust the rule to allow those spokes to access Endpoint B’s HTTP server, as well:

        ip saddr { 10.0.0.1, 10.0.0.4, 10.0.0.5 } ip daddr 10.0.0.2 tcp dport http accept

In those above cases, we could have instead added multiple separate rules for the separate spokes or services; but with nftables, it’s slightly more efficient to just use sets of IP addresses or ports within a single rule (although you’d only notice the difference if you had a lot of traffic or a lot of rules).

In cases where we have a number of similar rules, but with different source IP and destination IP or port combinations, we can also use an nftables set with concatenation to jam them all into a single rule. For example, if we had a group of similar rules like the following:

        ip saddr 10.0.0.1 ip daddr 10.0.0.2 tcp dport http accept
        ip saddr 10.0.0.4 ip daddr 10.0.0.2 tcp dport http accept
        ip saddr 10.0.0.4 ip daddr 10.0.0.5 tcp dport http accept
        ip saddr 10.0.0.1 ip daddr 10.0.0.5 tcp dport smtp accept
        ip saddr 10.0.0.4 ip daddr 10.0.0.5 tcp dport smtp accept

We could consolidate them all into a single rule like this (where . is the concatenation operator for nftables):

        ip saddr . ip daddr . tcp dport {
            10.0.0.1 . 10.0.0.2 . http,
            10.0.0.4 . 10.0.0.2 . http,
            10.0.0.4 . 10.0.0.5 . http,
            10.0.0.1 . 10.0.0.5 . smtp,
            10.0.0.4 . 10.0.0.5 . smtp
        } accept
Note

When using both IPv4 and IPv6 addresses, you do need a separate ip6 rule (with similar saddr and daddr arguments) for the IPv6 addresses — you can’t mix IPv4 and IPv6 addresses in the same expression.

The full /etc/nftables.conf file for Host C with the minimal wg-forward chain from above will look like the following:

#!/usr/sbin/nft -f
flush ruleset

define pub_iface = "eth0"
define wg_iface = "wg0"
define wg_port = 51823

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        # accept all loopback packets
        iif "lo" accept
        # accept all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # accept all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }
        # drop new connections over rate limit
        ct state new limit rate over 1/second burst 10 packets drop

        # accept all DHCPv6 packets received at a link-local address
        ip6 daddr fe80::/64 udp dport dhcpv6-client accept
        # accept all SSH packets received on a public interface
        iifname $pub_iface tcp dport ssh accept
        # accept all WireGuard packets received on a public interface
        iifname $pub_iface udp dport $wg_port accept

        # reject with polite "port unreachable" icmp response
        reject
    }

    chain wg-forward {
        # forward all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # forward all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }

        # forward all HTTP packets for Endpoint B
        ip daddr 10.0.0.2 tcp dport http accept

        # reject with polite "administratively prohibited" icmp response
        reject with icmpx type admin-prohibited
    }

    chain forward {
        type filter hook forward priority 0; policy drop;

        # filter all packets transiting WireGuard network via wg-forward chain
        iifname $wg_iface oifname $wg_iface goto wg-forward

        # reject with polite "host unreachable" icmp response
        reject with icmpx type host-unreachable
    }
}

Test It Out

You can test this out by trying to access Endpoint B’s HTTP server from Endpoint A:

$ curl 10.0.0.2
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
...

You should see the HTML of Endpoint B’s homepage printed. If you start up another web server running on Endpoint B on some other port, like say TCP port 8080 (run python3 -m http.server 8080 for a temporary server serving the contents of the current directory), you won’t be able to access it from Endpoint A (or anywhere else):

$ curl 10.0.0.2:8080
curl: (7) Failed to connect to 10.0.0.2 port 8080: Connection refused

Point to Site

With the standard point-to-site topology, an nftables firewall for the points will look much like the points in a point-to-point topology. The site, too, will be pretty similar to the Base Configuration, but will need a few additional rules to forward traffic from points to site.

For an example, we’ll use the point-to-site WireGuard VPN described in the WireGuard Point to Site Configuration guide. Here’s a network diagram of the scenario:

Point to Site VPN
Figure 3. Point to site scenario

In this scenario, Endpoint A makes HTTP requests through its WireGuard connection with Host β to Endpoint B in the Site B LAN. Endpoint B doesn’t ever need to initiate connections to Endpoint A (it just responds to requests made by Endpoint A).

Endpoint A

On Endpoint A, we can use our nftables Base Configuration practically verbatim. The only thing we need to tweak is the wg_port definition, to match the port used to Configure WireGuard on Endpoint A:

define wg_port = 51821

Endpoint B

Endpoint B in this scenario is not part of the WireGuard VPN — it’s just some server on the LAN at Site B. So while an nftables firewall for it will look a lot like our Base Configuration above, it won’t need to accept WireGuard packets; instead, it will need to accept HTTP packets. A minimal configuration for it would look like the following:

#!/usr/sbin/nft -f
flush ruleset

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        # accept all loopback packets
        iif "lo" accept
        # accept all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # accept all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }
        # drop new connections over rate limit
        ct state new limit rate over 1/second burst 10 packets drop

        # accept all DHCPv6 packets received at a link-local address
        ip6 daddr fe80::/64 udp dport dhcpv6-client accept
        # accept all SSH packets
        tcp dport ssh accept
        # accept all HTTP packets
        tcp dport http accept

        # reject with polite "port unreachable" icmp response
        reject
    }

    chain forward {
        type filter hook forward priority 0; policy drop;
        reject with icmpx type host-unreachable
    }
}

To limit connections to its HTTP server so that it allows access only from our WireGuard network (via Host β, 192.168.200.2 on the Site B LAN), adjust its HTTP rule to the following:

        # accept HTTP packets from Host β
        ip saddr 192.168.200.2 tcp dport http accept

Host B

On Host β, we need to make a few more changes from our nftables Base Configuration. We also need to change its wg_port definition, to match the port used to Configure WireGuard on Host β:

define wg_port = 51822

Also at the top of the configuration, let’s add a definition for our WireGuard interface name:

define wg_iface = "wg0"

Then, at the bottom, add a new nat table with a postrouting chain, which will masquerade packets from the WireGuard VPN to Site B’s LAN:

table inet nat {
    chain postrouting {
        type nat hook postrouting priority 100; policy accept;
        iifname $wg_iface oifname $pub_iface masquerade
    }
}

The name of the table and chain are arbitrary — what matters is that the chain type is set to nat, with a postrouting hook at priority 100 (the priority in the postrouting path where SNAT should be applied). The masquerade rule in this chain will translate the source address of any packets that Host β forwards from its connected WireGuard peers, rewriting them to use Host β’s own address on the Site B LAN (192.168.200.2). For replies to those packets, it will do the reverse — translate the packet’s destination from Host β’s own address to the IP address of the WireGuard peer that was the original source.

Note

If Host β is running a version of Linux kernel older than 5.2 (released mid-2019), set the nat table’s type to ip instead of inet (or ip6 if you’re using IPv6 addresses).

And if Host β’s kernel is older than 4.18 (released mid-2018), you’ll also need to add the following prerouting chain to the nat table:

    chain prerouting {
        type nat hook prerouting priority -100; policy accept;
    }

Even though this chain doesn’t contain any rules, it’s needed with older kernels to ensure that replies to masqueraded packets are translated back to their original source correctly.

Finally, if we want to allow Host β to forward all traffic from any host in our WireGuard network to any host at Site B, we could add the following two rules to the forward chain of its filter table (right before the reject with icmpx type host-unreachable rule):

        ct state vmap { invalid : drop, established : accept, related : accept }
        iifname $wg_iface oifname $pub_iface accept

If, however, we want to use Host β to enforce some access control rules for our WireGuard network, we might instead want to build a separate filter chain just for it, jumping from the forward chain to it for additional filtering. In that case, add the following two rules to the forward chain instead:

        ct state vmap { invalid : drop, established : accept, related : accept }
        iifname $wg_iface oifname $pub_iface goto wg-forward

Then add the following wg-forward chain to the filter table:

    chain wg-forward {
        # forward all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept

        # forward all HTTP packets for Endpoint B
        ip daddr 192.168.200.22 tcp dport http accept

        # reject with polite "administratively prohibited" icmp response
        reject with icmpx type admin-prohibited
    }

The first rules in this chain forward all ICMP and ICMPv6 traffic from any host in our WireGuard network to any host in the Site B LAN. The last rule in the chain rejects all traffic not explicitly accepted (ie most types of new connections) by dropping it and replying with a “Destination unreachable (Communication administratively prohibited)” ICMP or ICMPv6 packet.

In the middle, we have our WireGuard VPN access rules. In the minimal version above, we allow all connections to Endpoint B’s HTTP server (192.168.200.22 TCP port 80) from any host.

A more restrictive version of that rule might allow only Endpoint A (10.0.0.1) to access Endpoint B’s HTTP server:

        ip saddr 10.0.0.1 ip daddr 192.168.200.22 tcp dport http accept

If there were several services on Endpoint B that we wanted to allow Endpoint A to access, we could adjust this rule to allow those additional services, like the following (allowing access to TCP ports 22, 80, 8080, and 8081):

        ip saddr 10.0.0.1 ip daddr 192.168.200.22 tcp dport { ssh, http, 8080, 8081 } accept

Or if we added a couple more points to our WireGuard VPN (say 10.0.0.4 and 10.0.0.5), we could adjust the rule to allow those points to access Endpoint B’s HTTP server, as well:

        ip saddr { 10.0.0.1, 10.0.0.4, 10.0.0.5 } ip daddr 192.168.200.22 tcp dport http accept

In those above cases, we could have alternatively added multiple separate rules for each separate point or service; but with nftables, it’s slightly more efficient to use sets of IP addresses or ports within a single rule (although you’d only notice the difference if you had a lot of traffic or a lot of rules).

In cases where we have a number of similar rules, but with different source IP and destination IP or port combinations, we can also use an nftables set with concatenation to jam them all into a single rule. For example, if we had a group of similar rules like the following:

        ip saddr 10.0.0.1 ip daddr 192.168.200.22 tcp dport http accept
        ip saddr 10.0.0.4 ip daddr 192.168.200.22 tcp dport http accept
        ip saddr 10.0.0.5 ip daddr 192.168.200.123 tcp dport http accept
        ip saddr 10.0.0.1 ip daddr 192.168.200.123 tcp dport smtp accept
        ip saddr 10.0.0.5 ip daddr 192.168.200.123 tcp dport smtp accept

We could consolidate them all into a single rule like this (where . is the concatenation operator for nftables):

        ip saddr . ip daddr . tcp dport {
            10.0.0.1 . 192.168.200.22 . http,
            10.0.0.4 . 192.168.200.22 . http,
            10.0.0.5 . 192.168.200.123 . http,
            10.0.0.1 . 192.168.200.123 . smtp,
            10.0.0.5 . 192.168.200.123 . smtp
        } accept
Note

When using both IPv4 and IPv6 addresses, you need a separate ip6 rule (with similar saddr and daddr arguments) for the IPv6 addresses — you can’t mix IPv4 and IPv6 addresses in the same expression.

The full /etc/nftables.conf file for Host β with the minimal wg-forward chain from above will look like the following:

#!/usr/sbin/nft -f
flush ruleset

define pub_iface = "eth0"
define wg_iface = "wg0"
define wg_port = 51822

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        # accept all loopback packets
        iif "lo" accept
        # accept all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # accept all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }
        # drop new connections over rate limit
        ct state new limit rate over 1/second burst 10 packets drop

        # accept all DHCPv6 packets received at a link-local address
        ip6 daddr fe80::/64 udp dport dhcpv6-client accept
        # accept all SSH packets received on a public interface
        iifname $pub_iface tcp dport ssh accept
        # accept all WireGuard packets received on a public interface
        iifname $pub_iface udp dport $wg_port accept

        # reject with polite "port unreachable" icmp response
        reject
    }

    chain wg-forward {
        # forward all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept

        # forward all HTTP packets for Endpoint B
        ip daddr 192.168.200.22 tcp dport http accept

        # reject with polite "administratively prohibited" icmp response
        reject with icmpx type admin-prohibited
    }

    chain forward {
        type filter hook forward priority 0; policy drop;

        # forward all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }
        # filter all packets from WireGuard VPN to Site B via wg-forward chain
        iifname $wg_iface oifname $pub_iface goto wg-forward

        # reject with polite "host unreachable" icmp response
        reject with icmpx type host-unreachable
    }
}
table inet nat {
    chain postrouting {
        type nat hook postrouting priority 100; policy accept;
        # masquerade all packets from WireGuard VPN to Site B
        iifname $wg_iface oifname $pub_iface masquerade
    }
}

Test It Out

You can test this out by trying to access Endpoint B’s HTTP server from Endpoint A:

$ curl 192.168.200.22
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
...

You should see the HTML of Endpoint B’s homepage printed. If you start up another web server running on Endpoint B on some other port (or any other host in Site B), like say at TCP port 8080 (run python3 -m http.server 8080 for a temporary server serving the contents of the current directory), you won’t be able to access it from Endpoint A (or anywhere else in the WireGuard network):

$ curl 192.168.200.22:8080
curl: (7) Failed to connect to 192.168.200.22 port 8080: Connection refused

Point to Site With Port Forwarding

When you have a point-to-site topology, but want to “reverse” the direction of access — so that hosts within the site initiate access to services on the points instead of vice-versa — you need to set up a few different nftables rules on the site and points than you do with the standard Point to Site configuration.

For an example, we’ll use the point-to-site WireGuard VPN described in the WireGuard Point to Site With Port Forwarding guide. Here’s a network diagram of the scenario:

Point to Site VPN With Port Forwarding
Figure 4. Point to site with port forwarding

In this scenario, Endpoint B in Site B makes HTTP requests through Host β to Endpoint A in the WireGuard VPN. Endpoint A doesn’t initiate connections to Site B — it just responds to requests it receives from Host β.

Endpoint A

On Endpoint A, we can start with our nftables Base Configuration, and add one extra rule to allow access to Endpoint A’s HTTP service. First, though, we need to change its wg_port definition to 51821, to match the port used to Configure WireGuard on Endpoint A:

define wg_port = 51821

Then, add a rule to the input chain of the filter table to allow access to the HTTP server running on Endpoint A. If we wanted to allow access from any host at Site A or Site B, we could add a rule like the following to the bottom of the input chain (just before the reject statement):

        tcp dport http accept

However, if we want to limit access so that only hosts at Site B will have access, we can do it it instead with this rule:

        iifname "wg0" tcp dport http accept

If we add a wg_iface definition for the wg0 interface, the full /etc/nftables.conf file for Endpoint A will look like the following:

#!/usr/sbin/nft -f
flush ruleset

define pub_iface = "eth0"
define wg_iface = "wg0"
define wg_port = 51821

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        # accept all loopback packets
        iif "lo" accept
        # accept all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # accept all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }
        # drop new connections over rate limit
        ct state new limit rate over 1/second burst 10 packets drop

        # accept all DHCPv6 packets received at a link-local address
        ip6 daddr fe80::/64 udp dport dhcpv6-client accept
        # accept all SSH packets received on a public interface
        iifname $pub_iface tcp dport ssh accept
        # accept all WireGuard packets received on a public interface
        iifname $pub_iface udp dport $wg_port accept

        # accept all HTTP packets received on a WireGuard interface
        iifname $wg_iface tcp dport http accept

        # reject with polite "port unreachable" icmp response
        reject
    }

    chain forward {
        type filter hook forward priority 0; policy drop;
        reject with icmpx type host-unreachable
    }
}

Endpoint B

Endpoint B in this scenario is not part of the WireGuard VPN — it’s just some computer on the LAN at Site B. If you want to set up an nftables firewall for it, use the Base Configuration, and omit the iifname $pub_iface udp dport $wg_port accept line.

Host B

On Host β, we’ll start with our nftables Base Configuration again, and add a few more rules to it. First, change its wg_port definition, to match the port used to Configure WireGuard on Host β:

define wg_port = 51822

Also at the top of the configuration, add a definition for our WireGuard interface name:

define wg_iface = "wg0"

Then, at the bottom, add a new nat table with a prerouting chain, containing a rule to forward TCP port 80 to Endpoint A (10.0.0.1):

table inet nat {
    chain prerouting {
        type nat hook prerouting priority -100; policy accept;
        ip daddr 192.168.200.2 tcp dport http dnat ip to 10.0.0.1
    }
}

The name of the table and chain are arbitrary — what matters is that the chain type is set to nat, with a prerouting hook at priority -100 (the priority in the prerouting path where DNAT should be applied). The dnat rule in this chain will translate the destination address of TCP packets Host β receives that use its own Site B LAN address (192.168.200.2) as their destination address, and have a destination port of 80, to instead use Endpoint A’s IP address of 10.0.0.1 as their destination. For replies to these packets, it will do the reverse — translate the reply packet’s source address from 10.0.0.1 back to 192.168.200.2.

Note

If Host β is running a version of Linux kernel older than 5.2 (released mid-2019), set the nat table’s type to ip instead of inet (or ip6 if you’re using IPv6 addresses).

And if Host β’s kernel is older than 4.18 (released mid-2018), you’ll also need to add the following postrouting chain to the nat table:

    chain postrouting {
        type nat hook postrouting priority 100; policy accept;
    }

Even though this chain doesn’t contain any rules, it’s needed with older kernels to ensure that replies to translated packets are translated back to their original source correctly.

If you want to translate the port, as well — for example to forward port 8080 on Host β to port 80 on Endpoint A — append the translated port to the translated address:

        ip daddr 192.168.200.2 tcp dport 8080 dnat ip to 10.0.0.1:80

If you want to forward multiple ports, you can add one rule for each port:

        ip daddr 192.168.200.2 tcp dport 8080 dnat ip to 10.0.0.1:80
        ip daddr 192.168.200.2 tcp dport 8443 dnat ip to 10.0.0.1:443
        ip daddr 192.168.200.2 tcp dport 8993 dnat ip to 10.0.0.1:993
        ip daddr 192.168.200.2 tcp dport 8084 dnat ip to 10.0.0.4:80
        ip daddr 192.168.200.2 tcp dport 8085 dnat ip to 10.0.0.5:80

With the recently-released nftables version 1.0.0, you can alternatively combine multiple destination ip:port rules into a single rule with a convenient map syntax (although you still need separate rules for TCP and UDP ports, as well as for IPv4 and IPv6 addresses):

        ip daddr 192.168.200.2 dnat ip to tcp dport map {
            8080 : 10.0.0.1 . 80,
            8443 : 10.0.0.1 . 443,
            8993 : 10.0.0.1 . 993,
            8084 : 10.0.0.4 . 80,
            8085 : 10.0.0.5 . 80
        }

The final step is to allow Host β to forward the packets with the translated destination addresses. If we want to allow Host β to forward the translated packets from any host at Site B, we could add the following two rules to the forward chain of the filter table (right before the reject with icmpx type host-unreachable rule):

        ct state vmap { invalid : drop, established : accept, related : accept }
        iifname $pub_iface oifname $wg_iface accept

If, however, we want to use Host β to enforce some access control rules for packets forwarded from Site B to our WireGuard network, we might instead want to build a separate chain just for it, jumping from the forward chain to it for additional filtering. In that case, add the following two rules to the forward chain instead:

        ct state vmap { invalid : drop, established : accept, related : accept }
        iifname $pub_iface oifname $wg_iface goto site-b-forward

Then add the following site-b-forward chain to the filter table:

    chain site-b-forward {
        # forward HTTP packets from Endpoint B to Endpoint A
        ip saddr 192.168.200.22 ip daddr 10.0.0.1 tcp dport http accept

        # reject with polite "administratively prohibited" icmp response
        reject with icmpx type admin-prohibited
    }

The last rule in the chain rejects all traffic not explicitly accepted (ie most types of new connections) by dropping it and replying with a “Destination unreachable (Communication administratively prohibited)” ICMP or ICMPv6 packet. Before that, we have our WireGuard VPN access rules. In the version above, we allow connections to Endpoint A’s HTTP server (10.0.0.1 TCP port 80) only from Endpoint B (192.168.200.22).

Because the DNAT rule from the prerouting chain (with its nat type prerouting hook) runs before this access-control rule in the forward chain (with its filter type forward hook), by this point packets originally sent from Endpoint B to Host β’s TCP port 80 will already have been rewritten to use Endpoint A’s IP address of 10.0.0.1. So this access-control rule will allow these packets to be forwarded (while blocking any similar packets sent from other hosts in Site B). See the Netfilter Hooks documentation for more details about packet flows and hook ordering.

If there were several services on Endpoint A (and other endpoints that we might add to the WireGuard network, like say 10.0.0.4 and 10.0.0.5) that we wanted to allow Endpoint B and a few other hosts in Site B to access, we could add additional rules to allow access to those additional services, like the following:

        ip saddr 192.168.200.22 ip daddr 10.0.0.1 tcp dport 80 accept
        ip saddr 192.168.200.123 ip daddr 10.0.0.1 tcp dport 80 accept
        ip saddr 192.168.200.22 ip daddr 10.0.0.1 tcp dport 443 accept
        ip saddr 192.168.200.22 ip daddr 10.0.0.1 tc4 dport 993 accept
        ip saddr 192.168.200.22 ip daddr 10.0.0.4 tcp dport 80 accept
        ip saddr 192.168.200.123 ip daddr 10.0.0.5 tcp dport 80 accept

Or we could consolidate them all into a single rule like this (where . is the concatenation operator for nftables):

        ip saddr . ip daddr . tcp dport {
            192.168.200.22 . 10.0.0.1 . 80,
            192.168.200.123 . 10.0.0.1 . 80,
            192.168.200.22 . 10.0.0.1 . 443,
            192.168.200.22 . 10.0.0.1 . 993,
            192.168.200.22 . 10.0.0.4 . 80,
            192.168.200.123 . 10.0.0.5 . 80
        } accept

The full /etc/nftables.conf file for Host β with the original version of the site-b-forward chain will look like the following:

#!/usr/sbin/nft -f
flush ruleset

define pub_iface = "eth0"
define wg_iface = "wg0"
define wg_port = 51822

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        # accept all loopback packets
        iif "lo" accept
        # accept all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # accept all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }
        # drop new connections over rate limit
        ct state new limit rate over 1/second burst 10 packets drop

        # accept all DHCPv6 packets received at a link-local address
        ip6 daddr fe80::/64 udp dport dhcpv6-client accept
        # accept all SSH packets received on a public interface
        iifname $pub_iface tcp dport ssh accept
        # accept all WireGuard packets received on a public interface
        iifname $pub_iface udp dport $wg_port accept

        # reject with polite "port unreachable" icmp response
        reject
    }

    chain site-b-forward {
        # forward HTTP packets from Endpoint B to Endpoint A
        ip saddr 192.168.200.22 ip daddr 10.0.0.1 tcp dport http accept

        # reject with polite "administratively prohibited" icmp response
        reject with icmpx type admin-prohibited
    }

    chain forward {
        type filter hook forward priority 0; policy drop;

        # forward all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }
        # filter all packets from Site B to WireGuard VPN via site-b-forward chain
        iifname $pub_iface oifname $wg_iface goto site-b-forward

        # reject with polite "host unreachable" icmp response
        reject with icmpx type host-unreachable
    }
}
table inet nat {
    chain prerouting {
        type nat hook prerouting priority -100; policy accept;
        # rewrite destination address of TCP port 80 packets to 10.0.0.1
        ip daddr 192.168.200.2 tcp dport http dnat ip to 10.0.0.1
    }
}

Test It Out

You can test this out by trying to access Endpoint A’s HTTP server from Endpoint B, via the forwarded TCP port 80 on Host β (192.168.200.2):

$ curl 192.168.200.2
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
...

You should see the HTML of Endpoint A’s homepage printed. If you start up another web server running on Endpoint A on some other port (or any other host in the WireGuard network), like say at TCP port 8080 (run python3 -m http.server 8080 for a temporary server serving the contents of the current directory), you won’t be able to access it from Endpoint B (or from anywhere else in Site B):

$ curl 192.168.200.2:8080
curl: (7) Failed to connect to 192.168.200.22 port 8080: Connection refused

Site to Site

With a site-to-site topology, the nftables firewall for each of the sites will look much like the firewall for the site in a point-to-site topology — but simpler, since no NAT is needed.

For an example, we’ll use the site-to-site WireGuard VPN described in the WireGuard Site to Site Configuration guide. Here’s a network diagram of the scenario:

Site to Site VPN
Figure 5. Site to site scenario

In this scenario, Endpoint A in Site A makes HTTP requests to Endpoint B in Site B. These requests are routed through the WireGuard VPN between Host α in Site A and Host β in Site B. Neither Endpoint A nor Endpoint B are part of the WireGuard VPN.

Endpoint A

An nftables firewall for Endpoint A can just use the Base Configuration; and since Endpoint A doesn’t have anything to do with WireGuard, you can omit the iifname $pub_iface udp dport $wg_port accept line from it.

Endpoint B

An nftables firewall for Endpoint B can also use the Base Configuration with the iifname $pub_iface udp dport $wg_port accept line omitted. It does need to accept HTTP connections, however, so add a rule like the following to the bottom of the filter table’s input chain (right above the reject rule):

        tcp dport http accept

This will allow any host that can connect to Endpoint B to access its HTTP server. To limit access so that only Endpoint A (192.168.1.11) can access its HTTP server, use this rule instead:

        ip saddr 192.168.1.11 tcp dport http accept

Host A

On Host α, we can again use our nftables Base Configuration, with a few additions. We need to change its wg_port definition to match the port used to Configure WireGuard on Host α:

define wg_port = 51821

Also at the top of the configuration, let’s add a definition for our WireGuard interface name:

define wg_iface = "wg0"

Then in the forward chain of its filter table, if we want to allow unrestricted access from any host in Site B to any host in Site A, we could add the following two rules (right before the reject with icmpx type host-unreachable rule):

        iifname $wg_iface accept
        oifname $wg_iface accept

If, however, we want to restrict access from Site B such that hosts in Site B can only respond to connections initiated in Site A (like to respond to HTTP requests made by hosts in Site A), and not initiate any connections to Site A themselves, we could build instead a small chain just for this purpose, jumping to it from the forward chain. In that case, add the following two rules to the forward chain instead:

        iifname $wg_iface goto wg-forward
        oifname $wg_iface accept

Then add the following wg-forward chain to the filter table:

    chain wg-forward {
        # forward all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # forward all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }

        # reject with polite "administratively prohibited" icmp response
        reject with icmpx type admin-prohibited
    }

The first rules in this chain forward all ICMP and ICMPv6 traffic from any host in Site B to any host in Site A, as well as the traffic for any already-established connections. The last rule in the chain rejects all traffic not explicitly accepted (ie any new connections) by dropping it and replying with a “Destination unreachable (Communication administratively prohibited)” ICMP or ICMPv6 packet.

If we wanted to allow access to a few specific services in Site A from Site B, we’d add additional access rules to the middle of this chain (like we will for Host β below). But since we don’t need to grant any other access for this scenario, the full /etc/nftables.conf file for Host α will look like the following:

#!/usr/sbin/nft -f
flush ruleset

define pub_iface = "eth0"
define wg_iface = "wg0"
define wg_port = 51821

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        # accept all loopback packets
        iif "lo" accept
        # accept all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # accept all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }
        # drop new connections over rate limit
        ct state new limit rate over 1/second burst 10 packets drop

        # accept all DHCPv6 packets received at a link-local address
        ip6 daddr fe80::/64 udp dport dhcpv6-client accept
        # accept all SSH packets received on a public interface
        iifname $pub_iface tcp dport ssh accept
        # accept all WireGuard packets received on a public interface
        iifname $pub_iface udp dport $wg_port accept

        # reject with polite "port unreachable" icmp response
        reject
    }

    chain wg-forward {
        # forward all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # forward all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }

        # reject with polite "administratively prohibited" icmp response
        reject with icmpx type admin-prohibited
    }

    chain forward {
        type filter hook forward priority 0; policy drop;

        # filter all packets inbound from Site B to Site A via wg-forward chain
        iifname $wg_iface goto wg-forward
        # clamp mss of tcp syn and syn-ack packets outbound from Site A to Site B
        oifname $wg_iface tcp flags syn / syn,rst tcp option maxseg size set rt mtu
        # allow all packets outbound from Site A to Site B
        oifname $wg_iface accept

        # reject with polite "host unreachable" icmp response
        reject with icmpx type host-unreachable
    }
}
Tip

We’ve also added a rule to the forward chain which automatically adjusts the MSS (Maximum Segment Size) option of TCP SYN and SYN-ACK packets sent from Site A through the WireGuard tunnel, to account for the lower MTU (Maximum Transmission Unit) size of the WireGuard interface. This helps the other end of the TCP connection in Site B optimize the size of the packets it sends to Site A, avoiding packet fragmentation.

Host B

The nftables configuration for Host β will be pretty much the same as Host α (except that it will also grant access for Endpoint A to initiate connections to Endpoint B).

We’ll start with our nftables Base Configuration, and change its wg_port definition to match the port used to Configure WireGuard on Host β:

define wg_port = 51822

Also at the top of the configuration, add a definition for our WireGuard interface name:

define wg_iface = "wg0"

Since we want to use Host β to enforce some access control rules for Site B, we’ll next build a separate filter chain for access through its WireGuard connection with Site A. Add the following two rules to the filter table’s forward chain:

        iifname $wg_iface goto wg-forward
        oifname $wg_iface accept

Then add the following wg-forward chain to the filter table:

    chain wg-forward {
        # forward all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # forward all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }

        # forward all HTTP packets for Endpoint B
        ip daddr 192.168.200.22 tcp dport http accept

        # reject with polite "administratively prohibited" icmp response
        reject with icmpx type admin-prohibited
    }

The first rules in this chain forward all ICMP and ICMPv6 traffic from any host in Site B to any host in Site A, as well as the traffic for any already-established connections. The last rule in the chain rejects all traffic not explicitly accepted (ie most types of new connections) by dropping it and replying with a “Destination unreachable (Communication administratively prohibited)” ICMP or ICMPv6 packet.

In the middle, we have our WireGuard VPN access rules. In the minimal version above, we allow all connections to Endpoint B’s HTTP server (192.168.200.22 TCP port 80) from any Site A host.

A more restrictive version of that rule might allow only Endpoint A (192.168.1.11) to access Endpoint B’s HTTP server:

        ip saddr 192.168.1.11 ip daddr 192.168.200.22 tcp dport http accept

If there were several services on Endpoint B that we wanted to allow Endpoint A to access, we could adjust this rule to allow those additional services, like the following (allowing access to TCP ports 22, 80, 8080, and 8081):

        ip saddr 192.168.1.11 ip daddr 192.168.200.22 tcp dport { ssh, http, 8080, 8081 } accept

Or if there were a few other hosts in Site A that we wanted to allow access to Endpoint B, we could adjust the rule to allow those hosts to access Endpoint B’s HTTP server, as well:

        ip saddr { 192.168.1.11, 192.168.1.14, 192.168.1.15 } ip daddr 192.168.200.22 tcp dport http accept

In those above cases, we could have alternatively added multiple separate rules for each separate host or service; but with nftables, it’s slightly more efficient to use sets of IP addresses or ports within a single rule (although you’d only notice the difference if you had a lot of traffic or a lot of rules).

In cases where we have a number of similar rules, but with different source IP and destination IP or port combinations, we can also use an nftables set with concatenation to jam them all into a single rule. For example, if we had a group of similar rules like the following:

        ip saddr 192.168.1.11 ip daddr 192.168.200.22 tcp dport http accept
        ip saddr 192.168.1.14 ip daddr 192.168.200.22 tcp dport http accept
        ip saddr 192.168.1.15 ip daddr 192.168.200.123 tcp dport http accept
        ip saddr 192.168.1.11 ip daddr 192.168.200.123 tcp dport smtp accept
        ip saddr 192.168.1.15 ip daddr 192.168.200.123 tcp dport smtp accept

We could consolidate them all into a single rule like this (where . is the concatenation operator for nftables):

        ip saddr . ip daddr . tcp dport {
            192.168.1.11 . 192.168.200.22 . http,
            192.168.1.14 . 192.168.200.22 . http,
            192.168.1.15 . 192.168.200.123 . http,
            192.168.1.11 . 192.168.200.123 . smtp,
            192.168.1.15 . 192.168.200.123 . smtp
        } accept
Note

When using both IPv4 and IPv6 addresses, you need a separate ip6 rule (with similar saddr and daddr arguments) for the IPv6 addresses — you can’t mix IPv4 and IPv6 addresses in the same expression.

The full /etc/nftables.conf file for Host β with the minimal wg-forward chain from above will look like the following:

#!/usr/sbin/nft -f
flush ruleset

define pub_iface = "eth0"
define wg_iface = "wg0"
define wg_port = 51822

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        # accept all loopback packets
        iif "lo" accept
        # accept all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # accept all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }
        # drop new connections over rate limit
        ct state new limit rate over 1/second burst 10 packets drop

        # accept all DHCPv6 packets received at a link-local address
        ip6 daddr fe80::/64 udp dport dhcpv6-client accept
        # accept all SSH packets received on a public interface
        iifname $pub_iface tcp dport ssh accept
        # accept all WireGuard packets received on a public interface
        iifname $pub_iface udp dport $wg_port accept

        # reject with polite "port unreachable" icmp response
        reject
    }

    chain wg-forward {
        # forward all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # forward all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }

        # forward all HTTP packets for Endpoint B
        ip daddr 192.168.200.22 tcp dport http accept

        # reject with polite "administratively prohibited" icmp response
        reject with icmpx type admin-prohibited
    }

    chain forward {
        type filter hook forward priority 0; policy drop;

        # filter all packets inbound from Site A to Site B via wg-forward chain
        iifname $wg_iface goto wg-forward
        # clamp mss of tcp syn and syn-ack packets outbound from Site B to Site A
        oifname $wg_iface tcp flags syn / syn,rst tcp option maxseg size set rt mtu
        # allow all packets outbound from Site B to Site A
        oifname $wg_iface accept

        # reject with polite "host unreachable" icmp response
        reject with icmpx type host-unreachable
    }
}
Tip

We’ve also added a rule to the forward chain which automatically adjusts the MSS option of TCP SYN and SYN-ACK packets sent from Site B through the WireGuard tunnel, to account for the lower MTU size of the WireGuard interface. This helps the other end of the TCP connection in Site A optimize the size of the packets it sends to Site B, avoiding packet fragmentation.

Test It Out

You can test this out by trying to access Endpoint B’s HTTP server from Endpoint A:

$ curl 192.168.200.22
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
...

You should see the HTML of Endpoint B’s homepage printed. If you start up another web server running on Endpoint B on some other port (or any other host in Site B), like say at TCP port 8080 (run python3 -m http.server 8080 for a temporary server serving the contents of the current directory), you won’t be able to access it from Endpoint A (or any other host in Site A):

$ curl 192.168.200.22:8080
curl: (7) Failed to connect to 192.168.200.22 port 8080: Connection refused

Troubleshooting

Ping

Use ICMP tools like ping to check routing, not access control. For instance, our Site to Site firewall above allows any host in Site A to ping any host in Site B, and vice versa. This allows you to verify that the routing between Site A and Site B works.

To check that your firewall works, try connecting to the specific services that your firewall is supposed to allow or deny. For example, our Site to Site firewall above only allows HTTP connections from Endpoint A to Endpoint B. You can check this with an HTTP tool like curl (or a web browser).

Tcpdump

Tcpdump is a great all-purpose tool for troubleshooting networking issues. For firewall issues in particular, you can use it to determine if packets are transiting in and out as expected from a given host. For example, with the Site to Site scenario, if you see no output when you try to connect from Endpoint A to Endpoint B (ie when running curl 192.168.200.22), try running the following command on each host on the path between Endpoint A and Endpoint B (Endpoint A, Host α, Host β, and Endpoint B):

$ sudo tcpdump -ni any 'tcp port 80 and host 192.168.200.22'

While those tcpdump commands are running, try connecting from Endpoint A to Endpoint B again. You should see each terminal running tcpdump print a series of lines that look like the following:

22:12:16.156120 IP 192.168.1.11.36112 > 192.168.200.22.80: Flags [S], seq 767605079, win 62167, options [mss 8881,sackOK,TS val 1689421470 ecr 0,nop,wscale 6], length 0
22:12:16.157801 IP 192.168.200.22.80 > 192.168.1.11.36112: Flags [S.], seq 3979107600, ack 767605080, win 62083, options [mss 8881,sackOK,TS val 3921390190 ecr 1689421470,nop,wscale 6], length 0
22:12:16.158587 IP 192.168.1.11.36112 > 192.168.200.22.80: Flags [.], ack 1, win 972, options [nop,nop,TS val 1689421475 ecr 3921390190], length 0
22:12:16.158637 IP 192.168.1.11.36112 > 192.168.200.22.80: Flags [P.], seq 1:78, ack 1, win 972, options [nop,nop,TS val 1689421475 ecr 3921390190], length 77: HTTP: GET / HTTP/1.1
22:12:16.158755 IP 192.168.200.22.80 > 192.168.1.11.36112: Flags [.], ack 78, win 969, options [nop,nop,TS val 3921390191 ecr 1689421475], length 0

Lines with 192.168.1.11.36112 > 192.168.200.22.80 represent packets being sent to Endpoint B TCP port 80, and lines with 192.168.200.22.80 > 192.168.1.11.36112 represent packets being sent back from Endpoint B in response. If the tcpdump command on a host outputs only lines with the former, and none with the latter, it means only packets sent to Endpoint B are making it to the host — no packets on the return trip back from Endpoint B are. If the tcpdump command on a host doesn’t output anything, it means no packets are reaching the host, not even those on the first leg of the trip to Endpoint B.

Logging

You can use nftables log statements to log packets that make it to various points in your chains. For example, you could add a simple log statement to Host β’s wg-forward chain in the Site to Site example to log all packets that are about to be rejected:

    chain wg-forward {
        # forward all icmp/icmpv6 packets
        meta l4proto { icmp, ipv6-icmp } accept
        # forward all packets that are part of an already-established connection
        ct state vmap { invalid : drop, established : accept, related : accept }

        # forward all HTTP packets for Endpoint B
        ip daddr 192.168.200.22 tcp dport http accept

        log

        # reject with polite "administratively prohibited" icmp response
        reject with icmpx type admin-prohibited
    }

Any packet that makes it to that log statement will be logged to the kernel message facility. You can view these messages via the dmesg command; or if your system is set up with journald, via the journalctl -k command; or if your system uses rsyslogd, these messages will often be logged to files named /var/log/kern.log or /var/log/messages.

For example, if you try to run curl 192.168.200.22:8080 from Endpoint A in the Site to Site example, you’d see a message like this:

Nov 17 19:21:56 host-beta kernel: IN=wg0 OUT=eth0 MAC= SRC=192.168.1.11 DST=192.168.200.22 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=58474 DF PROTO=TCP SPT=43678 DPT=8080 WINDOW=62167 RES=0x00 SYN URGP=0

You can limit what packets are logged by adding an expression before the log statement; for example, to log only packets received from Endpoint A (192.168.1.11 in the Site to Site example), use the following statement:

        ip saddr 192.168.1.11 log

And you can specify the log level for messages (default is warn), as well as add a custom prefix to them:

        log level info prefix "wg-filter reject: "

You can also log packets that match a particular rule — just insert the log statement at the end of the rule (but before any terminating statement like accept, drop, jump, etc):

        # forward all HTTP packets for Endpoint B
        ip daddr 192.168.200.22 tcp dport http log level debug prefix "new conn to endpoint b: " accept

Tracing

Tracing with nftables can also be useful when trying to troubleshoot an issue. It will show you exactly what packets are being considered, and what rules from which chain are being applied to a packet.

For example, you could turn on tracing by adding a meta nftrace set 1 statement to Host β’s forward chain in the Site to Site example. This would trace packets as they move through the forward chain (as well as any other chain they go on to after it):

    chain forward {
        type filter hook forward priority 0; policy drop;

        meta nftrace set 1

        # filter all packets inbound from Site A to Site B via wg-forward chain
        iifname $wg_iface goto wg-forward
        # allow all packets outbound from Site B to Site A
        oifname $wg_iface accept

        # reject with polite "host unreachable" icmp response
        reject with icmpx type host-unreachable
    }

Use the following command to watch these traces as they happen:

$ sudo nft monitor trace

If you try to run curl 192.168.200.22:8080 from Endpoint A in the Site to Site example, this is the output you’d see:

trace id 053f9724 inet filter forward packet: iifname "wg0" oifname "eth0" ip saddr 192.168.1.11 ip daddr 192.168.200.22 ip dscp cs0 ip ecn not-ect ip ttl 63 ip id 48759 ip protocol tcp ip length 60 tcp sport 43682 tcp dport 8080 tcp flags == syn tcp window 62167
trace id 053f9724 inet filter forward rule iifname "wg0" oifname "eth0" goto wg-forward (verdict goto wg-forward)
trace id 053f9724 inet filter wg-forward rule ct state vmap { invalid : drop, established : accept, related : accept } (verdict continue)
trace id 053f9724 inet filter wg-forward rule reject with icmpx type admin-prohibited (verdict drop)