wpj428/doc/udp-df.md

7.7 KiB

The UDP/DF bug

I have observed that when communicating with UDP, if the Don't Fragment bit in the IP flags header is not set, the packet is not delivered.

Here are the following impacted applications:

  • Multiplayer mode of Trackmania
  • Discord
  • Wireguard
  • dig
  • Rocket League

Identifying / reproducing the problem

Here is an example with DNS where the 1st request is done with dnsmasq that set the DF bit and the second request is done via dig that does not set the DF bit. Both requests are sent to Quad9 DNS to resolve the lesterpig.com domain:

# tcpdump -vv -i wwan0 'port 53 and (host 9.9.9.10 or host 149.112.112.10)'
tcpdump: listening on wwan0, link-type RAW (Raw IP), capture size 262144 bytes
17:30:39.832839 IP (tos 0x0, ttl 64, id 40453, offset 0, flags [DF], proto UDP (17), length 70)
    192.0.0.2.61829 > dns10.quad9.net.53: [udp sum ok] 26437+ [1au] A? lesterpig.com. ar: . OPT UDPsize=512 (42)
17:30:39.913513 IP (tos 0x0, ttl 57, id 0, offset 0, flags [DF], proto UDP (17), length 86)
    dns10.quad9.net.53 > 192.0.0.2.61829: [udp sum ok] 26437 q: A? lesterpig.com. 1/0/1 lesterpig.com. A 89.89.231.11 ar: . OPT UDPsize=512 (58)
[...]
17:30:39.913513 IP (tos 0x0, ttl 57, id 0, offset 0, flags [DF], proto UDP (17), length 86)
    dns10.quad9.net.53 > 192.0.0.2.61829: [udp sum ok] 26437 q: A? lesterpig.com. 1/0/1 lesterpig.com. A 89.89.231.11 ar: . OPT UDPsize=512 (58)
[...]
17:30:49.497921 IP (tos 0x0, ttl 63, id 35598, offset 0, flags [none], proto UDP (17), length 82)
    192.0.0.2.42500 > dns10.quad9.net.53: [udp sum ok] 41045+ [1au] A? lesterpig.com. ar: . OPT UDPsize=4096 (54)
[...]
17:30:54.497440 IP (tos 0x0, ttl 63, id 36723, offset 0, flags [none], proto UDP (17), length 82)
    192.0.0.2.42500 > dns10.quad9.net.53: [udp sum ok] 41045+ [1au] A? lesterpig.com. ar: . OPT UDPsize=4096 (54)
[...]
17:30:59.497555 IP (tos 0x0, ttl 63, id 40590, offset 0, flags [none], proto UDP (17), length 82)
    192.0.0.2.42500 > dns10.quad9.net.53: [udp sum ok] 41045+ [1au] A? lesterpig.com. ar: . OPT UDPsize=4096 (54)

46 packets captured
46 packets received by filter
0 packets dropped by kernel

I have observed this problem at least twice before:

  • With Trackmania
  • With Wireguard

I also reproduced the bug with Scapy+netcat. First I use netcat to listen on UDP on a public server:

nc -ul 2372

Then I try to send a packet through scapy:

# IP flags's bits
# X.. Reserved
# .X. Don't fragment
# ..X More fragment
send(IP(dst="212.47.253.12",flags=0b010)/UDP(sport=2373,dport=2372)/Raw(load='does ... work\n'))
send(IP(dst="212.47.253.12",flags=0b000)/UDP(sport=2373,dport=2372)/Raw(load='does not work\n'))

I did not observed this bug when I was using a TP-Link+Huawei e3372h, so it probably does not come from my ISP. (But it still could come from it as we activated IPv6 + 5G).

Still, for now a question is not yet answered: do the bit must be set in both direction or only one? Additional tests are thus required.

Additional tests

On the server, we run a small UDP echo server with scapy:

sniff(
  filter="udp and port 2732", 
  count=1, 
  prn=lambda p: send(
    IP(src=p[IP].dst, dst=p[IP].src, flags=0b000)/
    UDP(sport=p[UDP].dport, dport=p[UDP].sport)/
    Raw(load='answer\n')))

And on the other side, we simply use netcat:

nc -u rayonx.machine.deuxfleurs.fr 2732

On the server, we observe:

# sudo tcpdump -i ens2 -vv udp and port 2732
tcpdump: listening on ens2, link-type EN10MB (Ethernet), capture size 262144 bytes
20:03:08.137728 IP (tos 0x0, ttl 48, id 0, offset 0, flags [DF], proto UDP (17), length 33)
    37-169-5-150.coucou-networks.fr.26054 > rayon-x.2732: [udp sum ok] UDP, length 5
20:03:08.207393 IP (tos 0x0, ttl 64, id 1, offset 0, flags [none], proto UDP (17), length 35)
    rayon-x.2732 > 37-169-5-150.coucou-networks.fr.26054: [udp sum ok] UDP, length 7

On the local machine, we observe:

$ sudo tcpdump -vv udp and port 2732
tcpdump: listening on enp4s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
20:03:08.125724 IP (tos 0x0, ttl 64, id 38908, offset 0, flags [DF], proto UDP (17), length 33)
    lheureduthe.lan.46395 > 12-253-47-212.instances.scw.cloud.g5m: [udp sum ok] UDP, length 5
20:03:08.246560 IP (tos 0x0, ttl 50, id 0, offset 0, flags [DF], proto UDP (17), length 35)
    12-253-47-212.instances.scw.cloud.g5m > lheureduthe.lan.46395: [udp sum ok] UDP, length 7

We see that our response UDP packet:

  • has been received despite the fact it has no flag
  • has been rewritten with the DF flag

Some possible workarounds

If the problem is only one way:

If the problem is 2 ways:

  • Use a VPN (openvpn works as it probably sets the DF bit)

Choosing XDP as the workaround

While we successfully validated the OpenVPN solution, we do not want to add this complexity in our daily setup. Moreover, as only the egress traffic (TX) is impacted and not the ingress (RX), we can rewrite the packets locally. By being fast and secure, we selected XDP to rewrite the packets. In the journey, we mainly learnt/discovered 3 main things:

  • XDP is INGRESS only, an EGRESS patch has been proposed but is not yet merged. It means we can not rewrite packets that are sent over an interface, only packets that are received on the interface. Hopefully, when my computer sends an UDP packet (to a DNS server for example), it is sent to its ethernet interface, received on the br-lan interface of the router then forwarded on its wwan interface. So, the only place we can do the rewrite is on br-lan. It has two drawbacks: local packets are rewritten while its useless and the router can not benefit from the XDP module for its own traffic. Still, this situation seemed acceptable to us.
  • At first, we thought it would be simple as we only want to flip a bit! But IPv4 has checksums (while IPv6 not), so we need to recompute it. bpf-helpers has some tools for that but they often requires a sk_buff, the internal structure of the kernel. But we are such low level here that this structure has not been created yet. It stll might be possible to use some functions but without any example and due to the quite strange interface and documentation, none of our tests we successful. We finally reimplemented in BPF the original checksuming from RFC 1071, and it works! We know that as we are only modifying part of the packet, we could do a more efficient incremental update as decribed in RFC 1624. We did not put the effort to understand and implement this optimization as the simple solution is easier to understand/debug/implement for us while providing satisfying performances.
  • Finally, it appears that OpenWRT (in its upstream version) supports XDP out of the box, but you need a loader. Because we do not need BPF maps, we can use the loader embedded in iproute2. However, by default on OpenWRT, iproute2 is not compiled with the support for XDP/eBPF. To embed the loader in iproute2, you need to install/select the ip-full from OpenWRT (in the Network/Routing and Redirection section). You can find these details in the Makefile of the package.

Linked resources:

More definitive solutions

  • Find the code involved and patch it
    • I have sent an email to Simcom to ask them to patch the firmware of their modem as of 2021-04-17
  • If it is a misconfiguration from Free Mobile, inform them

➡️ The manufacturer has been contacted but did not answer