wpj428/doc/udp-df.md

148 lines
7.7 KiB
Markdown
Raw Normal View History

2021-04-17 17:54:17 +00:00
# The UDP/DF bug
I have observed that when communicating with UDP, if the Don't Fragment bit in the IP flags header is not set,
the packet is not delivered.
2021-04-25 17:07:21 +00:00
Here are the following impacted applications:
- Multiplayer mode of Trackmania
- Discord
- Wireguard
- dig
- Rocket League
2021-04-17 17:54:17 +00:00
## Identifying / reproducing the problem
Here is an example with DNS where the 1st request is done with dnsmasq that set the DF bit and the second request is done via dig that does not set the DF bit. Both requests are sent to Quad9 DNS to resolve the `lesterpig.com` domain:
```
# tcpdump -vv -i wwan0 'port 53 and (host 9.9.9.10 or host 149.112.112.10)'
tcpdump: listening on wwan0, link-type RAW (Raw IP), capture size 262144 bytes
17:30:39.832839 IP (tos 0x0, ttl 64, id 40453, offset 0, flags [DF], proto UDP (17), length 70)
192.0.0.2.61829 > dns10.quad9.net.53: [udp sum ok] 26437+ [1au] A? lesterpig.com. ar: . OPT UDPsize=512 (42)
17:30:39.913513 IP (tos 0x0, ttl 57, id 0, offset 0, flags [DF], proto UDP (17), length 86)
dns10.quad9.net.53 > 192.0.0.2.61829: [udp sum ok] 26437 q: A? lesterpig.com. 1/0/1 lesterpig.com. A 89.89.231.11 ar: . OPT UDPsize=512 (58)
[...]
17:30:39.913513 IP (tos 0x0, ttl 57, id 0, offset 0, flags [DF], proto UDP (17), length 86)
dns10.quad9.net.53 > 192.0.0.2.61829: [udp sum ok] 26437 q: A? lesterpig.com. 1/0/1 lesterpig.com. A 89.89.231.11 ar: . OPT UDPsize=512 (58)
[...]
17:30:49.497921 IP (tos 0x0, ttl 63, id 35598, offset 0, flags [none], proto UDP (17), length 82)
192.0.0.2.42500 > dns10.quad9.net.53: [udp sum ok] 41045+ [1au] A? lesterpig.com. ar: . OPT UDPsize=4096 (54)
[...]
17:30:54.497440 IP (tos 0x0, ttl 63, id 36723, offset 0, flags [none], proto UDP (17), length 82)
192.0.0.2.42500 > dns10.quad9.net.53: [udp sum ok] 41045+ [1au] A? lesterpig.com. ar: . OPT UDPsize=4096 (54)
[...]
17:30:59.497555 IP (tos 0x0, ttl 63, id 40590, offset 0, flags [none], proto UDP (17), length 82)
192.0.0.2.42500 > dns10.quad9.net.53: [udp sum ok] 41045+ [1au] A? lesterpig.com. ar: . OPT UDPsize=4096 (54)
46 packets captured
46 packets received by filter
0 packets dropped by kernel
```
I have observed this problem at least twice before:
- With Trackmania
- With Wireguard
I also reproduced the bug with Scapy+netcat.
First I use netcat to listen on UDP on a public server:
```
nc -ul 2372
```
Then I try to send a packet through scapy:
```python
# IP flags's bits
# X.. Reserved
# .X. Don't fragment
# ..X More fragment
2021-04-18 18:05:38 +00:00
send(IP(dst="212.47.253.12",flags=0b010)/UDP(sport=2373,dport=2372)/Raw(load='does ... work\n'))
send(IP(dst="212.47.253.12",flags=0b000)/UDP(sport=2373,dport=2372)/Raw(load='does not work\n'))
2021-04-17 17:54:17 +00:00
```
I did not observed this bug when I was using a TP-Link+Huawei e3372h, so it probably does not come from my ISP.
(But it *still* could come from it as we activated IPv6 + 5G).
Still, for now a question is not yet answered: do the bit must be set in both direction or only one?
2021-04-18 18:05:38 +00:00
Additional tests are thus required.
## Additional tests
On the server, we run a small UDP echo server with scapy:
```python
sniff(
filter="udp and port 2732",
count=1,
prn=lambda p: send(
IP(src=p[IP].dst, dst=p[IP].src, flags=0b000)/
UDP(sport=p[UDP].dport, dport=p[UDP].sport)/
Raw(load='answer\n')))
```
And on the other side, we simply use netcat:
```
nc -u rayonx.machine.deuxfleurs.fr 2732
```
On the server, we observe:
```
# sudo tcpdump -i ens2 -vv udp and port 2732
tcpdump: listening on ens2, link-type EN10MB (Ethernet), capture size 262144 bytes
20:03:08.137728 IP (tos 0x0, ttl 48, id 0, offset 0, flags [DF], proto UDP (17), length 33)
37-169-5-150.coucou-networks.fr.26054 > rayon-x.2732: [udp sum ok] UDP, length 5
20:03:08.207393 IP (tos 0x0, ttl 64, id 1, offset 0, flags [none], proto UDP (17), length 35)
rayon-x.2732 > 37-169-5-150.coucou-networks.fr.26054: [udp sum ok] UDP, length 7
```
On the local machine, we observe:
```
$ sudo tcpdump -vv udp and port 2732
tcpdump: listening on enp4s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
20:03:08.125724 IP (tos 0x0, ttl 64, id 38908, offset 0, flags [DF], proto UDP (17), length 33)
lheureduthe.lan.46395 > 12-253-47-212.instances.scw.cloud.g5m: [udp sum ok] UDP, length 5
20:03:08.246560 IP (tos 0x0, ttl 50, id 0, offset 0, flags [DF], proto UDP (17), length 35)
12-253-47-212.instances.scw.cloud.g5m > lheureduthe.lan.46395: [udp sum ok] UDP, length 7
```
We see that our response UDP packet:
- has been received despite the fact it has no flag
- has been rewritten with the DF flag
2021-04-17 17:54:17 +00:00
## Some possible workarounds
If the problem is only one way:
- [Add a netfilter extension to rewrite DF bits](https://github.com/semverchenko/dontfragment)
- Use netfilter netqueue
- Write a XDP patch
If the problem is 2 ways:
- Use a VPN (openvpn works as it probably sets the DF bit)
2021-04-25 17:07:21 +00:00
## Choosing XDP as the workaround
While we successfully validated the OpenVPN solution, we do not want to add this complexity in our daily setup.
Moreover, as only the egress traffic (TX) is impacted and not the ingress (RX), we can rewrite the packets locally.
By being fast and secure, we selected XDP to rewrite the packets.
In the journey, we mainly learnt/discovered 3 main things:
- XDP is INGRESS only, an EGRESS patch has been proposed but is not yet merged. It means we can not rewrite packets that are sent over an interface, only packets that are received on the interface. Hopefully, when my computer sends an UDP packet (to a DNS server for example), it is sent to its ethernet interface, received on the `br-lan` interface of the router then forwarded on its `wwan` interface. So, the only place we can do the rewrite is on `br-lan`. It has two drawbacks: local packets are rewritten while its useless and the router can not benefit from the XDP module for its own traffic. Still, this situation seemed acceptable to us.
- At first, we thought it would be simple as we only want to flip a bit! But IPv4 has checksums (while IPv6 not), so we need to recompute it. bpf-helpers has some tools for that but they often requires a `sk_buff`, the internal structure of the kernel. But we are such low level here that this structure has not been created yet. It stll might be possible to use some functions but without any example and due to the quite strange interface and documentation, none of our tests we successful. We finally reimplemented in BPF the original checksuming from [RFC 1071](https://tools.ietf.org/html/rfc1071), and it works! We know that as we are only modifying part of the packet, we could do a more efficient incremental update as decribed in [RFC 1624](https://tools.ietf.org/html/rfc1624). We did not put the effort to understand and implement this optimization as the simple solution is easier to understand/debug/implement for us while providing satisfying performances.
- Finally, it appears that OpenWRT (in its upstream version) supports XDP out of the box, but you need a loader. Because we do not need BPF maps, we can use the loader embedded in iproute2. However, by default on OpenWRT, iproute2 is not compiled with the support for XDP/eBPF. To embed the loader in iproute2, you need to install/select the `ip-full` from OpenWRT (in the `Network/Routing and Redirection` section). You can find [these details in the Makefile](https://github.com/openwrt/openwrt/blob/master/package/network/utils/iproute2/Makefile) of the package.
Linked resources:
- [The code + instructions to compile and load](../xdp)
- [The compiled file](../files/var/lib/xdp/xdp_udp.o)
2021-04-17 17:54:17 +00:00
## More definitive solutions
- Find the code involved and patch it
- I have sent an email to Simcom to ask them to patch the firmware of their modem as of 2021-04-17
2021-04-25 17:07:21 +00:00
- If it is a misconfiguration from Free Mobile, inform them
2021-04-17 17:54:17 +00:00
2021-04-25 17:07:21 +00:00
➡️ The manufacturer has been contacted but did not answer