git.karo-electronics.de Git - karo-tx-linux.git/commit

author	Florian Westphal <fw@strlen.de>
	Fri, 22 May 2015 14:32:51 +0000 (16:32 +0200)
committer	David S. Miller <davem@davemloft.net>
	Wed, 27 May 2015 17:03:31 +0000 (13:03 -0400)
commit	d6b915e29f4adea94bc02ba7675bb4f84e6a1abd
tree	7e5f00b4c156f9549b8b3eeea92ce2a12dd1f0da	tree \| snapshot
parent	c5501eb3406d0f88b3efb2c437c4c40b35f865d8	commit \| diff

ip_fragment: don't forward defragmented DF packet

We currently always send fragments without DF bit set.

Thus, given following setup:

mtu1500 - mtu1500:1400 - mtu1400:1280 - mtu1280
A R1 R2 B

Where R1 and R2 run linux with netfilter defragmentation/conntrack
enabled, then if Host A sent a fragmented packet _with_ DF set to B, R1
will respond with icmp too big error if one of these fragments exceeded
1400 bytes.

However, if R1 receives fragment sizes 1200 and 100, it would
forward the reassembled packet without refragmenting, i.e.
R2 will send an icmp error in response to a packet that was never sent,
citing mtu that the original sender never exceeded.

The other minor issue is that a refragmentation on R1 will conceal the
MTU of R2-B since refragmentation does not set DF bit on the fragments.

This modifies ip_fragment so that we track largest fragment size seen
both for DF and non-DF packets, and set frag_max_size to the largest
value.

If the DF fragment size is larger or equal to the non-df one, we will
consider the packet a path mtu probe:
We set DF bit on the reassembled skb and also tag it with a new IPCB flag
to force refragmentation even if skb fits outdev mtu.

We will also set DF bit on each fragment in this case.

Joint work with Hannes Frederic Sowa.

Reported-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

include/net/inet_frag.h		diff \| blob \| history
include/net/ip.h		diff \| blob \| history
net/ipv4/ip_fragment.c		diff \| blob \| history
net/ipv4/ip_output.c		diff \| blob \| history