A world regulatory domain check was in place that
prevents user dynamic regulatory hints from being
processed. This was there for historical reasons
as this was only possible previously for world
roaming cards and dynamic regulatory settings was
only possible for country IEs. Fix this by enforcing
the world regulatory domain check only for when the
initiator is a country IE. Support for dynamic user
regulatory support is already checked.
Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johan Hedberg [Tue, 5 Nov 2013 09:30:39 +0000 (11:30 +0200)]
Bluetooth: Fix rejecting SMP security request in slave role
The SMP security request is for a slave role device to request the
master role device to initiate a pairing request. If we receive this
command while we're in the slave role we should reject it appropriately.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Seung-Woo Kim [Tue, 5 Nov 2013 09:46:33 +0000 (18:46 +0900)]
Bluetooth: Fix crash in l2cap_chan_send after l2cap_chan_del
Removing a bond and disconnecting from a specific remote device
can cause l2cap_chan_send() is called after l2cap_chan_del() is
called. This causes following crash.
Seung-Woo Kim [Tue, 5 Nov 2013 07:02:24 +0000 (16:02 +0900)]
Bluetooth: Fix RFCOMM bind fail for L2CAP sock
L2CAP socket bind checks its bdaddr type but RFCOMM kernel thread
does not assign proper bdaddr type for L2CAP sock. This can cause
that RFCOMM failure.
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
0x1313 is in rfcomm_sock_getsockopt (net/bluetooth/rfcomm/sock.c:743).
738
739 static int rfcomm_sock_getsockopt_old(struct socket *sock, int optname, char __user *optval, int __user *optlen)
740 {
741 struct sock *sk = sock->sk;
742 struct rfcomm_conninfo cinfo;
743 struct l2cap_conn *conn = l2cap_pi(sk)->chan->conn;
744 int len, err = 0;
745 u32 opt;
746
747 BT_DBG("sk %p", sk);
The l2cap_pi(sk) is wrong here since it should have been rfcomm_pi(sk),
but that socket of course does not contain the low-level connection
details requested here.
Tracking down the actual offending commit, it seems that this has been
introduced when doing some L2CAP refactoring:
The l2cap_sk got accidentally mixed into the sk (which is RFCOMM) and
now causing a problem within getsocketopt() system call. To fix this,
just re-introduce l2cap_sk and make sure the right socket is used for
the low-level connection details.
Reported-by: Fabio Rossi <rossi.f@inwind.it> Reported-by: Janusz Dziedzic <janusz.dziedzic@gmail.com> Tested-by: Janusz Dziedzic <janusz.dziedzic@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Dan Williams [Fri, 8 Nov 2013 19:39:44 +0000 (13:39 -0600)]
prism54: set netdev type to "wlan"
Userspace uses the netdev devtype for stuff like device naming and type
detection. Be nice and set it. Remove the pointless #if/#endif around
SET_NETDEV_DEV too.
Signed-off-by: Dan Williams <dcbw@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
Currently, the PLL is turned off for AR9485 when
switching to a low power state, but AR9485 has an issue
where the card will become unresponsive if left idle
for a long time without any traffic. To fix this,
force the PLL to always be on using a different initval
array, ar9485_1_1_pll_on_cdr_on_clkreq_disable_L1.
This is done for most of the AR9485 based cards
like HB125, WB225 etc. but certain models require the
feature to be turned off. Identify such cards and use
default values for them.
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Dan Carpenter [Wed, 6 Nov 2013 07:41:28 +0000 (10:41 +0300)]
wcn36xx: harmless memory corruption bug in debugfs
On 64 bit systems we write past the end of the arg[] array.
Fixes: 8e84c2582169 ('wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Avinash Patil [Tue, 5 Nov 2013 23:01:44 +0000 (15:01 -0800)]
mwifiex: correct packet length for packets from SDIO interface
While receiving a packet on SDIO interface, we allocate skb with
size multiple of SDIO block size. We need to resize this skb
after RX using packet length from RX header.
Cc: <stable@vger.kernel.org> Signed-off-by: Avinash Patil <patila@marvell.com> Signed-off-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Larry Finger [Tue, 5 Nov 2013 21:15:30 +0000 (15:15 -0600)]
rtlwifi: rtl8192de: Fix incorrect signal strength for unassociated AP
The routine that processes received frames was returning the RSSI value for the
signal strength; however, that value is available only for associated APs. As
a result, the strength was the absurd value of 10 dBm. As a result, scans
return incorrect values for the strength, which causes unwanted attempts to roam.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Stable <stable@vger.kernel.org> [3.1+] Signed-off-by: John W. Linville <linville@tuxdriver.com>
Larry Finger [Tue, 5 Nov 2013 21:15:29 +0000 (15:15 -0600)]
rtlwifi: rtl8192cu: Fix incorrect signal strength for unassociated AP
The routine that processes received frames was returning the RSSI value for the
signal strength; however, that value is available only for associated APs. As
a result, the strength was the absurd value of 10 dBm. As a result, scans
return incorrect values for the strength, which causes unwanted attempts to roam.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Stable <stable@vger.kernel.org> [2.6.39+] Signed-off-by: John W. Linville <linville@tuxdriver.com>
Larry Finger [Tue, 5 Nov 2013 21:15:28 +0000 (15:15 -0600)]
rtlwifi: rtl8192se: Fix incorrect signal strength for unassociated AP
The routine that processes received frames was returning the RSSI value for the
signal strength; however, that value is available only for associated APs. As
a result, the strength was the absurd value of 10 dBm. As a result, scans
return incorrect values for the strength, which causes unwanted attempts to roam.
This patch fixes https://bugzilla.kernel.org/show_bug.cgi?id=63881.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Reported-by: Matthieu Baerts <matttbe@gmail.com> Cc: Stable <stable@vger.kernel.org> [3.0 +] Signed-off-by: John W. Linville <linville@tuxdriver.com>
rtlwifi: Fix endian error in extracting packet type
All of the rtlwifi drivers have an error in the routine that tests if
the data is "special". If it is, the subsequant transmission will be
at the lowest rate to enhance reliability. The 16-bit quantity is
big-endian, but was being extracted in native CPU mode. One of the
effects of this bug is to inhibit association under some conditions
as the TX rate is too high.
Based on suggestions by Joe Perches, the entire routine is rewritten.
One of the local headers contained duplicates of some of the ETH_P_XXX
definitions. These are deleted.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Cc: Stable <stable@vger.kernel.org> [2.6.38+] Signed-off-by: John W. Linville <linville@tuxdriver.com>
Janusz Dziedzic [Fri, 1 Nov 2013 20:05:28 +0000 (21:05 +0100)]
ath9k: dfs_debug fix possible NULL dereference
Fix possible NULL (sc->dfs_detector) pointer dereference.
Detected by Smatch:
drivers/net/wireless/ath/ath9k/dfs_debug.c:67 read_file_dfs()
error: we previously assumed 'sc->dfs_detector' could be null (see line 47)
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Wei Yongjun [Wed, 30 Oct 2013 05:22:34 +0000 (13:22 +0800)]
libertas: fix error return code in if_cs_probe()
Fix to return -ENODEV in the unknown model error handling
case instead of 0, as done elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Dan Carpenter [Wed, 30 Oct 2013 17:12:51 +0000 (20:12 +0300)]
libertas: potential oops in debugfs
If we do a zero size allocation then it will oops. Also we can't be
sure the user passes us a NUL terminated string so I've added a
terminator.
This code can only be triggered by root.
Reported-by: Nico Golde <nico@ngolde.de> Reported-by: Fabian Yamaguchi <fabs@goesec.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Colin Ian King [Mon, 28 Oct 2013 12:58:12 +0000 (12:58 +0000)]
rtlwifi: fix null dereference on efuse_word on kmalloc fail returns NULL
kmalloc on efuse_word can return null, leading to free'ing of
elements in efuse_word on the error exit path even though it has not
been allocated. Instead, don't free the elements of efuse_word if
kmalloc failed.
Also, kmalloc of any of the arrays in efuse_word[] can also fail,
leading to undefined contents in the remaining elements leading to
problems when free'ing these elements later on. So kzalloc efuse_word
to ensure the kfree on the remaining elements won't cause breakage.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Amitkumar Karwar [Tue, 22 Oct 2013 22:24:45 +0000 (15:24 -0700)]
mwifiex: fix invalid memory access in mwifiex_ret_tx_rate_cfg()
As tlv_buf_len is decremented at the end of the loop, we may have
accessed invalid memory in the last iteration.
Modify the while condition and add a break statement at the
begining of the loop to fix the problem.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Amitkumar Karwar [Tue, 22 Oct 2013 22:24:44 +0000 (15:24 -0700)]
mwifiex: fix invalid memory access in mwifiex_get_power_level()
With "while (length)" check we may end up in accessing invalid
memory in last iteration.
This patch makes sure that tlv length is not less than the length
of structure mwifiex_power_group when min/max power is calculated.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Amitkumar Karwar [Tue, 22 Oct 2013 22:24:43 +0000 (15:24 -0700)]
mwifiex: replace u16 with __le16 in struct mwifiex_types_power_group
__le16 to u16 conversion is missing for "pg_tlv_hdr->length"
in mwifiex_get_power_level(). This creates a problem on big
endian machines.
It is resolved by changing definition of the structure
and making required endianness changes.
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Felipe Pena [Sat, 19 Oct 2013 00:52:40 +0000 (21:52 -0300)]
rtlwifi: rtl8192se: Fix wrong assignment
There is a typo in the struct member name on assignment when checking
rtlphy->current_chan_bw == HT_CHANNEL_WIDTH_20_40, the check uses pwrgroup_ht40
for bound limit and uses pwrgroup_ht20 when assigning instead.
Signed-off-by: Felipe Pena <felipensp@gmail.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: stable@vger.kernel.org [3.0+] Signed-off-by: John W. Linville <linville@tuxdriver.com>
Felipe Pena [Sat, 19 Oct 2013 00:20:42 +0000 (21:20 -0300)]
wireless: rt2800lib: Fix typo on checking
On rt2800_config_channel_rf53xx function the member default_power1 is checked
for bound limit, but default_power2 is used instead.
Signed-off-by: Felipe Pena <felipensp@gmail.com> Acked-by: Gertjan van Wingerde <gwingerde@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh
Fixes a suspicious rcu derference warning.
Cc: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jacob Keller [Sat, 9 Nov 2013 12:52:32 +0000 (04:52 -0800)]
ixgbe: add warning when max_vfs is out of range.
The max_vfs parameter has a limit of 63 and silently fails (adding 0 vfs) when
it is out of range. This patch adds a warning so that the user knows something
went wrong. Also, this patch moves the warning in ixgbe_enable_sriov() to where
max_vfs is checked, so that even an out of range value will show the deprecated
warning. Previously, an out of range parameter didn't even warn the user to use
the new sysfs interface instead.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Carolyn Wyborny [Sat, 9 Nov 2013 12:52:14 +0000 (04:52 -0800)]
igb: Update link modes display in ethtool
This patch fixes multiple problems in the link modes display in ethtool.
Newer parts have more complicated methods to determine actual link
capabilities. Older parts cannot communicate with their SFP modules.
Finally, all the available defines are not displayed by ethtool. This
updates the link modes to be as accurate as possible depending on what data
is available to the driver at any given time.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
and on HOSTB you do:
ping6 HOSTA -s2000 (MTU is 1500)
Incoming echo requests will be filtered out on HOSTA. This issue does
not occur with smaller packets than MTU (where fragmentation does not happen)
</example>
As was discussed previously, the only correct solution seems to be to use
reassembled skb instead of separete frags. Doing this has positive side
effects in reducing sk_buff by one pointer (nfct_reasm) and also the reams
dances in ipvs and conntrack can be removed.
Future plan is to remove net/ipv6/netfilter/nf_conntrack_reasm.c
entirely and use code in net/ipv6/reassembly.c instead.
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
MAINTAINERS: mv643xx_eth: take over maintainership from Lennart
Lennart says: "I haven't been able to spend time on mv643xx_eth for a
while now, so if you want to take over maintainership, I'd be fine with
that."
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Acked-by: Lennert Buytenhek <buytenh@wantstofly.org> Acked-by: Jason Cooper <jason@lakedaemon.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Fri, 8 Nov 2013 02:23:34 +0000 (10:23 +0800)]
net_sched: tbf: support of 64bit rates
With psched_ratecfg_precompute(), tbf can deal with 64bit rates.
Add two new attributes so that tc can use them to break the 32bit
limit.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Suggested-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Fri, 8 Nov 2013 08:51:10 +0000 (00:51 -0800)]
ixgbe: deleting dfwd stations out of order can cause null ptr deref
The number of stations in use is kept in the num_rx_pools counter
in the ixgbe_adapter structure. This is in turn used by the queue
allocation scheme to determine how many queues are needed to support
the number of pools in use with the current feature set.
This works as long as the pools are added and destroyed in order
because (num_rx_pools * queues_per_pool) is equal to the last
queue in use by a pool. But as soon as you delete a pool out of
order this is no longer the case. So the above multiplication
allocates to few queues and a pool may reference a ring that has
not been allocated/initialized.
To resolve use the bit mask of in use pools to determine the final
pool being used and allocate enough queues so that we don't
inadvertently remove its queues.
# ip link add link eth2 \
numtxqueues 4 numrxqueues 4 txqueuelen 50 type macvlan
# ip link set dev macvlan0 up
# ip link add link eth2 \
numtxqueues 4 numrxqueues 4 txqueuelen 50 type macvlan
# ip link set dev macvlan1 up
# for i in {0..100}; do
ip link set dev macvlan0 down; ip link set dev macvlan0 up;
done;
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Fri, 8 Nov 2013 08:50:32 +0000 (00:50 -0800)]
ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS
In the recent support for layer 2 hardware acceleration, I added a
few references to real_num_rx_queues and num_rx_queues which are
only available with CONFIG_RPS.
The fix is first to remove unnecessary references to num_rx_queues.
Because the hardware offload case is limited to cases where RX queues
and TX queues are equal we only need a single check. Then wrap the
single case in an ifdef.
net: Add layer 2 hardware acceleration operations for macvlan devices
Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Duan Jiong [Fri, 8 Nov 2013 01:56:53 +0000 (09:56 +0800)]
ipv6: use rt6_get_dflt_router to get default router in rt6_route_rcv
As the rfc 4191 said, the Router Preference and Lifetime values in a
::/0 Route Information Option should override the preference and lifetime
values in the Router Advertisement header. But when the kernel deals with
a ::/0 Route Information Option, the rt6_get_route_info() always return
NULL, that means that overriding will not happen, because those default
routers were added without flag RTF_ROUTEINFO in rt6_add_dflt_router().
In order to deal with that condition, we should call rt6_get_dflt_router
when the prefix length is 0.
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc [Thu, 7 Nov 2013 18:59:19 +0000 (19:59 +0100)]
nfnetlink: do not ack malformed messages
Commit 0628b123c96d ("netfilter: nfnetlink: add batch support and use it
from nf_tables") introduced a bug leading to various crashes in netlink_ack
when netlink message with invalid nlmsg_len was sent by an unprivileged
user.
Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When trying to delete a table >= 256 using iproute2 the local table
will be deleted.
The table id is specified as a netlink attribute when it needs more then
8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0).
Preconditions to matching the table id in the rule delete code
doesn't seem to take the "table id in netlink attribute" into condition
so the frh_get_table helper function never gets to do its job when
matching against current rule.
Use the helper function twice instead of peaking at the table value directly.
Originally reported at: http://bugs.debian.org/724783
Reported-by: Nicolas HICHER <nhicher@avencall.com> Signed-off-by: Andreas Henriksson <andreas@fatal.se> Signed-off-by: David S. Miller <davem@davemloft.net>
Florent Fourcot [Thu, 7 Nov 2013 16:53:14 +0000 (17:53 +0100)]
ipv6: protect flow label renew against GC
Take ip6_fl_lock before to read and update
a label.
v2: protect only the relevant code
Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Florent Fourcot [Thu, 7 Nov 2013 16:53:13 +0000 (17:53 +0100)]
ipv6: increase maximum lifetime of flow labels
If the last RFC 6437 does not give any constraints
for lifetime of flow labels, the previous RFC 3697
spoke of a minimum of 120 seconds between
reattribution of a flow label.
The maximum linger is currently set to 60 seconds
and does not allow this configuration without
CAP_NET_ADMIN right.
This patch increase the maximum linger to 150
seconds, allowing more flexibility to standard
users.
Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Florent Fourcot [Thu, 7 Nov 2013 16:53:12 +0000 (17:53 +0100)]
ipv6: enable IPV6_FLOWLABEL_MGR for getsockopt
It is already possible to set/put/renew a label
with IPV6_FLOWLABEL_MGR and setsockopt. This patch
add the possibility to get information about this
label (current value, time before expiration, etc).
It helps application to take decision for a renew
or a release of the label.
v2:
* Add spin_lock to prevent race condition
* return -ENOENT if no result found
* check if flr_action is GET
v3:
* move the spin_lock to protect only the
relevant code
Signed-off-by: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Nov 2013 18:15:39 +0000 (13:15 -0500)]
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next
John W. Linville says:
====================
Here is one more pull request for the 3.13 window. This is primarily
composed of downstream pull requests that were posted while I was
traveling during the last part of the 3.12 release.
For the mac80211 bits, Johannes says:
"I have two DFS fixes (ath9k already supports DFS) and a fix for a
pointer race."
And...
"In this round for mac80211-next I have:
* mesh channel switch support
* a CCM rewrite, using potential hardware offloads
* SMPS for AP mode
* RF-kill GPIO driver updates to make it usable as an ACPI driver
* regulatory improvements
* documentation fixes
* DFS for IBSS mode
* and a few small other fixes/improvements"
For the TI driver bits, Luca says:
"Some patches intended for 3.13. Eliad continues upstreaming pending
patches from the internal tree."
For the iwlwifi bits, Emmanuel says:
"There are a few fixes from Johannes mostly clean up patches. We have
also a few other fixes that are relevant for the new firmware that has
not been released yet."
For the Bluetooth bits, Gustavo says:
"A last fix to the 3.12. I ended forgetting to send it before, I hope we can
still make the way to 3.12. It is a revert and it fixes an issue with bluetooth
suspend/hibernate that had many bug reports. Please pull or let me know of any
problems. Thanks!" (Obviously, that one didn't make 3.12...)
Also...
"One more big pull request for 3.13. These are the patches we queued during
last week. Here you will find a lot of improvements to the HCI and L2CAP and
MGMT layers with the main ones being a better debugfs support and end of work
of splitting L2CAP into Core and Socket parts."
Additionally, there is one ath9k patch to enable DFS in IBSS mode for
that driver.
I appreciate your consideration for taking this extra pull request
this cycle. Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 8 Nov 2013 02:32:06 +0000 (18:32 -0800)]
inet: fix a UFO regression
While testing virtio_net and skb_segment() changes, Hannes reported
that UFO was sending wrong frames.
It appears this was introduced by a recent commit : 8c3a897bfab1 ("inet: restore gso for vxlan")
The old condition to perform IP frag was :
tunnel = !!skb->encapsulation;
...
if (!tunnel && proto == IPPROTO_UDP) {
So the new one should be :
udpfrag = !skb->encapsulation && proto == IPPROTO_UDP;
...
if (udpfrag) {
Initialization of udpfrag must be done before call
to ops->callbacks.gso_segment(skb, features), as
skb_udp_tunnel_segment() clears skb->encapsulation
(We want udpfrag to be true for UFO, false for VXLAN)
With help from Alexei Starovoitov
Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This series moves pskb_put() to the core code, making the code
duplication in caif obsolete (patches 1 and 2).
Patch 3 fixes a few kernel-doc issues.
v2 of this series does no longer contain the skb_cow_data() patch and
therefore no performance improvements for IPsec. The change is still
under discussion, but otherwise independent from the above changes.
Please apply!
v2:
- kernel-doc fixes for pskb_put, as noticed by Ben
- dropped skb_cow_data patch as it's still discussed
- added a kernel-doc fixes patch (patch 3)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mathias Krause [Thu, 7 Nov 2013 13:18:26 +0000 (14:18 +0100)]
net: skbuff - kernel-doc fixes
Use "@" to refer to parameters in the kernel-doc description. According
to Documentation/kernel-doc-nano-HOWTO.txt "&" shall be used to refer to
structures only.
Signed-off-by: Mathias Krause <mathias.krause@secunet.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Mathias Krause [Thu, 7 Nov 2013 13:18:25 +0000 (14:18 +0100)]
caif: use pskb_put() instead of reimplementing its functionality
Also remove the warning for fragmented packets -- skb_cow_data() will
linearize the buffer, removing all fragments.
Signed-off-by: Mathias Krause <mathias.krause@secunet.com> Cc: Dmitry Tarnyagin <dmitry.tarnyagin@lockless.no> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Mathias Krause [Thu, 7 Nov 2013 13:18:24 +0000 (14:18 +0100)]
net: move pskb_put() to core code
This function has usage beside IPsec so move it to the core skbuff code.
While doing so, give it some documentation and change its return type to
'unsigned char *' to be in line with skb_put().
Signed-off-by: Mathias Krause <mathias.krause@secunet.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
net: calxedaxgmac: Fix panic caused by MTU change of active interface
Changing MTU size of an xgmac network interface while it is active can
cause a panic like
skbuff: skb_over_panic: text:c03bc62c len:1090 put:1090 head:edfb6900 data:edfb6942 tail:0xedfb6d84 end:0xedfb6bc0 dev:eth0
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:126!
Internal error: Oops - BUG: 0 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 762 Comm: python Tainted: G W 3.10.0-00015-g3e33cd7 #309
task: edcfe000 ti: ed67e000 task.ti: ed67e000
PC is at skb_panic+0x64/0x70
LR is at wake_up_klogd+0x5c/0x68
This happens because xgmac_change_mtu modifies dev->mtu before the
network interface is quiesced. And thus there still might be buffers
in use which have a buffer size based on the old MTU.
To fix this I moved the change of dev->mtu after the call to
xgmac_stop.
Another modification is required (in xgmac_stop) to ensure that
xgmac_xmit is really not called anymore (xgmac_tx_complete might wake
up the queue again).
I've tested the fix by switching MTU size every second between 600 and
1500 while network traffic was going on. The test box survived a test
of several hours (until I've stopped it) whereas w/o this fix above
panic occurs after several minutes (at most).
Change since v1:
- remove call to netif_stop_queue at beginning of xgmac_stop
- use netif_tx_disable instead of locking+netif_stop_queue
Signed-off-by: Andreas Herrmann <andreas.herrmann@calxeda.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patchset contains some enhancements and bug fixes for the mlx4_* drivers.
Patchset was applied and tested against commit: "9bb8ca8 virtio-net: switch to
use XPS to choose txq"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx4_en: Datapath structures are allocated per NUMA node
For each RX/TX ring and its CQ, allocation is done on a NUMA node that
corresponds to the core that the data structure should operate on.
The assumption is that the core number is reflected by the ring index.
The affected allocations are the ring/CQ data structures,
the TX/RX info and the shared HW/SW buffer.
For TX rings, each core has rings of all UPs.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx4_core: ICM pages are allocated on device NUMA node
This is done to optimize FW/HW access to host memory.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently all TX/RX rings and completion queues are part of the
netdev priv structure and are allocated statically. This patch
will change the priv to hold only arrays of pointers and therefore
all TX/RX rings and completetion queues will be allocated
dynamically. This is in preparation for NUMA aware allocations.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rony Efraim [Thu, 7 Nov 2013 10:19:51 +0000 (12:19 +0200)]
net/mlx4_core: Add immediate activate for VGT->VST->VGT
Allow immediate activate of VGT->VST and VST->VGT transitions, without
the need of rebinding in mlx4_master_immediate_activate_vlan_qos().
Also in struct res_qp: add qp parameters (vlan_index,fvl,vlan_cntrol..)
to the saved set, in order to restore when move to VGT.
- Clear at mlx4_RST2INIT_QP_wrapper()
- Save at mlx4_INIT2RTR_QP_wrapper()
- Restore at mlx4_vf_immed_vlan_work_handler()
Update mlx4_vf_immed_vlan_work_handler() to support VGT.
Signed-off-by: Rony Efraim <ronye@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx4_core: Initialize all mailbox buffers to zero before use
To guarantee that all unused fields in all FW commands for both inboxes
and outboxes are zeroed out, initialize the mailbox buffer to all zeroes.
This is especially important for SRIOV comm-channel virtual commands
(such as QUERY_FUNC_CAP), where if new fields are added to support new
features, the driver can depend on older kernels passing zeroes in these
fields.
In addition to zeroing out the mailbox buffer at allocation time, all
(now unnecessary) calls to memset by the callers of
mlx4_alloc_cmd_mailbox() are removed.
Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support to offload macvlan net_devices to the
hardware. With these patches packets are pushed to the macvlan
net_device directly and do not pass through the lower dev.
The patches here have made it through multiple iterations
each with a slightly different focus. First I tried to
push these as a new link type called "VMDQ". The patches
shown here,
Following this implementation I renamed the link type
"VSI" and addressed various comments. Finally Neil
Horman picked up the patches and integrated the offload
into the macvlan code. Here,
The attached series is clean-up of his patches, with a
few fixes.
If folks find this series acceptable there are a few
items we can work on next. First broadcast and multicast
will use the hardware even for local traffic with this
series. It would be best (I think) to use the software
path for macvlan to macvlan traffic and save the PCIe
bus. This depends on how much you value CPU time vs
PCIE bandwidth. This will need another patch series
to flush out.
Also this series only allows for layer 2 mac forwarding
where some hardware supports more interesting forwarding
capabilities. Integrating with OVS may be useful here.
As always any comments/feedback welcome.
My basic I/O test is here but I've also done some link
testing, SRIOV/DCB with macvlans and others,
Changelog:
v2: two fixes to ixgbe when all features DCB, FCoE, SR-IOV
are enabled with macvlans. A VMDQ_P() reference
should have been accel->pool and do not set the offset
of the ring index from dfwd add call. The offset is used
by SR-IOV so clearing it can cause SR-IOV quue index's
to go sideways. With these fixes testing macvlan's with
SRIOV enabled was successful.
v3: addressed Neil's comments in ixgbe
fixed error path on dfwd_add_station() in ixgbe
fixed ixgbe to allow SRIOV and accelerated macvlans to
coexist.
v4: Dave caught some strange indentation, fixed it here
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Wed, 6 Nov 2013 17:54:52 +0000 (09:54 -0800)]
ixgbe: enable l2 forwarding acceleration for macvlans
Now that l2 acceleration ops are in place from the prior patch,
enable ixgbe to take advantage of these operations. Allow it to
allocate queues for a macvlan so that when we transmit a frame,
we can do the switching in hardware inside the ixgbe card, rather
than in software.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Wed, 6 Nov 2013 17:54:46 +0000 (09:54 -0800)]
net: Add layer 2 hardware acceleration operations for macvlan devices
Add a operations structure that allows a network interface to export
the fact that it supports package forwarding in hardware between
physical interfaces and other mac layer devices assigned to it (such
as macvlans). This operaions structure can be used by virtual mac
devices to bypass software switching so that forwarding can be done
in hardware more efficiently.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 7 Nov 2013 07:44:45 +0000 (10:44 +0300)]
6lowpan: release device on error path
We recently added a new error path and it needs a dev_put().
Fixes: 7adac1ec8198 ('6lowpan: Only make 6lowpan links to IEEE802154 devices') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eyal Perry [Wed, 6 Nov 2013 13:37:24 +0000 (15:37 +0200)]
RDMA/cma: Set IBoE SL (user-priority) by egress map when using vlans
On top of commit 366cddb40 "IB/rdma_cm: TOS <=> UP mapping for IBoE", add
support for case vlan egress map is used.
When the IBoE session is being set over a vlan, inherit the socket priority
to vlan priority mapping which was configured for the vlan device egress map.
Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eyal Perry [Wed, 6 Nov 2013 13:37:23 +0000 (15:37 +0200)]
net/vlan: Provide read access to the vlan egress map
Provide a method for read-only access to the vlan device egress mapping.
Do this by refactoring vlan_dev_get_egress_qos_mask() such that now it
receives as an argument the skb priority instead of pointer to the skb.
Such an access is needed for the IBoE stack where the control plane
goes through the network stack. This is an add-on step on top of commit d4a968658c "net/route: export symbol ip_tos2prio" which allowed the RDMA-CM
to use ip_tos2prio.
Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Wed, 6 Nov 2013 13:02:36 +0000 (14:02 +0100)]
tg3: avoid double-freeing of rx data memory
If build_skb fails the memory associated with the ring buffer is freed but
the ri->data member is not zeroed in this case. This causes a double-free
of this memory in tg3_free_rings->... path. The patch moves this block after
setting ri->data to NULL.
It would be nice to fix this bug also in stable >= v3.4 trees.
Cc: Nithin Nayak Sujir <nsujir@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 7 Nov 2013 23:30:35 +0000 (18:30 -0500)]
Merge branch 'tipc_fragmentation'
Erik Hugne says:
====================
tipc: message reassembly using fragment chain
We introduce a new reassembly algorithm that improves performance
and eliminates the risk of causing out-of-memory situations.
v3: -Use skb_try_coalesce, and revert to fraglist if this does not succeed.
-Make sure reassembly list head is uncloned.
v2: -Rebased on Ying's indentation fix.
-Node unlock call in msg_fragmenter case moved from patch #2 to #1.
('continue' with this lock held would cause spinlock recursion if only
patch #1 is used)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Erik Hugne [Wed, 6 Nov 2013 08:28:07 +0000 (09:28 +0100)]
tipc: reassembly failures should cause link reset
If appending a received fragment to the pending fragment chain
in a unicast link fails, the current code tries to force a retransmission
of the fragment by decrementing the 'next received sequence number'
field in the link. This is done under the assumption that the failure
is caused by an out-of-memory situation, an assumption that does
not hold true after the previous patch in this series.
A failure to append a fragment can now only be caused by a protocol
violation by the sending peer, and it must hence be assumed that it
is either malicious or buggy. Either way, the correct behavior is now
to reset the link instead of trying to revert its sequence number.
So, this is what we do in this commit.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Erik Hugne [Wed, 6 Nov 2013 08:28:06 +0000 (09:28 +0100)]
tipc: message reassembly using fragment chain
When the first fragment of a long data data message is received on a link, a
reassembly buffer large enough to hold the data from this and all subsequent
fragments of the message is allocated. The payload of each new fragment is
copied into this buffer upon arrival. When the last fragment is received, the
reassembled message is delivered upwards to the port/socket layer.
Not only is this an inefficient approach, but it may also cause bursts of
reassembly failures in low memory situations. since we may fail to allocate
the necessary large buffer in the first place. Furthermore, after 100 subsequent
such failures the link will be reset, something that in reality aggravates the
situation.
To remedy this problem, this patch introduces a different approach. Instead of
allocating a big reassembly buffer, we now append the arriving fragments
to a reassembly chain on the link, and deliver the whole chain up to the
socket layer once the last fragment has been received. This is safe because
the retransmission layer of a TIPC link always delivers packets in strict
uninterrupted order, to the reassembly layer as to all other upper layers.
Hence there can never be more than one fragment chain pending reassembly at
any given time in a link, and we can trust (but still verify) that the
fragments will be chained up in the correct order.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Erik Hugne [Wed, 6 Nov 2013 08:28:05 +0000 (09:28 +0100)]
tipc: don't reroute message fragments
When a message fragment is received in a broadcast or unicast link,
the reception code will append the fragment payload to a big reassembly
buffer through a call to the function tipc_recv_fragm(). However, after
the return of that call, the logics goes on and passes the fragment
buffer to the function tipc_net_route_msg(), which will simply drop it.
This behavior is a remnant from the now obsolete multi-cluster
functionality, and has no relevance in the current code base.
Although currently harmless, this unnecessary call would be fatal
after applying the next patch in this series, which introduces
a completely new reassembly algorithm. So we change the code to
eliminate the redundant call.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
bonding: extend round-robin mode with packets_per_slave
This patch aims to extend round-robin mode with a new option called
packets_per_slave which can have the following values and effects:
0 - choose a random slave
1 (default) - standard round-robin, 1 packet per slave
>1 - round-robin when >1 packets have been transmitted per slave
The allowed values are between 0 and 65535.
This patch also fixes the comment style in bond_xmit_roundrobin().
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Tue, 5 Nov 2013 10:19:45 +0000 (18:19 +0800)]
virtio-net: switch to use XPS to choose txq
We used to use a percpu structure vq_index to record the cpu to queue
mapping, this is suboptimal since it duplicates the work of XPS and
loses all other XPS functionality such as allowing user to configure
their own transmission steering strategy.
So this patch switches to use XPS and suggest a default mapping when
the number of cpus is equal to the number of queues. With XPS support,
there's no need for keeping per-cpu vq_index and .ndo_select_queue(),
so they were removed also.
Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S. Tsirkin <mst@redhat.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Duan Jiong [Tue, 5 Nov 2013 05:34:53 +0000 (13:34 +0800)]
ipv6: drop the judgement in rt6_alloc_cow()
Now rt6_alloc_cow() is only called by ip6_pol_route() when
rt->rt6i_flags doesn't contain both RTF_NONEXTHOP and RTF_GATEWAY,
and rt->rt6i_flags hasn't been changed in ip6_rt_copy().
So there is no neccessary to judge whether rt->rt6i_flags contains
RTF_GATEWAY or not.
Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6: fix headroom calculation in udp6_ufo_fragment
Commit 1e2bd517c108816220f262d7954b697af03b5f9c ("udp6: Fix udp
fragmentation for tunnel traffic.") changed the calculation if
there is enough space to include a fragment header in the skb from a
skb->mac_header dervived one to skb_headroom. Because we already peeled
off the skb to transport_header this is wrong. Change this back to check
if we have enough room before the mac_header.
This fixes a panic Saran Neti reported. He used the tbf scheduler which
skb_gso_segments the skb. The offsets get negative and we panic in memcpy
because the skb was erroneously not expanded at the head.
Reported-by: Saran Neti <Saran.Neti@telus.com> Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Gunthorpe [Tue, 5 Nov 2013 00:27:19 +0000 (17:27 -0700)]
net: mv643xx_eth: Add missing phy_addr_set in DT mode
Commit cc9d4598 'net: mv643xx_eth: use of_phy_connect if phy_node
present' made the call to phy_scan optional, if the DT has a link to
the phy node.
However phy_scan has the side effect of calling phy_addr_set, which
writes the phy MDIO address to the ethernet controller. If phy_addr_set
is not called, and the bootloader has not set the correct address then
the driver will fail to function.
Tested on Kirkwood.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Tested-by: Arnaud Ebalard <arno@natisbad.org> Signed-off-by: David S. Miller <davem@davemloft.net>
ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE
Sockets marked with IP_PMTUDISC_INTERFACE won't do path mtu discovery,
their sockets won't accept and install new path mtu information and they
will always use the interface mtu for outgoing packets. It is guaranteed
that the packet is not fragmented locally. But we won't set the DF-Flag
on the outgoing frames.
Florian Weimer had the idea to use this flag to ensure DNS servers are
never generating outgoing fragments. They may well be fragmented on the
path, but the server never stores or usees path mtu values, which could
well be forged in an attack.
(The root of the problem with path MTU discovery is that there is
no reliable way to authenticate ICMP Fragmentation Needed But DF Set
messages because they are sent from intermediate routers with their
source addresses, and the IMCP payload will not always contain sufficient
information to identify a flow.)
Recent research in the DNS community showed that it is possible to
implement an attack where DNS cache poisoning is feasible by spoofing
fragments. This work was done by Amir Herzberg and Haya Shulman:
<https://sites.google.com/site/hayashulman/files/fragmentation-poisoning.pdf>
This issue was previously discussed among the DNS community, e.g.
<http://www.ietf.org/mail-archive/web/dnsext/current/msg01204.html>,
without leading to fixes.
This patch depends on the patch "ipv4: fix DO and PROBE pmtu mode
regarding local fragmentation with UFO/CORK" for the enforcement of the
non-fragmentable checks. If other users than ip_append_page/data should
use this semantic too, we have to add a new flag to IPCB(skb)->flags to
suppress local fragmentation and check for this in ip_finish_output.
Many thanks to Florian Weimer for the idea and feedback while implementing
this patch.
Cc: David S. Miller <davem@davemloft.net> Suggested-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>