Alex Gartrell [Wed, 26 Aug 2015 16:40:37 +0000 (09:40 -0700)]
ipvs: attempt to schedule icmp packets
Invoke the try_to_schedule logic from the icmp path and update it to the
appropriate ip_vs_conn_put function. The schedule functions have been
updated to reject the packets immediately for now.
Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
Alex Gartrell [Wed, 26 Aug 2015 16:40:29 +0000 (09:40 -0700)]
ipvs: Add hdr_flags to iphdr
These flags contain information like whether or not the addresses are
inverted or from icmp. The first will allow us to drop an inverse param
all over the place, and the second will later be useful in scheduling icmp.
Signed-off-by: Alex Gartrell <agartrell@fb.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
*filter
:INPUT ACCEPT [365:25776]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [217:45832]
:t1 - [0:0]
:t2 - [0:0]
:t3 - [0:0]
:t4 - [0:0]
-A t1 -i lo -j t2
-A t2 -i lo -j t3
-A t3 -i lo -j t4
# -A INPUT -j t4
# -A INPUT -j t3
# -A INPUT -j t2
-A INPUT -j t1
COMMIT
Will compute a chain depth of 2 if the comments are removed.
Revert back to counting the number of chains for the time being.
Reported-by: Cong Wang <cwang@twopensource.com> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
And RFC 7084 states in L-14 that IPv6 Router MUST send ICMPv6 Destination
Unreachable with code 5 for packets forwarded to it that use an address
from a prefix that has been invalidated.
Codes 5 and 6 are more informative subsets of code 1.
Signed-off-by: Andreas Herz <andi@geekosphere.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ipvs: add sync_maxlen parameter for the sync daemon
Allow setups with large MTU to send large sync packets by
adding sync_maxlen parameter. The default value is now based
on MTU but no more than 1500 for compatibility reasons.
To avoid problems if MTU changes allow fragmentation by
sending packets with DF=0. Problem reported by Dan Carpenter.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
When the sync damon is started we need to hold rtnl
lock while calling ip_mc_join_group. Currently, we have
a wrong locking order because the correct one is
rtnl_lock->__ip_vs_mutex. It is implied from the usage
of __ip_vs_mutex in ip_vs_dst_event() which is called
under rtnl lock during NETDEV_* notifications.
Fix the problem by calling rtnl_lock early only for the
start_sync_thread call. As a bonus this fixes the usage
__dev_get_by_name which was not called under rtnl lock.
This patch actually extends and depends on commit 54ff9ef36bdf
("ipv4, ipv6: kill ip_mc_{join, leave}_group and
ipv6_sock_mc_{join, drop}").
Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
The weighted overflow scheduling algorithm directs network connections
to the server with the highest weight that is currently available
and overflows to the next when active connections exceed the node's weight.
David Ahern [Wed, 19 Aug 2015 18:40:31 +0000 (11:40 -0700)]
net: Fix nexthop lookups
Andreas reported breakage adding routes with local nexthops:
$ ip route show table main
...
172.28.0.0/24 dev vnf-xe1p0 proto kernel scope link src 172.28.0.16
$ ip route add 10.0.0.0/8 via 172.28.0.32 table 100 dev vnf-xe1p0
RTNETLINK answers: Resource temporarily unavailable
3bfd847203c changed the lookup to use the passed in table but for cases like
this the nexthop is in the local table rather than the passed in table.
Fixes: 3bfd847203c ("net: Use passed in table for nexthop lookups") Reported-by: Andreas Schultz <aschultz@tpip.net> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Scott Feldman [Wed, 19 Aug 2015 18:29:35 +0000 (11:29 -0700)]
bridge: fix netlink max attr size
.maxtype should match .policy. Probably just been getting lucky here
because IFLA_BRPORT_MAX > IFLA_BR_MAX.
Fixes: 13323516 ("bridge: implement rtnl_link_ops->changelink") Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Linton [Wed, 19 Aug 2015 16:46:43 +0000 (11:46 -0500)]
smsc911x: Remove dev==NULL check.
The dev==NULL check in smsc911x_probe_config is useless
and isn't providing any additional protection. If a fwnode
doesn't exist then an appropriate error should be returned
by device_get_phy_mode() covering the original case
of a missing of/fwnode.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds MAC address length check back into
the device_get_mac_addr() function before calling
is_valid_ether_addr() similar to the way the OF
routine does it.
Update the comments for the two new functions.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 20 Aug 2015 21:13:25 +0000 (14:13 -0700)]
Merge tag 'wireless-drivers-next-for-davem-2015-08-19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
Major changes:
ath10k:
* add support for qca99x0 family of devices
* improve performance of tx_lock
* add support for raw mode (802.11 frame format) and software crypto
engine enabled via a module parameter
ath9k:
* add fast-xmit support
wil6210:
* implement TSO support
* support bootloader v1 and onwards
iwlwifi:
* Deprecate -10.ucode
* Clean ups towards multiple Rx queues
* Add support for longer CMD IDs. This will be required by new
firmwares since we are getting close to the u8 limit.
* bugfixes for the D0i3 power state
* Add basic support for FTM
* polish the Miracast operation
* fix a few power consumption issues
* scan cleanup
* fixes for D0i3 system state
* add paging for devices that support it
* add again the new RBD allocation model
* add more options to the firmware debug system
* add support for frag SKBs in Tx
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 20 Aug 2015 21:10:23 +0000 (14:10 -0700)]
Merge branch 'ewma'
Johannes Berg says:
====================
average: convert users to inline implementation
Since there's very little benefit of the out-of-line implementation
(a single byte of .text in one driver as far as I've seen), convert
all drivers to the inline implementation, saving memory, and remove
the out-of-line implementation.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Wed, 19 Aug 2015 07:46:21 +0000 (09:46 +0200)]
rt2x00: use DECLARE_EWMA
Instead of using the out-of-line EWMA calculation, use DECLARE_EWMA()
to create static inlines. On x86/64 this results in code that's one
byte larger (for me), but reduces struct link_ant and struct link
size by the two unsigned long values that store the parameters each.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Wed, 19 Aug 2015 07:48:40 +0000 (09:48 +0200)]
virtio_net: use DECLARE_EWMA
Instead of using the out-of-line EWMA calculation, use DECLARE_EWMA()
to create static inlines. On x86/64 this results in no change in code
size for me, but reduces the struct receive_queue size by the two
unsigned long values that store the parameters.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 20 Aug 2015 20:01:57 +0000 (13:01 -0700)]
Merge branch 'vrf-cleanups-part-2'
Nikolay Aleksandrov says:
====================
vrf: cleanups part 2
This is the next part of vrf cleanups, patch 1 drops the SLAB_PANIC
when creating kmem cache since it's handled, patch 02 removes a slave
duplicate check which is already done by the lower/upper code, patch 3
moves the ndo_add_slave code around a bit so we can drop an error
label and patch 4 drops the master device checks which are unnecessary
because the ops are taken from the master device itself so it can't be
different.
====================
When ndo_add|del_slave ops are used, they're taken from the respective
master device's netdev ops, so if the master device is a VRF only then
the VRF ops will get called thus no need to check the type of the
master.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
vrf: move vrf_insert_slave so we can drop a goto label
We can simplify do_vrf_add_slave by moving vrf_insert_slave in the end
of the enslaving and thus eliminate an error goto label. It always
succeeds and isn't needed before that anyway.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently whenever a packet different from ETH_P_IP is sent through the
VRF device it is leaked so plug the leaks and properly drop these
packets.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
vrf: vrf_master_ifindex_rcu is not always called with rcu read lock
While running net-next I hit this:
[ 634.073119] ===============================
[ 634.073150] [ INFO: suspicious RCU usage. ]
[ 634.073182] 4.2.0-rc6+ #45 Not tainted
[ 634.073213] -------------------------------
[ 634.073244] include/net/vrf.h:38 suspicious rcu_dereference_check()
usage!
[ 634.073274]
other info that might help us debug this:
It would seem vrf_master_ifindex_rcu() can be called without RCU held in
other contexts as well so introduce a new helper which acquires rcu and
returns the ifindex.
Also add curly braces around both the "if" and "else" parts as per the
style guide.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Wed, 19 Aug 2015 07:46:17 +0000 (15:46 +0800)]
lwtunnel: Fix the sparse warnings in fib_encap_match
When CONFIG_LWTUNNEL config is not enabled, the lwtstate_free() is not
declared in lwtunnel.h at all. However, even in this case, the function
is still referenced in fib_semantics.c so that there appears the
following sparse warnings:
net/ipv4/fib_semantics.c:553:17: error: undefined identifier 'lwtstate_free'
CC net/ipv4/fib_semantics.o
net/ipv4/fib_semantics.c: In function ‘fib_encap_match’:
net/ipv4/fib_semantics.c:553:3: error: implicit declaration of function ‘lwtstate_free’ [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
make[1]: *** [net/ipv4/fib_semantics.o] Error 1
make: *** [net/ipv4/fib_semantics.o] Error 2
To eliminate the error, we define an empty function for lwtstate_free()
in lwtunnel.h when CONFIG_LWTUNNEL is disabled.
Fixes: df383e6240ef ("lwtunnel: fix memory leak") Cc: Jiri Benc <jbenc@redhat.com> Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Mon, 17 Aug 2015 16:09:55 +0000 (18:09 +0200)]
netfilter: nft_payload: work around vlan header stripping
make payload expression aware of the fact that VLAN offload may have
removed a vlan header.
When we encounter tagged skb, transparently insert the tag into the
register so that vlan header matching can work without userspace being
aware of offload features.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
David S. Miller [Wed, 19 Aug 2015 03:21:32 +0000 (20:21 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-08-18
This series contains updates to igb, e100, e1000e and ixgbe.
Shota Suzuki provides a fix for a possible overflow in
igb_set_interrupt_capability() which leads to an oops. When changing the
number of queues by "ethtool -L", set IGB_FLAG_QUEUE_PAIRS in the same
manner as when initializing the igb driver.
Vasily Averin provides a fix for a missing rtnl_unlock() for when we
error out due to not being able to allocate memory for our queues.
Stefan Assman provides a couple of fixes for igb/igbvf. First changes
the igb driver in probe to simply call igb_enable_sriov() instead of
igb_sriov_reinit() since we are starting from scratch. Then in igbvf,
fix the driver where it does not clear the buffer_info->dma in all
cases after calling dma_unmap_single(), which was found by changing the
MTU twice.
Richard Cochran implements the periodic output function using the
programmable clock outputs available in i210 when possible, falling
back to the target time for longer periods.
Todd adds support for the Marvell PHY 1512 which is required for i354
devices. Then updates igb to make sure SR-IOV init uses the correct
number of queues, since recent changes could result in the PF holding
onto all of the queues.
Alex Williamson provides a fix in the case where a guest OS does not
support hot-unplug, so disable SR-IOV prior to unregister_netdev() to
avoid the problem.
Jia-Ju Bai provides several patches, first knocks some collecting dust
off an old e100 driver to add a check to avoid a null pointer
dereference. Then cleans up a possible resource leak by releasing the
skb buffer allocated when the e100_xmit_prepare() runs into an issue
in the DMA mapping. In igb, add a missing rtnl_unlock() for when we
error out due to igb_sriov_reinit() in the igb_init_interrupt_scheme().
Provides a e1000e fix, based on suggestions from Alex Duyck to move
head/tail register writing to e1000_configure_tx/rx() to avoid a
possible null pointer dereference (similar to igb driver). Lastly,
fix a possible memory leak in igb_probe(), where the memory shadow_vfta
allocated by kcalloc in igb_sw_init() is not freed.
Mark simplifies port-specific macros for ixgbe by eliminating explicit
comparisons with 0 and enclose formal parameters in parens to eliminate
the risk of an operator precedence issue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 19 Aug 2015 03:16:53 +0000 (20:16 -0700)]
Merge branch 'vrf-next'
Nikolay Aleksandrov says:
====================
vrf: a few simplifications and cleanups
These patches remove some unnecessary checks (patches 3, 4), unnecessary
num_slaves member and refcnt manipulations which are already done by the
upper functions.
====================
Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
vrf: don't check for dstats and rth in uninit path
dstats and rth are always present because we fail the device registration
if they can't be allocated in vrf_init() (ndo_init) so drop the unnecessary
checks.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc [Tue, 18 Aug 2015 16:42:09 +0000 (18:42 +0200)]
lwtunnel: ip tunnel: fix multiple routes with different encap
Currently, two routes going through the same tunnel interface are considered
the same even when they are routed to a different host after encapsulation.
This causes all routes added after the first one to have incorrect
encapsulation parameters.
This is nicely visible by doing:
# ip r a 192.168.1.2/32 dev vxlan0 tunnel dst 10.0.0.2
# ip r a 192.168.1.3/32 dev vxlan0 tunnel dst 10.0.0.3
# ip r
[...]
192.168.1.2/32 tunnel id 0 src 0.0.0.0 dst 10.0.0.2 [...]
192.168.1.3/32 tunnel id 0 src 0.0.0.0 dst 10.0.0.2 [...]
Implement the missing comparison function.
Fixes: 3093fbe7ff4bc ("route: Per route IP tunnel metadata via lightweight tunnel") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc [Tue, 18 Aug 2015 16:41:13 +0000 (18:41 +0200)]
lwtunnel: fix memory leak
The built lwtunnel_state struct has to be freed after comparison.
Fixes: 571e722676fe3 ("ipv4: support for fib route lwtunnel encap attributes") Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 18 Aug 2015 09:31:44 +0000 (12:31 +0300)]
cxgb4: memory corruption in debugfs
You can't use kstrtoul() with an int or it causes memory corruption.
Also j should be unsigned or we have underflow bugs.
I considered changing "j" to unsigned long but everything fits in a u32.
Fixes: 8e3d04fd7d70 ('cxgb4: Add MPS tracing support') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/built-in.o: In function `.vnic_wq_devcmd2_alloc':
(.text+0x49fe40): multiple definition of `.vnic_wq_devcmd2_alloc'
drivers/scsi/built-in.o:(.text+0xb4318): first defined here
drivers/net/built-in.o:(.opd+0x2af00): multiple definition of `vnic_wq_devcmd2_alloc'
drivers/scsi/built-in.o:(.opd+0xad70): first defined here
drivers/net/built-in.o: In function `.vnic_wq_init_start':
(.text+0x49f9c0): multiple definition of `.vnic_wq_init_start'
drivers/scsi/built-in.o:(.text+0xb3b58): first defined here
drivers/net/built-in.o:(.opd+0x2ae88): multiple definition of `vnic_wq_init_start'
drivers/scsi/built-in.o:(.opd+0xace0): first defined here
Rename these to 'enic_*' to avoid the conflict with the functiosn of
the same name in the snic scsi driver.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Rajesh Borundia [Tue, 18 Aug 2015 07:22:59 +0000 (10:22 +0300)]
bnx2x: Add vxlan RSS support
Latest FW submission added some vxlan offload capabilities to our device.
This patch adds the ability to connect to the vxlan NDOs and configure
the UDP port associated with it in the HW.
The device would now be capable of performing RSS according to the
inner headers of the vxlan packets.
Signed-off-by: Rajesh Borundia <Rajesh.Borundia@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 18 Aug 2015 21:17:22 +0000 (14:17 -0700)]
Merge branch 'dsa-multi-swtich'
Andrew Lunn says:
====================
D in DSA patches
The D in DSA is distributed, meaning multiple switches can be
connected together. Currently no mainline system does this, and so the
code is broken. This patchset contains two fixes, and a small helper.
With three of more switches, the current device tree binding is not
sufficient to express the routing between the switches. The first
patch extends the binding, in a backwards compatible way, to allow a
link between a switch to describe all the switches accessible over the
link, not just the direct neighbor.
The third patch fixes the port configuration on newer devices for
links connecting switches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Mon, 17 Aug 2015 21:52:52 +0000 (23:52 +0200)]
dsa: mv88e6xxx: Set DSA mode based on chip abilities
Older devices only support a single DSA frame format, where as newer
devices have two. Take this into account when configuring a DSA port.
The port needs to be in plain old DSA mode, since this is a DSA link,
where as the newer format can be used for the CPU port.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Mon, 17 Aug 2015 21:52:50 +0000 (23:52 +0200)]
net: dsa: Allow multi hop routes to be expressed
With more than two switches in a hierarchy, it becomes necessary to
describe multi-hop routes between switches. The current binding does
not allow this, although the older platform_data did. Extend the link
property to be a list rather than a single phandle to a remote switch.
It is then possible to express that a port should be used to reach
more than one switch and the switch maybe more than one hop away.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Jacob Keller [Wed, 10 Jun 2015 18:44:45 +0000 (11:44 -0700)]
ixgbe: TRIVIAL fix up double 'the' and comment style
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mark Rustad [Sat, 6 Jun 2015 17:41:03 +0000 (10:41 -0700)]
ixgbe: Simplify port-specific macros
Simplify port-specific macros by eliminating explicit comparison
with 0. More importantly, enclose formal parameter in parens to
eliminate the risk of an operator precedence surprise.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Todd Fujinaka [Sat, 8 Aug 2015 00:27:39 +0000 (17:27 -0700)]
igb: make sure SR-IOV init uses the right number of queues
Recent changes to igb_probe_vfs() could lead to the PF holding onto all
of the queues. Reorder igb_probe_vfs() to be before
gb_init_queue_configuration() and add some more error checking.
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Stefan Assmann [Thu, 6 Aug 2015 07:32:17 +0000 (09:32 +0200)]
igbvf: clear buffer_info->dma after dma_unmap_single()
The driver doesn't clear buffer_info->dma after calling
dma_unmap_single() in all cases. This has been discovered by changing
the mtu twice, which caused the following backtrace.
Jia-Ju Bai [Wed, 5 Aug 2015 14:05:16 +0000 (22:05 +0800)]
igb: Fix a memory leak in igb_probe
In error handling code of igb_probe, the memory adapter->shadow_vfta
allocated by kcalloc in igb_sw_init is not freed. So when register_netdev
or igb_init_i2c is failed, a memory leak will occur.
This patch adds kfree to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jia-Ju Bai [Wed, 5 Aug 2015 10:16:10 +0000 (18:16 +0800)]
e1000e: Modify Tx/Rx configurations to avoid null pointer dereferences in e1000_open
When e1000e_setup_rx_resources is failed in e1000_open,
e1000e_free_tx_resources in "err_setup_rx" segment is executed.
"writel(0, tx_ring->head)" statement in e1000_clean_tx_ring
in e1000e_free_tx_resources will cause a null poonter dereference(crash),
because "tx_ring->head" is only assigned in e1000_configure_tx
in e1000_configure, but it is after e1000e_setup_rx_resources.
This patch moves head/tail register writing to e1000_configure_tx/rx,
which can fix this problem. It is inspired by igb_configure_tx_ring
in the igb driver.
Specially, thank Alexander Duyck for his valuable suggestion.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jia-Ju Bai [Mon, 3 Aug 2015 03:36:26 +0000 (11:36 +0800)]
igb: Fix a deadlock in igb_sriov_reinit
When igb_init_interrupt_scheme in igb_sriov_reinit is failed, the lock
acquired by rtnl_lock() is not released, which causes a deadlock.
This patch adds rtnl_unlock() in error handling to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jia-Ju Bai [Mon, 3 Aug 2015 02:40:48 +0000 (10:40 +0800)]
e100: Release skb when DMA mapping is failed in e100_xmit_prepare
When pci_dma_mapping_error in e100_xmit_prepare is failed, the skb buffer
allocated by netdev_alloc_skb_ip_align in e100_rx_alloc_skb is not
released, which causes a possible resource leak.
This patch adds error handling code to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jia-Ju Bai [Mon, 3 Aug 2015 02:17:08 +0000 (10:17 +0800)]
e100: Add a check after pci_pool_create to avoid null pointer dereference
The driver lacks the check of nic->cbs_pool after pci_pool_create
in e100_probe. When this function is failed, a null pointer dereference
occurs when pci_pool_alloc uses nic->cbs_pool in e100_alloc_cbs.
This patch adds a check and related error handling code to fix it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alex Williamson [Wed, 29 Jul 2015 20:38:15 +0000 (14:38 -0600)]
igb: Teardown SR-IOV before unregister_netdev()
When the .remove() callback for a PF is called, SR-IOV support for the
device is disabled, which requires unbinding and removing the VFs.
The VFs may be in-use either by the host kernel or userspace, such as
assigned to a VM through vfio-pci. In this latter case, the VFs may
be removed either by shutting down the VM or hot-unplugging the
devices from the VM. Unfortunately in the case of a Windows 2012 R2
guest, hot-unplug is broken due to the ordering of the PF driver
teardown. Disabling SR-IOV prior to unregister_netdev() avoids this
issue.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Richard Cochran [Thu, 23 Jul 2015 21:59:30 +0000 (14:59 -0700)]
igb: implement high frequency periodic output signals
In addition to interrupt driven target time output events, the i210
also has two programmable clock outputs. These clocks support periods
between 16 nanoseconds and 140 milliseconds. This patch implements
the periodic output function using the clock outputs when possible,
falling back to the target time for longer periods.
Signed-off-by: Richard Cochran <richardcochran@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Stefan Assmann [Fri, 10 Jul 2015 13:01:12 +0000 (15:01 +0200)]
igb: do not re-init SR-IOV during probe
During driver probing the following code path is triggered.
igb_probe
->igb_sw_init
->igb_probe_vfs
->igb_pci_enable_sriov
->igb_sriov_reinit
Doing the SR-IOV re-init is not necessary during probing since we're
starting from scratch. Here we can call igb_enable_sriov() right away.
Running igb_sriov_reinit() during igb_probe() also seems to cause
occasional packet loss on some onboard 82576 NICs. Reproduced on
Dell and HP servers with onboard 82576 NICs.
Example:
Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
Subsystem: Dell Device [1028:0481]
Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is
set if adapter->rss_queues exceeds half of max_rss_queues in
igb_init_queue_configuration().
On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of
queues exceeds half of max_combined in igb_set_channels() when changing
the number of queues by "ethtool -L".
In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size
of adapter->msix_entries[], an overflow can occur in
igb_set_interrupt_capability(), which in turn leads to an oops.
Fix this problem as follows:
- When changing the number of queues by "ethtool -L", set
IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver.
- When increasing the size of q_vector, reallocate it appropriately.
(With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.)
Another possible way to fix this problem is to cap the queues at its
initial number, which is the number of the initial online cpus. But this
is not the optimal way because we cannot increase queues when another
cpu becomes online.
Note that before commit cd14ef54d25b ("igb: Change to use statically
allocated array for MSIx entries"), this problem did not cause oops
but just made the number of queues become 1 because of entering msi_only
mode in igb_set_interrupt_capability().
Fixes: 907b7835799f ("igb: Add ethtool support to configure number of channels") CC: stable <stable@vger.kernel.org> Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Tue, 18 Aug 2015 18:55:08 +0000 (11:55 -0700)]
Merge branch 'drivers_iff_no_queue'
Phil Sutter says:
====================
net: Convert drivers to IFF_NO_QUEUE and cleanup afterwards
This series converts in-tree users away from the old and deprecated
'tx_queue_len = 0' idiom, adds a warning to notify out-of-tree driver
maintainers that there is need for action on their behalf and finally drops any
workarounds in scheduling algorithm implementations.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Phil Sutter [Tue, 18 Aug 2015 08:30:49 +0000 (10:30 +0200)]
net: sched: drop all special handling of tx_queue_len == 0
Those were all workarounds for the formerly double meaning of
tx_queue_len, which broke scheduling algorithms if untreated.
Now that all in-tree drivers have been converted away from setting
tx_queue_len = 0, it should be safe to drop these workarounds for
categorically broken setups.
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Phil Sutter [Tue, 18 Aug 2015 08:30:48 +0000 (10:30 +0200)]
net: warn if drivers set tx_queue_len = 0
Due to the introduction of IFF_NO_QUEUE, there is a better way for
drivers to indicate that no qdisc should be attached by default. Though,
the old convention can't be dropped since ignoring that setting would
break drivers still using it. Instead, add a warning so out-of-tree
driver maintainers get a chance to adjust their code before we finally
get rid of any special handling of tx_queue_len == 0.
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
Phil Sutter [Tue, 18 Aug 2015 08:30:47 +0000 (10:30 +0200)]
staging: wilc1000: convert to using IFF_NO_QUEUE
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Johnny Kim <johnny.kim@atmel.com> Cc: Rachel Kim <rachel.kim@atmel.com> Cc: Dean Lee <dean.lee@atmel.com> Cc: Chris Park <chris.park@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Phil Sutter [Tue, 18 Aug 2015 08:30:44 +0000 (10:30 +0200)]
net: batman-adv: convert to using IFF_NO_QUEUE
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Marek Lindner <mareklindner@neomailbox.ch> Cc: Simon Wunderlich <sw@simonwunderlich.de> Cc: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Phil Sutter [Tue, 18 Aug 2015 08:30:39 +0000 (10:30 +0200)]
net: bonding: convert to using IFF_NO_QUEUE
Signed-off-by: Phil Sutter <phil@nwl.cc> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kalle Valo [Tue, 18 Aug 2015 14:20:11 +0000 (17:20 +0300)]
Merge tag 'iwlwifi-next-for-kalle-2015-08-18' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next
* polish the Miracast operation
* fix a few power consumption issues
* scan cleanup
* fixes for D0i3 system state
* add paging for devices that support it
* add again the new RBD allocation model
* add more options to the firmware debug system
* add support for frag SKBs in Tx
When we enter D0i3, we must stop TXing otherwise the
sequence number we use might conflict with the firmware's
internal TX. In order to do so, we have
IWL_MVM_STATUS_IN_D0I3 which should prevent any Tx while we
enter D0i3. There is a bug in this code since we may Tx even
if IWL_MVM_STATUS_IN_D0I3 is set. This can happen as long as
mvm->d0i3_ap_sta_id is not set.
To make sure that we don't have any packet in the Tx path
while we set mvm->d0i3_ap_sta_id, call synchronize_net only
after we already set mvm->d0i3_ap_sta_id.
David Spinadel [Thu, 6 Aug 2015 07:26:50 +0000 (10:26 +0300)]
iwlwifi: mvm: don't disconnect on beacon loss in D0I3
Currently if we wake up during D0I3 due to beacon loss we disconnect
immediately. This behaviour causes redundant disconnection, which could
be prevented by polling as it is usually done in mac80211.
Instead, we prefer reporting beacon loss and let mac80211 try polling
before disconnection.
Signed-off-by: David Spinadel <david.spinadel@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
iwlwifi: out-of-bounds access in iwl_init_sband_channels
KASan error report:
==================================================================
BUG: KASan: out of bounds access in iwl_init_sband_channels+0x207/0x260 [iwlwifi] at addr ffff8800c2d0aac8
Read of size 4 by task modprobe/329
==================================================================
Both loops of this function compare data from the 'chan' array and then
check if the index is valid.
The 2 conditions should be inverted to avoid an out-of-bounds access.