Eric Dumazet [Thu, 28 Aug 2014 03:49:34 +0000 (20:49 -0700)]
net: attempt a single high order allocation
In commit ed98df3361f0 ("net: use __GFP_NORETRY for high order
allocations") we tried to address one issue caused by order-3
allocations.
We still observe high latencies and system overhead in situations where
compaction is not successful.
Instead of trying order-3, order-2, and order-1, do a single order-3
best effort and immediately fallback to plain order-0.
This mimics slub strategy to fallback to slab min order if the high
order allocation used for performance failed.
Order-3 allocations give a performance boost only if they can be done
without recurring and expensive memory scan.
Quoting David :
The page allocator relies on synchronous (sync light) memory compaction
after direct reclaim for allocations that don't retry and deferred
compaction doesn't work with this strategy because the allocation order
is always decreasing from the previous failed attempt.
This means sync light compaction will always be encountered if memory
cannot be defragmented or reclaimed several times during the
skb_page_frag_refill() iteration.
Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 30 Aug 2014 03:13:05 +0000 (20:13 -0700)]
Merge branch 'mlx4-net'
Or Gerlitz says:
====================
Setup mlx4 user space Ethernet QPs to properly handle VXLAN
This short series fixes the mlx4 driver setting of user space Ethernet QPs
(e.g those opened by DPDK applications) such that they will properly handle
VXLAN traffic/offloads
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Wed, 27 Aug 2014 13:47:48 +0000 (16:47 +0300)]
net/mlx4: Move the tunnel steering helper function to mlx4_core
Move the function which we use to set VXLAN DMFS (flow-steering) rules
from mlx4_en to mlx4_core. This refactoring will allow the mlx4_ib driver
to call the helper for the use case of user-space RAW Ethernet QPs, such
that they can serve VXLAN traffic too.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Enabling DMA_API_DEBUG, warnings are reported at runtime
because the device driver frees DMA memory with wrong functions
and it does not call dma_mapping_error after mapping dma memory.
The first problem is fixed by of introducing a flag that helps us
keeping track which mapping technique was used, so that we can use
the right API for unmap.
This approach was inspired by the e1000 driver, which uses a similar
technique.
Signed-off-by: Andre Draszik <andre.draszik@st.com> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Reviewed-by: Denis Kirjanov <kda@linux-powerpc.org> Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The PTP reference clock, used for setting the addend in the Timestamp Addend
Register, was erroneously hard-coded (as reported in the databook just as
example).
The patch removes the macro named: STMMAC_SYSCLOCK and allows to use a
reference clock (clk_ptp_ref_i) that can be passed from the platform.
If not passed, the main driver clock will be used as default; note that
this can be fine on some platforms.
Note that, prior this patch, using the old STMMAC_SYSCLOCK on some platforms,
as side effect, the ptp clock can move faster/slower than the system clock.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to w/a a problem that happens on some boxes when run at 10Mbps
Half duplex mode.
During the transmission the CSR signal is asserted for some time and the frames
aborted because of carrier sense error.
This is reported by MMC HW counter: txcarrier signal.
This actually is a false carrier so the frames are good and there is no reason
to ask for dropping them.
This patch so disables the Carrier Sense During Transmission
and this means that the MAC transmitter ignore the CRS signal
during frame transmission in Half-Duplex mode.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Acked-by: Vince Bridgers <vbridgers2013@gmail.com> Acked-by: Ley Foon Tan <lftan@altera.com> Acked-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: David S. Miller <davem@davemloft.net>
According to the Std 802.3az if the EEE Adv (Reg 7.60), Link partner ability
(Reg 7.61) and EEE capability (Register 3.20) bits return 0 this means no EEE
is supported. So this patch fixes the checks inside the phy_init_eee function.
Signed-off-by: Nandini Sharma <nandini.sharma@st.com> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Aug 2014 03:21:55 +0000 (20:21 -0700)]
mvneta: Add missing if_vlan.h include.
drivers/net/ethernet/marvell/mvneta.c: In function 'mvneta_skb_tx_csum':
drivers/net/ethernet/marvell/mvneta.c:1374:3: error: implicit declaration of function 'vlan_get_protocol' [-Werror=implicit-function-declaration]
__be16 l3_proto = vlan_get_protocol(skb);
^
Reporeted-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Liu [Mon, 25 Aug 2014 15:44:00 +0000 (16:44 +0100)]
xen-netback: move netif_napi_add before binding interrupt
Interrupt is enabled when bind_interdomain_evtchn_to_irqhandler returns.
If there's interrupt pending interrupt handler is invoked.
NAPI needs to be initialised before binding interrupt otherwise the
interrupt handler will try to scheduling a NAPI instance that is not
initialised yet, resulting in kernel OOPS.
This fixes a regression introduced in ea2c5e13 ("xen-netback: move NAPI
add/remove calls").
Ideally function calls to create kthreads should also be moved before
binding but I intent to fix this regression with minimal changes and
refactor the code with another patch.
Reported-by: Thomas Leonard <talex5@gmail.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 26 Aug 2014 00:27:47 +0000 (17:27 -0700)]
Merge branch 'tso_fix'
Vladislav Yasevich says:
====================
Fix TSO and checksum issues with non-accelerated vlan traffic.
I've recently ran across something rather interesting when testing vlans
from inside VMs. In some scenarios I was getting awfull thruput.
Some debugging uncovered a very scary packet corruption. I was
seeing packets that had original TSO length as IP total length
and their ip checksum was 0. This was with e1000e driver.
A bit more debugging uncovered an assumption made by that driver
that skb->protocol will contain l3 protocol information. This
was not the case in my setup since I manually turned off vlan
tx acceleration for the device. This caused the driver to not
initialize the tso information correctly and resulted in
corrupt TSO frames on the wire.
I decided to do some auditing of the usage of skb->protocols
in the drivers. Out of all the drivers, the included 8 appear
to be effected. They all allow user to control vlan acceleration
settings, all support TSO on vlan devices, and all use
skb->protocol to decide how to encode TSO information. Some
also have similar problems when initializing hw checksum information.
On such device, it is simple enough to reproduce the issue.
Simply turn off TX VLAN acceleration on the device, create a vlan,
and run you favorite network performance tool.
There is 1 driver I ran across that I belive will trigger a BUG
in the system when used with vlans. That driver is tile/tilepro.c
I have not changed it in this patch set and would hope that
the maintainer has time to look at it.
V2: Fix i40ev using the wrong function name. Full build.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Mon, 25 Aug 2014 14:34:55 +0000 (10:34 -0400)]
qlge: Fix TSO for non-accelerated vlan traffic
This device claims TSO support for vlans. It also allows a user to
control vlan acceleration offloading. As such, it is possible to turn
off vlan acceleration and configure a vlan which will continue to send
TSO traffic.
In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO information.
This results in corrupted frames sent on the wire.
This patch extracts the protocol value correctly by using a
vlan_get_protocol() helper and corrects corrupt TSO frames.
CC: Shahed Shaikh <shahed.shaikh@qlogic.com> CC: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> CC: Ron Mercer <ron.mercer@qlogic.com> CC: linux-driver@qlogic.com Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Mon, 25 Aug 2014 14:34:54 +0000 (10:34 -0400)]
mvneta: Fix TSO and checksum for non-acceleration vlan traffic
This driver doesn't appear to support vlan acceleration at
all. However, it does claim to support TSO and IP checksums
for vlan devices. Thus any configured vlan device would
end up passing down partial checksums or TSO frames.
The driver also uses the value from skb->protocol to
determine TSO and checksum offload information, but assumes
that skb->protocol holds the l3 protocol information.
As a result, vlan traffic with partial checksums or TSO
will fail those checks and TSO will not happen.
Fix this by using vlan_get_protocol() helper.
CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Mon, 25 Aug 2014 14:34:53 +0000 (10:34 -0400)]
i40evf: Fix TSO and hw checksums for non-accelerated vlan packets.
This device claims TSO and checksum support for vlans. It also
allows a user to control vlan acceleration offloading. As such,
it is possible to turn off vlan acceleration and configure a vlan
which will continue to support TSO and hw checksums.
In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO and checksum information.
This results in corrupted frames sent on the wire.
This patch extract the protocol value correctly and corrects TSO
and checksums for non-accelerated traffic.
Fix this by using vlan_get_protocol() helper.
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Bruce Allan <bruce.w.allan@intel.com> CC: Carolyn Wyborny <carolyn.wyborny@intel.com> CC: Don Skidmore <donald.c.skidmore@intel.com> CC: Greg Rose <gregory.v.rose@intel.com> CC: Alex Duyck <alexander.h.duyck@intel.com> CC: John Ronciak <john.ronciak@intel.com> CC: Mitch Williams <mitch.a.williams@intel.com> CC: Linux NICS <linux.nics@intel.com> CC: e1000-devel@lists.sourceforge.net Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Mon, 25 Aug 2014 14:34:52 +0000 (10:34 -0400)]
i40e: Fix TSO and hw checksums for non-accelerated vlan packets.
This device claims TSO and checksum support for vlans. It also
allows a user to control vlan acceleration offloading. As such,
it is possible to turn off vlan acceleration and configure a vlan
which will continue to support TSO and hw checksums.
In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO and checksum information.
This results in corrupted frames sent on the wire.
This patch extract the protocol value correctly and corrects TSO
and checksums for non-accelerated traffic.
Fix this by using vlan_get_protocol() helper.
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Bruce Allan <bruce.w.allan@intel.com> CC: Carolyn Wyborny <carolyn.wyborny@intel.com> CC: Don Skidmore <donald.c.skidmore@intel.com> CC: Greg Rose <gregory.v.rose@intel.com> CC: Alex Duyck <alexander.h.duyck@intel.com> CC: John Ronciak <john.ronciak@intel.com> CC: Mitch Williams <mitch.a.williams@intel.com> CC: Linux NICS <linux.nics@intel.com> CC: e1000-devel@lists.sourceforge.net Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Mon, 25 Aug 2014 14:34:51 +0000 (10:34 -0400)]
ehea: Fix TSO and hw checksums with non-accelerated vlan packets.
The driver claims that it can do TSO and IP checksums on vlan
devices and also allows user to control vlan acceleration offloading.
This makes it possible to push traffic to this driver that has TSO or
partial checksums set, but also have a non-accelearted vlan
header. In this case, the driver will fail to correctly
identify such traffic and will not correctly perform
segmentation and checksum calculation.
Fix this by using vlan_get_protocol() helper instead of
assuming skb->protocol always has this information.
CC: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Mon, 25 Aug 2014 14:34:50 +0000 (10:34 -0400)]
bna: Support TSO and partial checksum with non-accelerated vlans.
This device claims TSO and checksum support for vlans. It also
allows a user to control vlan acceleration offloading. As such,
it is possible to turn off vlan acceleration and configure a vlan
which will continue to support TSO.
In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO information. This results
in corrupted frames sent on the wire.
This patch extract the protocol value correctly and corrects TSO
and checksums for non-accelerated traffic.
CC: Rasesh Mody <rmody@brocade.com> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Mon, 25 Aug 2014 14:34:49 +0000 (10:34 -0400)]
e1000: Fix TSO for non-accelerated vlan traffic
This device claims TSO and checksum support for vlans. It also
allows a user to control vlan acceleration offloading. As such,
it is possible to turn off vlan acceleration and configure a vlan
which will continue to support TSO.
In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO and checksum information.
This will results in corrupted frames sent on the wire.
This patch extract the protocol value correctly and corrects TSO
for non-accelerated traffic.
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Bruce Allan <bruce.w.allan@intel.com> CC: Carolyn Wyborny <carolyn.wyborny@intel.com> CC: Don Skidmore <donald.c.skidmore@intel.com> CC: Greg Rose <gregory.v.rose@intel.com> CC: Alex Duyck <alexander.h.duyck@intel.com> CC: John Ronciak <john.ronciak@intel.com> CC: Mitch Williams <mitch.a.williams@intel.com> CC: Linux NICS <linux.nics@intel.com> CC: e1000-devel@lists.sourceforge.net Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Mon, 25 Aug 2014 14:34:48 +0000 (10:34 -0400)]
e1000e: Fix TSO with non-accelerated vlans
This device claims TSO support for vlans. It also allows a
user to control vlan acceleration offloading. As such, it is
possible to turn off vlan acceleration and configure a vlan
which will continue to support TSO.
In such situation the packet passed down the the device will contain
a vlan header and skb->protocol will be set to ETH_P_8021Q.
The device assumes that skb->protocol contains network protocol
value and uses that value to set up TSO information. This results
in corrupted frames sent on the wire. Corruptions include
incorrect IP total length and invalid IP checksum.
This patch extract the protocol value correctly and corrects TSO
for non-accelerated traffic.
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Bruce Allan <bruce.w.allan@intel.com> CC: Carolyn Wyborny <carolyn.wyborny@intel.com> CC: Don Skidmore <donald.c.skidmore@intel.com> CC: Greg Rose <gregory.v.rose@intel.com> CC: Alex Duyck <alexander.h.duyck@intel.com> CC: John Ronciak <john.ronciak@intel.com> CC: Mitch Williams <mitch.a.williams@intel.com> CC: Linux NICS <linux.nics@intel.com> CC: e1000-devel@lists.sourceforge.net Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jonas Jensen [Mon, 25 Aug 2014 14:22:22 +0000 (16:22 +0200)]
net: moxa: replace build_skb() with netdev_alloc_skb_ip_align() / memcpy()
build_skb() is used to make skbs out of existing RX ring memory
which is bad because the RX ring is allocated only once, on probe.
Memory corruption occur because said memory is reclaimed, i.e.
__kfree_skb() (and eventually put_page()).
Replace build_skb() with netdev_alloc_skb_ip_align() and use memcpy().
Remove SKB_DATA_ALIGN() from RX buffer size while we're at it.
Michal Kubeček [Mon, 25 Aug 2014 13:16:22 +0000 (15:16 +0200)]
net: fix checksum features handling in netif_skb_features()
This is follow-up to
da08143b8520 ("vlan: more careful checksum features handling")
which introduced more careful feature intersection in vlan code,
taking into account that HW_CSUM should be considered superset
of IP_CSUM/IPV6_CSUM. The same is needed in netif_skb_features()
in order to avoid offloading mismatch warning when vlan is
created on top of a bond consisting of slaves supporting IP/IPv6
checksumming but not vlan Tx offloading.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
Code manipulating sysfs symlinks on adjacent net_devices(s)
currently doesn't take into account that devices potentially
belong to different namespaces.
This patch trying to fix an issue as follows:
- check for net_ns before creating / deleting symlink.
for now only netdev_adjacent_rename_links and
__netdev_adjacent_dev_remove are affected, afaics
__netdev_adjacent_dev_insert implies both net_devs
belong to the same namespace.
- Drop all existing symlinks to / from all adj_devs before
switching namespace and recreate them just after.
Signed-off-by: Alexander Y. Fomichev <git.user@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
is optimised away by the compiler, due to the second initializer,
therefore initialising .sin.sin_addr.s_addr always to 0.
This results in netlink messages indicating a L3 miss never contain the
missed IP address. This was observed with GCC 4.8 and 4.9. I do not know about previous versions.
The problem affects user space programs relying on an IP address being
sent as part of a netlink message indicating a L3 miss.
Changing
.sa.sa_family = AF_INET,
to
.sin.sin_family = AF_INET,
fixes the problem.
Signed-off-by: Gerhard Stenzel <gerhard.stenzel@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 22 Aug 2014 21:50:21 +0000 (14:50 -0700)]
Merge tag 'pwm/for-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
Pull pwm fix from Thierry Reding:
"Just one bugfix for the PWM lookup table code that would cause a PWM
channel to be set to the wrong period and polarity for non-perfect
matches"
* tag 'pwm/for-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: Fix period and polarity in pwm_get() for non-perfect matches
Pull networking fixes from David Miller:
"Here are some bug fixes that have piled up during ksummit/linuxcon.
1) Fix endian problems in ibmveth, from Anton Blanchard.
2) IPV6 routing code does GFP_KERNEL allocation in atomic, fix from
Benjamin Block.
3) SCTP association fixes from Daniel Borkmann.
4) When multiple VLAN headers are present we have to make sure the
second and subsequent ones are pullable in the SKB otherwise we
blindly dereference garbage. From Jiri Benc.
5) The argument adjustment of the signature of hlist_add_after*()
introduced a regression in the batman-adv code, fix from Sven
Eckelmann.
6) Fix TX hang handling to avoid a panic in i40e, from Anjali Singhai
Jain.
7) PTP flag test is inverted in i40e driver, from Jesse Brandeburg.
8) ATM LEC driver needs to hold RTNL mutex over MTU changes, from
Chas Williams.
9) Truncate packets larger then the TPACKET_V3 format configured
buffers, otherwise we overwrite past the end of said buffers.
From Eric Dumazet.
10) Fix endianness bugs in qlcnic firmware handling, from Rajesh
Borundia and Shahed Shaikh.
11) CXGB4 sometimes doesn't get all of the TX completion events it
should resulting in SKBs getting stuck in the TX queue, from
Hariprasad Shenai.
12) When the FEC chip's PTP clock is disabled, you can't access the
register. Add necessary checks to avoid the resulting hang, from
Fugang Duan"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits)
drivers: isdn: eicon: xdi_msg.h: Fix typo in #ifndef
net: sctp: fix suboptimal edge-case on non-active active/retrans path selection
net: sctp: spare unnecessary comparison in sctp_trans_elect_best
net: ethernet: broadcom: bnx2x: Remove redundant #ifdef
ibmveth: Fix endian issues with rx_no_buffer statistic
net: xgene: fix possible NULL dereference in xgene_enet_free_desc_rings()
openvswitch: fix panic with multiple vlan headers
net: ipv6: fib: don't sleep inside atomic lock
net: fec: ptp: avoid register access when ipg clock is disabled
cxgb4: Free completed tx skbs promptly
cxgb4: Fix race condition in cleanup
sctp: not send SCTP_PEER_ADDR_CHANGE notifications with failed probe
bnx2x: Revert UNDI flushing mechanism
qlcnic: Fix endianess issue in firmware load from file operation
qlcnic: Fix endianess issue in FW dump template header
qlcnic: Fix flash access interface to application
MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driver
macvlan: Allow setting multicast filter on all macvlan types
packet: handle too big packets for PACKET_V3
MAINTAINERS: add entry for ec_bhf driver
...
Daniel Borkmann [Fri, 22 Aug 2014 11:03:30 +0000 (13:03 +0200)]
net: sctp: fix suboptimal edge-case on non-active active/retrans path selection
In SCTP, selection of active (T.ACT) and retransmission (T.RET)
transports is being done whenever transport control operations
(UP, DOWN, PF, ...) are engaged through sctp_assoc_control_transport().
Commits 4c47af4d5eb2 ("net: sctp: rework multihoming retransmission
path selection to rfc4960") and a7288c4dd509 ("net: sctp: improve
sctp_select_active_and_retran_path selection") have both improved
it towards a more fine-grained and optimal path selection.
Currently, the selection algorithm for T.ACT and T.RET is as follows:
1) Elect the two most recently used ACTIVE transports T1, T2 for
T.ACT, T.RET, where T.ACT<-T1 and T1 is most recently used
2) In case primary path T.PRI not in {T1, T2} but ACTIVE, set
T.ACT<-T.PRI and T.RET<-T1
3) If only T1 is ACTIVE from the set, set T.ACT<-T1 and T.RET<-T1
4) If none is ACTIVE, set T.ACT<-best(T.PRI, T.RET, T3) where
T3 is the most recently used (if avail) in PF, set T.RET<-T.PRI
Prior to above commits, 4) was simply a camp on T.ACT<-T.PRI and
T.RET<-T.PRI, ignoring possible paths in PF. Camping on T.PRI is
still slightly suboptimal as it can lead to the following scenario:
T.PRI is permanently down, T2 is put briefly into PF state (e.g. due to
link flapping). Here, the first time transmission is sent over PF path
T2 as it's the only non-INACTIVE path, but the retransmitted data-chunks
are sent over the INACTIVE path T1 (T.PRI), which is not good.
After the patch, it's choosing better transports in both cases by
modifying step 4):
4) If none is ACTIVE, set T.ACT_new<-best(T.ACT_old, T3) where T3 is
the most recently used (if avail) in PF, set T.RET<-T.ACT_new
This will still select a best possible path in PF if available (which
can also include T.PRI/T.RET), and set both T.ACT/T.RET to it.
In case sctp_assoc_control_transport() *just* put T.ACT_old into INACTIVE
as it transitioned from ACTIVE->PF->INACTIVE and stays in INACTIVE just
for a very short while before going back ACTIVE, it will guarantee that
this path will be reselected for T.ACT/T.RET since T3 (PF) is not
available.
Previously, this was not possible, as we would only select between T.PRI
and T.RET, and a possible T3 would be NULL due to the fact that we have
just transitioned T3 in sctp_assoc_control_transport() from PF->INACTIVE
and would select a suboptimal path when T.PRI/T.RET have worse properties.
In the case that T.ACT_old permanently went to INACTIVE during this
transition and there's no PF path available, plus T.PRI and T.RET are
INACTIVE as well, we would now camp on T.ACT_old, but if everything is
being INACTIVE there's really not much we can do except hoping for a
successful HB to bring one of the transports back up again and, thus
cause a new selection through sctp_assoc_control_transport().
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Fri, 22 Aug 2014 11:03:29 +0000 (13:03 +0200)]
net: sctp: spare unnecessary comparison in sctp_trans_elect_best
When both transports are the same, we don't have to go down that
road only to realize that we will return the very same transport.
We are guaranteed that curr is always non-NULL. Therefore, just
short-circuit this special case.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Blanchard [Fri, 22 Aug 2014 01:36:52 +0000 (11:36 +1000)]
ibmveth: Fix endian issues with rx_no_buffer statistic
Hidden away in the last 8 bytes of the buffer_list page is a solitary
statistic. It needs to be byte swapped or else ethtool -S will
produce numbers that terrify the user.
Since we do this in multiple places, create a helper function with a
comment explaining what is going on.
Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
net: xgene: fix possible NULL dereference in xgene_enet_free_desc_rings()
A NULL pointer dereference is possible for the argument ring->buf_pool
which is passed to xgene_enet_free_desc_ring(), as ring could be NULL.
And now since NULL pointers are being checked for before the calls to
xgene_enet_free_desc_ring(), might as well take advantage of them and
not call the function if the argument would be NULL.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc [Thu, 21 Aug 2014 19:33:44 +0000 (21:33 +0200)]
openvswitch: fix panic with multiple vlan headers
When there are multiple vlan headers present in a received frame, the first
one is put into vlan_tci and protocol is set to ETH_P_8021Q. Anything in the
skb beyond the VLAN TPID may be still non-linear, including the inner TCI
and ethertype. While ovs_flow_extract takes care of IP and IPv6 headers, it
does nothing with ETH_P_8021Q. Later, if OVS_ACTION_ATTR_POP_VLAN is
executed, __pop_vlan_tci pulls the next vlan header into vlan_tci.
This leads to two things:
1. Part of the resulting ethernet header is in the non-linear part of the
skb. When eth_type_trans is called later as the result of
OVS_ACTION_ATTR_OUTPUT, kernel BUGs in __skb_pull. Also, __pop_vlan_tci
is in fact accessing random data when it reads past the TPID.
2. network_header points into the ethernet header instead of behind it.
mac_len is set to a wrong value (10), too.
Reported-by: Yulong Pei <ypei@redhat.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin Block [Thu, 21 Aug 2014 17:37:48 +0000 (19:37 +0200)]
net: ipv6: fib: don't sleep inside atomic lock
The function fib6_commit_metrics() allocates a piece of memory in mode
GFP_KERNEL while holding an atomic lock from higher up in the stack, in
the function __ip6_ins_rt(). This produces the following BUG:
Fixing this by replacing the mode GFP_KERNEL with GFP_ATOMIC.
Signed-off-by: Benjamin Block <bebl@mageta.org> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Nimrod Andy [Thu, 21 Aug 2014 09:09:38 +0000 (17:09 +0800)]
net: fec: ptp: avoid register access when ipg clock is disabled
The current kernel hang on i.MX6SX with rootfs mount from MMC.
The root cause is that ptp uses a periodic timer to access enet register
even if ipg clock is disabled.
FEC ptp driver start one period timer to read 1588 counter register in the
ptp init function that is called after FEC driver is probed.
To save power, after FEC probe finish, FEC driver disable all clocks including
ipg clock that is needed for register access.
i.MX5x, i.MX6q/dl/sl FEC register access don't cause system hang when ipg clock
is disabled, just return zero value. But for i.MX6sx SOC, it cause system hang.
To avoid the issue, we need to check ptp clock status before ptp timer count access.
Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 22 Aug 2014 16:08:20 +0000 (09:08 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"This small set of fixes addresses a few issues introduced during the
merge window, including:
- fix typo in I-cache detection that was causing us to treat all
I-caches as aliasing
- hook up memfd_create and getrandom syscalls for native and compat
- revert a temporary hack for defconfig builds in -next (the audit
tree changes didn't make it in this merge window)
- a couple of UEFI fixes for TEXT_OFFSET fuzzing and /memreserve/
- a simple sparsemem fix for 48-bit physical addressing
- small defconfig updates to get autotesters working with X-gene"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
Revert "arm64: Do not invoke audit_syscall_* functions if !CONFIG_AUDIT_SYSCALL"
arm64: mm: update max pa bits to 48
arm64: ignore DT memreserve entries when booting in UEFI mode
arm64: configs: Enable X-Gene SATA and ethernet in defconfig
arm64: align randomized TEXT_OFFSET on 4 kB boundary
asm-generic: add memfd_create system call to unistd.h
arm64: compat: wire up memfd_create and getrandom syscalls for aarch32
arm64: fix typo in I-cache policy detection
Linus Torvalds [Fri, 22 Aug 2014 16:06:22 +0000 (09:06 -0700)]
Merge tag 'iommu-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"The fixes include:
- fix a crash in the VT-d driver when devices with a driver attached
are hot-unplugged
- fix a AMD IOMMU driver crash with device assignment of 32 bit PCI
devices to KVM guests
- fix for a copy&paste error in generic IOMMU code. Now the right
function pointer is checked before calling"
* tag 'iommu-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/core: Check for the right function pointer in iommu_map()
iommu/amd: Fix cleanup_domain for mass device removal
iommu/vt-d: Defer domain removal if device is assigned to a driver
Description of problem:
The NIC card is not reporting back to the driver the transmitted skbs,
so they get stuck in the TX ring causing issues with reference
counters in other kernel components.
Developed a new Automatic Egress Queue Update firmware facility to slowly tick
through Egress Queues and send back any outstanding CIDX Updates which are
laying around.
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 22 Aug 2014 04:53:15 +0000 (21:53 -0700)]
Merge tag 'linux-can-fixes-for-3.17-20140821' of git://gitorious.org/linux-can/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2014-08-21
The first patch is from Mirza Krak, it fixes the initialization of the hardware
in the sja1000 driver. The next patch is contributed by Dan Carpenter, it fixes
the error handling in the c_can's probe function. Then there are two patches
for the flexcan driver, one by Alexander Stein, which fixes the resetting of
the bus error interrupt mask, the other one by Sebastian Andrzej Siewior which
adds an additional error state transition message.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Anish Bhatt [Wed, 20 Aug 2014 20:44:06 +0000 (13:44 -0700)]
cxgb4: Fix race condition in cleanup
There is a possible race condition when we unregister the PCI Driver and then
flush/destroy the global "workq". This could lead to situations where there
are tasks on the Work Queue with references to now deleted adapter data
structures. Instead, have per-adapter Work Queues which were instantiated and
torn down in init_one() and remove_one(), respectively.
v2: Remove unnecessary call to flush_workqueue() before destroy_workqueue()
Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
zhuyj [Wed, 20 Aug 2014 09:31:43 +0000 (17:31 +0800)]
sctp: not send SCTP_PEER_ADDR_CHANGE notifications with failed probe
Since the transport has always been in state SCTP_UNCONFIRMED, it
therefore wasn't active before and hasn't been used before, and it
always has been, so it is unnecessary to bug the user with a
notification.
Reported-by: Deepak Khandelwal <khandelwal.deepak.1987@gmail.com> Suggested-by: Vlad Yasevich <vyasevich@gmail.com> Suggested-by: Michael Tuexen <tuexen@fh-muenster.de> Suggested-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Zhu Yanjun <Yanjun.Zhu@windriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 18 Aug 2014 19:36:23 +0000 (22:36 +0300)]
bnx2x: Revert UNDI flushing mechanism
Commit 91ebb929b6f8 ("bnx2x: Add support for Multi-Function UNDI") [which was
later supposedly fixed by de682941eef3 ("bnx2x: Fix UNDI driver unload")]
introduced a bug in which in some [yet-to-be-determined] scenarios the
alternative flushing mechanism which was to guarantee the Rx buffers are
empty before resetting them during device probe will fail.
If this happens, when device will be loaded once more a fatal attention will
occur; Since this most likely happens in boot from SAN scenarios, the machine
will fail to load.
Notice this may occur not only in the 'Multi-Function' scenario but in the
regular scenario as well, i.e., this introduced a regression in the driver's
ability to perform boot from SAN.
The patch reverts the mechanism and applies the old scheme to multi-function
devices as well as to single-function devices.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Shahed Shaikh [Mon, 18 Aug 2014 13:31:55 +0000 (09:31 -0400)]
qlcnic: Fix endianess issue in firmware load from file operation
Firmware binary file is in little endian. On big-endian architecture, while
writing this binary FW file to adapters memory, writel() swaps the data resulting into
corruption of FW image. So, swap the data before writing into adapters memory.
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rajesh Borundia [Mon, 18 Aug 2014 13:31:54 +0000 (09:31 -0400)]
qlcnic: Fix endianess issue in FW dump template header
Firmware dump template header is read from adapter using
readl() which swaps the data. So, adjust structure
element on the boundary of 32bit dword.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Application expects flash data in little endian, but driver reads/writes
flash data using readl()/writel() APIs which swaps data on big endian machine.
So, swap the data after reading from and before writing to flash memory.
Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Fri, 15 Aug 2014 17:04:59 +0000 (13:04 -0400)]
macvlan: Allow setting multicast filter on all macvlan types
Currently, macvlan code restricts multicast and unicast
filter setting only to passthru devices. As a result,
if a guest using macvtap wants to receive multicast
traffic, it has to set IFF_ALLMULTI or IFF_PROMISC.
This patch makes it possible to use the fdb interface
to add multicast addresses to the filter thus allowing
a guest to receive only targeted multicast traffic.
CC: John Fastabend <john.r.fastabend@intel.com> CC: Michael S. Tsirkin <mst@redhat.com> CC: Jason Wang <jasowang@redhat.com> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 15 Aug 2014 16:16:04 +0000 (09:16 -0700)]
packet: handle too big packets for PACKET_V3
af_packet can currently overwrite kernel memory by out of bound
accesses, because it assumed a [new] block can always hold one frame.
This is not generally the case, even if most existing tools do it right.
This patch clamps too long frames as API permits, and issue a one time
error on syslog.
[ 394.357639] tpacket_rcv: packet too big, clamped from 5042 to 3966. macoff=82
In this example, packet header tp_snaplen was set to 3966,
and tp_len was set to 5042 (skb->len)
Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.") Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 21 Aug 2014 21:26:27 +0000 (14:26 -0700)]
Merge branch 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"Nothing drastic but pushing out early due to build breakage in the new
tegra platform.
Additionally:
- M550 tagged trim blacklist pattern is widened so that it matches
the new 1TB model
- three controller specific fixes"
* 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: widen Crucial M550 blacklist matching
pata_scc: propagate return value of scc_wait_after_reset
ata: ahci_tegra: Change include to fix compilation
pata_samsung_cf: change ret type to signed
ahci_xgene: Removing NCQ support from the APM X-Gene SoC AHCI SATA Host Controller driver.
Linus Torvalds [Thu, 21 Aug 2014 21:25:20 +0000 (14:25 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- fixes for a couple potential memory corruption problems (the HW would
have to be manufactured to be deliberately evil to trigger those)
found by Ben Hawkes
- fix for potential infinite loop when using sysfs interface of
logitech driver, from Simon Wood
- a couple more simple driver fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: fix a couple of off-by-ones
HID: logitech: perform bounds checking on device_id early enough
HID: logitech: fix bounds checking on LED report size
HID: logitech: Prevent possibility of infinite loop when using /sys interface
HID: rmi: print an error if F11 is not found instead of stopping the device
HID: hid-sensor-hub: use devm_ functions consistently
HID: huion: Use allocated buffer for DMA
HID: huion: Fail on parameter retrieval errors
Linus Torvalds [Thu, 21 Aug 2014 21:24:40 +0000 (14:24 -0700)]
Merge tag 'sound-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A bunch of ASoC fixes with a few HD-audio fixes in this pull request.
All fairly small, boring and device-specific fixes, in addition to
MAINTAINERS update for better reviewing"
* tag 'sound-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/hdmi - apply Valleyview fix-ups to Cherryview display codec
ALSA: hda/hdmi - set depop_delay for haswell plus
ALSA: hda - restore the gpio led after resume
ALSA: hda/realtek - Avoid setting wrong COEF on ALC269 & co
ASoC: pxa-ssp: drop SNDRV_PCM_FMTBIT_S24_LE
ASoC: fsl-esai: Revert .xlate_tdm_slot_mask() support
ASoC: mcasp: Fix implicit BLCK divider setting
ASoC: arizona: Fix TDM slot length handling in arizona_hw_params
ASoC: pcm512x: Correct Digital Playback control names
ASoC: dapm: Fix uninitialized variable in snd_soc_dapm_get_enum_double()
ASoC: Intel: Restore Baytrail ADSP streams only when ADSP was in reset
ASoC: Intel: Wait Baytrail ADSP boot at resume_early stage
ASoC: Intel: Merge Baytrail ADSP suspend_noirq into suspend_late
MAINTAINERS: Add i.MX maintainers and paths to Freescale ASoC entry
ASoC: Intel: Update Baytrail ADSP firmware name
Linus Torvalds [Thu, 21 Aug 2014 21:07:44 +0000 (14:07 -0700)]
Merge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Here is the fixup for the 'lowlight' of my last pull request. I2C is
not selected anymore by I2C_ACPI. Instead, the code in question now
depends on I2C=y.
Also, Mika has agreed to support me and be the maintainer for I2C-ACPI
related patches. Finally, a new-ID-patch came along last week"
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
MAINTAINERS: add maintainer for ACPI parts of I2C
i2c: i801: Add PCI ID for Intel Braswell
i2c: rework kernel config I2C_ACPI
Jiri Kosina [Thu, 21 Aug 2014 14:57:17 +0000 (09:57 -0500)]
HID: logitech: perform bounds checking on device_id early enough
device_index is a char type and the size of paired_dj_deivces is 7
elements, therefore proper bounds checking has to be applied to
device_index before it is used.
We are currently performing the bounds checking in
logi_dj_recv_add_djhid_device(), which is too late, as malicious device
could send REPORT_TYPE_NOTIF_DEVICE_UNPAIRED early enough and trigger the
problem in one of the report forwarding functions called from
logi_dj_raw_event().
Fix this by performing the check at the earliest possible ocasion in
logi_dj_raw_event().
Cc: stable@vger.kernel.org Reported-by: Ben Hawkes <hawkes@google.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Jiri Kosina [Thu, 21 Aug 2014 14:56:47 +0000 (09:56 -0500)]
HID: logitech: fix bounds checking on LED report size
The check on report size for REPORT_TYPE_LEDS in logi_dj_ll_raw_request()
is wrong; the current check doesn't make any sense -- the report allocated
by HID core in hid_hw_raw_request() can be much larger than
DJREPORT_SHORT_LENGTH, and currently logi_dj_ll_raw_request() doesn't
handle this properly at all.
Fix the check by actually trimming down the report size properly if it is
too large.
Cc: stable@vger.kernel.org Reported-by: Ben Hawkes <hawkes@google.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
can: flexcan: handle state passive -> warning transition
Once the CAN-bus is open and a packet is sent, the controller switches
into the PASSIVE state. Once the BUS is closed again it goes the back
err-warning. The TX error counter goes 0 -> 0x80 -> 0x7f.
This patch makes sure that the user learns about this state chang
(CAN_STATE_ERROR_WARNING => CAN_STATE_ERROR_PASSIVE)
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Matthias Klein <matthias.klein@optimeas.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Alexander Stein [Tue, 12 Aug 2014 08:47:21 +0000 (10:47 +0200)]
can: flexcan: Disable error interrupt when bus error reporting is disabled
In case we don't have FLEXCAN_HAS_BROKEN_ERR_STATE and the user set
CAN_CTRLMODE_BERR_REPORTING once it can not be unset again until reboot.
So in case neither hardware nor user wants the error interrupt disable
the bit.
Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Mirza Krak [Fri, 8 Aug 2014 12:30:50 +0000 (14:30 +0200)]
can: sja1000: Validate initialization state in start method
When sja1000 is not compiled as module the SJA1000 chip is only
initialized during device registration on kernel boot. Should the chip
get a hardware reset there is no way to reinitialize it without re-
booting the Linux kernel.
This patch adds a check in sja1000_start if the chip is initialized, if
not we initialize it.
Signed-off-by: Mirza Krak <mirza.krak@hostmobility.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Linus Torvalds [Wed, 20 Aug 2014 23:33:21 +0000 (18:33 -0500)]
Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Most important fixes in this set include three SMB3 fixes for stable
(including fix for possible kernel oops), and a workaround to allow
writes to Mac servers (only cifs dialect, not more current SMB2.1,
worked to Mac servers). Also fallocate support added, and lease fix
from Jeff"
* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
[SMB3] Enable fallocate -z support for SMB3 mounts
enable fallocate punch hole ("fallocate -p") for SMB3
Incorrect error returned on setting file compressed on SMB2
CIFS: Fix wrong directory attributes after rename
CIFS: Fix SMB2 readdir error handling
[CIFS] Possible null ptr deref in SMB2_tcon
[CIFS] Workaround MacOS server problem with SMB2.1 write response
cifs: handle lease F_UNLCK requests properly
Cleanup sparse file support by creating worker function for it
Add sparse file support to SMB2/SMB3 mounts
Add missing definitions for CIFS File System Attributes
cifs: remove unused function cifs_oplock_break_wait
Linus Torvalds [Wed, 20 Aug 2014 23:32:16 +0000 (18:32 -0500)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull filesystem fixes from Jan Kara:
"udf, isofs, and ext3 bug fixes"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext3: Count internal journal as bsddf overhead in ext3_statfs
isofs: Fix unbounded recursion when processing relocated directories
udf: avoid unneeded up_write when fail to add entry in ->symlink
Linus Torvalds [Wed, 20 Aug 2014 23:22:10 +0000 (18:22 -0500)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"Reverting a 3.16 patch, fixing two bugs in device assignment (one has
a CVE), and fixing some problems introduced during the merge window
(the CMA bug came in via Andrew, the x86 ones via yours truly)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
virt/kvm/assigned-dev.c: Set 'dev->irq_source_id' to '-1' after free it
Revert "KVM: x86: Increase the number of fixed MTRR regs to 10"
KVM: x86: do not check CS.DPL against RPL during task switch
KVM: x86: Avoid emulating instructions on #UD mistakenly
PC, KVM, CMA: Fix regression caused by wrong get_order() use
kvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601)
Linus Torvalds [Wed, 20 Aug 2014 23:20:50 +0000 (18:20 -0500)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"These are the two bug fixes I mentioned in the final merge window
pull. One is a reversed logic check in the device busy tests which
can cause a nasty hang and another crash seen in the new SCSI pool
support if the use count ever goes to zero"
[ The device busy test already got merged from a patch earlier, so is
now duplicated. ]
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
[SCSI] save command pool address of Scsi_Host
[SCSI] fix qemu boot hang problem
Will Deacon [Tue, 19 Aug 2014 21:05:45 +0000 (22:05 +0100)]
Revert "arm64: Do not invoke audit_syscall_* functions if !CONFIG_AUDIT_SYSCALL"
For some reason, the audit patches didn't make it out of -next this
merge window, so revert our temporary hack and let the audit guys deal
with fixing up -next.
arm64: ignore DT memreserve entries when booting in UEFI mode
UEFI provides its own method for marking regions to reserve, via the
memory map which is also used to initialise memblock. So when using the
UEFI memory map, ignore any memreserve entries present in the DT.
Reported-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Leif Lindholm <leif.lindholm@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Brown [Thu, 14 Aug 2014 19:57:16 +0000 (20:57 +0100)]
arm64: configs: Enable X-Gene SATA and ethernet in defconfig
Currently when run on an APM platform the ARMv8 defconfig has no viable
options for rootfs other than ramdisk which is rather limiting. Since
we already have both SATA and the bits needed for NFS root enabled we just
need to enable the relevant drivers so do that, helping enable direct
testing of upstream.
If the configuration ends up becoming too big we can consider modularising
some of the drivers and asking people to use an initramfs but for now this
is not an issue.
Signed-off-by: Mark Brown <broonie@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
Ard Biesheuvel [Wed, 13 Aug 2014 17:53:03 +0000 (18:53 +0100)]
arm64: align randomized TEXT_OFFSET on 4 kB boundary
When booting via UEFI, the kernel Image is loaded at a 4 kB boundary and
the embedded EFI stub is executed in place. The EFI stub relocates the
Image to reside TEXT_OFFSET bytes above a 2 MB boundary, and jumps into
the kernel proper.
In AArch64, PC relative symbol references are emitted using adrp/add or
adrp/ldr pairs, where the offset into a 4 kB page is resolved using a
separate :lo12: relocation. This implicitly assumes that the code will
always be executed at the same relative offset with respect to a 4 kB
boundary, or the references will point to the wrong address.
This means we should link the kernel at a 4 kB aligned base address in
order to remain compatible with the base address the UEFI loader uses
when doing the initial load of Image. So update the code that generates
TEXT_OFFSET to choose a multiple of 4 kB.
At the same time, update the code so it chooses from the interval [0..2MB)
as the author originally intended.
Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
v2: patch description changes Fixes: f0f6ee1f70c4 ("cbq: incorrect processing of high limits")
Mainstream commit f0f6ee1f70c4 ("cbq: incorrect processing of high limits")
have side effect: if cbq bandwidth setting is less than real interface
throughput non-limited traffic can delay limited traffic for a very long time.
This happen because of q->now changes incorrectly in cbq_dequeue():
in described scenario L2T is much greater than real time delay,
and q->now gets an extra boost for each transmitted packet.
Accumulated boost prevents update q->now, and blocked class can wait
very long time until (q->now >= cl->undertime) will be true again.
More detailed problem description can be found here:
http://www.spinics.net/lists/netdev/msg292493.html
Following patches should fix the problem.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mainstream commit f0f6ee1f70c4 ("cbq: incorrect processing of high limits")
have side effect: if cbq bandwidth setting is less than real interface
throughput non-limited traffic can delay limited traffic for a very long time.
This happen because of q->now changes incorrectly in cbq_dequeue():
in described scenario L2T is much greater than real time delay,
and q->now gets an extra boost for each transmitted packet.
Accumulated boost prevents update q->now, and blocked class can wait
very long time until (q->now >= cl->undertime) will be true again.
To fix the problem the patch updates q->now on each cbq_update() call.
L2T-related pre-modification q->now was moved to cbq_update().
My testing confirmed that it fixes the problem and did not discover
any side-effects
Fixes: f0f6ee1f70c4 ("cbq: incorrect processing of high limits") Signed-off-by: Vasily Averin <vvs@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Guenter Roeck [Sun, 10 Aug 2014 12:54:25 +0000 (05:54 -0700)]
scsi: Fix qemu boot hang problem
The latest kernel fails to boot qemu arm images when using scsi
for disk access. Boot gets stuck after the following messages.
brd: module loaded
sym53c8xx 0000:00:0c.0: enabling device (0100 -> 0103)
sym0: <895a> rev 0x0 at pci 0000:00:0c.0 irq 93
sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking
sym0: SCSI BUS has been reset.
scsi host0: sym-2.2.3
Bisect points to commit 71e75c97f97a ("scsi: convert device_busy to
atomic_t"). Code inspection shows the following suspicious change
in scsi_request_fn.
out_delay:
- if (sdev->device_busy == 0 && !scsi_device_blocked(sdev))
+ if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev))
blk_delay_queue(q, SCSI_QUEUE_DELAY);
}
'sdev->device_busy == 0' was replaced with 'atomic_read(&sdev->device_busy)',
meaning the logic was reversed. Changing this expression to
'!atomic_read(&sdev->device_busy)' fixes the problem.
Jan Kara [Sun, 17 Aug 2014 09:49:57 +0000 (11:49 +0200)]
isofs: Fix unbounded recursion when processing relocated directories
We did not check relocated directory in any way when processing Rock
Ridge 'CL' tag. Thus a corrupted isofs image can possibly have a CL
entry pointing to another CL entry leading to possibly unbounded
recursion in kernel code and thus stack overflow or deadlocks (if there
is a loop created from CL entries).
Fix the problem by not allowing CL entry to point to a directory entry
with CL entry (such use makes no good sense anyway) and by checking
whether CL entry doesn't point to itself.
CC: stable@vger.kernel.org Reported-by: Chris Evans <cevans@google.com> Signed-off-by: Jan Kara <jack@suse.cz>
Alan Cox [Tue, 19 Aug 2014 14:37:28 +0000 (17:37 +0300)]
i2c: i801: Add PCI ID for Intel Braswell
The SMBus host controller is the same as used in Baytrail so add the new
PCI ID to the driver's list of supported IDs.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Lan Tianyu [Fri, 15 Aug 2014 05:38:59 +0000 (13:38 +0800)]
i2c: rework kernel config I2C_ACPI
Commit da3c6647(I2C/ACPI: Clean up I2C ACPI code and Add CONFIG_I2C_ACPI
config) adds a new kernel config I2C_ACPI and make I2C core built in
when the config is selected. This is wrong because distributions
etc generally compile I2C as a module and the commit broken that.
This patch is to rename I2C_ACPI to ACPI_I2C_OPREGION. New config
only controls ACPI I2C operation region code and depends on I2C=y.
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[wsa: removed unrelated change for Kconfig] Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Linus Torvalds [Tue, 19 Aug 2014 14:47:01 +0000 (09:47 -0500)]
Merge tag 'md/3.17-fixes' of git://neil.brown.name/md
Pull md bugfixes from Neil Brown:
"Here are the bug-fixes I promised :-)
Funny how you start looking for one and other start appearing.
- raid6 data corruption during recovery
- raid6 livelock
- raid10 memory leaks"
* tag 'md/3.17-fixes' of git://neil.brown.name/md:
md/raid10: always initialise ->state on newly allocated r10_bio
md/raid10: avoid memory leak on error path during reshape.
md/raid10: Fix memory leak when raid10 reshape completes.
md/raid10: fix memory leak when reshaping a RAID10.
md/raid6: avoid data corruption during recovery of double-degraded RAID6
md/raid5: avoid livelock caused by non-aligned writes.
NVIDIA Tegra
- Add debugfs support (Thierry Reding)
Synopsys DesignWare
- Look for configuration space in 'reg', not 'ranges' (Kishon Vijay Abraham I)
- Program ATU with untranslated address (Kishon Vijay Abraham I)
- Add config access-related pcie_host_ops for v3.65 hardware (Murali Karicheri)
- Add MSI-related pcie_host_ops for v3.65 hardware (Murali Karicheri)
TI DRA7xx
- Add TI DR7xx PCIe driver (Kishon Vijay Abraham I)"
* tag 'pci-v3.17-changes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: designware: Add MSI-related pcie_host_ops for v3.65 hardware
PCI: designware: Add config access-related pcie_host_ops for v3.65 hardware
PCI: dra7xx: Add TI DRA7xx PCIe driver
PCI: designware: Program ATU with untranslated address
PCI: designware: Look for configuration space in 'reg', not 'ranges'
PCI: tegra: Add debugfs support
PCI: mvebu: Remove ARCH_KIRKWOOD dependency
Linus Torvalds [Tue, 19 Aug 2014 14:43:48 +0000 (09:43 -0500)]
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux
Pull devicetree fixes from Grant Likely:
"Three more commits needed for v3.17: A bug fix for reserved regions
based at address zero, a clarification on how to interpret existence
of both interrupts and interrupts-extended properties, and a fix to
allow device tree testcases to run on any platform"
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
of/irq: Fix lookup to use 'interrupts-extended' property first
Enabling OF selftest to run without machine's devicetree
of: Allow mem_reserve of memory with a base address of zero
Chen Gang [Fri, 8 Aug 2014 15:37:59 +0000 (23:37 +0800)]
virt/kvm/assigned-dev.c: Set 'dev->irq_source_id' to '-1' after free it
As a generic function, deassign_guest_irq() assumes it can be called
even if assign_guest_irq() is not be called successfully (which can be
triggered by ioctl from user mode, indirectly).
So for assign_guest_irq() failure process, need set 'dev->irq_source_id'
to -1 after free 'dev->irq_source_id', or deassign_guest_irq() may free
it again.
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>