Michael Chan [Tue, 3 Jun 2014 06:08:48 +0000 (23:08 -0700)]
cnic: Fix missing ISCSI_KEVENT_IF_DOWN message
The iSCSI netlink message needs to be sent before the ulp_ops is cleared
as it is sent through a function pointer in the ulp_ops. This bug
causes iscsid to not get the message when the bnx2i driver is unloaded.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 3 Jun 2014 06:08:47 +0000 (23:08 -0700)]
cnic: Don't take cnic_dev_lock in cnic_alloc_uio_rings()
We are allocating memory with GFP_KERNEL under spinlock. Since this is
the only call manipulating the cnic_udev_list and it is always under
rtnl_lock, cnic_dev_lock can be safely removed.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 3 Jun 2014 06:08:46 +0000 (23:08 -0700)]
cnic: Don't take rcu_read_lock in cnic_rcv_netevent()
Because the called function, such as bnx2fc_indicate_netevent(), can sleep,
we cannot take rcu_lock(). To prevent the rcu protected ulp_ops from going
away, we use the cnic_lock mutex and set the ULP_F_CALL_PENDING flag.
The code already waits for ULP_F_CALL_PENDING flag to clear in
cnic_unregister_device().
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In commit cd11cf505318ff24e42f35145f9cdf8596fa1958 I accidentally
added an error message. I used it for debugging and forgot to remove
it before submitting the patch.
Signed-off-by: Christian Riesch <christian.riesch@omicron.at> Signed-off-by: David S. Miller <davem@davemloft.net>
1) Unbreak zebra and other netlink apps, from Eric W Biederman.
2) Some new qmi_wwan device IDs, from Aleksander Morgado.
3) Fix info leak in DCB netlink handler of qlcnic driver, from Dan
Carpenter.
4) inet_getid() and ipv6_select_ident() do not generate monotonically
increasing ID numbers, fix from Eric Dumazet.
5) Fix memory leak in __sk_prepare_filter(), from Leon Yu.
6) Netlink leftover bytes warning message is user triggerable, rate
limit it. From Michal Schmidt.
7) Fix non-linear SKB panic in ipvs, from Peter Christensen.
8) Congestion window undo needs to be performed even if only never
retransmitted data is SACK'd, fix from Yuching Cheng.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (24 commits)
net: filter: fix possible memory leak in __sk_prepare_filter()
net: ec_bhf: Add runtime dependencies
tcp: fix cwnd undo on DSACK in F-RTO
netlink: Only check file credentials for implicit destinations
ipheth: Add support for iPad 2 and iPad 3
team: fix mtu setting
net: fix inet_getid() and ipv6_select_ident() bugs
net: qmi_wwan: interface #11 in Sierra Wireless MC73xx is not QMI
net: qmi_wwan: add additional Sierra Wireless QMI devices
bridge: Prevent insertion of FDB entry with disallowed vlan
netlink: rate-limit leftover bytes warning and print process name
bridge: notify user space after fdb update
net: qmi_wwan: add Netgear AirCard 341U
net: fix wrong mac_len calculation for vlans
batman-adv: fix NULL pointer dereferences
net/mlx4_core: Reset RoCE VF gids when guest driver goes down
emac: aggregation of v1-2 PLB errors for IER register
emac: add missing support of 10mbit in emac/rgmii
can: only rename enabled led triggers when changing the netdev name
ipvs: Fix panic due to non-linear skb
...
Leon Yu [Sun, 1 Jun 2014 05:37:25 +0000 (05:37 +0000)]
net: filter: fix possible memory leak in __sk_prepare_filter()
__sk_prepare_filter() was reworked in commit bd4cf0ed3 (net: filter:
rework/optimize internal BPF interpreter's instruction set) so that it should
have uncharged memory once things went wrong. However that work isn't complete.
Error is handled only in __sk_migrate_filter() while memory can still leak in
the error path right after sk_chk_filter().
Fixes: bd4cf0ed331a ("net: filter: rework/optimize internal BPF interpreter's instruction set") Signed-off-by: Leon Yu <chianglungyu@gmail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Tested-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 3 Jun 2014 00:04:37 +0000 (17:04 -0700)]
Merge tag 'md/3.15-fixes' of git://neil.brown.name/md
Pull two md bugfixes from Neil Brown:
"Two md bugfixes for possible corruption when restarting reshape
If a raid5/6 reshape is restarted (After stopping and re-assembling
the array) and the array is marked read-only (or read-auto), then the
reshape will appear to complete immediately, without actually moving
anything around. This can result in corruption.
There are two patches which do much the same thing in different
places. They are separate because one is an older bug and so can be
applied to more -stable kernels"
* tag 'md/3.15-fixes' of git://neil.brown.name/md:
md: always set MD_RECOVERY_INTR when interrupting a reshape thread.
md: always set MD_RECOVERY_INTR when aborting a reshape or other "resync".
Jean Delvare [Sat, 31 May 2014 15:32:27 +0000 (17:32 +0200)]
net: ec_bhf: Add runtime dependencies
The ec_bhf driver is specific to the Beckhoff CX embedded PC series.
These are based on Intel x86 CPU. So we can add a dependency on
X86, with COMPILE_TEST as an alternative to still allow for broader
build-testing.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Darek Marcinkiewicz <reksio@newterm.pl> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Queued trim only works for some users with MU05 firmware. Revert to
blacklisting all firmware versions.
Introduced by commit d121f7d0cbb8 ("libata: Update queued trim blacklist
for M5x0 drives") which this effectively reverts, while retaining the
blacklisting of M550.
See
https://bugzilla.kernel.org/show_bug.cgi?id=71371
for reports of trouble with MU05 firmware.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 2 Jun 2014 23:57:23 +0000 (16:57 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Peter Anvin:
"A single quite small patch that managed to get overlooked earlier, to
prevent a user space triggerable oops on systems without HPET"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET
Linus Torvalds [Mon, 2 Jun 2014 23:56:42 +0000 (16:56 -0700)]
Merge tag 'usb-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some fixes for 3.15-rc8 that resolve a number of tiny USB
issues that have been reported, and there are some new device ids as
well.
All have been tested in linux-next"
* tag 'usb-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
xhci: delete endpoints from bandwidth list before freeing whole device
usb: pci-quirks: Prevent Sony VAIO t-series from switching usb ports
USB: cdc-wdm: properly include types.h
usb: cdc-wdm: export cdc-wdm uapi header
USB: serial: option: add support for Novatel E371 PCIe card
USB: ftdi_sio: add NovaTech OrionLXm product ID
USB: io_ti: fix firmware download on big-endian machines (part 2)
USB: Avoid runtime suspend loops for HCDs that can't handle suspend/resume
Linus Torvalds [Mon, 2 Jun 2014 23:55:18 +0000 (16:55 -0700)]
Merge tag 'staging-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are some staging driver fixes for 3.15.
Three are for the speakup drivers (one fixes a regression caused in
3.15-rc, and the other two resolve a tty issue found by Ben Hutchings)
The comedi and r8192e_pci driver fixes also resolve reported issues"
* tag 'staging-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: r8192e_pci: fix htons error
Staging: speakup: Update __speakup_paste_selection() tty (ab)usage to match vt
Staging: speakup: Move pasting into a work item
staging: comedi: ni_daq_700: add mux settling delay
speakup: fix incorrect perms on speakup_acntsa.c
Yuchung Cheng [Fri, 30 May 2014 22:25:59 +0000 (15:25 -0700)]
tcp: fix cwnd undo on DSACK in F-RTO
This bug is discovered by an recent F-RTO issue on tcpm list
https://www.ietf.org/mail-archive/web/tcpm/current/msg08794.html
The bug is that currently F-RTO does not use DSACK to undo cwnd in
certain cases: upon receiving an ACK after the RTO retransmission in
F-RTO, and the ACK has DSACK indicating the retransmission is spurious,
the sender only calls tcp_try_undo_loss() if some never retransmisted
data is sacked (FLAG_ORIG_DATA_SACKED).
The correct behavior is to unconditionally call tcp_try_undo_loss so
the DSACK information is used properly to undo the cwnd reduction.
Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
netlink: Only check file credentials for implicit destinations
It was possible to get a setuid root or setcap executable to write to
it's stdout or stderr (which has been set made a netlink socket) and
inadvertently reconfigure the networking stack.
To prevent this we check that both the creator of the socket and
the currentl applications has permission to reconfigure the network
stack.
Unfortunately this breaks Zebra which always uses sendto/sendmsg
and creates it's socket without any privileges.
To keep Zebra working don't bother checking if the creator of the
socket has privilege when a destination address is specified. Instead
rely exclusively on the privileges of the sender of the socket.
Note from Andy: This is exactly Eric's code except for some comment
clarifications and formatting fixes. Neither I nor, I think, anyone
else is thrilled with this approach, but I'm hesitant to wait on a
better fix since 3.15 is almost here.
Note to stable maintainers: This is a mess. An earlier series of
patches in 3.15 fix a rather serious security issue (CVE-2014-0181),
but they did so in a way that breaks Zebra. The offending series
includes:
net: Add variants of capable for use on netlink messages
If a given kernel version is missing that series of fixes, it's
probably worth backporting it and this patch. if that series is
present, then this fix is critical if you care about Zebra.
Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Kristian Evensen [Fri, 30 May 2014 10:17:00 +0000 (12:17 +0200)]
ipheth: Add support for iPad 2 and iPad 3
Each iPad model has a different product id, this patch adds support for iPad 2
(pid 0x12a2) and iPad 3 (pid 0x12a6). Note that iPad 2 must be jailbroken and a
third-party app must be used for tethering to work. On iPad 3, tethering works
out of the box (assuming your ISP is nice).
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 29 May 2014 18:46:17 +0000 (20:46 +0200)]
team: fix mtu setting
Now it is not possible to set mtu to team device which has a port
enslaved to it. The reason is that when team_change_mtu() calls
dev_set_mtu() for port device, notificator for NETDEV_PRECHANGEMTU
event is called and team_device_event() returns NOTIFY_BAD forbidding
the change. So fix this by returning NOTIFY_DONE here in case team is
changing mtu in team_change_mtu().
Introduced-by: 3d249d4c "net: introduce ethernet teaming device" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 29 May 2014 15:45:14 +0000 (08:45 -0700)]
net: fix inet_getid() and ipv6_select_ident() bugs
I noticed we were sending wrong IPv4 ID in TCP flows when MTU discovery
is disabled.
Note how GSO/TSO packets do not have monotonically incrementing ID.
06:37:41.575531 IP (id 14227, proto: TCP (6), length: 4396)
06:37:41.575534 IP (id 14272, proto: TCP (6), length: 65212)
06:37:41.575544 IP (id 14312, proto: TCP (6), length: 57972)
06:37:41.575678 IP (id 14317, proto: TCP (6), length: 7292)
06:37:41.575683 IP (id 14361, proto: TCP (6), length: 63764)
It appears I introduced this bug in linux-3.1.
inet_getid() must return the old value of peer->ip_id_count,
not the new one.
Lets revert this part, and remove the prevention of
a null identification field in IPv6 Fragment Extension Header,
which is dubious and not even done properly.
Fixes: 87c48fa3b463 ("ipv6: make fragment identifications less predictable") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: qmi_wwan: interface #11 in Sierra Wireless MC73xx is not QMI
This interface is unusable, as the cdc-wdm character device doesn't reply to
any QMI command. Also, the out-of-tree Sierra Wireless GobiNet driver fully
skips it.
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita [Mon, 26 May 2014 06:15:53 +0000 (15:15 +0900)]
bridge: Prevent insertion of FDB entry with disallowed vlan
br_handle_local_finish() is allowing us to insert an FDB entry with
disallowed vlan. For example, when port 1 and 2 are communicating in
vlan 10, and even if vlan 10 is disallowed on port 3, port 3 can
interfere with their communication by spoofed src mac address with
vlan id 10.
Note: Even if it is judged that a frame should not be learned, it should
not be dropped because it is destined for not forwarding layer but higher
layer. See IEEE 802.1Q-2011 8.13.10.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maxwell [Thu, 29 May 2014 07:27:16 +0000 (17:27 +1000)]
bridge: notify user space after fdb update
There has been a number incidents recently where customers running KVM have
reported that VM hosts on different Hypervisors are unreachable. Based on
pcap traces we found that the bridge was broadcasting the ARP request out
onto the network. However some NICs have an inbuilt switch which on occasions
were broadcasting the VMs ARP request back through the physical NIC on the
Hypervisor. This resulted in the bridge changing ports and incorrectly learning
that the VMs mac address was external. As a result the ARP reply was directed
back onto the external network and VM never updated it's ARP cache. This patch
will notify the bridge command, after a fdb has been updated to identify such
port toggling.
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
After 1e785f48d29a ("net: Start with correct mac_len in
skb_network_protocol") skb->mac_len is used as a start of the
calculation in skb_network_protocol() but that is not always correct. If
skb->protocol == 8021Q/AD, usually the vlan header is already inserted
in the skb (i.e. vlan reorder hdr == 0). Usually when the packet enters
dev_hard_xmit it has mac_len == 0 so we take 2 bytes from the
destination mac address (skb->data + VLAN_HLEN) as a type in
skb_network_protocol() and return vlan_depth == 4. In the case where TSO is
off, then the mac_len is set but it's == 18 (ETH_HLEN + VLAN_HLEN), so
skb_network_protocol() returns a type from inside the packet and
offset == 22. Also make vlan_depth unsigned as suggested before.
As suggested by Eric Dumazet, move the while() loop in the if() so we
can avoid additional testing in fast path.
Here are few netperf tests + debug printk's to illustrate:
cat netperf.tso-on.reorder-on.bugged
- Vlan -> device (reorder on, default, this case is okay)
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
- Vlan -> device (reorder off, bad)
cat netperf.tso-on.reorder-off.bugged
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
87380 16384 16384 10.00 241.35
[ 204.578332] skb->len 1518 skb->gso_size 0 skb->proto 0x8100
skb->mac_len 0 vlan_depth 4 type 0x5301
0x5301 are the last two bytes of the destination mac.
And if we stop TSO, we may get even the following:
[ 83.343156] skb->len 2966 skb->gso_size 1448 skb->proto 0x8100
skb->mac_len 18 vlan_depth 22 type 0xb84
Because mac_len already accounts for VLAN_HLEN.
After the fix:
cat netperf.tso-on.reorder-off.fixed
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
CC: Vlad Yasevich <vyasevic@redhat.com> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Daniel Borkman <dborkman@redhat.com> CC: David S. Miller <davem@davemloft.net>
Fixes:1e785f48d29a ("net: Start with correct mac_len in
skb_network_protocol") Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 2 Jun 2014 01:28:58 +0000 (18:28 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"A fair number of fixes across the field. Nothing terribly
complicated; the one liners in below changelog should be fairly
descriptive.
Noteworthy is the SB1 change which the result of changes to binutils
resulting in one big gas warning for most files being assembled as
well as the asid_cache and branch emulation fixes which fix corruption
or possible uninteded behaviour of kernel or application code. The
remainder of fixes are more platforms or subsystem specific"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: R46000: Fix Micro-assembler field overflow for R4600 V2
MIPS: ptrace: Avoid smp_processor_id() in preemptible code
MIPS: Lemote 2F: cs5536: mfgpt: use raw locks
MIPS: SB1: Fix excessive kernel warnings.
MIPS: RC32434: fix broken PCI resource initialization
MIPS: malta: memory.c: Initialize the 'memsize' variable
MIPS: Fix typo when reporting cache and ftlb errors for ImgTec cores
MIPS: Fix inconsistancy of __NR_Linux_syscalls value
MIPS: Fix branch emulation of branch likely instructions.
MIPS: Fix a typo error in AUDIT_ARCH definition
MIPS: Change type of asid_cache to unsigned long
Linus Torvalds [Mon, 2 Jun 2014 01:26:59 +0000 (18:26 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Various fixlets, mostly related to the (root-only) SCHED_DEADLINE
policy, but also a hotplug bug fix and a fix for a NR_CPUS related
overallocation bug causing a suspend/resume regression"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix hotplug vs. set_cpus_allowed_ptr()
sched/cpupri: Replace NR_CPUS arrays
sched/deadline: Replace NR_CPUS arrays
sched/deadline: Restrict user params max value to 2^63 ns
sched/deadline: Change sched_getparam() behaviour vs SCHED_DEADLINE
sched: Disallow sched_attr::sched_policy < 0
sched: Make sched_setattr() correctly return -EFBIG
here you have another very small fix intended for net/linux-3.15.
It prevents some multicast functions from dereferencing a NULL pointer.
(Actually it was nothing more than a typo)
I hope it is not too late for such a small patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 31 May 2014 16:47:55 +0000 (09:47 -0700)]
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core futex/rtmutex fixes from Thomas Gleixner:
"Three fixlets for long standing issues in the futex/rtmutex code
unearthed by Dave Jones syscall fuzzer:
- Add missing early deadlock detection checks in the futex code
- Prevent user space from attaching a futex to kernel threads
- Make the deadlock detector of rtmutex work again
Looks large, but is more comments than code change"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rtmutex: Fix deadlock detector for real
futex: Prevent attaching to kernel threads
futex: Add another early deadlock detection check
Linus Torvalds [Sat, 31 May 2014 16:19:02 +0000 (09:19 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Mostly quiet now:
i915:
fixing userspace visiblie issues, all stable marked
radeon:
one more pll fix, two crashers, one suspend/resume regression"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: Resume fbcon last
drm/radeon: only allocate necessary size for vm bo list
drm/radeon: don't allow RADEON_GEM_DOMAIN_CPU for command submission
drm/radeon: avoid crash if VM command submission isn't available
drm/radeon: lower the ref * post PLL maximum once more
drm/i915: Prevent negative relocation deltas from wrapping
drm/i915: Only copy back the modified fields to userspace from execbuffer
drm/i915: Fix dynamic allocation of physical handles
Linus Torvalds [Sat, 31 May 2014 16:13:21 +0000 (09:13 -0700)]
dcache: add missing lockdep annotation
lock_parent() very much on purpose does nested locking of dentries, and
is careful to maintain the right order (lock parent first). But because
it didn't annotate the nested locking order, lockdep thought it might be
a deadlock on d_lock, and complained.
Add the proper annotation for the inner locking of the child dentry to
make lockdep happy.
Introduced by commit 046b961b45f9 ("shrink_dentry_list(): take parent's
->d_lock earlier").
Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Acked-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
David S. Miller [Sat, 31 May 2014 00:56:09 +0000 (17:56 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
The following patchset contains a late fix for IPVS:
* Fix crash when trying to remove the transport header with non-linear
skbuffs, this was introduced in 3.6-rc. Patch from Peter Christensen
via the IPVS folks.
I'll pass this to -stable once this hits mainstream.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Tue, 27 May 2014 06:26:38 +0000 (09:26 +0300)]
net/mlx4_core: Reset RoCE VF gids when guest driver goes down
Reset the GIDs assigned to a VF in the port RoCE GID table when
that guest goes down (either crashes or goes down cleanly).
As part of this fix, we refactor the RoCE gid table driver copy,
moving it to the mlx4_port_info structure (together with the MAC
and VLAN tables).
As with the MAC and VLAN tables, we now use a mutex per port
for the GID table so that modifying the driver copy and
modifying the firmware copy of a port GID table becomes an
atomic operation (thus avoiding driver-copy/FW-copy mismatches).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Mikhaylov [Mon, 26 May 2014 18:34:39 +0000 (22:34 +0400)]
emac: aggregation of v1-2 PLB errors for IER register
Aggreagation of version 1-2 because of version 1 can hit
PLB errors too. If it's not set so we missing events for PLB bits
and driver can't process those interrupts.
Signed-off-by: Ivan Mikhaylov <ivan@ru.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
which was merged long ago and actually part of 3.14.
Digging deeper I've noticed (again) that the call to
drm_helper_resume_force_mode in the radeon resume handlers was a no-op
previously because everything gets shut down on suspend. radeon does
this with explicit calls to drm_helper_connector_dpms with DPMS_OFF.
But with 177c we now force the dpms state to ON, so suddenly
resume_force_mode actually forced the crtcs back on.
This is the intention of the change after all, the problem is that
radeon resumes the fbdev console layer _before_ restoring the display,
through calling fb_set_suspend. And fbcon does an immediate ->set_par,
which in turn causes the same forced mode restore to happen.
Two concurrent modeset operations didn't lead to happiness. Fix this
by delaying the fbcon resume until the end of the readeon resum
functions.
v2: Fix up a bit of the spelling fail.
References: https://lkml.org/lkml/2014/5/29/1043
References: https://lkml.org/lkml/2014/5/2/388
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=74751 Tested-by: Ken Moffat <zarniwhoop@ntlworld.com> Cc: Alex Deucher <alexdeucher@gmail.com> Cc: Ken Moffat <zarniwhoop@ntlworld.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@gmail.com>
Dave Airlie [Fri, 30 May 2014 23:19:05 +0000 (09:19 +1000)]
Merge branch 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux into drm-fixes
this is the next pull request for stashed up radeon fixes for 3.15. This is finally calming down with only four patches in this pull request.
* 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux:
drm/radeon: only allocate necessary size for vm bo list
drm/radeon: don't allow RADEON_GEM_DOMAIN_CPU for command submission
drm/radeon: avoid crash if VM command submission isn't available
drm/radeon: lower the ref * post PLL maximum once more
Linus Torvalds [Fri, 30 May 2014 19:07:48 +0000 (12:07 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem fixes from Dmitry Torokhov:
"A couple of driver/build fixups and also redone quirk for Synaptics
touchpads on Lenovo boxes (now using PNP IDs instead of DMI data to
limit number of quirks)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics - change min/max quirk table to pnp-id matching
Input: synaptics - add a matches_pnp_id helper function
Input: synaptics - T540p - unify with other LEN0034 models
Input: synaptics - add min/max quirk for the ThinkPad W540
Input: ambakmi - request a shared interrupt for AMBA KMI devices
Input: pxa27x-keypad - fix generating scancode
Input: atmel-wm97xx - only build for AVR32
Input: fix ps2/serio module dependency
Linus Torvalds [Fri, 30 May 2014 19:06:15 +0000 (12:06 -0700)]
Merge tag 'firewire-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fix from Stefan Richter:
"A regression fix for the IEEE 1394 subsystem: re-enable IRQ-based
asynchronous request reception at addresses below 128 TB"
* tag 'firewire-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: revert to 4 GB RDMA, fix protocols using Memory Space
Linus Torvalds [Fri, 30 May 2014 19:04:56 +0000 (12:04 -0700)]
Merge tag 'dm-3.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device-mapper fixes from Mike Snitzer:
"A dm-cache stable fix to split discards on cache block boundaries
because dm-cache cannot yet handle discards that span cache blocks.
Really fix a dm-mpath LOCKDEP warning that was introduced in -rc1.
Add a 'no_space_timeout' control to dm-thinp to restore the ability to
queue IO indefinitely when no data space is available. This fixes a
change in behavior that was introduced in -rc6 where the timeout
couldn't be disabled"
* tag 'dm-3.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm mpath: really fix lockdep warning
dm cache: always split discards on cache block boundaries
dm thin: add 'no_space_timeout' dm-thin-pool module param
Minchan Kim [Wed, 28 May 2014 06:53:59 +0000 (15:53 +0900)]
x86_64: expand kernel stack to 16K
While I play inhouse patches with much memory pressure on qemu-kvm,
3.14 kernel was randomly crashed. The reason was kernel stack overflow.
When I investigated the problem, the callstack was a little bit deeper
by involve with reclaim functions but not direct reclaim path.
I tried to diet stack size of some functions related with alloc/reclaim
so did a hundred of byte but overflow was't disappeard so that I encounter
overflow by another deeper callstack on reclaim/allocator path.
Of course, we might sweep every sites we have found for reducing
stack usage but I'm not sure how long it saves the world(surely,
lots of developer start to add nice features which will use stack
agains) and if we consider another more complex feature in I/O layer
and/or reclaim path, it might be better to increase stack size(
meanwhile, stack usage on 64bit machine was doubled compared to 32bit
while it have sticked to 8K. Hmm, it's not a fair to me and arm64
already expaned to 16K. )
So, my stupid idea is just let's expand stack size and keep an eye
toward stack consumption on each kernel functions via stacktrace of ftrace.
For example, we can have a bar like that each funcion shouldn't exceed 200K
and emit the warning when some function consumes more in runtime.
Of course, it could make false positive but at least, it could make a
chance to think over it.
I guess this topic was discussed several time so there might be
strong reason not to increase kernel stack size on x86_64, for me not
knowing so Ccing x86_64 maintainers, other MM guys and virtio
maintainers.
Here's an example call trace using up the kernel stack:
[ Note: the problem is exacerbated by certain gcc versions that seem to
generate much bigger stack frames due to apparently bad coalescing of
temporaries and generating too many spills. Rusty saw gcc-4.6.4 using
35% more stack on the virtio path than 4.8.2 does, for example.
Minchan not only uses such a bad gcc version (4.6.3 in his case), but
some of the stack use is due to debugging (CONFIG_DEBUG_PAGEALLOC is
what causes that kernel_map_pages() frame, for example). But we're
clearly getting too close.
The VM code also seems to have excessive stack frames partly for the
same compiler reason, triggered by excessive inlining and lots of
function arguments.
We need to improve on our stack use, but in the meantime let's do this
simple stack increase too. Unlike most earlier reports, there is
nothing simple that stands out as being really horribly wrong here,
apart from the fact that the stack frames are just bigger than they
should need to be. - Linus ]
Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Jones <davej@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S Tsirkin <mst@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: PJ Waskiewicz <pjwaskiewicz@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 30 May 2014 16:52:55 +0000 (09:52 -0700)]
Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs dcache livelock fix from Al Viro:
"Fixes for livelocks in shrink_dentry_list() introduced by fixes to
shrink list corruption; the root cause was that trylock of parent's
->d_lock could be disrupted by d_walk() happening on other CPUs,
resulting in shrink_dentry_list() making no progress *and* the same
d_walk() being called again and again for as long as
shrink_dentry_list() doesn't get past that mess.
The solution is to have shrink_dentry_list() treat that trylock
failure not as 'try to do the same thing again', but 'lock them in the
right order'"
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
dentry_kill() doesn't need the second argument now
dealing with the rest of shrink_dentry_list() livelock
shrink_dentry_list(): take parent's ->d_lock earlier
expand dentry_kill(dentry, 0) in shrink_dentry_list()
split dentry_kill()
lift the "already marked killed" case into shrink_dentry_list()
Al Viro [Thu, 29 May 2014 13:11:45 +0000 (09:11 -0400)]
dealing with the rest of shrink_dentry_list() livelock
We have the same problem with ->d_lock order in the inner loop, where
we are dropping references to ancestors. Same solution, basically -
instead of using dentry_kill() we use lock_parent() (introduced in the
previous commit) to get that lock in a safe way, recheck ->d_count
(in case if lock_parent() has ended up dropping and retaking ->d_lock
and somebody managed to grab a reference during that window), trylock
the inode->i_lock and use __dentry_kill() to do the rest.
Al Viro [Thu, 29 May 2014 12:54:52 +0000 (08:54 -0400)]
shrink_dentry_list(): take parent's ->d_lock earlier
The cause of livelocks there is that we are taking ->d_lock on
dentry and its parent in the wrong order, forcing us to use
trylock on the parent's one. d_walk() takes them in the right
order, and unfortunately it's not hard to create a situation
when shrink_dentry_list() can't make progress since trylock
keeps failing, and shrink_dcache_parent() or check_submounts_and_drop()
keeps calling d_walk() disrupting the very shrink_dentry_list() it's
waiting for.
Solution is straightforward - if that trylock fails, let's unlock
the dentry itself and take locks in the right order. We need to
stabilize ->d_parent without holding ->d_lock, but that's doable
using RCU. And we'd better do that in the very beginning of the
loop in shrink_dentry_list(), since the checks on refcount, etc.
would need to be redone anyway.
That deals with a half of the problem - killing dentries on the
shrink list itself. Another one (dropping their parents) is
in the next commit.
locking parent is interesting - it would be easy to do rcu_read_lock(),
lock whatever we think is a parent, lock dentry itself and check
if the parent is still the right one. Except that we need to check
that *before* locking the dentry, or we are risking taking ->d_lock
out of order. Fortunately, once the D1 is locked, we can check if
D2->d_parent is equal to D1 without the need to lock D2; D2->d_parent
can start or stop pointing to D1 only under D1->d_lock, so taking
D1->d_lock is enough. In other words, the right solution is
rcu_read_lock/lock what looks like parent right now/check if it's
still our parent/rcu_read_unlock/lock the child.
Linus Torvalds [Thu, 29 May 2014 21:14:43 +0000 (14:14 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
"Fix CoW regression for transparent hugepages by routing set_pmd_at to
set_pte_at, which correctly handles PTE_WRITE and will mark the
resulting table entry as read-only where appropriate"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: fix pmd_write CoW brokenness
Linus Torvalds [Thu, 29 May 2014 21:05:57 +0000 (14:05 -0700)]
Merge tag 'pm+acpi-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These are three stable-candidate fixes, one for the ACPI thermal
driver and two for cpufreq drivers.
Specifics:
- A workqueue is destroyed too early during the ACPI thermal driver
module unload which leads to a NULL pointer dereference in the
driver's remove callback. Fix from Aaron Lu.
- A wrong argument is passed to devm_regulator_get_optional() in the
probe routine of the cpu0 cpufreq driver which leads to resource
leaks if the driver is unbound from the cpufreq platform device.
Fix from Lucas Stach.
- A lock is missing in cpufreq_governor_dbs() which leads to memory
corruption and NULL pointer dereferences during system
suspend/resume, for example. Fix from Bibek Basu"
* tag 'pm+acpi-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / thermal: fix workqueue destroy order
cpufreq: cpu0: drop wrong devm usage
cpufreq: remove race while accessing cur_policy
Linus Torvalds [Thu, 29 May 2014 20:59:18 +0000 (13:59 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux
Pull clock fixes from Mike Turquette:
"Small number of user-visible regression fixes for clock drivers.
There is a memory leak fix for an ST platform, an infinite Loop Of
Doom fix for the recent changes to the basic clock divider (hopefully
the last fix for those recent changes) and some Tegra PLL changes
which keep PCI from being hosed on that platform"
* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
clk: st: Fix memory leak
clk: divider: Fix table round up function
clk: tegra: Fix enabling of PLLE
clk: tegra: Introduce divider mask and shift helpers
clk: tegra: Fix PLLE programming
Stefan Richter [Thu, 29 May 2014 13:23:26 +0000 (15:23 +0200)]
firewire: revert to 4 GB RDMA, fix protocols using Memory Space
Undo a feature introduced in v3.14 by commit fcd46b34425d
"firewire: Enable remote DMA above 4 GB". That change raised the
minimum address at which protocol drivers and user programs can register
for request reception from 0x0001'0000'0000 to 0x8000'0000'0000.
It turned out that at least one vendor-specific protocol exists which
uses lower addresses: https://bugzilla.kernel.org/show_bug.cgi?id=76921
For the time being, revert most of commit fcd46b34425d so that affected
protocols work like with kernel v3.13 and before. Just keep the valid
documentation parts from the regressing commit, and the ability to
identify controllers which could be programmed to accept >32 bit
physical DMA addresses. The rest of fcd46b34425d should probably be
brought back as an optional instead of default feature.
Al Viro [Wed, 28 May 2014 17:59:13 +0000 (13:59 -0400)]
expand dentry_kill(dentry, 0) in shrink_dentry_list()
Result will be massaged to saner shape in the next commits. It is
ugly, no questions - the point of that one is to be a provably
equivalent transformation (and it might be worth splitting a bit
more).
Will Deacon [Tue, 27 May 2014 18:11:58 +0000 (19:11 +0100)]
arm64: mm: fix pmd_write CoW brokenness
Commit 9c7e535fcc17 ("arm64: mm: Route pmd thp functions through pte
equivalents") changed the pmd manipulator and accessor functions to
convert the target pmd to a pte, process it with the pte functions, then
convert it back. Along the way, we gained support for PTE_WRITE, however
this is completely ignored by set_pmd_at, and so we fail to set the
PMD_SECT_RDONLY for PMDs, resulting in all sorts of lovely failures (like
CoW not working).
Partially reverting the offending commit (by making use of
PMD_SECT_RDONLY explicitly for pmd_{write,wrprotect,mkwrite} functions)
leads to further issues because pmd_write can then return potentially
incorrect values for page table entries marked as RDONLY, leading to
BUG_ON(pmd_write(entry)) tripping under some THP workloads.
This patch fixes the issue by routing set_pmd_at through set_pte_at,
which correctly takes the PTE_WRITE flag into account. Given that
THP mappings are always anonymous, the additional cache-flushing code
in __sync_icache_dcache won't impose any significant overhead as the
flush will be skipped.
Cc: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
added a called to md_reap_sync_thread() which cause a reshape thread
to be interrupted (in particular, it could cause md_thread() to never even
call md_do_sync()).
However it didn't set MD_RECOVERY_INTR so ->finish_reshape() would not
know that the reshape didn't complete.
This only happens when mddev->ro is set and normally reshape threads
don't run in that situation. But raid5 and raid10 can start a reshape
thread during "run" is the array is in the middle of a reshape.
They do this even if ->ro is set.
So it is best to set MD_RECOVERY_INTR before abortingg the
sync thread, just in case.
Though it rare for this to trigger a problem it can cause data corruption
because the reshape isn't finished properly.
So it is suitable for any stable which the offending commit was applied to.
(3.2 or later)
Mathias Nyman [Wed, 28 May 2014 20:51:13 +0000 (23:51 +0300)]
xhci: delete endpoints from bandwidth list before freeing whole device
Lists of endpoints are stored for bandwidth calculation for roothub ports.
Make sure we remove all endpoints from the list before the whole device,
containing its endpoints list_head stuctures, is freed.
This used to be done in the wrong order in xhci_mem_cleanup(),
and triggered an oops in resume from S4 (hibernate).
Sean MacLennan [Wed, 28 May 2014 15:19:00 +0000 (11:19 -0400)]
staging: r8192e_pci: fix htons error
A sparse error fixup removed a htons() which is required for the driver
to function. This patch puts the htons() back and fixes the sparse
warning correctly by changing the left side cast.
Mathias Nyman [Wed, 28 May 2014 20:18:35 +0000 (23:18 +0300)]
usb: pci-quirks: Prevent Sony VAIO t-series from switching usb ports
Sony VAIO t-series machines are not capable of switching usb2 ports over
from Intel EHCI to xHCI controller. If tried the USB2 port will be left
unconnected and unusable.
This patch should be backported to stable kernels as old as 3.12,
that contain the commit 26b76798e0507429506b93cd49f8c4cfdab06896
"Intel xhci: refactor EHCI/xHCI port switching"
Linus Torvalds [Wed, 28 May 2014 18:17:41 +0000 (11:17 -0700)]
Merge tag 'sound-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Just two small stable fixes: an HD-audio fix for the new Intel
chipsets and a PM handling fix in PCM dmaengine core"
* tag 'sound-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix onboard audio on Intel H97/Z97 chipsets
ALSA: pcm_dmaengine: Add check during device suspend
Linus Torvalds [Wed, 28 May 2014 18:15:57 +0000 (11:15 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fix from Al Viro:
"Oh, well... Still nothing useful on that livelock (I had something
that looked kinda-sorta like a non-invasive solution, but it
deadlocks), so it's just Miklos' vmsplice fix for now"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: fix vmplice_to_user()
Nicolas Pitre [Fri, 23 May 2014 21:31:44 +0000 (22:31 +0100)]
ARM: 8063/1: bL_switcher: fix individual online status reporting of removed CPUs
The content of /sys/devices/system/cpu/cpu*/online is still 1 for those
CPUs that the switcher has removed even though the global state in
/sys/devices/system/cpu/online is updated correctly.
It turns out that commit 0902a9044f ("Driver core: Use generic
offline/online for CPU offline/online") has changed the way those files
retrieve their content by relying on on the generic attribute handling
code. The switcher, by calling cpu_down() directly, bypasses this
handling and the attribute value doesn't get updated.
Fix this by calling device_offline()/device_online() instead.
Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Thomas Gleixner [Thu, 22 May 2014 03:25:39 +0000 (03:25 +0000)]
rtmutex: Fix deadlock detector for real
The current deadlock detection logic does not work reliably due to the
following early exit path:
/*
* Drop out, when the task has no waiters. Note,
* top_waiter can be NULL, when we are in the deboosting
* mode!
*/
if (top_waiter && (!task_has_pi_waiters(task) ||
top_waiter != task_top_pi_waiter(task)))
goto out_unlock_pi;
So this not only exits when the task has no waiters, it also exits
unconditionally when the current waiter is not the top priority waiter
of the task.
So in a nested locking scenario, it might abort the lock chain walk
and therefor miss a potential deadlock.
Simple fix: Continue the chain walk, when deadlock detection is
enabled.
We also avoid the whole enqueue, if we detect the deadlock right away
(A-A). It's an optimization, but also prevents that another waiter who
comes in after the detection and before the task has undone the damage
observes the situation and detects the deadlock and returns
-EDEADLOCK, which is wrong as the other task is not in a deadlock
situation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20140522031949.725272460@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Wed, 28 May 2014 15:08:03 +0000 (08:08 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Small fixes for x86, slightly larger fixes for PPC, and a forgotten
s390 patch. The PPC fixes are important because they fix breakage
that is new in 3.15"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: announce irqfd capability
KVM: x86: disable master clock if TSC is reset during suspend
KVM: vmx: disable APIC virtualization in nested guests
KVM guest: Make pv trampoline code executable
KVM: PPC: Book3S: ifdef on CONFIG_KVM_BOOK3S_32_HANDLER for 32bit
KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit
KVM: PPC: Book3S: HV: make _PAGE_NUMA take effect
Linus Torvalds [Wed, 28 May 2014 15:06:50 +0000 (08:06 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull two powerpc fixes from Ben Herrenschmidt:
"Here's a pair of powerpc fixes for 3.15 which are also going to
stable.
One's a fix for building with newer binutils (the problem currently
only affects the BookE kernels but the affected macro might come back
into use on BookS platforms at any time). Unfortunately, the binutils
maintainer did a backward incompatible change to a construct that we
use so we have to add Makefile check.
The other one is a fix for CPUs getting stuck in kexec when running
single threaded. Since we routinely use kexec on power (including in
our newer bootloaders), I deemed that important enough"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode
powerpc: Fix 64 bit builds with binutils 2.24
Al Viro [Wed, 28 May 2014 13:48:44 +0000 (09:48 -0400)]
lift the "already marked killed" case into shrink_dentry_list()
It can happen only when dentry_kill() is called with unlock_on_failure
equal to 0 - other callers had dentry pinned until the moment they've
got ->d_lock and DCACHE_DENTRY_KILLED is set only after lockref_mark_dead().
IOW, only one of three call sites of dentry_kill() might end up reaching
that code. Just move it there.
Alex Smith [Thu, 1 May 2014 11:51:19 +0000 (12:51 +0100)]
MIPS: ptrace: Avoid smp_processor_id() in preemptible code
ptrace_{get,set}_watch_regs access current_cpu_data to get the watch
register count/masks, which calls smp_processor_id(). However they are
run in preemptible context and therefore trigger warnings like so:
[ 6340.092000] BUG: using smp_processor_id() in preemptible [00000000] code: gdb/367
[ 6340.092000] caller is ptrace_get_watch_regs+0x44/0x220
Since the watch register count/masks should be the same across all
CPUs, use boot_cpu_data instead. Note that this may need to change in
future should a heterogenous system be supported where the count/masks
are not the same across all CPUs (the current code is also incorrect
for this scenario - current_cpu_data here would not necessarily be
correct for the CPU that the target task will execute on).
Signed-off-by: Alex Smith <alex.smith@imgtec.com> Reviewed-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6879/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The lock is taken in the raw irq path and therefore a rawlock should be
used instead of a normal spinlock.
While here I drop the export symbol on that variable since there are no
other users.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: linux-mips@linux-mips.org Cc: Hua Yan <yanh@lemote.com> Cc: Huacai Chen <chenhc@lemote.com> Cc: Alex Smith <alex.smith@imgtec.com> Cc: Hongliang Tao <taohl@lemote.com> Cc: Wu Zhangjin <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6936/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 28 May 2014 06:36:23 +0000 (08:36 +0200)]
MIPS: SB1: Fix excessive kernel warnings.
A kernel build with binutils 2.24 is going to emit warnings like
CC kernel/sys.o
{standard input}: Assembler messages:
{standard input}:701: Warning: the 32-bit MIPS architecture does not support the `mdmx' extension
{standard input}:701: Warning: the `mdmx' extension requires 64-bit FPRs
{standard input}:701: Warning: the `mips3d' extension requires MIPS32 revision 2 or greater
{standard input}:701: Warning: the `mips3d' extension requires 64-bit FPRs
for almost every file. This is caused by changes to gas' interpretation
of .set semantics. Fixed by explicitly disabling MIPS3D and MDMX for
Sibyte builds.
NeilBrown [Wed, 28 May 2014 03:39:23 +0000 (13:39 +1000)]
md: always set MD_RECOVERY_INTR when aborting a reshape or other "resync".
If mddev->ro is set, md_to_sync will (correctly) abort.
However in that case MD_RECOVERY_INTR isn't set.
If a RESHAPE had been requested, then ->finish_reshape() will be
called and it will think the reshape was successful even though
nothing happened.
Normally a resync will not be requested if ->ro is set, but if an
array is stopped while a reshape is on-going, then when the array is
started, the reshape will be restarted. If the array is also set
read-only at this point, the reshape will instantly appear to success,
resulting in data corruption.
Consequently, this patch is suitable for any -stable kernel.
Srivatsa S. Bhat [Tue, 27 May 2014 10:55:34 +0000 (16:25 +0530)]
powerpc, kexec: Fix "Processor X is stuck" issue during kexec from ST mode
If we try to perform a kexec when the machine is in ST (Single-Threaded) mode
(ppc64_cpu --smt=off), the kexec operation doesn't succeed properly, and we
get the following messages during boot:
[ 0.089866] POWER8 performance monitor hardware support registered
[ 0.089985] power8-pmu: PMAO restore workaround active.
[ 5.095419] Processor 1 is stuck.
[ 10.097933] Processor 2 is stuck.
[ 15.100480] Processor 3 is stuck.
[ 20.102982] Processor 4 is stuck.
[ 25.105489] Processor 5 is stuck.
[ 30.108005] Processor 6 is stuck.
[ 35.110518] Processor 7 is stuck.
[ 40.113369] Processor 9 is stuck.
[ 45.115879] Processor 10 is stuck.
[ 50.118389] Processor 11 is stuck.
[ 55.120904] Processor 12 is stuck.
[ 60.123425] Processor 13 is stuck.
[ 65.125970] Processor 14 is stuck.
[ 70.128495] Processor 15 is stuck.
[ 75.131316] Processor 17 is stuck.
Note that only the sibling threads are stuck, while the primary threads (0, 8,
16 etc) boot just fine. Looking closer at the previous step of kexec, we observe
that kexec tries to wakeup (bring online) the sibling threads of all the cores,
before performing kexec:
[ 9464.131231] Starting new kernel
[ 9464.148507] kexec: Waking offline cpu 1.
[ 9464.148552] kexec: Waking offline cpu 2.
[ 9464.148600] kexec: Waking offline cpu 3.
[ 9464.148636] kexec: Waking offline cpu 4.
[ 9464.148671] kexec: Waking offline cpu 5.
[ 9464.148708] kexec: Waking offline cpu 6.
[ 9464.148743] kexec: Waking offline cpu 7.
[ 9464.148779] kexec: Waking offline cpu 9.
[ 9464.148815] kexec: Waking offline cpu 10.
[ 9464.148851] kexec: Waking offline cpu 11.
[ 9464.148887] kexec: Waking offline cpu 12.
[ 9464.148922] kexec: Waking offline cpu 13.
[ 9464.148958] kexec: Waking offline cpu 14.
[ 9464.148994] kexec: Waking offline cpu 15.
[ 9464.149030] kexec: Waking offline cpu 17.
Instrumenting this piece of code revealed that the cpu_up() operation actually
fails with -EBUSY. Thus, only the primary threads of all the cores are online
during kexec, and hence this is a sure-shot receipe for disaster, as explained
in commit e8e5c2155b (powerpc/kexec: Fix orphaned offline CPUs across kexec),
as well as in the comment above wake_offline_cpus().
It turns out that cpu_up() was returning -EBUSY because the variable
'cpu_hotplug_disabled' was set to 1; and this disabling of CPU hotplug was done
by migrate_to_reboot_cpu() inside kernel_kexec().
Now, migrate_to_reboot_cpu() was originally written with the assumption that
any further code will not need to perform CPU hotplug, since we are anyway in
the reboot path. However, kexec is clearly not such a case, since we depend on
onlining CPUs, atleast on powerpc.
So re-enable cpu-hotplug after returning from migrate_to_reboot_cpu() in the
kexec path, to fix this regression in kexec on powerpc.
Also, wrap the cpu_up() in powerpc kexec code within a WARN_ON(), so that we
can catch such issues more easily in the future.
Fixes: c97102ba963 (kexec: migrate to reboot cpu) Cc: stable@vger.kernel.org Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Guenter Roeck [Thu, 15 May 2014 16:33:42 +0000 (09:33 -0700)]
powerpc: Fix 64 bit builds with binutils 2.24
With binutils 2.24, various 64 bit builds fail with relocation errors
such as
arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
(.text+0x165ee): relocation truncated to fit: R_PPC64_ADDR16_HI
against symbol `interrupt_base_book3e' defined in .text section
in arch/powerpc/kernel/built-in.o
arch/powerpc/kernel/built-in.o: In function `exc_debug_crit_book3e':
(.text+0x16602): relocation truncated to fit: R_PPC64_ADDR16_HI
against symbol `interrupt_end_book3e' defined in .text section
in arch/powerpc/kernel/built-in.o
The assembler maintainer says:
I changed the ABI, something that had to be done but unfortunately
happens to break the booke kernel code. When building up a 64-bit
value with lis, ori, shl, oris, ori or similar sequences, you now
should use @high and @higha in place of @h and @ha. @h and @ha
(and their associated relocs R_PPC64_ADDR16_HI and R_PPC64_ADDR16_HA)
now report overflow if the value is out of 32-bit signed range.
ie. @h and @ha assume you're building a 32-bit value. This is needed
to report out-of-range -mcmodel=medium toc pointer offsets in @toc@h
and @toc@ha expressions, and for consistency I did the same for all
other @h and @ha relocs.
Replacing @h with @high in one strategic location fixes the relocation
errors. This has to be done conditionally since the assembler either
supports @h or @high but not both.
Cc: <stable@vger.kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Dave Airlie [Tue, 27 May 2014 23:18:32 +0000 (09:18 +1000)]
Merge tag 'drm-intel-fixes-2014-05-27' of git://anongit.freedesktop.org/drm-intel into drm-fixes
Fixes from Chris, all cc: stable.
* tag 'drm-intel-fixes-2014-05-27' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Prevent negative relocation deltas from wrapping
drm/i915: Only copy back the modified fields to userspace from execbuffer
drm/i915: Fix dynamic allocation of physical handles
Linus Torvalds [Tue, 27 May 2014 22:59:34 +0000 (15:59 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull virtio_blk fix from Jens Axboe:
"There's a start/stop queue race in virtio_blk, which causes stalls and
erratic behaviour for some. I've had this queued up for 3.16 for a
while, but I think we should push it into the current series as well.
So I cherry picked the commit and added a stable marker as well, so it
can propagate down"
* 'for-linus' of git://git.kernel.dk/linux-block:
virtio_blk: fix race between start and stop queue
Linus Torvalds [Tue, 27 May 2014 22:58:20 +0000 (15:58 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull two timer fixes from Thomas Gleixner:
"Two small fixlets for ARM SoC clocksource drivers:
- avoid calling functions which might sleep from interrupt [disabled]
context in tcb_clksrc used on Atmel SoCs
- use irq_force_affinity() to pin the per cpu timer interrupt on a
not yet online cpu in the SiRFprimaII driver"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: tcb_clksrc: Make tc_mode interrupt safe
clocksource: marco: Fix the affinity set for local timer of CPU1
Johan Hovold [Sat, 26 Apr 2014 09:53:44 +0000 (11:53 +0200)]
USB: io_ti: fix firmware download on big-endian machines (part 2)
A recent patch that purported to fix firmware download on big-endian
machines failed to add the corresponding sparse annotation to the
i2c-header. This was reported by the kbuild test robot.
Adding the appropriate annotation revealed another endianess bug related
to the i2c-header Size-field in a code path that is exercised when the
firmware is actually being downloaded (and not just verified and left
untouched unless older than the firmware at hand).
This patch adds the required sparse annotation to the i2c-header and
makes sure that the Size-field is sent in little-endian byte order
during firmware download also on big-endian machines.
Note that this patch is only compile-tested, but that there is no
functional change for little-endian systems.
Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Ludovic Drolez <ldrolez@debian.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Tue, 27 May 2014 20:59:24 +0000 (13:59 -0700)]
Merge tag 'fixes-for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A slightly larger set of fixes than we'd like at this point in the
release. Hopefully our very last batch before 3.15:
OMAP:
- Fix boot regression with CPU_IDLE enabled
- Fixes for audio playback on OMAP5
- Clock rate setting fix for OMAP3
- Misc idle/PM fixes
Exynos:
- Removal of a couple of power domains to work around issues with
access when they are powered down
- Enabling missing highspeed-i2c driver to make MMC regulators work
- Secondary CPU spin-up fix for 4212
- Remove MDMA1 engine to avoid conflicts on secure mode platforms
- A few other DT fixes
Marvell:
- PCI-e fixes for clocks and resource allocation
plus a few other smaller fixes, add a MAINTAINERS entry for reset
drivers, etc"
* tag 'fixes-for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (21 commits)
MAINTAINERS: Add reset controller framework entry
ARM: trusted_foundations: fix compile error on non-SMP
ARM: at91: sam9260: fix compilation issues
ARM: mvebu: fix definitions of PCIe interfaces on Armada 38x
ARM: imx: fix error handling in ipu device registration
ARM: OMAP4: Fix the boot regression with CPU_IDLE enabled
ARM: dts: Keep LDO4 always ON for exynos5250-arndale board
ARM: dts: Fix SPI interrupt numbers for exynos5420
ARM: dts: fix incorrect ak8975 compatible for exynos4412-trats2 board
ARM: OMAP2+: Fix DMA hang after off-idle
ARM: OMAP2+: nand: Fix NAND on OMAP2 and OMAP3 boards
ARM: dts: Remove g2d_pd node for exynos5420
ARM: dts: Remove mau_pd node for exynos5420
ARM: exynos_defconfig: enable HS-I2C to fix for mmc partition mount
ARM: dts: disable MDMA1 node for exynos5420
ARM: EXYNOS: fix the secondary CPU boot of exynos4212
ARM: omap5: hwmod_data: Correct IDLEMODE for McPDM
ARM: mvebu: mvebu-soc-id: keep clock enabled if PCIe unit is enabled
ARM: mvebu: mvebu-soc-id: add missing clk_put() call
ARM: at91/dt: sam9260: correct external trigger value
...
Linus Torvalds [Tue, 27 May 2014 20:58:46 +0000 (13:58 -0700)]
Merge tag 'pinctrl-v3.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pinctrl fix from Linus Walleij:
"A single last pinctrl fix for the v3.15 series: the vt8500 driver was
failing to update the output value when the combined set direction
output and set value was executed"
* tag 'pinctrl-v3.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: vt8500: Ensure value reg is updated when setting direction
Linus Torvalds [Tue, 27 May 2014 20:57:00 +0000 (13:57 -0700)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine fixes from Vinod Koul:
"We have three small fixes.
First one from Andy reverts the devm_request irq as we need to ensure
the tasklet is killed after irq is freed, so we need to do free irq in
our code. Other two from Arnd are fixing the compilation issue in
omap and sa11x0 drivers with ARM randconfigs"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: sa11x0: remove broken #ifdef
dmaengine: omap: hide filter_fn for built-in drivers
dmaengine: dw: went back to plain {request,free}_irq() calls
Hannes Reinecke [Mon, 26 May 2014 12:45:39 +0000 (14:45 +0200)]
dm mpath: really fix lockdep warning
lockdep complains about a circular locking. And indeed, we need to
release the lock before calling dm_table_run_md_queue_async().
As such, commit 4cdd2ad ("dm mpath: fix lock order inconsistency in
multipath_ioctl") must also be reverted in addition to fixing the
lock order in the other dm_table_run_md_queue_async() callers.
Reported-by: Bart van Assche <bvanassche@acm.org> Tested-by: Bart van Assche <bvanassche@acm.org> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Ming Lei [Fri, 16 May 2014 15:31:21 +0000 (23:31 +0800)]
virtio_blk: fix race between start and stop queue
When there isn't enough vring descriptor for adding to vq,
blk-mq will be put as stopped state until some of pending
descriptors are completed & freed.
Unfortunately, the vq's interrupt may come just before
blk-mq's BLK_MQ_S_STOPPED flag is set, so the blk-mq will
still be kept as stopped even though lots of descriptors
are completed and freed in the interrupt handler. The worst
case is that all pending descriptors are freed in the
interrupt handler, and the queue is kept as stopped forever.
This patch fixes the problem by starting/stopping blk-mq
with holding vq_lock.
Cc: Jens Axboe <axboe@kernel.dk> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <axboe@fb.com>
Conflicts:
drivers/block/virtio_blk.c
dm cache: always split discards on cache block boundaries
The DM cache target cannot cope with discards that span multiple cache
blocks, so each discard bio that spans more than one cache block must
get split by the DM core.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # v3.9+
Oliver Hartkopp [Tue, 27 May 2014 11:30:56 +0000 (13:30 +0200)]
can: only rename enabled led triggers when changing the netdev name
Commit a1ef7bd9fce8 ("can: rename LED trigger name on netdev renames") renames
the led trigger names according to the changed netdevice name.
As not every CAN driver supports and initializes the led triggers, checking for
the CAN private datastructure with safe_candev_priv() in the notifier chain is
not enough.
This patch adds a check when CONFIG_CAN_LEDS is enabled and the driver does not
support led triggers.
For stable 3.9+
Cc: Fabio Baltieri <fabio.baltieri@gmail.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Acked-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Chris Wilson [Fri, 23 May 2014 06:48:08 +0000 (08:48 +0200)]
drm/i915: Prevent negative relocation deltas from wrapping
This is pure evil. Userspace, I'm looking at you SNA, repacks batch
buffers on the fly after generation as they are being passed to the
kernel for execution. These batches also contain self-referenced
relocations as a single buffer encompasses the state commands, kernels,
vertices and sampler. During generation the buffers are placed at known
offsets within the full batch, and then the relocation deltas (as passed
to the kernel) are tweaked as the batch is repacked into a smaller buffer.
This means that userspace is passing negative relocations deltas, which
subsequently wrap to large values if the batch is at a low address. The
GPU hangs when it then tries to use the large value as a base for its
address offsets, rather than wrapping back to the real value (as one
would hope). As the GPU uses positive offsets from the base, we can
treat the relocation address as the minimum address read by the GPU.
For the upper bound, we trust that userspace will not read beyond the
end of the buffer.
So, how do we fix negative relocations from wrapping? We can either
check that every relocation looks valid when we write it, and then
position each object such that we prevent the offset wraparound, or we
just special-case the self-referential behaviour of SNA and force all
batches to be above 256k. Daniel prefers the latter approach.
This fixes a GPU hang when it tries to use an address (relocation +
offset) greater than the GTT size. The issue would occur quite easily
with full-ppgtt as each fd gets its own VM space, so low offsets would
often be handed out. However, with the rearrangement of the low GTT due
to capturing the BIOS framebuffer, it is already affecting kernels 3.15
onwards. I think only IVB+ is susceptible to this bug, but the workaround
should only kick in rarely, so it seems sensible to always apply it.
v3: Use a bias for batch buffers to prevent small negative delta relocations
from wrapping.
v4 from Daniel:
- s/BIAS/BATCH_OFFSET_BIAS/
- Extract eb_vma_misplaced/i915_vma_misplaced since the conditions
were growing rather cumbersome.
- Add a comment to eb_get_batch explaining why we do this.
- Apply the batch offset bias everywhere but mention that we've only
observed it on gen7 gpus.
- Drop PIN_OFFSET_FIX for now, that slipped in from a feature patch.
v5: Add static to eb_get_batch, spotted by 0-day tester.
Testcase: igt/gem_bad_reloc
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78533 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v3) Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Fri, 23 May 2014 09:45:52 +0000 (10:45 +0100)]
drm/i915: Only copy back the modified fields to userspace from execbuffer
We only want to modifiy a single field in the userspace view of the
execbuffer command buffer, so explicitly change that rather than copy
everything back again.
This serves two purposes:
1. The single fields are much cheaper to copy (constant size so the
copy uses special case code) and much smaller than the whole array.
2. We modify the array for internal use that need to be masked from
the user.
Note: We need this backported since without it the next bugfix will
blow up when userspace recycles batchbuffers and relocations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>