git.karo-electronics.de Git - linux-beck.git/log

Merge tag 'nfs-rdma-4.8-1' of git://git.linux-nfs.org/projects/anna/nfs-rdma

NFS: NFSoRDMA Client Side Changes

New Features:
- Add kerberos support

Bugfixes and cleanups:
- Remove ALLPHYSICAL memory registration mode
- Fix FMR disconnect recovery
- Reduce memory usage

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

NFS: Don't drop CB requests with invalid principals

Before commit 778be232a207 ("NFS do not find client in NFSv4
pg_authenticate"), the Linux callback server replied with
RPC_AUTH_ERROR / RPC_AUTH_BADCRED, instead of dropping the CB
request. Let's restore that behavior so the server has a chance to
do something useful about it, and provide a warning that helps
admins correct the problem.

Fixes: 778be232a207 ("NFS do not find client in NFSv4 ...")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

svc: Avoid garbage replies when pc_func() returns rpc_drop_reply

If an RPC program does not set vs_dispatch and pc_func() returns
rpc_drop_reply, the server sends a reply anyway containing a single
word containing the value RPC_DROP_REPLY (in network byte-order, of
course). This is a nonsense RPC message.

Fixes: 9e701c610923 ("svcrpc: simpler request dropping")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: No direct data placement with krb5i and krb5p

Direct data placement is not allowed when using flavors that
guarantee integrity or privacy. When such security flavors are in
effect, don't allow the use of Read and Write chunks for moving
individual data items. All messages larger than the inline threshold
are sent via Long Call or Long Reply.

On my systems (CX-3 Pro on FDR), for small I/O operations, the use
of Long messages adds only around 5 usecs of latency in each
direction.

Note that when integrity or encryption is used, the host CPU touches
every byte in these messages. Even if it could be used, data
movement offload doesn't buy much in this case.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Clean up fixup_copy_count accounting

fixup_copy_count should count only the number of bytes copied to the
page list. The head and tail are now always handled without a data
copy.

And the debugging at the end of rpcrdma_inline_fixup() is also no
longer necessary, since copy_len will be non-zero when there is reply
data in the tail (a normal and valid case).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Update only specific fields in private receive buffer

Now that rpcrdma_inline_fixup() updates only two fields in
rq_rcv_buf, a full memcpy of that structure to rq_private_buf is
unwarranted. Updating rq_private_buf fields only where needed also
better documents what is going on.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Do not update {head, tail}.iov_len in rpcrdma_inline_fixup()

While trying NFSv4.0/RDMA with sec=krb5p, I noticed small NFS READ
operations failed. After the client unwrapped the NFS READ reply
message, the NFS READ XDR decoder was not able to decode the reply.
The message was "Server cheating in reply", with the reported
number of received payload bytes being zero. Applications reported
a read(2) that returned -1/EIO.

The problem is rpcrdma_inline_fixup() sets the tail.iov_len to zero
when the incoming reply fits entirely in the head iovec. The zero
tail.iov_len confused xdr_buf_trim(), which then mangled the actual
reply data instead of simply removing the trailing GSS checksum.

As near as I can tell, RPC transports are not supposed to update the
head.iov_len, page_len, or tail.iov_len fields in the receive XDR
buffer when handling an incoming RPC reply message. These fields
contain the length of each component of the XDR buffer, and hence
the maximum number of bytes of reply data that can be stored in each
XDR buffer component. I've concluded this because:

- This is how xdr_partial_copy_from_skb() appears to behave
- rpcrdma_inline_fixup() already does not alter page_len
- call_decode() compares rq_private_buf and rq_rcv_buf and WARNs
if they are not exactly the same

Unfortunately, as soon as I tried the simple fix to just remove the
line that sets tail.iov_len to zero, I saw that the logic that
appends the implicit Write chunk pad inline depends on inline_fixup
setting tail.iov_len to zero.

To address this, re-organize the tail iovec handling logic to use
the same approach as with the head iovec: simply point tail.iov_base
to the correct bytes in the receive buffer.

While I remember all this, write down the conclusion in documenting
comments.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: rpcrdma_inline_fixup() overruns the receive page list

When the remaining length of an incoming reply is longer than the
XDR buf's page_len, switch over to the tail iovec instead of
copying more than page_len bytes into the page list.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Chunk list encoders no longer share one rl_segments array

Currently, all three chunk list encoders each use a portion of the
one rl_segments array in rpcrdma_req. This is because the MWs for
each chunk list were preserved in rl_segments so that ro_unmap could
find and invalidate them after the RPC was complete.

However, now that MWs are placed on a per-req linked list as they
are registered, there is no longer any information in rpcrdma_mr_seg
that is shared between ro_map and ro_unmap_{sync,safe}, and thus
nothing in rl_segments needs to be preserved after
rpcrdma_marshal_req is complete.

Thus the rl_segments array can be used now just for the needs of
each rpcrdma_convert_iovs call. Once each chunk list is encoded, the
next chunk list encoder is free to re-use all of rl_segments.

This means all three chunk lists in one RPC request can now each
encode a full size data payload with no increase in the size of
rl_segments.

This is a key requirement for Kerberos support, since both the Call
and Reply for a single RPC transaction are conveyed via Long
messages (RDMA Read/Write). Both can be large.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Place registered MWs on a per-req list

Instead of placing registered MWs sparsely into the rl_segments
array, place these MWs on a per-req list.

ro_unmap_{sync,safe} can then simply pull those MWs off the list
instead of walking through the array.

This change significantly reduces the size of struct rpcrdma_req
by removing nsegs and rl_mw from every array element.

As an additional clean-up, chunk co-ordinates are returned in the
"*mw" output argument so they are no longer needed in every
array element.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Release orphaned MRs immediately

Instead of leaving orphaned MRs to be released when the transport
is destroyed, release them immediately. The MR free list can now be
replenished if it becomes exhausted.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Allocate MRs on demand

Frequent MR list exhaustion can impact I/O throughput, so enough MRs
are always created during transport set-up to prevent running out.
This means more MRs are created than most workloads need.

Commit 94f58c58c0b4 ("xprtrdma: Allow Read list and Reply chunk
simultaneously") introduced support for sending two chunk lists per
RPC, which consumes more MRs per RPC.

Instead of trying to provision more MRs, introduce a mechanism for
allocating MRs on demand. A few MRs are allocated during transport
set-up to kick things off.

This significantly reduces the average number of MRs per transport
while allowing the MR count to grow for workloads or devices that
need more MRs.

FRWR with mlx4 allocated almost 400 MRs per transport before this
patch. Now it starts with 32.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Chunk list encoders must not return zero

Clean up, based on code audit: Remove the possibility that the
chunk list XDR encoders can return zero, which would be interpreted
as a NULL.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Honor ->send_request API contract

Commit c93c62231cf5 ("xprtrdma: Disconnect on registration failure")
added a disconnect for some RPC marshaling failures. This is needed
only in a handful of cases, but it was triggering for simple stuff
like temporary resource shortages. Try to straighten this out.

Fix up the lower layers so they don't return -ENOMEM or other error
codes that the RPC client's FSM doesn't explicitly recognize.

Also fix up the places in the send_request path that do want a
disconnect. For example, when ib_post_send or ib_post_recv fail,
this is a sign that there is a send or receive queue resource
miscalculation. That should be rare, and is a sign of a software
bug. But xprtrdma can recover: disconnect to reset the transport and
start over.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Reply buffer exhaustion can be catastrophic

Not having an rpcrdma_rep at call_allocate time can be a problem.
It means that send_request can't post a receive buffer to catch
the RPC's reply. Possible consequences are RPC timeouts or even
transport deadlock.

Instead of allowing an RPC to proceed if an rpcrdma_rep is
not available, return NULL to force call_allocate to wait and
try again.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Clean up device capability detection

Clean up: Move device capability detection into memreg-specific
source files.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Remove rpcrdma_map_one() and friends

Clean up: ALLPHYSICAL is gone and FMR has been converted to use
scatterlists. There are no more users of these functions.

This patch shrinks the size of struct rpcrdma_req by about 3500
bytes on x86_64. There is one of these structs for each RPC credit
(128 credits per transport connection).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Remove ALLPHYSICAL memory registration mode

No HCA or RNIC in the kernel tree requires the use of ALLPHYSICAL.

ALLPHYSICAL advertises in the clear on the network fabric an R_key
that is good for all of the client's memory. No known exploit
exists, but theoretically any user on the server can use that R_key
on the client's QP to read or update any part of the client's memory.

ALLPHYSICAL exposes the client to server bugs, including:
o base/bounds errors causing data outside the i/o buffer to be
accessed
o RDMA access after reply causing data corruption and/or integrity
fail

ALLPHYSICAL can't protect application memory regions from server
update after a local signal or soft timeout has terminated an RPC.

ALLPHYSICAL chunks are no larger than a page. Special cases to
handle small chunks and long chunk lists have been a source of
implementation complexity and bugs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Do not leak an MW during a DMA map failure

Based on code audit.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Refactor MR recovery work queues

I found that commit ead3f26e359e ("xprtrdma: Add ro_unmap_safe
memreg method"), which introduces ro_unmap_safe, never wired up the
FMR recovery worker.

The FMR and FRWR recovery work queues both do the same thing.
Instead of setting up separate individual work queues for this,
schedule a delayed worker to deal with them, since recovering MRs is
not performance-critical.

Fixes: ead3f26e359e ("xprtrdma: Add ro_unmap_safe memreg method")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Use scatterlist for DMA mapping and unmapping under FMR

The use of a scatterlist for handling DMA mapping and unmapping
was recently introduced in frwr_ops.c in commit 4143f34e01e9
("xprtrdma: Port to new memory registration API"). That commit did
not make a similar update to xprtrdma's FMR support because the
core ib_map_phys_fmr() and ib_unmap_fmr() APIs have not been changed
to take a scatterlist argument.

However, FMR still needs to do DMA mapping and unmapping. It appears
that RDS, for example, uses a scatterlist for this, then builds the
DMA addr array for the ib_map_phys_fmr call separately. I see that
SRP also utilizes a scatterlist for DMA mapping. xprtrdma can do
something similar.

This modernization is used immediately to properly defer DMA
unmapping during fmr_unmap_safe (a FIXME). It separates the DMA
unmapping coordinates from the rl_segments array. This array, being
part of an rpcrdma_req, is always re-used immediately when an RPC
exits. A scatterlist is allocated in memory independent of the
rl_segments array, so it can be preserved indefinitely (ie, until
the MR invalidation and DMA unmapping can actually be done by a
worker thread).

The FRWR and FMR DMA mapping code are slightly different from each
other now, and will diverge further when the "Check for holes" logic
can be removed from FRWR (support for SG_GAP MRs). So I chose not to
create helpers for the common-looking code.

Fixes: ead3f26e359e ("xprtrdma: Add ro_unmap_safe memreg method")
Suggested-by: Sagi Grimberg <sagi@lightbits.io>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Rename fields in rpcrdma_fmr

Clean up: Use the same naming convention used in other
RPC/RDMA-related data structures.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Move init and release helpers

Clean up: Moving these helpers in a separate patch makes later
patches more readable.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Create common scatterlist fields in rpcrdma_mw

Clean up: FMR is about to replace the rpcrdma_map_one code with
scatterlists. Move the scatterlist fields out of the FRWR-specific
union and into the generic part of rpcrdma_mw.

One minor change: -EIO is now returned if FRWR registration fails.
The RPC is terminated immediately, since the problem is likely due
to a software bug, thus retrying likely won't help.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

xprtrdma: Remove FMRs from the unmap list after unmapping

ib_unmap_fmr() takes a list of FMRs to unmap. However, it does not
remove the FMRs from this list as it processes them. Other
ib_unmap_fmr() call sites are careful to remove FMRs from the list
after ib_unmap_fmr() returns.

Since commit 7c7a5390dc6c8 ("xprtrdma: Add ro_unmap_sync method for FMR")
fmr_op_unmap_sync passes more than one FMR to ib_unmap_fmr(), but
it didn't bother to remove the FMRs from that list once the call was
complete.

I've noticed some instability that could be related to list
tangling by the new fmr_op_unmap_sync() logic. In an abundance
of caution, add some defensive logic to clean up properly after
ib_unmap_fmr().

Fixes: 7c7a5390dc6c8 ("xprtrdma: Add ro_unmap_sync method for FMR")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Linux 4.7-rc7

tmpfs: fix regression hang in fallocate undo

The well-spotted fallocate undo fix is good in most cases, but not when
fallocate failed on the very first page. index 0 then passes lend -1
to shmem_undo_range(), and that has two bad effects: (a) that it will
undo every fallocation throughout the file, unrestricted by the current
range; but more importantly (b) it can cause the undo to hang, because
lend -1 is treated as truncation, which makes it keep on retrying until
every page has gone, but those already fully instantiated will never go
away. Big thank you to xfstests generic/269 which demonstrates this.

Fixes: b9b4bb26af01 ("tmpfs: don't undo fallocate past its last page")
Cc: stable@vger.kernel.org
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus

Pull MIPS fix from Ralf Baechle:
"Another week with just a single 4.7 fix.

  This fixes a possible 'loss' of the huge page bit from pmd on
  permission change"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Fix page table corruption on THP permission changes.

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Three fixes.  One is the qla24xx MSI regression, one is a theoretical
  problem over blacklist matching, which would bite USB badly if it ever
  triggered and one is a system hang with a particular type of IPR
  device"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  qla2xxx: Fix NULL pointer deref in QLA interrupt
  SCSI: fix new bug in scsi_dev_info_list string matching
  ipr: Clear interrupt on croc/crocodile when running with LSI

Merge tag 'ecryptfs-4.7-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs

Pull eCryptfs fixes from Tyler Hicks:
"Provide a more concise fix for CVE-2016-1583:
   - Additionally fixes linux-stable regressions caused by the
     cherry-picking of the original fix

  Some very minor changes that have queued up:
   - Fix typos in code comments
   - Remove unnecessary check for NULL before destroying kmem_cache"

* tag 'ecryptfs-4.7-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
  ecryptfs: don't allow mmap when the lower fs doesn't support it
  Revert "ecryptfs: forbid opening files without mmap handler"
  ecryptfs: fix spelling mistakes
  eCryptfs: fix typos in comment
  ecryptfs: drop null test before destroy functions

Merge tag 'iommu-fixes-v4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU fixes from Joerg Roedel:
"Two Fixes:

   - Intel VT-d fix for a suspend/resume issue, introduced with the
     scalability improvements in this cycle.

   - AMD IOMMU fix for systems that have unity mappings defined.  There
     was a race where translation got enabled before the unity mappings
     were in place.  This issue was seen on some HP servers"

* tag 'iommu-fixes-v4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/amd: Fix unity mapping initialization race
  iommu/vt-d: Fix infinite loop in free_all_cpu_cached_iovas

Merge tag 'for-linus-4.7b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen bug fixes from David Vrabel:

- Fix two bugs in the handling of xenbus transactions.

- Make the xen acpi driver compatible with Xen 4.7.

* tag 'for-linus-4.7b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7
  xenbus: simplify xenbus_dev_request_and_reply()
  xenbus: don't bail early from xenbus_dev_request_and_reply()
  xenbus: don't BUG() on user mode induced condition

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
"A couple of late fixes here, but one that we've been sitting on for a
  few weeks while the details were worked out.  Specifically, we now
  enforce USER_DS on taking exceptions whilst in the kernel, which
  avoids leaking kernel data to userspace through things like perf.  The
  other patch is an update to a workaround for a hardware erratum on
  some Cavium SoCs.

  Summary:

   - Enforce USER_DS on exception entry from EL1

   - Apply workaround for Cavium errata #27456 on Thunderx-81xx parts"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Enable workaround for Cavium erratum 27456 on thunderx-81xx
  arm64: kernel: Save and restore UAO and addr_limit on exception entry

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"Three fixes:

   - A boot crash fix with certain configs
   - a MAINTAINERS entry update
   - Documentation typo fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/Documentation: Fix various typos in Documentation/x86/ files
  x86/amd_nb: Fix boot crash on non-AMD systems
  MAINTAINERS: Update the Calgary IOMMU entry

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
"Two load-balancing fixes for cgroups-intense workloads"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix calc_cfs_shares() fixed point arithmetics width confusion
sched/fair: Fix effective_load() to consistently use smoothed load

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"Various fixes:

   - 32-bit callgraph bug fix
   - suboptimal event group scheduling bug fix
   - event constraint fixes for Broadwell/Skylake
   - RAPL module name collision fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix pmu::filter_match for SW-led groups
  x86/perf/intel/rapl: Fix module name collision with powercap intel-rapl
  perf/x86: Fix 32-bit perf user callgraph collection
  perf/x86/intel: Update event constraints when HT is off

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:
"Two MIPS-GIC irqchip driver fixes to unbreak certain MIPS boards"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mips-gic: Match IPI IRQ domain by bus token only
irqchip/mips-gic: Map to VPs using HW VPNum

Merge tag 'gpio-v4.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
"I don't like to toss in last minute patches, but these are all for
  things that are broken, and have bitten people for real.  Two of them
  go into stable.  Maybe all of them if the compile test problem is a
  pain in the ass also for stable folks.

  Final (hopefully) GPIO fixes for v4.7:

   - Fix an oops on the Asus Eee PC 1201

   - Revert a patch trying to split GPIO parsing and GPIO configuration

   - Revert a too liberal compile testing thing"

* tag 'gpio-v4.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  Revert "gpio: gpiolib-of: Allow compile testing"
  Revert "gpiolib: Split GPIO flags parsing and GPIO configuration"
  gpio: sch: Fix Oops on module load on Asus Eee PC 1201

Merge tag 'drm-fixes-for-v4.7-rc7' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"One nouveau fix, and a few AMD Polaris fixes and some Allwinner fixes.

  I've got some vmware fixes that I might send separate over the
  weekend, they fix some black screens, but I'm still debating them"

* tag 'drm-fixes-for-v4.7-rc7' of git://people.freedesktop.org/~airlied/linux:
  drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation.
  drm/amd/powerplay: fix bug that get wrong polaris evv voltage.
  drm/amd/powerplay: incorrectly use of the function return value
  drm/amd/powerplay: fix incorrect voltage table value for tonga
  drm/amd/powerplay: fix incorrect voltage table value for polaris10
  drm/nouveau/disp/sor/gf119: select correct sor when poking training pattern
  gpu: drm: sun4i_drv: add missing of_node_put after calling of_parse_phandle
  drm/sun4i: Send vblank event when the CRTC is disabled
  drm/sun4i: Report proper vblank

ecryptfs: don't allow mmap when the lower fs doesn't support it

There are legitimate reasons to disallow mmap on certain files, notably
in sysfs or procfs. We shouldn't emulate mmap support on file systems
that don't offer support natively.

CVE-2016-1583

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: stable@vger.kernel.org
[tyhicks: clean up f_op check by using ecryptfs_file_to_lower()]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>

xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7

As of Xen 4.7 PV CPUID doesn't expose either of CPUID[1].ECX[7] and
CPUID[0x80000007].EDX[7] anymore, causing the driver to fail to load on
both Intel and AMD systems. Doing any kind of hardware capability
checks in the driver as a prerequisite was wrong anyway: With the
hypervisor being in charge, all such checking should be done by it. If
ACPI data gets uploaded despite some missing capability, the hypervisor
is free to ignore part or all of that data.

Ditch the entire check_prereq() function, and do the only valid check
(xen_initial_domain()) in the caller in its place.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xenbus: simplify xenbus_dev_request_and_reply()

No need to retain a local copy of the full request message, only the
type is really needed.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xenbus: don't bail early from xenbus_dev_request_and_reply()

xenbus_dev_request_and_reply() needs to track whether a transaction is
open. For XS_TRANSACTION_START messages it calls transaction_start()
and for XS_TRANSACTION_END messages it calls transaction_end().

If sending an XS_TRANSACTION_START message fails or responds with an
an error, the transaction is not open and transaction_end() must be
called.

If sending an XS_TRANSACTION_END message fails, the transaction is
still open, but if an error response is returned the transaction is
closed.

Commit 027bd7e89906 ("xen/xenbus: Avoid synchronous wait on XenBus
stalling shutdown/restart") introduced a regression where failed
XS_TRANSACTION_START messages were leaving the transaction open. This
can cause problems with suspend (and migration) as all transactions
must be closed before suspending.

It appears that the problematic change was added accidentally, so just
remove it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull apparmor fix from James Morris.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
apparmor: fix oops, validate buffer size in apparmor_setprocattr()

Merge tag 'acpi-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
"All of these fix recent regressions in ACPICA, in the ACPI PCI IRQ
  management code and in the ACPI AML debugger.

  Specifics:

   - Fix a lock ordering issue in ACPICA introduced by a recent commit
     that attempted to fix a deadlock in the dynamic table loading code
     which in turn appeared after changes related to the handling of
     module-level AML also made in this cycle (Lv Zheng).

   - Fix a recent regression in the ACPI IRQ management code that may
     cause PCI drivers to be unable to register an IRQ if that IRQ
     happens to be shared with a device on the ISA bus, like the
     parallel port, by reverting one commit entirely and restoring the
     previous behavior in two other places (Sinan Kaya).

   - Fix a recent regression in the ACPI AML debugger introduced by the
     commit that removed incorrect usage of IS_ERR_VALUE() from multiple
     places (Lv Zheng)"

* tag 'acpi-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / debugger: Fix regression introduced by IS_ERR_VALUE() removal
  ACPICA: Namespace: Fix namespace/interpreter lock ordering
  ACPI,PCI,IRQ: separate ISA penalty calculation
  Revert "ACPI, PCI, IRQ: remove redundant code in acpi_irq_penalty_init()"
  ACPI,PCI,IRQ: factor in PCI possible

Merge tag 'pm-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"One fix for a recent cpuidle core change that, against all odds,
  introduced a functional regression on Power systems and the fix for
  the crash during resume from hibernation on x86-64 that has been in
  the works for the last few weeks (it actually was ready last week, but
  I wanted to allow the reporters to test if for some more time).

  Specifics:

   - Fix a recent performance regression on Power systems (powernv and
     pseries) introduced by a core cpuidle commit that decreased the
     precision of the last_residency conversion from nano- to
     microseconds, which should not matter in theory, but turned out to
     play not-so-well with the special "snooze" idle state on Power
     (Shreyas B Prabhu).

   - Fix a crash during resume from hibernation on x86-64 caused by
     possible corruption of the kernel text part of page tables in the
     last phase of image restoration exposed by a security-related
     change during the 4.3 development cycle (Rafael Wysocki)"

* tag 'pm-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpuidle: Fix last_residency division
  x86/power/64: Fix kernel text mapping corruption during image restoration

Merge tag 'sunxi-drm-fixes-for-4.7-2' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into drm-fixes

Allwinner DRM driver fixes for 4.7, take 2

A new set of fixes for the sun4i driver, mostly related to vblank handling,
and a minor fix to release a reference on the device tree nodes we're
parsing in the probe logic.

* tag 'sunxi-drm-fixes-for-4.7-2' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
  gpu: drm: sun4i_drv: add missing of_node_put after calling of_parse_phandle
  drm/sun4i: Send vblank event when the CRTC is disabled
  drm/sun4i: Report proper vblank

apparmor: fix oops, validate buffer size in apparmor_setprocattr()

When proc_pid_attr_write() was changed to use memdup_user apparmor's
(interface violating) assumption that the setprocattr buffer was always
a single page was violated.

The size test is not strictly speaking needed as proc_pid_attr_write()
will reject anything larger, but for the sake of robustness we can keep
it in.

SMACK and SELinux look safe to me, but somebody else should probably
have a look just in case.

Based on original patch from Vegard Nossum <vegard.nossum@oracle.com>
modified for the case that apparmor provides null termination.

Fixes: bb646cdb12e75d82258c2f2e7746d5952d3e321a
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: stable@kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>

Revert "ecryptfs: forbid opening files without mmap handler"

This reverts commit 2f36db71009304b3f0b95afacd8eba1f9f046b87.

It fixed a local root exploit but also introduced a dependency on
the lower file system implementing an mmap operation just to open a file,
which is a bit of a heavy hammer. The right fix is to have mmap depend
on the existence of the mmap handler instead.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block IO fixes from Jens Axboe:
"Three small fixes that have been queued up and tested for this series:

   - A bug fix for xen-blkfront from Bob Liu, fixing an issue with
     incomplete requests during migration.

   - A fix for an ancient issue in retrieving the IO priority of a
     different PID than self, preventing that task from going away while
     we access it.  From Omar.

   - A writeback fix from Tahsin, fixing a case where we'd call ihold()
     with a zero ref count inode"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix use-after-free in sys_ioprio_get()
  writeback: inode cgroup wb switch should not call ihold()
  xen-blkfront: save uncompleted reqs in blkfront_resume()

Merge tag 'configfs-for-4.7' of git://git.infradead.org/users/hch/configfs

Pull configfs fix from Christoph Hellwig:
"A fix from Marek for ppos handling in configfs_write_bin_file, which
was introduced in Linux 4.5, but didn't have any users until recently"

* tag 'configfs-for-4.7' of git://git.infradead.org/users/hch/configfs:
configfs: Remove ppos increment in configfs_write_bin_file

Merge branches 'acpica-fixes', 'acpi-pci-fixes' and 'acpi-debug-fixes'

* acpica-fixes:
  ACPICA: Namespace: Fix namespace/interpreter lock ordering

* acpi-pci-fixes:
  ACPI,PCI,IRQ: separate ISA penalty calculation
  Revert "ACPI, PCI, IRQ: remove redundant code in acpi_irq_penalty_init()"
  ACPI,PCI,IRQ: factor in PCI possible

* acpi-debug-fixes:
  ACPI / debugger: Fix regression introduced by IS_ERR_VALUE() removal

Merge branches 'pm-cpuidle-fixes' and 'pm-sleep-fixes'

* pm-cpuidle-fixes:
cpuidle: Fix last_residency division

* pm-sleep-fixes:
x86/power/64: Fix kernel text mapping corruption during image restoration

arm64: Enable workaround for Cavium erratum 27456 on thunderx-81xx

Cavium erratum 27456 commit 104a0c02e8b1
("arm64: Add workaround for Cavium erratum 27456")
is applicable for thunderx-81xx pass1.0 SoC as well.
Adding code to enable to 81xx.

Signed-off-by: Ganapatrao Kulkarni <gkulkarni@cavium.com>
Reviewed-by: Andrew Pinski <apinski@cavium.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

arm64: kernel: Save and restore UAO and addr_limit on exception entry

If we take an exception while at EL1, the exception handler inherits
the original context's addr_limit and PSTATE.UAO values. To be consistent
always reset addr_limit and PSTATE.UAO on (re-)entry to EL1. This
prevents accidental re-use of the original context's addr_limit.

Based on a similar patch for arm from Russell King.

Cc: <stable@vger.kernel.org> # 4.6-
Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

xenbus: don't BUG() on user mode induced condition

Inability to locate a user mode specified transaction ID should not
lead to a kernel crash. For other than XS_TRANSACTION_START also
don't issue anything to xenbus if the specified ID doesn't match that
of any active transaction.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

perf/core: Fix pmu::filter_match for SW-led groups

The following commit:

  66eb579e66ec ("perf: allow for PMU-specific event filtering")

added the pmu::filter_match() callback. This was intended to
avoid HW constraints on events from resulting in extremely
pessimistic scheduling.

However, pmu::filter_match() is only called for the leader of each event
group. When the leader is a SW event, we do not filter the groups, and
may fail at pmu::add() time, and when this happens we'll give up on
scheduling any event groups later in the list until they are rotated
ahead of the failing group.

This can result in extremely sub-optimal event scheduling behaviour,
e.g. if running the following on a big.LITTLE platform:

$ taskset -c 0 ./perf stat \
-e 'a57{context-switches,armv8_cortex_a57/config=0x11/}' \
-e 'a53{context-switches,armv8_cortex_a53/config=0x11/}' \
ls

     <not counted>      context-switches                                              (0.00%)
     <not counted>      armv8_cortex_a57/config=0x11/                                 (0.00%)
                24      context-switches                                              (37.36%)
          57589154      armv8_cortex_a53/config=0x11/                                 (37.36%)

Here the 'a53' event group was always eligible to be scheduled, but
the 'a57' group never eligible to be scheduled, as the task was always
affine to a Cortex-A53 CPU. The SW (group leader) event in the 'a57'
group was eligible, but the HW event failed at pmu::add() time,
resulting in ctx_flexible_sched_in giving up on scheduling further
groups with HW events.

One way of avoiding this is to check pmu::filter_match() on siblings
as well as the group leader. If any of these fail their
pmu::filter_match() call, we must skip the entire group before
attempting to add any events.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: 66eb579e66ec ("perf: allow for PMU-specific event filtering")
Link: http://lkml.kernel.org/r/1465917041-15339-1-git-send-email-mark.rutland@arm.com
[ Small readability edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge branch 'linux-4.7' of git://github.com/skeggsb/linux into drm-fixes

Just one fix for a stupid thinko in a DP training pattern commit.

* 'linux-4.7' of git://github.com/skeggsb/linux:
drm/nouveau/disp/sor/gf119: select correct sor when poking training pattern

Merge branch 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Just a couple of fixes for amdgpu for 4.7:
- 2 small tonga powerplay fixes
- Additional Polaris fixes

* 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux:
  drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation.
  drm/amd/powerplay: fix bug that get wrong polaris evv voltage.
  drm/amd/powerplay: incorrectly use of the function return value
  drm/amd/powerplay: fix incorrect voltage table value for tonga
  drm/amd/powerplay: fix incorrect voltage table value for polaris10

init/Kconfig: keep Expert users menu together

The "expert" menu was broken (split) such that all entries in it after
KALLSYMS were displayed in the "General setup" area instead of in the
"Expert users" area.  Fix this by adding one kconfig dependency.

Yes, the Expert users menu is fragile.  Problems like this have happened
several times in the past.  I will attempt to isolate the Expert users
menu if there is interest in that.

Fixes: 4d5d5664c900 ("x86: kallsyms: disable absolute percpu symbols on !SMP")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: stable@vger.kernel.org # 4.6
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation.

As get the right evv voltage, update them to latest coefficients to
align with BB.

agd: squash in Slava's 32 bit build fix

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/powerplay: fix bug that get wrong polaris evv voltage.

value is 32 bits for polaris, not 16.

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/powerplay: incorrectly use of the function return value

'0' means true.

Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amd/powerplay: fix incorrect voltage table value for tonga

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amd/powerplay: fix incorrect voltage table value for polaris10

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) All users of AF_PACKET's fanout feature want a symmetric packet
    header hash for load balancing purposes, so give it to them.

2) Fix vlan state synchronization in e1000e, from Jarod Wilson.

3) Use correct socket pointer in ip_skb_dst_mtu(), from Shmulik
    Ladkani.

4) mlx5 bug fixes from Mohamad Haj Yahia, Daniel Jurgens, Matthew
    Finlay, Rana Shahout, and Shaker Daibes.  Mostly to do with
    operation timeouts and PCI error handling.

5) Fix checksum handling in mirred packet action, from WANG Cong.

6) Set skb->dev correctly when transmitting in !protect_frames case of
    macsec driver, from Daniel Borkmann.

7) Fix MTU calculation in geneve driver, from Haishuang Yan.

8) Missing netif_napi_del() in unregister path of qeth driver, from
    Ursula Braun.

9) Handle malformed route netlink messages in decnet properly, from
    Vergard Nossum.

10) Memory leak of percpu data in ipv6 routing code, from Martin KaFai
    Lau.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
  ipv6: Fix mem leak in rt6i_pcpu
  net: fix decnet rtnexthop parsing
  cxgb4: update latest firmware version supported
  net/mlx5: Avoid setting unused var when modifying vport node GUID
  bonding: fix enslavement slave link notifications
  r8152: fix runtime function for RTL8152
  qeth: delete napi struct when removing a qeth device
  Revert "fsl/fman: fix error handling"
  fsl/fman: fix error handling
  cdc_ncm: workaround for EM7455 "silent" data interface
  RDS: fix rds_tcp_init() error path
  geneve: fix max_mtu setting
  net: phy: dp83867: Fix initialization of PHYCR register
  enc28j60: Fix race condition in enc28j60 driver
  net: stmmac: Fix null-function call in ISR on stmmac1000
  tipc: fix nl compat regression for link statistics
  net: bcmsysport: Device stats are unsigned long
  macsec: set actual real device for xmit when !protect_frames
  net_sched: fix mirrored packets checksum
  packet: Use symmetric hash for PACKET_FANOUT_HASH.
  ...

Merge tag 'sound-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"Here are a collection of small fixes: at this time, we've got a
  slightly high amount, but all small and trivial fixes, and nothing
  scary can be seen there"

* tag 'sound-4.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
  ALSA: hda/realtek: Add Lenovo L460 to docking unit fixup
  ALSA: timer: Fix negative queue usage by racy accesses
  ASoC: rt5645: fix reg-2f default value.
  ASoC: fsl_ssi: Fix number of words per frame for I2S-slave mode
  ALSA: au88x0: Fix calculation in vortex_wtdma_bufshift()
  ALSA: hda - Add PCI ID for Kabylake-H
  ALSA: echoaudio: Fix memory allocation
  ASoC: Intel: atom: fix missing breaks that would cause the wrong operation to execute
  ALSA: hda - fix read before array start
  ASoC: cx20442: set tty->receiver_room in v253_open
  ASoC: ak4613: Enable cache usage to fix crashes on resume
  ASoC: wm8940: Enable cache usage to fix crashes on resume
  ASoC: Intel: Skylake: Initialize module list for Broxton
  ASoC: wm5102: Correct supported channels on trace compressed DAI
  ASoC: wm5110: Add missing route from OUT3R to SYSCLK
  ASoC: rt5670: fix HP Playback Volume control
  ASoC: hdmi-codec: select CONFIG_HDMI
  ASoC: davinci-mcasp: Fix dra7 DMA offset when using CFG port
  ASoC: hdac_hdmi: Fix potential NULL dereference
  ASoC: ak4613: Remove owner assignment from platform_driver
  ...

Merge tag 'chrome-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform

Pull chrome platform fix from Olof Johansson:
"A single fix this time, closing a window where ioctl args are fetched
twice"

* tag 'chrome-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform:
platform/chrome: cros_ec_dev - double fetch bug in ioctl

iommu/amd: Fix unity mapping initialization race

There is a race condition in the AMD IOMMU init code that
causes requested unity mappings to be blocked by the IOMMU
for a short period of time. This results on boot failures
and IO_PAGE_FAULTs on some machines.

Fix this by making sure the unity mappings are installed
before all other DMA is blocked.

Fixes: aafd8ba0ca74 ('iommu/amd: Implement add_device and remove_device')
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Joerg Roedel <jroedel@suse.de>

Merge branch 'jejb-fixes' into fixes

MIPS: Fix page table corruption on THP permission changes.

When the core THP code is modifying the permissions of a huge page it
calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit
of the page table entry.  The result can be kernel messages like:

mm/memory.c:397: bad pmd 000000040080004d.
mm/memory.c:397: bad pmd 00000003ff00004d.
mm/memory.c:397: bad pmd 000000040100004d.

or:

------------[ cut here ]------------
WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158()
Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80
CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4
Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0
          0000000000000000 0000000000000000 ffffffff85110000 0000000000000119
          0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f
          0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000
          0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80
          ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001
          0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924
          80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780
          80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f
          0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000
          ...
Call Trace:
[<ffffffff80865c4c>] show_stack+0x6c/0xf8
[<ffffffff80885780>] warn_slowpath_common+0x78/0xa8
[<ffffffff809207a0>] exit_mmap+0x150/0x158
[<ffffffff80882d44>] mmput+0x5c/0x110
[<ffffffff8088b450>] do_exit+0x230/0xa68
[<ffffffff8088be34>] do_group_exit+0x54/0x1d0
[<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18

---[ end trace c7b38293191c57dc ]---
BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536

Fix by not clearing _PAGE_HUGE bit.

Signed-off-by: David Daney <david.daney@cavium.com>
Tested-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Cc: stable@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13687/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

x86/perf/intel/rapl: Fix module name collision with powercap intel-rapl

Since commit 4b6e2571bf00 the rapl perf module calls itself intel-rapl. That
name was already in use by the rapl powercap driver, which now fails to load
if the perf module is loaded. Fix the problem by renaming the perf module to
intel-rapl-perf, so that both modules can coexist.

Fixes: 4b6e2571bf00 ("x86/perf/intel/rapl: Make the Intel RAPL PMU driver modular")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1466694409-3620-1-git-send-email-ville.syrjala@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ipv6: Fix mem leak in rt6i_pcpu

It was first reported and reproduced by Petr (thanks!) in
https://bugzilla.kernel.org/show_bug.cgi?id=119581

free_percpu(rt->rt6i_pcpu) used to always happen in ip6_dst_destroy().

However, after fixing a deadlock bug in
commit 9c7370a166b4 ("ipv6: Fix a potential deadlock when creating pcpu rt"),
free_percpu() is not called before setting non_pcpu_rt->rt6i_pcpu to NULL.

It is worth to note that rt6i_pcpu is protected by table->tb6_lock.

kmemleak somehow did not report it. We nailed it down by
observing the pcpu entries in /proc/vmallocinfo (first suggested
by Hannes, thanks!).

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Fixes: 9c7370a166b4 ("ipv6: Fix a potential deadlock when creating pcpu rt")
Reported-by: Petr Novopashenniy <pety@rusnet.ru>
Tested-by: Petr Novopashenniy <pety@rusnet.ru>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Petr Novopashenniy <pety@rusnet.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fix decnet rtnexthop parsing

dn_fib_count_nhs() could enter an infinite loop if nhp->rtnh_len == 0
(i.e. if userspace passes a malformed netlink message).

Let's use the helpers from net/nexthop.h which take care of all this
stuff. We can do exactly the same as e.g. fib_count_nexthops() and
fib_get_nhs() from net/ipv4/fib_semantics.c.

This fixes the softlockup for me.

Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ACPI / debugger: Fix regression introduced by IS_ERR_VALUE() removal

The FIFO unlocking mechanism in acpi_dbg has been broken by the
following commit:

Commit: 287980e49ffc0f6d911601e7e352a812ed27768e
Subject: remove lots of IS_ERR_VALUE abuses

It converted !IS_ERR_VALUE(ret) into !ret which was not entirely
correct. Fix the regression by taking ret > 0 into account too as
appropriate.

Fixes: 287980e49ffc (remove lots of IS_ERR_VALUE abuses)
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Simplifications, changelog & subject massage ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

platform/chrome: cros_ec_dev - double fetch bug in ioctl

We verify "u_cmd.outsize" and "u_cmd.insize" but we need to make sure
that those values have not changed between the two copy_from_user()
calls. Otherwise it could lead to a buffer overflow.

Additionally, cros_ec_cmd_xfer() can set s_cmd->insize to a lower value.
We should use the new smaller value so we don't copy too much data to
the user.

Reported-by: Pengfei Wang <wpengfeinudt@gmail.com>
Fixes: a841178445bb ('mfd: cros_ec: Use a zero-length array for command data')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Gwendal Grignou <gwendal@chromium.org>
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Olof Johansson <olof@lixom.net>

drm/nouveau/disp/sor/gf119: select correct sor when poking training pattern

Fixes a regression caused by a stupid thinko from "disp/sor/gf119: both
links use the same training register".

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org

ACPICA: Namespace: Fix namespace/interpreter lock ordering

There is a lock order issue in acpi_load_tables(). The namespace lock
is held before holding the interpreter lock.

With ACPI_MUTEX_DEBUG enabled in the kernel, this is printed to the
log during boot:

  [    0.885699] ACPI Error: Invalid acquire order: Thread 405884224 owns [ACPI_MTX_Namespace], wants [ACPI_MTX_Interpreter] (20160422/utmutex-263)
  [    0.885881] ACPI Error: Could not acquire AML Interpreter mutex (20160422/exutils-95)
  [    0.893846] ACPI Error: Mutex [0x0] is not acquired, cannot release (20160422/utmutex-326)
  [    0.894019] ACPI Error: Could not release AML Interpreter mutex (20160422/exutils-133)

The issue has been introduced by the following commit:

  Commit: 2f38b1b16d9280689e5cfa47a4c50956bf437f0d
  ACPICA Commit: bfe03ffcde8ed56a7eae38ea0b188aeb12f9c52e
  Subject: ACPICA: Namespace: Fix a regression that MLC support triggers
           dead lock in dynamic table loading

Which fixed a deadlock issue for acpi_ns_load_table() in
acpi_ex_add_table() but didn't take care of the lock order in
acpi_ns_load_table() correctly.

Originally (before the above commit), ACPICA used the
namespace/interpreter locks in the following 2 key code
paths:

1. Table loading:
acpi_ns_load_table
L(Namespace)
acpi_ns_parse_table
acpi_ns_one_complete_parse
U(Namespace)
2. Object evaluation:
acpi_ns_evaluate
L(Interpreter)
acpi_ps_execute_method
U(Interpreter)
acpi_ns_load_table
L(Namespace)
U(Namespace)
acpi_ev_initialize_region
L(Namespace)
U(Namespace)
address_space.setup
L(Namespace)
U(Namespace)
address_space.handler
L(Namespace)
U(Namespace)
acpi_os_wait_semaphore
acpi_os_acquire_mutex
acpi_os_sleep
L(Interpreter)
U(Interpreter)

During runtime, while acpi_ns_evaluate is called, the lock order is
always Interpreter -> Namespace.

In turn, the problematic commit acquires the locks in the following
order:

3. Table loading:
acpi_ns_load_table
L(Namespace)
acpi_ns_parse_table
L(Interpreter)
acpi_ns_one_complete_parse
U(Interpreter)
U(Namespace)

To fix the lock order issue, move the interpreter lock to
acpi_ns_load_table() to ensure the lock order correctness:

4. Table loading:
acpi_ns_load_table
L(Interpreter)
L(Namespace)
acpi_ns_parse_table
acpi_ns_one_complete_parse
U(Namespace)
U(Interpreter)

However, this doesn't fix the current design issues related to the
namespace lock. For example, we can notice that in acpi_ns_evaluate(),
outside of acpi_ns_load_table(), the namespace objects may be created
by the named object creation control methods. And the creation of
the method-owned namespace objects are not locked by the namespace
lock. This patch doesn't try to fix such kind of existing issues.

Fixes: 2f38b1b16d92 (ACPICA: Namespace: Fix a regression that MLC support triggers dead lock in dynamic table loading)
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

qla2xxx: Fix NULL pointer deref in QLA interrupt

In qla24xx_process_response_queue() rsp->msix->cpuid may trigger NULL
pointer dereference when rsp->msix is NULL:

[    5.622457] NULL pointer dereference at 0000000000000050
[    5.622457] IP: [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0
[    5.622457] PGD 0
[    5.622457] Oops: 0000 [#1] SMP
[    5.622457] Modules linked in:
[    5.622457] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.6.3-x86_64 #1
[    5.622457] Hardware name: HP ProLiant DL360 G5, BIOS P58 05/02/2011
[    5.622457] task: ffff8801a88f3740 ti: ffff8801a8954000 task.ti: ffff8801a8954000
[    5.622457] RIP: 0010:[<ffffffff8155e614>]  [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0
[    5.622457] RSP: 0000:ffff8801afb03de8  EFLAGS: 00010002
[    5.622457] RAX: 0000000000000000 RBX: 0000000000000032 RCX: 00000000ffffffff
[    5.622457] RDX: 0000000000000002 RSI: ffff8801a79bf8c8 RDI: ffff8800c8f7e7c0
[    5.622457] RBP: ffff8801afb03e68 R08: 0000000000000000 R09: 0000000000000000
[    5.622457] R10: 00000000ffff8c47 R11: 0000000000000002 R12: ffff8801a79bf8c8
[    5.622457] R13: ffff8800c8f7e7c0 R14: ffff8800c8f60000 R15: 0000000000018013
[    5.622457] FS:  0000000000000000(0000) GS:ffff8801afb00000(0000) knlGS:0000000000000000
[    5.622457] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.622457] CR2: 0000000000000050 CR3: 0000000001e07000 CR4: 00000000000006e0
[    5.622457] Stack:
[    5.622457]  ffff8801afb03e30 ffffffff810c0f2d 0000000000000086 0000000000000002
[    5.622457]  ffff8801afb03e28 ffffffff816570e1 ffff8800c8994628 0000000000000002
[    5.622457]  ffff8801afb03e60 ffffffff816772d4 b47c472ad6955e68 0000000000000032
[    5.622457] Call Trace:
[    5.622457]  <IRQ>
[    5.622457]  [<ffffffff810c0f2d>] ? __wake_up_common+0x4d/0x80
[    5.622457]  [<ffffffff816570e1>] ? usb_hcd_resume_root_hub+0x51/0x60
[    5.622457]  [<ffffffff816772d4>] ? uhci_hub_status_data+0x64/0x240
[    5.622457]  [<ffffffff81560d00>] qla24xx_intr_handler+0xf0/0x2e0
[    5.622457]  [<ffffffff810d569e>] ? get_next_timer_interrupt+0xce/0x200
[    5.622457]  [<ffffffff810c89b4>] handle_irq_event_percpu+0x64/0x100
[    5.622457]  [<ffffffff810c8a77>] handle_irq_event+0x27/0x50
[    5.622457]  [<ffffffff810cb965>] handle_edge_irq+0x65/0x140
[    5.622457]  [<ffffffff8101a498>] handle_irq+0x18/0x30
[    5.622457]  [<ffffffff8101a276>] do_IRQ+0x46/0xd0
[    5.622457]  [<ffffffff817f8fff>] common_interrupt+0x7f/0x7f
[    5.622457]  <EOI>
[    5.622457]  [<ffffffff81020d38>] ? mwait_idle+0x68/0x80
[    5.622457]  [<ffffffff8102114a>] arch_cpu_idle+0xa/0x10
[    5.622457]  [<ffffffff810c1b97>] default_idle_call+0x27/0x30
[    5.622457]  [<ffffffff810c1d3b>] cpu_startup_entry+0x19b/0x230
[    5.622457]  [<ffffffff810324c6>] start_secondary+0x136/0x140
[    5.622457] Code: 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 8b 47 58 a8 02 0f 84 c5 00 00 00 48 8b 46 50 49 89 f4 65 8b 15 34 bb aa 7e <39> 50 50 74 11 89 50 50 48 8b 46 50 8b 40 50 41 89 86 60 8b 00
[    5.622457] RIP  [<ffffffff8155e614>] qla24xx_process_response_queue+0x44/0x4b0
[    5.622457]  RSP <ffff8801afb03de8>
[    5.622457] CR2: 0000000000000050
[    5.622457] ---[ end trace fa2b19c25106d42b ]---
[    5.622457] Kernel panic - not syncing: Fatal exception in interrupt

The affected code was introduced by commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
(qla2xxx: Add irq affinity notification).

Only dereference rsp->msix when it has been set so the machine can boot
fine. Possibly rsp->msix is unset because:
[    3.479679] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 8.07.00.33-k.
[    3.481839] qla2xxx [0000:13:00.0]-001d: : Found an ISP2432 irq 17 iobase 0xffffc90000038000.
[    3.484081] qla2xxx [0000:13:00.0]-0035:0: MSI-X; Unsupported ISP2432 (0x2, 0x3).
[    3.485804] qla2xxx [0000:13:00.0]-0037:0: Falling back-to MSI mode -258.
[    3.890145] scsi host0: qla2xxx
[    3.891956] qla2xxx [0000:13:00.0]-00fb:0: QLogic QLE2460 - PCI-Express Single Channel 4Gb Fibre Channel HBA.
[    3.894207] qla2xxx [0000:13:00.0]-00fc:0: ISP2432: PCIe (2.5GT/s x4) @ 0000:13:00.0 hdma+ host#=0 fw=7.03.00 (9496).
[    5.714774] qla2xxx [0000:13:00.0]-500a:0: LOOP UP detected (4 Gbps).

Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Acked-by: Quinn Tran <quinn.tran@qlogic.com>
CC: <stable@vger.kernel.org> # 4.5+
Fixes: cdb898c52d1dfad4b4800b83a58b3fe5d352edde
Signed-off-by: James Bottomley <jejb@linux.vnet.ibm.com>

cxgb4: update latest firmware version supported

Change t4fw_version.h to update latest firmware version number

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx5: Avoid setting unused var when modifying vport node GUID

GCC complains on unused-but-set-variable, clean this up.

Fixes: 23898c763f4a ('net/mlx5: E-Switch, Modify node guid on vf set MAC')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: fix enslavement slave link notifications

Currently, link notifications are not sent by
bond_set_slave_link_state() upon enslavement if
the slave is enslaved when up.

This happens because slave->link default init value
is 0, which is the same as BOND_LINK_UP, resulting
in bond_set_slave_link_state() ignoring this transition.

This patch sets the default value of slave->link to
BOND_LINK_NOCHANGE, assuring it will count as a state
transition and thus trigger notification logic.

Signed-off-by: Aviv Heller <avivh@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: fix runtime function for RTL8152

The RTL8152 doesn't have U1U2 and U2P3 features, so use different
runtime functions for RTL812 and RTL8153 by adding autosuspend_en()
to rtl_ops.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "gpio: gpiolib-of: Allow compile testing"

This reverts commit 1e4a80640338924b9f9fd7a121ac31d08134410a.

This creates more problems than it solves right now. Compile
testing needs to go in with patches fixing the problems it
uncovers.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

irqchip/mips-gic: Match IPI IRQ domain by bus token only

Commit fbde2d7d8290 ("MIPS: Add generic SMP IPI support") introduced
code which calls irq_find_matching_host with a NULL node parameter in
order to discover IPI IRQ domains which are not associated with the DT
root node's interrupt parent. This suggests that implementations of IPI
IRQ domains should effectively ignore the node parameter if it is NULL
and search purely based upon the bus token. Commit 2af70a962070
("irqchip/mips-gic: Add a IPI hierarchy domain") did not do this when
implementing the GIC IPI IRQ domain, and on MIPS Boston boards this
leads to no IPI domain being discovered and a NULL pointer dereference
when attempting to send an IPI:

  CPU 0 Unable to handle kernel paging request at virtual address 0000000000000040, epc == ffffffff8016e70c, ra == ffffffff8010ff5c
  Oops[#1]:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc6-00223-gad0d1b6 #945
  task: a8000000ff066fc0 ti: a8000000ff068000 task.ti: a8000000ff068000
  $ 0   : 0000000000000000 0000000000000001 ffffffff80730000 0000000000000003
  $ 4   : 0000000000000000 ffffffff8057e5b0 a800000001e3ee00 0000000000000000
  $ 8   : 0000000000000000 0000000000000023 0000000000000001 0000000000000001
  $12   : 0000000000000000 ffffffff803323d0 0000000000000000 0000000000000000
  $16   : 0000000000000000 0000000000000000 0000000000000001 ffffffff801108fc
  $20   : 0000000000000000 ffffffff8057e5b0 0000000000000001 0000000000000000
  $24   : 0000000000000000 ffffffff8012de28
  $28   : a8000000ff068000 a8000000ff06fbc0 0000000000000000 ffffffff8010ff5c
  Hi    : ffffffff8014c174
  Lo    : a800000001e1e140
  epc   : ffffffff8016e70c __ipi_send_mask+0x24/0x11c
  ra    : ffffffff8010ff5c mips_smp_send_ipi_mask+0x68/0x178
  Status: 140084e2        KX SX UX KERNEL EXL
  Cause : 00800008 (ExcCode 02)
  BadVA : 0000000000000040
  PrId  : 0001a920 (MIPS I6400)
  Process swapper/0 (pid: 1, threadinfo=a8000000ff068000, task=a8000000ff066fc0, tls=0000000000000000)
  Stack : 0000000000000000 0000000000000000 0000000000000001 ffffffff801108fc
            0000000000000000 ffffffff8057e5b0 0000000000000001 ffffffff8010ff5c
            0000000000000001 0000000000000020 0000000000000000 0000000000000000
            0000000000000000 ffffffff801108fc 0000000000000000 0000000000000001
            0000000000000001 0000000000000000 0000000000000000 ffffffff801865e8
            a8000000ff0c7500 a8000000ff06fc90 0000000000000001 0000000000000002
            ffffffff801108fc ffffffff801868b8 0000000000000000 ffffffff801108fc
            0000000000000000 0000000000000003 ffffffff8068c700 0000000000000001
            ffffffff80730000 0000000000000001 a8000000ff00a290 ffffffff80110c50
            0000000000000003 a800000001e48308 0000000000000003 0000000000000008
            ...
  Call Trace:
  [<ffffffff8016e70c>] __ipi_send_mask+0x24/0x11c
  [<ffffffff8010ff5c>] mips_smp_send_ipi_mask+0x68/0x178
  [<ffffffff801865e8>] generic_exec_single+0x150/0x170
  [<ffffffff801868b8>] smp_call_function_single+0x108/0x160
  [<ffffffff80110c50>] cps_boot_secondary+0x328/0x394
  [<ffffffff80110534>] __cpu_up+0x38/0x90
  [<ffffffff8012de4c>] bringup_cpu+0x24/0xac
  [<ffffffff8012df40>] cpuhp_up_callbacks+0x58/0xdc
  [<ffffffff8012e648>] cpu_up+0x118/0x18c
  [<ffffffff806dc158>] smp_init+0xbc/0xe8
  [<ffffffff806d4c18>] kernel_init_freeable+0xa0/0x228
  [<ffffffff8056c908>] kernel_init+0x10/0xf0
  [<ffffffff80105098>] ret_from_kernel_thread+0x14/0x1c

Fix this by allowing the GIC IPI IRQ domain to match purely based upon
the bus token if the node provided is NULL.

Fixes: 2af70a962070 ("irqchip/mips-gic: Add a IPI hierarchy domain")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160705132600.27730-2-paul.burton@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

irqchip/mips-gic: Map to VPs using HW VPNum

When mapping an interrupt to a VP(E) we must use the identifier for the
VP that the hardware expects, and this does not always match up with the
Linux CPU number. Commit d46812bb0bef ("irqchip: mips-gic: Use HW IDs
for VPE_OTHER_ADDR") corrected this for the cases that existed at the
time it was written, but commit 2af70a962070 ("irqchip/mips-gic: Add a
IPI hierarchy domain") added another case before the former patch was
merged. This leads to incorrectly using Linux CPU numbers when mapping
interrupts to VPs, which breaks on certain systems such as those with
multi-core I6400 CPUs. Fix by adding the appropriate call to
mips_cm_vp_id() to retrieve the expected VP identifier.

Fixes: d46812bb0bef ("irqchip: mips-gic: Use HW IDs for VPE_OTHER_ADDR")
Fixes: 2af70a962070 ("irqchip/mips-gic: Add a IPI hierarchy domain")
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20160705132600.27730-1-paul.burton@imgtec.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

ALSA: hda/realtek: Add Lenovo L460 to docking unit fixup

This solves the issue that a headphone is not working on the docking
unit.

Signed-off-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

gpu: drm: sun4i_drv: add missing of_node_put after calling of_parse_phandle

of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.

Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>

qeth: delete napi struct when removing a qeth device

A qeth_card contains a napi_struct linked to the net_device during
device probing. This struct must be deleted when removing the qeth
device, otherwise Panic on oops can occur when qeth devices are
repeatedly removed and added.

Fixes: a1c3ed4c9ca ("qeth: NAPI support for l2 and l3 discipline")
Cc: stable@vger.kernel.org # v2.6.37+
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Tested-by: Alexander Klein <ALKL@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "fsl/fman: fix error handling"

This reverts commit a788a4a040e003574b8ad17115706ab1601ec572.

This patch is wrong, the type returned doesn't fit
what the error pointer macros expect.

Signed-off-by: David S. Miller <davem@davemloft.net>

fsl/fman: fix error handling

This is likely that checking 'fman->fifo_offset' instead of
'fman->cam_offset' is expected here.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

cdc_ncm: workaround for EM7455 "silent" data interface

Several Lenovo users have reported problems with their Sierra
Wireless EM7455 modem. The driver has loaded successfully and
the MBIM management channel has appeared to work, including
establishing a connection to the mobile network. But no frames
have been received over the data interface.

The problem affects all EM7455 and MC7455, and is assumed to
affect other modems based on the same Qualcomm chipset and
baseband firmware.

Testing narrowed the problem down to what seems to be a
firmware timing bug during initialization. Adding a short sleep
while probing is sufficient to make the problem disappear.
Experiments have shown that 1-2 ms is too little to have any
effect, while 10-20 ms is enough to reliably succeed.

Reported-by: Stefan Armbruster <ml001@armbruster-it.de>
Reported-by: Ralph Plawetzki <ralph@purejava.org>
Reported-by: Andreas Fett <andreas.fett@secunet.com>
Reported-by: Rasmus Lerdorf <rasmus@lerdorf.com>
Reported-by: Samo Ratnik <samo.ratnik@gmail.com>
Reported-and-tested-by: Aleksander Morgado <aleksander@aleksander.es>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

RDS: fix rds_tcp_init() error path

If register_pernet_subsys() fails, we shouldn't try to call
unregister_pernet_subsys().

Fixes: 467fa15356 ("RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns.")
Cc: stable@vger.kernel.org
Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

geneve: fix max_mtu setting

For ipv6+udp+geneve encapsulation data, the max_mtu should subtract
sizeof(ipv6hdr), instead of sizeof(iphdr).

Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "gpiolib: Split GPIO flags parsing and GPIO configuration"

This reverts commit 923b93e451db876d1479d3e4458fce14fec31d1c.

Make sure consumers do not overwrite gpio flags for pins that have
already been claimed.

While adding support for gpio drivers to refuse a request using
unsupported flags, the order of when the requested flag was checked and
the new flags were applied was reversed to that consumers could
overwrite flags for already requested gpios.

This not only affects device-tree setups where two drivers could request
the same gpio using conflicting configurations, but also allowed user
space to clear gpio flags for already claimed pins simply by attempting
to export them through the sysfs interface. By for example clearing the
FLAG_ACTIVE_LOW flag this way, user space could effectively change the
polarity of a signal.

Reverting this change obviously prevents gpio drivers from doing sanity
checks on the flags in their request callbacks. Fortunately only one
recently added driver (gpio-tps65218 in v4.6) appears to do this, and a
follow up patch could restore this functionality through a different
interface.

Cc: stable <stable@vger.kernel.org> # 4.4
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

gpio: sch: Fix Oops on module load on Asus Eee PC 1201

This fixes the issue descirbe in bug 117531
(https://bugzilla.kernel.org/show_bug.cgi?id=117531).
It's a regression introduced in linux 4.5 that causes a Oops at load of
gpio_sch and prevents powering off the computer.

The issue is that sch_gpio_reg_set is called in sch_gpio_probe before
gpio_chip data is initialized with the pointer to the sch_gpio struct. As
sch_gpio_reg_set calls gpiochip_get_data, it returns NULL which causes
the Oops.

The patch follows Mika's advice (https://lkml.org/lkml/2016/5/9/61) and
consists in modifying sch_gpio_reg_get and sch_gpio_reg_set to take a
sch_gpio struct directly instead of a gpio_chip, which avoids the call to
gpiochip_get_data.

Thanks Mika for your patience with me :-)

Cc: stable@vger.kernel.org
Signed-off-by: Colin Pitrat <colin.pitrat@gmail.com>
Acked-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

cpuidle: Fix last_residency division

Snooze is a poll idle state in powernv and pseries platforms. Snooze
has a timeout so that if a CPU stays in snooze for more than target
residency of the next available idle state, then it would exit
thereby giving chance to the cpuidle governor to re-evaluate and
promote the CPU to a deeper idle state. Therefore whenever snooze
exits due to this timeout, its last_residency will be target_residency
of the next deeper state.

Commit e93e59ce5b85 "cpuidle: Replace ktime_get() with local_clock()"
changed the math around last_residency calculation. Specifically,
while converting last_residency value from nano- to microseconds, it
carries out right shift by 10. Because of that, in snooze timeout
exit scenarios last_residency calculated is roughly 2.3% less than
target_residency of the next available state. This pattern is picked
up by get_typical_interval() in the menu governor and therefore
expected_interval in menu_select() is frequently less than the
target_residency of any state other than snooze.

Due to this we are entering snooze at a higher rate, thereby
affecting the single thread performance.

Fix this by using more precise division via ktime_us_delta().

Fixes: e93e59ce5b85 "cpuidle: Replace ktime_get() with local_clock()"
Reported-by: Anton Blanchard <anton@samba.org>
Bisected-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

ALSA: timer: Fix negative queue usage by racy accesses

The user timer tu->qused counter may go to a negative value when
multiple concurrent reads are performed since both the check and the
decrement of tu->qused are done in two individual locked contexts.
This results in bogus read outs, and the endless loop in the
user-space side.

The fix is to move the decrement of the tu->qused counter into the
same spinlock context as the zero-check of the counter.

Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

iommu/vt-d: Fix infinite loop in free_all_cpu_cached_iovas

Per VT-d spec Section 10.4.2 ("Capability Register"), the maximum
number of possible domains is 64K; indeed this is the maximum value
that the cap_ndoms() macro will expand to. Since the value 65536
will not fix in a u16, the 'did' variable must be promoted to an
int, otherwise the test for < 65536 will always be true and the
loop will never end.

The symptom, in my case, was a hung machine during suspend.

Fixes: 3bd4f9112f87 ("iommu/vt-d: Fix overflow of iommu->domains array")
Signed-off-by: Aaron Campbell <aaron@monkey.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>

Linux 4.7-rc6