git.karo-electronics.de Git - linux-beck.git/log

KVM: s390: do not expose random data via facility bitmap

commit 04478197416e3a302e9ebc917ba1aa884ef9bfab upstream.

kvm_s390_get_machine() populates the facility bitmap by copying bytes
from the host results that are stored in a 256 byte array in the prefix
page. The KVM code does use the size of the target buffer (2k), thus
copying and exposing unrelated kernel memory (mostly machine check
related logout data).

Let's use the size of the source buffer instead. This is ok, as the
target buffer will always be greater or equal than the source buffer as
the KVM internal buffers (and thus S390_ARCH_FAC_LIST_SIZE_BYTE) cover
the maximum possible size that is allowed by STFLE, which is 256
doublewords. All structures are zero allocated so we can leave bytes
256-2047 unchanged.

Add a similar fix for kvm_arch_init_vm().

Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[found with smatch]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: nand: xway: fix build because of module functions

commit a2724663494f7313f53da10d8c0a729c5e3c4dea upstream.

Remove the usage of modules functions to make this driver compile
again. Otherwise an include of linux/modules.h would be needed.

Fixes: 024366750c2e ("mtd: nand: xway: convert to normal platform driver")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: nand: xway: disable module support

commit 73529c872a189c747bdb528ce9b85b67b0e28dec upstream.

The xway_nand driver accesses the ltq_ebu_membase symbol which is not
exported. This also should not get exported and we should handle the
EBU interface in a better way later. This quick fix just deactivated
support for building as module.

Fixes: 99f2b107924c ("mtd: lantiq: Add NAND support on Lantiq XWAY SoC.")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: nand: lpc32xx: fix invalid error handling of a requested irq

commit cf9e1672a66c49ed8903c01b4c380a2f2dc91b40 upstream.

Semantics of NR_IRQS is different on machines with SPARSE_IRQ option
disabled or enabled, in the latter case IRQs are allocated starting
at least from the value specified by NR_IRQS and going upwards, so
the check of (irq >= NR_IRQ) to decide about an error code returned by
platform_get_irq() is completely invalid, don't attempt to overrule
irq subsystem in the driver.

The change fixes LPC32xx NAND MLC driver initialization on boot.

Fixes: 8cb17b5ed017 ("irqchip: Add LPC32xx interrupt controller driver")
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Sylvain Lemieux <slemieux.tyco@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ieee802154: atusb: do not use the stack for buffers to make them DMA able

commit 05a974efa4bdf6e2a150e3f27dc6fcf0a9ad5655 upstream.

From 4.9 we should really avoid using the stack here as this will not be DMA
able on various platforms. This changes the buffers already being present in
time of 4.9 being released. This should go into stable as well.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: mxs-mmc: Fix additional cycles after transmission stop

commit 01167c7b9cbf099c69fe411a228e4e9c7104e123 upstream.

According to the code the intention is to append 8 SCK cycles
instead of 4 at end of a MMC_STOP_TRANSMISSION command. But this
will never happened because it's an AC command not an ADTC command.
So fix this by moving the statement into the right function.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: e4243f13d10e (mmc: mxs-mmc: add mmc host driver for i.MX23/28)
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: sdhci-acpi: Only powered up enabled acpi child devices

commit e1d070c3793a2766122865a7c2142853b48808c5 upstream.

Commit e5bbf30733f9 ("mmc: sdhci-acpi: Ensure connected devices are
powered when probing") introduced code to powerup any acpi child
nodes listed in the dstd. But some dstd-s list all possible devices
used on some board variants, while reporting if the device is actually
present and enabled in the status field of the device.

So we end up calling the acpi _PS0 (power-on) method for devices which
are not actually present. This does not always end well, e.g. on my
cube iwork8 air tablet, this results in freezing the entire tablet as
soon as the r8723bs module is loaded.

This commit fixes this by checking the child device's status.present
and status.enabled bits and only call acpi_device_fix_up_power()
if both are set.

Fixes: e5bbf30733f9 ("mmc: sdhci-acpi: Ensure connected devices are powered when probing")
BugLink: https://github.com/hadess/rtl8723bs/issues/80
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: corsair: fix control-transfer error handling

commit 7a546af50eb78ab99840903083231eb635c8a566 upstream.

Make sure to check for short control transfers in order to avoid parsing
uninitialised buffer data and leaking it to user space.

Note that the backlight and macro-mode buffer constraints are kept as
loose as possible in order to avoid any regressions should the current
buffer sizes be larger than necessary.

Fixes: 6f78193ee9ea ("HID: corsair: Add Corsair Vengeance K90 driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: corsair: fix DMA buffers on stack

commit 6d104af38b570d37aa32a5803b04c354f8ed513d upstream.

Not all platforms support DMA to the stack, and specifically since v4.9
this is no longer supported on x86 with VMAP_STACK either.

Note that the macro-mode buffer was larger than necessary.

Fixes: 6f78193ee9ea ("HID: corsair: Add Corsair Vengeance K90 driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

PCI: Enumerate switches below PCI-to-PCIe bridges

commit 51ebfc92b72b4f7dac1ab45683bf56741e454b8c upstream.

A PCI-to-PCIe bridge (a "reverse bridge") has a PCI or PCI-X primary
interface and a PCI Express secondary interface.  The PCIe interface is a
Downstream Port that originates a Link.  See the "PCI Express to PCI/PCI-X
Bridge Specification", rev 1.0, sections 1.2 and A.6.

The bug report below involves a PCI-to-PCIe bridge and a PCIe switch below
the bridge:

  00:1e.0 Intel 82801 PCI Bridge to [bus 01-0a]
  01:00.0 Pericom PI7C9X111SL PCIe-to-PCI Reversible Bridge to [bus 02-0a]
  02:00.0 Pericom Device 8608 [PCIe Upstream Port] to [bus 03-0a]
  03:01.0 Pericom Device 8608 [PCIe Downstream Port] to [bus 0a]

01:00.0 is configured as a PCI-to-PCIe bridge (despite the name printed by
lspci).  As we traverse a PCIe hierarchy, device connections alternate
between PCIe Links and internal Switch logic.  Previously we did not
recognize that 01:00.0 had a secondary link, so we thought the 02:00.0
Upstream Port *did* have a secondary link.  In fact, it's the other way
around: 01:00.0 has a secondary link, and 02:00.0 has internal Switch logic
on its secondary side.

When we thought 02:00.0 had a secondary link, the pci_scan_slot() ->
only_one_child() path assumed 02:00.0 could have only one child, so 03:00.0
was the only possible downstream device.  But 03:00.0 doesn't exist, so we
didn't look for any other devices on bus 03.

Booting with "pci=pcie_scan_all" is a workaround, but we don't want users
to have to do that.

Recognize that PCI-to-PCIe bridges originate links on their secondary
interfaces.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=189361
Fixes: d0751b98dfa3 ("PCI: Add dev->has_secondary_link to track downstream PCIe links")
Tested-by: Blake Moore <blake.moore@men.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

PCI: designware: Check for iATU unroll only on platforms that use ATU

commit a782b5f986c3fa1cfa7f2b57941200c6a5809242 upstream.

Previously we checked for iATU unroll support by reading PCIE_ATU_VIEWPORT
even on platforms, e.g., Keystone, that do not have ATU ports.  This can
cause bad behavior such as asynchronous external aborts:

  OF: PCI:   MEM 0x60000000..0x6fffffff -> 0x60000000
  Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
  pgd = c0003000
  [00000000] *pgd=80000800004003, *pmd=00000000
  Internal error: : 1211 [#1] PREEMPT SMP ARM
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-00009-g6ff59d2-dirty #7
  Hardware name: Keystone
  task: eb878000 task.stack: eb866000
  PC is at dw_pcie_setup_rc+0x24/0x380
  LR is at ks_pcie_host_init+0x10/0x170

Move the dw_pcie_iatu_unroll_enabled() check so we only call it on
platforms that do not use the ATU.  These platforms supply their own
->rd_other_conf() and ->wr_other_conf() methods.

[bhelgaas: changelog]
Fixes: a0601a470537 ("PCI: designware: Add iATU Unroll feature")
Fixes: 416379f9ebde ("PCI: designware: Check for iATU unroll support after initializing host")
Tested-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-By: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fuse: fix time_to_jiffies nsec sanity check

commit 210675270caa33253e4c33f3c5e657e7d6060812 upstream.

Commit bcb6f6d2b9c2 ("fuse: use timespec64") introduced clamped nsec values
in time_to_jiffies but used the max of nsec and NSEC_PER_SEC - 1 instead of
the min. Because of this, dentries would stay in the cache longer than
requested and go stale in scenarios that relied on their timely eviction.

Fixes: bcb6f6d2b9c2 ("fuse: use timespec64")
Signed-off-by: David Sheets <dsheets@docker.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fuse: clear FR_PENDING flag when moving requests out of pending queue

commit a8a86d78d673b1c99fe9b0064739fde9e9774184 upstream.

fuse_abort_conn() moves requests from pending list to a temporary list
before canceling them. This operation races with request_wait_answer()
which also tries to remove the request after it gets a fatal signal. It
checks FR_PENDING flag to determine whether the request is still in the
pending list.

Make fuse_abort_conn() clear FR_PENDING flag so that request_wait_answer()
does not remove the request from temporary list.

This bug causes an Oops when trying to delete an already deleted list entry
in end_requests().

Fixes: ee314a870e40 ("fuse: abort: no fc->lock needed for request ending")
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARC: module: Fix !CONFIG_ARC_DW2_UNWIND builds

commit eb1357d942e5d96de6b4c20a8ffa55acf96233a2 upstream.

commit d65283f7b695b5 added mod->arch.secstr under
CONFIG_ARC_DW2_UNWIND, but used it unconditionally which broke builds
when the option was disabled. Fix that by adjusting the #ifdef guard.

And while at it add a missing guard (for unwinder) in module.c as well

Reported-by: Waldemar Brodkorb <wbx@openadk.org>
Fixes: d65283f7b695b5 ("ARC: module: elide loop to save reference to .eh_frame")
Tested-by: Anton Kolesov <akolesov@synopsys.com>
Reviewed-by: Alexey Brodkin <abrodkin@synopsys.com>
[abrodkin: provided fixlet to Kconfig per failure in allnoconfig build]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

libnvdimm, namespace: fix pmem namespace leak, delete when size set to zero

commit 1f19b983a8877f81763fab3e693c6befe212736d upstream.

Commit 98a29c39dc68 ("libnvdimm, namespace: allow creation of multiple
pmem-namespaces per region") added support for establishing additional
pmem namespace beyond the seed device, similar to blk namespaces.
However, it neglected to delete the namespace when the size is set to
zero.

Fixes: 98a29c39dc68 ("libnvdimm, namespace: allow creation of multiple pmem-namespaces per region")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

svcrpc: don't leak contexts on PROC_DESTROY

commit 78794d1890708cf94e3961261e52dcec2cc34722 upstream.

Context expiry times are in units of seconds since boot, not unix time.

The use of get_seconds() here therefore sets the expiry time decades in
the future. This prevents timely freeing of contexts destroyed by
client RPC_GSS_PROC_DESTROY requests. We'd still free them eventually
(when the module is unloaded or the container shut down), but a lot of
contexts could pile up before then.

Fixes: c5b29f885afe "sunrpc: use seconds since boot in expiry cache"
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sunrpc: don't call sleeping functions from the notifier block callbacks

commit 546125d1614264d26080817d0c8cddb9b25081fa upstream.

The inet6addr_chain is an atomic notifier chain, so we can't call
anything that might sleep (like lock_sock)... instead of closing the
socket from svc_age_temp_xprts_now (which is called by the notifier
function), just have the rpc service threads do it instead.

Fixes: c3d4879e01be "sunrpc: Add a function to close..."
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rcu: Narrow early boot window of illegal synchronous grace periods

commit 52d7e48b86fc108e45a656d8e53e4237993c481d upstream.

The current preemptible RCU implementation goes through three phases
during bootup.  In the first phase, there is only one CPU that is running
with preemption disabled, so that a no-op is a synchronous grace period.
In the second mid-boot phase, the scheduler is running, but RCU has
not yet gotten its kthreads spawned (and, for expedited grace periods,
workqueues are not yet running.  During this time, any attempt to do
a synchronous grace period will hang the system (or complain bitterly,
depending).  In the third and final phase, RCU is fully operational and
everything works normally.

This has been OK for some time, but there has recently been some
synchronous grace periods showing up during the second mid-boot phase.
This code worked "by accident" for awhile, but started failing as soon
as expedited RCU grace periods switched over to workqueues in commit
8b355e3bc140 ("rcu: Drive expedited grace periods from workqueue").
Note that the code was buggy even before this commit, as it was subject
to failure on real-time systems that forced all expedited grace periods
to run as normal grace periods (for example, using the rcu_normal ksysfs
parameter).  The callchain from the failure case is as follows:

early_amd_iommu_init()
|-> acpi_put_table(ivrs_base);
|-> acpi_tb_put_table(table_desc);
|-> acpi_tb_invalidate_table(table_desc);
|-> acpi_tb_release_table(...)
|-> acpi_os_unmap_memory
|-> acpi_os_unmap_iomem
|-> acpi_os_map_cleanup
|-> synchronize_rcu_expedited

The kernel showing this callchain was built with CONFIG_PREEMPT_RCU=y,
which caused the code to try using workqueues before they were
initialized, which did not go well.

This commit therefore reworks RCU to permit synchronous grace periods
to proceed during this mid-boot phase.  This commit is therefore a
fix to a regression introduced in v4.9, and is therefore being put
forward post-merge-window in v4.10.

This commit sets a flag from the existing rcu_scheduler_starting()
function which causes all synchronous grace periods to take the expedited
path.  The expedited path now checks this flag, using the requesting task
to drive the expedited grace period forward during the mid-boot phase.
Finally, this flag is updated by a core_initcall() function named
rcu_exp_runtime_mode(), which causes the runtime codepaths to be used.

Note that this arrangement assumes that tasks are not sent POSIX signals
(or anything similar) from the time that the first task is spawned
through core_initcall() time.

Fixes: 8b355e3bc140 ("rcu: Drive expedited grace periods from workqueue")
Reported-by: "Zheng, Lv" <lv.zheng@intel.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Stan Kain <stan.kain@gmail.com>
Tested-by: Ivan <waffolz@hotmail.com>
Tested-by: Emanuel Castelo <emanuel.castelo@gmail.com>
Tested-by: Bruno Pesavento <bpesavento@infinito.it>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Frederic Bezies <fredbezies@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rcu: Remove cond_resched() from Tiny synchronize_sched()

commit f466ae66fa6a599f9a53b5f9bafea4b8cfffa7fb upstream.

It is now legal to invoke synchronize_sched() at early boot, which causes
Tiny RCU's synchronize_sched() to emit spurious splats. This commit
therefore removes the cond_resched() from Tiny RCU's synchronize_sched().

Fixes: 8b355e3bc140 ("rcu: Drive expedited grace periods from workqueue")
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/PCI: Ignore _CRS on Supermicro X8DTH-i/6/iF/6F

commit 89e9f7bcd8744ea25fcf0ac671b8d72c10d7d790 upstream.

Martin reported that the Supermicro X8DTH-i/6/iF/6F advertises incorrect
host bridge windows via _CRS:

  pci_root PNP0A08:00: host bridge window [io  0xf000-0xffff]
  pci_root PNP0A08:01: host bridge window [io  0xf000-0xffff]

Both bridges advertise the 0xf000-0xffff window, which cannot be correct.

Work around this by ignoring _CRS on this system.  The downside is that we
may not assign resources correctly to hot-added PCI devices (if they are
possible on this system).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=42606
Reported-by: Martin Burnicki <martin.burnicki@meinberg.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tmpfs: clear S_ISGID when setting posix ACLs

commit 497de07d89c1410d76a15bec2bb41f24a2a89f31 upstream.

This change was missed the tmpfs modification in In CVE-2016-7097
commit 073931017b49 ("posix_acl: Clear SGID bit when setting
file permissions")
It can test by xfstest generic/375, which failed to clear
setgid bit in the following test case on tmpfs:

  touch $testfile
  chown 100:100 $testfile
  chmod 2755 $testfile
  _runas -u 100 -g 101 -- setfacl -m u::rwx,g::rwx,o::rwx $testfile

Signed-off-by: Gu Zheng <guzheng1@huawei.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: omap3: Add DTS for Logic PD SOM-LV 37xx Dev Kit

commit 7245f67f86e847769f41dacad26bb8f5b5d74bf4 upstream.

Fixes: ("ab8dd3aed011 ARM: DTS: Add minimal Support for Logic PD
DM3730 SOM-LV")

This adds the dts file into the Makefile. This should have been included in
the original patch.

V2: Update patch description - same source code
V1: Original patch

Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: imx31: fix AVIC base address

commit af92305e567b7f4c9cf48b9e46c1f48ec9ffb1fb upstream.

On i.MX31 AVIC interrupt controller base address is at 0x68000000.

The problem was shadowed by the AVIC driver, which takes the correct
base address from a SoC specific header file.

Fixes: d2a37b3d91f4 ("ARM i.MX31: Add devicetree support")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: imx31: move CCM device node to AIPS2 bus devices

commit 1f87aee6a2e55eda466a43ba6248a8b75eede153 upstream.

i.MX31 Clock Control Module controller is found on AIPS2 bus, move it
there from SPBA bus to avoid a conflict of device IO space mismatch.

Fixes: ef0e4a606fb6 ("ARM: mx31: Replace clk_register_clkdev with clock DT lookup")
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: imx31: fix clock control module interrupts description

commit 2e575cbc930901718cc18e084566ecbb9a4b5ebb upstream.

The type of AVIC interrupt controller found on i.MX31 is one-cell,
namely 31 for CCM DVFS and 53 for CCM, however for clock control
module its interrupts are specified as 3-cells, fix it.

Fixes: ef0e4a606fb6 ("ARM: mx31: Replace clk_register_clkdev with clock DT lookup")
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: imx6q-cm-fx6: fix fec pinctrl

commit 72649a46067903d00f46e2ebef6543768224f1a0 upstream.

According to the schematics of CompuLab's sbc-fx6 baseboard and the
vendor devicetree GPIO_16 is *not* muxed to ENET_REF_CLK but to SPDIF_IN.

Remove the wrong pinctrl setting.

Fixes: 682d055e6ac5 ("ARM: dts: Add initial support for cm-fx6.")
Signed-off-by: Christopher Spinrath <christopher.spinrath@rwth-aachen.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: r8a7794: remove Z clock

commit 68cc085a4daaa32f7138de1e918331c05165a484 upstream.

R8A7794 doesn't have Cortex-A15 CPUs, thus there's no Z clock...

Fixes: 0dce5454d5c2 ("ARM: shmobile: Initial r8a7794 SoC device tree")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: r8a7794: Use SYSC "always-on" PM Domain for sound

commit 24b2d930a50662c11918fd0c22931f1448488da4 upstream.

Hook up the Audio-DMAC and sound device nodes to the SYSC "always-on" PM
Domain, for a more consistent device-power-area description in DT.

Cfr. commit 0761ff2ad0c581f3 ("ARM: dts: r8a7794: Add SYSC PM Domains").

Fixes: 320d6c5a08a4abd3 ("ARM: dts: r8a7794: add sound support")
Fixes: 298e4ee3d213a076 ("ARM: dts: r8a7794: add Audio-DMAC support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: bcm283x: fix typo in mailbox address

commit 7d891a685dd46b925cf25b74ada0280a2531c34f upstream.

The address of the mailbox node in the bcm283x.dtsi also has a typo.
So fix it accordingly.

Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Fixes: 05b682b7a3b2 ("ARM: bcm2835: dt: Add the mailbox to the device tree")
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf jit: Enable jitdump support without dwarf

commit 621cb4e7837e39d25a5af5a785ad282cdd2b4ce8 upstream.

This patch modifies the build dependencies on the jitdump support in
perf. As it stands jitdump was wrongfully made dependent 100% on using
DWARF. However, the dwarf dependency, only exist if generating the
source line table in genelf_debug.c. The rest of the support does not
need DWARF.

This patch removes the dependency on DWARF for the entire jitdump
support. It keeps it only for the genelf_debug.c support.

Signed-off-by: Maciej Debski <maciejd@google.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-3-git-send-email-eranian@google.com
Fixes: e12b202f8fb9 ("perf jitdump: Build only on supported archs")
[ Make it build only if NO_LIBELF isn't defined, as jitdump.o will only be built in that case ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf scripting: Avoid leaking the scripting_context variable

commit cf346d5bd4b9d61656df2f72565c9b354ef3ca0d upstream.

Both register_perl_scripting() and register_python_scripting() allocate
this variable, fix it by checking if it already was.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 7e4b21b84c43 ("perf/scripts: Add Python scripting engine")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf callchain: Fixup help/config for no-unwinding

commit c56cb33b56c13493eeb95612f80e4dd6e35cd109 upstream.

Since 841e3558b2d ("perf callchain: Recording 'dwarf' callchains do not
need DWARF unwinding support"), --call-graph dwarf is allowed in 'perf
record' even without unwind support. A couple of other places don't
reflect this yet though: the help text should list dwarf as a valid
record mode and the dump_size config should be respected too.

Signed-off-by: Rabin Vincent <rabinv@axis.com>
Cc: He Kuang <hekuang@huawei.com>
Fixes: 841e3558b2de ("perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding support")
Link: http://lkml.kernel.org/r/1470837148-7642-1-git-send-email-rabin.vincent@axis.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf diff: Do not overwrite valid build id

commit ed6c166cc7dc329736cace3affd2df984fb22ec8 upstream.

Fixes a perf diff regression issue which was introduced by commit
5baecbcd9c9a ("perf symbols: we can now read separate debug-info files
based on a build ID")

The binary name could be same when perf diff different binaries. Build
id is used to distinguish between them.
However, the previous patch assumes the same binary name has same build
id. So it overwrites the build id according to the binary name,
regardless of whether the build id is set or not.

Check the has_build_id in dso__load. If the build id is already set, use
it.

Before the fix:

  $ perf diff 1.perf.data 2.perf.data
  # Event 'cycles'
  #
  # Baseline    Delta  Shared Object     Symbol
  # ........  .......  ................  .............................
  #
    99.83%  -99.80%  tchain_edit       [.] f2
     0.12%  +99.81%  tchain_edit       [.] f3
     0.02%   -0.01%  [ixgbe]           [k] ixgbe_read_reg

  After the fix:
  $ perf diff 1.perf.data 2.perf.data
  # Event 'cycles'
  #
  # Baseline    Delta  Shared Object     Symbol
  # ........  .......  ................  .............................
  #
    99.83%   +0.10%  tchain_edit       [.] f3
     0.12%   -0.08%  tchain_edit       [.] f2

Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
CC: Dima Kogan <dima@secretsauce.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 5baecbcd9c9a ("perf symbols: we can now read separate debug-info files based on a build ID")
Link: http://lkml.kernel.org/r/1481642984-13593-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf trace: Check if MAP_32BIT is defined (again)

commit 2bd42f3aaa53ebe78b9be6f898b7945dd61f9773 upstream.

There might be systems where MAP_32BIT is not defined, like some some
RHEL7 powerpc versions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kyle McMartin <kyle@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 256763b01741 ("perf trace beauty mmap: Add more conditional defines")
Link: http://lkml.kernel.org/r/1481831814-23683-1-git-send-email-jolsa@kernel.org
[ Changed the Fixme cset to the one removing the conditional switch case for MAP_32BIT ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf mem: Fix --all-user/--all-kernel options

commit 631ac41b46d293fb3ee43a809776c1663de8d9c6 upstream.

Removing extra '--' prefix.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: ad16511b0e40 ("perf mem: Add -U/-K (--all-user/--all-kernel) options")
Link: http://lkml.kernel.org/r/1481538943-21874-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf trace: Use the syscall raw_syscalls:sys_enter timestamp

commit ecf1e2253ea79c6204f4d6a5e756e8fb4aed5a7e upstream.

Instead of the one when another syscall takes place while another is being
processed (in another CPU, but we show it serialized, so need to "interrupt"
the other), and also when finally showing the sys_enter + sys_exit + duration,
where we were showing the sample->time for the sys_exit, duh.

Before:

  # perf trace sleep 1
  <SNIP>
     0.373 (   0.001 ms): close(fd: 3                   ) = 0
  1000.626 (1000.211 ms): nanosleep(rqtp: 0x7ffd6ddddfb0) = 0
  1000.653 (   0.003 ms): close(fd: 1                   ) = 0
  1000.657 (   0.002 ms): close(fd: 2                   ) = 0
  1000.667 (   0.000 ms): exit_group(                   )
  #

After:

  # perf trace sleep 1
  <SNIP>
     0.336 (   0.001 ms): close(fd: 3                   ) = 0
     0.373 (1000.086 ms): nanosleep(rqtp: 0x7ffe303e9550) = 0
  1000.481 (   0.002 ms): close(fd: 1                   ) = 0
  1000.485 (   0.001 ms): close(fd: 2                   ) = 0
  1000.494 (   0.000 ms): exit_group(                   )
[root@jouet linux]#

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ecbzgmu2ni6glc6zkw8p1zmx@git.kernel.org
Fixes: 752fde44fd1c ("perf trace: Support interrupted syscalls")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/IPoIB: Remove can't use GFP_NOIO warning

commit 0b59970e7d96edcb3c7f651d9d48e1a59af3c3b0 upstream.

Remove the warning print of "can't use of GFP_NOIO" to avoid prints in
each QP creation when devices aren't supporting IB_QP_CREATE_USE_GFP_NOIO.

This print become more annoying when the IPoIB interface is configured
to work in connected mode.

Fixes: 09b93088d750 ('IB: Add a QP creation flag to use GFP_NOIO allocations')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx4: Check if GRH is available before using it

commit bf08e884bfd5be068fd2ccf2bc450f085d8dd853 upstream.

Before reading GRH attributes, need to make sure AH contains GRH,
and in addition, initialize GID type.

Fixes: dbf727de7440 ('IB/core: Use GID table in AH creation and dmac resolution')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx4: When no DMFS for IPoIB, don't allow NET_IF QPs

commit 1f22e454df2eb99ba6b7ace3f594f6805cdf5cbc upstream.

According to the firmware spec, FLOW_STEERING_IB_UC_QP_RANGE command is
supported only if dmfs_ipoib bit is set.

If it isn't set we want to ensure allocating NET_IF QPs fail. We do so
by filling out the allocation bitmap. By thus, the NET_IF QPs allocating
function won't find any free QP and will fail.

Fixes: c1c98501121e ('IB/mlx4: Add support for steerable IB UD QPs')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx4: Fix port query for 56Gb Ethernet links

commit 6fa26208206c406fa529cd73f7ae6bf4181e270b upstream.

Report the correct speed in the port attributes when using a 56Gbps
ethernet link. Without this change the field is incorrectly set to 10.

Fixes: a9c766bb75ee ('IB/mlx4: Fix info returned when querying IBoE ports')
Fixes: 2e96691c31ec ('IB: Use central enum for speed instead of hard-coded values')
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx4: Handle well-known-gid in mad_demux processing

commit befcabcd530e4ffb6f016638f693b7d94986d2ba upstream.

If OpenSM runs over a ConnectX-3, and there are ConnectX-4 or Connect-IB
VFs active on the network, the OpenSM will receive QP1 packets containing
a GRH where the destination GID is the "Well-Known GID" -- which is not a
GID in the HCA Port's GID Table.

This GID must be tested-for separately -- and packets which contain
this destination GID should be routed to slave 0 (the PF).

Fixes: 37bfc7c1e83f ('IB/mlx4: SR-IOV multiplex and demultiplex MADs')
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx4: Fix out-of-range array index in destroy qp flow

commit c482af646d0809a8d5e1b7f4398cce3592589b98 upstream.

For non-special QPs, the port value becomes non-zero only at the
RESET-to-INIT transition. If the QP has not undergone that transition,
its port number value is still zero.

If such a QP is destroyed before being moved out of the RESET state,
subtracting one from the qp port number results in a negative value.
Using that negative value as an index into the qp1_proxy array
results in an out-of-bounds array reference.

Fix this by testing that the QP type is one that uses qp1_proxy before
using the port number. For special QPs of all types, the port number is
specified at QP creation time.

Fixes: 9433c188915c ("IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx4: Set traffic class in AH

commit af4295c117b82a521b05d0daf39ce879d26e6cb1 upstream.

Set traffic class within sl_tclass_flowlabel when create iboe AH.
Without this the TOS value will be empty when running VLAN tagged
traffic, because the TOS value is taken from the traffic class in the
address handle attributes.

Fixes: 9106c4106974 ('IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx5: Wait for all async command completions to complete

commit acbda523884dcf45613bf6818d8ead5180df35c2 upstream.

Wait before continuing unload till all pending mkey async creation requests
are done.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx5: Assign SRQ type earlier

commit c73b7911de97fad3ab9032a110af48d6ab2da48f upstream.

Move the SRQ type assignment to be before actually using it
in create_srq_user() and in create_srq_kernel() functions.

Fixes: af1ba291c5e4 ('{net, IB}/mlx5: Refactor internal SRQ API')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx5: Fix reported max SGE calculation

commit 288c01b746aab484651391ca6d64b585d3eb5ec6 upstream.

Add the 512 bytes limit of RDMA READ and the size of remote
address to the max SGE calculation.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx5: Avoid system crash when enabling many VFs

commit afd02cd3a9b6c04b41d946b5d7f6e17b3fc30c6b upstream.

When enabling many VFs, the total amount of DMA mappings increase
significantly. This causes DMA allocations to take a lot of time
since they are serialized in the kernel.

As a result the driver enters into fatal condition due to
timeout and the system hangs. To recover from this we disable
MR cache for VFs.

PFs will still have a full cache and VFs cache can be manipulated
as usual after driver load.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/rxe: avoid putting a large struct rxe_qp on stack

commit a0fa72683e78979ef1123d679b1c40ae28bd9096 upstream.

A race condition fix added an rxe_qp structure to the stack in order
to be able to perform rollback in rxe_requester(), but the structure
is large enough to trigger the warning for possible stack overflow:

drivers/infiniband/sw/rxe/rxe_req.c: In function 'rxe_requester':
drivers/infiniband/sw/rxe/rxe_req.c:757:1: error: the frame size of 2064 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

This changes the rollback function to only save the psn inside
the qp, which is the only field we access in the rollback_qp
anyway.

Fixes: 3050b9985024 ("IB/rxe: Fix race condition between requester and completer")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/rxe: Increase max number of completions to 32k

commit d680ebed91e0b45c43ae03a880a0b43211096161 upstream.

Increase limit of max CQE from 8K to 32K to allow demanding
applications to work over SoftRoCE with same configuration
as most RoCEv2 HW vendors have.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/core: Release allocated memory in cache setup failure

commit aa6aae38f7fb2c030f326a6dd10b58fff1851dfa upstream.

The failure in ib_cache_setup_one function during
ib_register_device will leave leaked allocated memory.

Fixes: 03db3a2d81e6 ("IB/core: Add RoCE GID table management")
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pinctrl: sh-pfc: Do not unconditionally support PIN_CONFIG_BIAS_DISABLE

commit 5d7400c4acbf7fe633a976a89ee845f7333de3e4 upstream.

Always stating PIN_CONFIG_BIAS_DISABLE is supported gives untrue output
when examining /sys/kernel/debug/pinctrl/e6060000.pfc/pinconf-pins if
the operation get_bias() is implemented but the pin is not handled by
the get_bias() implementation. In that case the output will state that
"input bias disabled" indicating that this pin has bias control
support.

Make support for PIN_CONFIG_BIAS_DISABLE depend on that the pin either
supports SH_PFC_PIN_CFG_PULL_UP or SH_PFC_PIN_CFG_PULL_DOWN. This also
solves the issue where SoC specific implementations print error messages
if their particular implementation of {set,get}_bias() is called with a
pin it does not know about.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: hugetlb: fix the wrong return value for huge_ptep_set_access_flags

commit 69d012345a1a32d3f03957f14d972efccc106a98 upstream.

In current code, the @changed always returns the last one's status for
the huge page with the contiguous bit set. This is really not what we
want. Even one of the PTEs is changed, we should tell it to the caller.

This patch fixes this issue.

Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit")
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: hugetlb: remove the wrong pmd check in find_num_contig()

commit 20156ce2365d61beaa6f5a78a7a789044e0e7acc upstream.

The find_num_contig() will return 1 when the pmd is not present.
It will cause a kernel dead loop in the following scenaro:

   1.) pmd entry is not present.

   2.) the page fault occurs:
       ... hugetlb_fault() --> hugetlb_no_page() --> set_huge_pte_at()

   3.) set_huge_pte_at() will only set the first PMD entry, since the
       find_num_contig just return 1 in this case. So the PMD entries
       are all empty except the first one.

   4.) when kernel accesses the address mapped by the second PMD entry,
       a new page fault occurs:
       ... hugetlb_fault() --> huge_ptep_set_access_flags()

       The second PMD entry is still empty now.

   5.) When the kernel returns, the access will cause a page fault again.
       The kernel will run like the "4)" above.
       We will see a dead loop since here.

The dead loop is caught in the 32M hugetlb page (2M PMD + Contiguous bit).

This patch removes wrong pmd check, and fixes this dead loop.

This patch also removes the redundant checks for PGD/PUD in
the find_num_contig().

Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: hugetlb: fix the wrong address for several functions

commit 0c2f0afe3582c58efeef93bc57bc07d502132618 upstream.

The libhugetlbfs meets several failures since the following functions
do not use the correct address:
   huge_ptep_get_and_clear()
   huge_ptep_set_access_flags()
   huge_ptep_set_wrprotect()
   huge_ptep_clear_flush()

This patch fixes the wrong address for them.

Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powerpc/powernv: Don't warn on PE init if unfreeze is unsupported

commit d4791db527bf397c84c9956c3ece9692ed5322ac upstream.

Whenever a PE is initialised in powernv, opal_pci_eeh_freeze_clear() is
called. This is to remove any existing freeze, and has no negative side
effects if the PE is already in an unfrozen state. On PHB backends that
don't support this operation and return OPAL_UNSUPPORTED, this creates a
scary and misleading warning message.

Skip the warning message on init if OPAL_UNSUPPORTED is returned.

As far as I'm aware, this currently only affects NPUs.

Fixes: 313483d ("powerpc/powernv: Unfreeze PE on allocation")
Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powerpc/ibmebus: Fix device reference leaks in sysfs interface

commit fe0f3168169f7c34c29b0cf0c489f126a7f29643 upstream.

Make sure to drop any reference taken by bus_find_device() in the sysfs
callbacks that are used to create and destroy devices based on
device-tree entries.

Fixes: 6bccf755ff53 ("[POWERPC] ibmebus: dynamic addition/removal of adapters, some code cleanup")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powerpc/ibmebus: Fix further device reference leaks

commit 815a7141c4d1b11610dccb7fcbb38633759824f2 upstream.

Make sure to drop any reference taken by bus_find_device() when creating
devices during init and driver registration.

Fixes: 55347cc9962f ("[POWERPC] ibmebus: Add device creation and bus probing based on of_device")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powerpc/mm: Correct process and partition table max size

commit 555c16328ae6d75a90e234eac9b51998d68f185b upstream.

Version 3.00 of the ISA states that the PATS (partition table size) field
of the PTCR (partition table control register) and the PRTS (process table
size) field of the partition table entry must both be less than or equal
to 24. However the actual size of the partition and process tables is equal
to 2 to the power of 12 plus the PATS and PRTS fields, respectively. This
means that the max allowable size of each of these tables is 2^36 or 64GB
for both.

Thus when checking the size shift for each we should be checking for values
of greater than 36 instead of the current check for shifts larger than 24
and 23.

Fixes: 2bfd65e45e877fb5704730244da67c748d28a1b8
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: vexpress-config: fix device reference leak

commit c090959b9dd8c87703e275079aa4b4a824ba3f8e upstream.

Make sure to drop the reference to the parent device taken by
class_find_device() after populating the bus.

Fixes: 3b9334ac835b ("mfd: vexpress: Convert custom func API to regmap")
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

blk-mq: Always schedule hctx->next_cpu

commit c02ebfdddbafa9a6a0f52fbd715e6bfa229af9d3 upstream.

Commit 0e87e58bf60e ("blk-mq: improve warning for running a queue on the
wrong CPU") attempts to avoid triggering the WARN_ON in
__blk_mq_run_hw_queue when the expected CPU is dead. Problem is, in the
last batch execution before round robin, blk_mq_hctx_next_cpu can
schedule a dead CPU and also update next_cpu to the next alive CPU in
the mask, which will trigger the WARN_ON despite the previous
workaround.

The following patch fixes this scenario by always scheduling the value
in hctx->next_cpu. This changes the moment when we round-robin the CPU
running the hctx, but it really doesn't matter, since it still executes
BLK_MQ_CPU_WORK_BATCH times in a row before switching to another CPU.

Fixes: 0e87e58bf60e ("blk-mq: improve warning for running a queue on the wrong CPU")
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

power: supply: bq27xxx_battery: Fix register map for BQ27510 and BQ27520

commit 3bee9ea1de687925d116670f036599cbed8b66b0 upstream.

The BQ27510 and BQ27520 use a slightly different register map than the
BQ27500, add a new type enum and add these gauges to it.

Fixes: d74534c27775 ("power: bq27xxx_battery: Add support for additional bq27xxx family devices")
Based-on-patch-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bq24190_charger: Fix PM runtime use for bq24190_battery_set_property

commit 075eb5719d53e8bb4a406ad87e1de99319aa50f0 upstream.

There's a typo, it should do pm_runtime_get_sync, not put.

Fixes: d7bf353fd0aa3 ("bq24190_charger: Add support for TI BQ24190 Battery Charger")
Cc: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Mark Greer <mgreer@animalcreek.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iw_cxgb4: Fix error return code in c4iw_rdev_open()

commit 15f7e3c21b76598bc6e5816d2577ce843b2b963f upstream.

Fix to return error code -ENOMEM from the __get_free_page() error
handling case instead of 0, as done elsewhere in this function.

Fixes: 05eb23893c2c ("cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powercap/intel_rapl: fix and tidy up error handling

commit cb43f81b8489dcb87555e16c17453f0a9fa690f2 upstream.

Commit e1399ba20eee ("powercap / RAPL: handle missing MSRs") added
contraint_to_pl() function to return index into an array. But it
can potentially return -EINVAL if powercap layer sends an out of
range constraint ID. This patch adds sanity check.

Unnecessary RAPL domain pointer check is removed since it must be
initialized before calling rapl_unit_xlate().

Fixes: e1399ba20eee ("powercap / RAPL: handle missing MSRs")
Reported-by: Odzioba, Lukasz <lukasz.odzioba@intel.com>
Reported-by: Koss, Marcin <marcin.koss@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / APEI: Fix NMI notification handling

commit a545715d2dae8d071c5b06af947b07ffa846b288 upstream.

When removing and adding cpu 0 on a system with GHES NMI the following stack
trace is seen when re-adding the cpu:

WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1349 setup_local_APIC+
Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache coretemp intel_ra
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc6+ #2
Call Trace:
dump_stack+0x63/0x8e
__warn+0xd1/0xf0
warn_slowpath_null+0x1d/0x20
setup_local_APIC+0x275/0x370
apic_ap_setup+0xe/0x20
start_secondary+0x48/0x180
set_init_arg+0x55/0x55
early_idt_handler_array+0x120/0x120
x86_64_start_reservations+0x2a/0x2c
x86_64_start_kernel+0x13d/0x14c

During the cpu bringup, wakeup_cpu_via_init_nmi() is called and issues an
NMI on CPU 0. The GHES NMI handler, ghes_notify_nmi() runs the
ghes_proc_irq_work work queue which ends up setting IRQ_WORK_VECTOR
(0xf6). The "faulty" IR line set at arch/x86/kernel/apic/apic.c:1349 is also
0xf6 (specifically APIC IRR for irqs 255 to 224 is 0x400000) which confirms
that something has set the IRQ_WORK_VECTOR line prior to the APIC being
initialized.

Commit 2383844d4850 ("GHES: Elliminate double-loop in the NMI handler")
incorrectly modified the behavior such that the handler returns
NMI_HANDLED only if an error was processed, and incorrectly runs the ghes
work queue for every NMI.

This patch modifies the ghes_proc_irq_work() to run as it did prior to
2383844d4850 ("GHES: Elliminate double-loop in the NMI handler") by
properly returning NMI_HANDLED and only calling the work queue if
NMI_HANDLED has been set.

Fixes: 2383844d4850 (GHES: Elliminate double-loop in the NMI handler)
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

block: cfq_cpd_alloc() should use @gfp

commit ebc4ff661fbe76781c6b16dfb7b754a5d5073f8e upstream.

cfq_cpd_alloc() which is the cpd_alloc_fn implementation for cfq was
incorrectly hard coding GFP_KERNEL instead of using the mask specified
through the @gfp parameter. This currently doesn't cause any actual
issues because all current callers specify GFP_KERNEL. Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: e4a9bde9589f ("blkcg: replace blkcg_policy->cpd_size with ->cpd_alloc/free_fn() methods")
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

block: Change extern inline to static inline

commit 9a05e7541c39680d28ecf91892338e074738d5fd upstream.

With compilers which follow the C99 standard (like modern versions of
gcc and clang), "extern inline" does the opposite thing from older
versions of gcc (emits code for an externally linkable version of the
inline function).

"static inline" does the intended behavior in all cases instead.

Description taken from commit 6d91857d4826 ("staging, rtl8192e,
LLVMLinux: Change extern inline to static inline").

This also fixes the following GCC warning when building with CONFIG_PM
disabled:

./include/linux/blkdev.h:1143:20: warning: no previous prototype for 'blk_set_runtime_active' [-Wmissing-prototypes]

Fixes: d07ab6d11477 ("block: Add blk_set_runtime_active()")
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / CPPC: set an error code on probe error path

commit 501634759d55a5b56967de6d9465acf02bbc3565 upstream.

We should return -EINVAL (instead of 0) if get_cpu_device() fails.

Fixes: 158c998ea44b (ACPI / CPPC: add sysfs support to compute delivered performance)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

regulators: helpers: Fix handling of bypass_val_on in get_bypass_regmap

commit 85b037442e3f0e84296ab1010fd6b057eee18496 upstream.

The handling of bypass_val_on that was added in
regulator_get_bypass_regmap is done unconditionally however
several drivers don't define a value for bypass_val_on. This
results in those drivers reporting bypass being enabled when
it is not. In regulator_set_bypass_regmap we use bypass_mask
if bypass_val_on is zero. This patch adds similar handling in
regulator_get_bypass_regmap.

Fixes: commit dd1a571daee7 ("regulator: helpers: Ensure bypass register field matches ON value")
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cpufreq: powernv: Disable preemption while checking CPU throttling state

commit 8a10c06a20ec8097a68fd7a4a1c0e285095b4d2f upstream.

With preemption turned on we can read incorrect throttling state
while being switched to CPU on a different chip.

BUG: using smp_processor_id() in preemptible [00000000] code: cat/7343
caller is .powernv_cpufreq_throttle_check+0x2c/0x710
CPU: 13 PID: 7343 Comm: cat Not tainted 4.8.0-rc5-dirty #1
Call Trace:
[c0000007d25b75b0] [c000000000971378] .dump_stack+0xe4/0x150 (unreliable)
[c0000007d25b7640] [c0000000005162e4] .check_preemption_disabled+0x134/0x150
[c0000007d25b76e0] [c0000000007b63ac] .powernv_cpufreq_throttle_check+0x2c/0x710
[c0000007d25b7790] [c0000000007b6d18] .powernv_cpufreq_target_index+0x288/0x360
[c0000007d25b7870] [c0000000007acee4] .__cpufreq_driver_target+0x394/0x8c0
[c0000007d25b7920] [c0000000007b22ac] .cpufreq_set+0x7c/0xd0
[c0000007d25b79b0] [c0000000007adf50] .store_scaling_setspeed+0x80/0xc0
[c0000007d25b7a40] [c0000000007ae270] .store+0xa0/0x100
[c0000007d25b7ae0] [c0000000003566e8] .sysfs_kf_write+0x88/0xb0
[c0000007d25b7b70] [c0000000003553b8] .kernfs_fop_write+0x178/0x260
[c0000007d25b7c10] [c0000000002ac3cc] .__vfs_write+0x3c/0x1c0
[c0000007d25b7cf0] [c0000000002ad584] .vfs_write+0xc4/0x230
[c0000007d25b7d90] [c0000000002aeef8] .SyS_write+0x58/0x100
[c0000007d25b7e30] [c00000000000bfec] system_call+0x38/0xfc

Fixes: 09a972d16209 (cpufreq: powernv: Report cpu frequency throttling)
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powerpc/64: Simplify adaptation to new ISA v3.00 HPTE format

commit 6b243fcfb5f1e16bcf732e6f86a63f8af5b59a9f upstream.

This changes the way that we support the new ISA v3.00 HPTE format.
Instead of adapting everything that uses HPTE values to handle either
the old format or the new format, depending on which CPU we are on,
we now convert explicitly between old and new formats if necessary
in the low-level routines that actually access HPTEs in memory.
This limits the amount of code that needs to know about the new
format and makes the conversions explicit. This is OK because the
old format contains all the information that is in the new format.

This also fixes operation under a hypervisor, because the H_ENTER
hypercall (and other hypercalls that deal with HPTEs) will continue
to require the HPTE value to be supplied in the old format. At
present the kernel will not boot in HPT mode on POWER9 under a
hypervisor.

This fixes and partially reverts commit 50de596de8be
("powerpc/mm/hash: Add support for Power9 Hash", 2016-04-29).

Fixes: 50de596de8be ("powerpc/mm/hash: Add support for Power9 Hash")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

remoteproc: st: Fix error return code in st_rproc_probe()

commit 1d701d3dd8caf6660ff33c3c23a115b4649c5cdb upstream.

Fix to return a negative error code from the st_rproc_state() error
handling case instead of 0, as done elsewhere in this function.

Fixes: 63edb0310a5c ("remoteproc: Supply controller driver for ST's Remote Processors")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

remoteproc: qcom_wcnss: Fix circular module dependency

commit 6de1a507c46bf22ed97043495b9ab96e4d5c213b upstream.

The tie between the main WCNSS driver and the IRIS driver causes a
circular dependency between the two modules. Neither part makes sense to
have on their own so lets merge them into one module.

For the sake of picking up the clock and regulator resources described
in the iris of_node we need an associated struct device. But, to keep
the size of the patch down we continue to represent the IRIS part as its
own platform_driver, within the same module, rather than setting up a
dummy device.

Fixes: aed361adca9f ("remoteproc: qcom: Introduce WCNSS peripheral image loader")
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm: Initialise drm_mm.head_node.allocated

commit cc98e6ce6abe1c0103cbd7aff1ee586622a9361e upstream.

commit 202b52b7fbf7 ("drm: Track drm_mm nodes with an interval tree")
introduced a requirement that the special drm_mm.head_node was
initialised and marked as not being allocated. It is a very special node
that has no side but has a hole that represents the drm_mm address
space, and holds the list of nodes. Since it is not a real node, it is
not part of the node rbtree and we detect this as it being unallocated.
This presumed that drm_mm_init() was initialising it to zero. It happens
that i915 kzallocs its objects and so it was accidentally setting it,
but for generic use we cannot make that assumption.

[   22.981519] general protection fault: 0000 [#1] SMP
[   22.981521] Modules linked in: test_drm_mm(+) ctr ccm arc4 rt2800usb rt2x00usb rt2800lib rt2x00lib crc_ccitt mac80211 cmac rfcomm bnep snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel dcdbas snd_hda_codec x86_pkg_temp_thermal intel_powerclamp btusb snd_hda_core coretemp crct10dif_pclmul cfg80211 btrtl btbcm btintel bluetooth crc32_pclmul ghash_clmulni_intel aesni_intel snd_pcm i2c_hid aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_timer hid_multitouch snd joydev serio_raw lpc_ich mfd_core i2c_designware_platform i2c_designware_core 8250_dw binfmt_misc soundcore acpi_pad nls_iso8859_1 usbhid hid psmouse ahci libahci [last unloaded: test_drm_mm]
[   22.981544] CPU: 1 PID: 2088 Comm: drm_mm Tainted: G        W       4.9.0-rc7+ #234
[   22.981545] Hardware name: Dell Inc. XPS 13 9343/0310JH, BIOS A07 11/11/2015
[   22.981546] task: ffff88020c971cc0 task.stack: ffffc90001728000
[   22.981547] RIP: 0010:[<ffffffff814050f0>]  [<ffffffff814050f0>] drm_mm_interval_tree_add_node+0xa0/0xd0
[   22.981551] RSP: 0018:ffffc9000172ba98  EFLAGS: 00010202
[   22.981552] RAX: 0f0000c69cf63d80 RBX: ffff88020be00000 RCX: ffff88020be00000
[   22.981553] RDX: 0000000000000fff RSI: ffffc9000172bc48 RDI: ffffffff810ac4df
[   22.981553] RBP: ffffc9000172bb08 R08: ffffc9000172bc70 R09: 0000000000000fff
[   22.981554] R10: ffffffff810ac4d7 R11: 4dc04d8b4cffffe5 R12: 0000000000001000
[   22.981555] R13: ffffc9000172bbd0 R14: ffffc9000172bbe0 R15: 0000000002000000
[   22.981556] FS:  00007f80c9fab740(0000) GS:ffff88021f480000(0000) knlGS:0000000000000000
[   22.981557] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   22.981558] CR2: 00007f80c9fd5000 CR3: 000000020c191000 CR4: 00000000003406e0
[   22.981559] Stack:
[   22.981560]  ffffffff81405d09 ffff88020be00000 ffffc9000172bbe0 000000000172bb08
[   22.981562]  ffffffffffffffff 0000000000000000 0000000000000000 0000000000000000
[   22.981563]  0000000002000000 0000000002000000 ffffffffa02f3000 ffff88020be00000
[   22.981565] Call Trace:
[   22.981568]  [<ffffffff81405d09>] ? drm_mm_insert_node_generic+0x229/0x310
[   22.981570]  [<ffffffffa02f3000>] ? 0xffffffffa02f3000
[   22.981572]  [<ffffffffa02903c1>] __subtest_insert_range.constprop.7+0xd1/0x5b0 [test_drm_mm]
[   22.981575]  [<ffffffff81081222>] ? default_wake_function+0x12/0x20
[   22.981576]  [<ffffffff81096905>] ? __wake_up_common+0x55/0x90
[   22.981578]  [<ffffffff81085f42>] ? sched_clock_cpu+0x72/0xa0
[   22.981581]  [<ffffffff811308ad>] ? irq_work_queue+0xd/0x80
[   22.981582]  [<ffffffff810abcc4>] ? wake_up_klogd+0x34/0x40
[   22.981584]  [<ffffffff810ac19d>] ? console_unlock+0x4cd/0x530
[   22.981585]  [<ffffffff810ac4d7>] ? vprintk_emit+0x2d7/0x490
[   22.981587]  [<ffffffff810ac82f>] ? vprintk_default+0x1f/0x30
[   22.981589]  [<ffffffff81146e1c>] ? printk+0x4d/0x4f
[   22.981590]  [<ffffffffa02f3000>] ? 0xffffffffa02f3000
[   22.981592]  [<ffffffffa02908b5>] subtest_insert_range+0x15/0x80 [test_drm_mm]
[   22.981594]  [<ffffffffa02f3088>] test_drm_mm_init+0x88/0x1000 [test_drm_mm]
[   22.981597]  [<ffffffff8100043d>] do_one_initcall+0x3d/0x150
[   22.981600]  [<ffffffff8119dfbf>] ? kfree+0x13f/0x180
[   22.981602]  [<ffffffff811471f2>] do_init_module+0x60/0x1f1
[   22.981606]  [<ffffffff810db878>] load_module+0x2228/0x2790
[   22.981608]  [<ffffffff810d8590>] ? __symbol_put+0x40/0x40
[   22.981612]  [<ffffffff811c52b1>] ? kernel_read+0x41/0x60
[   22.981614]  [<ffffffff810dbfb6>] SYSC_finit_module+0x96/0xd0
[   22.981617]  [<ffffffff810dc00e>] SyS_finit_module+0xe/0x10
[   22.981620]  [<ffffffff816e7aa4>] entry_SYSCALL_64_fastpath+0x17/0x98
[   22.981622] Code: c7 41 30 00 00 00 00 48 89 e5 48 89 3a 48 c7 c2 20 4e 40 81 e8 b2 a1 f0 ff 5d c3 48 8d 56 78 45 31 d2 48 89 d6 eb 25 48 8b 51 58 <48> 39 50 38 73 04 48 89 50 38 4c 8b 58 28 4c 39 59 48 48 8d 50
[   22.981651] RIP  [<ffffffff814050f0>] drm_mm_interval_tree_add_node+0xa0/0xd0
[   22.981655]  RSP <ffffc9000172ba98>

Testcase: igt/drm_mm
Fixes: 202b52b7fbf7 ("drm: Track drm_mm nodes with an interval tree")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: <drm-intel-fixes@lists.freedesktop.org> # v4.9-rc1+
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161130205126.31106-1-chris@chris-wilson.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Move the min_pixclk[] handling to the end of readout

commit 00b2b7288299a8c73c0c37b531a075ba5c849e67 upstream.

Trying to determine the pixel rate of the pipe can't be done until we
know the clock, which means it can't be done until the encoder
.get_config() hooks have been called. So let's move the min_pixclk[]
stuff to the end of intel_modeset_readout_hw_state() when we actually
have gathered all the required infromation.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Fixes: 565602d7501a ("drm/i915: Do not acquire crtc state to check clock during modeset, v4.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161220153902.15621-1-ville.syrjala@linux.intel.com
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
(cherry picked from commit aca1ebf491518910df156f3dab6a66306bb52e28)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/panel: simple: Check against num_timings when setting preferred for timing

commit 230c5b44233ff0543c0b5ccf4ff9400057010fbe upstream.

In the loop on .timings, we should check .num_timings to see if it's the
only mode specified, not .num_modes, which should be used with .modes.

Fixes: cda553725c92 ("drm/panel: simple: Set appropriate mode type")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm: avoid uninitialized timestamp use in wait_vblank

commit cff52e5fc4cfc978b7df898dc14a0492c7ef0ae8 upstream.

gcc warns about the timestamp in drm_wait_vblank being possibly
used without an initialization:

drivers/gpu/drm/drm_irq.c: In function 'drm_crtc_send_vblank_event':
drivers/gpu/drm/drm_irq.c:992:24: error: 'now.tv_usec' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/gpu/drm/drm_irq.c:1069:17: note: 'now.tv_usec' was declared here
drivers/gpu/drm/drm_irq.c:991:23: error: 'now.tv_sec' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This can happen if drm_vblank_count_and_time() returns 0 in its
error path. To sanitize the error case, I'm changing that function
to return a zero timestamp when it fails.

Fixes: e6ae8687a87b ("drm: idiot-proof vblank")
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161017221355.1861551-6-arnd@arndb.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915/gen9: Fix PCODE polling during SAGV disabling

commit dccf82ad1775f2b9c36ec85e25e39d88c7e86818 upstream.

According to the previous patch, it's possible atm that we call
intel_do_sagv_disable() only once during the 1ms period and time out if
that call fails. As opposed to this the spec says that we need to keep
retrying this request for a 1ms duration, so let's do this similarly to
the CDCLK change notification request.

v4-5:
- Rebased on the reply_mask, reply change.
v6:
- Remove w/s change. (Lyude)
- Rebased on the timeout_base argument change.

Cc: Lyude <cpaul@redhat.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 656d1b89e5ff ("drm/i915/skl: Add support for the SAGV, fix underrun hangs")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lyude <lyude@redhat.com> (v4)
Link: http://patchwork.freedesktop.org/patch/msgid/1480955258-26311-2-git-send-email-imre.deak@intel.com
(cherry picked from commit b3b8e99984a4eace91bc097e8f8cec71441cae16)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: mux: pca954x: fix i2c mux selection caching

commit 7f638c1cb0a1112dbe0b682a42db30521646686b upstream.

smbus functions return -ve on error, 0 on success. However,
__i2c_transfer() have a different return signature - -ve on error, or
number of buffers transferred (which may be zero or greater.)

The upshot of this is that the sense of the test is reversed when using
the mux on a bus supporting the master_xfer method: we cache the value
and never retry if we fail to transfer any buffers, but if we succeed,
we clear the cached value.

Fix this by making pca954x_reg_write() return a negative error code for
all failure cases.

Fixes: 463e8f845cbf ("i2c: mux: pca954x: retry updating the mux selection on failure")
Acked-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4.1: nfs4_fl_prepare_ds must be careful about reporting success.

commit cfd278c280f997cf2fe4662e0acab0fe465f637b upstream.

Various places assume that if nfs4_fl_prepare_ds() turns a non-NULL 'ds',
then ds->ds_clp will also be non-NULL.

This is not necessasrily true in the case when the process received a fatal signal
while nfs4_pnfs_ds_connect is waiting in nfs4_wait_ds_connect().
In that case ->ds_clp may not be set, and the devid may not recently have been marked
unavailable.

So add a test for ds_clp == NULL and return NULL in that case.

Fixes: c23266d532b4 ("NFS4.1 Fix data server connection race")
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Olga Kornievskaia <aglo@umich.edu>
Acked-by: Adamson, Andy <William.Adamson@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFS: Fix a performance regression in readdir

commit 79f687a3de9e3ba2518b4ea33f38ca6cbe9133eb upstream.

Ben Coddington reports that commit 311324ad1713, by adding the function
nfs_dir_mapping_need_revalidate() that checks page cache validity on
each call to nfs_readdir() causes a performance regression when
the directory is being modified.

If the directory is changing while we're iterating through the directory,
POSIX does not require us to invalidate the page cache unless the user
calls rewinddir(). However, we still do want to ensure that we use
readdirplus in order to avoid a load of stat() calls when the user
is doing an 'ls -l' workload.

The fix should be to invalidate the page cache immediately when we're
setting the NFS_INO_ADVISE_RDPLUS bit.

Reported-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: 311324ad1713 ("NFS: Be more aggressive in using readdirplus...")
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pNFS: Fix race in pnfs_wait_on_layoutreturn

commit ee284e35d8c71bf5d4d807eaff6f67a17134b359 upstream.

We must put the task to sleep while holding the inode->i_lock in order
to ensure atomicity with the test for NFS_LAYOUT_RETURN.

Fixes: 500d701f336b ("NFS41: make close wait for layoutreturn")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFS: fix typo in parameter description

commit f36ab161bebe464d33b998294eff29b17a9c8918 upstream.

Fix typo in parameter description.

Fixes: 5405fc44c337 ("NFSv4.x: Add kernel parameter to control the
callback server")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pinctrl: meson: fix gpio request disabling other modes

commit f24d311f92b516a8aadef5056424ccabb4068e7b upstream.

The pinctrl_gpio_request is called with the "full" gpio number, already
containing the base, then meson_pmx_request_gpio is then called with the
final pin number.
Remove the base addition when calling meson_pmx_disable_other_groups.

Fixes: 6ac730951104 ("pinctrl: add driver for Amlogic Meson SoCs")
CC: Beniamino Galvani <b.galvani@gmail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Acked-by: Beniamino Galvani <b.galvani@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: fix error handling when run_delayed_extent_op fails

commit aa7c8da35d1905d80e840d075f07d26ec90144b5 upstream.

In __btrfs_run_delayed_refs, the error path when run_delayed_extent_op
fails sets locked_ref->processing = 0 but doesn't re-increment
delayed_refs->num_heads_ready. As a result, we end up triggering
the WARN_ON in btrfs_select_ref_head.

Fixes: d7df2c796d7 (Btrfs: attach delayed ref updates to delayed ref heads)
Reported-by: Jon Nelson <jnelson-suse@jamponi.net>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: fix locking when we put back a delayed ref that's too new

commit d0280996437081dd12ed1e982ac8aeaa62835ec4 upstream.

In __btrfs_run_delayed_refs, when we put back a delayed ref that's too
new, we have already dropped the lock on locked_ref when we set
->processing = 0.

This patch keeps the lock to cover that assignment.

Fixes: d7df2c796d7 (Btrfs: attach delayed ref updates to delayed ref heads)
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too

commit b5a10c5f7532b7473776da87e67f8301bbc32693 upstream.

Commit 54adc01055b7 ("nvme/quirk: Add a delay before checking for adapter
readiness") introduced a quirk to adapters that cannot read the bit
NVME_CSTS_RDY right after register NVME_REG_CC is set; these adapters
need a delay or else the action of reading the bit NVME_CSTS_RDY could
somehow corrupt adapter's registers state and it never recovers.

When this quirk was added, we checked ctrl->tagset in order to avoid
quirking in probe time, supposing we would never require such delay
during probe. Well, it was too optimistic; we in fact need this quirk
at probe time in some cases, like after a kexec.

In some experiments, after abnormal shutdown of machine (aka power cord
unplug), we booted into our bootloader in Power, which is a Linux kernel,
and kexec'ed into another distro. If this kexec is too quick, we end up
reaching the probe of NVMe adapter in that distro when adapter is in
bad state (not fully initialized on our bootloader). What happens next
is that nvme_wait_ready() is unable to complete, except if the quirk is
enabled.

So, this patch removes the original ctrl->tagset verification in order
to enable the quirk even on probe time.

Fixes: 54adc01055b7 ("nvme/quirk: Add a delay before checking for adapter readiness")
Reported-by: Andrew Byrne <byrneadw@ie.ibm.com>
Reported-by: Jaime A. H. Gomez <jahgomez@mx1.ibm.com>
Reported-by: Zachary D. Myers <zdmyers@us.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Jeffrey Lien <Jeff.Lien@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/cpu: Fix bootup crashes by sanitizing the argument of the 'clearcpuid=' command-line option

commit dd853fd216d1485ed3045ff772079cc8689a9a4a upstream.

A negative number can be specified in the cmdline which will be used as
setup_clear_cpu_cap() argument. With that we can clear/set some bit in
memory predceeding boot_cpu_data/cpu_caps_cleared which may cause kernel
to misbehave. This patch adds lower bound check to setup_disablecpuid().

Boris Petkov reproduced a crash:

[ 1.234575] BUG: unable to handle kernel paging request at ffffffff858bd540
[ 1.236535] IP: memcpy_erms+0x6/0x10

Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andi.kleen@intel.com
Cc: bp@alien8.de
Cc: dave.hansen@linux.intel.com
Cc: luto@kernel.org
Cc: slaoub@gmail.com
Fixes: ac72e7888a61 ("x86: add generic clearcpuid=... option")
Link: http://lkml.kernel.org/r/1482933340-11857-1-git-send-email-lukasz.odzioba@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: piix4: Avoid race conditions with IMC

commit 701dc207bf551d9fe6defa36e84a911e880398c3 upstream.

On AMD's SB800 and upwards, the SMBus is shared with the Integrated
Micro Controller (IMC).

The platform provides a hardware semaphore to avoid race conditions
among them. (Check page 288 of the SB800-Series Southbridges Register
Reference Guide http://support.amd.com/TechDocs/45482.pdf)

Without this patch, many access to the SMBus end with an invalid
transaction or even with the bus stalled.

Reported-by: Alexandre Desnoyers <alex@qtec.com>
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>:
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/mlx5: Only cancel recovery work when cleaning up device

commit 5e44fca5047054f1762813751626b5245e0da022 upstream.

Do not attempt to drain the health workqueue when unloading the device in
the recovery flow, this can cause a deadlock when the recovery work
tries to cancel itself with sync.

Because the work is no longer unconditionally canceled when unloading, it
must be explicitly canceled in the AER flow.

fixes: 689a248df83b ("net/mlx5: Cancel recovery work in remove flow")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: serial: ch341: fix modem-control and B0 handling

commit 030ee7ae52a46a2be52ccc8242c4a330aba8d38e upstream.

The modem-control signals are managed by the tty-layer during open and
should not be asserted prematurely when set_termios is called from
driver open.

Also make sure that the signals are asserted only when changing speed
from B0.

Fixes: 664d5df92e88 ("USB: usb-serial ch341: support for DTR/RTS/CTS")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu: drop verde dpm quirks

commit 7192c54a68013f6058b1bb505645fcd07015191c upstream.

Port of radeon change to amdgpu.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu: update si kicker smc firmware

commit 5165484b02f2cbedb5bf3a41ff5e8ae16069016c upstream.

Use the appropriate smc firmware for each chip revision.
Using the wrong one can cause stability issues.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: drop verde dpm quirks

commit 8a08403bcb39f5d0e733bcf59a8a74f16b538f6e upstream.

fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=98897
https://bugs.launchpad.net/bugs/1651981

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Adrian Fiergolski <A.Fiergolski@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: update smc firmware selection for SI

commit 6458bd4dfd9414cba5804eb9907fe2a824278c34 upstream.

Use the appropriate smc firmware for each chip revision.
Using the wrong one can cause stability issues.

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm: Clean up planes in atomic commit helper failure path

commit aebe55c2d4b998741c0847ace1b4af47d73c763b upstream.

If waiting for fences fails for blocking commits, planes must be cleaned
up before returning.

Fixes: f6ce410a59a4 ("drm/fence: allow fence waiting to be interrupted by userspace")
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170102231427.7192-1-laurent.pinchart@ideasonboard.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915/gen9: Fix PCODE polling timeout in stable backport

The backport of
2c7d0602c - "Fix PCODE polling during CDCLK change notification"
to the 4.9 stable tree used an incorrect timeout value. Fix this up
so the backport matches the upstream commit.

Reported-by: Thomas Backlund <tmb@mageia.org>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/af_iucv: don't use paged skbs for TX on HiperSockets

commit dc5367bcc556e97555fc94a32cd1aadbebdff47e upstream.

With commit e53743994e21
("af_iucv: use paged SKBs for big outbound messages"),
we transmit paged skbs for both of AF_IUCV's transport modes
(IUCV or HiperSockets).
The qeth driver for Layer 3 HiperSockets currently doesn't
support NETIF_F_SG, so these skbs would just be linearized again
by the stack.
Avoid that overhead by using paged skbs only for IUCV transport.

cc stable, since this also circumvents a significant skb leak when
sending large messages (where the skb then needs to be linearized).

Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Fixes: e53743994e21 ("af_iucv: use paged SKBs for big outbound messages")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sysctl: Drop reference added by grab_header in proc_sys_readdir

commit 93362fa47fe98b62e4a34ab408c4a418432e7939 upstream.

Fixes CVE-2016-9191, proc_sys_readdir doesn't drop reference
added by grab_header when return from !dir_emit_dots path.
It can cause any path called unregister_sysctl_table will
wait forever.

The calltrace of CVE-2016-9191:

[ 5535.960522] Call Trace:
[ 5535.963265]  [<ffffffff817cdaaf>] schedule+0x3f/0xa0
[ 5535.968817]  [<ffffffff817d33fb>] schedule_timeout+0x3db/0x6f0
[ 5535.975346]  [<ffffffff817cf055>] ? wait_for_completion+0x45/0x130
[ 5535.982256]  [<ffffffff817cf0d3>] wait_for_completion+0xc3/0x130
[ 5535.988972]  [<ffffffff810d1fd0>] ? wake_up_q+0x80/0x80
[ 5535.994804]  [<ffffffff8130de64>] drop_sysctl_table+0xc4/0xe0
[ 5536.001227]  [<ffffffff8130de17>] drop_sysctl_table+0x77/0xe0
[ 5536.007648]  [<ffffffff8130decd>] unregister_sysctl_table+0x4d/0xa0
[ 5536.014654]  [<ffffffff8130deff>] unregister_sysctl_table+0x7f/0xa0
[ 5536.021657]  [<ffffffff810f57f5>] unregister_sched_domain_sysctl+0x15/0x40
[ 5536.029344]  [<ffffffff810d7704>] partition_sched_domains+0x44/0x450
[ 5536.036447]  [<ffffffff817d0761>] ? __mutex_unlock_slowpath+0x111/0x1f0
[ 5536.043844]  [<ffffffff81167684>] rebuild_sched_domains_locked+0x64/0xb0
[ 5536.051336]  [<ffffffff8116789d>] update_flag+0x11d/0x210
[ 5536.057373]  [<ffffffff817cf61f>] ? mutex_lock_nested+0x2df/0x450
[ 5536.064186]  [<ffffffff81167acb>] ? cpuset_css_offline+0x1b/0x60
[ 5536.070899]  [<ffffffff810fce3d>] ? trace_hardirqs_on+0xd/0x10
[ 5536.077420]  [<ffffffff817cf61f>] ? mutex_lock_nested+0x2df/0x450
[ 5536.084234]  [<ffffffff8115a9f5>] ? css_killed_work_fn+0x25/0x220
[ 5536.091049]  [<ffffffff81167ae5>] cpuset_css_offline+0x35/0x60
[ 5536.097571]  [<ffffffff8115aa2c>] css_killed_work_fn+0x5c/0x220
[ 5536.104207]  [<ffffffff810bc83f>] process_one_work+0x1df/0x710
[ 5536.110736]  [<ffffffff810bc7c0>] ? process_one_work+0x160/0x710
[ 5536.117461]  [<ffffffff810bce9b>] worker_thread+0x12b/0x4a0
[ 5536.123697]  [<ffffffff810bcd70>] ? process_one_work+0x710/0x710
[ 5536.130426]  [<ffffffff810c3f7e>] kthread+0xfe/0x120
[ 5536.135991]  [<ffffffff817d4baf>] ret_from_fork+0x1f/0x40
[ 5536.142041]  [<ffffffff810c3e80>] ? kthread_create_on_node+0x230/0x230

One cgroup maintainer mentioned that "cgroup is trying to offline
a cpuset css, which takes place under cgroup_mutex.  The offlining
ends up trying to drain active usages of a sysctl table which apprently
is not happening."
The real reason is that proc_sys_readdir doesn't drop reference added
by grab_header when return from !dir_emit_dots path. So this cpuset
offline path will wait here forever.

See here for details: http://www.openwall.com/lists/oss-security/2016/11/04/13

Fixes: f0c3b5093add ("[readdir] convert procfs")
Reported-by: CAI Qian <caiqian@redhat.com>
Tested-by: Yang Shukui <yangshukui@huawei.com>
Signed-off-by: Zhou Chengming <zhouchengming1@huawei.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>