git.karo-electronics.de Git - karo-tx-linux.git/log

Merge tag 'drm-intel-fixes-2017-05-29' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes

drm/i915 fixes for v4.12-rc4

* tag 'drm-intel-fixes-2017-05-29' of git://anongit.freedesktop.org/git/drm-intel:
  drm/i915: Stop pretending to mask/unmask LPE audio interrupts
  drm/i915/selftests: Silence compiler warning in igt_ctx_exec
  Revert "drm/i915: Restore lost "Initialized i915" welcome message"
  drm/i915/gvt: clean up unsubmited workloads before destroying kmem cache
  drm/i915/gvt: Disable compression workaround for Gen9
  drm/i915: set initialised only when init_context callback is NULL
  drm/i915: Fix new -Wint-in-bool-context gcc compiler warning
  drm/i915: use vma->size for appgtt allocate_va_range
  drm/i915: Do not sync RCU during shrinking

iscsi-target: Always wait for kthread_should_stop() before kthread exit

There are three timing problems in the kthread usages of iscsi_target_mod:

- np_thread of struct iscsi_np
- rx_thread and tx_thread of struct iscsi_conn

In iscsit_close_connection(), it calls

send_sig(SIGINT, conn->tx_thread, 1);
kthread_stop(conn->tx_thread);

In conn->tx_thread, which is iscsi_target_tx_thread(), when it receive
SIGINT the kthread will exit without checking the return value of
kthread_should_stop().

So if iscsi_target_tx_thread() exit right between send_sig(SIGINT...)
and kthread_stop(...), the kthread_stop() will try to stop an already
stopped kthread.

This is invalid according to the documentation of kthread_stop().

(Fix -ECONNRESET logout handling in iscsi_target_tx_thread and
early iscsi_target_rx_thread failure case - nab)

Signed-off-by: Jiang Yi <jiangyilism@gmail.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

iscsi-target: Fix initial login PDU asynchronous socket close OOPs

This patch fixes a OOPs originally introduced by:

   commit bb048357dad6d604520c91586334c9c230366a14
   Author: Nicholas Bellinger <nab@linux-iscsi.org>
   Date:   Thu Sep 5 14:54:04 2013 -0700

   iscsi-target: Add sk->sk_state_change to cleanup after TCP failure

which would trigger a NULL pointer dereference when a TCP connection
was closed asynchronously via iscsi_target_sk_state_change(), but only
when the initial PDU processing in iscsi_target_do_login() from iscsi_np
process context was blocked waiting for backend I/O to complete.

To address this issue, this patch makes the following changes.

First, it introduces some common helper functions used for checking
socket closing state, checking login_flags, and atomically checking
socket closing state + setting login_flags.

Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP
connection has dropped via iscsi_target_sk_state_change(), but the
initial PDU processing within iscsi_target_do_login() in iscsi_np
context is still running.  For this case, it sets LOGIN_FLAGS_CLOSED,
but doesn't invoke schedule_delayed_work().

The original NULL pointer dereference case reported by MNC is now handled
by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before
transitioning to FFP to determine when the socket has already closed,
or iscsi_target_start_negotiation() if the login needs to exchange
more PDUs (eg: iscsi_target_do_login returned 0) but the socket has
closed.  For both of these cases, the cleanup up of remaining connection
resources will occur in iscsi_target_start_negotiation() from iscsi_np
process context once the failure is detected.

Finally, to handle to case where iscsi_target_sk_state_change() is
called after the initial PDU procesing is complete, it now invokes
conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once
existing iscsi_target_sk_check_close() checks detect connection failure.
For this case, the cleanup of remaining connection resources will occur
in iscsi_target_do_login_rx() from delayed workqueue process context
once the failure is detected.

Reported-by: Mike Christie <mchristi@redhat.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Tested-by: Mike Christie <mchristi@redhat.com>
Cc: Mike Christie <mchristi@redhat.com>
Reported-by: Hannes Reinecke <hare@suse.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Varun Prakash <varun@chelsio.com>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

drm/amdgpu: Program ring for vce instance 1 at its register space

We need program ring buffer on instance 1 register space domain,
when only if instance 1 available, with two instances or instance 0,
and we need only program instance 0 regsiter space domain for ring.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

SUNRPC: ensure correct error is reported by xs_tcp_setup_socket()

If you attempt a TCP mount from an host that is unreachable in a way
that triggers an immediate error from kernel_connect(), that error
does not propagate up, instead EAGAIN is reported.

This results in call_connect_status receiving the wrong error.

A case that it easy to demonstrate is to attempt to mount from an
address that results in ENETUNREACH, but first deleting any default
route.
Without this patch, the mount.nfs process is persistently runnable
and is hard to kill.  With this patch it exits as it should.

The problem is caused by the fact that xs_tcp_force_close() eventually
calls
      xprt_wake_pending_tasks(xprt, -EAGAIN);
which causes an error return of -EAGAIN.  so when xs_tcp_setup_sock()
calls
      xprt_wake_pending_tasks(xprt, status);
the status is ignored.

Fixes: 4efdd92c9211 ("SUNRPC: Remove TCP client connection reset hack")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

md: Make flush bios explicitely sync

Commit b685d3d65ac7 "block: treat REQ_FUA and REQ_PREFLUSH as
synchronous" removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions

Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.

CC: linux-raid@vger.kernel.org
CC: Shaohua Li <shli@kernel.org>
Fixes: b685d3d65ac791406e0dfd8779cc9b3707fea5a3
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Shaohua Li <shli@fb.com>

Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs fixes from Miklos Szeredi:
"Fix regressions:

   - missing CONFIG_EXPORTFS dependency

   - failure if upper fs doesn't support xattr

   - bad error cleanup

  This also adds the concept of "impure" directories complementing the
  "origin" marking introduced in -rc1. Together they enable getting
  consistent st_ino and d_ino for directory listings.

  And there's a bug fix and a cleanup as well"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: filter trusted xattr for non-admin
  ovl: mark upper merge dir with type origin entries "impure"
  ovl: mark upper dir with type origin entries "impure"
  ovl: remove unused arg from ovl_lookup_temp()
  ovl: handle rename when upper doesn't support xattr
  ovl: don't fail copy-up if upper doesn't support xattr
  ovl: check on mount time if upper fs supports setting xattr
  ovl: fix creds leak in copy up error path
  ovl: select EXPORTFS

cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode

When adding a cfq_group into the cfq service tree, we use CFQ_IDLE_DELAY
as the delay of cfq_group's vdisktime if there have been other cfq_groups
already.

When cfq is under iops mode, commit 9a7f38c42c2b ("cfq-iosched: Convert
from jiffies to nanoseconds") could result in a large iops delay and
lead to an abnormal io schedule delay for the added cfq_group. To fix
it, we just need to revert to the old CFQ_IDLE_DELAY value: HZ / 5
when iops mode is enabled.

Despite having the same value, the delay of a cfq_queue in idle class
and the delay of cfq_group are different things, so I define two new
macros for the delay of a cfq_group under time-slice mode and iops mode.

Fixes: 9a7f38c42c2b ("cfq-iosched: Convert from jiffies to nanoseconds")
Cc: <stable@vger.kernel.org> # 4.8+
Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>

xfs: use ->b_state to fix buffer I/O accounting release race

We've had user reports of unmount hangs in xfs_wait_buftarg() that
analysis shows is due to btp->bt_io_count == -1. bt_io_count
represents the count of in-flight asynchronous buffers and thus
should always be >= 0. xfs_wait_buftarg() waits for this value to
stabilize to zero in order to ensure that all untracked (with
respect to the lru) buffers have completed I/O processing before
unmount proceeds to tear down in-core data structures.

The value of -1 implies an I/O accounting decrement race. Indeed,
the fact that xfs_buf_ioacct_dec() is called from xfs_buf_rele()
(where the buffer lock is no longer held) means that bp->b_flags can
be updated from an unsafe context. While a user-level reproducer is
currently not available, some intrusive hacks to run racing buffer
lookups/ioacct/releases from multiple threads was used to
successfully manufacture this problem.

Existing callers do not expect to acquire the buffer lock from
xfs_buf_rele(). Therefore, we can not safely update ->b_flags from
this context. It turns out that we already have separate buffer
state bits and associated serialization for dealing with buffer LRU
state in the form of ->b_state and ->b_lock. Therefore, replace the
_XBF_IN_FLIGHT flag with a ->b_state variant, update the I/O
accounting wrappers appropriately and make sure they are used with
the correct locking. This ensures that buffer in-flight state can be
modified at buffer release time without racing with modifications
from a buffer lock holder.

Fixes: 9c7504aa72b6 ("xfs: track and serialize in-flight async buffers against unmount")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Tested-by: Libor Pechacek <lpechacek@suse.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

dm: make flush bios explicitly sync

Commit b685d3d65ac7 ("block: treat REQ_FUA and REQ_PREFLUSH as
synchronous") removed REQ_SYNC flag from WRITE_{FUA|PREFLUSH|...}
definitions. generic_make_request_checks() however strips REQ_FUA and
REQ_PREFLUSH flags from a bio when the storage doesn't report volatile
write cache and thus write effectively becomes asynchronous which can
lead to performance regressions.

Fix the problem by making sure all bios which are synchronous are
properly marked with REQ_SYNC.

Fixes: b685d3d65ac7 ("block: treat REQ_FUA and REQ_PREFLUSH as synchronous")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

ALSA: usb: Avoid VLA in mixer_us16x08.c

This is another attempt to work around the VLA used in
mixer_us16x08.c. Basically the temporary array is used individually
for two cases, and we can declare locally in each block, instead of
hackish max() usage.

Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: usb: Fix a typo in Tascam US-16x08 mixer element

A mixer element created in a quirk for Tascam US-16x08 contains a
typo: it should be "EQ MidLow Q" instead of "EQ MidQLow Q".

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195875
Fixes: d2bb390a2081 ("ALSA: usb-audio: Tascam US-16x08 DSP mixer quirk")
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Takashi Iwai <tiwai@suse.de>

Revert "ALSA: usb-audio: purge needless variable length array"

This reverts commit 89b593c30e83 ("ALSA: usb-audio: purge needless
variable length array").  The patch turned out to cause a severe
regression, triggering an Oops at snd_usb_ctl_msg().  It was overseen
that snd_usb_ctl_msg() writes back the response to the given buffer,
while the patch changed it to a read-only const buffer.  (One should
always double-check when an extra pointer cast is present...)

As a simple fix, just revert the affected commit.  It was merely a
cleanup.  Although it brings VLA again, it's clearer as a fix.  We'll
address the VLA later in another patch.

Fixes: 89b593c30e83 ("ALSA: usb-audio: purge needless variable length array")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195875
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Takashi Iwai <tiwai@suse.de>

hwmon: (aspeed-pwm-tacho) On read failure return -ETIMEDOUT

When the controller fails to provide an RPM reading within the alloted
time; the driver returns -ETIMEDOUT and no file contents.

Signed-off-by: Patrick Venture <venture@google.com>
Fixes: 2d7a548a3eff ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (aspeed-pwm-tacho) Select REGMAP

The driver uses regmap and thus has to select it to avoid build
errors such as the following.

drivers/hwmon/aspeed-pwm-tacho.c:337:21: error: variable
'aspeed_pwm_tacho_regmap_config' has initializer but incomplete type

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Acked-by: Joel Stanley <joel@jms.id.au>
Fixes: 2d7a548a3eff ("drivers: hwmon: Support for ASPEED PWM/Fan tach")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

"Yes, people use FOLL_FORCE ;)"

This effectively reverts commit 8ee74a91ac30 ("proc: try to remove use
of FOLL_FORCE entirely")

It turns out that people do depend on FOLL_FORCE for the /proc/<pid>/mem
case, and we're talking not just debuggers. Talking to the affected people, the use-cases are:

Keno Fischer:
"We used these semantics as a hardening mechanism in the julia JIT. By
  opening /proc/self/mem and using these semantics, we could avoid
  needing RWX pages, or a dual mapping approach. We do have fallbacks to
  these other methods (though getting EIO here actually causes an assert
  in released versions - we'll updated that to make sure to take the
  fall back in that case).

  Nevertheless the /proc/self/mem approach was our favored approach
  because it a) Required an attacker to be able to execute syscalls
  which is a taller order than getting memory write and b) didn't double
  the virtual address space requirements (as a dual mapping approach
  would).

  I think in general this feature is very useful for anybody who needs
  to precisely control the execution of some other process. Various
  debuggers (gdb/lldb/rr) certainly fall into that category, but there's
  another class of such processes (wine, various emulators) which may
  want to do that kind of thing.

  Now, I suspect most of these will have the other process under ptrace
  control, so maybe allowing (same_mm || ptraced) would be ok, but at
  least for the sandbox/remote-jit use case, it would be perfectly
  reasonable to not have the jit server be a ptracer"

Robert O'Callahan:
"We write to readonly code and data mappings via /proc/.../mem in lots
  of different situations, particularly when we're adjusting program
  state during replay to match the recorded execution.

  Like Julia, we can add workarounds, but they could be expensive."

so not only do people use FOLL_FORCE for both reads and writes, but they
use it for both the local mm and remote mm.

With these comments in mind, we likely also cannot add the "are we
actively ptracing" check either, so this keeps the new code organization
and does not do a real revert that would add back the original comment
about "Maybe we should limit FOLL_FORCE to actual ptrace users?"

Reported-by: Keno Fischer <keno@juliacomputing.com>
Reported-by: Robert O'Callahan <robert@ocallahan.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

blk-mq: Take tagset lock when updating hw queues

The tagset lock needs to be held when iterating the tag_list, so a
lockdep assert was added when updating number of hardware queues. The
drivers calling this API, however, were unaware of the new requirement,
so are failing the assertion.

This patch takes the lock within the blk-mq function so the drivers do
not have to be modified in order to be safe.

Fixes: 705cda97e ("blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list")
Reported-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

KVM: SVM: ignore type when setting segment registers

Commit 19bca6ab75d8 ("KVM: SVM: Fix cross vendor migration issue with
unusable bit") added checking type when setting unusable.
So unusable can be set if present is 0 OR type is 0.
According to the AMD processor manual, long mode ignores the type value
in segment descriptor. And type can be 0 if it is read-only data segment.
Therefore type value is not related to unusable flag.

This patch is based on linux-next v4.12.0-rc3.

Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: nVMX: fix nested_vmx_check_vmptr failure paths under debugging

kvm_skip_emulated_instruction() will return 0 if userspace is
single-stepping the guest.

kvm_skip_emulated_instruction() uses return status convention of exit
handler: 0 means "exit to userspace" and 1 means "continue vm entries".
The problem is that nested_vmx_check_vmptr() return status means
something else: 0 is ok, 1 is error.

This means we would continue executing after a failure. Static checker
noticed it because vmptr was not initialized.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 6affcbedcac7 ("KVM: x86: Add kvm_skip_emulated_instruction and use it.")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

nbd: don't leak nbd_config

nbd_config is allocated in nbd_alloc_config(), but never freed.

Fixes: 5ea8d10802ec ("nbd: separate out the config information")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

nbd: nbd_reset() call in nbd_dev_add() is redundant

There is nothing to clear -- nbd_device has just been allocated.
Fold nbd_reset() into its other caller, nbd_config_put().

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

drivers/perf: arm_pmu_acpi: avoid perf IRQ init when guest PMU is off

We saw perf IRQ init failures when running Linux kernel in an ACPI
guest without PMU (i.e. pmu=off). This is because perf IRQ is not
present when pmu=off, but arm_pmu_acpi still tries to register
or unregister GSI. This patch addresses the problem by checking
gicc->performance_interrupt. If it is 0, which is the value set
by qemu when pmu=off, we skip the IRQ register/unregister process.

[    4.069470] bc00: 0000000000040b00 ffff0000089db190
[    4.070267] [<ffff000008134f80>] enable_percpu_irq+0xdc/0xe4
[    4.071192] [<ffff000008667cc4>] arm_perf_starting_cpu+0x108/0x10c
[    4.072200] [<ffff0000080cbdd4>] cpuhp_invoke_callback+0x14c/0x4ac
[    4.073210] [<ffff0000080ccd3c>] cpuhp_thread_fun+0xd4/0x11c
[    4.074132] [<ffff0000080f1394>] smpboot_thread_fn+0x1b4/0x1c4
[    4.075081] [<ffff0000080ec90c>] kthread+0x10c/0x138
[    4.075921] [<ffff0000080833c0>] ret_from_fork+0x10/0x50
[    4.076947] genirq: Setting trigger mode 4 for irq 43 failed
(gic_set_type+0x0/0x74)

Signed-off-by: Wei Huang <wei@redhat.com>
[will: add comment justifying deviation from ACPI spec, removed redundant hunk]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

rcar-dmac: fixup descriptor pointer for descriptor mode

In descriptor mode, the descriptor running pointer is not maintained
by the interrupt handler, thus, driver finds the running descriptor
from the descriptor pointer field in the CHCRB register.
But, CHCRB::DPTR indicates *next* descriptor pointer, not current.
Thus, The residue calculation will be missed. This patch fixup it.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>

Input: synaptics - tell users to report when they should be using rmi-smbus

Users should really consider switching to rmi-smbus instead of plain PS/2.
Notify them that they should report a missing pnpID in the file.

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

Input: synaptics - warn the users when there is a better mode

The Synaptics touchpads are now either using i2c-hid or rmi-smbus.
Warn the users if they are missing the rmi-smbus modules and have no
chance of reporting correct data.

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

Input: synaptics - keep PS/2 around when RMI4_SMB is not enabled

Or the user might have the touchpad unbound from PS/2 but never picked
up by rmi-smbus.ko

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

Input: synaptics - clear device info before filling in

synaptics_query_hardware() was being passed a 'struct synaptics_device_info'
in uninitialized stack memory, then not always initializing all fields.
This caused garbage to show up in certain fields, making the touchpad
unusable.

Fix by zeroing the device info, so all fields default to 0.

Fixes: 6c53694fb222 ("Input: synaptics - split device info into a separate structure")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

Input: silead - disable interrupt during suspend

When we put the touchscreen controller in low-power mode the irq
pin may trigger (float) and if we then try to read a data packet we
get the following error in dmesg:

[ 478.801017] silead_ts i2c-MSSL1680:00: Data read error -121

This commit disables the irq during suspend/resume fixing this error.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

cpufreq: kirkwood-cpufreq:- Handle return value of clk_prepare_enable()

clk_prepare_enable() can fail here and we must check its return value.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

cpufreq: cpufreq_register_driver() should return -ENODEV if init fails

For a driver that does not set the CPUFREQ_STICKY flag, if all of the
->init() calls fail, cpufreq_register_driver() should return an error.
This will prevent the driver from loading.

Fixes: ce1bcfe94db8 (cpufreq: check cpufreq_policy_list instead of scanning policies for all CPUs)
Cc: 4.0+ <stable@vger.kernel.org> # 4.0+
Signed-off-by: David Arcari <darcari@redhat.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

ACPICA: Tables: Fix regression introduced by a too early mechanism enabling

In the Linux kernel, acpi_get_table() "clones" haven't been fully
balanced by acpi_put_table() invocations. In upstream ACPICA, due to
the design change, there are also unbalanced acpi_get_table_by_index()
invocations requiring special care.

acpi_get_table() reference counting mismatches may occor due to that
and printing error messages related to them is not useful at this
point. The strict balanced validation count check should only be
enabled after confirming that all invocations are safe and aligned
with their designed purposes.

Thus this patch removes the error value returned by acpi_tb_get_table()
in that case along with the accompanying error message to fix the
issue.

Fixes: 174cc7187e6f (ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel)
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Reported-by: Anush Seetharaman <anush.seetharaman@intel.com>
Reported-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Revert "ACPI / button: Change default behavior to lid_init_state=open"

Revert commit 77e9a4aa9de1 (ACPI / button: Change default behavior to
lid_init_state=open) which changed the kernel's behavior on laptops
that boot with closed lids and expect the lid switch state to be
reported accurately by the kernel.

If you boot or resume your laptop with the lid closed on a docking
station while using an external monitor connected to it, both internal
and external displays will light on, while only the external should.

There is a design choice in gdm to only provide the greeter on the
internal display when lit on, so users only see a gray area on the
external monitor. Also, the cursor will not show up as it's by
default on the internal display too.

To "fix" that, users have to open the laptop once and close it once
again to sync the state of the switch with the hardware state.

Even if the "method" operation mode implementation can be buggy on
some platforms, the "open" choice is worse.  It breaks docking
stations basically and there is no way to have a user-space hwdb to
fix that.

On the contrary, it's rather easy in user-space to have a hwdb
with the problematic platforms. Then,  libinput (1.7.0+) can fix
the state of the lid switch for us: you need to set the udev
property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.

When libinput detects internal keyboard events, it will overwrite the
state of the switch to open, making it reliable again.  Given that
logind only checks the lid switch value after a timeout, we can
assume the user will use the internal keyboard before this timeout
expires.

For example, such a hwdb entry is:

libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open

Link: https://bugzilla.gnome.org/show_bug.cgi?id=782380
Cc: 4.11+ <stable@vger.kernel.org> # 4.11+
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Merge tag 'pinctrl-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
"Here is an overdue pull request for pin control fixes, the most
  prominent feature is to make Intel Chromebooks (and I suspect any
  other Cherryview-based Intel thing) happy again, which we really want
  to see.

  There is a patch hitting drivers/firmware/* that I was uncertain to
  who actually manages, but I got Andy Shevchenko's and Dmitry Torokov's
  review tags on it and I trust them both 100% to do the right thing for
  Intel platform drivers.

  Summary:

   - Make a few Intel Chromebooks with Cherryview DMI firmware work
     smoothly.

   - A fix for some bogus allocations in the generic group management
     code.

   - Some GPIO descriptor lookup table stubs. Merged through the pin
     control tree for administrative reasons.

   - Revert the "bi-directional" and "output-enable" generic properties:
     we need more discussions around this. It seems other SoCs are using
     input/output gate enablement and these terms are not correct.

   - Fix mux and drive strength atomically in the MXS driver.

   - Fix the SPDIF function on sunxi A83T.

   - OF table terminators and other small fixes"

* tag 'pinctrl-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: sunxi: Fix SPDIF function name for A83T
  pinctrl: mxs: atomically switch mux and drive strength config
  pinctrl: cherryview: Extend the Chromebook DMI quirk to Intel_Strago systems
  firmware: dmi: Add DMI_PRODUCT_FAMILY identification string
  pinctrl: core: Fix warning by removing bogus code
  gpiolib: Add stubs for gpiod lookup table interface
  Revert "pinctrl: generic: Add bi-directional and output-enable"
  pinctrl: cherryview: Add terminate entry for dmi_system_id tables

kthread: fix boot hang (regression) on MIPS/OpenRISC

This fixes a regression in commit 4d6501dce079 where I didn't notice
that MIPS and OpenRISC were reinitialising p->{set,clear}_child_tid to
NULL after our initialisation in copy_process().

We can simply get rid of the arch-specific initialisation here since it
is now always done in copy_process() before hitting copy_thread{,_tls}().

Review notes:

- As far as I can tell, copy_process() is the only user of
   copy_thread_tls(), which is the only caller of copy_thread() for
   architectures that don't implement copy_thread_tls().

- After this patch, there is no arch-specific code touching
   p->set_child_tid or p->clear_child_tid whatsoever.

- It may look like MIPS/OpenRISC wanted to always have these fields be
   NULL, but that's not true, as copy_process() would unconditionally
   set them again _after_ calling copy_thread_tls() before commit
   4d6501dce079.

Fixes: 4d6501dce079c1eb6bf0b1d8f528a5e81770109e ("kthread: Fix use-after-free if kthread fork fails")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net> # MIPS only
Acked-by: Stafford Horne <shorne@gmail.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: openrisc@lists.librecores.org
Cc: Jamie Iles <jamie.iles@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ovl: filter trusted xattr for non-admin

Filesystems filter out extended attributes in the "trusted." domain for
unprivlieged callers.

Overlay calls underlying filesystem's method with elevated privs, so need
to do the filtering in overlayfs too.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

HID: i2c: Call acpi_device_fix_up_power for ACPI-enumerated devices

For ACPI devices which do not have a _PSC method, the ACPI subsys cannot
query their initial state at boot, so these devices are assumed to have
been put in D0 by the BIOS, but for touchscreens that is not always true.

This commit adds a call to acpi_device_fix_up_power to explicitly put
devices without a _PSC method into D0 state (for devices with a _PSC
method it is a nop). Note we only need to do this on probe, after a
resume the ACPI subsys knows the device is in D3 and will properly
put it in D0.

This fixes the SIS0817 i2c-hid touchscreen on a Peaq C1010 2-in-1
device failing to probe with a "hid_descr_cmd failed" error.

Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

drm/i915: Detect USB-C specific dongles before reducing M and N

The Analogix 7737 DP to HDMI converter requires reduced M and N values
when to operate correctly at HBR2. We tried to reduce the M/N values for
all devices in commit 9a86cda07af2 ("drm/i915/dp: reduce link M/N
parameters"), but that regressed some other sinks. Detect this IC by its
OUI value of 0x0022B9 via the DPCD quirk list, and only reduce the M/N
values for that.

v2 by Jani: Rebased on the DP quirk database

v3 by Jani: Rebased on the reworked DP quirk database

v4 by Jani: Improve commit message (Daniel)

Fixes: 9a86cda07af2 ("drm/i915/dp: reduce link M/N parameters")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93578
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100755
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/2d2e30f8f47d3f28c9b74ca2612336a54585c3ec.1495105635.git.jani.nikula@intel.com

drm/dp: start a DPCD based DP sink/branch device quirk database

Face the fact, there are Display Port sink and branch devices out there
in the wild that don't follow the Display Port specifications, or they
have bugs, or just otherwise require special treatment. Start a common
quirk database the drivers can query based on the DP device
identification. At least for now, we leave the workarounds for the
drivers to implement as they see fit.

For starters, add a branch device that can't handle full 24-bit main
link Mdiv and Ndiv main link attributes properly. Naturally, the
workaround of reducing main link attributes for all devices ended up in
regressions for other devices. So here we are.

v2: Rebase on DRM DP desc read helpers

v3: Fix the OUI memcmp blunder (Clint)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Clint Taylor <clinton.a.taylor@intel.com>
Cc: Adam Jackson <ajax@redhat.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Tested-by: Clinton Taylor <clinton.a.taylor@intel.com>
Reviewed-by: Clinton Taylor <clinton.a.taylor@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> # v2
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/91ec198dd95258dbf3bee2f6be739e0da73b4fdd.1495105635.git.jani.nikula@intel.com

drm/i915: use drm DP helper to read DPCD desc

Switch to using the common DP helpers instead of using our own.

v2: also remove leftover struct intel_dp_desc (Daniel)

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/dp: add helper for reading DP sink/branch device desc from DPCD

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/acba54da7d80eafea9e59a893e27e3c31028c0ba.1495105635.git.jani.nikula@intel.com

ovl: mark upper merge dir with type origin entries "impure"

An upper dir is marked "impure" to let ovl_iterate() know that this
directory may contain non pure upper entries whose d_ino may need to be
read from the origin inode.

We already mark a non-merge dir "impure" when moving a non-pure child
entry inside it, to let ovl_iterate() know not to iterate the non-merge
dir directly.

Mark also a merge dir "impure" when moving a non-pure child entry inside
it and when copying up a child entry inside it.

This can be used to optimize ovl_iterate() to perform a "pure merge" of
upper and lower directories, merging the content of the directories,
without having to read d_ino from origin inodes.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

rbd: implement REQ_OP_WRITE_ZEROES

Commit 93c1defedcae ("rbd: remove the discard_zeroes_data flag")
explicitly didn't implement REQ_OP_WRITE_ZEROES for rbd, while the
following commit 48920ff2a5a9 ("block: remove the discard_zeroes_data
flag") dropped ->discard_zeroes_data in favor of REQ_OP_WRITE_ZEROES.

rbd does support efficient zeroing via CEPH_OSD_OP_ZERO opcode and will
release either some or all blocks depending on whether the zeroing
request is rbd_obj_bytes() aligned.  This is how we currently implement
discards, so REQ_OP_WRITE_ZEROES can be identical to REQ_OP_DISCARD for
now.  Caveats:

- REQ_NOUNMAP is ignored, but AFAICT that's true of at least two other
  current implementations - nvme and loop

- there is no ->write_zeroes_alignment and blk_bio_write_zeroes_split()
  is hence less helpful than blk_bio_discard_split(), but this can (and
  should) be fixed on the rbd side

In the future we will split these into two code paths to respect
REQ_NOUNMAP on zeroout and save on zeroing blocks that couldn't be
released on discard.

Fixes: 93c1defedcae ("rbd: remove the discard_zeroes_data flag")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

x86/debug/32: Convert a smp_processor_id() call to raw to avoid DEBUG_PREEMPT warning

... to raw_smp_processor_id() to not trip the

BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1

check. The reasoning behind it is that __warn() already uses the raw_
variants but the show_regs() path on 32-bit doesn't.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170528092212.fiod7kygpjm23m3o@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/microcode/AMD: Change load_microcode_amd()'s param to bool to fix preemptibility bug

With CONFIG_DEBUG_PREEMPT enabled, I get:

  BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
  caller is debug_smp_processor_id
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.12.0-rc2+ #2
  Call Trace:
   dump_stack
   check_preemption_disabled
   debug_smp_processor_id
   save_microcode_in_initrd_amd
   ? microcode_init
   save_microcode_in_initrd
   ...

because, well, it says it above, we're using smp_processor_id() in
preemptible code.

But passing the CPU number is not really needed. It is only used to
determine whether we're on the BSP, and, if so, to save the microcode
patch for early loading.

[ We don't absolutely need to do it on the BSP but we do that
   customarily there. ]

Instead, convert that function parameter to a boolean which denotes
whether the patch should be saved or not, thereby avoiding the use of
smp_processor_id() in preemptible code.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170528200414.31305-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>

drm/exynos: clean up description of exynos_drm_crtc

This patch removes unnecessary descriptions on
exynos_drm_crtc structure and adds one description
which specifies what pipe_clk member does.

pipe_clk support had been added by below patch without any description,
drm/exynos: add support for pipeline clock to the framework
Commit-id : f26b9343f582f44ec920474d71b4b2220b1ed9a8

Signed-off-by: Inki Dae <inki.dae@samsung.com>

drm/exynos: dsi: Remove bridge node reference in removal

Since bridge node is referenced during in the probe, it should be
released on removal.

Suggested-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>

drm/exynos: dsi: Fix the parse_dt function

The dsi + panel is a parental relationship, so OF grpah is not needed.
Therefore, the current dsi_parse_dt function will throw an error,
because there is no linked OF graph for the case fimd + dsi + panel.

Parse the Pll burst and esc clock frequency properties in dsi_parse_dt()
and create a bridge_node only if there is an OF graph associated with dsi.

Signed-off-by: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Reviewed-by: Andi Shyti <andi.shyti@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>

drm/exynos: Merge pre/postclose hooks

Again no apparent explanation for the split except hysterical raisins.

Cc: Inki Dae <inki.dae@samsung.com>
Cc: Joonyoung Shim <jy0922.shim@samsung.com>
Cc: Seung-Woo Kim <sw0312.kim@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>

Linux 4.12-rc3

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal

Pull thermal SoC management fixes from Eduardo Valentin:

- fixes to TI SoC driver, Broadcom, qoriq

- small sparse warning fix on thermal core

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
  thermal: broadcom: ns-thermal: default on iProc SoCs
  ti-soc-thermal: Fix a typo in a comment line
  ti-soc-thermal: Delete error messages for failed memory allocations in ti_bandgap_build()
  ti-soc-thermal: Use devm_kcalloc() in ti_bandgap_build()
  thermal: core: make thermal_emergency_poweroff static
  thermal: qoriq: remove useless call for of_thermal_get_trip_points()

mtk-vcodec: Use designated initializers

The randstruct plugin requires designated initializers for structures
that are entirely function pointers.

Cc: Wu-Cheng Li <wuchengli@chromium.org>
Cc: Tiffany Lin <tiffany.lin@mediatek.com>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Kees Cook <keescook@chromium.org>

drm/amd/powerplay: Use designated initializers

The randstruct plugin requires designated initializers for structures
that are entirely function pointers.

Cc: Christian König <christian.koenig@amd.com>
Cc: Eric Huang <JinHuiEric.Huang@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kees Cook <keescook@chromium.org>

drm/amdgpu: Use designated initializers

The randstruct plugin requires structures that are entirely function
pointers be initialized using designated initializers.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Kees Cook <keescook@chromium.org>

sgi-xp: Use designated initializers

Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. In this case, no initializers
are needed (they can be NULL initialized and callers adjusted to check
for NULL, which is more efficient than an indirect call).

Cc: Robin Holt <robinmholt@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@infradead.org>

ocfs2: Use ERR_CAST() to avoid cross-structure cast

When trying to propagate an error result, the error return path attempts
to retain the error, but does this with an open cast across very different
types, which the upcoming structure layout randomization plugin flags as
being potentially dangerous in the face of randomization. This is a false
positive, but what this code actually wants to do is use ERR_CAST() to
retain the error value.

Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Kees Cook <keescook@chromium.org>

ntfs: Use ERR_CAST() to avoid cross-structure cast

When trying to propagate an error result, the error return path attempts
to retain the error, but does this with an open cast across very different
types, which the upcoming structure layout randomization plugin flags as
being potentially dangerous in the face of randomization. This is a false
positive, but what this code actually wants to do is use ERR_CAST() to
retain the error value.

Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>

NFS: Use ERR_CAST() to avoid cross-structure cast

When the call to nfs_devname() fails, the error path attempts to retain
the error via the mnt variable, but this requires a cast across very
different types (char * to struct vfsmount *), which the upcoming
structure layout randomization plugin flags as being potentially
dangerous in the face of randomization. This is a false positive, but
what this code actually wants to do is retain the error value, so this
patch explicitly sets it, instead of using what seems to be an
unexpected cast.

Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>

efi/bgrt: Skip efi_bgrt_init() in case of non-EFI boot

Sabrina Dubroca reported an early panic:

  BUG: unable to handle kernel paging request at ffffffffff240001
  IP: efi_bgrt_init+0xdc/0x134

  [...]

  ---[ end Kernel panic - not syncing: Attempted to kill the idle task!

... which was introduced by:

  7b0a911478c7 ("efi/x86: Move the EFI BGRT init code to early init code")

The cause is that on this machine the firmware provides the EFI ACPI BGRT
table even on legacy non-EFI bootups - which table should be EFI only.

The garbage BGRT data causes the efi_bgrt_init() panic.

Add a check to skip efi_bgrt_init() in case non-EFI bootup to work around
this firmware bug.

Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: <stable@vger.kernel.org> # v4.11+
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 7b0a911478c7 ("efi/x86: Move the EFI BGRT init code to early init code")
Link: http://lkml.kernel.org/r/20170526113652.21339-6-matt@codeblueprint.co.uk
[ Rewrote the changelog to be more readable. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/efi: Correct EFI identity mapping under 'efi=old_map' when KASLR is enabled

For EFI with the 'efi=old_map' kernel option specified, the kernel will panic
when KASLR is enabled:

  BUG: unable to handle kernel paging request at 000000007febd57e
  IP: 0x7febd57e
  PGD 1025a067
  PUD 0

  Oops: 0010 [#1] SMP
  Call Trace:
   efi_enter_virtual_mode()
   start_kernel()
   x86_64_start_reservations()
   x86_64_start_kernel()
   start_cpu()

The root cause is that the identity mapping is not built correctly
in the 'efi=old_map' case.

On 'nokaslr' kernels, PAGE_OFFSET is 0xffff880000000000 which is PGDIR_SIZE
aligned. We can borrow the PUD table from the direct mappings safely. Given a
physical address X, we have pud_index(X) == pud_index(__va(X)).

However, on KASLR kernels, PAGE_OFFSET is PUD_SIZE aligned. For a given physical
address X, pud_index(X) != pud_index(__va(X)). We can't just copy the PGD entry
from direct mapping to build identity mapping, instead we need to copy the
PUD entries one by one from the direct mapping.

Fix it.

Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Frank Ramsay <frank.ramsay@hpe.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20170526113652.21339-5-matt@codeblueprint.co.uk
[ Fixed and reworded the changelog and code comments to be more readable. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/efi: Disable runtime services on kexec kernel if booted with efi=old_map

Booting kexec kernel with "efi=old_map" in kernel command line hits
kernel panic as shown below.

BUG: unable to handle kernel paging request at ffff88007fe78070
IP: virt_efi_set_variable.part.7+0x63/0x1b0
PGD 7ea28067
PUD 7ea2b067
PMD 7ea2d067
PTE 0
[...]
Call Trace:
  virt_efi_set_variable()
  efi_delete_dummy_variable()
  efi_enter_virtual_mode()
  start_kernel()
  x86_64_start_reservations()
  x86_64_start_kernel()
  start_cpu()

[ efi=old_map was never intended to work with kexec. The problem with
  using efi=old_map is that the virtual addresses are assigned from the
  memory region used by other kernel mappings; vmalloc() space.
  Potentially there could be collisions when booting kexec if something
  else is mapped at the virtual address we allocated for runtime service
  regions in the initial boot - Matt Fleming ]

Since kexec was never intended to work with efi=old_map, disable
runtime services in kexec if booted with efi=old_map, so that we don't
panic.

Tested-by: Lee Chun-Yi <jlee@suse.com>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Dave Young <dyoung@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20170526113652.21339-4-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>

efi: Remove duplicate 'const' specifiers

gcc-7 shows these harmless warnings:

  drivers/firmware/efi/libstub/secureboot.c:19:27: error: duplicate 'const' declaration specifier [-Werror=duplicate-decl-specifier]
   static const efi_char16_t const efi_SecureBoot_name[] = {
  drivers/firmware/efi/libstub/secureboot.c:22:27: error: duplicate 'const' declaration specifier [-Werror=duplicate-decl-specifier]

Removing one of the specifiers gives us the expected behavior.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: David Howells <dhowells@redhat.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: de8cb458625c ("efi: Get and store the secure boot status")
Link: http://lkml.kernel.org/r/20170526113652.21339-3-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>

efi: Don't issue error message when booted under Xen

When booted as Xen dom0 there won't be an EFI memmap allocated. Avoid
issuing an error message in this case:

[ 0.144079] efi: Failed to allocate new EFI memmap

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: <stable@vger.kernel.org> # v4.9+
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20170526113652.21339-2-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>

drm/msm: Fix the check for the command size

The overrun check for the size of submitted commands is off by one.
It should allow the offset plus the size to be equal to the
size of the memory object when the command stream is very tightly
constructed.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

drm/msm: Take the mutex before calling msm_gem_new_impl

Amongst its other duties, msm_gem_new_impl adds the newly created
GEM object to the shared inactive list which may also be actively
modifiying the list during submission. All the paths to modify
the list are protected by the mutex except for the one through
msm_gem_import which can end up causing list corruption.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
[add extra WARN_ON(!mutex_is_locked(&dev->struct_mutex))]
Signed-off-by: Rob Clark <robdclark@gmail.com>

drm/msm: for array in-fences, check if all backing fences are from our own context before waiting

Use the dma_fence_match_context helper to check if all backing fences
are from our own context, in which case we don't have to wait.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Gustavo Padovan <gustavo.padovan@collabora.com>
[rebased on code-motion]
Signed-off-by: Rob Clark <robdclark@gmail.com>

drm/msm: constify irq_domain_ops

struct irq_domain_ops is not modified, so it can be made const.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Rob Clark <robdclark@gmail.com>

drm/msm/mdp5: release hwpipe(s) for unused planes

Otherwise, if userspace doesn't re-use a given plane, it's hwpipe(s)
could stay permanently assigned.

Signed-off-by: Rob Clark <robdclark@gmail.com>

drm/msm: Reuse dma_fence_release.

If we follow the typical pattern of the base class being the first
member, we can use the default dma_fence_free function.

Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: Rob Clark <robdclark@gmail.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>

drm/msm: Expose our reservation object when exporting a dmabuf.

Without this, polling on the dma-buf (and presumably other devices
synchronizing against our rendering) would return immediately, even
while the BO was busy.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Rob Clark <robdclark@gmail.com>
Cc: linux-arm-msm@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>

drm/msm/gpu: check legacy clk names in get_clocks()

Otherwise if someone was using old bindings with "core_clk" instead of
"core" as the clock name, we'd never find it and gpu would be stuck at
27MHz (or whatever it's slowest rate is).

Fixes: 98db803 ("msm/drm: gpu: Dynamically locate the clocks from the device tree")
Signed-off-by: Rob Clark <robdclark@gmail.com>

drm/msm/mdp5: use __drm_atomic_helper_plane_duplicate_state()

Somehow the helper was never retrofitted for mdp5.  Which meant when
plane_state->fence was added, it could get copied into new state in
mdp5_plane_duplicate_state().

If an update to disable the plane (for example on rmfb) managed to sneak
in after an nonblock update had swapped state, but before it was
committed, we'd get a splat:

    WARNING: CPU: 1 PID: 69 at ../drivers/gpu/drm/drm_atomic_helper.c:1061 drm_atomic_helper_wait_for_fences+0xe0/0xf8
   Modules linked in:

   CPU: 1 PID: 69 Comm: kworker/1:1 Tainted: G        W       4.11.0-rc8+ #1187
   Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
   Workqueue: events drm_mode_rmfb_work_fn
   task: ffffffc036560d00 task.stack: ffffffc036550000
   PC is at drm_atomic_helper_wait_for_fences+0xe0/0xf8
   LR is at complete_commit.isra.1+0x44/0x1c0
   pc : [<ffffff80084f6040>] lr : [<ffffff800854176c>] pstate: 20000145
   sp : ffffffc036553b60
   x29: ffffffc036553b60 x28: ffffffc0264e6a00
   x27: ffffffc035659000 x26: 0000000000000000
   x25: ffffffc0240e8000 x24: 0000000000000038
   x23: 0000000000000000 x22: ffffff800858f200
   x21: ffffffc0240e8000 x20: ffffffc02f56a800
   x19: 0000000000000000 x18: 0000000000000000
   x17: 0000000000000000 x16: 0000000000000000
   x15: 0000000000000000 x14: ffffffc00a192700
   x13: 0000000000000004 x12: 0000000000000000
   x11: ffffff80089a1690 x10: 00000000000008f0
   x9 : ffffffc036553b20 x8 : ffffffc036561650
   x7 : ffffffc03fe6cb40 x6 : 0000000000000000
   x5 : 0000000000000001 x4 : 0000000000000002
   x3 : ffffffc035659000 x2 : ffffffc0240e8c80
   x1 : 0000000000000000 x0 : ffffffc02adbe588

   ---[ end trace 13aeec77c3fb55e2 ]---
   Call trace:
   Exception stack(0xffffffc036553990 to 0xffffffc036553ac0)
   3980:                                   0000000000000000 0000008000000000
   39a0: ffffffc036553b60 ffffff80084f6040 0000000000004ff0 0000000000000038
   39c0: ffffffc0365539d0 ffffff800857e098 ffffffc036553a00 ffffff800857e1b0
   39e0: ffffffc036553a10 ffffff800857c554 ffffffc0365e8400 ffffffc0365e8400
   3a00: ffffffc036553a20 ffffff8008103358 000000000001aad7 ffffff800851b72c
   3a20: ffffffc036553a50 ffffff80080e9228 ffffffc02adbe588 0000000000000000
   3a40: ffffffc0240e8c80 ffffffc035659000 0000000000000002 0000000000000001
   3a60: 0000000000000000 ffffffc03fe6cb40 ffffffc036561650 ffffffc036553b20
   3a80: 00000000000008f0 ffffff80089a1690 0000000000000000 0000000000000004
   3aa0: ffffffc00a192700 0000000000000000 0000000000000000 0000000000000000
   [<ffffff80084f6040>] drm_atomic_helper_wait_for_fences+0xe0/0xf8
   [<ffffff800854176c>] complete_commit.isra.1+0x44/0x1c0
   [<ffffff8008541c64>] msm_atomic_commit+0x32c/0x350
   [<ffffff8008516230>] drm_atomic_commit+0x50/0x60
   [<ffffff8008517548>] drm_atomic_remove_fb+0x158/0x250
   [<ffffff80085186d0>] drm_framebuffer_remove+0x50/0x158
   [<ffffff8008518818>] drm_mode_rmfb_work_fn+0x40/0x58
   [<ffffff80080d5668>] process_one_work+0x1d0/0x378
   [<ffffff80080d5a54>] worker_thread+0x244/0x488
   [<ffffff80080db7fc>] kthread+0xfc/0x128
   [<ffffff8008082ec0>] ret_from_fork+0x10/0x50

Fixes: 9626014 ("drm/fence: add in-fences support")
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reported-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

drm/msm: select PM_OPP

Otherwise, if nothing else enabled selects it, dev_pm_opp_of_add_table()
will return -ENOTSUPP.

Fixes: e2af8b6 ("drm/msm: gpu: Use OPP tables if we can")
Signed-off-by: Rob Clark <robdclark@gmail.com>

Merge tag 'tty-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
"Here are some serial and tty fixes for 4.12-rc3. They are a bit bigger
  than normal, which is why I had them bake in linux-next for a few
  weeks and didn't send them to you for -rc2.

  They revert a few of the serdev patches from 4.12-rc1, and bring
  things back to how they were in 4.11, to try to make things a bit more
  stable there. Rob and Johan both agree that this is the way forward,
  so this isn't people squabbling over semantics. Other than that, just
  a few minor serial driver fixes that people have had problems with.

  All of these have been in linux-next for a few weeks with no reported
  issues"

* tag 'tty-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: altera_uart: call iounmap() at driver remove
  serial: imx: ensure UCR3 and UFCR are setup correctly
  MAINTAINERS/serial: Change maintainer of jsm driver
  serial: enable serdev support
  tty/serdev: add serdev registration interface
  serdev: Restore serdev_device_write_buf for atomic context
  serial: core: fix crash in uart_suspend_port
  tty: fix port buffer locking
  tty: ehv_bytechan: clean up init error handling
  serial: ifx6x60: fix use-after-free on module unload
  serial: altera_jtaguart: adding iounmap()
  serial: exar: Fix stuck MSIs
  serial: efm32: Fix parity management in 'efm32_uart_console_get_options()'
  serdev: fix tty-port client deregistration
  Revert "tty_port: register tty ports with serdev bus"
  drivers/tty: 8250: only call fintek_8250_probe when doing port I/O

Merge tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
"Fix running SPU programs on Cell, and a few other minor fixes.

  Thanks to Alistair Popple, Jeremy Kerr, Michael Neuling, Nicholas
  Piggin"

* tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Add PPC_FEATURE userspace bits for SCV and DARN instructions
  powerpc/spufs: Fix hash faults for kernel regions
  powerpc: Fix booting P9 hash with CONFIG_PPC_RADIX_MMU=N
  powerpc/powernv/npu-dma.c: Fix opal_npu_destroy_context() call
  selftests/powerpc: Fix TM resched DSCR test with some compilers

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
"A series of fixes for X86:

   - The final fix for the end-of-stack issue in the unwinder
   - Handle non PAT systems gracefully
   - Prevent access to uninitiliazed memory
   - Move early delay calaibration after basic init
   - Fix Kconfig help text
   - Fix a cross compile issue
   - Unbreak older make versions"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/timers: Move simple_udelay_calibration past init_hypervisor_platform
  x86/alternatives: Prevent uninitialized stack byte read in apply_alternatives()
  x86/PAT: Fix Xorg regression on CPUs that don't support PAT
  x86/watchdog: Fix Kconfig help text file path reference to lockup watchdog documentation
  x86/build: Permit building with old make versions
  x86/unwind: Add end-of-stack check for ftrace handlers
  Revert "x86/entry: Fix the end of the stack for newly forked tasks"
  x86/boot: Use CROSS_COMPILE prefix for readelf

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixlet from Thomas Gleixner:
"Silence dmesg spam by making the posix cpu timer printks depend on
print_fatal_signals"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
posix-timers: Make signal printks conditional

Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS fixes from Thomas Gleixner:
"Two fixlets for RAS:

   - Export memory_error() so the NFIT module can utilize it

   - Handle memory errors in NFIT correctly"

* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  acpi, nfit: Fix the memory error check in nfit_handle_mce()
  x86/MCE: Export memory_error()

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf tooling fixes from Thomas Gleixner:

- Synchronization of tools and kernel headers

- A series of fixes for perf report addressing various failures:
    * Handle invalid maps proper
    * Plug a memory leak
    * Handle frames and callchain order correctly

- Fixes for handling inlines and children mode

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools/include: Sync kernel ABI headers with tooling headers
  perf tools: Put caller above callee in --children mode
  perf report: Do not drop last inlined frame
  perf report: Always honor callchain order for inlined nodes
  perf script: Add --inline option for debugging
  perf report: Fix off-by-one for non-activation frames
  perf report: Fix memory leak in addr2line when called by addr2inlines
  perf report: Don't crash on invalid maps in `-g srcline` mode

Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Thomas Gleixner:
"A fix for a state leak which was introduced in the recent rework of
futex/rtmutex interaction"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()

Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull kthread fix from Thomas Gleixner:
"A single fix which prevents a use after free when kthread fork fails"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kthread: Fix use-after-free if kthread fork fails

Merge tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull ftrace fixes from Steven Rostedt:
"There's been a few memory issues found with ftrace.

  One was simply a memory leak where not all was being freed that should
  have been in releasing a file pointer on set_graph_function.

  Then Thomas found that the ftrace trampolines were marked for
  read/write as well as execute. To shrink the possible attack surface,
  he added calls to set them to ro. Which also uncovered some other
  issues with freeing module allocated memory that had its permissions
  changed.

  Kprobes had a similar issue which is fixed and a selftest was added to
  trigger that issue again"

* tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  x86/ftrace: Make sure that ftrace trampolines are not RWX
  x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range()
  selftests/ftrace: Add a testcase for many kprobe events
  kprobes/x86: Fix to set RWX bits correctly before releasing trampoline
  ftrace: Fix memory leak in ftrace_graph_release()

x86/ftrace: Make sure that ftrace trampolines are not RWX

ftrace use module_alloc() to allocate trampoline pages. The mapping of
module_alloc() is RWX, which makes sense as the memory is written to right
after allocation. But nothing makes these pages RO after writing to them.

Add proper set_memory_rw/ro() calls to protect the trampolines after
modification.

Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705251056410.1862@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range()

With function tracing starting in early bootup and having its trampoline
pages being read only, a bug triggered with the following:

kernel BUG at arch/x86/mm/pageattr.c:189!
invalid opcode: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.12.0-rc2-test+ #3
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
task: ffffffffb4222500 task.stack: ffffffffb4200000
RIP: 0010:change_page_attr_set_clr+0x269/0x302
RSP: 0000:ffffffffb4203c88 EFLAGS: 00010046
RAX: 0000000000000046 RBX: 0000000000000000 RCX: 00000001b6000000
RDX: ffffffffb4203d40 RSI: 0000000000000000 RDI: ffffffffb4240d60
RBP: ffffffffb4203d18 R08: 00000001b6000000 R09: 0000000000000001
R10: ffffffffb4203aa8 R11: 0000000000000003 R12: ffffffffc029b000
R13: ffffffffb4203d40 R14: 0000000000000001 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff9a639ea00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff9a636b384000 CR3: 00000001ea21d000 CR4: 00000000000406b0
Call Trace:
change_page_attr_clear+0x1f/0x21
set_memory_ro+0x1e/0x20
arch_ftrace_update_trampoline+0x207/0x21c
? ftrace_caller+0x64/0x64
? 0xffffffffc029b000
ftrace_startup+0xf4/0x198
register_ftrace_function+0x26/0x3c
function_trace_init+0x5e/0x73
tracer_init+0x1e/0x23
tracing_set_tracer+0x127/0x15a
register_tracer+0x19b/0x1bc
init_function_trace+0x90/0x92
early_trace_init+0x236/0x2b3
start_kernel+0x200/0x3f5
x86_64_start_reservations+0x29/0x2b
x86_64_start_kernel+0x17c/0x18f
secondary_startup_64+0x9f/0x9f
? secondary_startup_64+0x9f/0x9f

Interrupts should not be enabled at this early in the boot process. It is
also fine to leave interrupts enabled during this time as there's only one
CPU running, and on_each_cpu() means to only run on the current CPU.

If early_boot_irqs_disabled is set, it is safe to run cpu_flush_range() with
interrupts disabled. Don't trigger a BUG_ON() in that case.

Link: http://lkml.kernel.org/r/20170526093717.0be3b849@gandalf.local.home
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

selftests/ftrace: Add a testcase for many kprobe events

Add a testcase to test kprobes via ftrace interface
with many concurrent kprobe events.

This tries to add many kprobe events (up to 256) on
kernel functions. To avoid making ftrace-based
kprobes (kprobes on fentry), it skips first N bytes
(on x86 N=5, on ppc or arm N=4) of function entry.
After that, it enables all those events, disable it,
and remove it.

Since the unoptimization buffer reclaiming will
be delayed, after removing events, it will wait
enough time.

Link: http://lkml.kernel.org/r/149577388470.11702.11832460851769204511.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

kprobes/x86: Fix to set RWX bits correctly before releasing trampoline

Fix kprobes to set(recover) RWX bits correctly on trampoline
buffer before releasing it. Releasing readonly page to
module_memfree() crash the kernel.

Without this fix, if kprobes user register a bunch of kprobes
in function body (since kprobes on function entry usually
use ftrace) and unregister it, kernel hits a BUG and crash.

Link: http://lkml.kernel.org/r/149570868652.3518.14120169373590420503.stgit@devbox
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: d0381c81c2f7 ("kprobes/x86: Set kprobes pages read-only")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

ftrace: Fix memory leak in ftrace_graph_release()

ftrace_hash is being kfree'ed in ftrace_graph_release(), however the
->buckets field is not.  This results in a memory leak that is easily
captured by kmemleak:

unreferenced object 0xffff880038afe000 (size 8192):
  comm "trace-cmd", pid 238, jiffies 4294916898 (age 9.736s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff815f561e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8113964d>] __kmalloc+0x12d/0x1a0
    [<ffffffff810bf6d1>] alloc_ftrace_hash+0x51/0x80
    [<ffffffff810c0523>] __ftrace_graph_open.isra.39.constprop.46+0xa3/0x100
    [<ffffffff810c05e8>] ftrace_graph_open+0x68/0xa0
    [<ffffffff8114003d>] do_dentry_open.isra.1+0x1bd/0x2d0
    [<ffffffff81140df7>] vfs_open+0x47/0x60
    [<ffffffff81150f95>] path_openat+0x2a5/0x1020
    [<ffffffff81152d6a>] do_filp_open+0x8a/0xf0
    [<ffffffff811411df>] do_sys_open+0x12f/0x200
    [<ffffffff811412ce>] SyS_open+0x1e/0x20
    [<ffffffff815fa6e0>] entry_SYSCALL_64_fastpath+0x13/0x94
    [<ffffffffffffffff>] 0xffffffffffffffff

Link: http://lkml.kernel.org/r/20170525152038.7661-1-lhenriques@suse.com
Cc: stable@vger.kernel.org
Fixes: b9b0c831bed2 ("ftrace: Convert graph filter to use hash tables")
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input layer fixes from Dmitry Torokhov:
"Just a few fixups to a couple of drivers"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: elan_i2c - ignore signals when finishing updating firmware
  Input: elan_i2c - clear INT before resetting controller
  Input: atmel_mxt_ts - add T100 as a readable object
  Input: edt-ft5x06 - increase allowed data range for threshold parameter

livepatch: Make livepatch dependent on !TRIM_UNUSED_KSYMS

If TRIM_UNUSED_KSYMS is enabled, all unneeded exported symbols are made
unexported. Two-pass build of the kernel is done to find out which
symbols are needed based on a configuration. This effectively
complicates things for out-of-tree modules.

Livepatch exports functions to (un)register and enable/disable a live
patch. The only in-tree module which uses these functions is a sample in
samples/livepatch/. If the sample is disabled, the functions are
trimmed and out-of-tree live patches cannot be built.

Note that live patches are intended to be built out-of-tree.

Suggested-by: Michal Marek <mmarek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Jessica Yu <jeyu@redhat.com>
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

Merge tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds

Pull LED fix from Jacek Anaszewski:
"A single LED fix for 4.12-rc3.

  leds-pca955x driver uses only i2c_smbus API and thus it should pass
  I2C_FUNC_SMBUS_BYTE_DATA flag to i2c_check_functionality"

* tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
  leds: pca955x: Correct I2C Functionality

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix state pruning in bpf verifier wrt. alignment, from Daniel
    Borkmann.

2) Handle non-linear SKBs properly in SCTP ICMP parsing, from Davide
    Caratti.

3) Fix bit field definitions for rss_hash_type of descriptors in mlx5
    driver, from Jesper Brouer.

4) Defer slave->link updates until bonding is ready to do a full commit
    to the new settings, from Nithin Sujir.

5) Properly reference count ipv4 FIB metrics to avoid use after free
    situations, from Eric Dumazet and several others including Cong Wang
    and Julian Anastasov.

6) Fix races in llc_ui_bind(), from Lin Zhang.

7) Fix regression of ESP UDP encapsulation for TCP packets, from
    Steffen Klassert.

8) Fix mdio-octeon driver Kconfig deps, from Randy Dunlap.

9) Fix regression in setting DSCP on ipv6/GRE encapsulation, from Peter
    Dawson.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
  ipv4: add reference counting to metrics
  net: ethernet: ax88796: don't call free_irq without request_irq first
  ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets
  sctp: fix ICMP processing if skb is non-linear
  net: llc: add lock_sock in llc_ui_bind to avoid a race condition
  bonding: Don't update slave->link until ready to commit
  test_bpf: Add a couple of tests for BPF_JSGE.
  bpf: add various verifier test cases
  bpf: fix wrong exposure of map_flags into fdinfo for lpm
  bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data
  bpf: properly reset caller saved regs after helper call and ld_abs/ind
  bpf: fix incorrect pruning decision when alignment must be tracked
  arp: fixed -Wuninitialized compiler warning
  tcp: avoid fastopen API to be used on AF_UNSPEC
  net: move somaxconn init from sysctl code
  net: fix potential null pointer dereference
  geneve: fix fill_info when using collect_metadata
  virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
  be2net: Fix offload features for Q-in-Q packets
  vlan: Fix tcp checksum offloads in Q-in-Q vlans
  ...

Merge tag 'xfs-4.12-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull XFS fixes from Darrick Wong:
"A few miscellaneous bug fixes & cleanups:

   - Fix indlen block reservation accounting bug when splitting delalloc
     extent

   - Fix warnings about unused variables that appeared in -rc1.

   - Don't spew errors when bmapping a local format directory

   - Fix an off-by-one error in a delalloc eof assertion

   - Make fsmap only return inode information for CAP_SYS_ADMIN

   - Fix a potential mount time deadlock recovering cow extents

   - Fix unaligned memory access in _btree_visit_blocks

   - Fix various SEEK_HOLE/SEEK_DATA bugs"

* tag 'xfs-4.12-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: Move handling of missing page into one place in xfs_find_get_desired_pgoff()
  xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff()
  xfs: Fix missed holes in SEEK_HOLE implementation
  xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff()
  xfs: fix unaligned access in xfs_btree_visit_blocks
  xfs: avoid mount-time deadlock in CoW extent recovery
  xfs: only return detailed fsmap info if the caller has CAP_SYS_ADMIN
  xfs: bad assertion for delalloc an extent that start at i_size
  xfs: fix warnings about unused stack variables
  xfs: BMAPX shouldn't barf on inline-format directories
  xfs: fix indlen accounting error on partial delalloc conversion

ipv4: add reference counting to metrics

Andrey Konovalov reported crashes in ipv4_mtu()

I could reproduce the issue with KASAN kernels, between
10.246.7.151 and 10.246.7.152 :

1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 &

2) At the same time run following loop :
while :
do
ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
done

Cong Wang attempted to add back rt->fi in commit
82486aa6f1b9 ("ipv4: restore rt->fi for reference counting")
but this proved to add some issues that were complex to solve.

Instead, I suggested to add a refcount to the metrics themselves,
being a standalone object (in particular, no reference to other objects)

I tried to make this patch as small as possible to ease its backport,
instead of being super clean. Note that we believe that only ipv4 dst
need to take care of the metric refcount. But if this is wrong,
this patch adds the basic infrastructure to extend this to other
families.

Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang
for his efforts on this problem.

Fixes: 2860583fe840 ("ipv4: Kill rt->fi")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: ax88796: don't call free_irq without request_irq first

The function ax_init_dev (which is called only from the driver's .probe
function) calls free_irq in the error path without having requested the
irq in the first place. So drop the free_irq call in the error path.

Fixes: 825a2ff1896e ("AX88796 network driver")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets

This fix addresses two problems in the way the DSCP field is formulated
on the encapsulating header of IPv6 tunnels.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195661

1) The IPv6 tunneling code was manipulating the DSCP field of the
encapsulating packet using the 32b flowlabel. Since the flowlabel is
only the lower 20b it was incorrect to assume that the upper 12b
containing the DSCP and ECN fields would remain intact when formulating
the encapsulating header. This fix handles the 'inherit' and
'fixed-value' DSCP cases explicitly using the extant dsfield u8 variable.

2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was
incorrect and resulted in the DSCP value always being set to 0.

Commit 90427ef5d2a4 ("ipv6: fix flow labels when the traffic class
is non-0") caused the regression by masking out the flowlabel
which exposed the incorrect handling of the DSCP portion of the
flowlabel in ip6_tunnel and ip6_gre.

Fixes: 90427ef5d2a4 ("ipv6: fix flow labels when the traffic class is non-0")
Signed-off-by: Peter Dawson <peter.a.dawson@boeing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: fix ICMP processing if skb is non-linear

sometimes ICMP replies to INIT chunks are ignored by the client, even if
the encapsulated SCTP headers match an open socket. This happens when the
ICMP packet is carried by a paged skb: use skb_header_pointer() to read
packet contents beyond the SCTP header, so that chunk header and initiate
tag are validated correctly.

v2:
- don't use skb_header_pointer() to read the transport header, since
icmp_socket_deliver() already puts these 8 bytes in the linear area.
- change commit message to make specific reference to INIT chunks.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: llc: add lock_sock in llc_ui_bind to avoid a race condition

There is a race condition in llc_ui_bind if two or more processes/threads
try to bind a same socket.

If more processes/threads bind a same socket success that will lead to
two problems, one is this action is not what we expected, another is
will lead to kernel in unstable status or oops(in my simple test case,
cause llc2.ko can't unload).

The current code is test SOCK_ZAPPED bit to avoid a process to
bind a same socket twice but that is can't avoid more processes/threads
try to bind a same socket at the same time.

So, add lock_sock in llc_ui_bind like others, such as llc_ui_connect.

Signed-off-by: Lin Zhang <xiaolou4617@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"A collection of fixes that should go into this series. This contains:

   - A set of NVMe fixes, pulled from Christoph. This includes a set of
     fixes for the fiber channel bits from James Smart, rdma queue depth
     fix from Marta, controller removal fixes from Ming, and some more
     APST quirk updates from Andy.

   - A blk-mq debugfs fix from Bart, fixing a problem with the
     untangling of the sysfs and debugfs blk-mq bits that was added in
     this series.

   - Error code fix in add_partition() from Dan.

   - A small series of fixes for the new blk-throttle code from Shaohua"

* 'for-linus' of git://git.kernel.dk/linux-block: (21 commits)
  blk-mq: Only register debugfs attributes for blk-mq queues
  nvme: Quirk APST on Intel 600P/P3100 devices
  nvme: only setup block integrity if supported by the driver
  nvme: replace is_flags field in nvme_ctrl_ops with a flags field
  nvme-pci: consistencly use ctrl->device for logging
  partitions/msdos: FreeBSD UFS2 file systems are not recognized
  block: fix an error code in add_partition()
  blk-throttle: force user to configure all settings for io.low
  blk-throttle: respect 0 bps/iops settings for io.low
  blk-throttle: output some debug info in trace
  blk-throttle: add hierarchy support for latency target and idle time
  nvme_fc: remove extra controller reference taken on reconnect
  nvme_fc: correct nvme status set on abort
  nvme_fc: set logging level on resets/deletes
  nvme_fc: revise comment on teardown
  nvme_fc: Support ctrl_loss_tmo
  nvme_fc: get rid of local reconnect_delay
  blk-mq: remove blk_mq_abort_requeue_list()
  nvme: avoid to use blk_mq_abort_requeue_list()
  nvme: use blk_mq_start_hw_queues() in nvme_kill_queues()
  ...

Merge tag 'pci-v4.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

- fix PCI_ENDPOINT build error (merged for v4.12)

- fix Switchtec driver (merged for v4.12)

- fix imx6 config read timeouts, fallout from changing to non-postable
   reads

- add PM "needs_resume" flag for i915 suspend issue

* tag 'pci-v4.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI/PM: Add needs_resume flag to avoid suspend complete optimization
  PCI: imx6: Fix config read timeout handling
  switchtec: Fix minor bug with partition ID register
  switchtec: Use new cdev_device_add() helper function
  PCI: endpoint: Make PCI_ENDPOINT depend on HAS_DMA

Merge tag 'ceph-for-4.12-rc3' of git://github.com/ceph/ceph-client

Pul ceph fixes from Ilya Dryomov:
"A bunch of make W=1 and static checker fixups, a RECONNECT_SEQ
  messenger patch from Zheng and Luis' fallocate fix"

* tag 'ceph-for-4.12-rc3' of git://github.com/ceph/ceph-client:
  ceph: check that the new inode size is within limits in ceph_fallocate()
  libceph: cleanup old messages according to reconnect seq
  libceph: NULL deref on crush_decode() error path
  libceph: fix error handling in process_one_ticket()
  libceph: validate blob_struct_v in process_one_ticket()
  libceph: drop version variable from ceph_monmap_decode()
  libceph: make ceph_msg_data_advance() return void
  libceph: use kbasename() and kill ceph_file_part()

Merge tag 'mmc-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
"This contains fixes to make the WiFi work again for the ARM64 Hikey
  board.

  Together with a couple of DTS updates for the Hikey board we have also
  extended the mmc pwrseq_simple, to support a new power-off-delay-us DT
  property, as that was required to enable a graceful power off sequence
  for the WiFi chip"

* tag 'mmc-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  arm64: dts: hikey: Fix WiFi support
  arm64: dts: hi6220: Move board data from the dwmmc nodes to hikey dts
  arm64: dts: hikey: Add the SYS_5V and the VDD_3V3 regulators
  arm64: dts: hi6220: Move the fixed_5v_hub regulator to the hikey dts
  arm64: dts: hikey: Add clock for the pmic mfd
  mfd: dts: hi655x: Add clock binding for the pmic
  mmc: pwrseq_simple: Parse DTS for the power-off-delay-us property
  mmc: dt: pwrseq-simple: Invent power-off-delay-us