Fabio Estevam [Thu, 13 Aug 2015 17:02:05 +0000 (14:02 -0300)]
mtd: spi-nor: Improve Kconfig help text for SPI_FSL_QUADSPI
The current "We only connect the NOR to this controller now." text
is not very clear, so explain it better by saying that generic SPI
is not supported by SPI_FSL_QUADSPI and only SPI NOR is.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Joachim Eastwood [Thu, 13 Aug 2015 17:19:40 +0000 (19:19 +0200)]
mtd: spi-nor: add driver for NXP SPI Flash Interface (SPIFI)
Add SPI-NOR driver for the SPI Flash Interface (SPIFI)
controller that is found on newer NXP MCU devices.
The controller supports serial SPI Flash devices with 1-, 2-
and 4-bit width in either SPI mode 0 or 3. The controller
can operate in either command or memory mode. In memory mode
the Flash is exposed as normal memory and can be directly
accessed by the CPU.
Signed-off-by: Joachim Eastwood <manabian@gmail.com> Reviewed-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Ezequiel Garcia [Mon, 3 Aug 2015 14:31:26 +0000 (11:31 -0300)]
nand: pxa3xx: Increase initial buffer size
The initial buffer is used for the initial commands used to detect
a flash device (STATUS, READID and PARAM).
ONFI param page is 256 bytes, and there are three redundant copies
to be read. JEDEC param page is 512 bytes, and there are also three
redundant copies to be read. Hence this buffer should be at least
512 x 3. This commits rounds the buffer size to 2048.
Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Frank Li [Tue, 4 Aug 2015 15:26:10 +0000 (10:26 -0500)]
mtd: spi-nor: fsl-quadspi: reset the module in the probe
The uboot may run the QuadSpi controler with command:
#sf probe
So we should reset the module in the probe.
This patch also clear the pending interrupts which arised by the uboot
code.
Signed-off-by: Huang Shijie <shijie8@gmail.com> Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Frank Li [Tue, 4 Aug 2015 15:26:04 +0000 (10:26 -0500)]
mtd: spi-nor: fsl-quadspi: workaround qspi can't wakeup from wait mode
QSPI1 cannot wake up CCM from WAIT mode on SX ARD board, add pmqos to
let PM NOT enter WAIT mode when accessing QSPI1, refer to TKT245618.
Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: Han Xu <Han.xu@freescale.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Allen Xu [Tue, 4 Aug 2015 15:25:58 +0000 (10:25 -0500)]
mtd: spi-nor: fsl-quadspi: i.MX6SX: fixed the random QSPI access failed issue
We found there is a low probability(5%) QSPI access timeout issue,
usually it happened on kernel boot stage, the first time kernel tried to
access QSPI chip. The READ_ID command was sent but not executed,
consequently the probe function failed.
The root cause is that the divider is not glitchless in i.MX6SX chip.
If qspi clock enabled then change clock frequency by call clk_set_rate,
there will be glitch at low possiblity rate and pass to qspi controller.
The controler will be hang by this glitch.
Based on the new clock flag(CLK_SET_RATE_GATE) and new framework, we
need to change the approach of seting clock rate.
1. Disable clock.
2. call clk_set_rate.
3. Enable clock again.
Signed-off-by: Han Xu <han.xu@freescale.com> Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Frank Li [Tue, 4 Aug 2015 15:25:35 +0000 (10:25 -0500)]
mtd: spi-nor: fsl-quadspi: add imx7d support
Support i.mx7d.
quadspi in i.mx7d increase rxfifo.
require fill at least 16byte to trigger data transfer.
Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: Han Xu <han.xu@freescale.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Han Xu [Tue, 4 Aug 2015 15:25:29 +0000 (10:25 -0500)]
mtd: spi-nor: fsl-quadspi: use quirk to distinguish different qspi version
add several quirk to distinguish different version of qspi module.
Signed-off-by: Han Xu <han.xu@freescale.com> Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Han Xu [Tue, 4 Aug 2015 15:25:22 +0000 (10:25 -0500)]
mtd: spi-nor: fsl-quadspi: dynamically map memory space for AHB read
QSPI may failed to map enough memory (256MB) for AHB read in
previous implementation, especially in 3G/1G memory layout kernel.
Dynamically map memory to avoid such issue.
This implementation generally map QUADSPI_MAX_IOMAP (default 4MB) memory
for AHB read, it should be enough for common scenarios, and the side
effect (0.6% performance drop) is minor.
Previous implementation
root@imx6qdlsolo:~# dd if=/dev/mtd0 of=/dev/null bs=1K count=32K
32768+0 records in
32768+0 records out 33554432 bytes (34 MB) copied, 2.16006 s, 15.5 MB/s
root@imx6qdlsolo:~# dd if=/dev/mtd0 of=/dev/null bs=32M count=1
1+0 records in
1+0 records out 33554432 bytes (34 MB) copied, 1.43149 s, 23.4 MB/s
After applied the patch
root@imx6qdlsolo:~# dd if=/dev/mtd0 of=/dev/null bs=1K count=32K
32768+0 records in
32768+0 records out 33554432 bytes (34 MB) copied, 2.1743 s, 15.4 MB/s
root@imx6qdlsolo:~# dd if=/dev/mtd0 of=/dev/null bs=32M count=1
1+0 records in
1+0 records out 33554432 bytes (34 MB) copied, 1.43158 s, 23.4 MB/s
Signed-off-by: Han Xu <han.xu@freescale.com> Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Brian Norris [Tue, 19 May 2015 21:38:22 +0000 (14:38 -0700)]
mtd: m25p80: allow arbitrary OF matching for "jedec,spi-nor"
When we added the "jedec,spi-nor" compatible string for use in this
driver, we added it as a modalias option. The modalias can be derived in
different ways for platform devices vs. device tree (of_*) matching. But
for device tree matching (the primary target of this identifier string),
the modalias is determined from the first entry in the 'compatible'
property. IOW, the following properties would bind to this driver:
So, we'd like to match (a) and (c) (even when we don't have an explicit
entry for "shinynewdevice"), and we'd rather not allow (b).
To do this, we
(1) always (for devices without specific platform data) pass the
modalias to the spi-nor library;
(2) rework the spi-nor library to not reject "bad" names, and
instead just fall back to autodetection; and
(3) add the .of_match_table to properly catch all "jedec,spi-nor".
This allows (a) and (c) without warnings, and rejects (b).
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Roy Spliet [Fri, 26 Jun 2015 09:00:11 +0000 (11:00 +0200)]
mtd: nand: sunxi: Set serial access mode correctly
Replaces the hard coded "always use EDO" policy with that prescribed
by the ONFI 3.1 specification that EDO mode should always be used if tRC
is below 30ns.
Signed-off-by: Roy Spliet <r.spliet@ultimaker.com> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Roy Spliet [Fri, 26 Jun 2015 09:00:10 +0000 (11:00 +0200)]
mtd: nand: sunxi: Replace failsafe timing cfg with calculated value
The TIMING_CFG register was previously statically set to a magic value
(extracted from Allwinner's BSP) when initializing the NAND controller.
Now that we have more details about the TIMING_CFG register layout
(extracted from the A83 user manual) we can dynamically calculate the
appropriate value for each NAND chip and set it when selecting the
chip.
Signed-off-by: Roy Spliet <r.spliet@ultimaker.com> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Nicolas Iooss [Sun, 5 Jul 2015 01:57:41 +0000 (09:57 +0800)]
mtd: r852: make ecc_reg 32-bit in r852_ecc_correct
r852_ecc_correct() reads a 32-bit register into a 16-bit variable,
ecc_reg, but this variable is later used as if it was larger. This is
reported by clang when building the kernel with many warnings:
drivers/mtd/nand/r852.c:512:11: error: shift count >= width of type
[-Werror,-Wshift-count-overflow]
ecc_reg >>= 16;
^ ~~
Fix this by making ecc_reg 32-bit, like the return type of
r852_read_reg_dword().
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Scott Wood [Sat, 27 Jun 2015 00:43:58 +0000 (19:43 -0500)]
mtd: nand: Fix NAND_USE_BOUNCE_BUFFER flag conflict
Commit 66507c7bc8895f0da6b ("mtd: nand: Add support to use nand_base
poi databuf as bounce buffer") added a flag NAND_USE_BOUNCE_BUFFER
using the same bit value as the existing NAND_BUSWIDTH_AUTO.
Cc: Kamal Dasu <kdasu.kdev@gmail.com> Fixes: 66507c7bc8895f0da6b ("mtd: nand: Add support to use nand_base
poi databuf as bounce buffer") Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Merge tag 'platform-drivers-x86-v4.2-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
Pull late x86 platform driver updates from Darren Hart:
"The following came in a bit later and I wanted them to bake in next a
few more days before submitting, thus the second pull.
A new intel_pmc_ipc driver, a symmetrical allocation and free fix in
dell-laptop, a couple minor fixes, and some updated documentation in
the dell-laptop comments.
intel_pmc_ipc:
- Add Intel Apollo Lake PMC IPC driver
tc1100-wmi:
- Delete an unnecessary check before the function call "kfree"
dell-laptop:
- Fix allocating & freeing SMI buffer page
- Show info about WiGig and UWB in debugfs
- Update information about wireless control"
* tag 'platform-drivers-x86-v4.2-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
intel_pmc_ipc: Add Intel Apollo Lake PMC IPC driver
tc1100-wmi: Delete an unnecessary check before the function call "kfree"
dell-laptop: Fix allocating & freeing SMI buffer page
dell-laptop: Show info about WiGig and UWB in debugfs
dell-laptop: Update information about wireless control
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
"Assorted VFS fixes and related cleanups (IMO the most interesting in
that part are f_path-related things and Eric's descriptor-related
stuff). UFS regression fixes (it got broken last cycle). 9P fixes.
fs-cache series, DAX patches, Jan's file_remove_suid() work"
[ I'd say this is much more than "fixes and related cleanups". The
file_table locking rule change by Eric Dumazet is a rather big and
fundamental update even if the patch isn't huge. - Linus ]
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
9p: cope with bogus responses from server in p9_client_{read,write}
p9_client_write(): avoid double p9_free_req()
9p: forgetting to cancel request on interrupted zero-copy RPC
dax: bdev_direct_access() may sleep
block: Add support for DAX reads/writes to block devices
dax: Use copy_from_iter_nocache
dax: Add block size note to documentation
fs/file.c: __fget() and dup2() atomicity rules
fs/file.c: don't acquire files->file_lock in fd_install()
fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
vfs: avoid creation of inode number 0 in get_next_ino
namei: make set_root_rcu() return void
make simple_positive() public
ufs: use dir_pages instead of ufs_dir_pages()
pagemap.h: move dir_pages() over there
remove the pointless include of lglock.h
fs: cleanup slight list_entry abuse
xfs: Correctly lock inode when removing suid and file capabilities
fs: Call security_ops->inode_killpriv on truncate
fs: Provide function telling whether file_remove_privs() will do anything
...
Commit 835a6a2f8603 ("Bluetooth: Stop sabotaging list poisoning")
thought that the code was sabotaging the list poisoning when NULL'ing
out the list pointers and removed it.
But what was going on was that the bluetooth code was using NULL
pointers for the list as a way to mark it empty, and that commit just
broke it (and replaced the test with NULL with a "list_empty()" test on
a uninitialized list instead, breaking things even further).
So fix it all up to use the regular and real list_empty() handling
(which does not use NULL, but a pointer to itself), also making sure to
initialize the list properly (the previous NULL case was initialized
implicitly by the session being allocated with kzalloc())
This is a combination of patches by Marcel Holtmann and Tedd Ho-Jeong
An.
[ I would normally expect to get this through the bt tree, but I'm going
to release -rc1, so I'm just committing this directly - Linus ]
Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
"It's been a busy development cycle for target-core in a number of
different areas.
The fabric API usage for se_node_acl allocation is now within
target-core code, dropping the external API callers for all fabric
drivers tree-wide.
There is a new conversion to RCU hlists for se_node_acl and
se_portal_group LUN mappings, that turns fast-past LUN lookup into a
completely lockless code-path. It also removes the original
hard-coded limitation of 256 LUNs per fabric endpoint.
The configfs attributes for backends can now be shared between core
and driver code, allowing existing drivers to use common code while
still allowing flexibility for new backend provided attributes.
The highlights include:
- Merge sbc_verify_dif_* into common code (sagi)
- Remove iscsi-target support for obsolete IFMarker/OFMarker
(Christophe Vu-Brugier)
- Add bidi support in target/user backend (ilias + vangelis + agover)
- Move se_node_acl allocation into target-core code (hch)
- Add crc_t10dif_update common helper (akinobu + mkp)
- Handle target-core odd SGL mapping for data transfer memory
(akinobu)
- Move transport ID handling into target-core (hch)
- Move task tag into struct se_cmd + support 64-bit tags (bart)
- Convert se_node_acl->device_list[] to RCU hlist (nab + hch +
paulmck)
- Convert se_portal_group->tpg_lun_list[] to RCU hlist (nab + hch +
paulmck)
- Simplify target backend driver registration (hch)
- Consolidate + simplify target backend attribute implementations
(hch + nab)
- Subsume se_port + t10_alua_tg_pt_gp_member into se_lun (hch)
- Drop lun_sep_lock for se_lun->lun_se_dev RCU usage (hch + nab)
- Drop unnecessary core_tpg_register TFO parameter (nab)
- Use 64-bit LUNs tree-wide (hannes)
- Drop left-over TARGET_MAX_LUNS_PER_TRANSPORT limit (hannes)"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (76 commits)
target: Bump core version to v5.0
target: remove target_core_configfs.h
target: remove unused TARGET_CORE_CONFIG_ROOT define
target: consolidate version defines
target: implement WRITE_SAME with UNMAP bit using ->execute_unmap
target: simplify UNMAP handling
target: replace se_cmd->execute_rw with a protocol_data field
target/user: Fix inconsistent kmap_atomic/kunmap_atomic
target: Send UA when changing LUN inventory
target: Send UA upon LUN RESET tmr completion
target: Send UA on ALUA target port group change
target: Convert se_lun->lun_deve_lock to normal spinlock
target: use 'se_dev_entry' when allocating UAs
target: Remove 'ua_nacl' pointer from se_ua structure
target_core_alua: Correct UA handling when switching states
xen-scsiback: Fix compile warning for 64-bit LUN
target: Remove TARGET_MAX_LUNS_PER_TRANSPORT
target: use 64-bit LUNs
target: Drop duplicate + unused se_dev_check_wce
target: Drop unnecessary core_tpg_register TFO parameter
...
Merge tag 'ntb-4.2' of git://github.com/jonmason/ntb
Pull NTB updates from Jon Mason:
"This includes a pretty significant reworking of the NTB core code, but
has already produced some significant performance improvements.
An abstraction layer was added to allow the hardware and clients to be
easily added. This required rewriting the NTB transport layer for
this abstraction layer. This modification will allow future "high
performance" NTB clients.
In addition to this change, a number of performance modifications were
added. These changes include NUMA enablement, using CPU memcpy
instead of asyncdma, and modification of NTB layer MTU size"
* tag 'ntb-4.2' of git://github.com/jonmason/ntb: (22 commits)
NTB: Add split BAR output for debugfs stats
NTB: Change WARN_ON_ONCE to pr_warn_once on unsafe
NTB: Print driver name and version in module init
NTB: Increase transport MTU to 64k from 16k
NTB: Rename Intel code names to platform names
NTB: Default to CPU memcpy for performance
NTB: Improve performance with write combining
NTB: Use NUMA memory in Intel driver
NTB: Use NUMA memory and DMA chan in transport
NTB: Rate limit ntb_qp_link_work
NTB: Add tool test client
NTB: Add ping pong test client
NTB: Add parameters for Intel SNB B2B addresses
NTB: Reset transport QP link stats on down
NTB: Do not advance transport RX on link down
NTB: Differentiate transport link down messages
NTB: Check the device ID to set errata flags
NTB: Enable link for Intel root port mode in probe
NTB: Read peer info from local SPAD in transport
NTB: Split ntb_hw_intel and ntb_transport drivers
...
Al Viro [Sat, 4 Jul 2015 20:11:05 +0000 (16:11 -0400)]
p9_client_write(): avoid double p9_free_req()
Braino in "9p: switch p9_client_write() to passing it struct iov_iter *";
if response is impossible to parse and we discard the request, get the
out of the loop right there.
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 4 Jul 2015 20:04:19 +0000 (16:04 -0400)]
9p: forgetting to cancel request on interrupted zero-copy RPC
If we'd already sent a request and decide to abort it, we *must*
issue TFLUSH properly and not just blindly reuse the tag, or
we'll get seriously screwed when response eventually arrives
and we confuse it for response to later request that had reused
the same tag.
Cc: stable@vger.kernel.org # v3.2 and later Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Matthew Wilcox [Fri, 3 Jul 2015 14:40:43 +0000 (10:40 -0400)]
dax: bdev_direct_access() may sleep
The brd driver is the only in-tree driver that may sleep currently.
After some discussion on linux-fsdevel, we decided that any driver
may choose to sleep in its ->direct_access method. To ensure that all
callers of bdev_direct_access() are prepared for this, add a call
to might_sleep().
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Matthew Wilcox [Fri, 3 Jul 2015 14:40:42 +0000 (10:40 -0400)]
block: Add support for DAX reads/writes to block devices
If a block device supports the ->direct_access methods, bypass the normal
DIO path and use DAX to go straight to memcpy() instead of allocating
a DIO and a BIO.
Includes support for the DIO_SKIP_DIO_COUNT flag in DAX, as is done in
do_blockdev_direct_IO().
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Except for the preempt notifiers fix, these are all small bugfixes
that could have been waited for -rc2. Sending them now since I was
taking care of Peter's patch anyway"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: add hyper-v crash msrs values
KVM: x86: remove data variable from kvm_get_msr_common
KVM: s390: virtio-ccw: don't overwrite config space values
KVM: x86: keep track of LVT0 changes under APICv
KVM: x86: properly restore LVT0
KVM: x86: make vapics_in_nmi_mode atomic
sched, preempt_notifier: separate notifier registration from static_key inc/dec
Dave Jiang [Tue, 19 May 2015 20:52:04 +0000 (16:52 -0400)]
NTB: Default to CPU memcpy for performance
Disable DMA usage by default, since the CPU provides much better
performance with write combining. Provide a module parameter to enable
DMA usage when offloading the memcpy is preferred.
Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
Dave Jiang [Tue, 19 May 2015 20:45:46 +0000 (16:45 -0400)]
NTB: Improve performance with write combining
Changing the memory window BAR mappings to write combining significantly
boosts the performance. We will also use memcpy that uses non-temporal
store, which showed performance improvement when doing non-cached
memcpys.
Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
Allen Hubbe [Mon, 11 May 2015 14:08:26 +0000 (10:08 -0400)]
NTB: Rate limit ntb_qp_link_work
When the ntb transport is connecting and waiting for the peer, the debug
console receives lots of debug level messages about the remote qp link
status being down. Rate limit those messages.
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
Allen Hubbe [Thu, 21 May 2015 06:51:39 +0000 (02:51 -0400)]
NTB: Add tool test client
This is a simple debugging driver that enables the doorbell and
scratch pad registers to be read and written from the debugfs. This
tool enables more complicated debugging to be scripted from user space.
This driver may be used to test that your ntb hardware and drivers are
functioning at a basic level.
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
Allen Hubbe [Wed, 15 Apr 2015 15:12:41 +0000 (11:12 -0400)]
NTB: Add ping pong test client
This is a simple ping pong driver that exercises the scratch pads and
doorbells of the ntb hardware. This driver may be used to test that
your ntb hardware and drivers are functioning at a basic level.
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
Allen Hubbe [Tue, 12 May 2015 12:09:15 +0000 (08:09 -0400)]
NTB: Reset transport QP link stats on down
Reset the link stats when the link goes down. In particular, the TX and
RX index and count must be reset, or else the TX side will be sending
packets to the RX side where the RX side is not expecting them. Reset
all the stats, to be consistent.
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
Dave Jiang [Tue, 19 May 2015 20:59:34 +0000 (16:59 -0400)]
NTB: Enable link for Intel root port mode in probe
Link training should be enabled in the driver probe for root port mode.
We should not have to wait for transport to be loaded for this to
happen. Otherwise the ntb device will not show up on the transparent
bridge side of the link.
Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
Dave Jiang [Tue, 2 Jun 2015 07:45:07 +0000 (03:45 -0400)]
NTB: Read peer info from local SPAD in transport
The transport was writing and then reading the peer scratch pad,
essentially reading what it just wrote instead of exchanging any
information with the peer. The transport expects the peer values to be
the same as the local values, so this issue was not obvious.
Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq update from Thomas Gleixner:
"The last update for 4.2 is just moving a macro from a local header to
the global one, so it can be used in architecture code as well.
Cleanup of the now empty local header is 4.3 material"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: Move IRQCHIP_DECLARE macro to include/linux/irqchip.h
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two FPU rewrite related fixes. This addresses all known x86
regressions at this stage. Also some other misc fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fpu: Fix boot crash in the early FPU code
x86/asm/entry/64: Update path names
x86/fpu: Fix FPU related boot regression when CPUID masking BIOS feature is enabled
x86/boot/setup: Clean up the e820_reserve_setup_data() code
x86/kaslr: Fix typo in the KASLR_FLAG documentation
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Debug info and other statistics fixes and related enhancements"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/numa: Fix numa balancing stats in /proc/pid/sched
sched/numa: Show numa_group ID in /proc/sched_debug task listings
sched/debug: Move print_cfs_rq() declaration to kernel/sched/sched.h
sched/stat: Expose /proc/pid/schedstat if CONFIG_SCHED_INFO=y
sched/stat: Simplify the sched_info accounting dependency
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"This tree includes an x86 PMU scheduling fix, but most changes are
late breaking tooling fixes and updates:
User visible fixes:
- Create config.detected into OUTPUT directory, fixing parallel
builds sharing the same source directory (Aaro Kiskinen)
- Allow to specify custom linker command, fixing some MIPS64 builds.
(Aaro Kiskinen)
- Fix to show proper convergence stats in 'perf bench numa' (Srikar
Dronamraju)
User visible changes:
- Validate syscall list passed via -e argument to 'perf trace'.
(Arnaldo Carvalho de Melo)
- Introduce 'perf stat --per-thread' (Jiri Olsa)
- Check access permission for --kallsyms and --vmlinux (Li Zhang)
- Move toggling event logic from 'perf top' and into hists browser,
allowing freeze/unfreeze with event lists with more than one entry
(Namhyung Kim)
- Add missing newlines when dumping PERF_RECORD_FINISHED_ROUND and
showing the Aggregated stats in 'perf report -D' (Adrian Hunter)
Infrastructure fixes:
- Add missing break for PERF_RECORD_ITRACE_START, which caused those
events samples to be parsed as well as PERF_RECORD_LOST_SAMPLES.
ITRACE_START only appears when Intel PT or BTS are present, so..
(Jiri Olsa)
- Call the perf_session destructor when bailing out in the inject,
kmem, report, kvm and mem tools (Taeung Song)
Infrastructure changes:
- Move stuff out of 'perf stat' and into the lib for further use
(Jiri Olsa)
- Reference count the cpu_map and thread_map classes (Jiri Olsa)
- Set evsel->{cpus,threads} from the evlist, if not set, allowing the
generalization of some 'perf stat' functions that previously were
accessing private static evlist variable (Jiri Olsa)
- Delete an unnecessary check before the calling free_event_desc()
(Markus Elfring)
- Allow auxtrace data alignment (Adrian Hunter)
- Allow events with dot (Andi Kleen)
- Fix failure to 'perf probe' events on arm (He Kuang)
- Add testing for Makefile.perf (Jiri Olsa)
- Add test for make install with prefix (Jiri Olsa)
- Fix single target build dependency check (Jiri Olsa)
- Access thread_map entries via accessors, prep patch to hold more
info per entry, for ongoing 'perf stat --per-thread' work (Jiri
Olsa)
- Use __weak definition from compiler.h (Sukadev Bhattiprolu)
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
perf tools: Allow to specify custom linker command
perf tools: Create config.detected into OUTPUT directory
perf mem: Fill in the missing session freeing after an error occurs
perf kvm: Fill in the missing session freeing after an error occurs
perf report: Fill in the missing session freeing after an error occurs
perf kmem: Fill in the missing session freeing after an error occurs
perf inject: Fill in the missing session freeing after an error occurs
perf tools: Add missing break for PERF_RECORD_ITRACE_START
perf/x86: Fix 'active_events' imbalance
perf symbols: Check access permission when reading symbol files
perf stat: Introduce --per-thread option
perf stat: Introduce print_counters function
perf stat: Using init_stats instead of memset
perf stat: Rename print_interval to process_interval
perf stat: Remove perf_evsel__read_cb function
perf stat: Move perf_stat initialization counter process code
perf stat: Move zero_per_pkg into counter process code
perf stat: Separate counters reading and processing
perf stat: Introduce read_counters function
perf stat: Introduce perf_evsel__read function
...
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull second round of input updates from Dmitry Torokhov:
"A new driver for Weida wdt87xx touch controllers, and a bunch of
fixups for other drivers"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: wdt87xx_i2c - add a scaling factor for TOUCH_MAJOR event
Input: wdt87xx_i2c - remove stray newline in diagnostic message
Input: arc_ps2 - add HAS_IOMEM dependency
Input: wdt87xx_i2c - fix format warning
Input: improve parsing OF parameters for touchscreens
Input: edt-ft5x06 - mark as direct input device
Input: use for_each_set_bit() where appropriate
Input: add a driver for wdt87xx touchscreen controller
Input: axp20x-pek - fix reporting button state as inverted
Input: xpad - re-send LED command on present event
Input: xpad - set the LEDs properly on XBox Wireless controllers
Input: imx_keypad - check for clk_prepare_enable() error
91a8c2a5b43f ("x86/fpu: Clean up and fix MXCSR handling")
The reason is that the on-stack FPU registers state variable,
used by the FXSAVE instruction, did not have the required
minimum alignment of 16 bytes, causing the general protection
fault.
This is most likely a GCC bug in older GCC versions, but the
offending commit also added a bogus extra 32-byte alignment
(which GCC ignored too).
So fix this bug by making the variable static again, but also
mark it __initdata this time, because fpu__init_system_mxcsr()
is now an __init function.
Reported-and-bisected-by: Jan Kara <jack@suse.cz> Reported-bisected-and-tested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Kara <jack@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150704075819.GA9201@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
sched/numa: Fix numa balancing stats in /proc/pid/sched
Commit 44dba3d5d6a1 ("sched: Refactor task_struct to use
numa_faults instead of numa_* pointers") modified the way
tsk->numa_faults stats are accounted.
However that commit never touched show_numa_stats() that is displayed
in /proc/pid/sched and thus the numbers displayed in /proc/pid/sched
don't match the actual numbers.
Fix it by making sure that /proc/pid/sched reflects the task
fault numbers. Also add group fault stats too.
Also couple of more modifications are added here:
1. Format changes:
- Previously we would list two entries per node, one for private
and one for shared. Also the home node info was listed in each entry.
- Now preferred node, total_faults and current node are
displayed separately.
- Now there is one entry per node, that lists private,shared task and
group faults.
2. Unit changes:
- p->numa_pages_migrated was getting reset after every read of
/proc/pid/sched. It's more useful to have absolute numbers since
differential migrations between two accesses can be more easily
calculated.
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Iulia Manda <iulia.manda21@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1435252903-1081-4-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
sched/debug: Move print_cfs_rq() declaration to kernel/sched/sched.h
Currently print_cfs_rq() is declared in include/linux/sched.h.
However it's not used outside kernel/sched. Hence move the
declaration to kernel/sched/sched.h
Also some functions are only available for CONFIG_SCHED_DEBUG=y.
Hence move the declarations to within the #ifdef.
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Iulia Manda <iulia.manda21@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1435252903-1081-2-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Naveen N. Rao [Tue, 30 Jun 2015 09:06:03 +0000 (14:36 +0530)]
sched/stat: Expose /proc/pid/schedstat if CONFIG_SCHED_INFO=y
Expand /proc/pid/schedstat output:
- enable it on CONFIG_TASK_DELAY_ACCT=y && !CONFIG_SCHEDSTATS kernels.
- dump all zeroes on kernels that are booted with the 'nodelayacct'
option, which boot option disables delay accounting on
CONFIG_TASK_DELAY_ACCT=y kernels.
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vhost cross endian support from Michael Tsirkin:
"I have just queued some more bugfix patches today but none fix
regressions and none are related to these ones, so it looks like a
good time for a merge for -rc1.
The motivation for this is support for legacy BE guests on the new LE
hosts. There are two redeeming properties that made me merge this:
- It's a trivial amount of code: since we wrap host/guest accesses
anyway, almost all of it is well hidden from drivers.
- Sane platforms would never set flags like VHOST_CROSS_ENDIAN_LEGACY,
and when it's clear, there's zero overhead (as some point it was
tested by compiling with and without the patches, got the same
stripped binary).
Maybe we could create a Kconfig symbol to enforce the second point:
prevent people from enabling it eg on x86. I will look into this"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio-pci: alloc only resources actually used.
macvtap/tun: cross-endian support for little-endian hosts
vhost: cross-endian support for legacy devices
virtio: add explicit big-endian support to memory accessors
vhost: introduce vhost_is_little_endian() helper
vringh: introduce vringh_is_little_endian() helper
macvtap: introduce macvtap_is_little_endian() helper
tun: add tun_is_little_endian() helper
virtio: introduce virtio_is_little_endian() helper
The length of each EDID block is EDID_LENGTH, and number of blocks is
(1 + edid->extensions) - we need to multiply not add them.
This causes wrong EDID to be passed on, and is a regression introduced
by d2ed34362a52 (drm: Introduce helper for replacing blob properties)
Signed-off-by: Shixin Zeng <zeng.shixin@gmail.com> Cc: Daniel Stone <daniels@collabora.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com>
[danvet: Add Cc: and fix commit summary.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace updates from Eric Biederman:
"Long ago and far away when user namespaces where young it was realized
that allowing fresh mounts of proc and sysfs with only user namespace
permissions could violate the basic rule that only root gets to decide
if proc or sysfs should be mounted at all.
Some hacks were put in place to reduce the worst of the damage could
be done, and the common sense rule was adopted that fresh mounts of
proc and sysfs should allow no more than bind mounts of proc and
sysfs. Unfortunately that rule has not been fully enforced.
There are two kinds of gaps in that enforcement. Only filesystems
mounted on empty directories of proc and sysfs should be ignored but
the test for empty directories was insufficient. So in my tree
directories on proc, sysctl and sysfs that will always be empty are
created specially. Every other technique is imperfect as an ordinary
directory can have entries added even after a readdir returns and
shows that the directory is empty. Special creation of directories
for mount points makes the code in the kernel a smidge clearer about
it's purpose. I asked container developers from the various container
projects to help test this and no holes were found in the set of mount
points on proc and sysfs that are created specially.
This set of changes also starts enforcing the mount flags of fresh
mounts of proc and sysfs are consistent with the existing mount of
proc and sysfs. I expected this to be the boring part of the work but
unfortunately unprivileged userspace winds up mounting fresh copies of
proc and sysfs with noexec and nosuid clear when root set those flags
on the previous mount of proc and sysfs. So for now only the atime,
read-only and nodev attributes which userspace happens to keep
consistent are enforced. Dealing with the noexec and nosuid
attributes remains for another time.
This set of changes also addresses an issue with how open file
descriptors from /proc/<pid>/ns/* are displayed. Recently readlink of
/proc/<pid>/fd has been triggering a WARN_ON that has not been
meaningful since it was added (as all of the code in the kernel was
converted) and is not now actively wrong.
There is also a short list of issues that have not been fixed yet that
I will mention briefly.
It is possible to rename a directory from below to above a bind mount.
At which point any directory pointers below the renamed directory can
be walked up to the root directory of the filesystem. With user
namespaces enabled a bind mount of the bind mount can be created
allowing the user to pick a directory whose children they can rename
to outside of the bind mount. This is challenging to fix and doubly
so because all obvious solutions must touch code that is in the
performance part of pathname resolution.
As mentioned above there is also a question of how to ensure that
developers by accident or with purpose do not introduce exectuable
files on sysfs and proc and in doing so introduce security regressions
in the current userspace that will not be immediately obvious and as
such are likely to require breaking userspace in painful ways once
they are recognized"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
vfs: Remove incorrect debugging WARN in prepend_path
mnt: Update fs_fully_visible to test for permanently empty directories
sysfs: Create mountpoints with sysfs_create_mount_point
sysfs: Add support for permanently empty directories to serve as mount points.
kernfs: Add support for always empty directories.
proc: Allow creating permanently empty directories that serve as mount points
sysctl: Allow creating permanently empty directories that serve as mountpoints.
fs: Add helper functions for permanently empty directories.
vfs: Ignore unlocked mounts in fs_fully_visible
mnt: Modify fs_fully_visible to deal with locked ro nodev and atime
mnt: Refactor the logic for mounting sysfs and proc in a user namespace
Merge tag 'remoteproc-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/remoteproc
Pull remoteproc updates from Ohad Ben-Cohen:
- remoteproc fixes/cleanups from Suman Anna
- new remoteproc TI Wakeup M3 driver from Dave Gerlach
- remoteproc core support for TI's Wakeup M3 driver from both Dave and Suman
- tiny remoteproc build fix from myself
* tag 'remoteproc-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/remoteproc:
remoteproc: fix !CONFIG_OF build breakage
remoteproc/wkup_m3: add a remoteproc driver for TI Wakeup M3
Documentation: dt: add bindings for TI Wakeup M3 processor
remoteproc: add a rproc ops for performing address translation
remoteproc: introduce rproc_get_by_phandle API
remoteproc: fix various checkpatch warnings
remoteproc/davinci: fix quoted split string checkpatch warning
remoteproc/ste: add blank lines after declarations
Merge tag 'hwspinlock-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock
Pull hwspinlock updates from Ohad Ben-Cohen:
- hwspinlock core DT support from Suman Anna
- OMAP hwspinlock DT support from Suman Anna
- QCOM hwspinlock DT support from Bjorn Andersson
- a new CSR atlas7 hwspinlock driver from Wei Chen
- CSR atlas7 hwspinlock DT binding document from Wei Chen
- a tiny QCOM hwspinlock driver fix from Bjorn Andersson
* tag 'hwspinlock-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock:
hwspinlock: qcom: Correct msb in regmap_field
DT: hwspinlock: add the CSR atlas7 hwspinlock bindings document
hwspinlock: add a CSR atlas7 driver
hwspinlock: qcom: Add support for Qualcomm HW Mutex block
DT: hwspinlock: Add binding documentation for Qualcomm hwmutex
hwspinlock/omap: add support for dt nodes
Documentation: dt: add the omap hwspinlock bindings document
hwspinlock/core: add device tree support
Documentation: dt: add common bindings for hwspinlock
- suspicious RCU usage warning
- BPF (out of bounds array read and endianness conversion)
- perf (of_node usage after of_node_put, cpu_pmu->plat_device
assignment)
- huge pmd/pud check for value 0
- rate-limiting should only take unhandled signals into account
Clean-up:
- incorrect use of pgprot_t type
- unused header include
- __init annotation to arm_cpuidle_init
- pr_debug instead of pr_error for disabled GICC entries in
ACPI/MADT"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix show_unhandled_signal_ratelimited usage
ARM64 / SMP: Switch pr_err() to pr_debug() for disabled GICC entry
arm64: cpuidle: add __init section marker to arm_cpuidle_init
arm64: Don't report clear pmds and puds as huge
arm64: perf: fix unassigned cpu_pmu->plat_device when probing PMU PPIs
arm64: perf: Don't use of_node after putting it
arm64: fix incorrect use of pgprot_t variable
arm64/hw_breakpoint.c: remove unnecessary header
arm64: bpf: fix endianness conversion bugs
arm64: bpf: fix out-of-bounds read in bpf2a64_offset()
ARM64: smp: Fix suspicious RCU usage with ipi tracepoints
Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull more hwmon updates from Jean Delvare.
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (w83627ehf) Use swap() in w82627ehf_swap_tempreg()
hwmon: Document which I2C addresses can be probed
hwmon: (w83792d) Additional PWM outputs support
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Mainly sending this off now for the writeback fixes, since they fix a
real regression introduced with the cgroup writeback changes. The
NVMe fix could wait for next pull for this series, but it's simple
enough that we might as well include it.
This contains:
- two cgroup writeback fixes from Tejun, fixing a user reported issue
with luks crypt devices hanging when being closed.
- NVMe error cleanup fix from Jon Derrick, fixing a case where we'd
attempt to free an unregistered IRQ"
* 'for-linus' of git://git.kernel.dk/linux-block:
NVMe: Fix irq freeing when queue_request_irq fails
writeback: don't drain bdi_writeback_congested on bdi destruction
writeback: don't embed root bdi_writeback_congested in bdi_writeback
Nicolas Iooss [Mon, 29 Jun 2015 10:39:23 +0000 (18:39 +0800)]
KVM: x86: remove data variable from kvm_get_msr_common
Commit 609e36d372ad ("KVM: x86: pass host_initiated to functions that
read MSRs") modified kvm_get_msr_common function to use msr_info->data
instead of data but missed one occurrence. Replace it and remove the
unused local variable.
Fixes: 609e36d372ad ("KVM: x86: pass host_initiated to functions that
read MSRs") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cornelia Huck [Mon, 29 Jun 2015 14:44:01 +0000 (16:44 +0200)]
KVM: s390: virtio-ccw: don't overwrite config space values
Eric noticed problems with vhost-scsi and virtio-ccw: vhost-scsi
complained about overwriting values in the config space, which
was triggered by a broken implementation of virtio-ccw's config
get/set routines. It was probably sheer luck that we did not hit
this before.
When writing a value to the config space, the WRITE_CONF ccw will
always write from the beginning of the config space up to and
including the value to be set. If the config space up to the value
has not yet been retrieved from the device, however, we'll end up
overwriting values. Keep track of the known config space and update
if needed to avoid this.
Moreover, READ_CONF will only read the number of bytes it has been
instructed to retrieve, so we must not copy more than that to the
buffer, or we might overwrite trailing values.
Reported-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com> Tested-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Radim Krčmář [Tue, 30 Jun 2015 20:19:16 +0000 (22:19 +0200)]
KVM: x86: keep track of LVT0 changes under APICv
Memory-mapped LVT0 register already contains the new value when APICv
traps so we can't directly detect a change.
Memorize a bit we are interested in to enable legacy NMI watchdog.
Suggested-by: Yoshida Nobuo <yoshida.nb@ncos.nec.co.jp> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Writes were a bit racy, but hard to turn into a bug at the same time.
(Particularly because modern Linux doesn't use this feature anymore.)
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
[Actually the next patch makes it much, much easier to trigger the race
so I'm including this one for stable@ as well. - Paolo] Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Peter Zijlstra [Fri, 3 Jul 2015 16:53:58 +0000 (18:53 +0200)]
sched, preempt_notifier: separate notifier registration from static_key inc/dec
Commit 1cde2930e154 ("sched/preempt: Add static_key() to preempt_notifiers")
had two problems. First, the preempt-notifier API needs to sleep with the
addition of the static_key, we do however need to hold off preemption
while modifying the preempt notifier list, otherwise a preemption could
observe an inconsistent list state. KVM correctly registers and
unregisters preempt notifiers with preemption disabled, so the sleep
caused dmesg splats.
Second, KVM registers and unregisters preemption notifiers very often
(in vcpu_load/vcpu_put). With a single uniprocessor guest the static key
would move between 0 and 1 continuously, hitting the slow path on every
userspace exit.
To fix this, wrap the static_key inc/dec in a new API, and call it from
KVM.
Fixes: 1cde2930e154 ("sched/preempt: Add static_key() to preempt_notifiers") Reported-by: Pontus Fuchs <pontus.fuchs@gmail.com> Reported-by: Takashi Iwai <tiwai@suse.de> Tested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 86dca36e6ba introduced ratelimited usage for
'unhandled_signal' messages.
The commit checks the ratelimit irrespective of whether
the signal is handled or not, which is wrong and leads
to false reports like the below in dmesg :
__do_user_fault: 127 callbacks suppressed
Do the ratelimit check only if the signal is unhandled.
Fixes: 86dca36e6ba0 ("arm64: use private ratelimit state along with show_unhandled_signals") Cc: Vladimir Murzin <Vladimir.Murzin@arm.com> Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>