Gabor Juhos [Wed, 17 Feb 2016 11:58:20 +0000 (12:58 +0100)]
pata-rb532-cf: get rid of the irq_to_gpio() call
The RB532 platform specific irq_to_gpio() implementation has been
removed with commit 832f5dacfa0b ("MIPS: Remove all the uses of
custom gpio.h"). Now the platform uses the generic stub which causes
the following error:
pata-rb532-cf pata-rb532-cf: no GPIO found for irq149
pata-rb532-cf: probe of pata-rb532-cf failed with error -2
Drop the irq_to_gpio() call and get the GPIO number from platform
data instead. After this change, the driver works again:
scsi host0: pata-rb532-cf
ata1: PATA max PIO4 irq 149
ata1.00: CFA: CF 1GB, 20080820, max MWDMA4
ata1.00: 1989792 sectors, multi 0: LBA
ata1.00: configured for PIO4
scsi 0:0:0:0: Direct-Access ATA CF 1GB 0820 PQ: 0\
ANSI: 5
sd 0:0:0:0: [sda] 1989792 512-byte logical blocks: (1.01 GB/971 MiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't\
support DPO or FUA
sda: sda1 sda2
sd 0:0:0:0: [sda] Attached SCSI disk
Fixes: 832f5dacfa0b ("MIPS: Remove all the uses of custom gpio.h") Cc: Alban Bedel <albeu@free.fr> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: <stable@vger.kernel.org> #v4.3+ Signed-off-by: Gabor Juhos <juhosg@openwrt.org> Signed-off-by: Tejun Heo <tj@kernel.org>
Arnd Bergmann [Thu, 11 Feb 2016 13:16:27 +0000 (14:16 +0100)]
libata: fix HDIO_GET_32BIT ioctl
As reported by Soohoon Lee, the HDIO_GET_32BIT ioctl does not
work correctly in compat mode with libata.
I have investigated the issue further and found multiple problems
that all appeared with the same commit that originally introduced
HDIO_GET_32BIT handling in libata back in linux-2.6.8 and presumably
also linux-2.4, as the code uses "copy_to_user(arg, &val, 1)" to copy
a 'long' variable containing either 0 or 1 to user space.
The problems with this are:
* On big-endian machines, this will always write a zero because it
stores the wrong byte into user space.
* In compat mode, the upper three bytes of the variable are updated
by the compat_hdio_ioctl() function, but they now contain
uninitialized stack data.
* The hdparm tool calling this ioctl uses a 'static long' variable
to store the result. This means at least the upper bytes are
initialized to zero, but calling another ioctl like HDIO_GET_MULTCOUNT
would fill them with data that remains stale when the low byte
is overwritten. Fortunately libata doesn't implement any of the
affected ioctl commands, so this would only happen when we query
both an IDE and an ATA device in the same command such as
"hdparm -N -c /dev/hda /dev/sda"
* The libata code for unknown reasons started using ATA_IOC_GET_IO32
and ATA_IOC_SET_IO32 as aliases for HDIO_GET_32BIT and HDIO_SET_32BIT,
while the ioctl commands that were added later use the normal
HDIO_* names. This is harmless but rather confusing.
This addresses all four issues by changing the code to use put_user()
on an 'unsigned long' variable in HDIO_GET_32BIT, like the IDE subsystem
does, and by clarifying the names of the ioctl commands.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Soohoon Lee <Soohoon.Lee@f5.com> Tested-by: Soohoon Lee <Soohoon.Lee@f5.com> Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org>
Suman Tripathi [Sat, 6 Feb 2016 05:55:24 +0000 (11:25 +0530)]
ahci_xgene: Implement the workaround to fix the missing of the edge interrupt for the HOST_IRQ_STAT.
Due to H/W errata, the HOST_IRQ_STAT register misses the edge interrupt
when clearing the HOST_IRQ_STAT register and hardware reporting the
PORT_IRQ_STAT register happens to be at the same clock cycle.
Suman Tripathi [Sat, 6 Feb 2016 05:55:23 +0000 (11:25 +0530)]
ata: Remove the AHCI_HFLAG_EDGE_IRQ support from libahci.
The flexibility to override the irq handles in the LLD's are already
present, so controllers implementing a edge trigger latch can
implement their own interrupt handler inside the driver. This patch
removes the AHCI_HFLAG_EDGE_IRQ support from libahci and moves edge
irq handling to ahci_xgene.
Suman Tripathi [Sat, 6 Feb 2016 05:55:22 +0000 (11:25 +0530)]
libahci: Implement the capability to override the generic ahci interrupt handler.
This patch implements the capability to override the generic AHCI
interrupt handler so that specific ahci drivers can implement their
own custom interrupt handler routines. It also exports
ahci_handle_port_intr so that custom irq_handler implementations can
use it.
tj: s/ahci_irq_handler/irq_handler/ and updated description.
Tejun Heo [Mon, 1 Feb 2016 16:33:21 +0000 (11:33 -0500)]
libata: fix sff host state machine locking while polling
The bulk of ATA host state machine is implemented by
ata_sff_hsm_move(). The function is called from either the interrupt
handler or, if polling, a work item. Unlike from the interrupt path,
the polling path calls the function without holding the host lock and
ata_sff_hsm_move() selectively grabs the lock.
This is completely broken. If an IRQ triggers while polling is in
progress, the two can easily race and end up accessing the hardware
and updating state machine state at the same time. This can put the
state machine in an illegal state and lead to a crash like the
following.
Fix it by ensuring that the polling path is holding the host lock
before entering ata_sff_hsm_move() so that all hardware accesses and
state updates are performed under the host lock.
Tejun Heo [Fri, 29 Jan 2016 12:06:53 +0000 (07:06 -0500)]
libata-sff: use WARN instead of BUG on illegal host state machine state
ata_sff_hsm_move() triggers BUG if it sees a host state machine state
that it dind't expect. The risk for data corruption when the
condition occurs is low as it's highly unlikely that it would lead to
spurious completion of commands. The BUG occasionally triggered for
subtle race conditions in the driver. Let's downgrade it to WARN so
that it doesn't kill the machine unnecessarily.
Tejun Heo [Fri, 15 Jan 2016 20:13:05 +0000 (15:13 -0500)]
libata: disable forced PORTS_IMPL for >= AHCI 1.3
Some early controllers incorrectly reported zero ports in PORTS_IMPL
register and the ahci driver fabricates PORTS_IMPL from the number of
ports in those cases. This hasn't mattered but with the new nvme
controllers there are cases where zero PORTS_IMPL is valid and should
be honored.
Danesh Petigara [Mon, 11 Jan 2016 21:22:26 +0000 (13:22 -0800)]
drivers: ata: wake port before DMA stop for ALPM
The AHCI driver code stops and starts port DMA engines at will
without considering the power state of the particular port. The
AHCI specification isn't very clear on how to handle this scenario,
leaving implementation open to interpretation.
Broadcom's STB SATA host controller is unable to handle port DMA
controller restarts when the port in question is in low power mode.
When a port enters partial or slumber mode, its PHY is powered down.
When a controller restart is requested, the controller's internal
state machine expects the PHY to be brought back up by software which
never happens in this case, resulting in failures.
To avoid this situation, logic is added to manually wake up the port
just before its DMA engine is stopped, if the port happens to be in
a low power state. HBA initiated power management ensures that the port
eventually returns to its configured low power state, when the link is
idle (as per the conditions listed in the spec). A new host flag is also
added to ensure this logic is only exercised for hosts with the above
limitation.
Linus Torvalds [Sun, 24 Jan 2016 20:50:56 +0000 (12:50 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
"This is the main pull request for MIPS for 4.5 plus some 4.4 fixes.
The executive summary:
- ATH79 platform improvments, use DT bindings for the ATH79 USB PHY.
- Avoid useless rebuilds for zboot.
- jz4780: Add NEMC, BCH and NAND device tree nodes
- Initial support for the MicroChip's DT platform. As all the device
drivers are missing this is still of limited use.
- Some Loongson3 cleanups.
- The unavoidable whitespace polishing.
- Reduce clock skew when synchronizing the CPU cycle counters on CPU
startup.
- Add MIPS R6 fixes.
- Lots of cleanups across arch/mips as fallout from KVM.
- Lots of minor fixes and changes for IEEE 754-2008 support to the
FPU emulator / fp-assist software.
- Minor Ralink, BCM47xx and bcm963xx platform support improvments.
- Support SMP on BCM63168"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (84 commits)
MIPS: zboot: Add support for serial debug using the PROM
MIPS: zboot: Avoid useless rebuilds
MIPS: BMIPS: Enable ARCH_WANT_OPTIONAL_GPIOLIB
MIPS: bcm63xx: nvram: Remove unused bcm63xx_nvram_get_psi_size() function
MIPS: bcm963xx: Update bcm_tag field image_sequence
MIPS: bcm963xx: Move extended flash address to bcm_tag header file
MIPS: bcm963xx: Move Broadcom BCM963xx image tag data structure
MIPS: bcm63xx: nvram: Use nvram structure definition from header file
MIPS: bcm963xx: Add Broadcom BCM963xx board nvram data structure
MAINTAINERS: Add KVM for MIPS entry
MIPS: KVM: Add missing newline to kvm_err()
MIPS: Move KVM specific opcodes into asm/inst.h
MIPS: KVM: Use cacheops.h definitions
MIPS: Break down cacheops.h definitions
MIPS: Use EXCCODE_ constants with set_except_vector()
MIPS: Update trap codes
MIPS: Move Cause.ExcCode trap codes to mipsregs.h
MIPS: KVM: Make kvm_mips_{init,exit}() static
MIPS: KVM: Refactor added offsetof()s
MIPS: KVM: Convert EXPORT_SYMBOL to _GPL
...
Linus Torvalds [Sun, 24 Jan 2016 20:45:35 +0000 (12:45 -0800)]
Merge tag 'platform-drivers-x86-v4.5-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
Pull x86 platform driver updates from Darren Hart:
"Emergency travel prevented me from completing my final testing on this
until today. Nothing here that couldn't wait until RC1 fixes, but I
thought it best to get it out sooner rather than later as it does
contain a build warning fix.
Summary:
A build warning fix, MAINTAINERS cleanup, and a new DMI quirk:
ideapad-laptop:
- Add Lenovo Yoga 700 to no_hw_rfkill dmi list
MAINTAINERS:
- Combine multiple telemetry entries
intel_telemetry_debugfs:
- Fix unused warnings in telemetry debugfs"
* tag 'platform-drivers-x86-v4.5-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
ideapad-laptop: Add Lenovo Yoga 700 to no_hw_rfkill dmi list
MAINTAINERS: Combine multiple telemetry entries
intel_telemetry_debugfs: Fix unused warnings in telemetry debugfs
Linus Torvalds [Sun, 24 Jan 2016 20:43:06 +0000 (12:43 -0800)]
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:
"The top merge commit was re-generated yesterday because two topic
branches were dropped from this pull request in the last minute due to
some unaddressed comments. All the other material has been in
linux-next for quite a while.
Specifics:
- Enhance thermal core to handle unexpected device cooling states
after fresh boot and system resume. From Zhang Rui and Chen Yu.
- Several fixes and cleanups on Rockchip and RCAR thermal drivers.
From Caesar Wang and Kuninori Morimoto.
- Add Broxton support for Intel processor thermal reporting device
driver. From Amy Wiles"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
thermal: trip_point_temp_store() calls thermal_zone_device_update()
thermal: rcar: rcar_thermal_get_temp() return error if strange temp
thermal: rcar: check irq possibility in rcar_thermal_irq_xxx()
thermal: rcar: check every rcar_thermal_update_temp() return value
thermal: rcar: move rcar_thermal_dt_ids to upside
thermal: rockchip: Support the RK3399 SoCs in thermal driver
thermal: rockchip: Support the RK3228 SoCs in thermal driver
dt-bindings: rockchip-thermal: Support the RK3228/RK3399 SoCs compatible
thermal: rockchip: fix a trivial typo
Thermal: Enable Broxton SoC thermal reporting device
thermal: constify pch_dev_ops structure
Thermal: do thermal zone update after a cooling device registered
Thermal: handle thermal zone device properly during system sleep
Thermal: initialize thermal zone device correctly
Linus Torvalds [Sun, 24 Jan 2016 20:39:09 +0000 (12:39 -0800)]
Merge tag 'for-linus-4.5-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
Pull 9p updates from Eric Van Hensbergen:
"Sorry for the last minute pull request, there's was a change that
didn't get pulled into for-next until two weeks ago and I wanted to
give it some bake time.
Summary:
Rework and error handling fixes, primarily in the fscatch and fd
transports"
* tag 'for-linus-4.5-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
fs/9p: use fscache mutex rather than spinlock
9p: trans_fd, bail out if recv fcall if missing
9p: trans_fd, read rework to use p9_parse_header
net/9p: Add device name details on error
Linus Torvalds [Sun, 24 Jan 2016 20:34:13 +0000 (12:34 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph updates from Sage Weil:
"The two main changes are aio support in CephFS, and a series that
fixes several issues in the authentication key timeout/renewal code.
On top of that are a variety of cleanups and minor bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: remove outdated comment
libceph: kill off ceph_x_ticket_handler::validity
libceph: invalidate AUTH in addition to a service ticket
libceph: fix authorizer invalidation, take 2
libceph: clear messenger auth_retry flag if we fault
libceph: fix ceph_msg_revoke()
libceph: use list_for_each_entry_safe
ceph: use i_size_{read,write} to get/set i_size
ceph: re-send AIO write request when getting -EOLDSNAP error
ceph: Asynchronous IO support
ceph: Avoid to propagate the invalid page point
ceph: fix double page_unlock() in page_mkwrite()
rbd: delete an unnecessary check before rbd_dev_destroy()
libceph: use list_next_entry instead of list_entry_next
ceph: ceph_frag_contains_value can be boolean
ceph: remove unused functions in ceph_frag.h
Linus Torvalds [Sun, 24 Jan 2016 20:31:12 +0000 (12:31 -0800)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull SMB3 fixes from Steve French:
"A collection of CIFS/SMB3 fixes.
It includes a couple bug fixes, a few for improved debugging of
cifs.ko and some improvements to the way cifs does key generation.
I do have some additional bug fixes I expect in the next week or two
(to address a problem found by xfstest, and some fixes for SMB3.11
dialect, and a couple patches that just came in yesterday that I am
reviewing)"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs_dbg() outputs an uninitialized buffer in cifs_readdir()
cifs: fix race between call_async() and reconnect()
Prepare for encryption support (first part). Add decryption and encryption key generation. Thanks to Metze for helping with this.
cifs: Allow using O_DIRECT with cache=loose
cifs: Make echo interval tunable
cifs: Check uniqueid for SMB2+ and return -ESTALE if necessary
Print IP address of unresponsive server
cifs: Ratelimit kernel log messages
Josh Boyer [Sun, 24 Jan 2016 15:46:42 +0000 (10:46 -0500)]
ideapad-laptop: Add Lenovo Yoga 700 to no_hw_rfkill dmi list
Like the Yoga 900 models the Lenovo Yoga 700 does not have a
hw rfkill switch, and trying to read the hw rfkill switch through the
ideapad module causes it to always reported blocking breaking wifi.
This commit adds the Lenovo Yoga 700 to the no_hw_rfkill dmi list, fixing
the wifi breakage.
If we detect that there is nothing to do just set the flag and do not
check if it was already set before. Races really do not matter. If the
flag is set by any code then the shepherd will start dealing with the
situation and reenable the vmstat workers when necessary again.
Since commit 0eb77e988032 ("vmstat: make vmstat_updater deferrable again
and shut down on idle") quiet_vmstat might update cpu_stat_off and mark
a particular cpu to be handled by vmstat_shepherd. This might trigger a
VM_BUG_ON in vmstat_update because the work item might have been
sleeping during the idle period and see the cpu_stat_off updated after
the wake up. The VM_BUG_ON is therefore misleading and no more
appropriate. Moreover it doesn't really suite any protection from real
bugs because vmstat_shepherd will simply reschedule the vmstat_work
anytime it sees a particular cpu set or vmstat_update would do the same
from the worker context directly. Even when the two would race the
result wouldn't be incorrect as the counters update is fully idempotent.
Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Christoph Lameter <cl@linux.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alban Bedel [Thu, 10 Dec 2015 09:57:20 +0000 (10:57 +0100)]
MIPS: zboot: Avoid useless rebuilds
Add dummy.o to the targets list, and fill targets automatically from
$(vmlinuzobjs) to avoid having to maintain two lists.
When building with XZ compression copy ashldi3.c to the build
directory to use a different object file for the kernel and zboot.
Without this the same object file need to be build with different
flags which cause a rebuild at every run.
Signed-off-by: Alban Bedel <albeu@free.fr> Cc: linux-mips@linux-mips.org Cc: Alex Smith <alex.smith@imgtec.com> Cc: Wu Zhangjin <wuzhangjin@gmail.com> Cc: Andrew Bresticker <abrestic@chromium.org> Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11810/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Simon Arlott [Sun, 13 Dec 2015 22:47:55 +0000 (22:47 +0000)]
MIPS: bcm963xx: Move extended flash address to bcm_tag header file
The extended flash address needs to be subtracted from bcm_tag flash
image offsets. Move this value to the bcm_tag header file.
Renamed define name to consistently use bcm963xx for flash layout
which should be considered a property of the board and not the SoC
(i.e. bcm63xx could theoretically be used on a board without CFE
or any flash).
Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Brian Norris <computersforpeace@gmail.com> Cc: Kevin Cernekee <cernekee@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org> Cc: MIPS Mailing List <linux-mips@linux-mips.org> Cc: MTD Maling List <linux-mtd@lists.infradead.org>
Patchwork: https://patchwork.linux-mips.org/patch/11833/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Simon Arlott [Sun, 13 Dec 2015 22:45:30 +0000 (22:45 +0000)]
MIPS: bcm963xx: Add Broadcom BCM963xx board nvram data structure
Broadcom BCM963xx boards have multiple nvram variants across different
SoCs with additional checksum fields added whenever the size of the
nvram was extended.
Add this structure as a header file so that multiple drivers can use it.
Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Brian Norris <computersforpeace@gmail.com> Cc: Kevin Cernekee <cernekee@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org> Cc: MIPS Mailing List <linux-mips@linux-mips.org> Cc: MTD Maling List <linux-mtd@lists.infradead.org>
Patchwork: https://patchwork.linux-mips.org/patch/11830/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Sun, 24 Jan 2016 02:45:06 +0000 (18:45 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma updates from Doug Ledford:
"Initial roundup of 4.5 merge window patches
- Remove usage of ib_query_device and instead store attributes in
ib_device struct
- Move iopoll out of block and into lib, rename to irqpoll, and use
in several places in the rdma stack as our new completion queue
polling library mechanism. Update the other block drivers that
already used iopoll to use the new mechanism too.
- Replace the per-entry GID table locks with a single GID table lock
- IPoIB multicast cleanup
- Cleanups to the IB MR facility
- Add support for 64bit extended IB counters
- Fix for netlink oops while parsing RDMA nl messages
- Add support for remote invalidate to the iSER driver (pushed
through the RDMA tree due to dependencies, acknowledged by nab)
- Update to NFSoRDMA (pushed through the RDMA tree due to
dependencies, acknowledged by Bruce)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (169 commits)
IB/mlx5: Unify CQ create flags check
IB/mlx5: Expose Raw Packet QP to user space consumers
{IB, net}/mlx5: Move the modify QP operation table to mlx5_ib
IB/mlx5: Support setting Ethernet priority for Raw Packet QPs
IB/mlx5: Add Raw Packet QP query functionality
IB/mlx5: Add create and destroy functionality for Raw Packet QP
IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types
IB/mlx5: Allocate a Transport Domain for each ucontext
net/mlx5_core: Warn on unsupported events of QP/RQ/SQ
net/mlx5_core: Add RQ and SQ event handling
net/mlx5_core: Export transport objects
IB/mlx5: Expose CQE version to user-space
IB/mlx5: Add CQE version 1 support to user QPs and SRQs
IB/mlx5: Fix data validation in mlx5_ib_alloc_ucontext
IB/sa: Fix netlink local service GFP crash
IB/srpt: Remove redundant wc array
IB/qib: Improve ipoib UD performance
IB/mlx4: Advertise RoCE v2 support
IB/mlx4: Create and use another QP1 for RoCEv2
IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
...
James Hogan [Wed, 16 Dec 2015 23:49:38 +0000 (23:49 +0000)]
MIPS: Move KVM specific opcodes into asm/inst.h
The header arch/mips/kvm/opcode.h defines a few extra opcodes which
aren't in arch/mips/include/uapi/asm/inst.h. There's nothing KVM
specific about them, so lets move them into inst.h where they belong and
delete the header.
Note that mfmcz_op is renamed to mfmc0_op to match the instruction set
manual, and wait_op was already added to inst.h in commit b0a3eae2b943
("MIPS: inst.h: define COP0 wait op"), merged in v3.16-rc1.
James Hogan [Wed, 16 Dec 2015 23:49:37 +0000 (23:49 +0000)]
MIPS: KVM: Use cacheops.h definitions
Drop the custom cache operation code definitions used by KVM for
emulating guest CACHE instructions, and switch to use the existing
definitions in <asm/cacheops.h>.
James Hogan [Wed, 16 Dec 2015 23:49:36 +0000 (23:49 +0000)]
MIPS: Break down cacheops.h definitions
Most of the cache op codes defined in cacheops.h are split into a 2-bit
cache identifier, and a 3-bit cache op code which does largely the same
thing semantically regardless of the cache identifier.
To allow the use of these definitions by KVM for decoding cache ops,
break the definitions down into parts where it makes sense to do so, and
add masks for the Cache and Op field within the cache op.
James Hogan [Wed, 16 Dec 2015 23:49:34 +0000 (23:49 +0000)]
MIPS: Update trap codes
Add a few missing trap codes.
[ralf@linux-mips.org: Drop removal of exception codes. I don't care what
the incomplete architecture spec says; it can't change existing hardware
and VCEI is supported indeed.]
James Hogan [Wed, 16 Dec 2015 23:49:33 +0000 (23:49 +0000)]
MIPS: Move Cause.ExcCode trap codes to mipsregs.h
Move the Cause.ExcCode trap code definitions from kvm_host.h to
mipsregs.h, since they describe architectural bits rather than KVM
specific constants, and change the prefix from T_ to EXCCODE_.
James Hogan [Wed, 16 Dec 2015 23:49:31 +0000 (23:49 +0000)]
MIPS: KVM: Refactor added offsetof()s
When calculating the offsets into the commpage for dynamically
translated mtc0/mfc0 guest instructions, multiple offsetof()s are added
together to find the offset of the specific register in the mips_coproc,
within the commpage.
Simplify each of these cases to a single offsetof() to find the offset
of the specific register within the commpage.
James Hogan [Wed, 16 Dec 2015 23:49:28 +0000 (23:49 +0000)]
MIPS: Move definition of DC bit to mipsregs.h
The CAUSEB_DC and CAUSEF_DC definitions used by KVM are defined in
asm/kvm_host.h, but all the other Cause register field definitions are
found in asm/mipsregs.h.
Lets reunite the DC bit definitions with its friends in mipsregs.h.
James Hogan [Wed, 16 Dec 2015 23:49:27 +0000 (23:49 +0000)]
MIPS: KVM: Drop some unused definitions from kvm_host.h
Some definitions in the MIPS asm/kvm_host.h are completely unused, so
lets drop them.
MS_TO_NS is no longer used since commit e30492bbe95a ("MIPS: KVM:
Rewrite count/compare timer emulation"). The others don't appear ever to
have been used.
Joshua Henderson [Thu, 14 Jan 2016 01:15:39 +0000 (18:15 -0700)]
MIPS: Add support for PIC32MZDA platform
This adds support for the Microchip PIC32 MIPS microcontroller with the
specific variant PIC32MZDA. PIC32MZDA is based on the MIPS m14KEc core
and boots using device tree.
This includes an early pin setup and early clock setup needed prior to
device tree being initialized. In additon, an interface is provided to
synchronize access to registers shared across several peripherals.
Cristian Birsan [Thu, 14 Jan 2016 01:15:35 +0000 (18:15 -0700)]
IRQCHIP: irq-pic32-evic: Add support for PIC32 interrupt controller
This adds support for the interrupt controller present on PIC32 class
devices. It handles all internal and external interrupts. This controller
exists outside of the CPU core and is the arbitrator of all interrupts
(including interrupts from the CPU itself) before they are presented to
the CPU.
The following features are supported:
- DT properties for EVIC and for devices/peripherals that use interrupt lines
- Persistent and non-persistent interrupt handling
- irqdomain and generic chip support
- Configuration of external interrupt edge polarity
Signed-off-by: Cristian Birsan <cristian.birsan@microchip.com> Signed-off-by: Joshua Henderson <joshua.henderson@microchip.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12092/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
James Hogan [Tue, 22 Dec 2015 13:56:39 +0000 (13:56 +0000)]
MIPS: ptrace: Drop cp0_tcstatus from regoffset_table[]
The cp0_tcstatus member of struct pt_regs was removed along with the
rest of SMTC in v3.16, commit b633648c5ad3 ("MIPS: MT: Remove SMTC
support"), however recent uprobes support in v4.3 added back a reference
to it in the regoffset_table[] in ptrace.c. Remove it.
Linus Walleij [Tue, 22 Dec 2015 14:41:44 +0000 (15:41 +0100)]
MIPS: TXx9: iocled: Be sure to clamp return value
As we want gpio_chip .get() calls to be able to return negative
error codes and propagate to drivers, we need to go over all
drivers and make sure their return values are clamped to [0,1].
We do this by using the ret = !!(val) design pattern.
Linus Walleij [Tue, 22 Dec 2015 14:41:19 +0000 (15:41 +0100)]
MIPS: RB532: Be sure to clamp return value
As we want gpio_chip .get() calls to be able to return negative
error codes and propagate to drivers, we need to go over all
drivers and make sure their return values are clamped to [0,1].
We do this by using the ret = !!(val) design pattern.
Linus Walleij [Tue, 22 Dec 2015 14:41:01 +0000 (15:41 +0100)]
MIPS: TXx9: Be sure to clamp return value
As we want gpio_chip .get() calls to be able to return negative
error codes and propagate to drivers, we need to go over all
drivers and make sure their return values are clamped to [0,1].
We do this by using the ret = !!(val) design pattern.
Linus Walleij [Tue, 22 Dec 2015 14:40:27 +0000 (15:40 +0100)]
MIPS: ar7: Be sure to clamp return value
As we want gpio_chip .get() calls to be able to return negative
error codes and propagate to drivers, we need to go over all
drivers and make sure their return values are clamped to [0,1].
We do this by using the ret = !!(val) design pattern.
Linus Walleij [Tue, 22 Dec 2015 14:40:02 +0000 (15:40 +0100)]
MIPS: Alchemy: Be sure to clamp return value
As we want gpio_chip .get() calls to be able to return negative
error codes and propagate to drivers, we need to go over all
drivers and make sure their return values are clamped to [0,1].
We do this by using the ret = !!(val) design pattern.
Matt Redfearn [Fri, 18 Dec 2015 12:47:00 +0000 (12:47 +0000)]
MIPS: smp-cps: Ensure secondary cores start with EVA disabled
The kernel currently assumes that a core will start up in legacy mode
using the exception base provided through the CM GCR registers. If a
core has been configured in hardware to start in EVA mode, these
assumptions will fail.
This patch ensures that secondary cores are initialized to meet these
assumptions.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com> Reviewed-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/11907/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Fix the description of the microMIPS NOP16 encoding or MM_NOP16, which
is not equivalent to the MIPS16 NOP instruction. This is 0x0c00 and
represents the microMIPS `MOVE16 $0, $0' operation, whereas MIPS16 NOP
is encoded as 0x6500, representing `MOVE $0, $16'.
Also fix a typo in `mm_fp0_format' description.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12177/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS: math-emu: dsemul: Correct description of the emulation frame
Remove irrelevant content from the description of the emulation frame in
`mips_dsemul', referring to bare-metal configurations. Update the text,
reflecting the change made with commit ba3049ed4086 ("MIPS: Switch FPU
emulator trap to BREAK instruction."), where we switched from using an
address error exception on an unaligned access to the use of a BREAK 514
instruction causing a breakpoint exception instead.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12176/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS: math-emu: Correct the emulation of microMIPS ADDIUPC instruction
Emulate the microMIPS ADDIUPC instruction directly in `mips_dsemul'. If
executed in the emulation frame, this instruction produces an incorrect
result, because the value of the PC there is not the same as where the
instruction originated.
Reshape code so as to handle all microMIPS cases together.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12175/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS: math-emu: Make microMIPS branch delay slot emulation work
Complement commit 102cedc32a6e ("MIPS: microMIPS: Floating point
support.") which introduced microMIPS FPU emulation, but did not adjust
the encoding of the BREAK instruction used to terminate the branch delay
slot emulation frame. Consequently the execution of any such frame is
indeterminate and, depending on CPU configuration, will result in random
code execution or an offending program being terminated with SIGILL.
This is because the regular MIPS BREAK instruction is encoded with the 0
major and the 0xd minor opcode, however in the microMIPS instruction set
this major/minor opcode pair denotes an encoding reserved for the DSP
ASE. Instead the microMIPS BREAK instruction is encoded with the 0
major and the 0x7 minor opcode.
Use the correct BREAK encoding for microMIPS FPU emulation then.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12174/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS: math-emu: dsemul: Fix ill formatting of microMIPS part
Correct formatting breakage introduced with commit 102cedc32a6e ("MIPS:
microMIPS: Floating point support."), so that further changes to this
code can be consistent.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12173/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Fix an issue introduced with commit 9ab4471c9f1b ("MIPS: math-emu:
Correct delay-slot exception propagation") where the emulation of a NOP
instruction signals the need to terminate the emulation loop. This in
turn, if the PC has not changed from the entry to the loop, will cause
the kernel to terminate the program with SIGILL.
Consider this program:
static double div(double d)
{
do
d /= 2.0;
while (d > .5);
return d;
}
int main(int argc, char **argv)
{
return div(argc);
}
Where the FPU emulator is used, depending on the number of command-line
arguments this code will either run to completion or terminate with
SIGILL.
If no arguments are specified, then BC1T will not be taken, NOP will not
be emulated and code will complete successfully.
If one argument is specified, then BC1T will be taken once and NOP will
be emulated. At this point the entry PC value will be 0x400498 and the
new PC value, set by `mips_dsemul' will be 0x4004a0, the target of BC1T.
The emulation loop will terminate, but SIGILL will not be issued,
because the PC has changed. The FPU emulator will be entered again and
on the second execution BC1T will not be taken, NOP will not be emulated
and code will complete successfully.
If two or more arguments are specified, then the first execution of BC1T
will proceed as above. Upon reentering the FPU emulator the emulation
loop will continue to BC1T, at which point the branch will be taken and
NOP emulated again. At this point however the entry PC value will be
0x4004a0, the same as the target of BC1T. This will make the emulator
conclude that execution has not advanced and therefore an unsupported
FPU instruction has been encountered, and SIGILL will be sent to the
process.
Fix the problem by extending the internal API of `mips_dsemul', making
it return -1 if no delay slot emulation frame has been made, the
instruction has been handled and execution of the emulation loop needs
to continue as if nothing happened. Remove code from `mips_dsemul' to
reproduce steps made by the emulation loop at the conclusion of each
iteration, as those will be reached normally now. Adjust call sites
accordingly. Document the API.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12172/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Huacai Chen [Thu, 21 Jan 2016 13:09:52 +0000 (21:09 +0800)]
MIPS: Fix some missing CONFIG_CPU_MIPSR6 #ifdefs
Commit be0c37c985eddc4 (MIPS: Rearrange PTE bits into fixed positions.)
defines fixed PTE bits for MIPS R2. Then, commit d7b631419b3d230a4d383
(MIPS: pgtable-bits: Fix XPA damage to R6 definitions.) adds the MIPS
R6 definitions in the same way as MIPS R2. But some R6 #ifdefs in the
later commit are missing, so in this patch I fix that.
Huacai Chen [Thu, 21 Jan 2016 13:09:51 +0000 (21:09 +0800)]
MIPS: sync-r4k: reduce skew while synchronization
While synchronization, count register will go backwards for the master.
If synchronise_count_master() runs before synchronise_count_slave(),
skew becomes even more. The skew is very harmful for CPU hotplug (CPU0
do synchronization with CPU1, then CPU0 do synchronization with CPU2
and CPU0's count goes backwards, so it will be out of sync with CPU1).
After the commit cf9bfe55f24973a8f40e2 (MIPS: Synchronize MIPS count one
CPU at a time), we needn't evaluate count_reference at the beginning of
synchronise_count_master() any more. Thus, we evaluate the initcount (It
seems like count_reference is redundant) in the 2nd loop. Since we write
the count register in the last loop, we don't need additional barriers
(the existing memory barriers are enough).
Moreover, I think we loop 3 times is enough to get a primed instruction
cache, this can also get less skew than looping 5 times.
Comments are also updated in this patch.
Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/12163/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Sun, 24 Jan 2016 00:00:52 +0000 (16:00 -0800)]
Merge tag 'ntb-4.5' of git://github.com/jonmason/ntb
Pull NTB updates from Jon Mason:
"A new driver to support AMD NTB, a NTB performance test driver, NTB
bugs fixes, and the ability to recover from running out of DMA
descriptors"
* tag 'ntb-4.5' of git://github.com/jonmason/ntb:
NTB: Fix macro parameter conflict with field name
NTB: Add support for AMD PCI-Express Non-Transparent Bridge
ntb: ntb perf tool
NTB: Address out of DMA descriptor issue with NTB
NTB: Clear property bits in BAR value
NTB: ntb_process_tx error path bug
Linus Torvalds [Sat, 23 Jan 2016 20:24:56 +0000 (12:24 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull final vfs updates from Al Viro:
- The ->i_mutex wrappers (with small prereq in lustre)
- a fix for too early freeing of symlink bodies on shmem (they need to
be RCU-delayed) (-stable fodder)
- followup to dedupe stuff merged this cycle
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: abort dedupe loop if fatal signals are pending
make sure that freeing shmem fast symlinks is RCU-delayed
wrappers for ->i_mutex access
lustre: remove unused declaration
Linus Torvalds [Sat, 23 Jan 2016 19:47:13 +0000 (11:47 -0800)]
Merge tag 'nfs-for-4.5-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes and cleanups from Trond Myklebust:
"Bugfixes:
- pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn
- pNFS/flexfiles: Improve merging of errors in LAYOUTRETURN
* tag 'nfs-for-4.5-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn
NFS: Simplify nfs_request_add_commit_list() arguments
pNFS/flexfiles: Improve merging of errors in LAYOUTRETURN
Linus Torvalds [Sat, 23 Jan 2016 19:13:56 +0000 (11:13 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge small final update from Andrew Morton:
- DAX feature work: add fsync/msync support
- kfree cleanup, MAINTAINERS update
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
MAINTAINERS: return arch/sh to maintained state, with new maintainers
tree wide: use kvfree() than conditional kfree()/vfree()
dax: never rely on bh.b_dev being set by get_block()
xfs: call dax_pfn_mkwrite() for DAX fsync/msync
ext4: call dax_pfn_mkwrite() for DAX fsync/msync
ext2: call dax_pfn_mkwrite() for DAX fsync/msync
dax: add support for fsync/sync
mm: add find_get_entries_tag()
dax: support dirty DAX entries in radix tree
pmem: add wb_cache_pmem() to the PMEM API
dax: fix conversion of holes to PMDs
dax: fix NULL pointer dereference in __dax_dbg()
Linus Torvalds [Sat, 23 Jan 2016 01:30:52 +0000 (17:30 -0800)]
Merge tag 'armsoc-tegra' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC support for Tegra platforms from Olof Johansson:
"Here's a single-SoC topic branch that we've staged separately. Mainly
because it was hard to sort the branch contents in a way that fit our
existing branches due to some refactorings.
The code has been in -next for quite a while, but we staged it in
arm-soc a bit late, which is why we've kept it separate from the other
updates and are sending it separately here"
* tag 'armsoc-tegra' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: tegra: Add NVIDIA Jetson TX1 Developer Kit support
arm64: tegra: Add NVIDIA P2597 I/O board support
arm64: tegra: Add NVIDIA Jetson TX1 support
arm64: tegra: Add NVIDIA P2571 board support
arm64: tegra: Add NVIDIA P2371 board support
arm64: tegra: Add NVIDIA P2595 I/O board support
arm64: tegra: Add NVIDIA P2530 main board support
arm64: tegra: Add Tegra210 support
arm64: tegra: Add NVIDIA Tegra132 Norrin support
arm64: tegra: Add Tegra132 support
ARM: tegra: select USB_ULPI from EHCI rather than platform
ARM: tegra: Ensure entire dcache is flushed on entering LP0/1
amba: Hide TEGRA_AHB symbol
soc/tegra: Add Tegra210 support
soc/tegra: Provide per-SoC Kconfig symbols
Linus Torvalds [Sat, 23 Jan 2016 01:26:00 +0000 (17:26 -0800)]
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A few fixes for fallout that we didn't catch in time in -next, or
smaller warning fixes that have been discovered since"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
soc: qcom/spm: shut up uninitialized variable warning
ARM: realview: fix device tree build
ARM: debug-ll: fix BCM63xx entry for multiplatform
ARM: dts: armadillo800eva Correct extal1 frequency to 24 MHz
Linus Torvalds [Sat, 23 Jan 2016 01:20:30 +0000 (17:20 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull more input updates from Dmitry Torokhov:
"The second round of updates for the input subsystem, mainly changes to
xpad driver to better hanlde Xbox One controllers"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: gpio-keys - allow disabling individual buttons in DT
Input: gpio-keys - allow setting input device name in DT
Input: xpad - correct xbox one pad device name
Input: atmel_mxt_ts - improve touchscreen size/orientation handling
Input: xpad - use LED API when identifying wireless controllers
Input: xpad - workaround dead irq_out after suspend/ resume
Input: xpad - update Xbox One Force Feedback Support
Input: xpad - correctly handle concurrent LED and FF requests
Input: xpad - handle "present" and "gone" correctly
Input: xpad - remove spurious events of wireless xpad 360 controller
Linus Torvalds [Sat, 23 Jan 2016 01:13:15 +0000 (17:13 -0800)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"This is mostly stuff which missed the first pull request because it
needed to incubate longer. It's mainly made up of the ncr 5380 rework
but also has a few assorted bug fixes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (88 commits)
imm: Use new parport device model
megaraid: Fix possible NULL pointer deference in mraid_mm_ioctl
storvsc: Fix typo in MODULE_PARM_DESC
cxgbi: Typo in MODULE_PARM_DESC
3w-xxxx: Pass through compat mode ioctls
hisi_sas: Use u64 for qw0 in free_device_v1_hw()
hisi_sas: Fix typo in setup_itct_v1_hw()
hisi_sas: Fix v1 itct masks
ipr: Fix out-of-bounds null overwrite
scsi: add Synology to 1024 sector blacklist
ncr5380: Add support for HP C2502
ncr5380: Fix wait for 53C80 registers registers after PDMA
ncr5380: Enable PDMA for DTC chips
ncr5380: Enable PDMA for NCR53C400A
ncr5380: Use runtime register mapping
ncr5380: Fix pseudo DMA transfers on 53C400
ncr5380: Cleanup whitespace and parentheses
atari_NCR5380: Merge changes from NCR5380.c
ncr5380: Merge changes from atari_NCR5380.c
ncr5380: Fix whitespace in comments using regexp
...
Tetsuo Handa [Fri, 22 Jan 2016 23:11:02 +0000 (15:11 -0800)]
tree wide: use kvfree() than conditional kfree()/vfree()
There are many locations that do
if (memory_was_allocated_by_vmalloc)
vfree(ptr);
else
kfree(ptr);
but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory
using is_vmalloc_addr(). Unless callers have special reasons, we can
replace this branch with kvfree(). Please check and reply if you found
problems.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Jan Kara <jack@suse.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net> Acked-by: David Rientjes <rientjes@google.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Boris Petkov <bp@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Fri, 22 Jan 2016 23:10:59 +0000 (15:10 -0800)]
dax: never rely on bh.b_dev being set by get_block()
Previously in DAX we assumed that calls to get_block() would set
bh.b_bdev, and we would then use that value even in error cases for
debugging. This caused a NULL pointer dereference in __dax_dbg() which
was fixed by a previous commit, but that commit only changed the one
place where we were hitting an error.
Instead, update dax.c so that we always initialize bh.b_bdev as best we
can based on the information that DAX has. get_block() may or may not
update to a new value, but this at least lets us get something helpful
from bh.b_bdev for error messages and not have to worry about whether it
was set by get_block() or not.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reported-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Fri, 22 Jan 2016 23:10:56 +0000 (15:10 -0800)]
xfs: call dax_pfn_mkwrite() for DAX fsync/msync
To properly support the new DAX fsync/msync infrastructure filesystems
need to call dax_pfn_mkwrite() so that DAX can track when user pages are
dirtied.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Fri, 22 Jan 2016 23:10:53 +0000 (15:10 -0800)]
ext4: call dax_pfn_mkwrite() for DAX fsync/msync
To properly support the new DAX fsync/msync infrastructure filesystems
need to call dax_pfn_mkwrite() so that DAX can track when user pages are
dirtied.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Fri, 22 Jan 2016 23:10:50 +0000 (15:10 -0800)]
ext2: call dax_pfn_mkwrite() for DAX fsync/msync
To properly support the new DAX fsync/msync infrastructure filesystems
need to call dax_pfn_mkwrite() so that DAX can track when user pages are
dirtied.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Fri, 22 Jan 2016 23:10:47 +0000 (15:10 -0800)]
dax: add support for fsync/sync
To properly handle fsync/msync in an efficient way DAX needs to track
dirty pages so it is able to flush them durably to media on demand.
The tracking of dirty pages is done via the radix tree in struct
address_space. This radix tree is already used by the page writeback
infrastructure for tracking dirty pages associated with an open file,
and it already has support for exceptional (non struct page*) entries.
We build upon these features to add exceptional entries to the radix
tree for DAX dirty PMD or PTE pages at fault time.
[dan.j.williams@intel.com: fix dax_pmd_dbg build warning] Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Fri, 22 Jan 2016 23:10:44 +0000 (15:10 -0800)]
mm: add find_get_entries_tag()
Add find_get_entries_tag() to the family of functions that include
find_get_entries(), find_get_pages() and find_get_pages_tag(). This is
needed for DAX dirty page handling because we need a list of both page
offsets and radix tree entries ('indices' and 'entries' in this
function) that are marked with the PAGECACHE_TAG_TOWRITE tag.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Fri, 22 Jan 2016 23:10:40 +0000 (15:10 -0800)]
dax: support dirty DAX entries in radix tree
Add support for tracking dirty DAX entries in the struct address_space
radix tree. This tree is already used for dirty page writeback, and it
already supports the use of exceptional (non struct page*) entries.
In order to properly track dirty DAX pages we will insert new
exceptional entries into the radix tree that represent dirty DAX PTE or
PMD pages. These exceptional entries will also contain the writeback
addresses for the PTE or PMD faults that we can use at fsync/msync time.
There are currently two types of exceptional entries (shmem and shadow)
that can be placed into the radix tree, and this adds a third. We rely
on the fact that only one type of exceptional entry can be found in a
given radix tree based on its usage. This happens for free with DAX vs
shmem but we explicitly prevent shadow entries from being added to radix
trees for DAX mappings.
The only shadow entries that would be generated for DAX radix trees
would be to track zero page mappings that were created for holes. These
pages would receive minimal benefit from having shadow entries, and the
choice to have only one type of exceptional entry in a given radix tree
makes the logic simpler both in clear_exceptional_entry() and in the
rest of DAX.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Fri, 22 Jan 2016 23:10:37 +0000 (15:10 -0800)]
pmem: add wb_cache_pmem() to the PMEM API
__arch_wb_cache_pmem() was already an internal implementation detail of
the x86 PMEM API, but this functionality needs to be exported as part of
the general PMEM API to handle the fsync/msync case for DAX mmaps.
One thing worth noting is that we really do want this to be part of the
PMEM API as opposed to a stand-alone function like clflush_cache_range()
because of ordering restrictions. By having wb_cache_pmem() as part of
the PMEM API we can leave it unordered, call it multiple times to write
back large amounts of memory, and then order the multiple calls with a
single wmb_pmem().
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Fri, 22 Jan 2016 23:10:34 +0000 (15:10 -0800)]
dax: fix conversion of holes to PMDs
When we get a DAX PMD fault for a write it is possible that there could
be some number of 4k zero pages already present for the same range that
were inserted to service reads from a hole. These 4k zero pages need to
be unmapped from the VMAs and removed from the struct address_space
radix tree before the real DAX PMD entry can be inserted.
For PTE faults this same use case also exists and is handled by a
combination of unmap_mapping_range() to unmap the VMAs and
delete_from_page_cache() to remove the page from the address_space radix
tree.
For PMD faults we do have a call to unmap_mapping_range() (protected by
a buffer_new() check), but nothing clears out the radix tree entry. The
buffer_new() check is also incorrect as the current ext4 and XFS
filesystem code will never return a buffer_head with BH_New set, even
when allocating new blocks over a hole. Instead the filesystem will
zero the blocks manually and return a buffer_head with only BH_Mapped
set.
Fix this situation by removing the buffer_new() check and adding a call
to truncate_inode_pages_range() to clear out the radix tree entries
before we insert the DAX PMD.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reported-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Fri, 22 Jan 2016 23:10:31 +0000 (15:10 -0800)]
dax: fix NULL pointer dereference in __dax_dbg()
In __dax_pmd_fault() we currently assume that get_block() will always
set bh.b_bdev and we unconditionally dereference it in __dax_dbg().
This assumption isn't always true - when called for reads of holes
ext4_dax_mmap_get_block() returns a buffer head where bh->b_bdev is
never set. I hit this BUG while testing the DAX PMD fault path.
Instead, initialize bh.b_bdev before passing bh into get_block(). It is
possible that the filesystem's get_block() will update bh.b_bdev, and
this is fine - we just want to initialize bh.b_bdev to something
reasonable so that the calls to __dax_dbg() work and print something
useful.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reported-by: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It was originally sent in an earlier revision of the pfn_t patchset.
Besides being broken, the warning is also fixed by PFN_FLAGS_MASK
casting the PAGE_MASK to an unsigned long.
Reported-by: Manuel Lauss <manuel.lauss@gmail.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Cc: linux-kernel@vger.kernel.org Cc: Linux-MIPS <linux-mips@linux-mips.org> Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12182/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>