Olof Johansson [Thu, 2 Jan 2014 19:25:10 +0000 (11:25 -0800)]
Merge tag 'renesas-boards2-for-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/boards
From Simon Horman:
Second Round of Renesas ARM based SoC Board updates for v3.14
* r8a7791 (R-Car M2) based Koelsch board
- Let Koelsch multiplatform boot with Koelsch DTB
- Remove non-multiplatform DT reference support
- Instantiate clkdevs for SCIF and CMT
- Remove duplicate CCF initialization
- Add Ether and DU support
* r8a7790 (R-Car H2) based Lager board
- Let Lager multiplatform boot with Lager DTB
- Remove non-multiplatform DT reference support
- Instantiate clkdevs for SCIF and CMT
- Enable multiplaform kernel support
* r8a7740 (R-Mobile A1) based Armadillo board
- Set backlight enable GPIO
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org> Signed-off-by: David Brown <davidb@codeaurora.org> Signed-off-by: Olof Johansson <olof@lixom.net>
Rohit Vaswani [Thu, 2 Jan 2014 18:17:31 +0000 (10:17 -0800)]
ARM: msm: Add support for APQ8074 Dragonboard
This patch adds basic board support for APQ8074 Dragonboard
which belongs to the Snapdragon 800 family.
For now, just support a basic machine with device tree.
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org> Acked-by: Kumar Gala <galak@codeaurora.org> Signed-off-by: David Brown <davidb@codeaurora.org>
[olof: Split off SoC and board support in separate patches] Signed-off-by: Olof Johansson <olof@lixom.net>
Stephen Boyd [Fri, 20 Dec 2013 19:09:17 +0000 (11:09 -0800)]
ARM: msm: Simplify ARCH_MSM_DT config
This doesn't need to be a def_bool y. Instead we can have every
DT supported platform select ARCH_MSM_DT and we achieve the same
thing with less chance of conflicts.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: David Brown <davidb@codeaurora.org> Signed-off-by: Olof Johansson <olof@lixom.net>
Rohit Vaswani [Fri, 20 Dec 2013 19:09:15 +0000 (11:09 -0800)]
ARM: msm: Add support for MSM8974 SoC
Add support for the Snapdragon 800 MSM8974 SoC, used on the Dragonboard
and others. Board support added in separate patch.
Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org> Acked-by: Kumar Gala <galak@codeaurora.org> Signed-off-by: David Brown <davidb@codeaurora.org>
[olof: split off SoC support in separate patch] Signed-off-by: Olof Johansson <olof@lixom.net>
Josh Cartwright [Fri, 20 Dec 2013 19:09:14 +0000 (11:09 -0800)]
ARM: msm: trout: fix uninit var warning
Fix the following warning when !CONFIG_MMC:
arch/arm/mach-msm/board-trout.c: In function 'trout_init':
arch/arm/mach-msm/board-trout.c:67:6: warning: unused variable 'rc' [-Wunused-variable]
int rc;
^
Also, while we're here, rework explicit printk(KERN_CRIT..) to use
pr_crit.
Signed-off-by: Josh Cartwright <joshc@codeaurora.org> Signed-off-by: David Brown <davidb@codeaurora.org> Signed-off-by: Olof Johansson <olof@lixom.net>
Olof Johansson [Sun, 29 Dec 2013 21:24:02 +0000 (13:24 -0800)]
Merge tag 'renesas-defconfig2-for-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/boards
From Simon Horman:
Second Round of Renesas ARM Based SoC Defconfig Updates for v3.14
* r7s72100 SoC (RZ/A1H) based Genmai Board
- Fixup I2C device on defconfig
- Add gpio regulator support on defconfig
* r8a7791 (R-Car M2) based Koelsch board
- Do not disable CONFIG_{INOTIFY_USER,UNIX} in defconfig
- Enable CONFIG_PACKET in defconfig
- Enable Ether in defconfig
* r8a7740 (R-Mobile A1) based Armadillo board
- Enable backlight control in defconfig
* r8a7740 (R-Mobile A1) based Armadillo board
- Rename ARCH_SHMOBILE to ARCH_SHMOBILE_LEGACY
* tag 'renesas-defconfig2-for-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: koelsch: Do not disable CONFIG_{INOTIFY_USER,UNIX} in defconfig
ARM: shmobile: koelsch: Enable CONFIG_PACKET in defconfig
ARM: shmobile: armadillo800eva: Enable backlight control in defconfig
ARM: shmobile: Koelsch: enable Ether in defconfig
ARM: shmobile: genmai: Rename ARCH_SHMOBILE to ARCH_SHMOBILE_LEGACY
ARM: shmobile: lager: fixup I2C device on defconfig
ARM: shmobile: lager: add gpio regulator support on defconfig
Olof Johansson [Sun, 29 Dec 2013 21:15:03 +0000 (13:15 -0800)]
Merge tag 'renesas-boards-for-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/boards
From Simon Horman:
Renesas ARM based SoC board updates for v3.14
* Global
- Kconfig: Mention Renesas ARM SoCs instead of SH-Mobile
* r7s72100 SoC (RZ/A1H) based Genmai Board
- Add Multiplatform support
- Add Reference DT
* r8a7791 (R-Car M2) based Koelsch board
- Add pinctrl_register_mappings() for Koelsch
- Hook up SW30-SW36 on Koelsch
- Mark GPIO keys as wake-up sources
- Use ->init_late()
- Add Multiplatform support
- Set .debounce_interval for GPIO keys
- Add SW2 to GPIO keys
- Add Led 6, 7 and 8 support
- Add reference DT
- Enable PFC/GPIO
* r8a7790 (R-Car H2) based Lager board
- Add gpio/fixed regulator for SDHI
- Add SPI FLASH support on QSPI
- Mark GPIO keys as wake-up sources
- Use ->init_late()
- Set .debounce_interval for GPIO keys
* r8a7778 (R-Car M1) based Bock-W board
- bockw: remove unused RSND_SSI_CLK_FROM_ADG
- Set .debounce_interval for GPIO keys
- Correct FPGA ioremap area
- Use regulator for MMCIF
* sh7374 (SH-Mobile AP4) based Mackerel board
- Use pinconf API to configure pin pull-down
- clk_round_rate() can return a zero to indicate an error
* tag 'renesas-boards-for-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: (75 commits)
ARM: shmobile: lager: add gpio/fixed regulator for SDHI
ARM: shmobile: bockw: remove unused RSND_SSI_CLK_FROM_ADG
ARM: shmobile: armadillo: fixup FSI address size
ARM: Kconfig: Mention Renesas ARM SoCs instead of SH-Mobile
ARM: shmobile: mackerel: Use pinconf API to configure pin pull-down
ARM: shmobile: Lager:add SPI FLASH support on QSPI
ARM: shmobile: mackerel: clk_round_rate() can return a zero to indicate an error
ARM: shmobile: Add pinctrl_register_mappings() for Koelsch
ARM: shmobile: Use ->init_late() on Lager
ARM: shmobile: Hook up SW30-SW36 on Koelsch
ARM: shmobile: koelsch: mark GPIO keys as wake-up sources
ARM: shmobile: Use ->init_late() on Koelsch
ARM: shmobile: lager: mark GPIO keys as wake-up sources
ARM: shmobile: r7s72100 Genmai Multiplatform
ARM: shmobile: r7s72100 Genmai DT reference C bits
ARM: shmobile: r7s72100 Genmai DT reference DTS bits
ARM: shmobile: Initial r8a7791 and Koelsch multiplatform support
ARM: shmobile: koelsch: set .debounce_interval
ARM: shmobile: lager: set .debounce_interval
ARM: shmobile: bockw: add pin pull-up setting for SDHI
...
Olof Johansson [Sun, 29 Dec 2013 05:38:35 +0000 (21:38 -0800)]
Merge tag 'samsung-defconfig-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into next/boards
From Kukjin Kim:
Samsung defconfig 2nd update for v3.14
- enable REGULATOR S2MPS11 for Arndale Octa board
* tag 'samsung-defconfig-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: exynos_defconfig: Enable S2MPS11 voltage regulator
Olof Johansson [Sun, 22 Dec 2013 22:04:34 +0000 (14:04 -0800)]
Merge tag 'samsung-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into next/boards
From Kukjin Kim:
exynos_defconfig updates for v3.14
- increase number of CPU to 8 for EXYNOS SoCs
* tag 'samsung-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: exynos_defconfig: increase CONFIG_NR_CPUS value to 8
Linus Torvalds [Sun, 22 Dec 2013 19:13:02 +0000 (11:13 -0800)]
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Much smaller batch of fixes this week.
Biggest one is a revert of an OMAP display change that removed some
non-DT pinmux code that was still needed for 3.13 to get DSI displays
to work.
There's also a fix that resolves some misdescribed GPIO controller
resources on shmobile. The rest are mostly smaller fixes, a couple of
MAINTAINERS updates, etc"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
Revert "ARM: OMAP2+: Remove legacy mux code for display.c"
MAINTAINERS: Add keystone clock drivers
MAINTAINERS: Add keystone git tree information
ARM: s3c64xx: dt: Fix boot failure due to double clock initialization
ARM: shmobile: r8a7790: Fix GPIO resources in DTS
irqchip: renesas-intc-irqpin: Fix register bitfield shift calculation
ARM: shmobile: lager: phy fixup needs CONFIG_PHYLIB
Linus Torvalds [Sun, 22 Dec 2013 19:11:57 +0000 (11:11 -0800)]
Merge tag 'firewire-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fixlet from Stefan Richter:
"A one-liner to reenable WRITE SAME over SBP-2 like in v3.8...v3.12.
Buggy targets which could malfunction when being subjected to this
command are already sufficiently protected by a scsi_level check in sd
+ SCSI core"
* tag 'firewire-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: sbp2: bring back WRITE SAME support
Pull SCSI target fixes from Nicholas Bellinger:
"Mostly minor items this time around, the most notable being a FILEIO
backend change to enforce hw_max_sectors based upon the current
block_size to address a bug where large sized I/Os (> 1M) where being
rejected"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
qla2xxx: Fix scsi_host leak on qlt_lport_register callback failure
target: Remove extra percpu_ref_init
target/file: Update hw_max_sectors based on current block_size
iser-target: Move INIT_WORK setup into isert_create_device_ib_res
iscsi-target: Fix incorrect np->np_thread NULL assignment
qla2xxx: Fix schedule_delayed_work() for target timeout calculations
iser-target: fix error return code in isert_create_device_ib_res()
iscsi-target: Fix-up all zero data-length CDBs with R/W_BIT set
target: Remove write-only stats fields and lock from struct se_node_acl
iscsi-target: return -EINVAL on oversized configfs parameter
Linus Torvalds [Sun, 22 Dec 2013 19:03:49 +0000 (11:03 -0800)]
Merge git://git.kvack.org/~bcrl/aio-next
Pull AIO leak fixes from Ben LaHaise:
"I've put these two patches plus Linus's change through a round of
tests, and it passes millions of iterations of the aio numa
migratepage test, as well as a number of repetitions of a few simple
read and write tests.
The first patch fixes the memory leak Kent introduced, while the
second patch makes aio_migratepage() much more paranoid and robust"
* git://git.kvack.org/~bcrl/aio-next:
aio/migratepages: make aio migrate pages sane
aio: fix kioctx leak introduced by "aio: Fix a trinity splat"
Linus Torvalds [Thu, 19 Dec 2013 20:11:12 +0000 (05:11 +0900)]
aio: clean up and fix aio_setup_ring page mapping
Since commit 36bc08cc01709 ("fs/aio: Add support to aio ring pages
migration") the aio ring setup code has used a special per-ring backing
inode for the page allocations, rather than just using random anonymous
pages.
However, rather than remembering the pages as it allocated them, it
would allocate the pages, insert them into the file mapping (dirty, so
that they couldn't be free'd), and then forget about them. And then to
look them up again, it would mmap the mapping, and then use
"get_user_pages()" to get back an array of the pages we just created.
Now, not only is that incredibly inefficient, it also leaked all the
pages if the mmap failed (which could happen due to excessive number of
mappings, for example).
So clean it all up, making it much more straightforward. Also remove
some left-overs of the previous (broken) mm_populate() usage that was
removed in commit d6c355c7dabc ("aio: fix race in ring buffer page
lookup introduced by page migration support") but left the pointless and
now misleading MAP_POPULATE flag around.
Tested-and-acked-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Benjamin LaHaise [Sat, 21 Dec 2013 22:56:08 +0000 (17:56 -0500)]
aio/migratepages: make aio migrate pages sane
The arbitrary restriction on page counts offered by the core
migrate_page_move_mapping() code results in rather suspicious looking
fiddling with page reference counts in the aio_migratepage() operation.
To fix this, make migrate_page_move_mapping() take an extra_count parameter
that allows aio to tell the code about its own reference count on the page
being migrated.
While cleaning up aio_migratepage(), make it validate that the old page
being passed in is actually what aio_migratepage() expects to prevent
misbehaviour in the case of races.
Benjamin LaHaise [Sat, 21 Dec 2013 20:49:28 +0000 (15:49 -0500)]
aio: fix kioctx leak introduced by "aio: Fix a trinity splat"
e34ecee2ae791df674dfb466ce40692ca6218e43 reworked the percpu reference
counting to correct a bug trinity found. Unfortunately, the change lead
to kioctxes being leaked because there was no final reference count to
put. Add that reference count back in to fix things.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Cc: stable@vger.kernel.org
Linus Torvalds [Sat, 21 Dec 2013 00:52:45 +0000 (16:52 -0800)]
Don't set the INITRD_COMPRESS environment variable automatically
Commit 1bf49dd4be0b ("./Makefile: export initial ramdisk compression
config option") started setting the INITRD_COMPRESS environment variable
depending on which decompression models the kernel had available.
That is completely broken.
For example, we by default have CONFIG_RD_LZ4 enabled, and are able to
decompress such an initrd, but the user tools to *create* such an initrd
may not be availble. So trying to tell dracut to generate an
lz4-compressed image just because we can decode such an image is
completely inappropriate.
Cc: J P <ppandit@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Beulich <JBeulich@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 20 Dec 2013 23:48:45 +0000 (15:48 -0800)]
Merge tag 'xfs-for-linus-v3.13-rc5' of git://oss.sgi.com/xfs/xfs
Pull xfs bugfixes from Ben Myers:
"This contains fixes for some asserts
related to project quotas, a memory leak, a hang when disabling group or
project quotas before disabling user quotas, Dave's email address, several
fixes for the alignment of file allocation to stripe unit/width geometry, a
fix for an assertion with xfs_zero_remaining_bytes, and the behavior of
metadata writeback in the face of IO errors.
Details:
- fix memory leak in xfs_dir2_node_removename
- fix quota assertion in xfs_setattr_size
- fix quota assertions in xfs_qm_vop_create_dqattach
- fix for hang when disabling group and project quotas before
disabling user quotas
- fix Dave Chinner's email address in MAINTAINERS
- fix for file allocation alignment
- fix for assertion in xfs_buf_stale by removing xfsbdstrat
- fix for alignment with swalloc mount option
- fix for "retry forever" semantics on IO errors"
* tag 'xfs-for-linus-v3.13-rc5' of git://oss.sgi.com/xfs/xfs:
xfs: abort metadata writeback on permanent errors
xfs: swalloc doesn't align allocations properly
xfs: remove xfsbdstrat error
xfs: align initial file allocations correctly
MAINTAINERS: fix incorrect mail address of XFS maintainer
xfs: fix infinite loop by detaching the group/project hints from user dquot
xfs: fix assertion failure at xfs_setattr_nonsize
xfs: fix false assertion at xfs_qm_vop_create_dqattach
xfs: fix memory leak in xfs_dir2_node_removename
Olof Johansson [Fri, 20 Dec 2013 22:28:05 +0000 (14:28 -0800)]
mm: fix build of split ptlock code
Commit 597d795a2a78 ('mm: do not allocate page->ptl dynamically, if
spinlock_t fits to long') restructures some allocators that are compiled
even if USE_SPLIT_PTLOCKS arn't used. It results in compilation
failure:
mm/memory.c:4282:6: error: 'struct page' has no member named 'ptl'
mm/memory.c:4288:12: error: 'struct page' has no member named 'ptl'
Add in the missing ifdef.
Fixes: 597d795a2a78 ('mm: do not allocate page->ptl dynamically, if spinlock_t fits to long') Signed-off-by: Olof Johansson <olof@lixom.net> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 20 Dec 2013 21:50:42 +0000 (13:50 -0800)]
Merge tag 'arc-fixes-for-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fix from Vineet Gupta:
"Fix busted syscall table due to unistd header inclusion issue"
* tag 'arc-fixes-for-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: Allow conditional multiple inclusion of uapi/asm/unistd.h
Sachin Kamat [Fri, 20 Dec 2013 21:27:47 +0000 (06:27 +0900)]
ARM: exynos_defconfig: Enable S2MPS11 voltage regulator
S2MPS11 voltage regulator is commonly used on the latest Exynos
boards like SMDK5420, Arndale-Octa, etc. Hence it makes sense to
enable it like S5M8767A voltage regulator.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Luck, Tony [Wed, 18 Dec 2013 23:17:10 +0000 (15:17 -0800)]
pstore: Don't allow high traffic options on fragile devices
Some pstore backing devices use on board flash as persistent
storage. These have limited numbers of write cycles so it
is a poor idea to use them from high frequency operations.
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 20 Dec 2013 20:27:41 +0000 (12:27 -0800)]
Merge tag 'dmaengine-fixes-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine
Pull dmaengine fixes from Dan Williams:
- deprecation of net_dma to be removed in 3.14
- crash regression fix in pl330 from the dmaengine_unmap rework
- crash regression fix for any channel running raid ops without
CONFIG_ASYNC_TX_DMA from dmaengine_unmap
- memory leak regression in mv_xor from dmaengine_unmap
- build warning regressions in mv_xor, fsldma, ppc4xx, txx9, and
at_hdmac from dmaengine_unmap
- sleep in atomic regression in dma_async_memcpy_pg_to_pg
- new fix in mv_xor for handling channel initialization failures
* tag 'dmaengine-fixes-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine:
net_dma: mark broken
dma: pl330: ensure DMA descriptors are zero-initialised
dmaengine: fix sleep in atomic
dmaengine: mv_xor: fix oops when channels fail to initialise
dma: mv_xor: Use dmaengine_unmap_data for the self-tests
dmaengine: fix enable for high order unmap pools
dma: fix build warnings in txx9
dmatest: fix build warning on mips
dma: fix fsldma build warnings
dma: fix build warnings in ppc4xx
dmaengine: at_hdmac: remove unused function
dma: mv_xor: remove mv_desc_get_dest_addr()
mm: do not allocate page->ptl dynamically, if spinlock_t fits to long
In struct page we have enough space to fit long-size page->ptl there,
but we use dynamically-allocated page->ptl if size(spinlock_t) is larger
than sizeof(int).
It hurts 64-bit architectures with CONFIG_GENERIC_LOCKBREAK, where
sizeof(spinlock_t) == 8, but it easily fits into struct page.
Johannes Weiner [Fri, 20 Dec 2013 14:54:12 +0000 (14:54 +0000)]
mm: page_alloc: revert NUMA aspect of fair allocation policy
Commit 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy") meant
to bring aging fairness among zones in system, but it was overzealous
and badly regressed basic workloads on NUMA systems.
Due to the way kswapd and page allocator interacts, we still want to
make sure that all zones in any given node are used equally for all
allocations to maximize memory utilization and prevent thrashing on the
highest zone in the node.
While the same principle applies to NUMA nodes - memory utilization is
obviously improved by spreading allocations throughout all nodes -
remote references can be costly and so many workloads prefer locality
over memory utilization. The original change assumed that
zone_reclaim_mode would be a good enough predictor for that, but it
turned out to be as indicative as a coin flip.
Revert the NUMA aspect of the fairness until we can find a proper way to
make it configurable and agree on a sane default.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: <stable@kernel.org> # 3.12 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Fri, 20 Dec 2013 14:54:11 +0000 (14:54 +0000)]
Revert "mm: page_alloc: exclude unreclaimable allocations from zone fairness policy"
This reverts commit 73f038b863df. The NUMA behaviour of this patch is
less than ideal. An alternative approch is to interleave allocations
only within local zones which is implemented in the next patch.
mm: Fix NULL pointer dereference in madvise(MADV_WILLNEED) support
Sasha Levin found a NULL pointer dereference that is due to a missing
page table lock, which in turn is due to the pmd entry in question being
a transparent huge-table entry.
The code - introduced in commit 1998cc048901 ("mm: make
madvise(MADV_WILLNEED) support swap file prefetch") - correctly checks
for this situation using pmd_none_or_trans_huge_or_clear_bad(), but it
turns out that that function doesn't work correctly.
pmd_none_or_trans_huge_or_clear_bad() expected that pmd_bad() would
trigger if the transparent hugepage bit was set, but it doesn't do that
if pmd_numa() is also set. Note that the NUMA bit only gets set on real
NUMA machines, so people trying to reproduce this on most normal
development systems would never actually trigger this.
Fix it by removing the very subtle (and subtly incorrect) expectation,
and instead just checking pmd_trans_huge() explicitly.
Reported-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Andrea Arcangeli <aarcange@redhat.com>
[ Additionally remove the now stale test for pmd_trans_huge() inside the
pmd_bad() case - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This problem has been present since GPIOs were added to the r8a7790 SoC
by f98e10c88aa95bf7 ("ARM: shmobile: r8a7790: Add GPIO controller
devices to device tree") in v3.12-rc1.
This bug has been present since the renesas-intc-irqpin driver was
introduced by 443580486e3b9657 ("irqchip: Renesas INTC External IRQ pin
driver") in v3.10-rc1
* Lager board
- Do not build the phy fixup unless CONFIG_PHYLIB is enabled
Paolo Bonzini [Fri, 20 Dec 2013 18:13:58 +0000 (19:13 +0100)]
Merge tag 'signed-for-3.13' of git://github.com/agraf/linux-2.6 into kvm-master
Patch queue for 3.13 - 2013-12-18
This fixes some grave issues we've only found after 3.13-rc1:
- Make the modularized HV/PR book3s kvm work well as modules
- Fix some race conditions
- Fix compilation with certain compilers (booke)
- Fix THP for book3s_hv
- Fix preemption for book3s_pr
Alexander Graf (4):
KVM: PPC: Book3S: PR: Don't clobber our exit handler id
KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu
KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy
KVM: PPC: Book3S: PR: Enable interrupts earlier
Linus Torvalds [Fri, 20 Dec 2013 17:34:54 +0000 (09:34 -0800)]
Merge tag 'stable/for-linus-3.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen bugfixes from Konrad Rzeszutek Wilk:
- Fix balloon driver for auto-translate guests (PVHVM, ARM) to not use
scratch pages.
- Fix block API header for ARM32 and ARM64 to have proper layout
- On ARM when mapping guests, stick on PTE_SPECIAL
- When using SWIOTLB under ARM, don't call swiotlb functions twice
- When unmapping guests memory and if we fail, don't return pages which
failed to be unmapped.
- Grant driver was using the wrong address on ARM.
* tag 'stable/for-linus-3.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/balloon: Seperate the auto-translate logic properly (v2)
xen/block: Correctly define structures in public headers on ARM32 and ARM64
arm: xen: foreign mapping PTEs are special.
xen/arm64: do not call the swiotlb functions twice
xen: privcmd: do not return pages which we have failed to unmap
XEN: Grant table address, xen_hvm_resume_frames, is a phys_addr not a pfn
Linus Torvalds [Fri, 20 Dec 2013 17:32:30 +0000 (09:32 -0800)]
Merge tag 'trace-fixes-v3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ftrace fix from Steven Rostedt:
"This fixes a long standing bug in the ftrace profiler. The problem is
that the profiler only initializes the online CPUs, and not possible
CPUs. This causes issues if the user takes CPUs online or offline
while the profiler is running.
If we online a CPU after starting the profiler, we lose all the trace
information on the CPU going online.
If we offline a CPU after running a test and start a new test, it will
not clear the old data from that CPU.
This bug causes incorrect data to be reported to the user if they
online or offline CPUs during the profiling"
* tag 'trace-fixes-v3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Initialize the ftrace profiler for each possible cpu
Kevin Hilman [Fri, 20 Dec 2013 16:30:50 +0000 (08:30 -0800)]
Merge tag 'omap-for-v3.13/display-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
I accidentally removed some mux code for omap4 that I thought was
dead code as omap4 has been booting with device tree only since
v3.10. Turns out I also removed some display related mux code,
so let's revert that except for the dead code parts.
* tag 'omap-for-v3.13/display-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: (439 commits)
Revert "ARM: OMAP2+: Remove legacy mux code for display.c"
+Linux 3.13-rc4
Kevin Hilman [Thu, 19 Dec 2013 23:12:23 +0000 (15:12 -0800)]
Merge tag 'keystone/maintainer-file' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into fixes
From Santosh Shilimkar:
Couple of updates to MAINTAINERS file for Keystone
- Add git tree information
- Add clock drivers entry
* tag 'keystone/maintainer-file' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
MAINTAINERS: Add keystone clock drivers
MAINTAINERS: Add keystone git tree information
qla2xxx: Fix scsi_host leak on qlt_lport_register callback failure
This patch fixes a possible scsi_host reference leak in qlt_lport_register(),
when a non zero return from the passed (*callback) does not call drop the
local reference via scsi_host_put() before returning.
This currently does not effect existing tcm_qla2xxx code as the passed callback
will never fail, but fix this up regardless for future code.
Cc: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Will Deacon [Tue, 17 Dec 2013 17:09:08 +0000 (17:09 +0000)]
arm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled events
Commit 8f34a1da35ae ("arm64: ptrace: use HW_BREAKPOINT_EMPTY type for
disabled breakpoints") fixed an issue with GDB trying to zero breakpoint
control registers. The problem there is that the arch hw_breakpoint code
will attempt to create a (disabled), execute breakpoint of length 0.
This will fail validation and report unexpected failure to GDB. To avoid
this, we treated disabled breakpoints as HW_BREAKPOINT_EMPTY, but that
seems to have broken with recent kernels, causing watchpoints to be
treated as TYPE_INST in the core code and returning ENOSPC for any
further breakpoints.
This patch fixes the problem by prioritising the `enable' field of the
breakpoint: if it is cleared, we simply update the perf_event_attr to
indicate that the thing is disabled and don't bother changing either the
type or the length. This reinforces the behaviour that the breakpoint
control register is essentially read-only apart from the enable bit
when disabling a breakpoint.
Cc: <stable@vger.kernel.org> Reported-by: Aaron Liu <liucy214@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Linus Torvalds [Thu, 19 Dec 2013 17:10:46 +0000 (09:10 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"An ABI documentation fix, and a mixed-PMU perf-info-corruption fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Document the new transaction sample type
perf: Disable all pmus on unthrottling and rescheduling
Linus Torvalds [Thu, 19 Dec 2013 17:10:10 +0000 (09:10 -0800)]
Merge tag 'sound-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"We have a bit more changes than usual in ASoC here, as it was slipped
from the previous update. There are one minr ASoC PCM code fix and
ASoC dmaengine fix, in addition of a collection of small ASoC driver
fixes. The rest are a couple of HD-audio stable fixups, and a
long-standing fix for the paused stream handling.
So, all commits look not scary (and hopefully won't give you
disastrous holiday season)"
* tag 'sound-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add Dell headset detection quirk for one more laptop model
ASoC: wm8904: fix DSP mode B configuration
ASoC: wm_adsp: Add small delay while polling DSP RAM start
ALSA: Add SNDRV_PCM_STATE_PAUSED case in wait_for_avail function
ASoC: kirkwood: Fix the CPU DAI rates
ASoC: wm5110: Correct HPOUT3 DAPM route typo
ALSA: hda - Add Dell headset detection quirk for three laptop models
ALSA: hda - Add enable_msi=0 workaround for four HP machines
ASoC: don't leak on error in snd_dmaengine_pcm_register
ASoC: fsl: imx-wm8962: Don't update bias_level in machine driver
ASoC: tegra: fix uninitialized variables in set_fmt
ASoC: wm8962: Enable SYSCLK provisonally before fetching generated DSPCLK_DIV
ASoC: sam9x5_wm8731: change to work in DSP A mode
ASoC: atmel_ssc_dai: add dai trigger ops
ASoC: soc-pcm: Use valid condition for snd_soc_dai_digital_mute() in hw_free()
Vineet Gupta [Thu, 19 Dec 2013 13:25:58 +0000 (18:55 +0530)]
ARC: Allow conditional multiple inclusion of uapi/asm/unistd.h
Commit 97bc386fc12d "ARC: Add guard macro to uapi/asm/unistd.h"
inhibited multiple inclusion of ARCH unistd.h. This however hosed the system
since Generic syscall table generator relies on it being included twice,
and in lack-of an empty table was emitted by C preprocessor.
Fix that by allowing one exception to rule for the special case (just
like Xtensa)
Suggested-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Laurent Pinchart [Wed, 11 Dec 2013 14:13:57 +0000 (15:13 +0100)]
ARM: shmobile: Remove non-multiplatform Koelsch reference support
Now that r8a7791 has CCF support remove the legacy Koelsch reference
Kconfig bits CONFIG_MACH_KOELSCH_REFERENCE for the non-multiplatform
case.
Starting from this commit Koelsch board support is always enabled via
CONFIG_MACH_KOELSCH, and CONFIG_ARCH_MULTIPLATFORM is used to select
between board-koelsch.c and board-koelsch-reference.c
The file board-koelsch-reference.c can no longer be used together with
the legacy sh-clk clock framework, instead CCF is used.
Laurent Pinchart [Wed, 11 Dec 2013 14:13:56 +0000 (15:13 +0100)]
ARM: shmobile: Remove non-multiplatform Lager reference support
Now that r8a7790 has CCF support remove the legacy Lager reference
Kconfig bits CONFIG_MACH_LAGER_REFERENCE for the non-multiplatform
case.
Starting from this commit Lager board support is always enabled via
CONFIG_MACH_LAGER, and CONFIG_ARCH_MULTIPLATFORM is used to select
between board-lager.c and board-lager-reference.c
The file board-lager-reference.c can no longer be used together with
the legacy sh-clk clock framework, instead CCF is used.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Laurent Pinchart [Wed, 11 Dec 2013 14:13:55 +0000 (15:13 +0100)]
ARM: shmobile: koelsch-reference: Instantiate clkdevs for SCIF and CMT
Now that the common clock framework is supported, the clock lookup
entries in clock-r8a7791.c are not registered anymore. Devices must
instead reference their clocks in the device tree. However, SCIF and CMT
devices are still instantiated through platform code, and thus need a
clock lookup entry.
Retrieve the SCIF and CMT clock entries by name and register clkdevs for
the corresponding devices. This will be removed when the SCIF and CMT
devices will be instantiated from the device tree.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Laurent Pinchart [Wed, 11 Dec 2013 14:13:54 +0000 (15:13 +0100)]
ARM: shmobile: lager-reference: Instantiate clkdevs for SCIF and CMT
Now that the common clock framework is supported, the clock lookup
entries in clock-r8a7790.c are not registered anymore. Devices must
instead reference their clocks in the device tree. However, SCIF and CMT
devices are still instantiated through platform code, and thus need a
clock lookup entry.
Retrieve the SCIF and CMT clock entries by name and register clkdevs for
the corresponding devices. This will be removed when the SCIF and CMT
devices will be instantiated from the device tree.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Laurent Pinchart [Wed, 11 Dec 2013 14:13:52 +0000 (15:13 +0100)]
ARM: shmobile: lager-reference: Enable multiplaform kernel support
Enable multiplaform ARM architecture support for the Lager reference
board. Common clock framework initialization will be handled by the
rcar_gen2_init_timer() call, we just need to remove the legacy clock
code initialization.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Laurent Pinchart [Wed, 11 Dec 2013 02:48:16 +0000 (03:48 +0100)]
ARM: shmobile: armadillo: Set backlight enable GPIO
The Armadillo 800 EVA panel module has a backlight enable signal
connected to GPIO 61. Instead of requesting the GPIO in board code and
setting it to a high level unconditionally, pass the GPIO number to the
PWM backlight driver as the backlight enable GPIO.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Simon Horman [Mon, 16 Dec 2013 00:34:09 +0000 (09:34 +0900)]
ARM: shmobile: koelsch: Do not disable CONFIG_{INOTIFY_USER,UNIX} in defconfig
CONFIG_INOTIFY_USER and CONFIG_UNIX are required for udev to function.
This change brings the koelsch defconfig into line with
other shmobile defconfigs.
Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
The koelsch board uses has an SH ethernet controller which uses a Micrel
phy. Select MICREL_PHY for koelsch if SH_ETH is enabled to make use of the
Micrel-specific phy driver rather than relying on the generic phy driver.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
target/file: Update hw_max_sectors based on current block_size
This patch allows FILEIO to update hw_max_sectors based on the current
max_bytes_per_io. This is required because vfs_[writev,readv]() can accept
a maximum of 2048 iovecs per call, so the enforced hw_max_sectors really
needs to be calculated based on block_size.
This addresses a >= v3.5 bug where block_size=512 was rejecting > 1M
sized I/O requests, because FD_MAX_SECTORS was hardcoded to 2048 for
the block_size=4096 case.
(v2: Use max_bytes_per_io instead of ->update_hw_max_sectors)
Reported-by: Henrik Goldman <hg@x-formation.com> Cc: <stable@vger.kernel.org> #3.5+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
iser-target: Move INIT_WORK setup into isert_create_device_ib_res
This patch moves INIT_WORK setup for cq_desc->cq_[rx,tx]_work into
isert_create_device_ib_res(), instead of being done each callback
invocation in isert_cq_[rx,tx]_callback().
This also fixes a 'INFO: trying to register non-static key' warning
when cancel_work_sync() is called before INIT_WORK has setup the
struct work_struct.
Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> #3.12+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
When shutting down a target there is a race condition between
iscsit_del_np() and __iscsi_target_login_thread().
The latter sets the thread pointer to NULL, and the former
tries to issue kthread_stop() on that pointer without any
synchronization.
This patch moves the np->np_thread NULL assignment into
iscsit_del_np(), after kthread_stop() has completed. It also
removes the signal_pending() + np_state check, and only
exits when kthread_should_stop() is true.
Reported-by: Hannes Reinecke <hare@suse.de> Cc: <stable@vger.kernel.org> #3.12+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Laurent Pinchart [Wed, 11 Dec 2013 14:13:51 +0000 (15:13 +0100)]
ARM: shmobile: rcar-gen2: Initialize CCF before clock sources
When CONFIG_COMMON_CLOCK is enabled, call rcar_gen2_clocks_init() in the
timer init function to initialize the common clock framework before
initializing the clock sources. This will take care of clock
initialization when the r8a779[01] boards will be switched to
multiplatform kernels.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Linus Torvalds [Thu, 19 Dec 2013 03:05:00 +0000 (19:05 -0800)]
Merge branch 'akpm' (incoming from Andrew)
Merge patches from Andrew Morton:
"23 fixes and a MAINTAINERS update"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (24 commits)
mm/hugetlb: check for pte NULL pointer in __page_check_address()
fix build with make 3.80
mm/mempolicy: fix !vma in new_vma_page()
MAINTAINERS: add Davidlohr as GPT maintainer
mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully
mm/compaction: respect ignore_skip_hint in update_pageblock_skip
mm/mempolicy: correct putback method for isolate pages if failed
mm: add missing dependency in Kconfig
sh: always link in helper functions extracted from libgcc
mm: page_alloc: exclude unreclaimable allocations from zone fairness policy
mm: numa: defer TLB flush for THP migration as long as possible
mm: numa: guarantee that tlb_flush_pending updates are visible before page table updates
mm: fix TLB flush race between migration, and change_protection_range
mm: numa: avoid unnecessary disruption of NUMA hinting during migration
mm: numa: clear numa hinting information on mprotect
sched: numa: skip inaccessible VMAs
mm: numa: avoid unnecessary work on the failure path
mm: numa: ensure anon_vma is locked to prevent parallel THP splits
mm: numa: do not clear PTE for pte_numa update
mm: numa: do not clear PMD during PTE update scan
...
Jan Beulich [Thu, 19 Dec 2013 01:08:57 +0000 (17:08 -0800)]
fix build with make 3.80
According to Documentation/Changes, make 3.80 is still being supported
for building the kernel, hence make files must not make (unconditional)
use of features introduced only in newer versions.
Commit 1bf49dd4be0b ("./Makefile: export initial ramdisk compression
config option") however introduced "else ifeq" constructs which make
3.80 doesn't understand. Replace the logic there with more conventional
(in the kernel build infrastructure) list constructs (except that the
list here is intentionally limited to exactly one element).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: P J P <ppandit@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wanpeng Li [Thu, 19 Dec 2013 01:08:56 +0000 (17:08 -0800)]
mm/mempolicy: fix !vma in new_vma_page()
BUG_ON(!vma) assumption is introduced by commit 0bf598d863e3 ("mbind:
add BUG_ON(!vma) in new_vma_page()"), however, even if
address = __vma_address(page, vma);
and
vma->start < address < vma->end
page_address_in_vma() may still return -EFAULT because of many other
conditions in it. As a result the while loop in new_vma_page() may end
with vma=NULL.
This patch revert the commit and also fix the potential dereference NULL
pointer reported by Dan.
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Reported-by: Dave Jones <davej@redhat.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jianguo Wu [Thu, 19 Dec 2013 01:08:54 +0000 (17:08 -0800)]
mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully
After a successful hugetlb page migration by soft offline, the source
page will either be freed into hugepage_freelists or buddy(over-commit
page). If page is in buddy, page_hstate(page) will be NULL. It will
hit a NULL pointer dereference in dequeue_hwpoisoned_huge_page().
Joonsoo Kim [Thu, 19 Dec 2013 01:08:52 +0000 (17:08 -0800)]
mm/compaction: respect ignore_skip_hint in update_pageblock_skip
update_pageblock_skip() only fits to compaction which tries to isolate
by pageblock unit. If isolate_migratepages_range() is called by CMA, it
try to isolate regardless of pageblock unit and it don't reference
get_pageblock_skip() by ignore_skip_hint. We should also respect it on
update_pageblock_skip() to prevent from setting the wrong information.
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: <stable@vger.kernel.org> [3.7+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sh: always link in helper functions extracted from libgcc
E.g. landisk_defconfig, which has CONFIG_NTFS_FS=m:
ERROR: "__ashrdi3" [fs/ntfs/ntfs.ko] undefined!
For "lib-y", if no symbols in a compilation unit are referenced by other
units, the compilation unit will not be included in vmlinux. This
breaks modules that do reference those symbols.
This doesn't fix all cases. There are others, e.g. udivsi3.
This is also not limited to sh, many architectures handle this in the
same way.
A simple solution is to unconditionally include all helper functions.
A more complex solution is to make the choice of "lib-y" or "obj-y" depend
on CONFIG_MODULES:
Johannes Weiner [Thu, 19 Dec 2013 01:08:47 +0000 (17:08 -0800)]
mm: page_alloc: exclude unreclaimable allocations from zone fairness policy
Dave Hansen noted a regression in a microbenchmark that loops around
open() and close() on an 8-node NUMA machine and bisected it down to
commit 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy").
That change forces the slab allocations of the file descriptor to spread
out to all 8 nodes, causing remote references in the page allocator and
slab.
The round-robin policy is only there to provide fairness among memory
allocations that are reclaimed involuntarily based on pressure in each
zone. It does not make sense to apply it to unreclaimable kernel
allocations that are freed manually, in this case instantly after the
allocation, and incur the remote reference costs twice for no reason.
Only round-robin allocations that are usually freed through page reclaim
or slab shrinking.
Bisected by Dave Hansen.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:45 +0000 (17:08 -0800)]
mm: numa: guarantee that tlb_flush_pending updates are visible before page table updates
According to documentation on barriers, stores issued before a LOCK can
complete after the lock implying that it's possible tlb_flush_pending
can be visible after a page table update. As per revised documentation,
this patch adds a smp_mb__before_spinlock to guarantee the correct
ordering.
Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rik van Riel [Thu, 19 Dec 2013 01:08:44 +0000 (17:08 -0800)]
mm: fix TLB flush race between migration, and change_protection_range
There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.
The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.
During that time, a CPU may continue writing to the page.
This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.
When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.
This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).
The basic race looks like this:
CPU A CPU B CPU C
load TLB entry
make entry PTE/PMD_NUMA
fault on entry
read/write old page
start migrating page
change PTE/PMD to new page
read/write old page [*]
flush TLB
reload TLB from new entry
read/write new page
lose data
[*] the old page may belong to a new user at this point!
The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.
This should fix both NUMA migration and compaction.
[mgorman@suse.de: fix build] Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Alex Thorlton <athorlton@sgi.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:42 +0000 (17:08 -0800)]
mm: numa: avoid unnecessary disruption of NUMA hinting during migration
do_huge_pmd_numa_page() handles the case where there is parallel THP
migration. However, by the time it is checked the NUMA hinting
information has already been disrupted. This patch adds an earlier
check with some helpers.
Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:41 +0000 (17:08 -0800)]
mm: numa: clear numa hinting information on mprotect
On a protection change it is no longer clear if the page should be still
accessible. This patch clears the NUMA hinting fault bits on a
protection change.
Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Thu, 19 Dec 2013 01:08:38 +0000 (17:08 -0800)]
mm: numa: ensure anon_vma is locked to prevent parallel THP splits
The anon_vma lock prevents parallel THP splits and any associated
complexity that arises when handling splits during THP migration. This
patch checks if the lock was successfully acquired and bails from THP
migration if it failed for any reason.
Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>