Linus Torvalds [Sat, 28 Mar 2015 17:58:53 +0000 (10:58 -0700)]
Merge branch 'parisc-4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parsic fixes from Helge Deller:
"One patch from Mikulas fixes a bug on parisc by artifically
incrementing the counter in pmd_free when the kernel tries to free
the preallocated pmd.
Other than that we now prevent that syscalls gets added without
incrementing __NR_Linux_syscalls and fix the initial pmd setup code
if a default page size greater than 4k has been selected"
* 'parisc-4.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix pmd code to depend on PT_NLEVELS value, not on CONFIG_64BIT
parisc: mm: don't count preallocated pmds
parisc: Add compile-time check when adding new syscalls
Linus Torvalds [Sat, 28 Mar 2015 17:47:27 +0000 (09:47 -0800)]
Merge tag 'arc-4.0-fixes-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"We found some issues with signal handling taking down the system. I
know its late, but these are important and all marked for stable.
ARC signal handling related fixes uncovered during recent testing of
NPTL tools"
* tag 'arc-4.0-fixes-part-2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: signal handling robustify
ARC: SA_SIGINFO ucontext regs off-by-one
Linus Torvalds [Fri, 27 Mar 2015 21:38:02 +0000 (14:38 -0700)]
Merge tag 'sound-4.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Three trivial oneliner fixes for HD-audio.
Two are device-specific quirks while one is a generic fix for recent
Realtek codecs"
* tag 'sound-4.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Add one more node in the EAPD supporting candidate list
ALSA: hda_intel: apply the Seperate stream_tag for Sunrise Point
ALSA: hda - Add dock support for Thinkpad T450s (17aa:5036)
James Hogan [Fri, 20 Feb 2015 23:45:45 +0000 (23:45 +0000)]
watchdog: imgpdc: Fix default heartbeat
The IMG PDC watchdog driver heartbeat module parameter has no default so
it is initialised to zero. This results in the following warning during
probe:
imgpdc-wdt 2006000.wdt: Initial timeout out of range! setting max timeout
The module parameter description implies that the default value should
be PDC_WDT_DEF_TIMEOUT, which isn't yet used, so initialise it to that.
Also tweak the heartbeat module parameter description for consistency.
The IMG PDC watchdog probe function calls pdc_wdt_stop() prior to
watchdog_set_drvdata(), causing a NULL pointer dereference when
pdc_wdt_stop() retrieves the struct pdc_wdt_dev pointer using
watchdog_get_drvdata() and reads the register base address through it.
Fix by moving the watchdog_set_drvdata() call earlier, to where various
other pdc_wdt->wdt_dev fields are initialised.
Linus Torvalds [Thu, 26 Mar 2015 22:04:05 +0000 (15:04 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm refcounting fixes from Dave Airlie:
"Here is the complete set of i915 bug/warn/refcounting fixes"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Fixup legacy plane->crtc link for initial fb config
drm/i915: Fix atomic state when reusing the firmware fb
drm/i915: Keep ring->active_list and ring->requests_list consistent
drm/i915: Don't try to reference the fb in get_initial_plane_config()
drm: Fixup racy refcounting in plane_force_disable
Linus Torvalds [Thu, 26 Mar 2015 21:53:47 +0000 (14:53 -0700)]
Merge tag 'dm-4.0-fix-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mike Snitzer:
"Fix DM core device cleanup regression -- due to a latent race that was
exposed by the bdi changes that were introduced during the 4.0 merge"
* tag 'dm-4.0-fix-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: fix add_disk() NULL pointer due to race with free_dev()
Linus Torvalds [Thu, 26 Mar 2015 21:43:42 +0000 (14:43 -0700)]
Merge tag 'linux-kselftest-4.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fix from Shuah Khan.
* tag 'linux-kselftest-4.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: Fix build failures when invoked from kselftest target
Dave Airlie [Thu, 26 Mar 2015 21:39:45 +0000 (07:39 +1000)]
Merge tag 'drm-intel-fixes-2015-03-26' of git://anongit.freedesktop.org/drm-intel into drm-fixes
This should cover the final warnings in -rc5 with two more backports
from our development branch (drm-intel-next-queued). They're the ones
from Daniel and Damien, with references to the reports.
This is on top of drm-fixes because of the dependency on the two earlier
fixes not yet in Linus' tree.
There's an additional regression fix from Chris.
* tag 'drm-intel-fixes-2015-03-26' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Fixup legacy plane->crtc link for initial fb config
drm/i915: Fix atomic state when reusing the firmware fb
drm/i915: Keep ring->active_list and ring->requests_list consistent
Linus Torvalds [Thu, 26 Mar 2015 21:11:17 +0000 (14:11 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"A couple of bug fixes for s390.
The ftrace comile fix is quite large for a -rc6 release, but it would
be nice to have it in 4.0"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/smp: reenable smt after resume
s390/mm: limit STACK_RND_MASK for compat tasks
s390/ftrace: fix compile error if CONFIG_KPROBES is disabled
s390/cpum_sf: add diagnostic sampling event only if it is authorized
drm/i915: Fix modeset state confusion in the load detect code
But this time around it was the initial fb code that forgot to update
the plane->crtc pointer. Otherwise it's the exact same bug, with the
exact same restrains (any set_config call/ioctl that doesn't disable
the pipe papers over the bug for free, so fairly hard to hit in normal
testing). So if you want the full explanation just go read that one
over there - it's rather long ...
Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Josh Boyer <jwboyer@fedoraproject.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reported-and-tested-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[Jani: backported to drm-intel-fixes for v4.0-rc]
Reference: http://mid.gmane.org/CA+5PVA7ChbtJrknqws1qvZcbrg1CW2pQAFkSMURWWgyASRyGXg@mail.gmail.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Hui Wang [Thu, 26 Mar 2015 09:14:55 +0000 (17:14 +0800)]
ALSA: hda - Add one more node in the EAPD supporting candidate list
We have a HP machine which use the codec node 0x17 connecting the
internal speaker, and from the node capability, we saw the EAPD,
if we don't set the EAPD on for this node, the internal speaker
can't output any sound.
Chris Wilson [Wed, 18 Mar 2015 18:19:22 +0000 (18:19 +0000)]
drm/i915: Keep ring->active_list and ring->requests_list consistent
If we retire requests last, we may use a later seqno and so clear
the requests lists without clearing the active list, leading to
confusion. Hence we should retire requests first for consistency with
the early return. The order used to be important as the lifecycle for
the object on the active list was determined by request->seqno. However,
the requests themselves are now reference counted removing the
constraint from the order of retirement.
drm/i915: Convert 'i915_seqno_passed' calls into 'i915_gem_request_completed
'
and a
WARNING: CPU: 0 PID: 1383 at drivers/gpu/drm/i915/i915_gem_evict.c:279 i915_gem_evict_vm+0x10c/0x140()
WARN_ON(!list_empty(&vm->active_list))
Identified by updating WATCH_LISTS:
[drm:i915_verify_lists] *ERROR* blitter ring: active list not empty, but no requests
WARNING: CPU: 0 PID: 681 at drivers/gpu/drm/i915/i915_gem.c:2751 i915_gem_retire_requests_ring+0x149/0x230()
WARN_ON(i915_verify_lists(ring->dev))
Note that this is only a problem in evict_vm where the following happens
after a retire_request has cleaned out all requests, but not all active
bo:
- intel_ring_idle called from i915_gpu_idle notices that no requests are
outstanding and immediately returns.
- i915_gem_retire_requests_ring called from i915_gem_retire_requests also
immediately returns when there's no request, still leaving the bo on the
active list.
- evict_vm hits the WARN_ON(!list_empty(&vm->active_list)) after evicting
all active objects that there's still stuff left that shouldn't be
there.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Libin Yang [Thu, 26 Mar 2015 05:28:39 +0000 (13:28 +0800)]
ALSA: hda_intel: apply the Seperate stream_tag for Sunrise Point
The total stream number of Sunrise Point's input and output stream
exceeds 15, which will cause some streams do not work because
of the overflow on SDxCTL.STRM field if using the legacy
stream tag allocation method.
This patch uses the new stream tag allocation method by add
the flag AZX_DCAPS_SEPARATE_STREAM_TAG for Skylake platform.
Signed-off-by: Libin Yang <libin.yang@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Vineet Gupta [Thu, 26 Mar 2015 05:44:41 +0000 (11:14 +0530)]
ARC: signal handling robustify
A malicious signal handler / restorer can DOS the system by fudging the
user regs saved on stack, causing weird things such as sigreturn returning
to user mode PC but cpu state still being kernel mode....
Ensure that in sigreturn path status32 always has U bit; any other bogosity
(gargbage PC etc) will be taken care of by normal user mode exceptions mechanisms.
Vineet Gupta [Thu, 26 Mar 2015 03:55:44 +0000 (09:25 +0530)]
ARC: SA_SIGINFO ucontext regs off-by-one
The regfile provided to SA_SIGINFO signal handler as ucontext was off by
one due to pt_regs gutter cleanups in 2013.
Before handling signal, user pt_regs are copied onto user_regs_struct and copied
back later. Both structs are binary compatible. This was all fine until
commit 2fa919045b72 (ARC: pt_regs update #2) which removed the empty stack slot
at top of pt_regs (corresponding to first pad) and made the corresponding
fixup in struct user_regs_struct (the pad in there was moved out of
@scratch - not removed altogether as it is part of ptrace ABI)
struct user_regs_struct {
+ long pad;
struct {
- long pad;
long bta, lp_start, lp_end,....
} scratch;
...
}
This meant that now user_regs_struct was off by 1 reg w.r.t pt_regs and
signal code needs to user_regs_struct.scratch to reflect it as pt_regs,
which is what this commit does.
This problem was hidden for 2 years, because both save/restore, despite
using wrong location, were using the same location. Only an interim
inspection (reproducer below) exposed the issue.
Linus Torvalds [Wed, 25 Mar 2015 23:21:17 +0000 (16:21 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"15 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: numa: mark huge PTEs young when clearing NUMA hinting faults
mm: numa: slow PTE scan rate if migration failures occur
mm: numa: preserve PTE write permissions across a NUMA hinting fault
mm: numa: group related processes based on VMA flags instead of page table flags
hfsplus: fix B-tree corruption after insertion at position 0
MAINTAINERS: add Jan as DMI/SMBIOS support maintainer
fs/affs/file.c: unlock/release page on error
mm/page_alloc.c: call kernel_map_pages in unset_migrateype_isolate
mm/slub: fix lockups on PREEMPT && !SMP kernels
mm/memory hotplug: postpone the reset of obsolete pgdat
MAINTAINERS: correct rtc armada38x pattern entry
mm/pagewalk.c: prevent positive return value of walk_page_test() from being passed to callers
mm: fix anon_vma->degree underflow in anon_vma endless growing prevention
drivers/rtc/rtc-mrst: fix suspend/resume
aoe: update aoe maintainer information
Mel Gorman [Wed, 25 Mar 2015 22:55:45 +0000 (15:55 -0700)]
mm: numa: mark huge PTEs young when clearing NUMA hinting faults
Base PTEs are marked young when the NUMA hinting information is cleared
but the same does not happen for huge pages which this patch addresses.
Note that migrated pages are not marked young as the base page migration
code does not assume that migrated pages have been referenced. This
could be addressed but beyond the scope of this series which is aimed at
Dave Chinners shrink workload that is unlikely to be affected by this
issue.
Mel Gorman [Wed, 25 Mar 2015 22:55:42 +0000 (15:55 -0700)]
mm: numa: slow PTE scan rate if migration failures occur
Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226
Across the board the 4.0-rc1 numbers are much slower, and the degradation
is far worse when using the large memory footprint configs. Perf points
straight at the cause - this is from 4.0-rc1 on the "-o bhash=101073" config:
This is showing excessive migration activity even though excessive
migrations are meant to get throttled. Normally, the scan rate is tuned
on a per-task basis depending on the locality of faults. However, if
migrations fail for any reason then the PTE scanner may scan faster if
the faults continue to be remote. This means there is higher system CPU
overhead and fault trapping at exactly the time we know that migrations
cannot happen. This patch tracks when migration failures occur and
slows the PTE scanner.
Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Dave Chinner <david@fromorbit.com> Tested-by: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Wed, 25 Mar 2015 22:55:40 +0000 (15:55 -0700)]
mm: numa: preserve PTE write permissions across a NUMA hinting fault
Protecting a PTE to trap a NUMA hinting fault clears the writable bit
and further faults are needed after trapping a NUMA hinting fault to set
the writable bit again. This patch preserves the writable bit when
trapping NUMA hinting faults. The impact is obvious from the number of
minor faults trapped during the basis balancing benchmark and the system
CPU usage;
autonumabench
4.0.0-rc4 4.0.0-rc4
baseline preserve
Time System-NUMA01 107.13 ( 0.00%) 103.13 ( 3.73%)
Time System-NUMA01_THEADLOCAL 131.87 ( 0.00%) 83.30 ( 36.83%)
Time System-NUMA02 8.95 ( 0.00%) 10.72 (-19.78%)
Time System-NUMA02_SMT 4.57 ( 0.00%) 3.99 ( 12.69%)
Time Elapsed-NUMA01 515.78 ( 0.00%) 517.26 ( -0.29%)
Time Elapsed-NUMA01_THEADLOCAL 384.10 ( 0.00%) 384.31 ( -0.05%)
Time Elapsed-NUMA02 48.86 ( 0.00%) 48.78 ( 0.16%)
Time Elapsed-NUMA02_SMT 47.98 ( 0.00%) 48.12 ( -0.29%)
4.0.0-rc4 4.0.0-rc4
baseline preserve
User 44383.95 43971.89
System 252.61 201.24
Elapsed 998.68 1000.94
The patch looks hacky but the alternatives looked worse. The tidest was
to rewalk the page tables after a hinting fault but it was more complex
than this approach and the performance was worse. It's not generally
safe to just mark the page writable during the fault if it's a write
fault as it may have been read-only for COW so that approach was
discarded.
Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Dave Chinner <david@fromorbit.com> Tested-by: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Wed, 25 Mar 2015 22:55:37 +0000 (15:55 -0700)]
mm: numa: group related processes based on VMA flags instead of page table flags
These are three follow-on patches based on the xfsrepair workload Dave
Chinner reported was problematic in 4.0-rc1 due to changes in page table
management -- https://lkml.org/lkml/2015/3/1/226.
Much of the problem was reduced by commit 53da3bc2ba9e ("mm: fix up numa
read-only thread grouping logic") and commit ba68bc0115eb ("mm: thp:
Return the correct value for change_huge_pmd"). It was known that the
performance in 3.19 was still better even if is far less safe. This
series aims to restore the performance without compromising on safety.
For the test of this mail, I'm comparing 3.19 against 4.0-rc4 and the
three patches applied on top
Overall the system CPU usage is comparable and the test is naturally a
bit variable. The slowing of the scanner hurts numa01 but on this
machine it is an adverse workload and patches that dramatically help it
often hurt absolutely everything else.
Due to patch 2, the fault activity is interesting
3.19.0 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4
vanilla vanillavmwrite-v5r8preserve-v5r8slowscan-v5r8
Minor Faults 20978112656646259724919812301636841
Major Faults 362 450 365 364 365
Note the impact preserving the write bit across protection updates and
fault reduces faults.
Here the impact of slowing the PTE scanner on migratrion failures is
obvious as "NUMA base PTE updates" and "NUMA huge PMD updates" are
massively reduced even though the headline performance is very similar.
As xfsrepair was the reported workload here is the impact of the series
on it.
The really relevant lines as syst-xfsrepair which is the system CPU
usage when running xfsrepair. Note that on my machine the overhead was
45% higher on 4.0-rc4 which may be part of what Dave is seeing. Once we
preserve the write bit across faults, it's only 2.51% higher on average.
With the full series applied, system CPU usage is 24.6% lower on
average.
Again, the impact of preserving the write bit on minor faults is obvious
and the impact of slowing scanning after migration failures is obvious
on the PTE updates. Note also that the number of pages migrated is much
reduced even though the headline performance is comparable.
3.19.0 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4 4.0.0-rc4
vanilla vanillavmwrite-v5r8preserve-v5r8slowscan-v5r8
Mean sdb-avgqusz 13.47 2.54 2.55 2.47 2.49
Mean sdb-avgrqsz 202.32 140.22 139.50 139.02 138.12
Mean sdb-await 25.92 5.09 5.33 5.02 5.22
Mean sdb-r_await 4.71 0.19 0.83 0.51 0.11
Mean sdb-w_await 104.13 5.21 5.38 5.05 5.32
Mean sdb-svctm 0.59 0.13 0.14 0.13 0.14
Mean sdb-rrqm 0.16 0.00 0.00 0.00 0.00
Mean sdb-wrqm 3.59 1799.43 1826.84 1812.21 1785.67
Max sdb-avgqusz 111.06 12.13 14.05 11.66 15.60
Max sdb-avgrqsz 255.60 190.34 190.01 187.33 191.78
Max sdb-await 168.24 39.28 49.22 44.64 65.62
Max sdb-r_await 660.00 52.00 280.00 76.00 12.00
Max sdb-w_await 7804.00 39.28 49.22 44.64 65.62
Max sdb-svctm 4.00 2.82 2.86 1.98 2.84
Max sdb-rrqm 8.30 0.00 0.00 0.00 0.00
Max sdb-wrqm 34.20 5372.80 5278.60 5386.60 5546.15
FWIW, I also checked SPECjbb in different configurations but it's
similar observations -- minor faults lower, PTE update activity lower
and performance is roughly comparable against 3.19.
This patch (of 3):
Threads that share writable data within pages are grouped together as
related tasks. This decision is based on whether the PTE is marked
dirty which is subject to timing races between the PTE scanner update
and when the application writes the page. If the page is file-backed,
then background flushes and sync also affect placement. This is
unpredictable behaviour which is impossible to reason about so this
patch makes grouping decisions based on the VMA flags.
Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Dave Chinner <david@fromorbit.com> Tested-by: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sergei Antonov [Wed, 25 Mar 2015 22:55:34 +0000 (15:55 -0700)]
hfsplus: fix B-tree corruption after insertion at position 0
Fix B-tree corruption when a new record is inserted at position 0 in the
node in hfs_brec_insert(). In this case a hfs_brec_update_parent() is
called to update the parent index node (if exists) and it is passed
hfs_find_data with a search_key containing a newly inserted key instead
of the key to be updated. This results in an inconsistent index node.
The bug reproduces on my machine after an extents overflow record for
the catalog file (CNID=4) is inserted into the extents overflow B-tree.
Because of a low (reserved) value of CNID=4, it has to become the first
record in the first leaf node.
A change in hfs_brec_insert() makes hfs_brec_update_parent() work
correctly by preventing it from getting fd->record=-1 value from
__hfs_brec_find().
Along the way, I removed duplicate code with unification of the if
condition. The resulting code is equivalent to the original code
because node is never 0.
Also hfs_brec_update_parent() will now return an error after getting a
negative fd->record value. However, the return value of
hfs_brec_update_parent() is not checked anywhere in the file and I'm
leaving it unchanged by this patch. brec.c lacks error checking after
some other calls too, but this issue is of less importance than the one
being fixed by this patch.
Signed-off-by: Sergei Antonov <saproj@gmail.com> Cc: Joe Perches <joe@perches.com> Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com> Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Anton Altaparmakov <aia21@cam.ac.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Taesoo Kim [Wed, 25 Mar 2015 22:55:29 +0000 (15:55 -0700)]
fs/affs/file.c: unlock/release page on error
When affs_bread_ino() fails, correctly unlock the page and release the
page cache with proper error value. All write_end() should
unlock/release the page that was locked by write_beg().
Signed-off-by: Taesoo Kim <tsgatesv@gmail.com> Cc: Fabian Frederick <fabf@skynet.be> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Laura Abbott [Wed, 25 Mar 2015 22:55:26 +0000 (15:55 -0700)]
mm/page_alloc.c: call kernel_map_pages in unset_migrateype_isolate
Commit 3c605096d315 ("mm/page_alloc: restrict max order of merging on
isolated pageblock") changed the logic of unset_migratetype_isolate to
check the buddy allocator and explicitly call __free_pages to merge.
The page that is being freed in this path never had prep_new_page called
so set_page_refcounted is called explicitly but there is no call to
kernel_map_pages. With the default kernel_map_pages this is mostly
harmless but if kernel_map_pages does any manipulation of the page
tables (unmapping or setting pages to read only) this may trigger a
fault:
Mark Rutland [Wed, 25 Mar 2015 22:55:23 +0000 (15:55 -0700)]
mm/slub: fix lockups on PREEMPT && !SMP kernels
Commit 9aabf810a67c ("mm/slub: optimize alloc/free fastpath by removing
preemption on/off") introduced an occasional hang for kernels built with
CONFIG_PREEMPT && !CONFIG_SMP.
The problem is the following loop the patch introduced to
slab_alloc_node and slab_free:
do {
tid = this_cpu_read(s->cpu_slab->tid);
c = raw_cpu_ptr(s->cpu_slab);
} while (IS_ENABLED(CONFIG_PREEMPT) && unlikely(tid != c->tid));
GCC 4.9 has been observed to hoist the load of c and c->tid above the
loop for !SMP kernels (as in this case raw_cpu_ptr(x) is compile-time
constant and does not force a reload). On arm64 the generated assembly
looks like:
If the thread is preempted between the load of c->tid (into x1) and tid
(into x4), and an allocation or free occurs in another thread (bumping
the cpu_slab's tid), the thread will be stuck in the loop until
s->cpu_slab->tid wraps, which may be forever in the absence of
allocations/frees on the same CPU.
This patch changes the loop condition to access c->tid with READ_ONCE.
This ensures that the value is reloaded even when the compiler would
otherwise assume it could cache the value, and also ensures that the
load will not be torn.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Steve Capper <steve.capper@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The cause is the "memset(pgdat, 0, sizeof(*pgdat))" at the end of
try_offline_node, which will reset all the content of pgdat to 0, as the
pgdat is accessed lock-free, so that the users still using the pgdat
will panic, such as the vmstat_update routine.
process A: offline node XX:
vmstat_updat()
refresh_cpu_vm_stats()
for_each_populated_zone()
find online node XX
cond_resched()
offline cpu and memory, then try_offline_node()
node_set_offline(nid), and memset(pgdat, 0, sizeof(*pgdat))
zone = next_zone(zone)
pg_data_t *pgdat = zone->zone_pgdat; // here pgdat is NULL now
next_online_pgdat(pgdat)
next_online_node(pgdat->node_id); // NULL pointer access
So the solution here is postponing the reset of obsolete pgdat from
try_offline_node() to hotadd_new_pgdat(), and just resetting
pgdat->nr_zones and pgdat->classzone_idx to be 0 rather than the memset
0 to avoid breaking pointer information in pgdat.
Naoya Horiguchi [Wed, 25 Mar 2015 22:55:14 +0000 (15:55 -0700)]
mm/pagewalk.c: prevent positive return value of walk_page_test() from being passed to callers
walk_page_test() is purely pagewalk's internal stuff, and its positive
return values are not intended to be passed to the callers of pagewalk.
However, in the current code if the last vma in the do-while loop in
walk_page_range() happens to return a positive value, it leaks outside
walk_page_range(). So the user visible effect is invalid/unexpected
return value (according to the reporter, mbind() causes it.)
This patch fixes it simply by reinitializing the return value after
checked.
Another exposed interface, walk_page_vma(), already returns 0 for such
cases so no problem.
Leon Yu [Wed, 25 Mar 2015 22:55:11 +0000 (15:55 -0700)]
mm: fix anon_vma->degree underflow in anon_vma endless growing prevention
I have constantly stumbled upon "kernel BUG at mm/rmap.c:399!" after
upgrading to 3.19 and had no luck with 4.0-rc1 neither.
So, after looking into new logic introduced by commit 7a3ef208e662 ("mm:
prevent endless growth of anon_vma hierarchy"), I found chances are that
unlink_anon_vmas() is called without incrementing dst->anon_vma->degree
in anon_vma_clone() due to allocation failure. If dst->anon_vma is not
NULL in error path, its degree will be incorrectly decremented in
unlink_anon_vmas() and eventually underflow when exiting as a result of
another call to unlink_anon_vmas(). That's how "kernel BUG at
mm/rmap.c:399!" is triggered for me.
This patch fixes the underflow by dropping dst->anon_vma when allocation
fails. It's safe to do so regardless of original value of dst->anon_vma
because dst->anon_vma doesn't have valid meaning if anon_vma_clone()
fails. Besides, callers don't care dst->anon_vma in such case neither.
Also suggested by Michal Hocko, we can clean up vma_adjust() a bit as
anon_vma_clone() now does the work.
[akpm@linux-foundation.org: tweak comment] Fixes: 7a3ef208e662 ("mm: prevent endless growth of anon_vma hierarchy") Signed-off-by: Leon Yu <chianglungyu@gmail.com> Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Moorestown RTC driver implements suspend and resume callbacks and
assigns them to the suspend and resume fields of the device_driver
struct. These callbacks are never actually called by anything though.
Modify the driver to properly use dev_pm_ops so that the suspend and
resume functions are actually executed upon suspend/resume.
Ed Cashin [Wed, 25 Mar 2015 22:55:06 +0000 (15:55 -0700)]
aoe: update aoe maintainer information
The coraid.com email address is defunct. The old aoe support area hosted
at coraid.com is no longer up. These changes update the email and website
to current ones.
Signed-off-by: Ed Cashin <ed.cashin@acm.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 25 Mar 2015 22:40:21 +0000 (15:40 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
"A small collection of fixes that has been gathered over the last few
weeks. This contains:
- A one-liner fix for NVMe, fixing a missing list_head init that
could makes us oops on hitting recovery at load time.
- Two small blk-mq fixes:
- Fixup a bad goto jump on error handling.
- Fix for oopsing if running out of reserved tags.
- A memory leak fix for NBD.
- Two small writeback fixes from Tejun, fixing a missing init to
INITIAL_JIFFIES, and a possible underflow introduced recently.
- A core merge fixup in sg gap detection, where rq->biotail was
indexed with the count of rq->bio"
* 'for-linus' of git://git.kernel.dk/linux-block:
writeback: fix possible underflow in write bandwidth calculation
NVMe: Initialize device list head before starting
Fix bug in blk_rq_merge_ok
blkmq: Fix NULL pointer deref when all reserved tags in
blk-mq: fix use of incorrect goto label in blk_mq_init_queue error path
nbd: fix possible memory leak
writeback: add missing INITIAL_JIFFIES init in global_update_bandwidth()
Joe Perches [Tue, 24 Mar 2015 01:01:35 +0000 (18:01 -0700)]
selinux: fix sel_write_enforce broken return value
Return a negative error value like the rest of the entries in this function.
Cc: <stable@vger.kernel.org> Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: tweaked subject line] Signed-off-by: Paul Moore <pmoore@redhat.com>
Heiko Carstens [Sat, 21 Mar 2015 11:43:08 +0000 (12:43 +0100)]
s390/smp: reenable smt after resume
After a suspend/resume cycle we missed to enable smt again, which leads
to all sorts of bugs, since the kernel assumes smt is enabled, while the
hardware thinks it is not.
Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reported-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Linus Torvalds [Wed, 25 Mar 2015 00:27:18 +0000 (17:27 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull two arm64 fixes from Catalin Marinas:
- switch_mm() fix where init_mm.pgd ends up in the user TTBR0;
swapper_pg_dir is not suitable for user mappings
- this_cpu accessors fix for preemption safety
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: percpu: Make this_cpu accessors pre-empt safe
arm64: Use the reserved TTBR0 if context switching to the init_mm
Linus Torvalds [Wed, 25 Mar 2015 00:23:03 +0000 (17:23 -0700)]
Merge tag 'powerpc-4.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc fixes from Michael Ellerman:
- Fix the MCE code to use CONFIG_KVM_BOOK3S_64_HANDLER
- Little endian fixes for post mobility device tree update
- Add PVR for POWER8NVL processor
- Fixes for hypervisor doorbell handling
* tag 'powerpc-4.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
powerpc/book3s: Fix the MCE code to use CONFIG_KVM_BOOK3S_64_HANDLER
powerpc/pseries: Little endian fixes for post mobility device tree update
powerpc: Add PVR for POWER8NVL processor
powerpc/powernv: Fixes for hypervisor doorbell handling
Linus Torvalds [Wed, 25 Mar 2015 00:08:29 +0000 (17:08 -0700)]
Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fix from Tejun Heo:
"One patch to fix a regression from the recent switch to blk-mq tag
allocation which can cause oops on SAS-attached SATA drives"
* 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata: Add a new flag to destinguish sas controller
Linus Torvalds [Wed, 25 Mar 2015 00:02:45 +0000 (17:02 -0700)]
Merge tag 'mfd-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD fixes from Lee Jones:
- Use DMA'able addresses for DMA; rtsx_usb
- Use return value in the correct way; kempld-core
* tag 'mfd-fixes-4.0' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
mfd: kempld-core: Fix callback return value check
mfd: rtsx_usb: Prevent DMA from stack
drm/i915: Ensure plane->state->fb stays in sync with plane->fb
v2: Don't move update_state_fb(). It was moved around because I
originally put update_state_fb() in intel_alloc_plane_obj() before
finding a better place. (Matt)
Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From drm-next:
(cherry picked from commit f55548b5af87ebfc586ca75748947f1c1b1a4a52) Signed-off-by: Dave Airlie <airlied@redhat.com>
Linus Torvalds [Tue, 24 Mar 2015 23:58:29 +0000 (16:58 -0700)]
Merge tag 'spi-v4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A couple of driver specific fixes of the usual "important if you have
that device" kind together with a fix for a use after free bug that
was introduced into the trace code in some of the recent refactoring
of the message queue handling"
* tag 'spi-v4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: trigger trace event for message-done before mesg->complete
spi: dw-mid: clear BUSY flag fist and test other one
spi: qup: Fix cs-num DT property parsing
Linus Torvalds [Tue, 24 Mar 2015 23:51:42 +0000 (16:51 -0700)]
Merge tag 'regulator-fix-v4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"Two fixes here, one typo fix in the documentation and one fix for a
system hang with one of the Palmas chips caused by the use of an
incorrect offset being provided for one of the registers"
* tag 'regulator-fix-v4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: Fix documentation for regmap in the config
regulator: palmas: Correct TPS659038 register definition for REGEN2
Linus Torvalds [Tue, 24 Mar 2015 23:42:54 +0000 (16:42 -0700)]
Merge tag 'regmap-fix-v4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"This patch fixes a bad interaction between the support that was added
for having regmaps without devices for early system controller
initialization and the trace support.
There's a very good analysis of the actual issue in the commit message
for the change"
* tag 'regmap-fix-v4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: introduce regmap_name to fix syscon regmap trace events
Steve Capper [Sun, 22 Mar 2015 14:51:51 +0000 (14:51 +0000)]
arm64: percpu: Make this_cpu accessors pre-empt safe
this_cpu operations were implemented for arm64 in: 5284e1b arm64: xchg: Implement cmpxchg_double f97fc81 arm64: percpu: Implement this_cpu operations
Unfortunately, it is possible for pre-emption to take place between
address generation and data access. This can lead to cases where data
is being manipulated by this_cpu for a different CPU than it was
called on. Which effectively breaks the spec.
This patch disables pre-emption for the this_cpu operations
guaranteeing that address generation and data manipulation take place
without a pre-emption in-between.
Fixes: 5284e1b4bc8a ("arm64: xchg: Implement cmpxchg_double") Fixes: f97fc810798c ("arm64: percpu: Implement this_cpu operations") Reported-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Steve Capper <steve.capper@linaro.org>
[catalin.marinas@arm.com: remove space after type cast] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
we've switched to weak references, broke that assumption but forgot to
fix it up.
Since we still force-disable planes it's only possible to hit this
when racing multiple rmfb with fbdev restoring or similar evil things.
As long as userspace is nice it's impossible to hit the BUG_ON.
But the BUG_ON would most likely be hit from fbdev code, which usually
invovles the console_lock besides all modeset locks. So very likely
we'd never get the bug reports if this was hit in the wild, hence
better be safe than sorry and backport.
Spotted by Matt Roper while reviewing other patches.
[airlied: pull this back into 4.0 - the oops happens there]
Cc: stable@vger.kernel.org Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Failure happens when attempting to allocate pages for
'struct kvm_memslots', however it doesn't have to be
present in physically contiguous (kmalloc-ed) address
space, change allocation to kvm_kvzalloc() so that
it will be vmalloc-ed when its size is more then a page.
Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Mike Snitzer [Mon, 23 Mar 2015 21:01:43 +0000 (17:01 -0400)]
dm: fix add_disk() NULL pointer due to race with free_dev()
Commit c4db59d31e39 ("fs: don't reassign dirty inodes to
default_backing_dev_info") exposed DM to a latent race in free_dev() vs
add_disk() in relation to management of the device's minor number.
Fix this by refactoring free_dev() to match cleanup order of the
alloc_dev() error path. Move cleanup of the gendisk, queue, and bdev
to _before_ the cleanup of the idr managed minor number.
Also, purely due to cleanup that fell out during the free_dev() audit:
- adjust dm_blk_close() to access the gendisk's private_data under
the _minor_lock spinlock.
- move __dm_destroy()'s dm_get_live_table() call out from under the
_minor_lock spinlock.
Catalin Marinas [Mon, 23 Mar 2015 15:06:50 +0000 (15:06 +0000)]
arm64: Use the reserved TTBR0 if context switching to the init_mm
The idle_task_exit() function may call switch_mm() with next ==
&init_mm. On arm64, init_mm.pgd cannot be used for user mappings, so
this patch simply sets the reserved TTBR0.
Cc: <stable@vger.kernel.org> Reported-by: Jon Medhurst (Tixy) <tixy@linaro.org> Tested-by: Jon Medhurst (Tixy) <tixy@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
1) Validate iov ranges before feeding them into iov_iter_init(), from
Al Viro.
2) We changed copy_from_msghdr_from_user() to zero out the msg_namelen
is a NULL pointer is given for the msg_name. Do the same in the
compat code too. From Catalin Marinas.
3) Fix partially initialized tuples in netfilter conntrack helper, from
Ian Wilson.
4) Missing continue; statement in nft_hash walker can lead to crashes,
from Herbert Xu.
5) tproxy_tg6_check looks for IP6T_INV_PROTO in ->flags instead of
->invflags, fix from Pablo Neira Ayuso.
6) Incorrect memory account of TCP FINs can result in negative socket
memory accounting values. Fix from Josh Hunt.
7) Don't allow virtual functions to enable VLAN promiscuous mode in
be2net driver, from Vasundhara Volam.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set
cx82310_eth: wait for firmware to become ready
net: validate the range we feed to iov_iter_init() in sys_sendto/sys_recvfrom
net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour
be2net: use PCI MMIO read instead of config read for errors
be2net: restrict MODIFY_EQ_DELAY cmd to a max of 8 EQs
be2net: Prevent VFs from enabling VLAN promiscuous mode
tcp: fix tcp fin memory accounting
ipv6: fix backtracking for throw routes
net: ethernet: pcnet32: Setup the SRAM and NOUFLO on Am79C97{3, 5}
ipv6: call ipv6_proxy_select_ident instead of ipv6_select_ident in udp6_ufo_fragment
netfilter: xt_TPROXY: fix invflags check in tproxy_tg6_check()
netfilter: restore rule tracing via nfnetlink_log
netfilter: nf_tables: allow to change chain policy without hook if it exists
netfilter: Fix potential crash in nft_hash walker
netfilter: Zero the tuple in nfnl_cthelper_parse_tuple()
David S. Miller [Mon, 23 Mar 2015 16:22:10 +0000 (09:22 -0700)]
sparc64: Fix several bugs in memmove().
Firstly, handle zero length calls properly. Believe it or not there
are a few of these happening during early boot.
Next, we can't just drop to a memcpy() call in the forward copy case
where dst <= src. The reason is that the cache initializing stores
used in the Niagara memcpy() implementations can end up clearing out
cache lines before we've sourced their original contents completely.
For example, considering NG4memcpy, the main unrolled loop begins like
this:
Assume dst is 64 byte aligned and let's say that dst is src - 8 for
this memcpy() call. That store at the end there is the one to the
first line in the cache line, thus clearing the whole line, which thus
clobbers "src + 0x28" before it even gets loaded.
To avoid this, just fall through to a simple copy only mildly
optimized for the case where src and dst are 8 byte aligned and the
length is a multiple of 8 as well. We could get fancy and call
GENmemcpy() but this is good enough for how this thing is actually
used.
Reported-by: David Ahern <david.ahern@oracle.com> Reported-by: Bob Picco <bpicco@meloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2f800fbd777b ("writeback: fix dirtied pages accounting on redirty")
introduced account_page_redirty() which reverts stat updates for a
redirtied page, making BDI_DIRTIED no longer monotonically increasing.
bdi_update_write_bandwidth() uses the delta in BDI_DIRTIED as the
basis for bandwidth calculation. While unlikely, since the above
patch, the newer value may be lower than the recorded past value and
underflow the bandwidth calculation leading to a wild result.
Fix it by subtracing min of the old and new values when calculating
delta. AFAIK, there hasn't been any report of it happening but the
resulting erratic behavior would be non-critical and temporary, so
it's possible that the issue is happening without being reported. The
risk of the fix is very low, so tagged for -stable.
James Hogan [Mon, 23 Mar 2015 11:17:56 +0000 (11:17 +0000)]
metag: Fix ioremap_wc/ioremap_cached build errors
When ioremap_wc() or ioremap_cached() are used without first including
asm/pgtable.h, the _PAGE_CACHEABLE or _PAGE_WR_COMBINE definitions
aren't found, resulting in build errors like the following (in
next-20150323 due to "lib: devres: add a helper function for
ioremap_wc"):
lib/devres.c: In function ‘devm_ioremap_wc’:
lib/devres.c:91: error: ‘_PAGE_WR_COMBINE’ undeclared
We can't easily include asm/pgtable.h in asm/io.h due to dependency
problems, so split out the _PAGE_* definitions from asm/pgtable.h into a
separate asm/pgtable-bits.h header (as a couple of other architectures
already do), and include that in io.h instead.
Helge Deller [Mon, 16 Mar 2015 20:17:50 +0000 (21:17 +0100)]
parisc: Fix pmd code to depend on PT_NLEVELS value, not on CONFIG_64BIT
Make the code which sets up the pmd depend on PT_NLEVELS == 3, not on
CONFIG_64BIT. The reason is, that a 64bit kernel with a page size
greater than 4k doesn't need the pmd and thus has PT_NLEVELS = 2.
The PA-RISC architecture preallocates one pmd with each pgd. This
preallocated pmd can never be freed - pmd_free does nothing when it is
called with this pmd. When the kernel attempts to free this preallocated
pmd, it decreases the count of allocated pmds. The result is that the
counter underflows and this error is reported.
This patch fixes the bug by artifically incrementing the counter in
pmd_free when the kernel tries to free the preallocated pmd.
powerpc/book3s: Fix the MCE code to use CONFIG_KVM_BOOK3S_64_HANDLER
commit id 2ba9f0d has changed CONFIG_KVM_BOOK3S_64_HV to tristate to allow
HV/PR bits to be built as modules. But the MCE code still depends on
CONFIG_KVM_BOOK3S_64_HV which is wrong. When user selects
CONFIG_KVM_BOOK3S_64_HV=m to build HV/PR bits as a separate module the
relevant MCE code gets excluded.
This patch fixes the MCE code to use CONFIG_KVM_BOOK3S_64_HANDLER. This
makes sure that the relevant MCE code is included when HV/PR bits
are built as a separate modules.
Fixes: 2ba9f0d88750 ("kvm: powerpc: book3s: Support building HV and PR KVM as module") Cc: stable@vger.kernel.org # v3.14+ Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Fix missing initialization of tuple structure in nfnetlink_cthelper
to avoid mismatches when looking up to attach userspace helpers to
flows, from Ian Wilson.
2) Fix potential crash in nft_hash when we hit -EAGAIN in
nft_hash_walk(), from Herbert Xu.
3) We don't need to indicate the hook information to update the
basechain default policy in nf_tables.
4) Restore tracing over nfnetlink_log due to recent rework to
accomodate logging infrastructure into nf_tables.
5) Fix wrong IP6T_INV_PROTO check in xt_TPROXY.
6) Set IP6T_F_PROTO flag in nft_compat so we can use SYNPROXY6 and
REJECT6 from xt over nftables.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 22 Mar 2015 19:07:47 +0000 (12:07 -0700)]
Merge tag 'driver-core-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two bugfixes for things reported. One regression in kernfs,
and another issue fixed in the LZ4 code that was fixed in the
"upstream" codebase that solves a reported kernel crash
Both have been in linux-next for a while"
* tag 'driver-core-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
LZ4 : fix the data abort issue
kernfs: handle poll correctly on 'direct_read' files.
Linus Torvalds [Sun, 22 Mar 2015 19:03:14 +0000 (12:03 -0700)]
Merge tag 'char-misc-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are three fixes for 4.0-rc5 that revert 3 PCMCIA patches that
were merged in 4.0-rc1 that cause regressions. So let's revert them
for now and they will be reworked and resent sometime in the future.
All have been tested in linux-next for a while"
* tag 'char-misc-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Revert "pcmcia: add a new resource manager for non ISA systems"
Revert "pcmcia: fix incorrect bracketing on a test"
Revert "pcmcia: add missing include for new pci resource handler"
Linus Torvalds [Sun, 22 Mar 2015 18:59:02 +0000 (11:59 -0700)]
Merge tag 'staging-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are four small staging driver fixes, all for the vt6656 and
vt6655 drivers, that resolve some reported issues with them.
All of these patches have been in linux next for a while"
* tag 'staging-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
vt6655: Fix late setting of byRFType.
vt6655: RFbSetPower fix missing rate RATE_12M
staging: vt6656: vnt_rf_setpower: fix missing rate RATE_12M
staging: vt6655: vnt_tx_packet fix dma_idx selection.
Linus Torvalds [Sun, 22 Mar 2015 18:54:29 +0000 (11:54 -0700)]
Merge tag 'tty-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fix from Greg KH:
"Here's a single 8250 serial driver that fixes a reported deadlock with
the serial console and the tty driver.
It's been in linux-next for a while now"
* tag 'tty-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: 8250_dw: Fix deadlock in LCR workaround
Linus Torvalds [Sun, 22 Mar 2015 18:33:55 +0000 (11:33 -0700)]
Merge tag 'usb-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / PHY driver fixes from Greg KH:
"Here's a number of USB and PHY driver fixes for 4.0-rc5.
The largest thing here is a revert of a gadget function driver patch
that removes 500 lines of code. Other than that, it's a number of
reported bugs fixes and new quirk/id entries.
All have been in linux-next for a while"
* tag 'usb-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (33 commits)
usb: common: otg-fsm: only signal connect after switching to peripheral
uas: Add US_FL_NO_ATA_1X for Initio Corporation controllers / devices
USB: ehci-atmel: rework clk handling
MAINTAINERS: add entry for USB OTG FSM
usb: chipidea: otg: add a_alt_hnp_support response for B device
phy: omap-usb2: Fix missing clk_prepare call when using old dt name
phy: ti/omap: Fix modalias
phy: core: Fixup return value of phy_exit when !pm_runtime_enabled
phy: miphy28lp: Convert to devm_kcalloc and fix wrong sizof
phy: miphy365x: Convert to devm_kcalloc and fix wrong sizeof
phy: twl4030-usb: Remove redundant assignment for twl->linkstat
phy: exynos5-usbdrd: Fix off-by-one valid value checking for args->args[0]
phy: Find the right match in devm_phy_destroy()
phy: rockchip-usb: Fixup rockchip_usb_phy_power_on failure path
phy: ti-pipe3: Simplify ti_pipe3_dpll_wait_lock implementation
phy: samsung-usb2: Remove NULL terminating entry from phys array
phy: hix5hd2-sata: Check return value of platform_get_resource
phy: exynos-dp-video: Kill exynos_dp_video_phy_pwr_isol function
Revert "usb: gadget: zero: Add support for interrupt EP"
Revert "xhci: Clear the host side toggle manually when endpoint is 'soft reset'"
...
Ondrej Zary [Sat, 21 Mar 2015 10:29:37 +0000 (11:29 +0100)]
cx82310_eth: wait for firmware to become ready
When the device is powered up, some (older) firmware versions fail to work
properly if we send commands before the boot is complete (everything is OK
when the device is hot-plugged). The firmware indicates its ready status by
putting the link up.
Newer firmwares delay the first command so they don't suffer from this problem.
They also report the link being always up.
Wait for firmware to become ready (link up) before sending any commands and/or
data.
This also allows lowering CMD_TIMEOUT value to a reasonable time.
Tested with 4.1.0.9 (old) and 4.1.0.30 (new) firmware versions.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 21 Mar 2015 20:05:37 +0000 (13:05 -0700)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave dmaengine fixes from Vinod Koul:
"Four fixes for dw, pl08x, imx-sdma and at_hdmac driver. Nothing
unusual here, simple fixes to these drivers"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: pl08x: Define capabilities for generic capabilities reporting
dmaengine: dw: append MODULE_ALIAS for platform driver
dmaengine: imx-sdma: switch to dynamic context mode after script loaded
dmaengine: at_hdmac: Fix calculation of the residual bytes
Linus Torvalds [Sat, 21 Mar 2015 19:51:36 +0000 (12:51 -0700)]
Merge tag 'pm+acpi-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"These are fixes for recent regressions (PCI/ACPI resources and at91
RTC locking), a stable-candidate powercap RAPL driver fix and two ARM
cpuidle fixes (one stable-candidate too).
Specifics:
- Revert a recent PCI commit related to IRQ resources management that
introduced a regression for drivers attempting to bind to devices
whose previous drivers did not balance pci_enable_device() and
pci_disable_device() as expected (Rafael J Wysocki).
- Fix a deadlock in at91_rtc_interrupt() introduced by a typo in a
recent commit related to wakeup interrupt handling (Dan Carpenter).
- Allow the power capping RAPL (Running-Average Power Limit) driver
to use different energy units for domains within one CPU package
which is necessary to handle Intel Haswell EP processors correctly
(Jacob Pan).
- Improve the cpuidle mvebu driver's handling of Armada XP SoCs by
updating the target residency and exit latency numbers for those
chips (Sebastien Rannou).
- Prevent the cpuidle mvebu driver from calling cpu_pm_enter() twice
in a row before cpu_pm_exit() is called on the same CPU which
breaks the core's assumptions regarding the usage of those
functions (Gregory Clement)"
* tag 'pm+acpi-4.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "x86/PCI: Refine the way to release PCI IRQ resources"
rtc: at91rm9200: double locking bug in at91_rtc_interrupt()
powercap / RAPL: handle domains with different energy units
cpuidle: mvebu: Update cpuidle thresholds for Armada XP SOCs
cpuidle: mvebu: Fix the CPU PM notifier usage
Linus Torvalds [Sat, 21 Mar 2015 19:41:50 +0000 (12:41 -0700)]
Merge git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
"A bunch of fixes across drivers:
radeon:
disable two ended allocation for now, it breaks some stuff
amdkfd:
misc fixes
nouveau:
fix irq loop problem, add basic support for GM206 (new hw)
i915:
fix some WARNs people were seeing
exynos:
fix some iommu interactions causing boot failures"
* git://people.freedesktop.org/~airlied/linux:
drm/radeon: drop ttm two ended allocation
drm/exynos: fix the initialization order in FIMD
drm/exynos: fix typo config name correctly.
drm/exynos: Check for NULL dereference of crtc
drm/exynos: IS_ERR() vs NULL bug
drm/exynos: remove unused files
drm/i915: Make sure the primary plane is enabled before reading out the fb state
drm/nouveau/bios: fix i2c table parsing for dcb 4.1
drm/nouveau/device/gm100: Basic GM206 bring up (as copy of GM204)
drm/nouveau/device: post write to NV_PMC_BOOT_1 when flipping endian switch
drm/nouveau/gr/gf100: fix some accidental or'ing of buffer addresses
drm/nouveau/fifo/nv04: remove the loop from the interrupt handler
drm/radeon: Changing number of compute pipe lines
drm/amdkfd: Fix SDMA queue init. in non-HWS mode
drm/amdkfd: destroy mqd when destroying kernel queue
drm/i915: Ensure plane->state->fb stays in sync with plane->fb
Linus Torvalds [Sat, 21 Mar 2015 19:33:01 +0000 (12:33 -0700)]
Merge tag 'devicetree-fixes-for-4.0-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull more DeviceTree fixes vfom Rob Herring:
- revert setting stdout-path as preferred console. This caused
regressions in PowerMACs and other systems.
- yet another fix for stdout-path option parsing.
- fix error path handling in of_irq_parse_one
* tag 'devicetree-fixes-for-4.0-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
Revert "of: Fix premature bootconsole disable with 'stdout-path'"
of: handle both '/' and ':' in path strings
of: unittest: Add option string test case with longer path
of/irq: Fix of_irq_parse_one() returned error codes
Pull SCSI target fixes from Nicholas Bellinger:
"Here are current target-pending fixes for v4.0-rc5 code that have made
their way into the queue over the last weeks.
The fixes this round include:
- Fix long-standing iser-target logout bug related to early
conn_logout_comp completion, resulting in iscsi_conn use-after-tree
OOpsen. (Sagi + nab)
- Fix long-standing tcm_fc bug in ft_invl_hw_context() failure
handing for DDP hw offload. (DanC)
- Fix incorrect use of unprotected __transport_register_session() in
tcm_qla2xxx + other single local se_node_acl fabrics. (Bart)
- Fix reference leak in target_submit_cmd() -> target_get_sess_cmd()
for ack_kref=1 failure path. (Bart)
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: do not reject FUA CDBs when write cache is enabled but emulate_write_cache is 0
target: Fix virtual LUN=0 target_configure_device failure OOPs
target/pscsi: Fix NULL pointer dereference in get_device_type
tcm_fc: missing curly braces in ft_invl_hw_context()
target: Fix reference leak in target_get_sess_cmd() error path
loop/usb/vhost-scsi/xen-scsiback: Fix use of __transport_register_session
tcm_qla2xxx: Fix incorrect use of __transport_register_session
iscsi-target: Avoid early conn_logout_comp for iser connections
Revert "iscsi-target: Avoid IN_LOGOUT failure case for iser-target"
target: Disallow changing of WRITE cache/FUA attrs after export
Linus Torvalds [Sat, 21 Mar 2015 18:15:13 +0000 (11:15 -0700)]
Merge tag 'dm-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull devicemapper fixes from Mike Snitzer:
"A handful of stable fixes for DM:
- fix thin target to always zero-fill reads to unprovisioned blocks
- fix to interlock device destruction's suspend from internal
suspends
- fix 2 snapshot exception store handover bugs
- fix dm-io to cope with DISCARD and WRITE_SAME capabilities changing"
* tag 'dm-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME
dm snapshot: suspend merging snapshot when doing exception handover
dm snapshot: suspend origin when doing exception handover
dm: hold suspend_lock while suspending device during device deletion
dm thin: fix to consistently zero-fill reads to unprovisioned blocks
Linus Torvalds [Sat, 21 Mar 2015 17:53:37 +0000 (10:53 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"Most of these are fixing extent reservation accounting, or corners
with tree writeback during commit.
Josef's set does add a test, which isn't strictly a fix, but it'll
keep us from making this same mistake again"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix outstanding_extents accounting in DIO
Btrfs: add sanity test for outstanding_extents accounting
Btrfs: just free dummy extent buffers
Btrfs: account merges/splits properly
Btrfs: prepare block group cache before writing
Btrfs: fix ASSERT(list_empty(&cur_trans->dirty_bgs_list)
Btrfs: account for the correct number of extents for delalloc reservations
Btrfs: fix merge delalloc logic
Btrfs: fix comp_oper to get right order
Btrfs: catch transaction abortion after waiting for it
btrfs: fix sizeof format specifier in btrfs_check_super_valid()
Linus Torvalds [Sat, 21 Mar 2015 17:24:10 +0000 (10:24 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- mm switching fix where the kernel pgd ends up in the user TTBR0 after
returning from an EFI run-time services call
- fix __GFP_ZERO handling for atomic pool and CMA DMA allocations (the
generic code does get the gfp flags, so it's left with the arch code
to memzero accordingly)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Honor __GFP_ZERO in dma allocations
arm64: efi: don't restore TTBR0 if active_mm points at init_mm
Linus Torvalds [Sat, 21 Mar 2015 17:03:22 +0000 (10:03 -0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Another few ARM fixes. Fabrice fixed the L2 cache DT parsing to allow
prefetch configuration to be specified even when the cache size
parsing fails.
Laura noticed that the setting of page attributes wasn't working for
modules due to is_module_addr() always returning false.
Marc Gonzalez (aka Mason) noticed a potential latent bug with the way
we read one of the CPUID registers (where we could attempt to read a
non-present CPUID register which may fault.)
I've fixed an issue where 32-bit DMA masks were failing with memory
which extended to the top of physical address space, and I've also
added debugging output of the page tables when we hit a data access
exception which we don't specifically handle - prompted by the lack of
information in a bug report"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8313/1: Use read_cpuid_ext() macro instead of inline asm
ARM: 8311/1: Don't use is_module_addr in setting page attributes
ARM: 8310/1: l2c: Fix prefetch settings dt parsing
ARM: dump pgd, pmd and pte states on unhandled data abort faults
ARM: dma-api: fix off-by-one error in __dma_supported()
NeilBrown [Fri, 13 Mar 2015 00:51:18 +0000 (11:51 +1100)]
md: fix problems with freeing private data after ->run failure.
If ->run() fails, it can either free the data structures it
allocated, or leave that task to ->free() which will be called
on failures.
However:
md.c calls ->free() even if ->private_data is NULL, which
causes problems in some personalities.
raid0.c frees the data, but doesn't clear ->private_data,
which will become a problem when we fix md.c
So better fix both these issues at once.
Reported-by: Richard W.M. Jones <rjones@redhat.com> Fixes: 5aa61f427e4979be733e4847b9199ff9cc48a47e
URL: https://bugzilla.kernel.org/show_bug.cgi?id=94381 Signed-off-by: NeilBrown <neilb@suse.de>
Catalin Marinas [Fri, 20 Mar 2015 16:48:13 +0000 (16:48 +0000)]
net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour
Commit db31c55a6fb2 (net: clamp ->msg_namelen instead of returning an
error) introduced the clamping of msg_namelen when the unsigned value
was larger than sizeof(struct sockaddr_storage). This caused a
msg_namelen of -1 to be valid. The native code was subsequently fixed by
commit dbb490b96584 (net: socket: error on a negative msg_namelen).
In addition, the native code sets msg_namelen to 0 when msg_name is
NULL. This was done in commit (6a2a2b3ae075 net:socket: set msg_namelen
to 0 if msg_name is passed as NULL in msghdr struct from userland) and
subsequently updated by 08adb7dabd48 (fold verify_iovec() into
copy_msghdr_from_user()).
This patch brings the get_compat_msghdr() in line with
copy_msghdr_from_user().
Fixes: db31c55a6fb2 (net: clamp ->msg_namelen instead of returning an error) Cc: David S. Miller <davem@davemloft.net> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 20 Mar 2015 17:25:56 +0000 (13:25 -0400)]
Merge branch 'be2net'
Sathya Perla says:
====================
be2net: patch set
Hi David, this patch set includes 3 bug fixes to the be2net driver.
Patch 1 fixes a vlan isolation issue with VFs. When a VF is placed in
promiscous mode, it could receive packets belonging to any vlan, as
the PF driver grants vlan promisc capability to VFs. The PF
driver now disables the vlan promisc capability for VFs to fix this
problem.
Patch 2 fixes the call to MODIFY_EQ_DELAY FW cmd to not include more
than 8 EQs per cmd. The FW is not capable of handling more than 8 EQs
per cmd.
Patch 3 fixes an EEH error detection issue. On Power platforms,
when an EEH error occurs, the slot disconnect state is more reliably
detected via an MMIO read compared to a config read. So, the error
register reads that occur every second are now done via MMIO.
Pls apply this patch set to the "net" tree. Thanks!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Fri, 20 Mar 2015 10:28:25 +0000 (06:28 -0400)]
be2net: use PCI MMIO read instead of config read for errors
When an EEH error occurs, the device/slot is disconnected. This condition
is more reliably detected (i.e., returns all ones) with an MMIO read rather
than a config read -- especially on power platforms.
Hence, this patch fixes EEH error detection by replacing config reads with
MMIO reads for reading the error registers. The error registers in
Skyhawk-R/BE2/BE3 are accessible both via the config space and the
PCICFG (BAR0) memory space.
Reported-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Suresh Reddy [Fri, 20 Mar 2015 10:28:24 +0000 (06:28 -0400)]
be2net: restrict MODIFY_EQ_DELAY cmd to a max of 8 EQs
Issuing this cmd for more than 8 EQs does not have the intended effect
even on BEx and Skyhawk-R.
This patch fixes this by issuing this cmd for upto 8 EQs at a time. Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>