Rik van Riel [Sat, 27 Oct 2012 16:12:11 +0000 (12:12 -0400)]
x86/mm: Completely drop the TLB flush from ptep_set_access_flags()
Intel has an architectural guarantee that the TLB entry causing
a page fault gets invalidated automatically. This means
we should be able to drop the local TLB invalidation.
Because of the way other areas of the page fault code work,
chances are good that all x86 CPUs do this. However, if
someone somewhere has an x86 CPU that does not invalidate
the TLB entry causing a page fault, this one-liner should
be easy to revert - or a CPU model specific quirk could
be added to retain this optimization on most CPUs.
Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Linus Torvalds <torvalds@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michel Lespinasse <walken@google.com>
[ Applied changelog massage and moved this last in the series,
to create bisection distance. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Mon, 22 Oct 2012 18:15:40 +0000 (20:15 +0200)]
sched, numa, mm: Implement slow start for working set sampling
Add a 1 second delay before starting to scan the working set of
a task and starting to balance it amongst nodes.
[ note that before the constant per task WSS sampling rate patch
the initial scan would happen much later still, in effect that
patch caused this regression. ]
The theory is that short-run tasks benefit very little from NUMA
placement: they come and go, and they better stick to the node
they were started on. As tasks mature and rebalance to other CPUs
and nodes, so does their NUMA placement have to change and so
does it start to matter more and more.
In practice this change fixes an observable kbuild regression:
# [ a perf stat --null --repeat 10 test of ten bzImage builds to /dev/shm ]
!NUMA:
45.291088843 seconds time elapsed ( +- 0.40% )
45.154231752 seconds time elapsed ( +- 0.36% )
+NUMA, no slow start:
46.172308123 seconds time elapsed ( +- 0.30% )
46.343168745 seconds time elapsed ( +- 0.25% )
The implementation is simple and straightforward, most of the patch
deals with adding the /proc/sys/kernel/sched_numa_scan_delay_ms tunable
knob.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Link: http://lkml.kernel.org/n/tip-vn7p3ynbwqt3qqewhdlvjltc@git.kernel.org
[ Wrote the changelog, ran measurements, tuned the default. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Sun, 14 Oct 2012 14:59:13 +0000 (16:59 +0200)]
sched, numa, mm: Implement constant, per task Working Set Sampling (WSS) rate
Previously, to probe the working set of a task, we'd use
a very simple and crude method: mark all of its address
space PROT_NONE.
That method has various (obvious) disadvantages:
- it samples the working set at dissimilar rates,
giving some tasks a sampling quality advantage
over others.
- creates performance problems for tasks with very
large working sets
- over-samples processes with large address spaces but
which only very rarely execute
Improve that method by keeping a rotating offset into the
address space that marks the current position of the scan,
and advance it by a constant rate (in a CPU cycles execution
proportional manner). If the offset reaches the last mapped
address of the mm then it then it starts over at the first
address.
The per-task nature of the working set sampling functionality
in this tree allows such constant rate, per task,
execution-weight proportional sampling of the working set,
with an adaptive sampling interval/frequency that goes from
once per 100 msecs up to just once per 1.6 seconds.
The current sampling volume is 256 MB per interval.
As tasks mature and converge their working set, so does the
sampling rate slow down to just a trickle, 256 MB per 1.6
seconds of CPU time executed.
This, beyond being adaptive, also rate-limits rarely
executing systems and does not over-sample on overloaded
systems.
[ In AutoNUMA speak, this patch deals with the effective sampling
rate of the 'hinting page fault'. AutoNUMA's scanning is
currently rate-limited, but it is also fundamentally
single-threaded, executing in the knuma_scand kernel thread,
so the limit in AutoNUMA is global and does not scale up with
the number of CPUs, nor does it scan tasks in an execution
proportional manner.
So the idea of rate-limiting the scanning was first implemented
in the AutoNUMA tree via a global rate limit. This patch goes
beyond that by implementing an execution rate proportional
working set sampling rate that is not implemented via a single
global scanning daemon. ]
[ Dan Carpenter pointed out a possible NULL pointer dereference in the
first version of this patch. ]
Based-on-idea-by: Andrea Arcangeli <aarcange@redhat.com> Bug-Found-By: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Link: http://lkml.kernel.org/n/tip-wt5b48o2226ec63784i58s3j@git.kernel.org
[ Wrote changelog and fixed bug. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rik van Riel [Thu, 18 Oct 2012 21:19:28 +0000 (17:19 -0400)]
sched, numa, mm: Add credits for NUMA placement
The NUMA placement code has been rewritten several times, but
the basic ideas took a lot of work to develop. The people who
put in the work deserve credit for it. Thanks Andrea & Peter :)
[ The Documentation/scheduler/numa-problem.txt file should
probably be rewritten once we figure out the final details of
what the NUMA code needs to do, and why. ]
Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: aarcange@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20121018171928.24d06af4@cuia.bos.redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
----
This is against tip.git numa/core
Peter Zijlstra [Tue, 9 Oct 2012 11:46:22 +0000 (13:46 +0200)]
sched, numa, mm: Add fault driven placement and migration policy
As per the problem/design document Documentation/scheduler/numa-problem.txt
implement 3ac & 4.
( A pure 3a was found too unstable, I did briefly try 3bc
but found no significant improvement. )
Implement a per-task memory placement scheme relying on a regular
PROT_NONE 'migration' fault to scan the memory space of the procress
and uses a two stage migration scheme to reduce the invluence of
unlikely usage relations.
It relies on the assumption that the compute part is tied to a
paticular task and builds a task<->page relation set to model the
compute<->data relation.
In the previous patch we made memory migrate towards where the task
is running, here we select the node on which most memory is located
as the preferred node to run on.
This creates a feed-back control loop between trying to schedule a
task on a node and migrating memory towards the node the task is
scheduled on.
Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Suggested-by: Rik van Riel <riel@redhat.com> Fixes-by: David Rientjes <rientjes@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-8ejt0ioj62k5ruf5zd2ix9zu@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Mon, 13 Aug 2012 13:22:20 +0000 (15:22 +0200)]
sched, numa, mm: Introduce last_nid in the pageframe
Introduce a per-page last_nid field, fold this into the struct
page::flags field whenever possible.
The unlikely/rare 32bit NUMA configs will likely grow the page-frame.
Completely dropping 32bit support for CONFIG_SCHED_NUMA would simplify
things, but it would also remove the warning if we grow enough 64bit
only page-flags to push the last-nid out.
Suggested-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Link: http://lkml.kernel.org/n/tip-0uois4f9skfw9mwyk1yoy0jq@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Sat, 3 Mar 2012 15:56:25 +0000 (16:56 +0100)]
sched, numa, mm: Implement home-node awareness
Implement home node preference in the scheduler's load-balancer.
This is done in four pieces:
- task_numa_hot(); make it harder to migrate tasks away from their
home-node, controlled using the NUMA_HOT feature flag.
- select_task_rq_fair(); prefer placing the task in their home-node,
controlled using the NUMA_TTWU_BIAS feature flag. Disabled by
default for we found this to be far too agressive.
- load_balance(); during the regular pull load-balance pass, try
pulling tasks that are on the wrong node first with a preference
of moving them nearer to their home-node through task_numa_hot(),
controlled through the NUMA_PULL feature flag.
- load_balance(); when the balancer finds no imbalance, introduce
some such that it still prefers to move tasks towards their
home-node, using active load-balance if needed, controlled through
the NUMA_PULL_BIAS feature flag.
In particular, only introduce this BIAS if the system is otherwise
properly (weight) balanced and we either have an offnode or !numa
task to trade for it.
In order to easily find off-node tasks, split the per-cpu task list
into two parts.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Turner <pjt@google.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Cc: Christoph Lameter <cl@linux.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-g0gzzzjrvrzxl6kgwnil1e97@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Sat, 3 Mar 2012 16:05:16 +0000 (17:05 +0100)]
sched, numa, mm: Introduce tsk_home_node()
Introduce the home-node concept for tasks. In order to keep memory
locality we need to have a something to stay local to, we define the
home-node of a task as the node we prefer to allocate memory from and
prefer to execute on.
These are no hard guarantees, merely soft preferences. This allows for
optimal resource usage, we can run a task away from the home-node, the
remote memory hit -- while expensive -- is less expensive than not
running at all, or very little, due to severe cpu overload.
Similarly, we can allocate memory from another node if our home-node
is depleted, again, some memory is better than no memory.
This patch merely introduces the basic infrastructure, all policy
comes later.
NOTE: we introduce the concept of EMBEDDED_NUMA, these are
architectures where the memory access cost doesn't depend on the cpu
but purely on the physical address -- embedded boards with cheap
(slow) and expensive (fast) memory banks.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-ii8j8cp87cgctecfqp2ib6rn@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Tue, 17 Jul 2012 20:54:51 +0000 (22:54 +0200)]
mm/mpol: Use special PROT_NONE to migrate pages
Combine our previous PROT_NONE, mpol_misplaced and
migrate_misplaced_page() pieces into an effective migrate on fault
scheme.
Note that (on x86) we rely on PROT_NONE pages being !present and avoid
the TLB flush from try_to_unmap(TTU_MIGRATION). This greatly improves
the page-migration performance.
Suggested-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Paul Turner <pjt@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Link: http://lkml.kernel.org/n/tip-e98gyl8kr9jzooh2s4piuils@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
numa, mm: Support NUMA hinting page faults from gup/gup_fast
Introduce FOLL_NUMA to tell follow_page to check
pte/pmd_numa. get_user_pages must use FOLL_NUMA, and it's safe to do
so because it always invokes handle_mm_fault and retries the
follow_page later.
KVM secondary MMU page faults will trigger the NUMA hinting page
faults through gup_fast -> get_user_pages -> follow_page ->
handle_mm_fault.
Other follow_page callers like KSM should not use FOLL_NUMA, or they
would fail to get the pages if they use follow_page instead of
get_user_pages.
[ This patch was picked up from the AutoNUMA tree. ]
Originally-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com>
[ ported to this tree. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Lee Schermerhorn [Thu, 12 Jan 2012 11:37:17 +0000 (12:37 +0100)]
mm/mpol: Add MPOL_MF_LAZY
This patch adds another mbind() flag to request "lazy migration". The
flag, MPOL_MF_LAZY, modifies MPOL_MF_MOVE* such that the selected
pages are marked PROT_NONE. The pages will be migrated in the fault
path on "first touch", if the policy dictates at that time.
"Lazy Migration" will allow testing of migrate-on-fault via mbind().
Also allows applications to specify that only subsequently touched
pages be migrated to obey new policy, instead of all pages in range.
This can be useful for multi-threaded applications working on a
large shared data area that is initialized by an initial thread
resulting in all pages on one [or a few, if overflowed] nodes.
After PROT_NONE, the pages in regions assigned to the worker threads
will be automatically migrated local to the threads on 1st touch.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org>
[ nearly complete rewrite.. ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-7rsodo9x8zvm5awru5o7zo0y@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Tue, 17 Jul 2012 16:25:14 +0000 (18:25 +0200)]
mm/mpol: Create special PROT_NONE infrastructure
In order to facilitate a lazy -- fault driven -- migration of pages,
create a special transient PROT_NONE variant, we can then use the
'spurious' protection faults to drive our migrations from.
Pages that already had an effective PROT_NONE mapping will not
be detected to generate these 'spuriuos' faults for the simple reason
that we cannot distinguish them on their protection bits, see
pte_numa().
This isn't a problem since PROT_NONE (and possible PROT_WRITE with
dirty tracking) aren't used or are rare enough for us to not care
about their placement.
Suggested-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Paul Turner <pjt@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Link: http://lkml.kernel.org/n/tip-0g5k80y4df8l83lha9j75xph@git.kernel.org
[ fixed various cross-arch and THP/!THP details ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Lee Schermerhorn [Wed, 11 Jan 2012 14:48:13 +0000 (15:48 +0100)]
mm/mpol: Check for misplaced page
This patch provides a new function to test whether a page resides
on a node that is appropriate for the mempolicy for the vma and
address where the page is supposed to be mapped. This involves
looking up the node where the page belongs. So, the function
returns that node so that it may be used to allocated the page
without consulting the policy again.
A subsequent patch will call this function from the fault path.
Because of this, I don't want to go ahead and allocate the page, e.g.,
via alloc_page_vma() only to have to free it if it has the correct
policy. So, I just mimic the alloc_page_vma() node computation
logic--sort of.
Note: we could use this function to implement a MPOL_MF_STRICT
behavior when migrating pages to match mbind() mempolicy--e.g.,
to ensure that pages in an interleaved range are reinterleaved
rather than left where they are when they reside on any page in
the interleave nodemask.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org>
[ Added MPOL_F_LAZY to trigger migrate-on-fault;
simplified code now that we don't have to bother
with special crap for interleaved ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-z3mgep4tgrc08o07vl1ahb2m@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Lee Schermerhorn [Mon, 16 Jan 2012 13:43:29 +0000 (14:43 +0100)]
mm/mpol: Add MPOL_MF_NOOP
This patch augments the MPOL_MF_LAZY feature by adding a "NOOP" policy
to mbind(). When the NOOP policy is used with the 'MOVE and 'LAZY
flags, mbind() will map the pages PROT_NONE so that they will be
migrated on the next touch.
This allows an application to prepare for a new phase of operation
where different regions of shared storage will be assigned to
worker threads, w/o changing policy. Note that we could just use
"default" policy in this case. However, this also allows an
application to request that pages be migrated, only if necessary,
to follow any arbitrary policy that might currently apply to a
range of pages, without knowing the policy, or without specifying
multiple mbind()s for ranges with different policies.
[ Bug in early version of mpol_parse_str() reported by Fengguang Wu. ]
Bug-Reported-by: Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@kernel.org>
mm/pgprot: Move the pgprot_modify() fallback definition to mm.h
pgprot_modify() is available on x86, but on other architectures it only
gets defined in mm/mprotect.c - breaking the build if anything outside
of mprotect.c tries to make use of this function.
Move it to the generic pgprot area in mm.h, so that an upcoming patch
can make use of it.
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rik van Riel <riel@redhat.com> Cc: Paul Turner <pjt@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-nfvarGMj9gjavowroorkizb4@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rik van Riel [Tue, 9 Oct 2012 13:31:59 +0000 (15:31 +0200)]
mm: Only flush the TLB when clearing an accessible pte
If ptep_clear_flush() is called to clear a page table entry that is
accessible anyway by the CPU, eg. a _PAGE_PROTNONE page table entry,
there is no need to flush the TLB on remote CPUs.
Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-vm3rkzevahelwhejx5uwm8ex@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rik van Riel [Tue, 9 Oct 2012 13:31:12 +0000 (15:31 +0200)]
x86/mm: Introduce pte_accessible()
We need pte_present to return true for _PAGE_PROTNONE pages, to indicate that
the pte is associated with a page.
However, for TLB flushing purposes, we would like to know whether the pte
points to an actually accessible page. This allows us to skip remote TLB
flushes for pages that are not actually accessible.
Fill in this method for x86 and provide a safe (but slower) method
on other architectures.
Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Fixed-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-66p11te4uj23gevgh4j987ip@git.kernel.org
[ Added Linus's review fixes. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Fri, 12 Oct 2012 10:13:10 +0000 (12:13 +0200)]
sched, numa, mm: Describe the NUMA scheduling problem formally
This is probably a first: formal description of a complex high-level
computing problem, within the kernel source.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mike Galbraith <efault@gmx.de>
Rik van Riel <riel@redhat.com> Link: http://lkml.kernel.org/n/tip-mmnlpupoetcatimvjEld16Pb@git.kernel.org
[ Next step: generate the kernel source from such formal descriptions and retire to a tropical island! ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Rik van Riel [Sat, 27 Oct 2012 16:12:11 +0000 (12:12 -0400)]
mm/generic: Only flush the local TLB in ptep_set_access_flags()
The function ptep_set_access_flags() is only ever used to upgrade
access permissions to a page - i.e. they make it less restrictive.
That means the only negative side effect of not flushing remote
TLBs in this function is that other CPUs may incur spurious page
faults, if they happen to access the same address, and still have
a PTE with the old permissions cached in their TLB caches.
Having another CPU maybe incur a spurious page fault is faster
than always incurring the cost of a remote TLB flush, so replace
the remote TLB flush with a purely local one.
This should be safe on every architecture that correctly
implements flush_tlb_fix_spurious_fault() to actually invalidate
the local TLB entry that caused a page fault, as well as on
architectures where the hardware invalidates TLB entries that
cause page faults.
In the unlikely event that you are hitting what appears to be
an infinite loop of page faults, and 'git bisect' took you to
this changeset, your architecture needs to implement
flush_tlb_fix_spurious_fault() to actually flush the TLB entry.
Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michel Lespinasse <walken@google.com>
[ Changelog massage. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull networking fixes from David Miller:
"This is what we usually expect at this stage of the game, lots of
little things, mostly in drivers. With the occasional 'oops didn't
mean to do that' kind of regressions in the core code."
1) Uninitialized data in __ip_vs_get_timeouts(), from Arnd Bergmann
2) Reject invalid ACK sequences in Fast Open sockets, from Jerry Chu.
3) Lost error code on return from _rtl_usb_receive(), from Christian
Lamparter.
4) Fix reset resume on USB rt2x00, from Stanislaw Gruszka.
5) Release resources on error in pch_gbe driver, from Veaceslav Falico.
6) Default hop limit not set correctly in ip6_template_metrics[], fix
from Li RongQing.
7) Gianfar PTP code requests wrong kind of resource during probe, fix
from Wei Yang.
8) Fix VHOST net driver on big-endian, from Michael S Tsirkin.
9) Mallenox driver bug fixes from Jack Morgenstein, Or Gerlitz, Moni
Shoua, Dotan Barak, and Uri Habusha.
10) usbnet leaks memory on TX path, fix from Hemant Kumar.
11) Use socket state test, rather than presence of FIN bit packet, to
determine FIONREAD/SIOCINQ value. Fix from Eric Dumazet.
12) Fix cxgb4 build failure, from Vipul Pandya.
13) Provide a SYN_DATA_ACKED state to complement SYN_FASTOPEN in socket
info dumps. From Yuchung Cheng.
14) Fix leak of security path in kfree_skb_partial(). Fix from Eric
Dumazet.
15) Handle RX FIFO overflows more resiliently in pch_gbe driver, from
Veaceslav Falico.
16) Fix MAINTAINERS file pattern for networking drivers, from Jean
Delvare.
17) Add iPhone5 IDs to IPHETH driver, from Jay Purohit.
18) VLAN device type change restriction is too strict, and should not
trigger for the automatically generated vlan0 device. Fix from Jiri
Pirko.
19) Make PMTU/redirect flushing work properly again in ipv4, from
Steffen Klassert.
20) Fix memory corruptions by using kfree_rcu() in netlink_release().
From Eric Dumazet.
21) More qmi_wwan device IDs, from Bjørn Mork.
22) Fix unintentional change of SNAT/DNAT hooks in generic NAT
infrastructure, from Elison Niven.
23) Fix 3.6.x regression in xt_TEE netfilter module, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits)
tilegx: fix some issues in the SW TSO support
qmi_wwan/cdc_ether: move Novatel 551 and E362 to qmi_wwan
net: usb: Fix memory leak on Tx data path
net/mlx4_core: Unmap UAR also in the case of error flow
net/mlx4_en: Don't use vlan tag value as an indication for vlan presence
net/mlx4_en: Fix double-release-range in tx-rings
bas_gigaset: fix pre_reset handling
vhost: fix mergeable bufs on BE hosts
gianfar_ptp: use iomem, not ioports resource tree in probe
ipv6: Set default hoplimit as zero.
NET_VENDOR_TI: make available for am33xx as well
pch_gbe: fix error handling in pch_gbe_up()
b43: Fix oops on unload when firmware not found
mwifiex: clean up scan state on error
mwifiex: return -EBUSY if specific scan request cannot be honored
brcmfmac: fix potential NULL dereference
Revert "ath9k_hw: Updated AR9003 tx gain table for 5GHz"
ath9k_htc: Add PID/VID for a Ubiquiti WiFiStation
rt2x00: usb: fix reset resume
rtlwifi: pass rx setup error code to caller
...
Linus Torvalds [Fri, 26 Oct 2012 21:59:01 +0000 (14:59 -0700)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine fixes from Vinod Koul:
"Three fixes for slave dmanegine.
Two are for typo omissions in sifr dmaengine driver and the last one
is for the imx driver fixing a missing unlock"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: sirf: fix a typo in moving running dma_desc to active queue
dmaengine: sirf: fix a typo in dma_prep_interleaved
dmaengine: imx-dma: fix missing unlock on error in imxdma_xfer_desc()
Linus Torvalds [Fri, 26 Oct 2012 21:23:35 +0000 (14:23 -0700)]
Merge tag 'pm+acpi-for-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael J Wysocki:
- Fix for a memory leak in acpi_bind_one() from Jesper Juhl.
- Fix for an error code path memory leak in pm_genpd_attach_cpuidle()
from Jonghwan Choi.
- Fix for smp_processor_id() usage in preemptible code in powernow-k8
from Andreas Herrmann.
- Fix for a suspend-related memory leak in cpufreq stats from Xiaobing
Tu.
- Freezer fix for failure to clear PF_NOFREEZE along with PF_KTHREAD in
flush_old_exec() from Oleg Nesterov.
- acpi_processor_notify() fix from Alan Cox.
* tag 'pm+acpi-for-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: missing break
freezer: exec should clear PF_NOFREEZE along with PF_KTHREAD
Fix memory leak in cpufreq stats.
cpufreq / powernow-k8: Remove usage of smp_processor_id() in preemptible code
PM / Domains: Fix memory leak on error path in pm_genpd_attach_cpuidle
ACPI: Fix memory leak in acpi_bind_one()
Linus Torvalds [Fri, 26 Oct 2012 20:46:41 +0000 (13:46 -0700)]
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband fixes from Roland Dreier:
"Small batch of fixes for 3.7:
- Fix crash in error path in cxgb4
- Fix build error on 32 bits in mlx4
- Fix SR-IOV bugs in mlx4"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
mlx4_core: Perform correct resource cleanup if mlx4_QUERY_ADAPTER() fails
mlx4_core: Remove annoying debug messages from SR-IOV flow
RDMA/cxgb4: Don't free chunk that we have failed to allocate
IB/mlx4: Synchronize cleanup of MCGs in MCG paravirtualization
IB/mlx4: Fix QP1 P_Key processing in the Primary Physical Function (PPF)
IB/mlx4: Fix build error on platforms where UL is not 64 bits
Linus Torvalds [Fri, 26 Oct 2012 17:26:36 +0000 (10:26 -0700)]
Merge tag 'usb-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg Kroah-Hartman:
"Here are a bunch of USB fixes for the 3.7-rc tree.
There's a lot of small USB serial driver fixes, and one larger one
(the mos7840 driver changes are mostly just moving code around to fix
problems.) Thanks to Johan Hovold for finding the problems and fixing
them all up.
Other than those, there is the usual new device ids, xhci bugfixes,
and gadget driver fixes, nothing out of the ordinary.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'usb-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (49 commits)
xhci: trivial: Remove assigned but unused ep_ctx.
xhci: trivial: Remove assigned but unused slot_ctx.
xhci: Fix missing break in xhci_evaluate_context_result.
xhci: Fix potential NULL ptr deref in command cancellation.
ehci: Add yet-another Lucid nohandoff pci quirk
ehci: fix Lucid nohandoff pci quirk to be more generic with BIOS versions
USB: mos7840: fix port_probe flow
USB: mos7840: fix port-data memory leak
USB: mos7840: remove invalid disconnect handling
USB: mos7840: remove NULL-urb submission
USB: qcserial: fix interface-data memory leak in error path
USB: option: fix interface-data memory leak in error path
USB: ipw: fix interface-data memory leak in error path
USB: mos7840: fix port-device leak in error path
USB: mos7840: fix urb leak at release
USB: sierra: fix port-data memory leak
USB: sierra: fix memory leak in probe error path
USB: sierra: fix memory leak in attach error path
USB: usb-wwan: fix multiple memory leaks in error paths
USB: keyspan: fix NULL-pointer dereferences and memory leaks
...
Linus Torvalds [Fri, 26 Oct 2012 17:25:31 +0000 (10:25 -0700)]
Merge tag 'staging-3.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg Kroah-Hartman:
"Here are some staging driver fixes for your 3.7-rc tree.
Nothing major here, a number of iio driver fixups that were causing
problems, some comedi driver bugfixes, and a bunch of tidspbridge
warning squashing and other regressions fixed from the 3.6 release.
All have been in the linux-next releases for a bit.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'staging-3.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (32 commits)
staging: tidspbridge: delete unused mmu functions
staging: tidspbridge: ioremap physical address of the stack segment in shm
staging: tidspbridge: ioremap dsp sync addr
staging: tidspbridge: change type to __iomem for per and core addresses
staging: tidspbridge: drop const from custom mmu implementation
staging: tidspbridge: request the right irq for mmu
staging: ipack: add missing include (implicit declaration of function 'kfree')
staging: ramster: depends on NET
staging: omapdrm: fix allocation size for page addresses array
staging: zram: Fix handling of incompressible pages
Staging: android: binder: Allow using highmem for binder buffers
Staging: android: binder: Fix memory leak on thread/process exit
staging: comedi: ni_labpc: fix possible NULL deref during detach
staging: comedi: das08: fix possible NULL deref during detach
staging: comedi: amplc_pc263: fix possible NULL deref during detach
staging: comedi: amplc_pc236: fix possible NULL deref during detach
staging: comedi: amplc_pc236: fix invalid register access during detach
staging: comedi: amplc_dio200: fix possible NULL deref during detach
staging: comedi: 8255_pci: fix possible NULL deref during detach
staging: comedi: ni_daq_700: fix dio subdevice regression
...
Linus Torvalds [Fri, 26 Oct 2012 17:24:51 +0000 (10:24 -0700)]
Merge tag 'driver-core-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg Kroah-Hartman:
"Here are a number of firmware core fixes for 3.7, and some other minor
fixes. And some documentation updates thrown in for good measure.
All have been in the linux-next tree for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'driver-core-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
Documentation:Chinese translation of Documentation/arm64/memory.txt
Documentation:Chinese translation of Documentation/arm64/booting.txt
Documentation:Chinese translation of Documentation/IRQ.txt
firmware loader: document kernel direct loading
sysfs: sysfs_pathname/sysfs_add_one: Use strlcat() instead of strcat()
dynamic_debug: Remove unnecessary __used
firmware loader: sync firmware cache by async_synchronize_full_domain
firmware loader: let direct loading back on 'firmware_buf'
firmware loader: fix one reqeust_firmware race
firmware loader: cancel uncache work before caching firmware
Linus Torvalds [Fri, 26 Oct 2012 17:24:19 +0000 (10:24 -0700)]
Merge tag 'char-misc-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg Kroah-Hartman:
"Here are some driver fixes for 3.7. They include extcon driver fixes,
a hyper-v bugfix, and two other minor driver fixes.
All of these have been in the linux-next releases for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'char-misc-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
sonypi: suspend/resume callbacks should be conditionally compiled on CONFIG_PM_SLEEP
Drivers: hv: Cleanup error handling in vmbus_open()
extcon : register for cable interest by cable name
extcon: trivial: kfree missed from remove path
extcon: driver model release call not needed
extcon: MAX77693: Add platform data for MUIC device to initialize registers
extcon: max77693: Use max77693_update_reg for rmw operations
extcon: Fix kerneldoc for extcon_set_cable_state and extcon_set_cable_state_
extcon: adc-jack: Add missing MODULE_LICENSE
extcon: adc-jack: Fix checking return value of request_any_context_irq
extcon: Fix return value in extcon_register_interest()
extcon: unregister compat link on cleanup
extcon: Unregister compat class at module unload to fix oops
extcon: optimising the check_mutually_exclusive function
extcon: standard cable names definition and declaration changed
extcon-max8997: remove usage of ret in max8997_muic_handle_charger_type_detach
extcon: Remove duplicate inclusion of extcon.h header file
Linus Torvalds [Fri, 26 Oct 2012 17:05:07 +0000 (10:05 -0700)]
VFS: don't do protected {sym,hard}links by default
In commit 800179c9b8a1 ("This adds symlink and hardlink restrictions to
the Linux VFS"), the new link protections were enabled by default, in
the hope that no actual application would care, despite it being
technically against legacy UNIX (and documented POSIX) behavior.
However, it does turn out to break some applications. It's rare, and
it's unfortunate, but it's unacceptable to break existing systems, so
we'll have to default to legacy behavior.
In particular, it has broken the way AFD distributes files, see
http://www.dwd.de/AFD/
along with some legacy scripts.
Distributions can end up setting this at initrd time or in system
scripts: if you have security problems due to link attacks during your
early boot sequence, you have bigger problems than some kernel sysctl
setting. Do:
Alternatively, we may at some point introduce a kernel config option
that sets these kinds of "more secure but not traditional" behavioural
options automatically.
Reported-by: Nick Bowler <nbowler@elliptictech.com> Reported-by: Holger Kiehl <Holger.Kiehl@dwd.de> Cc: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # v3.6 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 26 Oct 2012 17:03:22 +0000 (10:03 -0700)]
Merge tag 'sound-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Slightly a high amount of commits come from Adrian Knoth's HDSPM
driver fixes. Other than that, all small trival fixes or quirks that
are pretty driver-specific."
* tag 'sound-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: wm8994: Only enable extra BCLK cycles when required
ALSA: als3000: check for the kzalloc return value
ALSA: sound/isa/opti9xx/miro.c: eliminate possible double free
ALSA: hda - Fix silent headphone output from Toshiba P200
ALSA: hdspm - Fix coding style in CTL_ELEM macros
ALSA: hdspm - Fix typo in kcontrol element on RME MADI cards
ALSA: hdspm - Fix sync_in detection on AES/AES32
ALSA: hdspm - Fix sync_in reporting on RME MADI cards
ALSA: hdspm - Also report autosync_sample_rate on MADI and MADIface
ALSA: hdspm - Fix reported autosync_sample_rate
ALSA: hdspm - Fix sync check reporting on all RME HDSPM cards
ALSA: hdspm - Report external rate in slave mode on PCI MADI
ALSA: hdspm - Allow DDS/Varispeed to be set from userspace
ALSA: hda - add dock support for Thinkpad T430
ASoC: ux500_msp_i2s: Fix devm_* and return code merge error
ASoC: Ux500: Dispose of device nodes correctly
Linus Torvalds [Fri, 26 Oct 2012 17:01:43 +0000 (10:01 -0700)]
Merge branch 'fixes_for_linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA-mapping revert from Marek Szyprowski:
"Due to my mistake, my previous pull request (merged as commit cff7b8ba60e3: "Merge branch 'fixes_for_linus' ..") contained a patch
which is aimed for v3.8 and lacks its dependences. This pull request
reverts it and fixes build break of ARM architecture."
* 'fixes_for_linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
Revert "ARM: dma-mapping: support debug_dma_mapping_error"
Linus Torvalds [Fri, 26 Oct 2012 16:35:46 +0000 (09:35 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"This fixes a couple of nasty page table initialization bugs which were
causing kdump regressions. A clean rearchitecturing of the code is in
the works - meanwhile these are reverts that restore the
best-known-working state of the kernel.
There's also EFI fixes and other small fixes."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, mm: Undo incorrect revert in arch/x86/mm/init.c
x86: efi: Turn off efi_enabled after setup on mixed fw/kernel
x86, mm: Find_early_table_space based on ranges that are actually being mapped
x86, mm: Use memblock memory loop instead of e820_RAM
x86, mm: Trim memory in memblock to be page aligned
x86/irq/ioapic: Check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt
x86/efi: Fix oops caused by incorrect set_memory_uc() usage
x86-64: Fix page table accounting
Revert "x86/mm: Fix the size calculation of mapping tables"
MAINTAINERS: Add EFI git repository location
Linus Torvalds [Fri, 26 Oct 2012 16:35:00 +0000 (09:35 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Most of the kernel diffstat relates to a group of Intel P6 and KNC
(Xeon-Phi Knights Corner) PMU driver fixes, neither of which is in
heavy use, so we took the fixes.
The rest is diverse smallish fixes to the tooling and kernel side."
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Remove unused variable in nhmex_rbox_alter_er()
perf/x86: Enable overflow on Intel KNC with a custom knc_pmu_handle_irq()
perf/x86: Remove cpuc->enable check on Intl KNC event enable/disable
perf/x86: Make Intel KNC use full 40-bit width of counters
perf/x86/uncore: Handle pci_read_config_dword() errors
perf/x86: Remove P6 cpuc->enabled check
perf/x86: Update/fix generic events on P6 PMU
perf/x86: Fix P6 FP_ASSIST event constraint
perf, cpu hotplug: Use cached value of smp_processor_id()
perf, cpu hotplug: Run CPU_STARTING notifiers with irqs disabled
x86/perf: Fix virtualization sanity check
perf test: Fix exclude_guest parse events tests
perf tools: do not flush maps on COMM for perf report
perf help: Fix --help for builtins
perf trace: Check if sample raw_data field is set
perf trace: Validate syscall id before growing syscall table
Linus Torvalds [Fri, 26 Oct 2012 16:34:04 +0000 (09:34 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"This has our series of fixes for the next rc. The biggest batch is
from Jan Schmidt, fixing up some problems in our subvolume quota code
and fixing btrfs send/receive to work with the new extended inode
refs."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: do not bug when we fail to commit the transaction
Btrfs: fix memory leak when cloning root's node
Btrfs: Use btrfs_update_inode_fallback when creating a snapshot
Btrfs: Send: preserve ownership (uid and gid) also for symlinks.
Btrfs: fix deadlock caused by the nested chunk allocation
btrfs: Return EINVAL when length to trim is less than FSB
Btrfs: fix memory leak in btrfs_quota_enable()
Btrfs: send correct rdev and mode in btrfs-send
Btrfs: extended inode refs support for send mechanism
Btrfs: Fix wrong error handling code
Fix a sign bug causing invalid memory access in the ino_paths ioctl.
Btrfs: comment for loop in tree_mod_log_insert_move
Btrfs: fix extent buffer reference for tree mod log roots
Btrfs: determine level of old roots
Btrfs: tree mod log's old roots could still be part of the tree
Btrfs: fix a tree mod logging issue for root replacement operations
Btrfs: don't put removals from push_node_left into tree mod log twice
Andrew Vagin [Tue, 7 Aug 2012 12:56:05 +0000 (16:56 +0400)]
perf inject: Mark a dso if it's used
Otherwise they will be not written in an output file.
Signed-off-by: Andrew Vagin <avagin@openvz.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1344344165-369636-5-git-send-email-avagin@openvz.org
[ committer note: Fixed up wrt changes made in the immediate previous patches ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andrew Vagin [Tue, 7 Aug 2012 12:56:04 +0000 (16:56 +0400)]
perf inject: Merge sched_stat_* and sched_switch events
You may want to know where and how long a task is sleeping. A callchain
may be found in sched_switch and a time slice in stat_iowait, so I add
handler in perf inject for merging this events.
My code saves sched_switch event for each process and when it meets
stat_iowait, it reports the sched_switch event, because this event
contains a correct callchain. By another words it replaces all
stat_iowait events on proper sched_switch events.
Namhyung Kim [Fri, 26 Oct 2012 08:55:52 +0000 (17:55 +0900)]
perf tools: Fix LIBELF_MMAP checking
Currently checking mmap support in libelf failed due to wrong flags.
CHK libelf
CHK libdw
CHK libunwind
CHK -DLIBELF_MMAP
/tmp/ccYJwdR0.o: In function `main':
:(.text+0x18): undefined reference to `elf_begin'
collect2: error: ld returned 1 exit status
This cannot happen since we checked the elf_begin() when checking
libelf and it succeeded.
Fix it by using a same flag with libelf checking.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Borislav Petkov <bp@amd64.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1351241752-2919-5-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 26 Oct 2012 08:55:48 +0000 (17:55 +0900)]
tools lib traceevent: Do not generate dependency for system header files
Ingo reported (again!) that 'make clean' on perf/traceevent does not
work due to some reason with system header file. Quotes Ingo:
"Note that the old dependency related build failure thought to be
fixed in commit 860df5833e46 is back:
make[1]: *** No rule to make target
`/usr/lib/gcc/x86_64-redhat-linux/4.7.0/include/stddef.h', needed by `.trace-seq.d'. Stop.
'make clean' itself does not work in libtraceevent:
comet:~/tip/tools/lib/traceevent> make clean
make: *** No rule to make target `/usr/lib/gcc/x86_64-redhat-linux/4.7.0/include/stddef.h', needed by `.trace-seq.d'. Stop.
Ingo Molnar [Fri, 26 Oct 2012 12:50:17 +0000 (14:50 +0200)]
Merge tag 'mca_cfg' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras into x86/ras
Pull x86 RAS changes from Borislav Petkov:
"Rework all config variables used throughout the MCA code and collect
them together into a mca_config struct. This keeps them tightly and
neatly packed together instead of spilled all over the place.
Then, convert those which are used as booleans into real booleans and
save some space."
Borislav Petkov [Wed, 17 Oct 2012 10:05:33 +0000 (12:05 +0200)]
x86, MCA: Finish mca_config conversion
mce_ser, mce_bios_cmci_threshold and mce_disabled are the last three
bools which need conversion. Move them to the mca_config struct and
adjust usage sites accordingly.
Signed-off-by: Borislav Petkov <bp@alien8.de> Acked-by: Tony Luck <tony.luck@intel.com>
Borislav Petkov [Tue, 9 Oct 2012 17:52:05 +0000 (19:52 +0200)]
drivers/base: Add a DEVICE_BOOL_ATTR macro
... which, analogous to DEVICE_INT_ATTR provides functionality to
set/clear bools. Its purpose is to be used where values need to be used
as booleans in configuration context.
Ingo Molnar [Fri, 26 Oct 2012 08:30:49 +0000 (10:30 +0200)]
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core trace improvements from Arnaldo Carvalho de Melo:
* Don't stop synthesizing threads when one vanishes, this is for
the existing threads when we start a tool like trace.
* Use sched:sched_stat_runtime to provide a thread summary, this
produces the same output as the 'trace summary' subcommand of
tglx's original "trace" tool.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
Chris Metcalf [Thu, 25 Oct 2012 07:25:20 +0000 (07:25 +0000)]
tilegx: fix some issues in the SW TSO support
This change correctly computes the header length and data length in
the fragments to avoid a bug where we would end up with extremely
slow performance. Also adopt use of skb_frag_size() accessor.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Cc: stable@vger.kernel.org [v3.6] Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Williams [Wed, 24 Oct 2012 12:10:34 +0000 (12:10 +0000)]
qmi_wwan/cdc_ether: move Novatel 551 and E362 to qmi_wwan
These devices provide QMI and ethernet functionality via a standard CDC
ethernet descriptor. But when driven by cdc_ether, the QMI
functionality is unavailable because only cdc_ether can claim the USB
interface. Thus blacklist the devices in cdc_ether and add their IDs to
qmi_wwan, which enables both QMI and ethernet simultaneously.
Signed-off-by: Dan Williams <dcbw@redhat.com> Cc: stable@vger.kernel.org Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Hemant Kumar [Thu, 25 Oct 2012 18:17:54 +0000 (18:17 +0000)]
net: usb: Fix memory leak on Tx data path
Driver anchors the tx urbs and defers the urb submission if
a transmit request comes when the interface is suspended.
Anchoring urb increments the urb reference count. These
deferred urbs are later accessed by calling usb_get_from_anchor()
for submission during interface resume. usb_get_from_anchor()
unanchors the urb but urb reference count remains same.
This causes the urb reference count to remain non-zero
after usb_free_urb() gets called and urb never gets freed.
Hence call usb_put_urb() after anchoring the urb to properly
balance the reference count for these deferred urbs. Also,
unanchor these deferred urbs during disconnect, to free them
up.
Signed-off-by: Hemant Kumar <hemantk@codeaurora.org> Acked-by: Oliver Neukum <oneukum@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Dotan Barak [Thu, 25 Oct 2012 01:12:49 +0000 (01:12 +0000)]
net/mlx4_core: Unmap UAR also in the case of error flow
If a failure takes place during the EQ creation, we need to unmap the
UAR memory block too.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Uri Habusha <urih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Thu, 25 Oct 2012 01:12:47 +0000 (01:12 +0000)]
net/mlx4_en: Fix double-release-range in tx-rings
The QP range is reserved as a single block. However, when freeing the
en resources, the tx-ring QPs are released both in mlx4_en_destroy_tx_ring
(one at a time) and in mlx4_en_free_resources (as a block release).
Fix by eliminating the one-at-a-time release in mlx4_en_destroy_tx_ring.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Wed, 24 Oct 2012 08:44:32 +0000 (08:44 +0000)]
bas_gigaset: fix pre_reset handling
The delayed work function int_in_work() may call usb_reset_device()
and thus, indirectly, the driver's pre_reset method. Trying to
cancel the work synchronously in that situation would deadlock.
Fix by avoiding cancel_work_sync() in the pre_reset method.
If the reset was NOT initiated by int_in_work() this might cause
int_in_work() to run after the post_reset method, with urb_int_in
already resubmitted, so handle that case gracefully.
Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 26 Oct 2012 02:26:54 +0000 (19:26 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm radeon fixes from Dave Airlie:
"Just radeon fixes in this one:
- some new PCI IDs
- ATPX regression fix
- async VM regression fixes
- some module options fixes"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: fix ATPX regression in acpi rework
drm/radeon: fix ATPX function documentation
drm/radeon: move the retry to gem_object_create
drm/radeon: move size limits to gem_object_create.
drm/radeon: use vzalloc for gart pages
drm/radeon: fix and simplify pot argument checks v3
drm/radeon: fix header size estimation in VM code
drm/radeon: remove set_page check from VM code
drm/radeon: fix si_set_page v2
drm/radeon: fix cayman_vm_set_page v2
drm/radeon: fix PFP sync in vm_flush
drm/radeon: add error output if VM CS fails on cayman
drm/radeon: give each backlight a unique id
drm/radeon: fix sparse warning
drm/radeon: add some new SI PCI ids
Linus Torvalds [Fri, 26 Oct 2012 02:26:16 +0000 (19:26 -0700)]
Merge tag 'nfs-for-3.7-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS bugfixes from Trond Myklebust:
- Fix the NFSv2/v3 kernel statd protocol, which broke due to net
namespace related changes.
- Fix a number of races in the SUNRPC TCP disconnect/reconnect code.
* tag 'nfs-for-3.7-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
LOCKD: Clear ln->nsm_clnt only when ln->nsm_users is zero
LOCKD: fix races in nsm_client_get
SUNRPC: Get rid of the xs_error_report socket callback
SUNRPC: Prevent races in xs_abort_connection()
Revert "SUNRPC: Ensure we close the socket on EPIPE errors too..."
SUNRPC: Clear the connect flag when socket state is TCP_CLOSE_WAIT
Dave Airlie [Thu, 25 Oct 2012 01:36:05 +0000 (11:36 +1000)]
Merge branch 'drm-fixes-3.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Alex writes:
"Fixes pull request for radeon. The main things here are
fixing a ATPX regression from the acpi rework, fixing some
fallout from the async VM work, and fixing some module options
that were broken in certain cases. Other than that, mainly
just bug fixes."
* 'drm-fixes-3.7' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: fix ATPX regression in acpi rework
drm/radeon: fix ATPX function documentation
drm/radeon: move the retry to gem_object_create
drm/radeon: move size limits to gem_object_create.
drm/radeon: use vzalloc for gart pages
drm/radeon: fix and simplify pot argument checks v3
drm/radeon: fix header size estimation in VM code
drm/radeon: remove set_page check from VM code
drm/radeon: fix si_set_page v2
drm/radeon: fix cayman_vm_set_page v2
drm/radeon: fix PFP sync in vm_flush
drm/radeon: add error output if VM CS fails on cayman
drm/radeon: give each backlight a unique id
drm/radeon: fix sparse warning
drm/radeon: add some new SI PCI ids
Linus Torvalds [Thu, 25 Oct 2012 23:05:57 +0000 (16:05 -0700)]
Merge branch 'akpm' (Andrew's fixes)
Merge misc fixes from Andrew Morton:
"18 total. 15 fixes and some updates to a device_cgroup patchset which
bring it up to date with the version which I should have merged in the
first place."
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (18 patches)
fs/compat_ioctl.c: VIDEO_SET_SPU_PALETTE missing error check
gen_init_cpio: avoid stack overflow when expanding
drivers/rtc/rtc-imxdi.c: add missing spin lock initialization
mm, numa: avoid setting zone_reclaim_mode unless a node is sufficiently distant
pidns: limit the nesting depth of pid namespaces
drivers/dma/dw_dmac: make driver's endianness configurable
mm/mmu_notifier: allocate mmu_notifier in advance
tools/testing/selftests/epoll/test_epoll.c: fix build
UAPI: fix tools/vm/page-types.c
mm/page_alloc.c:alloc_contig_range(): return early for err path
rbtree: include linux/compiler.h for definition of __always_inline
genalloc: stop crashing the system when destroying a pool
backlight: ili9320: add missing SPI dependency
device_cgroup: add proper checking when changing default behavior
device_cgroup: stop using simple_strtoul()
device_cgroup: rename deny_all to behavior
cgroup: fix invalid rcu dereference
mm: fix XFS oops due to dirty pages without buffers on s390
Jason Gerecke [Sun, 21 Oct 2012 07:38:03 +0000 (00:38 -0700)]
Input: wacom - handle split-sensor devices with internal hubs
Like our other pen-and-touch products, the Cintiq 24HD touch needs data
to be shared between its two sensors to facilitate proximity-based palm
rejection.
Unlike other tablets that report sensor data through separate interfaces
of the same USB device, the Cintiq 24HD touch has separate USB devices
that are connected to an internal USB hub.
This patch makes it possible to designate the USB VID/PID of the other
device so that the two may share data. To ensure we don't accidentally
link to a sensor from a physically separate device (if several have been
plugged in), we limit the search to siblings (i.e., devices directly
connected to the same hub).
H. Peter Anvin [Wed, 24 Oct 2012 21:11:48 +0000 (14:11 -0700)]
Makefile: Documentation for external tool should be correct
If one includes documentation for an external tool, it should be
correct. This is not:
1. Overriding the input to rngd should typically be neither
necessary nor desired. This is especially so since newer
versions of rngd support a number of different *types* of sources.
2. The default kernel-exported device is called /dev/hwrng not
/dev/hwrandom nor /dev/hw_random (both of which were used in the
past; however, kernel and udev seem to have converged on
/dev/hwrng.)
Overall it is better if the documentation for rngd is kept with rngd
rather than in a kernel Makefile.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: David Howells <dhowells@redhat.com> Cc: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 25 Oct 2012 22:59:34 +0000 (15:59 -0700)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM fixes from Russell King:
"A random collection of various fixes, mainly from Arnd and a few other
people. Not thing really stands out here."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: drop experimental status for hotplug and Thumb2
ARM: 7560/1: SMP_TWD: use DIV_ROUND_CLOSEST() for periodic mode
ARM: 7559/1: smp: switch away from the idmap before updating init_mm.mm_count
ARM: 7556/1: perf: fix updated event period in response to PERF_EVENT_IOC_PERIOD
ARM: 7555/1: kexec: fix segment memory addresses check
ARM: warnings in arch/arm/include/asm/uaccess.h
ARM: binfmt_flat: unused variable 'persistent'
ARM: be really quiet when building with 'make -s'
ARM: pass -marm to gcc by default for both C and assembler
ARM: Xen: fix initial build problems
ARM: export default read_current_timer
ARM: Fix another build warning in arch/arm/mm/alignment.c
ARM: export set_irq_flags
ARM: kprobes: make more tests conditional