git.karo-electronics.de Git - karo-tx-linux.git/log

USB: fix oops in cdc-wdm in case of malformed descriptors

upstream commit: e13c594f3a1fc2c78e7a20d1a07974f71e4b448f

cdc-wdm needs to ignore extremely malformed descriptors.

Signed-off-by: Oliver Neukum <oliver@neukum.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

USB: ftdi_sio: add vendor/project id for JETI specbos 1201 spectrometer

upstream commit: ae27d84351f1f3568118318a8c40ff3a154bd629

Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

usb gadget: fix ethernet link reports to ethtool

upstream commit: 237e75bf1e558f7330f8deb167fa3116405bef2c

The g_ether USB gadget driver currently decides whether or not there's a
link to report back for eth_get_link based on if the USB link speed is
set. The USB gadget speed is however often set even before the device is
enumerated. It seems more sensible to only report a "link" if we're
actually connected to a host that wants to talk to us. The patch below
does this for me - tested with the PXA27x UDC driver.

Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

x86: disable X86_PTRACE_BTS for now

upstream commit: d45b41ae8da0f54aec0eebcc6f893ba5f22a1e8e

Oleg Nesterov found a couple of races in the ptrace-bts code
and fixes are queued up for it but they did not get ready in time
for the merge window. We'll merge them in v2.6.31 - until then
mark the feature as CONFIG_BROKEN. There's no user-space yet
making use of this so it's not a big issue.

Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[chrisw: trivial 2.6.29 backport]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

SCSI: sg: fix q->queue_lock on scsi_error_handler path

upstream commit: 015640edb1f346e0b2eda703587c4cd1c310ec1d

sg_rq_end_io() is called via rq->end_io. In some rare cases,
sg_rq_end_io calls blk_put_request/blk_rq_unmap_user (when a program
issuing a command has gone before the command completion; e.g. by
interrupting a program issuing a command before the command
completes).

We can't call blk_put_request/blk_rq_unmap_user in interrupt so the
commit c96952ed7031e7c576ecf90cf95b8ec099d5295a uses
execute_in_process_context().

The problem is that scsi_error_handler() calls rq->end_io too. We
can't call blk_put_request/blk_rq_unmap_user too in this path (we hold
q->queue_lock).

To avoid the above problem, in these rare cases, this patch always
uses schedule_work() instead of execute_in_process_context().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

SCSI: sg: avoid blk_put_request/blk_rq_unmap_user in interrupt

upstream commit: c96952ed7031e7c576ecf90cf95b8ec099d5295a

This fixes the following oops:

http://marc.info/?l=linux-kernel&m=123316111415677&w=2

You can reproduce this bug by interrupting a program before a sg
response completes. This leads to the special sg state (the orphan
state), then sg calls blk_put_request in interrupt (rq->end_io).

The above bug report shows the recursive lock problem because sg calls
blk_put_request in interrupt. We could call __blk_put_request here
instead however we also need to handle blk_rq_unmap_user here, which
can't be called in interrupt too.

In the orphan state, we don't need to care about the data transfer
(the program revoked the command) so adding 'just free the resource'
mode to blk_rq_unmap_user is a possible option.

I prefer to avoid complicating the blk mapping API when possible. I
change the orphan state to call sg_finish_rem_req via
execute_in_process_context. We hold sg_fd->kref so sg_fd doesn't go
away until keventd_wq finishes our work. copy_from_user/to_user fails
so blk_rq_unmap_user just frees the resource without the data
transfer.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

SCSI: sg: fix races with ioctl(SG_IO)

upstream commit: a2dd3b4cea335713b58996bb07b3abcde1175f47

sg_io_owned needs to be set before the command is sent to the midlevel;
otherwise, a quickly-completing command may cause a different CPU
to see "srp->done == 1 && !srp->sg_io_owned", which would lead to
incorrect behavior.

Check srp->done and set srp->orphan while holding rq_list_lock to
prevent races with sg_rq_end_io().

There is no need to check sfp->closed from read/write/ioctl/poll/etc.
since the kernel guarantees that this won't happen.

The usefulness of sg_srp_done() was questionable before; now it is
definitely not needed.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

SCSI: sg: fix races during device removal

upstream commit: c6517b7942fad663cc1cf3235cbe4207cf769332

sg has the following problems related to device removal:

* opening a sg fd races with removing a device
* closing a sg fd races with removing a device
* /proc/scsi/sg/* access races with removing a device
* command completion races with removing a device
* command completion races with closing a sg fd
* can rmmod sg with active commands

These problems can cause kernel oopses, memory-use-after-free, or
double-free errors.  This patch fixes these problems by using krefs
to manage the lifetime of sg_device and sg_fd.

Each command submitted to the midlevel holds a reference to sg_fd
until the completion callback.  This ensures that sg_fd doesn't go
away if the fd is closed with commands still outstanding.

sg_fd gets the reference of sg_device (with scsi_device) and also
makes sure that the sg module doesn't go away.

/proc/scsi/sg/* functions don't play nicely with krefs because they
give information about sg_fds which have been closed but not yet
freed due to still having outstanding commands and sg_devices which
have been removed but not yet freed due to still being referenced
by one or more sg_fds.  To deal with this safely without removing
functionality, /proc functions now access sg_device and sg_fd while
holding a lock instead of using kref_get()/kref_put().

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
[chrisw: big for -stable, helps fix real bug, and made it through rc2 upstream]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

mm: pass correct mm when growing stack

upstream commit: 05fa199d45c54a9bda7aa3ae6537253d6f097aa9

Tetsuo Handa reports seeing the WARN_ON(current->mm == NULL) in
security_vm_enough_memory(), when do_execve() is touching the
target mm's stack, to set up its args and environment.

Yes, a UMH_NO_WAIT or UMH_WAIT_PROC call_usermodehelper() spawns
an mm-less kernel thread to do the exec. And in any case, that
vm_enough_memory check when growing stack ought to be done on the
target mm, not on the execer's mm (though apart from the warning,
it only makes a slight tweak to OVERCOMMIT_NEVER behaviour).

Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

pata_hpt37x: fix HPT370 DMA timeouts

upstream commit: 265b7215aed36941620b65ecfff516200fb190c1

The libata driver has copied the code from the IDE driver which caused a post
2.4.18 regression on many HPT370[A] chips -- DMA stopped to work completely,
only causing timeouts. Now remove hpt370_bmdma_start() for good...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

hpt366: fix HPT370 DMA timeouts

upstream commit: c018f1ee5cf81e58b93d9e93a2ee39cad13dc1ac

The big driver change in 2.4.19-rc1 introduced a regression for many HPT370[A]
chips -- DMA stopped to work completely, only causing endless timeouts...

The culprit has been identified (at last!): it turned to be the code resetting
the DMA state machine before each transfer. Stop doing it now as this counter-
measure has clearly caused more harm than good.

This should fix the kernel.org bug #7703.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

powerpc: Fix data-corrupting bug in __futex_atomic_op

upstream commit: 306a82881b14d950d59e0b59a55093a07d82aa9a

Richard Henderson pointed out that the powerpc __futex_atomic_op has a
bug: it will write the wrong value if the stwcx. fails and it has to
retry the lwarx/stwcx. loop, since 'oparg' will have been overwritten
by the result from the first time around the loop. This happens
because it uses the same register for 'oparg' (an input) as it uses
for the result.

This fixes it by using separate registers for 'oparg' and 'ret'.

Cc: stable@kernel.org
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ALSA: hda - Fix the cmd cache keys for amp verbs

upstream commit: fcad94a4c71c36a05f4d5c6dcb174534b4e0b136

Fix the key value generation for get/set amp verbs. The upper bits of
the parameter have to be combined with the verb value to be unique for
each direction/index of amp access.

This fixes the resume problem on some hardwares like Macbook after
the channel mode is changed.

Tested-by: Johannes Berg <johannes@sipsolutions.net>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

sfc: Match calls to netif_napi_add() and netif_napi_del()

upstream commit: 718cff1eec595ce6ab0635b8160a51ee37d9268d

sfc could call netif_napi_add() multiple times for the same
napi_struct, corrupting the list of napi_structs for the associated
device and leading to a busy-loop on device removal. Move the call to
netif_napi_add() and add a call to netif_napi_del() in the obvious
places.

[bhutchings: backport to 2.6.29]
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

tty: Fix leak in ti-usb

upstream commit: cf5450930db0ae308584e5361f3345e0ff73e643

If the ti-usb adapter returns an zero data length frame (which happens)
then we leak a kref. Found by Christoph Mair <christoph.mair@gmail.com>
who proposed a patch. The patch here is different as Christoph's patch
didn't work for the case where tty = NULL and data arrived but Christoph
did all the hard work chasing it down.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

spi: spi_write_then_read() bugfixes

upstream commit: bdff549ebeff92b1a6952e5501caf16a6f8898c8

The "simplify spi_write_then_read()" patch included two regressions from
the 2.6.27 behaviors:

- The data it wrote out during the (full duplex) read side
   of the transfer was not zeroed.

- It fails completely on half duplex hardware, such as
   Microwire and most "3-wire" SPI variants.

So, revert that patch.  A revised version should be submitted at some
point, which can get the speedup on standard hardware (full duplex)
without breaking on less-capable half-duplex stuff.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: <stable@kernel.org> [2.6.28.x, 2.6.29.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

add some long-missing capabilities to fs_mask

upstream commit: 0ad30b8fd5fe798aae80df6344b415d8309342cc

When POSIX capabilities were introduced during the 2.1 Linux
cycle, the fs mask, which represents the capabilities which having
fsuid==0 is supposed to grant, did not include CAP_MKNOD and
CAP_LINUX_IMMUTABLE. However, before capabilities the privilege
to call these did in fact depend upon fsuid==0.

This patch introduces those capabilities into the fsmask,
restoring the old behavior.

See the thread starting at http://lkml.org/lkml/2009/3/11/157 for
reference.

Note that if this fix is deemed valid, then earlier kernel versions (2.4
and 2.2) ought to be fixed too.

Changelog:
[Mar 23] Actually delete old CAP_FS_SET definition...
[Mar 20] Updated against J. Bruce Fields's patch

Reported-by: Igor Zhbanov <izh1979@gmail.com>
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: stable@kernel.org
Cc: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

hrtimer: fix rq->lock inversion (again)

upstream commit: 7f1e2ca9f04b02794597f60e7b1d43f0a1317939

It appears I inadvertly introduced rq->lock recursion to the
hrtimer_start() path when I delegated running already expired
timers to softirq context.

This patch fixes it by introducing a __hrtimer_start_range_ns()
method that will not use raise_softirq_irqoff() but
__raise_softirq_irqoff() which avoids the wakeup.

It then also changes schedule() to check for pending softirqs and
do the wakeup then, I'm not quite sure I like this last bit, nor
am I convinced its really needed.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
LKML-Reference: <20090313112301.096138802@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

x86: fix broken irq migration logic while cleaning up multiple vectors

upstream commit: 68a8ca593fac82e336a792226272455901fa83df

Impact: fix spurious IRQs

During irq migration, we send a low priority interrupt to the previous
irq destination. This happens in non interrupt-remapping case after interrupt
starts arriving at new destination and in interrupt-remapping case after
modifying and flushing the interrupt-remapping table entry caches.

This low priority irq cleanup handler can cleanup multiple vectors, as
multiple irq's can be migrated at almost the same time. While
there will be multiple invocations of irq cleanup handler (one cleanup
IPI for each irq migration), first invocation of the cleanup handler
can potentially cleanup more than one vector (as the first invocation can
see the requests for more than vector cleanup). When we cleanup multiple
vectors during the first invocation of the smp_irq_move_cleanup_interrupt(),
other vectors that are to be cleanedup can still be pending in the local
cpu's IRR (as smp_irq_move_cleanup_interrupt() runs with interrupts disabled).

When we are ready to unhook a vector corresponding to an irq, check if that
vector is registered in the local cpu's IRR. If so skip that cleanup and
do a self IPI with the cleanup vector, so that we give a chance to
service the pending vector interrupt and then cleanup that vector
allocation once we execute the lowest priority handler.

This fixes spurious interrupts seen when migrating multiple vectors
at the same time.

[ This is apparently possible even on conventional xapic, although to
the best of our knowledge it has never been seen. The stable
maintainers may wish to consider this one for -stable. ]

[suresh: backport to 2.6.29]
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

sched: do not count frozen tasks toward load

upstream commit: e3c8ca8336707062f3f7cb1cd7e6b3c753baccdd

Freezing tasks via the cgroup freezer causes the load average to climb
because the freezer's current implementation puts frozen tasks in
uninterruptible sleep (D state).

Some applications which perform job-scheduling functions consult the
load average when making decisions.  If a cgroup is frozen, the load
average does not provide a useful measure of the system's utilization
to such applications.  This is especially inconvenient if the job
scheduler employs the cgroup freezer as a mechanism for preempting low
priority jobs.  Contrast this with using SIGSTOP for the same purpose:
the stopped tasks do not count toward system load.

Change task_contributes_to_load() to return false if the task is
frozen.  This results in /proc/loadavg behavior that better meets
users' expectations.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Nigel Cunningham <nigel@tuxonice.net>
Tested-by: Nigel Cunningham <nigel@tuxonice.net>
Cc: <stable@kernel.org>
Cc: containers@lists.linux-foundation.org
Cc: linux-pm@lists.linux-foundation.org
Cc: Matt Helsley <matthltc@us.ibm.com>
LKML-Reference: <20090408194512.47a99b95@manatee.lan>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dm kcopyd: fix callback race

upstream commit: 340cd44451fb0bfa542365e6b4b565bbd44836e2

If the thread calling dm_kcopyd_copy is delayed due to scheduling inside
split_job/segment_complete and the subjobs complete before the loop in
split_job completes, the kcopyd callback could be invoked from the
thread that called dm_kcopyd_copy instead of the kcopyd workqueue.

dm_kcopyd_copy -> split_job -> segment_complete -> job->fn()

Snapshots depend on the fact that callbacks are called from the singlethreaded
kcopyd workqueue and expect that there is no racing between individual
callbacks. The racing between callbacks can lead to corruption of exception
store and it can also mean that exception store callbacks are called twice
for the same exception - a likely reason for crashes reported inside
pending_complete() / remove_exception().

This patch fixes two problems:

1. job->fn being called from the thread that submitted the job (see above).

- Fix: hand over the completion callback to the kcopyd thread.

2. job->fn(read_err, write_err, job->context); in segment_complete
reports the error of the last subjob, not the union of all errors.

- Fix: pass job->write_err to the callback to report all error bits
(it is done already in run_complete_job)

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dm kcopyd: prepare for callback race fix

upstream commit: 73830857bca6f6c9dbd48e906daea50bea42d676

Use a variable in segment_complete() to point to the dm_kcopyd_client
struct and only release job->pages in run_complete_job() if any are
defined. These changes are needed by the next patch.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

posix-timers: fix RLIMIT_CPU && setitimer(CPUCLOCK_PROF)

upstream commit: 8f2e586567b1bad72dac7c3810fe9a2ef7117506

update_rlimit_cpu() tries to optimize out set_process_cpu_timer() in case
when we already have CPUCLOCK_PROF timer which should expire first. But it
uses cputime_lt() instead of cputime_gt().

Test case:

int main(void)
{
struct itimerval it = {
.it_value = { .tv_sec = 1000 },
};

assert(!setitimer(ITIMER_PROF, &it, NULL));

struct rlimit rl = {
.rlim_cur = 1,
.rlim_max = 1,
};

assert(!setrlimit(RLIMIT_CPU, &rl));

for (;;)
;

return 0;
}

Without this patch, the task is not killed as RLIMIT_CPU demands.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Peter Lojkin <ia6432@inbox.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: stable@kernel.org
LKML-Reference: <20090327000610.GA10108@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

posix-timers: fix RLIMIT_CPU && fork()

upstream commit: 6279a751fe096a21dc7704e918d570d3ff06e769

See http://bugzilla.kernel.org/show_bug.cgi?id=12911

copy_signal() copies signal->rlim, but RLIMIT_CPU is "lost". Because
posix_cpu_timers_init_group() sets cputime_expires.prof_exp = 0 and thus
fastpath_timer_check() returns false unless we have other expired cpu timers.

Change copy_signal() to set cputime_expires.prof_exp if we have RLIMIT_CPU.
Also, set cputimer.running = 1 in that case. This is not strictly necessary,
but imho makes sense.

Reported-by: Peter Lojkin <ia6432@inbox.ru>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Peter Lojkin <ia6432@inbox.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: stable@kernel.org
LKML-Reference: <20090327000607.GA10104@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

posixtimers, sched: Fix posix clock monotonicity

upstream commit: c5f8d99585d7b5b7e857fabf8aefd0174903a98c

Impact: Regression fix (against clock_gettime() backwarding bug)

This patch re-introduces a couple of functions, task_sched_runtime
and thread_group_sched_runtime, which was once removed at the
time of 2.6.28-rc1.

These functions protect the sampling of thread/process clock with
rq lock.  This rq lock is required not to update rq->clock during
the sampling.

i.e.
  The clock_gettime() may return
   ((accounted runtime before update) + (delta after update))
  that is less than what it should be.

v2 -> v3:
- Rename static helper function __task_delta_exec()
  to do_task_delta_exec() since -tip tree already has
  a __task_delta_exec() of different version.

v1 -> v2:
- Revises comments of function and patch description.
- Add note about accuracy of thread group's runtime.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org [2.6.28.x][2.6.29.x]
LKML-Reference: <49D1CC93.4080401@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

cap_prctl: don't set error to 0 at 'no_change'

upstream commit: 5bf37ec3e0f5eb79f23e024a7fbc8f3557c087f0

One-liner: capsh --print is broken without this patch.

In certain cases, cap_prctl returns error > 0 for success. However,
the 'no_change' label was always setting error to 0. As a result,
for example, 'prctl(CAP_BSET_READ, N)' would always return 0.
It should return 1 if a process has N in its bounding set (as
by default it does).

I'm keeping the no_change label even though it's now functionally
the same as 'error'.

Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

SCSI: libiscsi: fix iscsi pool error path

upstream commit: fd6e1c14b73dbab89cb76af895d5612e4a8b5522

Le lundi 30 mars 2009, Chris Wright a écrit :
> q->queue could be ERR_PTR(-ENOMEM) which will break unwinding
> on error. Make iscsi_pool_free more defensive.
>

Making the freeing of q->queue dependent on q->pool being set looks
really weird (although it is correct at the moment. But this seems
to be fixable in a much simpler way.

With the benefit that only the error case is slowed down. In both
cases we have a problem if q->queue contains an error value but it's
not -ENOMEM. Apparently this can't happen today, but it doesn't feel
right to assume this will always be true. Maybe it's the right time
to fix this as well.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
[chrisw: this is a fixlet to f474a37b, also in -stable]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

SCSI: libiscsi: fix iscsi pool error path

upstream commit: f474a37bc48667595b5653a983b635c95ed82a3b

Memory freeing in iscsi_pool_free() looks wrong to me. Either q->pool
can be NULL and this should be tested before dereferencing it, or it
can't be NULL and it shouldn't be tested at all. As far as I can see,
the only case where q->pool is NULL is on early error in
iscsi_pool_init(). One possible way to fix the bug is thus to not
call iscsi_pool_free() in this case (nothing needs to be freed anyway)
and then we can get rid of the q->pool check.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

sparc64: Fix bug in ("sparc64: Flush TLB before releasing pages.")

[ No upstream commit, this regression was added only to 2.6.29.1 ]

Unfortunately I merged an earlier version of commit
b6816b706138c3870f03115071872cad824f90b4 ("sparc64: Flush TLB before
releasing pages.") than what I actually tested and merged upstream.

Simply diffing asm/tlb_64.h in Linus's tree vs. what ended up in
2.6.29.1 confirms this.

Sync things up to fix BUG() triggers some users are seeing.

Reported-by: Dennis Gilmore <dennis@ausil.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ALSA: hda - add missing comma in ad1884_slave_vols

upstream commit: bca68467b59a24396554d8dd5979ee363c174854

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

splice: fix deadlock in splicing to file

upstream commit: 7bfac9ecf0585962fe13584f5cf526d8c8e76f17

There's a possible deadlock in generic_file_splice_write(),
splice_from_pipe() and ocfs2_file_splice_write():

- task A calls generic_file_splice_write()
- this calls inode_double_lock(), which locks i_mutex on both
   pipe->inode and target inode
- ordering depends on inode pointers, can happen that pipe->inode is
   locked first
- __splice_from_pipe() needs more data, calls pipe_wait()
- this releases lock on pipe->inode, goes to interruptible sleep
- task B calls generic_file_splice_write(), similarly to the first
- this locks pipe->inode, then tries to lock inode, but that is
   already held by task A
- task A is interrupted, it tries to lock pipe->inode, but fails, as
   it is already held by task B
- ABBA deadlock

Fix this by explicitly ordering locks: the outer lock must be on
target inode and the inner lock (which is later unlocked and relocked)
must be on pipe->inode.  This is OK, pipe inodes and target inodes
form two nonoverlapping sets, generic_file_splice_write() and friends
are not called with a target which is a pipe.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

netfilter: {ip, ip6, arp}_tables: fix incorrect loop detection

upstream commit: 1f9352ae2253a97b07b34dcf16ffa3b4ca12c558

Commit e1b4b9f ([NETFILTER]: {ip,ip6,arp}_tables: fix exponential worst-case
search for loops) introduced a regression in the loop detection algorithm,
causing sporadic incorrectly detected loops.

When a chain has already been visited during the check, it is treated as
having a standard target containing a RETURN verdict directly at the
beginning in order to not check it again. The real target of the first
rule is then incorrectly treated as STANDARD target and checked not to
contain invalid verdicts.

Fix by making sure the rule does actually contain a standard target.

Based on patch by Francis Dupont <Francis_Dupont@isc.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

kprobes: Fix locking imbalance in kretprobes

upstream commit: f02b8624fedca39886b0eef770dca70c2f0749b3

Fix locking imbalance in kretprobes:

=====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
kthreadd/2 is trying to release lock (&rp->lock) at:
[<c06b3080>] pre_handler_kretprobe+0xea/0xf4
but there are no more locks to release!

other info that might help us debug this:
1 lock held by kthreadd/2:
#0: (rcu_read_lock){..--}, at: [<c06b2b24>] __atomic_notifier_call_chain+0x0/0x5a

stack backtrace:
Pid: 2, comm: kthreadd Not tainted 2.6.29-rc8 #1
Call Trace:
[<c06ae498>] ? printk+0xf/0x17
[<c06b3080>] ? pre_handler_kretprobe+0xea/0xf4
[<c044ce6c>] print_unlock_inbalance_bug+0xc3/0xce
[<c0444d4b>] ? clocksource_read+0x7/0xa
[<c04450a4>] ? getnstimeofday+0x5f/0xf6
[<c044a9ca>] ? register_lock_class+0x17/0x293
[<c044b72c>] ? mark_lock+0x1e/0x30b
[<c0448956>] ? tick_dev_program_event+0x4a/0xbc
[<c0498100>] ? __slab_alloc+0xa5/0x415
[<c06b2fbe>] ? pre_handler_kretprobe+0x28/0xf4
[<c06b3080>] ? pre_handler_kretprobe+0xea/0xf4
[<c044cf1b>] lock_release_non_nested+0xa4/0x1a5
[<c06b3080>] ? pre_handler_kretprobe+0xea/0xf4
[<c044d15d>] lock_release+0x141/0x166
[<c06b07dd>] _spin_unlock_irqrestore+0x19/0x50
[<c06b3080>] pre_handler_kretprobe+0xea/0xf4
[<c06b20b5>] kprobe_exceptions_notify+0x1c9/0x43e
[<c06b2b02>] notifier_call_chain+0x26/0x48
[<c06b2b5b>] __atomic_notifier_call_chain+0x37/0x5a
[<c06b2b24>] ? __atomic_notifier_call_chain+0x0/0x5a
[<c06b2b8a>] atomic_notifier_call_chain+0xc/0xe
[<c0442d0d>] notify_die+0x2d/0x2f
[<c06b0f9c>] do_int3+0x1f/0x71
[<c06b0e84>] int3+0x2c/0x34
[<c042d476>] ? do_fork+0x1/0x288
[<c040221b>] ? kernel_thread+0x71/0x79
[<c043ed1b>] ? kthread+0x0/0x60
[<c043ed1b>] ? kthread+0x0/0x60
[<c04040b8>] ? kernel_thread_helper+0x0/0x10
[<c043ec7f>] kthreadd+0xac/0x148
[<c043ebd3>] ? kthreadd+0x0/0x148
[<c04040bf>] kernel_thread_helper+0x7/0x10

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Tested-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@kernel.org> [2.6.29.x, 2.6.28.x, 2.6.27.x]
LKML-Reference: <20090318113621.GB4129@in.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

acer-wmi: Blacklist Acer Aspire One

upstream commit: a74dd5fdabcd34c93e17e9c7024eeb503c92b048

The Aspire One's ACPI-WMI interface is a placeholder that does nothing,
and the invalid results that we get from it are now causing userspace
problems as acer-wmi always returns that the rfkill is enabled (i.e. the
radio is off, when it isn't). As it's hardware controlled, acer-wmi
isn't needed on the Aspire One either.

Thanks to Andy Whitcroft at Canonical for tracking down Ubuntu's userspace
issues to this.

Signed-off-by: Carlos Corbacho <carlos@strangeworlds.co.uk>
Reported-by: Andy Whitcroft <apw@canonical.com>
Cc: stable@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

crypto: shash - Fix unaligned calculation with short length

upstream commit: f4f689933c63e0fbfba62f2a80efb2b424b139ae

When the total length is shorter than the calculated number of unaligned bytes, the call to shash->update breaks. For example, calling crc32c on unaligned buffer with length of 1 can result in a system crash.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

net/netrom: Fix socket locking

upstream commit: cc29c70dd581f85ee7a3e7980fb031f90b90a2ab

Patch "af_rose/x25: Sanity check the maximum user frame size"
(commit 83e0bbcbe2145f160fbaa109b0439dae7f4a38a9) from Alan Cox got
locking wrong. If we bail out due to user frame size being too large,
we must unlock the socket beforehand.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

af_rose/x25: Sanity check the maximum user frame size

upstream commit: 83e0bbcbe2145f160fbaa109b0439dae7f4a38a9

CVE-2009-0795.

Otherwise we can wrap the sizes and end up sending garbage.

Closes #10423

Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dm table: fix upgrade mode race

upstream commit: 570b9d968bf9b16974252ef7cbce73fa6dac34f3

upgrade_mode() sets bdev to NULL temporarily, and does not have any
locking to exclude anything from seeing that NULL.

In dm_table_any_congested() bdev_get_queue() can dereference that NULL and
cause a reported oops.

Fix this by not changing that field during the mode upgrade.

Cc: stable@kernel.org
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dm: path selector use module refcount directly

upstream commit: aea9058801c0acfa2831af1714da412dfb0018c2

Fix refcount corruption in dm-path-selector

Refcounting with non-atomic ops under shared lock will corrupt the counter
in multi-processor system and may trigger BUG_ON().
Use module refcount.
# same approach as dm-target-use-module-refcount-directly.patch here
# https://www.redhat.com/archives/dm-devel/2008-December/msg00075.html

Typical oops:
  kernel BUG at linux-2.6.29-rc3/drivers/md/dm-path-selector.c:90!
  Pid: 11148, comm: dmsetup Not tainted 2.6.29-rc3-nm #1
  dm_put_path_selector+0x4d/0x61 [dm_multipath]
  Call Trace:
   [<ffffffffa031d3f9>] free_priority_group+0x33/0xb3 [dm_multipath]
   [<ffffffffa031d4aa>] free_multipath+0x31/0x67 [dm_multipath]
   [<ffffffffa031d50d>] multipath_dtr+0x2d/0x32 [dm_multipath]
   [<ffffffffa015d6c2>] dm_table_destroy+0x64/0xd8 [dm_mod]
   [<ffffffffa015b73a>] __unbind+0x46/0x4b [dm_mod]
   [<ffffffffa015b79f>] dm_swap_table+0x60/0x14d [dm_mod]
   [<ffffffffa015f963>] dev_suspend+0xfd/0x177 [dm_mod]
   [<ffffffffa0160250>] dm_ctl_ioctl+0x24c/0x29c [dm_mod]
   [<ffffffff80288cd3>] ? get_page_from_freelist+0x49c/0x61d
   [<ffffffffa015f866>] ? dev_suspend+0x0/0x177 [dm_mod]
   [<ffffffff802bf05c>] vfs_ioctl+0x2a/0x77
   [<ffffffff802bf4f1>] do_vfs_ioctl+0x448/0x4a0
   [<ffffffff802bf5a0>] sys_ioctl+0x57/0x7a
   [<ffffffff8020c05b>] system_call_fastpath+0x16/0x1b

Cc: stable@kernel.org
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dm target: use module refcount directly

upstream commit: 5642b8a61a15436231adf27b2b1bd96901b623dd

The tt_internal's 'use' field is superfluous: the module's refcount can do
the work properly. An acceptable side-effect is that this increases the
reference counts reported by 'lsmod'.

Remove the superfluous test when removing a target module.

[Crash possible without this on SMP - agk]

Cc: stable@kernel.org
Signed-off-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dm snapshot: avoid having two exceptions for the same chunk

upstream commit: 35bf659b008e83e725dcd30f542e38461dbb867c

We need to check if the exception was completed after dropping the lock.

After regaining the lock, __find_pending_exception checks if the exception
was already placed into &s->pending hash.

But we don't check if the exception was already completed and placed into
&s->complete hash. If the process waiting in alloc_pending_exception was
delayed at this point because of a scheduling latency and the exception
was meanwhile completed, we'd miss that and allocate another pending
exception for already completed chunk.

It would lead to a situation where two records for the same chunk exist
and potential data corruption because multiple snapshot I/Os to the
affected chunk could be redirected to different locations in the
snapshot.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dm snapshot: avoid dropping lock in __find_pending_exception

upstream commit: c66213921c816f6b1b16a84911618ba9a363b134

It is uncommon and bug-prone to drop a lock in a function that is called with
the lock held, so this is moved to the caller.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dm snapshot: refactor __find_pending_exception

upstream commit: 2913808eb56a6445a7b277eb8d17651c8defb035

Move looking-up of a pending exception from __find_pending_exception to another
function.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dm io: make sync_io uninterruptible

upstream commit: b64b6bf4fd8b678a9f8477c11773c38a0a246a6d

If someone sends signal to a process performing synchronous dm-io call,
the kernel may crash.

The function sync_io attempts to exit with -EINTR if it has pending signal,
however the structure "io" is allocated on stack, so already submitted io
requests end up touching unallocated stack space and corrupting kernel memory.

sync_io sets its state to TASK_UNINTERRUPTIBLE, so the signal can't break out
of io_schedule() --- however, if the signal was pending before sync_io entered
while (1) loop, the corruption of kernel memory will happen.

There is no way to cancel in-progress IOs, so the best solution is to ignore
signals at this point.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dm raid1: switch read_record from kmalloc to slab to save memory

upstream commit: 95f8fac8dc6139fedfb87746e0c8fda9b803cb46

With my previous patch to save bi_io_vec, the size of dm_raid1_read_record
is significantly increased (the vector list takes 3072 bytes on 32-bit machines
and 4096 bytes on 64-bit machines).

The structure dm_raid1_read_record used to be allocated with kmalloc,
but kmalloc aligns the size on the next power-of-two so an object
slightly greater than 4096 will allocate 8192 bytes of memory and half of
that memory will be wasted.

This patch turns kmalloc into a slab cache which doesn't have this
padding so it will reduce the memory consumed.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

vfs: skip I_CLEAR state inodes

upstream commit: b6fac63cc1f52ec27f29fe6c6c8494a2ffac33fd

clear_inode() will switch inode state from I_FREEING to I_CLEAR, and do so
_outside_ of inode_lock.  So any I_FREEING testing is incomplete without a
coupled testing of I_CLEAR.

So add I_CLEAR tests to drop_pagecache_sb(), generic_sync_sb_inodes() and
add_dquot_ref().

Masayoshi MIZUMA discovered the bug in drop_pagecache_sb() and Jan Kara
reminds fixing the other two cases.

Masayoshi MIZUMA has a nice panic flow:

=====================================================================
            [process A]               |        [process B]
|                                    |
|    prune_icache()                  | drop_pagecache()
|      spin_lock(&inode_lock)        |   drop_pagecache_sb()
|      inode->i_state |= I_FREEING;  |       |
|      spin_unlock(&inode_lock)      |       V
|          |                         |     spin_lock(&inode_lock)
|          V                         |         |
|      dispose_list()                |         |
|        list_del()                  |         |
|        clear_inode()               |         |
|          inode->i_state = I_CLEAR  |         |
|            |                       |         V
|            |                       |      if (inode->i_state & (I_FREEING|I_WILL_FREE))
|            |                       |              continue;           <==== NOT MATCH
|            |                       |
|            |                       | (DANGER from here on! Accessing disposing inode!)
|            |                       |
|            |                       |      __iget()
|            |                       |        list_move() <===== PANIC on poisoned list !!
V            V                       |
(time)
=====================================================================

Reported-by: Masayoshi MIZUMA <m.mizuma@jp.fujitsu.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[chrisw: backport to 2.6.29]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dm: preserve bi_io_vec when resubmitting bios

upstream commit: a920f6b3accc77d9dddbc98a7426be23ee479625

Device mapper saves and restores various fields in the bio, but it doesn't save
bi_io_vec. If the device driver modifies this after a partially successful
request, dm-raid1 and dm-multipath may attempt to resubmit a bio that has
bi_size inconsistent with the size of vector.

To make requests resubmittable in dm-raid1 and dm-multipath, we must save
and restore the bio vector as well.

To reduce the memory overhead involved in this, we do not save the pages in a
vector and use a 16-bit field size if the page size is less than 65536.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ixgbe: Fix potential memory leak/driver panic issue while setting up Tx & Rx ring parameters

upstream commit: f9ed88549e2ec73922b788e3865282d221233662

While setting up the ring parameters using ethtool the driver can
panic or leak memory as ixgbe_open tries to setup tx & rx resources.
The updated logic will use ixgbe_down/up after successful allocation of
tx & rx resources

Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

mm: do_xip_mapping_read: fix length calculation

upstream commit: 58984ce21d315b70df1a43644df7416ea7c9bfd8

The calculation of the value nr in do_xip_mapping_read is incorrect. If
the copy required more than one iteration in the do while loop the copies
variable will be non-zero. The maximum length that may be passed to the
call to copy_to_user(buf+copied, xip_mem+offset, nr) is len-copied but the
check only compares against (nr > len).

This bug is the cause for the heap corruption Carsten has been chasing
for so long:

*** glibc detected *** /bin/bash: free(): invalid next size (normal): 0x00000000800e39f0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x200000b9b44]
/lib64/libc.so.6(cfree+0x8e)[0x200000bdade]
/bin/bash(free_buffered_stream+0x32)[0x80050e4e]
/bin/bash(close_buffered_stream+0x1c)[0x80050ea4]
/bin/bash(unset_bash_input+0x2a)[0x8001c366]
/bin/bash(make_child+0x1d4)[0x8004115c]
/bin/bash[0x8002fc3c]
/bin/bash(execute_command_internal+0x656)[0x8003048e]
/bin/bash(execute_command+0x5e)[0x80031e1e]
/bin/bash(execute_command_internal+0x79a)[0x800305d2]
/bin/bash(execute_command+0x5e)[0x80031e1e]
/bin/bash(reader_loop+0x270)[0x8001efe0]
/bin/bash(main+0x1328)[0x8001e960]
/lib64/libc.so.6(__libc_start_main+0x100)[0x200000592a8]
/bin/bash(clearerr+0x5e)[0x8001c092]

With this bug fix the commit 0e4a9b59282914fe057ab17027f55123964bc2e2
"ext2/xip: refuse to change xip flag during remount with busy inodes" can
be removed again.

Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Jared Hulbert <jaredeh@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

mm: define a UNIQUE value for AS_UNEVICTABLE flag

upstream commit: 9a896c9a48ac6704c0ce8ee081b836644d0afe40

A new "address_space flag"--AS_MM_ALL_LOCKS--was defined to use the next
available AS flag while the Unevictable LRU was under development. The
Unevictable LRU was using the same flag and "no one" noticed. Current
mainline, since 2.6.28, has same value for two symbolic flag names.

So, define a unique flag value for AS_UNEVICTABLE--up close to the other
flags, [at the cost of an additional #ifdef] so we'll notice next time.
Note that #ifdef is not actually required, if we don't mind having the
unused flag value defined.

Replace #defines with an enum.

Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: <stable@kernel.org> [2.6.28.x, 2.6.29.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

sysctl: fix suid_dumpable and lease-break-time sysctls

upstream commit: 8e654fba4a376f436bdfe361fc5cdbc87ac09b35

Arne de Bruijn points out that commit
76fdbb25f963de5dc1e308325f0578a2f92b1c2d ("coredump masking: bound
suid_dumpable sysctl") mistakenly limits lease-break-time instead of
suid_dumpable.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Reported-by: Arne de Bruijn <kernelbt@arbruijn.dds.nl>
Cc: Kawai, Hidehiro <hidehiro.kawai.ez@hitachi.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

cpumask: fix slab corruption caused by alloc_cpumask_var_node()

upstream commit: 4f032ac4122a77dbabf7a24b2739b2790448180f

Fix slab corruption caused by alloc_cpumask_var_node() overwriting the
tail end of an off-stack cpumask.

The function zeros out cpumask bits beyond the last possible cpu.  The
starting point for zeroing should be the beginning of the mask offset by a
byte count derived from the number of possible cpus.  The offset was
calculated in bits instead of bytes.  This resulted in overwriting the end
of the cpumask.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Mike Travis <travis.sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: <stable@kernel.org> [2.6.29.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ide-atapi: start DMA after issuing a packet command

upstream commit: 2eba08270990b99fb5429b76ee97184ddd272f7f

Apparently¹, some ATAPI devices want to see the packet command first
before enabling DMA otherwise they simply hang indefinitely. Reorder the
two steps and start DMA only after having issued the command first.

[1] http://marc.info/?l=linux-kernel&m=123835520317235&w=2

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Reported-by: Michael Roth <mroth@nessie.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ide: drivers/ide/ide-atapi.c needs <linux/scatterlist.h>

upstream commit: 479edf065576aeed7ac99d10838bb3b4f870b5f9

On m68k:
| drivers/ide/ide-atapi.c: In function 'ide_io_buffers':
| drivers/ide/ide-atapi.c:87: error: implicit declaration of function 'sg_page'
| drivers/ide/ide-atapi.c:87: warning: passing argument 1 of 'PageHighMem' makes pointer from integer without a cast
| drivers/ide/ide-atapi.c:91: warning: passing argument 1 of 'kmap_atomic' makes pointer from integer without a cast
| drivers/ide/ide-atapi.c:96: error: implicit declaration of function 'sg_virt'
| drivers/ide/ide-atapi.c:96: warning: assignment makes pointer from integer without a cast
| drivers/ide/ide-atapi.c:107: error: implicit declaration of function 'sg_next'
| drivers/ide/ide-atapi.c:107: warning: assignment makes pointer from integer without a cast

[bart: Dmitri Vorobiev submitted similar patch fixing MIPS]

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Dmitri Vorobiev <dmitri.vorobiev@movial.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

V4L/DVB (10943): cx88: Prevent general protection fault on rmmod

upstream commit: 569b7ec73abf576f9a9e4070d213aadf2cce73cb

When unloading the cx8800 driver I sometimes get a general protection
fault. Analysis revealed a race in cx88_ir_stop(). It can be solved by
using a delayed work instead of a timer for infrared input polling.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

r8169: Reset IntrStatus after chip reset

upstream commit: d78ad8cbfe73ad568de38814a75e9c92ad0a907c

Original comment (Karsten):
On a MSI MS-6702E mainboard, when in rtl8169_init_one() for the first time
after BIOS has run, IntrStatus reads 5 after chip has been reset.
IntrStatus should equal 0 there, so patch changes IntrStatus reset to happen
after chip reset instead of before.

Remark (Francois):
Assuming that the loglevel of the driver is increased above NETIF_MSG_INTR,
the bug reveals itself with a typical "interrupt 0025 in poll" message
at startup. In retrospect, the message should had been read as an hint of
an unexpected hardware state several months ago :o(

Fixes (at least part of) https://bugzilla.redhat.com/show_bug.cgi?id=460747

Signed-off-by: Karsten Wiese <fzu@wemgehoertderstaat.de>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Josep <josep.puigdemont@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

md/raid1 - don't assume newly allocated bvecs are initialised.

upstream commit: 303a0e11d0ee136ad8f53f747f3c377daece763b

Since commit d3f761104b097738932afcc310fbbbbfb007ef92
newly allocated bvecs aren't initialised to NULL, so we have
to be more careful about freeing a bio which only managed
to get a few pages allocated to it. Otherwise the resync
process crashes.

This patch is appropriate for 2.6.29-stable.

Cc: stable@kernel.org
Cc: "Jens Axboe" <jens.axboe@oracle.com>
Reported-by: Gabriele Tozzi <gabriele@tozzi.eu>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

SCSI: sg: fix iovec bugs introduced by the block layer conversion

upstream commit: 0fdf96b67ac2649cc1ddb29b316a0db11586c6a8

- needs to use copy_from_user for iovec before passing it to
blk_rq_map_user_iov().

- before the block layer conversion, if ->dxfer_len and sum of iovec
disagrees, the shorter one wins. However, currently sg returns
-EINVAL. This restores the old behavior.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

drm/i915: fix TV mode setting in property change

upstream commit: 7d6ff7851c23740c3813bdf457be638381774b69

Only set TV DAC in property change seems doesn't work, we have to
setup whole crtc pipe which assigned to TV alone.

Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com>
[anholt: Note that this should also fix the oops at startup with new 2D]
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

drm/i915: only set TV mode when any property changed

upstream commit: ebcc8f2eade76946dbb5d5c545b91f8157051aa8

If there's no real property change, don't need to set TV mode again.

Signed-off-by: Zhenyu Wang <zhenyu.z.wang@intel.com>
[anholt: checkpatch.pl fix]
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

drm: Use pgprot_writecombine in GEM GTT mapping to get the right bits for !PAT.

upstream commit: 1055f9ddad093f54dfd708a0f976582034d4ce1a

Otherwise, the PAGE_CACHE_WC would end up getting us a UC-only mapping, and
the write performance of GTT maps dropped 10x.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[anholt: cleaned up unused var]
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

drm/i915: check for -EINVAL from vm_insert_pfn

upstream commit: 959b887cf42fd63cf10e28a7f26126f78aa1c0b0

Indicates something is wrong with the mapping; and apparently triggers
in current kernels.

Signed-off-by: Jesse Barnes <jbarnes@virtuosugeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

drm/i915: Check for dev->primary->master before dereference.

upstream commit: 98787c057fdefdce6230ff46f2c1105835005a4c

I've hit the occasional oops inside i915_wait_ring() with an indication of
a NULL derefence of dev->primary->master. Adding a NULL check is
consistent with the other potential users of dev->primary->master.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

drm/i915: Sync crt hotplug detection with intel video driver

upstream commit: 771cb081354161eea21534ba58e5cc1a2db94a25

This covers:
Use long crt hotplug activation time on GM45.

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

drm/i915: Read the right SDVO register when detecting SVDO/HDMI.

upstream commit: 13520b051e8888dd3af9bda639d83e7df76613d1

This fixes incorrect detection of the second SDVO/HDMI output on G4X, and
extra boot time on pre-G4X.

Signed-off-by: Kristian Høgsberg <krh@redhat.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

drm/i915: Change DCC tiling detection case to cover only mobile parts.

upstream commit: 568d9a8f6d4bf81e0672c74573dc02981d31e3ea

Later spec investigation has revealed that every 9xx mobile part has
had this register in this format. Also, no non-mobile parts have been shown
to have this register. So make all mobile use the same code, and all
non-mobile use the hack 965 detection.

Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

dock: fix dereference after kfree()

upstream commit: f240729832dff3785104d950dad2d3ced4387f6d

dock_remove() calls kfree() on dock_station so we should use
list_for_each_entry_safe() to avoid dereferencing freed memory.

Found by smatch (http://repo.or.cz/w/smatch.git/). Compile tested.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ACPI: cap off P-state transition latency from buggy BIOSes

upstream commit: a59d1637eb0e0a37ee0e5c92800c60abe3624e24

Some BIOSes report very high frequency transition latency which are plainly
wrong on CPus that can change frequency using native MSR interface.

One such system is IBM T42 (2327-8ZU) as reported by Owen Taylor and
Rik van Riel.

cpufreq_ondemand driver uses this transition latency to come up with a
reasonable sampling interval to sample CPU usage and with such high
latency value, ondemand sampling interval ends up being very high
(0.5 sec, in this particular case), resulting in performance impact due to
slow response to increasing frequency.

Fix it by capping-off the transition latency to 20uS for native MSR based
frequency transitions.

mjg: We've confirmed that this also helps on the X31

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

x86, setup: mark %esi as clobbered in E820 BIOS call

upstream commit: 01522df346f846906eaf6ca57148641476209909

Jordan Hargrave diagnosed a BIOS clobbering %esi in the E820 call.
That particular BIOS has been fixed, but there is a possibility that
this is responsible for other occasional reports of early boot
failure, and it does not hurt to add %esi to the clobbers.

-stable candidate patch.

Cc: Justin Forbes <jmforbes@linuxtx.org>
Signed-off-by: Michael K Johnson <johnsonm@rpath.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

tracing/core: fix early free of cpumasks

upstream commit: 2fc1dfbe17e7705c55b7a99da995fa565e26f151

Impact: fix crashes when tracing cpumasks

While ring-buffer allocation, the cpumasks are allocated too,
including the tracing cpumask and the per-cpu file mask handler.
But these cpumasks are freed accidentally just after.
Fix it.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1237164303-11476-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

rt2x00: Fix SLAB corruption during rmmod

At rmmod stage, the code path is the following one :

rt2x00lib_remove_dev
  ->  rt2x00lib_uninitialize()
        -> rt2x00rfkill_unregister()
             -> rfkill_unregister()
        -> rt2x00rfkill_free()
             -> rfkill_free()

The problem is that rfkill_free should not be called after rfkill_register
otherwise put_device(&rfkill->dev) will be called 2 times. This patch
fixes this by only calling rt2x00rfkill_free() when rt2x00rfkill_register()
hasn't been called or has failed.

This patch is for 2.6.29 only. The code in question has completely disappeared
in 2.6.30 and does not contain this bug.

Signed-off-by: Gertjan van Wingerde <gwingerde@gmail.com>
Tested-by: Arnaud Patard <apatard@mandriva.com>
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ext4: fix locking typo in mballoc which could cause soft lockup hangs

upstream commit: e7c9e3e99adf6c49c5d593a51375916acc039d1e

Smatch (http://repo.or.cz/w/smatch.git/) complains about the locking in
ext4_mb_add_n_trim() from fs/ext4/mballoc.c

  4438          list_for_each_entry_rcu(tmp_pa, &lg->lg_prealloc_list[order],
  4439                                                  pa_inode_list) {
  4440                  spin_lock(&tmp_pa->pa_lock);
  4441                  if (tmp_pa->pa_deleted) {
  4442                          spin_unlock(&pa->pa_lock);
  4443                          continue;
  4444                  }

Brown paper bag time...

Reported-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ext4: fix typo which causes a memory leak on error path

upstream commit: a7b19448ddbdc34b2b8fedc048ba154ca798667b

This was found by smatch (http://repo.or.cz/w/smatch.git/)

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

MIPS: Compat: Zero upper 32-bit of offset_high and offset_low.

upstream commit: d6c178e9694e7e0c7ffe0289cf4389a498cac735

Through sys_llseek() arguably should do exactly that it doesn't which
means llseek(2) will fail for o32 processes if offset_low has bit 31 set.

As suggested by Heiko Carstens.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

PCI/x86: detect host bridge config space size w/o using quirks

upstream commit: dfadd9edff498d767008edc6b2a6e86a7a19934d

Many host bridges support a 4k config space, so check them directy
instead of using quirks to add them.

We only need to do this extra check for host bridges at this point,
because only host bridges are known to have extended address space
without also having a PCI-X/PCI-E caps. Other devices with this
property could be done with quirks (if there are any).

As a bonus, we can remove the quirks for AMD host bridges with family
10h and 11h since they're not needed any more.

With this patch, we can get correct pci cfg size of new Intel CPUs/IOHs
with host bridges.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ide: Fix code dealing with sleeping devices in do_ide_request()

upstream commit: 9010941c5483a7a5bb1f7d97ee62491fb078bb51

Unfortunately, I missed a catch when reviewing the patch committed as
201bffa4. Here is the fix to the currently broken handling of sleeping
devices. In particular, this is required to get the disk shock
protection code working again.

Reported-by: Christian Thaeter <ct@pipapo.org>
Cc: stable@kernel.org
Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

fbdev: fix info->lock deadlock in fbcon_event_notify()

upstream commit: 513adb58685615b0b1d47a3f0d40f5352beff189

fb_notifier_call_chain() is called with info->lock held, i.e. in
do_fb_ioctl() => FBIOPUT_VSCREENINFO => fb_set_var() and the some
notifier callbacks, like fbcon_event_notify(), try to re-acquire
info->lock again.

Remove the lock/unlock_fb_info() in all the framebuffer notifier
callbacks' and be sure to always call fb_notifier_call_chain() with
info->lock held.

[fixes hang caused by 66c1ca01]

Reported-by: Pavel Roskin <proski@gnu.org>
Reported-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

fbmem: fix fb_info->lock and mm->mmap_sem circular locking dependency

upstream commit: 66c1ca019078220dc1bf968f2bb18421100ef147

Fix a circular locking dependency in the frame buffer console driver
pushing down the mutex fb_info->lock.

Circular locking dependecies occur calling the blocking
fb_notifier_call_chain() with fb_info->lock held. Notifier callbacks can
try to acquire mm->mmap_sem, while fb_mmap() acquires the locks in the
reverse order mm->mmap_sem => fb_info->lock.

Tested-by: Andrey Borzenkov <arvidjaar@mail.ru>
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

security/smack: fix oops when setting a size 0 SMACK64 xattr

upstream commit: 4303154e86597885bc3cbc178a48ccbc8213875f

this patch fix an oops in smack when setting a size 0 SMACK64 xattr eg
attr -S -s SMACK64 -V '' somefile
This oops because smk_import_entry treats a 0 length as SMK_MAXLEN

Signed-off-by: Etienne Basset <etienne.basset@numericable.fr>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

Linux 2.6.29.1

V4L: v4l2-common: remove incorrect MODULE test

upstream commit: d64260d58865004c6354e024da3450fdd607ea07

v4l2-common doesn't have to be a module for it to call request_module().
Just remove that test.

Without this patch loading ivtv as a module while v4l2-common is compiled
into the kernel will cause a delayed load of the i2c modules that ivtv
needs since request_module is never called directly.

While it is nice to see the delayed load in action, it is not so nice in
that ivtv fails to do a lot of necessary i2c initializations and will oops
later on with a division-by-zero.

Thanks to Mark Lord for reporting this and helping me figure out what was
wrong.

Thanks-to: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Thanks-to: Mark Lord <lkml@rtr.ca>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

sparc64: Fix reset hangs on Niagara systems.

[ Upstream commit ffaba674090f287afe0c44fd8d978c64c03581a8 ]

Hypervisor versions older than version 1.6.1 cannot handle
leaving the profile counter overflow interrupt chirping
when the system does a soft reset.

So use a reboot notifier to shut off the NMI watchdog.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

sparc64: Flush TLB before releasing pages.

[ Upstream commit a552a42cfa91ab653128dff89a70c8dde7fed042 ]

tlb_flush_mmu() needs to flush pending TLB entries before
processing the mmu_gather ->pages list.

Noticed by Benjamin Herrenschmidt.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

sparc64: Fix MM refcount check in smp_flush_tlb_pending().

[ Upstream commit f9384d41c02408dd404aa64d66d0ef38adcf6479 ]

As explained by Benjamin Herrenschmidt:

> CPU 0 is running the context, task->mm == task->active_mm == your
> context. The CPU is in userspace happily churning things.
>
> CPU 1 used to run it, not anymore, it's now running fancyfsd which
> is a kernel thread, but current->active_mm still points to that
> same context.
>
> Because there's only one "real" user, mm_users is 1 (but mm_count is
> elevated, it's just that the presence on CPU 1 as active_mm has no
> effect on mm_count().
>
> At this point, fancyfsd decides to invalidate a mapping currently mapped
> by that context, for example because a networked file has changed
> remotely or something like that, using unmap_mapping_ranges().
>
> So CPU 1 goes into the zapping code, which eventually ends up calling
> flush_tlb_pending(). Your test will succeed, as current->active_mm is
> indeed the target mm for the flush, and mm_users is indeed 1. So you
> will -not- send an IPI to the other CPU, and CPU 0 will continue happily
> accessing the pages that should have been unmapped.

To fix this problem, check ->mm instead of ->active_mm, and this
means:

> So if you test current->mm, you effectively account for mm_users == 1,
> so the only way the mm can be active on another processor is as a lazy
> mm for a kernel thread. So your test should work properly as long
> as you don't have a HW that will do speculative TLB reloads into the
> TLB on that other CPU (and even if you do, you flush-on-switch-in should
> get rid of any crap here).

And therefore we should be OK.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

KVM: MMU: Fix another largepage memory leak

upstream commit: c5bc22424021cabda862727fb3f5098b866f074d

In the paging_fetch function rmap_remove is called after setting a large
pte to non-present. This causes rmap_remove to not drop the reference to
the large page. The result is a memory leak of that page.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
[chrisw: backport to 2.6.29]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

cfg80211: fix incorrect assumption on last_request for 11d

upstream commit: cc0b6fe88e99096868bdbacbf486c97299533b5a

The incorrect assumption is the last regulatory request
(last_request) is always a country IE when processing
country IEs. Although this is true 99% of the time the
first time this happens this could not be true.

This fixes an oops in the branch check for the last_request
when accessing drv_last_ie. The access was done under the
assumption the struct won't be null.

Note to stable: to port to 29 replace as follows, only 29 has
country IE code:

s|NL80211_REGDOM_SET_BY_COUNTRY_IE|REGDOM_SET_BY_COUNTRY_IE

Cc: stable@kernel.org
Reported-by: Quentin Armitage <Quentin@armitage.org.uk>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
[chrisw: backport to 2.6.29]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

lguest: fix spurious BUG_ON() on invalid guest stack.

upstream commit: 6afbdd059c27330eccbd85943354f94c2b83a7fe

Impact: fix crash on misbehaving guest

gpte_addr() contains a BUG_ON(), insisting that the present flag is
set. We need to return before we call it if that isn't the case.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

lguest: wire up pte_update/pte_update_defer

upstream commit: b7ff99ea53cd16de8f6166c0e98f19a7c6ca67ee

Impact: intermittent guest segv/crash fix

I've been seeing random guest bad address crashes and segmentation faults:
bisect led to 4f98a2fee8 (vmscan: split LRU lists into anon & file sets),
but that's a red herring.

It turns out that lguest never hooked up the pte_update/pte_update_defer
calls, so our ptes were not always in sync. After the vmscan commit, the
bug became reproducible; now a fsck in a 64MB guest causes reproducible
pagetable corruption.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: jeremy@xensource.com
Cc: virtualization@lists.osdl.org
Cc: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

VM, x86, PAT: Change is_linear_pfn_mapping to not use vm_pgoff

upstream commit: 4bb9c5c02153dfc89a6c73a6f32091413805ad7d

Impact: fix false positive PAT warnings - also fix VirtalBox hang

Use of vma->vm_pgoff to identify the pfnmaps that are fully
mapped at mmap time is broken. vm_pgoff is set by generic mmap
code even for cases where drivers are setting up the mappings
at the fault time.

The problem was originally reported here:

http://marc.info/?l=linux-kernel&m=123383810628583&w=2

Change is_linear_pfn_mapping logic to overload VM_INSERTPAGE
flag along with VM_PFNMAP to mean full PFNMAP setup at mmap
time.

Problem also tracked at:

http://bugzilla.kernel.org/show_bug.cgi?id=12800

Reported-by: Thomas Hellstrom <thellstrom@vmware.com>
Tested-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: "ebiederm@xmission.com" <ebiederm@xmission.com>
Cc: <stable@kernel.org> # only for 2.6.29.1, not .28
LKML-Reference: <20090313004527.GA7176@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

x86: mtrr: don't modify RdDram/WrDram bits of fixed MTRRs

upstream commit: 3ff42da5048649503e343a32be37b14a6a4e8aaf

Impact: bug fix + BIOS workaround

BIOS is expected to clear the SYSCFG[MtrrFixDramModEn] on AMD CPUs
after fixed MTRRs are configured.

Some BIOSes do not clear SYSCFG[MtrrFixDramModEn] on BP (and on APs).

This can lead to obfuscation in Linux when this bit is not cleared on
BP but cleared on APs. A consequence of this is that the saved
fixed-MTRR state (from BP) differs from the fixed-MTRRs of APs --
because RdDram/WrDram bits are read as zero when
SYSCFG[MtrrFixDramModEn] is cleared -- and Linux tries to sync
fixed-MTRR state from BP to AP. This implies that Linux sets
SYSCFG[MtrrFixDramEn] and activates those bits.

More important is that (some) systems change these bits in SMM when
ACPI is enabled. Hence it is racy if Linux modifies RdMem/WrMem bits,
too.

(1) The patch modifies an old fix from Bernhard Kaindl to get
    suspend/resume working on some Acer Laptops. Bernhard's patch
    tried to sync RdMem/WrMem bits of fixed MTRR registers and that
    helped on those old Laptops. (Don't ask me why -- can't test it
    myself). But this old problem was not the motivation for the
    patch. (See http://lkml.org/lkml/2007/4/3/110)

(2) The more important effect is to fix issues on some more current systems.

    On those systems Linux panics or just freezes, see

    http://bugzilla.kernel.org/show_bug.cgi?id=11541
    (and also duplicates of this bug:
    http://bugzilla.kernel.org/show_bug.cgi?id=11737
    http://bugzilla.kernel.org/show_bug.cgi?id=11714)

    The affected systems boot only using acpi=ht, acpi=off or
    when the kernel is built with CONFIG_MTRR=n.

    The acpi options prevent full enablement of ACPI.  Obviously when
    ACPI is enabled the BIOS/SMM modfies RdMem/WrMem bits.  When
    CONFIG_MTRR=y Linux also accesses and modifies those bits when it
    needs to sync fixed-MTRRs across cores (Bernhard's fix, see (1)).
    How do you synchronize that? You can't. As a consequence Linux
    shouldn't touch those bits at all (Rationale are AMD's BKDGs which
    recommend to clear the bit that makes RdMem/WrMem accessible).
    This is the purpose of this patch. And (so far) this suffices to
    fix (1) and (2).

I suggest not to touch RdDram/WrDram bits of fixed-MTRRs and
SYSCFG[MtrrFixDramEn] and to clear SYSCFG[MtrrFixDramModEn] as
suggested by AMD K8, and AMD family 10h/11h BKDGs.
BIOS is expected to do this anyway. This should avoid that
Linux and SMM tread on each other's toes ...

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: trenn@suse.de
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20090312163937.GH20716@alberich.amd.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

x86: ptrace, bts: fix an unreachable statement

upstream commit: 5a8ac9d28dae5330c70562c7d7785f5104059c17

Commit c2724775ce57c98b8af9694857b941dc61056516 put a statement
after return, which makes that statement unreachable.

Move that statement before return.

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Markus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090313075622.GB8933@hack>
Cc: <stable@kernel.org> # .29 only
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

x86: fix 64k corruption-check

upstream commit: 6d7942dc2a70a7e74c352107b150265602671588

Impact: fix boot crash

Need to exit early if the addr is far above 64k.

The crash got exposed by:

78a8b35: x86: make e820_update_range() handle small range update

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: <stable@kernel.org>
LKML-Reference: <49BC2279.2030101@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

x86, uv: fix cpumask iterator in uv_bau_init()

upstream commit: 2c74d66624ddbda8101d54d1e184cf9229b378bc

Impact: fix boot crash on UV systems

Commit 76ba0ecda0de9accea9a91cb6dbde46782110e1c "cpumask: use
cpumask_var_t in uv_flush_tlb_others" used cur_cpu as an iterator;
it was supposed to be zero for the code below it.

Reported-by: Cliff Wickman <cpw@sgi.com>
Original-From: Cliff Wickman <cpw@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Mike Travis <travis@sgi.com>
Cc: steiner@sgi.com
Cc: <stable@kernel.org>
LKML-Reference: <200903180822.31196.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

x86, PAT, PCI: Change vma prot in pci_mmap to reflect inherited prot

upstream commit: 9cdec049389ce2c324fd1ec508a71528a27d4a07

While looking at the issue in the thread:

http://marc.info/?l=dri-devel&m=123606627824556&w=2

noticed a bug in pci PAT code and memory type setting.

PCI mmap code did not set the proper protection in vma, when it
inherited protection in reserve_memtype. This bug only affects
the case where there exists a WC mapping before X does an mmap
with /proc or /sys pci interface. This will cause X userlevel
mmap from /proc or /sysfs to fail on fork.

Reported-by: Kevin Winchester <kjwinchester@gmail.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: <stable@kernel.org>
LKML-Reference: <20090323190720.GA16831@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

Add a missing unlock_kernel() in raw_open()

upstream commit: 996ff68d8b358885c1de82a45517c607999947c7

Cc: stable@kernel.org
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

fuse: fix fuse_file_lseek returning with lock held

upstream commit: 5291658d87ac1ae60418e79e7b6bad7d5f595e0c

This bug was found with smatch (http://repo.or.cz/w/smatch.git/). If
we return directly the inode->i_mutex lock doesn't get released.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ARM: 5435/1: fix compile warning in sanity_check_meminfo()

upstream commit: f0bba9f934517533acbda7329be93f55d5a01c03

Compiling recent 2.6.29-rc kernels for ARM gives me the following warning:

arch/arm/mm/mmu.c: In function 'sanity_check_meminfo':
arch/arm/mm/mmu.c:697: warning: comparison between pointer and integer

This is because commit 3fd9825c42c784a59b3b90bdf073f49d4bb42a8d
"[ARM] 5402/1: fix a case of wrap-around in sanity_check_meminfo()"
in 2.6.29-rc5-git4 added a comparison of a pointer with PAGE_OFFSET,
which is an integer.

Fixed by casting PAGE_OFFSET to void *.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ARM: twl4030 - leak fix

upstream commit: 803c78e4da28d7d7cb0642caf643b9289ae7838a

Trivial error path leak fix. Problem found by Daniel Marjamäki using
cppcheck

Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ARM: fix leak in iop13xx/pci

upstream commit: b23c7a427e4b3764ad686a46de89ab652811c50a

Another leak found by Daniel Marjamäki

Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ARM: cumana: Fix a long standing bogon

upstream commit: ecbf61e7357d5c7047c813edd6983902d158688c

Should be using strncmp as the data from user space may be unterminated

(Bug #8004)

Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>