git.karo-electronics.de Git - karo-tx-linux.git/log

NET: Add preemption point in qdisc_run

Upstream commit: 2ba2506ca7ca62c56edaa334b0fe61eb5eab6ab0

The qdisc_run loop is currently unbounded and runs entirely in a
softirq. This is bad as it may create an unbounded softirq run.

This patch fixes this by calling need_resched and breaking out if
necessary.

It also adds a break out if the jiffies value changes since that would
indicate we've been transmitting for too long which starves other
softirqs.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

PPPOL2TP: Fix SMP issues in skb reorder queue handling

Upstream commit: e653181dd6b3ad38ce14904351b03a5388f4b0f7

When walking a session's packet reorder queue, use
skb_queue_walk_safe() since the list could be modified inside the
loop.

Rearrange the unlinking skbs from the reorder queue such that it is
done while the queue lock is held in pppol2tp_recv_dequeue() when
walking the skb list.

A version of this patch was suggested by Jarek Poplawski.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

PPPOL2TP: Make locking calls softirq-safe

Upstream commit: cf3752e2d203bbbfc88d29e362e6938cef4339b3

Fix locking issues in the pppol2tp driver which can cause a kernel
crash on SMP boxes. There were two problems:-

1. The driver was violating read_lock() and write_lock() scheduling
   rules because it wasn't using softirq-safe locks in softirq
   contexts. So we now consistently use the _bh variants of the lock
   functions.

2. The driver was calling sk_dst_get() in pppol2tp_xmit() which was
   taking sk_dst_lock in softirq context. We now call __sk_dst_get().

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

netpoll: zap_completion_queue: adjust skb->users counter

Upstream commit: 8a455b087c9629b3ae3b521b4f1ed16672f978cc

zap_completion_queue() retrieves skbs from completion_queue where they have
zero skb->users counter. Before dev_kfree_skb_any() it should be non-zero
yet, so it's increased now.

Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

LLC: Restrict LLC sockets to root

Upstream commit: 3480c63bdf008e9289aab94418f43b9592978fff

LLC currently allows users to inject raw frames, including IP packets
encapsulated in SNAP. While Linux doesn't handle IP over SNAP, other
systems do. Restrict LLC sockets to root similar to packet sockets.

[ Modified Patrick's patch to use CAP_NEW_RAW --DaveM ]

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

INET: inet_frag_evictor() must run with BH disabled

Part of upstream commit: e8e16b706e8406f1ab3bccab16932ebc513896d8

Based upon a lockdep trace from Dave Jones.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

SUNGEM: Fix NAPI assertion failure.

Upstream commit: da990a2402aeaee84837f29054c4628eb02f7493

As reported by Johannes Berg:

I started getting this warning with recent kernels:

[ 773.908927] ------------[ cut here ]------------
[ 773.908954] Badness at net/core/dev.c:2204
...

If we loop more than once in gem_poll(), we'll
use more than the real budget in our gem_rx()
calls, thus eventually trigger the caller's
assertions in net_rx_action().

Subtract "work_done" from "budget" for the second
arg to gem_rx() to fix the bug.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

NET: include <linux/types.h> into linux/ethtool.h for __u* typedef

Upstream commit: e621e69137b24fdbbe7ad28214e8d81e614c25b7

Signed-off-by: Kirill A. Shutemov <k.shutemov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

AX25 ax25_out: check skb for NULL in ax25_kick()

Upstream commit: f47b7257c7368698eabff6fd7b340071932af640

According to some OOPS reports ax25_kick tries to clone NULL skbs
sometimes. It looks like a race with ax25_clear_queues(). Probably
there is no need to add more than a simple check for this yet.
Another report suggested there are probably also cases where ax25
->paclen == 0 can happen in ax25_output(); this wasn't confirmed
during testing but let's leave this debugging check for some time.

Reported-and-tested-by: Jann Traschewski <jann@gmx.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ipmi: change device node ordering to reflect probe order

upstream commit: abd24df828f1a72971db29d1b74fefae104ea9e2

In 2.6.14 a patch was merged which switching the order of the ipmi device
naming from in-order-of-discovery over to reverse-order-of-discovery.

So on systems with multiple BMC interfaces, the ipmi device names are being
created in reverse order relative to how they are discovered on the system
(e.g. on an IBM x3950 multinode server with N nodes, the device name for the
BMC in the first node is /dev/ipmiN-1 and the device name for the BMC in the
last node is /dev/ipmi0, etc.).

The problem is caused by the list handling routines chosen in dmi_scan.c.
Using list_add() causes the multiple ipmi devices to be added to the device
list using a stack-paradigm and so the ipmi driver subsequently pulls them off
during initialization in LIFO order. This patch changes the
dmi_save_ipmi_device() list handling paradigm to a queue, thereby allowing the
ipmi driver to build the ipmi device names in the order in which they are
found on the system.

Signed-off-by: Carol Hebert <cah@us.ibm.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

mtd: fix broken state in CFI driver caused by FL_SHUTDOWN

upstream commit: fb6d080c6f75dfd7e23d5a3575334785aa8738eb

THe CFI driver in 2.6.24 kernel is broken. Not so intensive read/write
operations cause incomplete writes which lead to kernel panics in JFFS2.

We investigated the issue - it is caused by bug in FL_SHUTDOWN parsing code.
Sometimes chip returns -EIO as if it is in FL_SHUTDOWN state when it should
wait in FL_PONT (error in order of conditions).

The following patch fixes the bug in state parsing code of CFI. Also I've
added comments to notify developers if they want to add new case in future.

Signed-off-by: Alexey Korolev <akorolev@infradead.org>
Reviewed-by: Joern Engel <joern@logfs.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

CRYPTO xcbc: Fix crash when ipsec uses xcbc-mac with big data chunk

upstream commit: 1edcf2e1ee2babb011cfca80ad9d202e9c491669

The kernel crashes when ipsec passes a udp packet of about 14XX bytes
of data to aes-xcbc-mac.

It seems the first xxxx bytes of the data are in first sg entry,
and remaining xx bytes are in next sg entry. But we don't
check next sg entry to see if we need to go look the page up.

I noticed in hmac.c, we do a scatterwalk_sg_next(), to do this check
and possible lookup, thus xcbc.c needs to use this routine too.

A 15-hour run of an ipsec stress test sending streams of tcp and
udp packets of various sizes, using this patch and
aes-xcbc-mac completed successfully, so hopefully this fixes the
problem.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[chrisw@sous-sol.org: backport to 2.6.24.4]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

USB: serial: ti_usb_3410_5052: Correct TUSB3410 endpoint requirements.

upstream commit: 1bfd6693cd66f1e79abce62d3e8c3647e1f59a55

The changes introduced in commit
063a2da8f01806906f7d7b1a1424b9afddebc443 changed the semantics of the
num_interrupt_in, num_interrupt_out, num_bulk_in and num_bulk_out
entries of the usb_serial_driver struct to be the number of endpoints
the device has when probed.

This patch changes the ti_1port_device usb_serial_driver struct to
reflect this change. The single port devices only have 1
bulk_out endpoint in their initial configuration, and so this patch
changes the number of other types to NUM_DONT_CARE.

The same change probably needs doing to the ti_2port_device struct,
but I don't have a two port device at hand.

Signed-off-by: Robert Spanton <rspanton@zepler.net>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

USB: serial: fix regression in Visor/Palm OS module for kernels >= 2.6.24

upstream commit: d04863e9e65767feff7807c8f693ac2719dd1944

Fixes a bug/inconsistency revealed by the additional sanity checking in
   commit 063a2da8f01806906f7d7b1a1424b9afddebc443
introduced in the original 2.6.24 branch.

The Handspring Visor / PalmOS 4 device structure defines .num_bulk_out=2
but the usb-serial probe returns num_bulk_out=3, triggering the check in
the above commit and forcing a bail out when the device (a Garmin iQue in
my case) attempts to connect.  The patch bumps the expected number of
endpoints to 3.

FWIW, this patch will probably solve the following kernel bug report for
Treo users (identical symptoms, different model PalmOS units):
  <http://bugzilla.kernel.org/show_bug.cgi?id=10118>

Signed-off-by: Brad Sawatzky <brad+kernel@swatter.net>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

USB: Allow initialization of broken keyspan serial adapters.

upstream commit: 822470537d0fc1dee38a2a9c8b8c398bfbb332bb

Fixes the keyspan driver after the addition of additional
checking of driver requirements introduced in usb-serial.c
commit 063a2da8f01806906f7d7b1a1424b9afddebc443. The initialization
of the keyspan usb_serial_driver structs were not initializing the
num_interrupt_out field and the additional checking was rejecting
the end point so the driver wouldn't finish initializing.

This commit initializes the fields to NUM_DONT_CARE.
It works for the keyspan USA-49WG and doesn't break the USA-19HS
which are the two keyspan devices I have to test with.

Signed-off-by: Clark Rawlins <clark.rawlins@escient.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

vmcoreinfo: add the symbol "phys_base"

upstream commit: 629c8b4cdb354518308663aff2f719e02f69ffbe

Fix the problem that makedumpfile sometimes fails on x86_64 machine.

This patch adds the symbol "phys_base" to a vmcoreinfo data.  The
vmcoreinfo data has the minimum debugging information only for dump
filtering.  makedumpfile (dump filtering command) gets it to distinguish
unnecessary pages, and makedumpfile creates a small dumpfile.

On x86_64 kernel which compiled with CONFIG_PHYSICAL_START=0x0 and
CONFIG_RELOCATABLE=y, makedumpfile fails like the following:

# makedumpfile -d31 /proc/vmcore dumpfile
The kernel version is not supported.
The created dumpfile may be incomplete.
_exclude_free_page: Can't get next online node.

makedumpfile Failed.
#

The cause is the lack of the symbol "phys_base" in a vmcoreinfo data.
If the symbol "phys_base" does not exist, makedumpfile considers an
x86_64 kernel as non relocatable.  As the result, makedumpfile
misunderstands the physical address where the kernel is loaded, and it
cannot translate a kernel virtual address to physical address correctly.

To fix this problem, this patch adds the symbol "phys_base" to a
vmcoreinfo data.

Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: <stable@kernel.org>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

hwmon: (w83781d) Fix I/O resource conflict with PNP

upstream commit: 2961cb22ef02850d90e7a12c28a14d74e327df8d

Only request I/O ports 0x295-0x296 instead of the full I/O address
range. This solves a conflict with PNP resources on a few motherboards.

Also request the I/O ports in two parts (4 low ports, 4 high ports)
during device detection, otherwise the PNP resource makes the request
(and thus the detection) fail.

This fixes lm-sensors ticket #2306:
http://www.lm-sensors.org/ticket/2306

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

pci: revert SMBus unhide on HP Compaq nx6110

upstream commit: a99acc832de1104afaba02d7c2576fd9b9fd6422

This reverts commit 3c0a654e390d00fef9d8faed758f5e1e8078adb5 and
fixes kernel bug #10245:

http://bugzilla.kernel.org/show_bug.cgi?id=10245

The HP Compaq nc6120 has the same PCI sub-device ID as the nx6110, and the
SMBus is used by ACPI for thermal management on the nc6120, so Linux should
not attach a native driver to it. This means that this quirk is unsafe and
has to be removed.

I also added a comment to help developers realize that adding new IDs to this
SMBus unhiding quirk table should be done only with great care, and in
particular only after checking that ACPI is not making use of the SMBus.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Tomasz Koprowski <tomek@koprowski.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

vfs: fix data leak in nobh_write_end()

upstream commit: 5b41e74ad1b0bf7bc51765ae74e5dc564afc3e48

Current nobh_write_end() implementation ignore partial writes(copied < len)
case if page was fully mapped and simply mark page as Uptodate, which is
totally wrong because area [pos+copied, pos+len) wasn't updated explicitly in
previous write_begin call.  It simply contains garbage from pagecache and
result in data leakage.

#TEST_CASE_BEGIN:
~~~~~~~~~~~~~~~~
In fact issue triggered by classical testcase
open("/mnt/test", O_RDWR|O_CREAT|O_TRUNC, 0666) = 3
ftruncate(3, 409600)                    = 0
writev(3, [{"a", 1}, {NULL, 4095}], 2)  = 1
##TESTCASE_SOURCE:
~~~~~~~~~~~~~~~~~
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/uio.h>
#include <sys/mman.h>
#include <errno.h>
int main(int argc, char **argv)
{
int fd,  ret;
void* p;
struct iovec iov[2];
fd = open(argv[1], O_RDWR|O_CREAT|O_TRUNC, 0666);
ftruncate(fd, 409600);
iov[0].iov_base="a";
iov[0].iov_len=1;
iov[1].iov_base=NULL;
iov[1].iov_len=4096;
ret = writev(fd, iov, sizeof(iov)/sizeof(struct iovec));
printf("writev  = %d, err = %d\n", ret, errno);
return 0;
}
##TESTCASE RESULT:
~~~~~~~~~~~~~~~~~~
[root@ts63 ~]# mount | grep mnt2
/dev/mapper/test on /mnt2 type ext2 (rw,nobh)
[root@ts63 ~]#  /tmp/writev /mnt2/test
writev  = 1, err = 0
[root@ts63 ~]# hexdump -C /mnt2/test

00000000  61 65 62 6f 6f 74 00 00  f0 b9 b4 59 3a 00 00 00  |aeboot.....Y:...|
00000010  20 00 00 00 00 00 00 00  21 00 00 00 00 00 00 00  | .......!.......|
00000020  df df df df df df df df  df df df df df df df df  |................|
00000030  3a 00 00 00 2a 00 00 00  21 00 00 00 00 00 00 00  |:...*...!.......|
00000040  60 c0 8c 00 00 00 00 00  40 4a 8d 00 00 00 00 00  |`.......@J......|
00000050  00 00 00 00 00 00 00 00  41 00 00 00 00 00 00 00  |........A.......|
00000060  74 69 6d 65 20 64 64 20  69 66 3d 2f 64 65 76 2f  |time dd if=/dev/|
00000070  6c 6f 6f 70 30 20 20 6f  66 3d 2f 64 65 76 2f 6e  |loop0  of=/dev/n|
skip..
00000f50  00 00 00 00 00 00 00 00  31 00 00 00 00 00 00 00  |........1.......|
00000f60  6d 6b 66 73 2e 65 78 74  33 20 2f 64 65 76 2f 76  |mkfs.ext3 /dev/v|
00000f70  7a 76 67 2f 74 65 73 74  20 2d 62 34 30 39 36 00  |zvg/test -b4096.|
00000f80  a0 fe 8c 00 00 00 00 00  21 00 00 00 00 00 00 00  |........!.......|
00000f90  23 31 32 30 35 39 35 30  34 30 34 00 3a 00 00 00  |#1205950404.:...|
00000fa0  20 00 8d 00 00 00 00 00  21 00 00 00 00 00 00 00  | .......!.......|
00000fb0  d0 cf 8c 00 00 00 00 00  10 d0 8c 00 00 00 00 00  |................|
00000fc0  00 00 00 00 00 00 00 00  41 00 00 00 00 00 00 00  |........A.......|
00000fd0  6d 6f 75 6e 74 20 2f 64  65 76 2f 76 7a 76 67 2f  |mount /dev/vzvg/|
00000fe0  74 65 73 74 20 20 2f 76  7a 20 2d 6f 20 64 61 74  |test  /vz -o dat|
00000ff0  61 3d 77 72 69 74 65 62  61 63 6b 00 00 00 00 00  |a=writeback.....|
00001000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

As you can see file's page contains garbage from pagecache instead of zeros.
#TEST_CASE_END

Attached patch:
- Add sanity check BUG_ON in order to prevent incorrect usage by caller,
  This is function invariant because page can has buffers and in no zero
  *fadata pointer at the same time.
- Always attach buffers to page is it is partial write case.
- Always switch back to generic_write_end if page has buffers.
  This is reasonable because if page already has buffer then generic_write_begin
  was called previously.

Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
Reviewed-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

alloc_percpu() fails to allocate percpu data

upstream commit: be852795e1c8d3829ddf3cb1ce806113611fa555

Some oprofile results obtained while using tbench on a 2x2 cpu machine were
very surprising.

For example, loopback_xmit() function was using high number of cpu cycles
to perform the statistic updates, supposed to be real cheap since they use
percpu data

        pcpu_lstats = netdev_priv(dev);
        lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
        lb_stats->packets++;  /* HERE : serious contention */
        lb_stats->bytes += skb->len;

struct pcpu_lstats is a small structure containing two longs.  It appears
that on my 32bits platform, alloc_percpu(8) allocates a single cache line,
instead of giving to each cpu a separate cache line.

Using the following patch gave me impressive boost in various benchmarks
( 6 % in tbench)
(all percpu_counters hit this bug too)

Long term fix (ie >= 2.6.26) would be to let each CPU allocate their own
block of memory, so that we dont need to roudup sizes to L1_CACHE_BYTES, or
merging the SGI stuff of course...

Note : SLUB vs SLAB is important here to *show* the improvement, since they
dont have the same minimum allocation sizes (8 bytes vs 32 bytes).  This
could very well explain regressions some guys reported when they switched
to SLUB.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

PERCPU : __percpu_alloc_mask() can dynamically size percpu_data storage

upstream commit: b3242151906372f30f57feaa43b4cac96a23edb1

Instead of allocating a fix sized array of NR_CPUS pointers for percpu_data,
we can use nr_cpu_ids, which is generally < NR_CPUS.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

xen: fix UP setup of shared_info

upstream commit: 2e8fe719b57bbdc9e313daed1204bb55fed3ed44

We need to set up the shared_info pointer once we've mapped the real
shared_info into its fixmap slot.  That needs to happen once the general
pagetable setup has been done.  Previously, the UP shared_info was set
up one in xen_start_kernel, but that was left pointing to the dummy
shared info.  Unfortunately there's no really good place to do a later
setup of the shared_info in UP, so just do it once the pagetable setup
has been done.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[chrisw@sous-sol.org: backport to 2.6.24.4]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

xen: mask out SEP from CPUID

upstream commit: d40e705903397445c6861a0a56c23e5b2e8f9b9a

Fix 32-on-64 pvops kernel:

we don't want userspace using syscall/sysenter, even if the hypervisor
supports it, so mask it out from CPUID.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

xen: fix RMW when unmasking events

upstream commit: 04c44a080d2f699a3042d4e743f7ad2ffae9d538

xen_irq_enable_direct and xen_sysexit were using "andw $0x00ff,
XEN_vcpu_info_pending(vcpu)" to unmask events and test for pending ones
in one instuction.

Unfortunately, the pending flag must be modified with a locked operation
since it can be set by another CPU, and the unlocked form of this
operation was causing the pending flag to get lost, allowing the processor
to return to usermode with pending events and ultimately deadlock.

The simple fix would be to make it a locked operation, but that's rather
costly and unnecessary. The fix here is to split the mask-clearing and
pending-testing into two instructions; the interrupt window between
them is of no concern because either way pending or new events will
be processed.

This should fix lingering bugs in using direct vcpu structure access too.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

slab: fix cache_cache bootstrap in kmem_cache_init()

upstream commit: ec1f5eeeb5a79a0d48036de649a3498da42db565

Commit 556a169dab38b5100df6f4a45b655dddd3db94c1 ("slab: fix bootstrap on
memoryless node") introduced bootstrap-time cache_cache list3s for all nodes
but forgot that initkmem_list3 needs to be accessed by [somevalue + node]. This
patch fixes list_add() corruption in mm/slab.c seen on the ES7000.

Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

NOHZ: reevaluate idle sleep length after add_timer_on()

upstream commit: 06d8308c61e54346585b2691c13ee3f90cb6fb2f

add_timer_on() can add a timer on a CPU which is currently in a long
idle sleep, but the timer wheel is not reevaluated by the nohz code on
that CPU. So a timer can be delayed for quite a long time. This
triggered a false positive in the clocksource watchdog code.

To avoid this we need to wake up the idle CPU and enforce the
reevaluation of the timer wheel for the next timer event.

Add a function, which checks a given CPU for idle state, marks the
idle task with NEED_RESCHED and sends a reschedule IPI to notify the
other CPU of the change in the timer wheel.

Call this function from add_timer_on().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
--
include/linux/sched.h |    6 ++++++
kernel/sched.c        |   43 +++++++++++++++++++++++++++++++++++++++++++
kernel/timer.c        |   10 +++++++++-
3 files changed, 58 insertions(+), 1 deletion(-)

inotify: remove debug code

upstream commit: 0d71bd5993b630a989d15adc2562a9ffe41cd26d

The inotify debugging code is supposed to verify that the
DCACHE_INOTIFY_PARENT_WATCHED scalability optimisation does not result in
notifications getting lost nor extra needless locking generated.

Unfortunately there are also some races in the debugging code. And it isn't
very good at finding problems anyway. So remove it for now.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Robert Love <rlove@google.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Jan Kara <jack@ucw.cz>
Cc: Yan Zheng <yanzheng@21cn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Lamparter <chunkeey@web.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

inotify: fix race

upstream commit: d599e36a9ea85432587f4550acc113cd7549d12a

There is a race between setting an inode's children's "parent watched" flag
when placing the first watch on a parent, and instantiating new children of
that parent: a child could miss having its flags set by
set_dentry_child_flags, but then inotify_d_instantiate might still see
!inotify_inode_watched.

The solution is to set_dentry_child_flags after adding the watch. Locking is
taken care of, because both set_dentry_child_flags and inotify_d_instantiate
hold dcache_lock and child->d_locks.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Robert Love <rlove@google.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Jan Kara <jack@ucw.cz>
Cc: Yan Zheng <yanzheng@21cn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Lamparter <chunkeey@web.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

USB: new quirk flag to avoid Set-Interface

upstream commit: 392e1d9817d0024c96aae237c3c4349e47c976fd

This patch (as1057) fixes a problem with the X-Rite/Gretag-Macbeth
Eye-One Pro display colorimeter; the device crashes when it receives a
Set-Interface request. A new quirk (USB_QUIRK_NO_SET_INTF) is
introduced and a quirks entry is created for this device.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[chrisw@sous-sol.org: backport to 2.6.24.4]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

USB: add support for Motorola ROKR Z6 cellphone in mass storage mode

upstream commit: cc36bdd47ae51b66780b317c1fa519221f894405

Motorola ROKR Z6 cellphone has bugs in its USB, so it is impossible to use
it as mass storage. Patch describes new "unusual" USB device for it with
FIX_INQUIRY and FIX_CAPACITY flags and new BULK_IGNORE_TAG flag.
Last flag relaxes check for equality of bcs->Tag and us->tag in
usb_stor_Bulk_transport routine.

Signed-off-by: Constantin Baranov <const@tltsu.ru>
Signed-off-by: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

UIO: add pgprot_noncached() to UIO mmap code

upstream commit: c9698d6b1a90929e427a165bd8283f803f57d9bd

Mapping of physical memory in UIO needs pgprot_noncached() to ensure
that IO memory is not cached. Without pgprot_noncached(), it (accidentally)
works on x86 and arm, but fails on PPC.

Signed-off-by: Jean-Samuel Chenard <jsamch@gmail.com>
Signed-off-by: Hans J Koch <hjk@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

V4L: ivtv: Add missing sg_init_table()

upstream commit: 165e1213e13b49761f8b3fd9314701f83cf3db3a

If a dma transfer is attempted for either yuv or framebuffer output, a
missing sg_init_table() call causes a kernel BUG in scatterlist.h if
CONFIG_DEBUG_SG is set.

Signed-off-by: Ian Armstrong <ian@iarmst.demon.co.uk>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

md: remove the 'super' sysfs attribute from devices in an 'md' array

upstream commit: 0e82989d95cc46cc58622381eafa54f7428ee679

Exposing the binary blob which is the md 'super-block' via sysfs doesn't
really fit with the whole sysfs model, and ever since commit
8118a859dc7abd873193986c77a8d9bdb877adc8 ("sysfs: fix off-by-one error
in fill_read_buffer()") it doesn't actually work at all (as the size of
the blob is often one page).

(akpm: as in, fs/sysfs/file.c:fill_read_buffer() goes BUG)

So just remove it altogether. It isn't really useful.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

mtd: memory corruption in block2mtd.c

upstream commit: 2875fb65f8e40401c4b781ebc5002df10485f635

The block2mtd driver (drivers/mtd/devices/block2mtd.c) will kfree an on-stack
pointer when handling an invalid argument line (e.g.
block2mtd=/dev/loop0,xxx).

The kfree was added some time ago when "name" was dynamically allocated.

Signed-off-by: Ingo van Lil <inguin@gmx.de>
Acked-by: Joern Engel <joern@lazybastard.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

kbuild: soften modpost checks when doing cross builds

upstream commit: 4ce6efed48d736e3384c39ff87bda723e1f8e041

The module alias support in the kernel have a consistency
check where it is checked that the size of a structure
in the kernel and on the build host are the same.
For cross builds this check does not make sense so detect
when we do cross builds and silently skip the check in these
situations.
This fixes a build bug for a wireless driver when cross building
for arm.

Acked-by: Michael Buesch <mb@bu3sch.de>
Tested-by: Gordon Farquharson <gordonfarquharson@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
[chrisw@sous-sol.org: backport to 2.6.24.4]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

time: prevent the loop in timespec_add_ns() from being optimised away

upstream commit: 38332cb98772f5ea757e6486bed7ed0381cb5f98

Since some architectures don't support __udivdi3().

Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sedat Dilek <sedat.dilek@googlemail.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

Linux 2.6.24.4

S390 futex: let futex_atomic_cmpxchg_pt survive early functional tests.

a0c1e9073ef7428a14309cba010633a6cd6719ea "futex: runtime enable pi and
robust functionality" introduces a test wether futex in atomic stuff
works or not.
It does that by writing to address 0 of the kernel address space. This
will crash on older machines where addressing mode switching is enabled
but where the mvcos instruction is not available. Page table walking is
done by hand and therefore the code tries to access current->mm which
is NULL.
Therefore add an extra check, so we survive the early test.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

slab: NUMA slab allocator migration bugfix

NUMA slab allocator cpu migration bugfix

The NUMA slab allocator (specifically, cache_alloc_refill)
is not refreshing its local copies of what cpu and what
numa node it is on, when it drops and reacquires the irq
block that it inherited from its caller. As a result
those values become invalid if an attempt to migrate the
process to another numa node occured while the irq block
had been dropped.

The solution is to make cache_alloc_refill reload these
variables whenever it drops and reacquires the irq block.

The error is very difficult to hit. When it does occur,
one gets the following oops + stack traceback bits in
check_spinlock_acquired:

kernel BUG at mm/slab.c:2417
cache_alloc_refill+0xe6
kmem_cache_alloc+0xd0
...

This patch was developed against 2.6.23, ported to and
compiled-tested only against 2.6.25-rc4.

Signed-off-by: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

relay: fix subbuf_splice_actor() adding too many pages

If subbuf_pages was larger than the max number of pages the pipe
buffer will hold, subbuf_splice_actor() would happily go beyond
the array size.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

BLUETOOTH: Fix bugs in previous conn add/del workqueue changes.

Jens Axboe noticed that we were queueing &conn->work on both btaddconn
and keventd_wq.

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

SCSI advansys: Fix bug in AdvLoadMicrocode

commit: 951b62c11e86acf8c55d9828aa8c921575023c29

buf[i] can be up to 0xfd, so doubling it and assigning the result to an
unsigned char truncates the value. Just use an unsigned int instead;
it's only a temporary.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

async_tx: avoid the async xor_zero_sum path when src_cnt > device->max_xor

commit: 8d8002f642886ae256a3c5d70fe8aff4faf3631a

If the channel cannot perform the operation in one call to
->device_prep_dma_zero_sum, then fallback to the xor+page_is_zero path.
This only affects users with arrays larger than 16 devices on iop13xx or
32 devices on iop3xx.

Cc: <stable@kernel.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

aio: bad AIO race in aio_complete() leads to process hang

commit: 6cb2a21049b8990df4576c5fce4d48d0206c22d5

My group ran into a AIO process hang on a 2.6.24 kernel with the process
sleeping indefinitely in io_getevents(2) waiting for the last wakeup to come
and it never would.

We ran the tests on x86_64 SMP.  The hang only occurred on a Xeon box
("Clovertown") but not a Core2Duo ("Conroe").  On the Xeon, the L2 cache isn't
shared between all eight processors, but is L2 is shared between between all
two processors on the Core2Duo we use.

My analysis of the hang is if you go down to the second while-loop
in read_events(), what happens on processor #1:
1) add_wait_queue_exclusive() adds thread to ctx->wait
2) aio_read_evt() to check tail
3) if aio_read_evt() returned 0, call [io_]schedule() and sleep

In aio_complete() with processor #2:
A) info->tail = tail;
B) waitqueue_active(&ctx->wait)
C) if waitqueue_active() returned non-0, call wake_up()

The way the code is written, step 1 must be seen by all other processors
before processor 1 checks for pending events in step 2 (that were recorded by
step A) and step A by processor 2 must be seen by all other processors
(checked in step 2) before step B is done.

The race I believed I was seeing is that steps 1 and 2 were
effectively swapped due to the __list_add() being delayed by the L2
cache not shared by some of the other processors.  Imagine:
proc 2: just before step A
proc 1, step 1: adds to ctx->wait, but is not visible by other processors yet
proc 1, step 2: checks tail and sees no pending events
proc 2, step A: updates tail
proc 1, step 3: calls [io_]schedule() and sleeps
proc 2, step B: checks ctx->wait, but sees no one waiting, skips wakeup
                so proc 1 sleeps indefinitely

My patch adds a memory barrier between steps A and B.  It ensures that the
update in step 1 gets seen on processor 2 before continuing.  If processor 1
was just before step 1, the memory barrier makes sure that step A (update
tail) gets seen by the time processor 1 makes it to step 2 (check tail).

Before the patch our AIO process would hang virtually 100% of the time.  After
the patch, we have yet to see the process ever hang.

Signed-off-by: Quentin Barnes <qbarnes+linux@yahoo-inc.com>
Reviewed-by: Zach Brown <zach.brown@oracle.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: <stable@kernel.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ We should probably disallow that "if (waitqueue_active()) wake_up()"
  coding pattern, because it's so often buggy wrt memory ordering ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

jbd: correctly unescape journal data blocks

commit: 439aeec639d7c57f3561054a6d315c40fd24bb74

Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping. At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JFS_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device. Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

jbd2: correctly unescape journal data blocks

commit: d00256766a0b4f1441931a7f569a13edf6c68200

Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping. At the moment the wrong buffer head's
data is being unescaped.

To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JBD2_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device. Without this patch the files will contain zeros
instead of the correct data after recovery.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

zisofs: fix readpage() outside i_size

commit: 08ca0db8aa2db4ddcf487d46d85dc8ffb22162cc

A read request outside i_size will be handled in do_generic_file_read(). So
we just return 0 to avoid getting -EIO as normal reading, let
do_generic_file_read do the rest.

At the same time we need unlock the page to avoid system stuck.

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10227

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Report-by: Christian Perle <chris@linuxinfotag.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

NETFILTER: nfnetlink_log: fix computation of netlink skb size

Upstream commit 7000d38d:

This patch is similar to nfnetlink_queue fixes. It fixes the computation
of skb size by using NLMSG_SPACE instead of NLMSG_ALIGN.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

NETFILTER: nfnetlink_queue: fix computation of allocated size for netlink skb

Upstream commit cabaa9bf:

Size of the netlink skb was wrongly computed because the formula was using
NLMSG_ALIGN instead of NLMSG_SPACE. NLMSG_ALIGN does not add the room for
netlink header as NLMSG_SPACE does. This was causing a failure of message
building in some cases.

On my test system, all messages for packets in range [8*k+41, 8*k+48] where k
is an integer were invalid and the corresponding packets were dropped.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

NETFILTER: xt_time: fix failure to match on Sundays

Upstream commit 4f4c9430:

xt_time_match() in net/netfilter/xt_time.c in kernel 2.6.24 never
matches on Sundays. On my host I have a rule like

iptables -A OUTPUT -m time --weekdays Sun -j REJECT

and it never matches. The problem is in localtime_2(), which uses

    r->weekday = (4 + r->dse) % 7;

to map the epoch day onto a weekday in {0,...,6}. In particular this
gives 0 for Sundays. But 0 has to be wrong; a weekday of 0 can never
match. xt_time_match() has

    if (!(info->weekdays_match & (1 << current_time.weekday)))
        return false;

and when current_time.weekday = 0, the result of the & is always
zero, even when info->weekdays_match = XT_TIME_ALL_WEEKDAYS = 0xFE.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sched_nr_migrate wrong mode bits

sched_nr_migrate has strange permission bits:

$ ls -l /proc/sys/kernel/sched_nr_migrate
--w----r-T 1 root root 0 2008-03-17 23:31 /proc/sys/kernel/sched_nr_migrate

The bug is an obvious decimal/octal confusion.

Fixed (collaterally) in Linus's tree by Peter Zijlstra with commit fa85ae241
"sched: rt time limit" (in 2.6.25-rc1).

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

nfsd: fix oops on access from high-numbered ports

This bug was always here, but before my commit 6fa02839bf9412e18e77
("recheck for secure ports in fh_verify"), it could only be triggered by
failure of a kmalloc(). After that commit it could be triggered by a
client making a request from a non-reserved port for access to an export
marked "secure". (Exports are "secure" by default.)

The result is a struct svc_export with a reference count one too low,
resulting in likely oopses next time the export is accessed.

The reference counting here is not straightforward; a later patch will
clean up fh_verify().

Thanks to Lukas Hejtmanek for the bug report and followup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Lukas Hejtmanek <xhejtman@ics.muni.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sched: fix race in schedule()

Fix a hard to trigger crash seen in the -rt kernel that also affects
the vanilla scheduler.

There is a race condition between schedule() and some dequeue/enqueue
functions; rt_mutex_setprio(), __setscheduler() and sched_move_task().

When scheduling to idle, idle_balance() is called to pull tasks from
other busy processor. It might drop the rq lock. It means that those 3
functions encounter on_rq=0 and running=1. The current task should be
put when running.

Here is a possible scenario:

   CPU0                               CPU1
    |                              schedule()
    |                              ->deactivate_task()
    |                              ->idle_balance()
    |                              -->load_balance_newidle()
rt_mutex_setprio()                     |
    |                              --->double_lock_balance()
    *get lock                          *rel lock
    * on_rq=0, ruuning=1               |
    * sched_class is changed           |
    *rel lock                          *get lock
    :                                  |
                                       :
                                   ->put_prev_task_rt()
                                   ->pick_next_task_fair()
                                       => panic

The current process of CPU1(P1) is scheduling. Deactivated P1, and the
scheduler looks for another process on other CPU's runqueue because CPU1
will be idle. idle_balance(), load_balance_newidle() and
double_lock_balance() are called and double_lock_balance() could drop
the rq lock. On the other hand, CPU0 is trying to boost the priority of
P1. The result of boosting only P1's prio and sched_class are changed to
RT. The sched entities of P1 and P1's group are never put. It makes
cfs_rq invalid, because the cfs_rq has curr and no leaf, but
pick_next_task_fair() is called, then the kernel panics.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

SCSI: mpt fusion: don't oops if NumPhys==0

Don't oops if NumPhys==0, instead return -ENODEV.
This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=9909

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Acked-by: Eric Moore <Eric.Moore@lsi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

SCSI: gdth: fix to internal commands execution

commit: ee54cc6af95a7fa09da298493b853a9e64fa8abd

The recent patch named:
  [SCSI] gdth: !use_sg cleanup and use of scsi accessors

has done a bad job in handling internal commands issued by gdth_execute().

Internal commands are issued with device gdth_cmd_str ready made directly
to the card, without any mapping or translations of scsi commands. So here
I added a gdth_cmd_str pointer to the gdth_cmndinfo private structure which
is then copied directly to host.

following this patch is a cleanup that removes the home cooked accessors
and reverts them to regular scsi_cmnd accessors. Since they are not used
anymore. After review maybe the 2 patches should be squashed together.

FIXME: There is still a problem with gdth_get_info(). as reported there
   is a WARN_ON trigerd in dma_free_coherent() when doing:
   $ cat /proc/sys/gdth/0

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Joerg Dorchain: <joerg@dorchain.net>
Tested-by: Stefan Priebe <s.priebe@allied-internet.ag>
Tested-by: Jon Chelton <jchelton@ffpglobal.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

SCSI: gdth: bugfix for the at-exit problems

commit: b31ddd31c266c2ad1b708cad0d3d8e0aa7fa2737

gdth_exit would first remove all cards then stop the timer
and would not sync with the timer function. This caused a crash
in gdth_timer() when module was unloaded.
So del_timer_sync the timer before we delete the cards.

also the reboot notifier function would crash. So clean
that up and fix the crashes.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Joerg Dorchain: <joerg@dorchain.net>
Tested-by: Stefan Priebe <s.priebe@allied-internet.ag>
Tested-by: Jon Chelton <jchelton@ffpglobal.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Fix default compose table initialization

commit: 5ce2087ed0eb424e0889bdc9102727f65d2ecdde

Oddly enough, unsigned int c = '\300'; puts a "negative" value in c, not
0300... This fixes the default unicode compose table by using integers
instead of character constants.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fold in upstream commit 10a7f3135ac4937a3dc8ed11614a2b70cbd44728 (Build
fix for drivers/s390/char/defkeymap.c) from Tony Breeds.

Commit 5ce2087ed0eb424e0889bdc9102727f65d2ecdde (Fix default compose
table initialization) left a trailing quote.

CC drivers/s390/char/defkeymap.o
drivers/s390/char/defkeymap.c:155: error: missing terminating ' character
drivers/s390/char/defkeymap.c:156: error: syntax error before ';' token
make[3]: *** [drivers/s390/char/defkeymap.o] Error 1

Fix that.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: don't use P6_NOPs if compiling with CONFIG_X86_GENERIC

Commit: 959b3be64cab9160cd74532a49b89cdd918d38e9

x86: don't use P6_NOPs if compiling with CONFIG_X86_GENERIC

P6_NOPs are definitely not supported on some VIA CPUs, and possibly
(unverified) on AMD K7s. It is also the only thing that prevents a
686 kernel from running on Transmeta TM3x00/5x00 (Crusoe) series.

The performance benefit over generic NOPs is very small, so when
building for generic consumption, avoid using them.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[cebbert@redhat.com: backport take 2, with parens this time]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

SCSI: fix BUG when sum(scatterlist) > bufflen

commit: 4d2de3a50ce19af2008a90636436a1bf5b3b697b

When sending a SCSI command to a tape drive via the SCSI Generic (sg)
driver, if the command has a data transfer length more than
scatter_elem_sz (32 KB default) and not a multiple of 512, then I either
hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else
the command never completes (depending on the LLDD).

When constructing scatterlists, the sg driver rounds up the scatterlist
element sizes to be a multiple of 512.  This can result in
sum(scatterlist lengths) > bufflen.  In this case, scsi_req_map_sg()
incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to
bufflen.  When the command completes, req_bio_endio() detects that
bio->bi_size != 0, and so it doesn't call bio_endio().  This causes the
command to be resubmitted, resulting in BUG_ON or the command never
completing.

This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather
than to sum(scatterlist lengths), which fixes the problem.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: ehci: handle large bulk URBs correctly (again)

commit: b5f7a0ec11694e60c99d682549dfaf8a03d7ad97

USB: ehci: Fixes completion for multi-qtd URB the short read case

When use of urb->status in the EHCI driver was reworked last August
(commit 14c04c0f88f228fee1f412be91d6edcb935c78aa), a bug was inserted
in the handling of early completion for bulk transactions that need
more than one qTD (e.g. more than 20KB in one URB).

This patch resolves that problem by ensuring that the early completion
status is preserved until the URB is handed back to its submitter,
instead of resetting it after each qTD.

Signed-off-by: Misha Zhilin <misha@epiphan.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

USB: ftdi_sio - really enable EM1010PC

[upstream commit: 4ae897df]

Add EM1010PC to ftdi_sio.c

Signed-off-by: Sven Andersen <s.andersen@cryonet.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: ftdi_sio: Workaround for broken Matrix Orbital serial port

commit: 546d7eec389a3df3173b3131d92829c14e614601

Workaround for the FT232RL-based, Matrix Orbital VK204-25-USB serial port
added to the ftdi_sio driver.

The device has an invalid endpoint descriptor, which must be modified
before it can be used.

Signed-off-by: Kevin Vance <kvance@kvance.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[chrisw@sous-sol.org: backport to 2.6.24.3 w/out ftdi_jtag_probe]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

VT notifier fix for VT switch

commit: 8182ec49a73729334f5a6c65a607ba7009ebd6d6

VT notifier callbacks need to be aware of console switches. This is already
partially done from console_callback(), but at that time fg_console, cursor
positions, etc. are not yet updated and hence screen readers fetch the old
values.

This adds an update notify after all of the values are updated in
redraw_screen(vc, 1).

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

eCryptfs: make ecryptfs_prepare_write decrypt the page

commit: e4465fdaeb3f7b5ef47f389d3eac76db79ff20d8

When the page is not up to date, ecryptfs_prepare_write() should be
acting much like ecryptfs_readpage(). This includes the painfully
obvious step of actually decrypting the page contents read from the
lower encrypted file.

Note that this patch resolves a bug in eCryptfs in 2.6.24 that one can
produce with these steps:

# mount -t ecryptfs /secret /secret
# echo "abc" > /secret/file.txt
# umount /secret
# mount -t ecryptfs /secret /secret
# echo "def" >> /secret/file.txt
# cat /secret/file.txt

Without this patch, the resulting data returned from cat is likely to
be something other than "abc\ndef\n".

(Thanks to Benedikt Driessen for reporting this.)

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Benedikt Driessen <bdriessen@escrypt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ioat: fix 'ack' handling, driver must ensure that 'ack' is zero

commit: 6497dcffe07b7c3d863f9899280c4f6eae999161

Initialize 'ack' to zero in case the descriptor has been recycled.

Prevents "kernel BUG at crypto/async_tx/async_xor.c:185!"

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

macb: Fix speed setting

commit: 179956f498bd8cc55fb803c4ee0cf18be59c8b01

Fix NCFGR.SPD setting on 10Mbps. This bug was introduced by
conversion to generic PHY layer in kernel 2.6.23.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: move out tick_nohz_stop_sched_tick() call from the loop

[upstream commit: 3d97775a]

Move out tick_nohz_stop_sched_tick() call from the loop in cpu_idle
same as 32-bit version.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ed Tomlinson <edt@aei.ca>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

atmel_spi: fix clock polarity

commit: f6febccd7f86fbe94858a4a32d9384cc014c9f40

The atmel_spi driver does not initialize clock polarity correctly (except for
at91rm9200 CS0 channel) in some case.

The atmel_spi driver uses gpio-controlled chipselect.  OTOH spi clock signal
is controlled by CSRn.CPOL bit, but this register controls clock signal
correctly only in 'real transfer' duration.  At the time of cs_activate()
call, CSRn.CPOL will be initialized correctly, but the controller do not know
which channel is to be used next, so clock signal will stay at the inactive
state of last transfer.  If clock polarity of new transfer and last transfer
was differ, new transfer will start with wrong clock signal state.

For example, if you started SPI MODE 2 or 3 transfer after SPI MODE 0 or 1
transfer, the clock signal state at the assertion of chipselect will be low.
Of course this will violates SPI transfer.

This patch is short term solution for this problem.  It makes all CSRn.CPOL
match for the transfer before activating chipselect.  For longer term, the
best fix might be to let NPCS0 stay selected permanently in MR and overwrite
CSR0 with to the new slave's settings before asserting CS.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

b43: Backport bcm4311 fix

This is a backport of upstream commit 013978b6 ("b43: Changes to enable
BCM4311 rev 02 with wireless core revision 13") and the changes include
the following:

(1) Add the 802.11 rev 13 device to the ssb_device_id table to load b43.
(2) Add PHY revision 9 to the supported list.
(3) Change the 2-bit routing code for address extensions to 0b10 rather
than the 0b01 used for the 32-bit case.
(4) Remove some magic numbers in the DMA setup.

The DMA implementation for this chip supports full 64-bit addressing with
one exception. Whenever the Descriptor Ring Buffer is in high memory, a
fatal DMA error occurs. This problem was not present in 2.6.24-rc2 due
to code to "Bias the placement of kernel pages at lower PFNs". When
commit 44048d70 reverted that code, the DMA error appeared. As a "fix",
use the GFP_DMA flag when allocating the buffer for 64-bit DMA. At present,
this problem is thought to arise from a hardware error.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Cc: John W. Linville <linville@tuxdriver.com>
Cc: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

arcmsr: fix IRQs disabled warning spew

As of 2.6.24, running the archttp passthrough daemon with the arcmsr
driver produces an endless spew of dma_free_coherent warnings:

WARNING: at arch/x86/kernel/pci-dma_64.c:169 dma_free_coherent()

It turns out that coherent memory is not needed, so commit 76d78300 by
Nick Cheng <nick.cheng@areca.com.tw> switched it to kmalloc (as well as
making a lot of other changes which have not been included here).

James Bottomley pointed out that the new kmalloc usage was also wrong,
I corrected this in commit 69e562c2.

This patch combines both of the above for the purpose of fixing 2.6.24.
details in http://bugs.gentoo.org/208493.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Cc: Nick Cheng <nick.cheng@areca.com.tw>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e1000e: Fix CRC stripping in hardware context bug

CRC stripping was only correctly enabled for packet split recieves
which is used when receiving jumbo frames. Correctly enable SECRC
also for normal buffer packet receives.

Tested by Andy Gospodarek and Johan Andersson, see bugzilla #9940.

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Cc: Mike Pagano <mike@mpagano.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI x86: always use conf1 to access config space below 256 bytes

[upsteam commit: a0ca9909]

Thanks to Loic Prylli <loic@myri.com>, who originally proposed
this idea.

Always using legacy configuration mechanism for the legacy config space
and extended mechanism (mmconf) for the extended config space is
a simple and very logical approach. It's supposed to resolve all
known mmconf problems. It still allows per-device quirks (tweaking
dev->cfg_size). It also allows to get rid of mmconf fallback code.

This patch fixes a boot hang on Intel Q35 chipset as detailed at:
http://bugs.gentoo.org/198810

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Pagano <mpagano@gentoo.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

moduleparam: fix alpha, ia64 and ppc64 compile failures

[upstream commit: 91d35dd9]

On alpha, ia64 and ppc64 only relocations to local data can go into
read-only sections. The vast majority of module parameters use the global
generic param_set_*/param_get_* functions, so the 'const' attribute for
struct kernel_param is not only useless, but it also causes compile
failures due to 'section type conflict' in those rare cases where
param_set/get are local functions.

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=8964

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Pagano <mpagano@gentoo.org>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

pata_hpt*, pata_serverworks: fix UDMA masking

[upstream commit: 6ddd6861]

When masking, mask out the modes that are unsupported not the ones
that are supported. This makes life happier.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

SCSI advansys: fix overrun_buf aligned bug

commit 7d5d408c77cee95d1380511de46b7a4c8dc2211d

struct asc_dvc_var needs overrun buffer to be placed on an 8 byte
boundary. advansys defines struct asc_dvc_var:

struct asc_dvc_var {
    ...
    uchar overrun_buf[ASC_OVERRUN_BSIZE] __aligned(8);

The problem is that struct asc_dvc_var is placed on
shost->hostdata. So if the hostdata is not on an 8 byte boundary, the
advansys crashes. The hostdata is placed on a sizeof(unsigned long)
boundary so the 8 byte boundary is not garanteed with x86_32.

With 2.6.23 and 2.6.24, the hostdata is on an 8 byte boundary by
chance, but with the current git, it's not.

This patch removes overrun_buf static array and use kzalloc.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
FUJITA Tomonori notes:
  We thought that 2.6.24 doesn't have this bug, however it does.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

NETFILTER: fix ebtable targets return

Upstream commit 1b04ab459:

The function ebt_do_table doesn't take NF_DROP as a verdict from the targets.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

NETFILTER: Fix incorrect use of skb_make_writable

Upstream commit eb1197bc0:

http://bugzilla.kernel.org/show_bug.cgi?id=9920
The function skb_make_writable returns true or false.

Signed-off-by: Joonwoo Park <joonwpark81@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

NETFILTER: nfnetlink_queue: fix SKB_LINEAR_ASSERT when mangling packet data

Upstream commit e2b58a67:

As reported by Tomas Simonaitis <tomas.simonaitis@gmail.com>, inserting new
data in skbs queued over {ip,ip6,nfnetlink}_queue triggers a SKB_LINEAR_ASSERT
in skb_put().

Going back through the git history, it seems this bug is present since at
least 2.6.12-rc2, probably even since the removal of skb_linearize() for
netfilter.

Linearize non-linear skbs through skb_copy_expand() when enlarging them.
Tested by Thomas, fixes bugzilla #9933.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

spi: pxa2xx_spi clock polarity fix

commit: b97c74bddce4e2c6fef6b3b58910b4fd9eb7f3b8

Fixes a sequencing bug in spi driver pxa2xx_spi.c in which the chip select
for a transfer may be asserted before the clock polarity is set on the
interface.  As a result of this bug, the clock signal may have the wrong
polarity at transfer start, so it may need to make an extra half transition
before the intended clock/data signals begin.  (This probably means all
transfers are one bit out of sequence.)

This only occurs on the first transfer following a change in clock polarity
in systems using more than one more than one such polarity.  The fix
assures that the clock mode is properly set before asserting chip select.

This bug was introduced in a patch merged on 2006/12/10, kernel 2.6.20.
The patch defines an additional bit in: include/asm-arm/arch-pxa/regs-ssp.h
for 2.6.25 and newer kernels but this addition must be made in:
include/asm-arm/arch-pxa/pxa-regs.h for kernels between 2.6.20 and 2.6.24,
inclusive

Signed-off-by: Ned Forrester <nforrester@whoi.edu>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[chrisw@sous-sol.org: backport to 2.6.24.3]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ufs: fix parenthesisation in ufs_set_fs_state()

commit: f81e8a43871f44f98dd14e83a83bf9ca0b3b46c5

This bug snuck in with

commit 252e211e90ce56bf005cb533ad5a297c18c19407
Author: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Date: Tue Oct 16 23:26:31 2007 -0700

Add in SunOS 4.1.x compatible mode for UFS

Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Acked-by: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

hugetlb: ensure we do not reference a surplus page after handing it to buddy

commit: e5df70ab194543522397fa3da8c8f80564a0f7d3

When we free a page via free_huge_page and we detect that we are in surplus
the page will be returned to the buddy.  After this we no longer own the page.

However at the end free_huge_page we clear out our mapping pointer from
page private.  Even where the page is not a surplus we free the page to
the hugepage pool, drop the pool locks and then clear page private.  In
either case the page may have been reallocated.  BAD.

Make sure we clear out page private before we free the page.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

file capabilities: simplify signal check

commit: 094972840f2e7c1c6fc9e1a97d817cc17085378e

Simplify the uid equivalence check in cap_task_kill(). Anyone can kill a
process owned by the same uid.

Without this patch wireshark is reported to fail.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

futex: runtime enable pi and robust functionality

commit: a0c1e9073ef7428a14309cba010633a6cd6719ea

Not all architectures implement futex_atomic_cmpxchg_inatomic().  The default
implementation returns -ENOSYS, which is currently not handled inside of the
futex guts.

Futex PI calls and robust list exits with a held futex result in an endless
loop in the futex code on architectures which have no support.

Fixing up every place where futex_atomic_cmpxchg_inatomic() is called would
add a fair amount of extra if/else constructs to the already complex code.  It
is also not possible to disable the robust feature before user space tries to
register robust lists.

Compile time disabling is not a good idea either, as there are already
architectures with runtime detection of futex_atomic_cmpxchg_inatomic support.

Detect the functionality at runtime instead by calling
cmpxchg_futex_value_locked() with a NULL pointer from the futex initialization
code.  This is guaranteed to fail, but the call of
futex_atomic_cmpxchg_inatomic() happens with pagefaults disabled.

On architectures, which use the asm-generic implementation or have a runtime
CPU feature detection, a -ENOSYS return value disables the PI/robust features.

On architectures with a working implementation the call returns -EFAULT and
the PI/robust features are enabled.

The relevant syscalls return -ENOSYS and the robust list exit code is blocked,
when the detection fails.

Fixes http://lkml.org/lkml/2008/2/11/149
Originally reported by: Lennart Buytenhek

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Riku Voipio <riku.voipio@movial.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

futex: fix init order

commit: 3e4ab747efa8e78562ec6782b08bbf21a00aba1b

When the futex init code fails to initialize the futex pseudo file system it
returns early without initializing the hash queues. Should the boot succeed
then a futex syscall which tries to enqueue a waiter on the hashqueue will
crash due to the unitilialized plist heads.

Initialize the hash queues before the filesystem.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Riku Voipio <riku.voipio@movial.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ARM pxa: fix clock lookup to find specific device clocks

commit: a0dd005d1d9f4c3beab52086f3844ef9342d1e67

Ensure that the clock lookup always finds an entry for a specific
device and ID before it falls back to finding just by ID.  This
fixes a problem reported by Holger Schurig where the BTUART was
assigned the wrong clock.

Tested-by: Holger Schurig <hs4233@mail.mn-solutions.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Uli Luckas notes:

  The patch fixes the otherwise unusable bluetooth uart on pxa25x. The
  patch is written by Russell King [1] who also gave his OK for
  stable inclusion [2].  The patch is also available as commit
  a0dd005d1d9f4c3beab52086f3844ef9342d1e67 to Linus' tree.

  [1] http://marc.info/?l=linux-arm-kernel&m=120298366510315
  [2] http://marc.info/?l=linux-arm-kernel&m=120384388411097

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: replace LOCK_PREFIX in futex.h

Commit: 9d55b9923a1b7ea8193b8875c57ec940dc2ff027

The exception fixup for the futex macros __futex_atomic_op1/2 and
futex_atomic_cmpxchg_inatomic() is missing an entry when the lock
prefix is replaced by a NOP via SMP alternatives.

Chuck Ebert tracked this down from the information provided in:
https://bugzilla.redhat.com/show_bug.cgi?id=429412

A possible solution would be to add another fixup after the
LOCK_PREFIX, so both the LOCK and NOP case have their own entry in the
exception table, but it's not really worth the trouble.

Simply replace LOCK_PREFIX with lock and keep those untouched by SMP
alternatives.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[cebbert@redhat.com: backport to 2.6.24]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

SCSI aic94xx: fix REQ_TASK_ABORT and REQ_DEVICE_RESET

commit: cb84e2d2ff3b50c0da5a7604a6d8634294a00a01

This driver has been failing under heavy load with

aic94xx: escb_tasklet_complete: REQ_TASK_ABORT, reason=0x6
aic94xx: escb_tasklet_complete: Can't find task (tc=4) to abort!

The second message is because the driver fails to identify the task
it's being asked to abort. On closer inpection, there's a thinko in
the for each task loop over pending tasks in both the REQ_TASK_ABORT
and REQ_DEVICE_RESET cases where it doesn't look at the task on the
pending list but at the one on the ESCB (which is always NULL).

Fix by looking at the right task. Also add a print for the case where
the pending SCB doesn't have a task attached.

Not sure if this will fix all the problems, but it's a definite first
step.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

SCSI gdth: don't call pci_free_consistent under spinlock

commit: ff83efacf2b77a1fe8942db6613825a4b80ee5e2

The spinlock is held over too large a region: pscratch is a permanent
address (it's allocated at boot time and never changes). All you need
the smp lock for is mediating the scratch in use flag, so fix this by
moving the spinlock into the case where we set the pscratch_busy flag
to false.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

SCSI ips: fix data buffer accessors conversion bug

commit: 2b28a4721e068ac89bd5435472723a1bc44442fe

This fixes a bug that can't handle a passthru command with more than
two sg entries.

Big thanks to Tim Pepper for debugging the problem.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Mark Salyzyn <Mark_Salyzyn@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

usb-storage: don't access beyond the end of the sg buffer

This patch (as1038) fixes a bug in usb_stor_access_xfer_buf() and
usb_stor_set_xfer_buf() (the bug was originally found by Boaz
Harrosh): The routine must not attempt to write beyond the end of a
scatter-gather list or beyond the number of bytes requested.

This is the minimal 2.6.24 equivalent to as1035 +
as1037 (7084191d53b224b953c8e1db525ea6c31aca5fc7 "USB:
usb-storage: don't access beyond the end of the sg buffer" +
6d512a80c26d87f8599057c86dc920fbfe0aa3aa "usb-storage: update earlier
scatter-gather bug fix"). Mark Glines has confirmed that it fixes
his problem.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Mark Glines <mark@glines.org>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fuse: fix permission checking

[upstream commit 1a823ac9ff09cbdf39201df37b7ede1f9395de83]

I added a nasty local variable shadowing bug to fuse in 2.6.24, with the
result, that the 'default_permissions' mount option is basically ignored.

How did this happen?

- old err declaration in inner scope
- new err getting declared in outer scope
- 'return err' from inner scope getting removed
- old declaration not being noticed

-Wshadow would have saved us, but it doesn't seem practical for
the kernel :(

More testing would have also saved us :((

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

CRYPTO xts: Use proper alignment

[ Upstream commit: 6212f2c7f70c591efb0d9f3d50ad29112392fee2 ]

The XTS blockmode uses a copy of the IV which is saved on the stack
and may or may not be properly aligned. If it is not, it will break
hardware cipher like the geode or padlock.
This patch encrypts the IV in place so we don't have to worry about
alignment.

Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Tested-by: Stefan Hellermann <stefan@the2masters.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

CRYPTO xcbc: Fix crash with IPsec

[ Upstream commit: 2f40a178e70030c4712fe63807c883f34c3645eb ]

When using aes-xcbc-mac for authentication in IPsec,
the kernel crashes. It seems this algorithm doesn't
account for the space IPsec may make in scatterlist for authtag.
Thus when crypto_xcbc_digest_update2() gets called,
nbytes may be less than sg[i].length.
Since nbytes is an unsigned number, it wraps
at the end of the loop allowing us to go back
into loop and causing crash in memcpy.

I used update function in digest.c to model this fix.
Please let me know if it looks ok.

Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

SCSI ips: handle scsi_add_host() failure, and other err cleanups

commit 2551a13e61d3c3df6c2da6de5a3ece78e6d67111

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: Mark Salyzyn <mark_salyzyn@adaptec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
FUJITA Tomonori notes:
  It didn't intend to fix a critical bug, however, it turned out that it
  does. Without this patch, the ips driver in 2.6.23 and 2.6.24 doesn't
  work at all. You can find the more details at the following thread:

  http://marc.info/?t=120293911900023&r=1&w=2

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: adjust enable_NMI_through_LVT0()

commit e94271017f0933b29362a3c9dea5a6b9d04d98e1

Its previous use in a call to on_each_cpu() was pointless, as at the
time that code gets executed only one CPU is online. Further, the
function can be __cpuinit, and for this to work without
CONFIG_HOTPLUG_CPU setup_nmi() must also get an attribute (this one
can even be __init; on 64-bits check_timer() also was lacking that
attribute).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ tglx@linutronix.de: backport to 2.6.24.3]
Cc: Justin Piszcz <jpiszcz@lucidpixels.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drivers: fix dma_get_required_mask

commit: e88a0c2ca81207a75afe5bbb8020541dabf606ac
Date: Sun, 9 Mar 2008 11:57:56 -0500
Subject: drivers: fix dma_get_required_mask

There's a bug in the current implementation of dma_get_required_mask()
where it ands the returned mask with the current device mask. This
rather defeats the purpose if you're using the call to determine what
your mask should be (since you will at that time have the default
DMA_32BIT_MASK). This bug results in any driver that uses this function
*always* getting a 32 bit mask, which is wrong.

Fix by removing the and with dev->dma_mask.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

iov_iter_advance() fix

commit: f7009264c519603b8ec67c881bd368a56703cfc9

iov_iter_advance() skips over zero-length iovecs, however it does not properly
terminate at the end of the iovec array. Fix this by checking against
i->count before we skip a zero-length iov.

The bug was reproduced with a test program that continually randomly creates
iovs to writev. The fix was also verified with the same program and also it
could verify that the correct data was contained in the file after each
writev.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Tested-by: "Kevin Coffman" <kwc@citi.umich.edu>
Cc: "Alexey Dobriyan" <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: Clear DF before calling signal handler

x86: Clear DF before calling signal handler

The Linux kernel currently does not clear the direction flag before
calling a signal handler, whereas the x86/x86-64 ABI requires that.
This become a real problem with gcc version 4.3, which assumes that
the direction flag is correctly cleared at the entry of a function.

This patches changes the setup_frame() functions to clear the
direction before entering the signal handler.

This is a backport of patch e40cd10ccff3d9fbffd57b93780bee4b7b9bff51
("x86: clear DF before calling signal handler") that has been applied
in 2.6.25-rc.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ub: fix up the conversion to sg_init_table()

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Cc: "Oliver Pinter" <oliver.pntr@gmail.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

MIPS: Mark all but i8259 interrupts as no-probe.

Use set_irq_noprobe() to mark all MIPS interrupts as non-probe. Override that
default for i8259 interrupts.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Acked-and-tested-by: Rob Landley <rob@landley.net>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>