git.karo-electronics.de Git - linux-beck.git/log

Fix possible runqueue lock starvation in wait_task_inactive()

Miklos Szeredi reported very long pauses (several seconds, sometimes
more) on his T60 (with a Core2Duo) which he managed to track down to
wait_task_inactive()'s open-coded busy-loop.

He observed that an interrupt on one core tries to acquire the
runqueue-lock but does not succeed in doing so for a very long time -
while wait_task_inactive() on the other core loops waiting for the first
core to deschedule a task (which it wont do while spinning in an
interrupt handler).

This rewrites wait_task_inactive() to do all its waiting optimistically
without any locks taken at all, and then just double-check the end
result with the proper runqueue lock held over just a very short
section.  If there were races in the optimistic wait, of a preemption
event scheduled the process away, we simply re-synchronize, and start
over.

So the code now looks like this:

repeat:
/* Unlocked, optimistic looping! */
rq = task_rq(p);
while (task_running(rq, p))
cpu_relax();

/* Get the *real* values */
rq = task_rq_lock(p, &flags);
running = task_running(rq, p);
array = p->array;
task_rq_unlock(rq, &flags);

/* Check them.. */
if (unlikely(running)) {
cpu_relax();
goto repeat;
}

/* Preempted away? Yield if so.. */
if (unlikely(array)) {
yield();
goto repeat;
}

Basically, that first "while()" loop is done entirely without any
locking at all (and doesn't check for the case where the target process
might have been preempted away), and so it's possibly "incorrect", but
we don't really care.  Both the runqueue used, and the "task_running()"
check might be the wrong tests, but they won't oops - they just mean
that we could possibly get the wrong results due to lack of locking and
exit the loop early in the case of a race condition.

So once we've exited the loop, we then get the proper (and careful) rq
lock, and check the running/runnable state _safely_.  And if it turns
out that our quick-and-dirty and unsafe loop was wrong after all, we
just go back and try it all again.

(The patch also adds a lot of comments, which is the actual bulk of it
all, to make it more obvious why we can do these things without holding
the locks).

Thanks to Miklos for all the testing and tracking it down.

Tested-by: Miklos Szeredi <miklos@szeredi.hu>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sched: fix SysRq-N (normalize RT tasks)

Gene Heskett reported the following problem while testing CFS: SysRq-N
is not always effective in normalizing tasks back to SCHED_OTHER.

The reason for that turns out to be the following bug:

- normalize_rt_tasks() uses for_each_process() to iterate through all
tasks in the system. The problem is, this method does not iterate
through all tasks, it iterates through all thread groups.

The proper mechanism to enumerate over all threads is to use a
do_each_thread() + while_each_thread() loop.

Reported-by: Gene Heskett <gene.heskett@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] ESP: Don't forget to clear ESP_FLAG_RESETTING.
[SCSI] fusion: fix for BZ 8426 - massive slowdown on SCSI CD/DVD drive

Fix signalfd interaction with thread-private signals

Don't let signalfd dequeue private signals off other threads (in the
case of things like SIGILL or SIGSEGV, trying to do so would result
in undefined behaviour on who actually gets the signal, since they
are force unblocked).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Revert "futex_requeue_pi optimization"

This reverts commit d0aa7a70bf03b9de9e995ab272293be1f7937822.

It not only introduced user space visible changes to the futex syscall,
it is also non-functional and there is no way to fix it proper before
the 2.6.22 release.

The breakage report ( http://lkml.org/lkml/2007/5/12/17 ) went
unanswered, and unfortunately it turned out that the concept is not
feasible at all. It violates the rtmutex semantics badly by introducing
a virtual owner, which hacks around the coupling of the user-space
pi_futex and the kernel internal rt_mutex representation.

At the moment the only safe option is to remove it fully as it contains
user-space visible changes to broken kernel code, which we do not want
to expose in the 2.6.22 release.

The patch reverts the original patch mostly 1:1, but contains a couple
of trivial manual cleanups which were necessary due to patches, which
touched the same area of code later.

Verified against the glibc tests and my own PI futex tests.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Ulrich Drepper <drepper@redhat.com>
Cc: Pierre Peiffer <pierre.peiffer@bull.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Linux 2.6.22-rc5

The manatees, they are dancing!

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

shm: fix the filename of hugetlb sysv shared memory

Some user space tools need to identify SYSV shared memory when examining
/proc/<pid>/maps. To do so they look for a block device with major zero, a
dentry named SYSV<sysv key>, and having the minor of the internal sysv
shared memory kernel mount.

To help these tools and to make it easier for people just browsing
/proc/<pid>/maps this patch modifies hugetlb sysv shared memory to use the
SYSV<key> dentry naming convention.

User space tools will still have to be aware that hugetlb sysv shared
memory lives on a different internal kernel mount and so has a different
block device minor number from the rest of sysv shared memory.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Albert Cahalan <acahalan@gmail.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

hugetlb: fix get_policy for stacked shared memory files

Here's another breakage as a result of shared memory stacked files :(

The NUMA policy for a VMA is determined by checking the following (in the
order given):

1) vma->vm_ops->get_policy() (if defined)
2) vma->vm_policy (if defined)
3) task->mempolicy (if defined)
4) Fall back to default_policy

By switching to stacked files for shared memory, get_policy() is now always
set to shm_get_policy which is a wrapper function. This causes us to stop
at step 1, which yields NULL for hugetlb instead of task->mempolicy which
was the previous (and correct) result.

This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for
the wrapped vm_ops.

(akpm: the refcounting of mempolicies is busted and this patch does nothing to
improve it)

Signed-off-by: Adam Litke <agl@us.ibm.com>
Acked-by: William Irwin <bill.irwin@oracle.com>
Cc: dean gaudet <dean@arctic.org>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

udf: fix possible leakage of blocks

We have to take care that when we call udf_discard_prealloc() from
udf_clear_inode() we have to write inode ourselves afterwards (otherwise,
some changes might be lost leading to leakage of blocks, use of free blocks
or improperly aligned extents).

Also udf_discard_prealloc() does two different things - it removes
preallocated blocks and truncates the last extent to exactly match i_size.
We move the latter functionality to udf_truncate_tail_extent(), call
udf_discard_prealloc() when last reference to a file is dropped and call
udf_truncate_tail_extent() when inode is being removed from inode cache
(udf_clear_inode() call).

We cannot call udf_truncate_tail_extent() earlier as subsequent open+write
would find the last block of the file mapped and happily write to the end
of it, although the last extent says it's shorter.

[akpm@linux-foundation.org: Make checkpatch.pl happier]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Eric Sandeen <sandeen@sandeen.net>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

SLUB: minimum alignment fixes

If ARCH_KMALLOC_MINALIGN is set to a value greater than 8 (SLUBs smallest
kmalloc cache) then SLUB may generate duplicate slabs in sysfs (yes again)
because the object size is padded to reach ARCH_KMALLOC_MINALIGN.  Thus the
size of the small slabs is all the same.

No arch sets ARCH_KMALLOC_MINALIGN larger than 8 though except mips which
for some reason wants a 128 byte alignment.

This patch increases the size of the smallest cache if
ARCH_KMALLOC_MINALIGN is greater than 8.  In that case more and more of the
smallest caches are disabled.

If we do that then the count of the active general caches that is displayed
on boot is not correct anymore since we may skip elements of the kmalloc
array.  So count them separately.

This approach was tested by Havard yesterday.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Rework ptep_set_access_flags and fix sun4c

Some changes done a while ago to avoid pounding on ptep_set_access_flags and
update_mmu_cache in some race situations break sun4c which requires
update_mmu_cache() to always be called on minor faults.

This patch reworks ptep_set_access_flags() semantics, implementations and
callers so that it's now responsible for returning whether an update is
necessary or not (basically whether the PTE actually changed). This allow
fixing the sparc implementation to always return 1 on sun4c.

[akpm@linux-foundation.org: fixes, cleanups]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: David Miller <davem@davemloft.net>
Cc: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

random: fix output buffer folding

(As reported by linux@horizon.com)

Folding is done to minimize the theoretical possibility of systematic
weakness in the particular bits of the SHA1 hash output. The result of
this bug is that 16 out of 80 bits are un-folded. Without a major new
vulnerability being found in SHA1, this is harmless, but still worth
fixing.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: <linux@horizon.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

uml: kill x86_64 STACK_TOP_MAX

The x86_64 a.out.h got a definition of STACK_TOP_MAX, which interferes with
the UML version. So, just undef it like STACK_TOP.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

uml: remove PAGE_SIZE from libc code

Distros seem to be removing PAGE_SIZE from asm/page.h. So, the libc side of
UML should stop using it.

I replace it with UM_KERN_PAGE_SIZE, which is defined to be the same as
PAGE_SIZE on the kernel side of the house. I could also use getpagesize(),
but it's more important that UML have the same value of PAGE_SIZE everywhere.
It's conceivable that it could be built with a larger PAGE_SIZE, and use of
getpagesize() would break that badly.

PAGE_MASK got the same treatment, as it is closely tied to PAGE_SIZE.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

spi doc updates

Update two points in the SPI interface documentation:

- Update description of the "chip stays selected after message ends"
  mode.  In some cases it's required for correctness; it isn't just a
  performance tweak.  (Yes: to use this mode on mult-device busses, another
  programming interface will be needed.  One draft has been circulated
  already.)

- Clarify spi_setup(), highlighting that callers must ensure that no
  requests are queued (can't change configuration except between I/Os), and
  that the device must be deselected when this returns (which is a key part
  of why it's called during device init).

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: fix bug in error handling during raid1 repair

If raid1/repair (which reads all block and fixes any differences it finds)
hits a read error, it doesn't reset the bio for writing before writing
correct data back, so the read error isn't fixed, and the device probably
gets a zero-length write which it might complain about.

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: fix two raid10 bugs

1/ When resyncing a degraded raid10 which has more than 2 copies of each block,
garbage can get synced on top of good data.

2/ We round the wrong way in part of the device size calculation, which
can cause confusion.

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fuse: ->fs_flags fixlet

fs/fuse/inode.c:658:3: error: Initializer entry defined twice
fs/fuse/inode.c:661:3: also defined here

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

perfctr-watchdog: fix interchanged parameters to release_{evntsel,perfctr}_nmi

Fix oops triggered during: echo 0 > /proc/sys/kernel/nmi_watchdog

The culprit seems to be 09198e68501a7e34737cd9264d266f42429abcdc:
[PATCH] i386: Clean up NMI watchdog code

In two places, the parameters to release_{evntsel,perfctr}_nmi
got interchanged during the cleanup.

Fix interchanged parameters to release_{evntsel,perfctr}_nmi.

Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
Cc: Stephane Eranian <eranian@hpl.hp.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

swsusp: Fix userland interface

Fix oops caused by 'cat /dev/snapshot', reported by Arkadiusz Miskiewicz,
and make it impossible to thaw tasks with the help of the swsusp userland
interface while there is a snapshot image ready to save.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

toshiba_acpi: fix section mismatch in allyesconfig

Fix section error (allyesconfig). The exit function is called from init,
so functions that are called by the exit function cannot be marked __exit.

WARNING: drivers/built-in.o(.text+0xe5bc6): Section mismatch: reference to .exit.
text: (between 'toshiba_acpi_exit' and 'hci_raw')

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Len Brown <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpuset: zero malloc - fix for old cpusets

The cpuset code to present a list of tasks using a cpuset to user space could
write to an array that it had kmalloc'd, after a kmalloc request of zero size.

The problem was that the code didn't check for writes past the allocated end
of the array until -after- the first write.

This is a race condition that is likely rare -- it would only show up if a
cpuset went from being empty to having a task in it, during the brief time
between the allocation and the first write.

Prior to roughly 2.6.22 kernels, this was also a benign problem, because a
zero kmalloc returned a few usable bytes anyway, and no harm was done with the
bogus write.

With the 2.6.22 kernel changes to make issue a warning if code tries to write
to the location returned from a zero size allocation, this problem is no
longer benign. This cpuset code would occassionally trigger that warning.

The fix is trivial -- check before storing into the array, not after, whether
the array is big enough to hold the store.

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serue@us.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: Paul Menage <menage@google.com>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

i386 mm: use pte_update() in ptep_test_and_clear_dirty()

It is not safe to use pte_update_defer() in ptep_test_and_clear_young():
its only user, /proc/<pid>/clear_refs, drops pte lock before flushing TLB.
Use the safe though less efficient pte_update() paravirtop in its place.
Likewise in ptep_test_and_clear_dirty(), though that has no current use.

These are macros (header file dependency stops them from becoming inline
functions), so be more liberal with the underscores and parentheses.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Restore shmid as inode# to fix /proc/pid/maps ABI breakage

shmid used to be stored as inode# for shared memory segments. Some of
the proc-ps tools use this from /proc/pid/maps. Recent cleanups
to newseg() changed it. This patch sets inode number back to shared
memory id to fix breakage.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Cc: "Albert Cahalan" <acahalan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

SLUB slab validation: Alloc while interrupts are disabled must use GFP_ATOMIC

The data structure to manage the information gathered about functions
allocating and freeing objects is allocated when the list_lock has already
been taken. We need to allocate with GFP_ATOMIC instead of GFP_KERNEL.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

i386: use the right wrapper to disable the NMI watchdog

When disabled through /proc/sys/kernel/nmi_watchdog, the NMI watchdog uses the
stop() method directly, which does not decrement the activity counter, leading
to a BUG(). Use the wrapper function instead to fix that.

Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

i386: fix NMI watchdog not reserving its MSRs

At system boot time, the NMI watchdog no longer reserved its MSRs, allowing
other subsystems to mess with them. Fix that.

Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tty: restore locked ioctl file op

Restore tty locked ioctl handler which was replaced with
an unlocked ioctl handler in hung_up_tty_fops by the patch:

commit e10cc1df1d2014f68a4bdcf73f6dd122c4561f94
Author: Paul Fulghum <paulkf@microgate.com>
Date:   Thu May 10 22:22:50 2007 -0700

    tty: add compat_ioctl

This was reported in:
[Bug 8473] New: Oops: 0010 [1] SMP

The bug is caused by switching to hung_up_tty_fops in do_tty_hangup.  An
ioctl call can be waiting on BLK after testing for existence of the locked
ioctl handler in the normal tty fops, but before calling the locked ioctl
handler.  If a hangup occurs at that point, the locked ioctl fop is NULL
and an oops occurs.

(akpm: we can remove my debugging code from do_ioctl() now, but it'll be OK to
do that for 2.6.23)

Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fix radeon setparam on 32/64 systems, harder.

Commit 9b01bd5b284bbf519b726b39f1352023cb5e9e69 introduced a
compat_ioctl handler for RADEON_SETPARAM, the sole purpose of which was
to handle the fact that on i386, alignof(uint64_t)==4.

Unfortunately, this handler was installed for _all_ 64-bit
architectures, instead of only x86_64 and ia64. And thus it breaks
32-bit compatibility on every other arch, where 64-bit integers are
aligned to 8 bytes in 32-bit mode just the same as in 64-bit mode.

Arnd has a cunning plan to use 'compat_u64' with appropriate alignment
attributes according to the 32-bit ABI, but for now let's just make the
compat_radeon_cp_setparam routine entirely disappear on 64-bit machines
whose 32-bit compat support isn't for i386. It would be a no-op with
compat_u64 anyway.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Airlie <airlied@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6:
ide-scsi: fix OOPS in idescsi_expiry()
Resume from RAM on HPC nx6325 broken

ide-scsi: fix OOPS in idescsi_expiry()

drive->driver_data contains pointer to Scsi_Host not idescsi_scsi_t.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

Resume from RAM on HPC nx6325 broken

generic_ide_resume() should check if dev->driver is not NULL before applying
to_ide_driver() to it. Fix that.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [RXRPC] net/rxrpc/ar-connection.c: fix NULL dereference
  [TCP]: Fix logic breakage due to DSACK separation
  [TCP]: Congestion control API RTT sampling fix

mm: Fix memory/cpu hotplug section mismatch and oops.

When building with memory hotplug enabled and cpu hotplug disabled, we
end up with the following section mismatch:

WARNING: mm/built-in.o(.text+0x4e58): Section mismatch: reference to
.init.text: (between 'free_area_init_node' and '__build_all_zonelists')

This happens as a result of:

        -> free_area_init_node()
          -> free_area_init_core()
            -> zone_pcp_init() <-- all __meminit up to this point
              -> zone_batchsize() <-- marked as __cpuinit                     fo

This happens because CONFIG_HOTPLUG_CPU=n sets __cpuinit to __init, but
CONFIG_MEMORY_HOTPLUG=y unsets __meminit.

Changing zone_batchsize() to __devinit fixes this.

__devinit is the only thing that is common between CONFIG_HOTPLUG_CPU=y and
CONFIG_MEMORY_HOTPLUG=y. In the long run, perhaps this should be moved to
another section identifier completely. Without this, memory hot-add
of offline nodes (via hotadd_new_pgdat()) will oops if CPU hotplug is
not also enabled.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
--

mm/page_alloc.c |    2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6

* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (30 commits)
  Blackfin SMC91X ethernet supporting driver: SMC91C111 LEDs are note drived in the kernel like in uboot
  Blackfin SPI driver: fix bug SPI DMA incomplete transmission
  Blackfin SPI driver: tweak spi cleanup function to match newer kernel changes
  Blackfin RTC drivers: update MAINTAINERS information
  Blackfin serial driver: decouple PARODD and CMSPAR checking from PARENB
  Blackfin serial driver: actually implement the break_ctl() function
  Blackfin serial driver: ignore framing and parity errors
  Blackfin serial driver: hook up our UARTs STP bit with userspaces CMSPAR
  Blackfin arch: move HI/LO macros into blackfin.h and punt the rest of macros.h as it includes VDSP macros we never use
  Blackfin arch: redo our linker script a bit
  Blackfin arch: make sure we initialize our L1 Data B section properly based on the linked kernel
  Blackfin arch: fix bug can not wakeup from sleep via push buttons
  Blackfin arch: add support for Alon Bar-Lev's dynamic kernel command-line
  Blackfin arch: add missing gpio.h header to fix compiling in some pm configurations
  Blackfin arch: As Mike pointed out range goes form m..MAX_BLACKFIN_GPIO -1
  Blackfin arch: fix spelling typo in output
  Blackfin arch: try to split up functions like this into smaller units according to LKML review
  Blackfin arch: add proper ENDPROC()
  Blackfin arch: move more of our startup code to .init so it can be freed once we are up and running
  Blackfin arch: unify differences between our diff head.S files -- no functional changes
  ...

Merge branch 'splice-2.6.22' of git://git.kernel.dk/data/git/linux-2.6-block

* 'splice-2.6.22' of git://git.kernel.dk/data/git/linux-2.6-block:
  splice: only check do_wakeup in splice_to_pipe() for a real pipe
  splice: fix leak of pages on short splice to pipe
  splice: adjust balance_dirty_pages_ratelimited() call

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: Prevent guest fpu state from leaking into the host

Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  [MIPS] Fix builds where MSC01E_xxx is undefined.
  [MIPS] Separate performance counter interrupts
  [MIPS] Malta: Fix for SOCitSC based Maltas

Merge branch 'for-linus' of git://www.atmel.no/~hskinnemoen/linux/kernel/avr32

* 'for-linus' of git://www.atmel.no/~hskinnemoen/linux/kernel/avr32:
  [AVR32] Define ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES
  [AVR32] STK1000: Set SPI_MODE_3 in the ltv350qv board info
  [AVR32] gpio_*_cansleep() fix
  [AVR32] ratelimit segfault reporting rate

block: always requeue !fs requests at the front

SCSI marks internal commands with REQ_PREEMPT and push it at the front
of the request queue using blk_execute_rq().  When entering suspended
or frozen state, SCSI devices are quiesced using
scsi_device_quiesce().  In quiesced state, only REQ_PREEMPT requests
are processed.  This is how SCSI blocks other requests out while
suspending and resuming.  As all internal commands are pushed at the
front of the queue, this usually works.

Unfortunately, this interacts badly with ordered requeueing.  To
preserve request order on requeueing (due to busy device, active EH or
other failures), requests are sorted according to ordered sequence on
requeue if IO barrier is in progress.

The following sequence deadlocks.

1. IO barrier sequence issues.

2. Suspend requested.  Queue is quiesced with part or all of IO
   barrier sequence at the front.

3. During suspending or resuming, SCSI issues internal command which
   gets deferred and requeued for some reason.  As the command is
   issued after the IO barrier in #1, ordered requeueing code puts the
   request after IO barrier sequence.

4. The device is ready to process requests again but still is in
   quiesced state and the first request of the queue isn't
   REQ_PREEMPT, so command processing is deadlocked -
   suspending/resuming waits for the issued request to complete while
   the request can't be processed till device is put back into
   running state by resuming.

This can be fixed by always putting !fs requests at the front when
requeueing.

The following thread reports this deadlock.

  http://thread.gmane.org/gmane.linux.kernel/537473

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: David Greaves <david@dgreaves.com>
Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[RXRPC] net/rxrpc/ar-connection.c: fix NULL dereference

This patch fixes a NULL dereference spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: Fix logic breakage due to DSACK separation

Commit 6f74651ae626ec672028587bc700538076dfbefb is found guilty
of breaking DSACK counting, which should be done only for the
SACK block reported by the DSACK instead of every SACK block
that is received along with DSACK information.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: Congestion control API RTT sampling fix

Commit 164891aadf1721fca4dce473bb0e0998181537c6 broke RTT
sampling of congestion control modules. Inaccurate timestamps
could be fed to them without providing any way for them to
identify such cases. Previously RTT sampler was called only if
FLAG_RETRANS_DATA_ACKED was not set filtering inaccurate
timestamps nicely. In addition, the new behavior could give an
invalid timestamp (zero) to RTT sampler if only skbs with
TCPCB_RETRANS were ACKed. This solves both problems.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fix console output getting dropped on platforms without udbg_putc
[POWERPC] Fix per-cpu allocation on oldworld SMP powermacs

splice: only check do_wakeup in splice_to_pipe() for a real pipe

We only ever set do_wakeup to non-zero if the pipe has an inode
backing, so it's pointless to check outside the pipe->inode
check.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

splice: fix leak of pages on short splice to pipe

If the destination pipe is full and we already transferred
data, we break out instead of waiting for more pipe room.
The exit logic looks at spd->nr_pages to see if we moved
everything inside the spd container, but we decrement that
variable in the loop to decide when spd has emptied.

Instead we want to compare to the original page count in
the spd, so cache that in a local variable.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

splice: adjust balance_dirty_pages_ratelimited() call

As we have potentially dirtied more than 1 page, we should indicate as
such to the dirty page balancing. So call
balance_dirty_pages_ratelimited_nr() and pass in the approximate number
of pages we dirtied.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

KVM: Prevent guest fpu state from leaking into the host

The lazy fpu changes did not take into account that some vmexit handlers
can sleep. Move loading the guest state into the inner loop so that it
can be reloaded if necessary, and move loading the host state into
vmx_vcpu_put() so it can be performed whenever we relinquish the vcpu.

Signed-off-by: Avi Kivity <avi@qumranet.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fix

* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fix:
kbuild: fix sh64 section mismatch problems

Merge branch 'drm-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6

* 'drm-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm: fix radeon setparam on 32/64 bit systems.
  drm/i915:  Add support for the G33, Q33, and Q35 chipsets.
  i915: add new pciids for 945GME, 965GME/GLE

Merge master.kernel.org:/pub/scm/linux/kernel/git/kyle/parisc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/kyle/parisc-2.6: (30 commits)
  [PARISC] remove global_ack_eiem
  [PARISC] Fix kernel panic in check_ivt
  [PARISC] Fix bug when syscall nr is __NR_Linux_syscalls
  [PARISC] be more defensive in process.c::get_wchan
  [PARISC] fix "reduce size of task_struct on 64-bit machines" fallout
  [PARISC] fix null ptr deref in unwind.c
  [PARISC] fix trivial spelling nit in asm/linkage.h
  [PARISC] remove remnants of parisc-specific softirq code
  [PARISC] fix section mismatch in smp.c
  [PARISC] fix "ENTRY" macro redefinition
  [PARISC] Wire up utimensat/signalfd/timerfd/eventfd syscalls
  [PARISC] fix section mismatch in superio serial drivers
  [PARISC] fix section mismatch in parisc eisa driver
  [PARISC] fix section mismatches in arch/parisc/kernel
  [PARISC] fix section mismatch in ccio-dma
  [PARISC] fix section mismatch in parisc STI video drivers
  [PARISC] fix section mismatch in parport_gsc
  [PARISC] fix lasi_82596 build
  [PARISC] Build fixes for power.c
  [PARISC] kobject is embedded in subsys, not kset
  ...

Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart

* master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart:
[AGPGART] intel_agp: fix device probe

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6

* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: Fix args to sun4v_ldc_revoke().
  [SPARC64]: Really fix parport.
  [SPARC64]: Fix IO/MEM space sizing for PCI.
  [SPARC64]: Wire up cookie based sun4v interrupt registry.

[AGPGART] intel_agp: fix device probe

This patch trys to fix device probe in two cases. First we should
correctly detect device if integrated graphics device is not enabled
or exists, like an add-in card is plugged. Second on some type of intel
GMCH, it might have multiple graphic chip models, like 945GME case, so
we should be sure the detect works through the whole table.

Signed-off-by: Wang Zhenyu <zhenyu.z.wang@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [IPV6] addrconf: Fix IPv6 on tuntap tunnels
  [TCP]: Add missing break to TCP option parsing code
  [SCTP] Don't disable PMTU discovery when mtu is small
  [SCTP] Flag a pmtu change request
  [SCTP] Update pmtu handling to be similar to tcp
  [SCTP] Fix leak in sctp_getsockopt_local_addrs when copy_to_user fails
  [SCTP]: Allow unspecified port in sctp_bindx()
  [SCTP]: Correctly set daddr for IPv6 sockets during peeloff
  [TCP]: Set initial_ssthresh default to zero in Cubic and BIC.
  [TCP]: Fix left_out setting during FRTO
  [TCP]: Disable TSO if MD5SIG is enabled.
  [PPP_MPPE]: Fix "osize too small" check.
  [PATCH] mac80211: Don't stop tx queue on master device while scanning.
  [PATCH] mac80211: fix debugfs tx power reduction output
  [PATCH] cfg80211: fix signed macaddress in sysfs
  [IrDA]: f-timer reloading when sending rejected frames.
  [IrDA]: Fix Rx/Tx path race.

Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 4445/1: ANUBIS: Fix CPLD registers
  [ARM] 4444/2: OSIRIS: CPLD suspend fix
  [ARM] 4443/1: OSIRIS: Add watchdog device to machine devices
  [ARM] 4442/1: OSIRIS: Fix CPLD register definitions
  [ARM] VFP: fix section mismatch error

Merge master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev

[IPV6] addrconf: Fix IPv6 on tuntap tunnels

The recent patch that added ipv6_hwtype is broken on tuntap tunnels.
Indeed, it's broken on any device that does not pass the ipv6_hwtype
test.

The reason is that the original test only applies to autoconfiguration,
not IPv6 support.  IPv6 support is allowed on any device.  In fact,
even with the ipv6_hwtype patch applied you can still add IPv6 addresses
to any interface that doesn't pass thw ipv6_hwtype test provided that
they have a sufficiently large MTU.  This is a serious problem because
come deregistration time these devices won't be cleaned up properly.

I've gone back and looked at the rationale for the patch.  It appears
that the real problem is that we were creating IPv6 devices even if the
MTU was too small.  So here's a patch which fixes that and reverts the
ipv6_hwtype stuff.

Thanks to Kanru Chen for reporting this issue.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

[TCP]: Add missing break to TCP option parsing code

This flaw does not affect any behavior (currently).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

[MIPS] Fix builds where MSC01E_xxx is undefined.

Signed-off-by: Chris Dearman <chris@mips.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[MIPS] Separate performance counter interrupts

Support for performance counter overflow interrupt that is on a separate
interrupt from the timer.

Signed-off-by: Chris Dearman <chris@mips.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[MIPS] Malta: Fix for SOCitSC based Maltas

And an attempt to tidy up the core/controller differences.

Signed-off-by: Chris Dearman <chris@mips.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[AVR32] Define ARCH_KMALLOC_MINALIGN to L1_CACHE_BYTES

This allows SLUB debugging to be used without fear of messing up DMA
transfers. SPI is one example that easily breaks without this patch.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>

[AVR32] STK1000: Set SPI_MODE_3 in the ltv350qv board info

In the latest incarnation of the ltv350qv driver the call to
spi_setup() has been removed. So we need to initialize things more
carefully in the board info struct.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>

[AVR32] gpio_*_cansleep() fix

The AVR32 <asm/gpio.h> was missing the gpio_*_cansleep() calls,
breaking compilation for some code using them.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>

[AVR32] ratelimit segfault reporting rate

Limit the rate of the kernel logging for the segfaults of user
applications, to avoid potential message floods or denial-of-service
attacks.

Signed-off-by: Andrea Righi <a.righi@cineca.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>

[POWERPC] Fix console output getting dropped on platforms without udbg_putc

Previously, registering this early console would just result
in dropping early buffered printk output until a udbg_putc
was registered.

However, commit 69331af79cf29e26d1231152a172a1a10c2df511
clears the CON_PRINTBUFFER flag on the main console when a
CON_BOOT (early) console has been registered, resulting in
the buffered messages never being displayed to the user.

This fixes the problem by making sure we don't register udbg_console
on platforms that don't implement udbg_putc.

Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Mark A. Greer <mgreer@mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>

[POWERPC] Fix per-cpu allocation on oldworld SMP powermacs

The per-cpu area(a) for the secondary CPU(s) isn't getting allocated
on old SMP powermacs that don't have the secondary CPU(s) listed in
the device tree, as per-cpu areas are now only allocated for CPUs in
the cpu_possible_map, and we aren't setting the bits for the secondary
CPU(s) until smp_prepare_cpus(), which is after per-cpu allocation.
Therefore this sets the bits for CPUs 1..3 in cpu_possible_map in
pmac_setup_arch, so they get per-cpu data allocated.

Signed-off-by: Paul Mackerras <paulus@samba.org>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: move input-polldev to drivers/input
  Input: i8042 - add ULI EV4873 to noloop list
  Input: i8042 - add ASUS P65UP5 to the noloop list
  Input: usbtouchscreen - fix fallout caused by move from drivers/usb

[SCTP] Don't disable PMTU discovery when mtu is small

Right now, when we receive a mtu estimate smaller then minim
threshold in the ICMP message, we disable the path mtu discovery
on the transport. This leads to the never increasing sctp fragmentation
point even when the real path mtu has increased.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>

[SCTP] Flag a pmtu change request

Currently, if the socket is owned by the user, we drop the ICMP
message.  As a result SCTP forgets that path MTU changed and
never adjusting it's estimate.  This causes all subsequent
packets to be fragmented.  With this patch, we'll flag the association
that it needs to udpate it's estimate based on the already updated
routing information.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>

[SCTP] Update pmtu handling to be similar to tcp

Introduce new function sctp_transport_update_pmtu that updates
the transports and destination caches view of the path mtu.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>

[SCTP] Fix leak in sctp_getsockopt_local_addrs when copy_to_user fails

If the copy_to_user or copy_user calls fail in sctp_getsockopt_local_addrs(),
the function should free locally allocated storage before returning error.
Spotted by Coverity.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>

[SCTP]: Allow unspecified port in sctp_bindx()

Allow sctp_bindx() to accept multiple address with
unspecified port. In this case, all addresses inherit
the first bound port. We still catch full mis-matches.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>

[SCTP]: Correctly set daddr for IPv6 sockets during peeloff

During peeloff of AF_INET6 socket, the inet6_sk(sk)->daddr
wasn't set correctly since the code was assuming IPv4 only.
Now we use a correct call to set the destination address.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>

[SCSI] ESP: Don't forget to clear ESP_FLAG_RESETTING.

esp_reset_cleanup() does everything necessary except clear
the flag, so we never exit resetting state.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

mmc: get back read-only switch function

Somehow the code to read the read-only switch of SD cards got lost
in the reorganisation.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>

mmc-omap: fix sd response type 6 vs. 1

Ignoring OMAP_MMC_STAT_CARD_ERR, treating it as if the command
completed correctly.

Signed-off-by: Ragner Magalhaes <ragner.magalhaes@indt.org.br>
Signed-off-by: Carlos Eduardo Aguiar <carlos.aguiar@indt.org.br>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>

[TCP]: Set initial_ssthresh default to zero in Cubic and BIC.

Because of the current default of 100, Cubic and BIC perform very
poorly compared to standard Reno.

In the worst case, this change makes Cubic and BIC as aggressive as
Reno. So this change should be very safe.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Fix args to sun4v_ldc_revoke().

First argument is LDC channel ID, then mapping cookie,
then the MTE revoke cookie.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Really fix parport.

We were passing a "struct pci_dev *" instead of a
"struct device *" to the parport registry routines.
No wonder things exploded.

The ebus_bus_type hacks can be backed out from
asm-sparc64/dma-mapping.h, those were wrong.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Fix IO/MEM space sizing for PCI.

In pci_determine_mem_io_space(), do not hard code the region sizes.
Instead, use the values given to us in the ranges property.

Thanks goes to Mikael Petterson for the original Xorg failure
bug repoert, and strace dumps from Mikael and Dmitry Artamonow.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Wire up cookie based sun4v interrupt registry.

This will be used for logical domain channel interrupts.

Signed-off-by: David S. Miller <davem@davemloft.net>

Input: move input-polldev to drivers/input

To work around deficiences in Kconfig that allows to "select"
a symbol without automatically selecting all dependencies for
that symbol move input-polldev from drivers/input/misc to
drivers/input thus removing extra dependency on CONFIG_INPUT_MISC.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (89 commits)
  myri10ge: update driver version
  myri10ge: report when the link partner is running in Myrinet mode
  myri10ge: limit the number of recoveries
  NetXen: Fix link status messages
  Revert "[netdrvr e100] experiment with doing RX in a similar manner to eepro100"
  [PATCH] libertas: convert libertas_mpp into anycast_mask
  [PATCH] libertas: actually send mesh frames to mesh netdev
  [PATCH] libertas: deauthenticate from AP in channel switch
  [PATCH] libertas: pull current channel from firmware on mesh autostart
  [PATCH] libertas: reduce SSID and BSSID mixed-case abuse
  [PATCH] libertas: remove WPA_SUPPLICANT structure
  [PATCH] libertas: remove structure WLAN_802_11_SSID and libertas_escape_essid
  [PATCH] libertas: tweak association debug output
  [PATCH] libertas: fix big-endian associate command.
  [PATCH] libertas: don't byte-swap firmware version number. It's a byte array.
  [PATCH] libertas: more endianness fixes, in tx.c this time
  [PATCH] libertas: More endianness fixes.
  [PATCH] libertas: first pass at fixing up endianness issues
  [PATCH] libertas: sparse fixes
  [PATCH] libertas: fix character set in README
  ...

Merge branch 'master' into upstream-fixes

Merge branch 'libertas-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes

Merge branch 'libertas' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes

[TCP]: Fix left_out setting during FRTO

Without FRTO, the tcp_try_to_open is never called with
lost_out > 0 (see tcp_time_to_recover). However, when FRTO is
enabled, the !tp->lost condition is not used until end of FRTO
because that way TCP avoids premature entry to fast recovery
during FRTO.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

sysfs: fix race condition around sd->s_dentry, take#2

Allowing attribute and symlink dentries to be reclaimed means
sd->s_dentry can change dynamically.  However, updates to the field
are unsynchronized leading to race conditions.  This patch adds
sysfs_lock and use it to synchronize updates to sd->s_dentry.

Due to the locking around ->d_iput, the check in sysfs_drop_dentry()
is complex.  sysfs_lock only protect sd->s_dentry pointer itself.  The
validity of the dentry is protected by dcache_lock, so whether dentry
is alive or not can only be tested while holding both locks.

This is minimal backport of sysfs_drop_dentry() rewrite in devel
branch.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sysfs: fix condition check in sysfs_drop_dentry()

The condition check doesn't make much sense as it basically always
succeeds. This causes NULL dereferencing on certain cases. It seems
that parentheses are put in the wrong place. Fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses

Backport of
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch

For regular files in sysfs, sysfs_readdir wants to traverse
sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number.
But, the dentry can be reclaimed under memory pressure, and there is
no synchronization with readdir. This patch follows Tejun's scheme of
allocating and storing an inode number in the new s_ino member of a
sysfs_dirent, when dirents are created, and retrieving it from there
for readdir, so that the pointer chain doesn't have to be traversed.

Tejun's upstream patch uses a new-ish "ida" allocator which brings
along some extra complexity; this -stable patch has a brain-dead
incrementing counter which does not guarantee uniqueness, but because
sysfs doesn't hash inodes as iunique expects, uniqueness wasn't
guaranteed today anyway.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

myri10ge: update driver version

Update myri10ge driver version to 1.3.1-1.248.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

myri10ge: report when the link partner is running in Myrinet mode

Since Myri-10G boards may also run in Myrinet mode instead of Ethernet,
add a message when we detect that the link partner is not running in the
right mode.

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

myri10ge: limit the number of recoveries

Limit the number of recoveries from a NIC hw watchdog reset to 1 by default.
It enables detection of defective NICs immediately since these memory parity
errors are expected to happen very rarely (less than once per century*NIC).

Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

NetXen: Fix link status messages

NetXen: Fix incorrect link status even with switch turned OFF.
NetXen driver failed to accurately indicate when a link is up or down.
This was encountered during failover testing, when the first port
indicated that the link was up even when the 10G switch it was assigned
to in the Bladecenter was turned off completely.

Signed-off by: Wen Xiong <wenxiong@us.ibm.com>
Signed-off by: Mithlesh Thukral <mithlesh@netxen.com>

Signed-off-by: Jeff Garzik <jeff@garzik.org>

Revert "[netdrvr e100] experiment with doing RX in a similar manner to eepro100"

This reverts commit d52df4a35af569071fda3f4eb08e47cc7023f094.

This patch attempted to fix e100 for non-cache coherent memory
architectures by using the cb style code that eepro100 had and using
the EL and s bits from the RFD list. Unfortunately the hardware
doesn't work exactly like this and therefore this patch actually
breaks e100. Reverting the change brings it back to the previously
known good state for 2.6.22. The pending rewrite in progress to this
code can then be safely merged later.

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

[TCP]: Disable TSO if MD5SIG is enabled.

Signed-off-by: David S. Miller <davem@davemloft.net>

[PPP_MPPE]: Fix "osize too small" check.

Prevent mppe_decompress() from generating "osize too small" errors when
checking for output buffer size. When receiving a packet of mru size the
output buffer for decrypted data is 1 byte too small since
mppe_decompress() tries to account for possible PFC, however later in code
it is assumed no PFC.

Adjusting the check prevented these errors from occurring.

Signed-off-by: Konstantin Sharlaimov <konstantin.sharlaimov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mac80211-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6