git.karo-electronics.de Git - mv-sheeva.git/log

net: Fix IP_MULTICAST_IF

ipv4/ipv6 setsockopt(IP_MULTICAST_IF) have dubious __dev_get_by_index() calls.

This function should be called only with RTNL or dev_base_lock held, or reader
could see a corrupt hash chain and eventually enter an endless loop.

Fix is to call dev_get_by_index()/dev_put().

If this happens to be performance critical, we could define a new dev_exist_by_index()
function to avoid touching dev refcount.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bluetooth: static lock key fix

When shutdown ppp connection, lockdep waring about non-static key
will happen, it is caused by the lock is not initialized properly
at that time.

Fix with tuning the lock/skb_queue_head init order

[   94.339261] INFO: trying to register non-static key.
[   94.342509] the code is fine but needs lockdep annotation.
[   94.342509] turning off the locking correctness validator.
[   94.342509] Pid: 0, comm: swapper Not tainted 2.6.31-mm1 #2
[   94.342509] Call Trace:
[   94.342509]  [<c0248fbe>] register_lock_class+0x58/0x241
[   94.342509]  [<c024b5df>] ? __lock_acquire+0xb57/0xb73
[   94.342509]  [<c024ab34>] __lock_acquire+0xac/0xb73
[   94.342509]  [<c024b7fa>] ? lock_release_non_nested+0x17b/0x1de
[   94.342509]  [<c024b662>] lock_acquire+0x67/0x84
[   94.342509]  [<c04cd1eb>] ? skb_dequeue+0x15/0x41
[   94.342509]  [<c054a857>] _spin_lock_irqsave+0x2f/0x3f
[   94.342509]  [<c04cd1eb>] ? skb_dequeue+0x15/0x41
[   94.342509]  [<c04cd1eb>] skb_dequeue+0x15/0x41
[   94.342509]  [<c054a648>] ? _read_unlock+0x1d/0x20
[   94.342509]  [<c04cd641>] skb_queue_purge+0x14/0x1b
[   94.342509]  [<fab94fdc>] l2cap_recv_frame+0xea1/0x115a [l2cap]
[   94.342509]  [<c024b5df>] ? __lock_acquire+0xb57/0xb73
[   94.342509]  [<c0249c04>] ? mark_lock+0x1e/0x1c7
[   94.342509]  [<f8364963>] ? hci_rx_task+0xd2/0x1bc [bluetooth]
[   94.342509]  [<fab95346>] l2cap_recv_acldata+0xb1/0x1c6 [l2cap]
[   94.342509]  [<f8364997>] hci_rx_task+0x106/0x1bc [bluetooth]
[   94.342509]  [<fab95295>] ? l2cap_recv_acldata+0x0/0x1c6 [l2cap]
[   94.342509]  [<c02302c4>] tasklet_action+0x69/0xc1
[   94.342509]  [<c022fbef>] __do_softirq+0x94/0x11e
[   94.342509]  [<c022fcaf>] do_softirq+0x36/0x5a
[   94.342509]  [<c022fe14>] irq_exit+0x35/0x68
[   94.342509]  [<c0204ced>] do_IRQ+0x72/0x89
[   94.342509]  [<c02038ee>] common_interrupt+0x2e/0x34
[   94.342509]  [<c024007b>] ? pm_qos_add_requirement+0x63/0x9d
[   94.342509]  [<c038e8a5>] ? acpi_idle_enter_bm+0x209/0x238
[   94.342509]  [<c049d238>] cpuidle_idle_call+0x5c/0x94
[   94.342509]  [<c02023f8>] cpu_idle+0x4e/0x6f
[   94.342509]  [<c0534153>] rest_init+0x53/0x55
[   94.342509]  [<c0781894>] start_kernel+0x2f0/0x2f5
[   94.342509]  [<c0781091>] i386_start_kernel+0x91/0x96

Reported-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

bluetooth: scheduling while atomic bug fix

Due to driver core changes dev_set_drvdata will call kzalloc which should be
in might_sleep context, but hci_conn_add will be called in atomic context

Like dev_set_name move dev_set_drvdata to work queue function.

oops as following:

Oct  2 17:41:59 darkstar kernel: [  438.001341] BUG: sleeping function called from invalid context at mm/slqb.c:1546
Oct  2 17:41:59 darkstar kernel: [  438.001345] in_atomic(): 1, irqs_disabled(): 0, pid: 2133, name: sdptool
Oct  2 17:41:59 darkstar kernel: [  438.001348] 2 locks held by sdptool/2133:
Oct  2 17:41:59 darkstar kernel: [  438.001350]  #0:  (sk_lock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+.+.}, at: [<faa1d2f5>] lock_sock+0xa/0xc [l2cap]
Oct  2 17:41:59 darkstar kernel: [  438.001360]  #1:  (&hdev->lock){+.-.+.}, at: [<faa20e16>] l2cap_sock_connect+0x103/0x26b [l2cap]
Oct  2 17:41:59 darkstar kernel: [  438.001371] Pid: 2133, comm: sdptool Not tainted 2.6.31-mm1 #2
Oct  2 17:41:59 darkstar kernel: [  438.001373] Call Trace:
Oct  2 17:41:59 darkstar kernel: [  438.001381]  [<c022433f>] __might_sleep+0xde/0xe5
Oct  2 17:41:59 darkstar kernel: [  438.001386]  [<c0298843>] __kmalloc+0x4a/0x15a
Oct  2 17:41:59 darkstar kernel: [  438.001392]  [<c03f0065>] ? kzalloc+0xb/0xd
Oct  2 17:41:59 darkstar kernel: [  438.001396]  [<c03f0065>] kzalloc+0xb/0xd
Oct  2 17:41:59 darkstar kernel: [  438.001400]  [<c03f04ff>] device_private_init+0x15/0x3d
Oct  2 17:41:59 darkstar kernel: [  438.001405]  [<c03f24c5>] dev_set_drvdata+0x18/0x26
Oct  2 17:41:59 darkstar kernel: [  438.001414]  [<fa51fff7>] hci_conn_init_sysfs+0x40/0xd9 [bluetooth]
Oct  2 17:41:59 darkstar kernel: [  438.001422]  [<fa51cdc0>] ? hci_conn_add+0x128/0x186 [bluetooth]
Oct  2 17:41:59 darkstar kernel: [  438.001429]  [<fa51ce0f>] hci_conn_add+0x177/0x186 [bluetooth]
Oct  2 17:41:59 darkstar kernel: [  438.001437]  [<fa51cf8a>] hci_connect+0x3c/0xfb [bluetooth]
Oct  2 17:41:59 darkstar kernel: [  438.001442]  [<faa20e87>] l2cap_sock_connect+0x174/0x26b [l2cap]
Oct  2 17:41:59 darkstar kernel: [  438.001448]  [<c04c8df5>] sys_connect+0x60/0x7a
Oct  2 17:41:59 darkstar kernel: [  438.001453]  [<c024b703>] ? lock_release_non_nested+0x84/0x1de
Oct  2 17:41:59 darkstar kernel: [  438.001458]  [<c028804b>] ? might_fault+0x47/0x81
Oct  2 17:41:59 darkstar kernel: [  438.001462]  [<c028804b>] ? might_fault+0x47/0x81
Oct  2 17:41:59 darkstar kernel: [  438.001468]  [<c033361f>] ? __copy_from_user_ll+0x11/0xce
Oct  2 17:41:59 darkstar kernel: [  438.001472]  [<c04c9419>] sys_socketcall+0x82/0x17b
Oct  2 17:41:59 darkstar kernel: [  438.001477]  [<c020329d>] syscall_call+0x7/0xb

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: fix TCP_DEFER_ACCEPT retrans calculation

Fix TCP_DEFER_ACCEPT conversion between seconds and
retransmission to match the TCP SYN-ACK retransmission periods
because the time is converted to such retransmissions. The old
algorithm selects one more retransmission in some cases. Allow
up to 255 retransmissions.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT

Change SYN-ACK retransmitting code for the TCP_DEFER_ACCEPT
users to not retransmit SYN-ACKs during the deferring period if
ACK from client was received. The goal is to reduce traffic
during the deferring period. When the period is finished
we continue with sending SYN-ACKs (at least one) but this time
any traffic from client will change the request to established
socket allowing application to terminate it properly.
Also, do not drop acked request if sending of SYN-ACK fails.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: accept socket after TCP_DEFER_ACCEPT period

Willy Tarreau and many other folks in recent years
were concerned what happens when the TCP_DEFER_ACCEPT period
expires for clients which sent ACK packet. They prefer clients
that actively resend ACK on our SYN-ACK retransmissions to be
converted from open requests to sockets and queued to the
listener for accepting after the deferring period is finished.
Then application server can decide to wait longer for data
or to properly terminate the connection with FIN if read()
returns EAGAIN which is an indication for accepting after
the deferring period. This change still can have side effects
for applications that expect always to see data on the accepted
socket. Others can be prepared to work in both modes (with or
without TCP_DEFER_ACCEPT period) and their data processing can
ignore the read=EAGAIN notification and to allocate resources for
clients which proved to have no data to send during the deferring
period. OTOH, servers that use TCP_DEFER_ACCEPT=1 as flag (not
as a timeout) to wait for data will notice clients that didn't
send data for 3 seconds but that still resend ACKs.
Thanks to Willy Tarreau for the initial idea and to
Eric Dumazet for the review and testing the change.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "tcp: fix tcp_defer_accept to consider the timeout"

This reverts commit 6d01a026b7d3009a418326bdcf313503a314f1ea.

Julian Anastasov, Willy Tarreau and Eric Dumazet have come up
with a more correct way to deal with this.

Signed-off-by: David S. Miller <davem@davemloft.net>

perf timechart: Improve the visual appearance of scheduler delays

[from KS feedback]

Currently, scheduler delays are shown in a mostly transparent,
light yellow color. This color is rather hard to see on several
screens, especially projectors.

This patch changes the color of the scheduler delays to be a
much more "hard" yellow that survived the kernel summit
projector.

Reported-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020064731.20ae126a@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

perf timechart: Fix the wakeup-arrows that point to non-visible processes

The timechart wakeup arrows currently show no process
information when the waker/wakee are processes that are not
actually chosen to be shown on the timechart.

This patch fixes this oversight, by looking through all
processes (after giving preference to visible processes) as well
as falling back to just showing the PID if no name for the
process can be resolved.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091020064649.0e4959b2@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

perf top: Fix --delay_secs 0 division by zero

Add delay_secs sanity check to handle_keypress,
this fixes a division by zero crash.

Signed-off-by: Tim Blechmann <tim@klingt.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4AD9EBFD.106@klingt.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

AF_UNIX: Fix deadlock on connecting to shutdown socket

I found a deadlock bug in UNIX domain socket, which makes able to DoS
attack against the local machine by non-root users.

How to reproduce:
1. Make a listening AF_UNIX/SOCK_STREAM socket with an abstruct
    namespace(*), and shutdown(2) it.
2. Repeat connect(2)ing to the listening socket from the other sockets
    until the connection backlog is full-filled.
3. connect(2) takes the CPU forever. If every core is taken, the
    system hangs.

PoC code: (Run as many times as cores on SMP machines.)

int main(void)
{
int ret;
int csd;
int lsd;
struct sockaddr_un sun;

/* make an abstruct name address (*) */
memset(&sun, 0, sizeof(sun));
sun.sun_family = PF_UNIX;
sprintf(&sun.sun_path[1], "%d", getpid());

/* create the listening socket and shutdown */
lsd = socket(AF_UNIX, SOCK_STREAM, 0);
bind(lsd, (struct sockaddr *)&sun, sizeof(sun));
listen(lsd, 1);
shutdown(lsd, SHUT_RDWR);

/* connect loop */
alarm(15); /* forcely exit the loop after 15 sec */
for (;;) {
csd = socket(AF_UNIX, SOCK_STREAM, 0);
ret = connect(csd, (struct sockaddr *)&sun, sizeof(sun));
if (-1 == ret) {
perror("connect()");
break;
}
puts("Connection OK");
}
return 0;
}

(*) Make sun_path[0] = 0 to use the abstruct namespace.
    If a file-based socket is used, the system doesn't deadlock because
    of context switches in the file system layer.

Why this happens:
Error checks between unix_socket_connect() and unix_wait_for_peer() are
inconsistent. The former calls the latter to wait until the backlog is
processed. Despite the latter returns without doing anything when the
socket is shutdown, the former doesn't check the shutdown state and
just retries calling the latter forever.

Patch:
The patch below adds shutdown check into unix_socket_connect(), so
connect(2) to the shutdown socket will return -ECONREFUSED.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
Signed-off-by: Masanori Yoshida <masanori.yoshida.tv@hitachi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethoc: clear only pending irqs

This patch fixed the problem of dropped packets due to lost of
interrupt requests. We should only clear what was pending at the
moment we read the irq source reg.

Signed-off-by: Thomas Chou <thomas@wytron.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethoc: inline regs access

Signed-off-by: Thomas Chou <thomas@wytron.com.tw>
Signed-off-by: David S. Miller <davem@davemloft.net>

pcmcia: do not try to store more than 4 version strings

... for struct pcmcia_device only provides for 4 anyway.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

pcmcia: pccard_read_tuple and TUPLE_RETURN_COMMON cleanup

pccard_read_tuple(), which is only used by the PCMCIA core, should
handle TUPLE_RETURN_COMMON more sensibly: If a specific function (which
may be 0) is requested, set tuple.Attributes = 0 as was done in all
PCMCIA drivers. If, however, BIND_FN_ALL is requested, return the
"common" tuple. As to the callers of pccard_read_tuple():

- All calls to pcmcia_validate_cis() had set the "function" parameter to
  BIND_FN_ALL. Therefore, remove the "function" parameter and make the
  parameter to pccard_read_tuple explicit.

- Calls to CISTPL_VERS_1 and CISTPL_MANFID now set BIND_FN_ALL. This was
  already the case for calls to CISTPL_LONGLINK_MFC.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

inotify: fix coalesce duplicate events into a single event in special case

If we do rename a dir entry, like this:

rename("/tmp/ino7UrgoJ.rename1", "/tmp/ino7UrgoJ.rename2")
rename("/tmp/ino7UrgoJ.rename2", "/tmp/ino7UrgoJ")

The duplicate events should be coalesced into a single event. But those two
events do not be coalesced into a single event, due to some bad check in
event_compare(). It can not match the two NULL inodes as the same event.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Eric Paris <eparis@redhat.com>

inotify: deprecate the inotify kernel interface

In 2.6.33 there will be no users of the inotify interface. Mark it for
removal as fsnotify is more generic and is easier to use.

Signed-off-by: Eric Paris <eparis@redhat.com>

fsnotify: do not set group for a mark before it is on the i_list

fsnotify_add_mark is supposed to add a mark to the g_list and i_list and to
set the group and inode for the mark.  fsnotify_destroy_mark_by_entry uses
the fact that ->group != NULL to know if this group should be destroyed or
if it's already been done.

But fsnotify_add_mark sets the group and inode before it actually adds the
mark to the i_list and g_list.  This can result in a race in inotify, it
requires 3 threads.

sys_inotify_add_watch("file") sys_inotify_add_watch("file") sys_inotify_rm_watch([a])
inotify_update_watch()
inotify_new_watch()
inotify_add_to_idr()
   ^--- returns wd = [a]
inotfiy_update_watch()
inotify_new_watch()
inotify_add_to_idr()
fsnotify_add_mark()
   ^--- returns wd = [b]
returns to userspace;
inotify_idr_find([a])
   ^--- gives us the pointer from task 1
fsnotify_add_mark()
   ^--- this is going to set the mark->group and mark->inode fields, but will
return -EEXIST because of the race with [b].
fsnotify_destroy_mark()
   ^--- since ->group != NULL we call back
into inotify_freeing_mark() which calls
inotify_remove_from_idr([a])

since fsnotify_add_mark() failed we call:
inotify_remove_from_idr([a])     <------WHOOPS it's not in the idr, this could
have been any entry added later!

The fix is to make sure we don't set mark->group until we are sure the mark is
on the inode and fsnotify_add_mark will return success.

Signed-off-by: Eric Paris <eparis@redhat.com>

Input: hp_sdc_rtc - fix test in hp_sdc_rtc_read_rt()

If left unsigned the hp_sdc_rtc_read_i8042timer() return value will not
be checked correctly.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

Input: atkbd - consolidate force release quirks for volume keys

Some machines share same key list for volume up/down release key quirks,
use only one key list.

Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

Input: logips2pp - model 73 is actually TrackMan FX

Reported-and-tested-by: Harald Dunkel <harald.dunkel@t-online.de>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

Input: i8042 - add Sony Vaio VGN-FZ240E to the nomux list

On this model, when KBD is in active multiplexing mode, acknowledgements
to reset and get ID commands issued on KBD port sometimes are delivered
to AUX3 port (touchpad) which messes up device detection. Legacy KBC
mode works fine and since there are no external PS/2 ports on this laptop
and no support for docking station we can safely disable active MUX mode.

Tested-by: Carlos R. Mafra <crmafra2@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

ARM: 5764/1: bcmring: add oprofile pmu support

add oprofile pmu support for bcmring.

Signed-off-by: Leo Hao Chen <leochen@broadcom.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

Merge branch 'fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ycmiao/pxa-linux-2.6

[ARM] pxa/spitz: add gpio button support (fixes regression)

Updating desc for lid keys and resending patch with proper comments:
Define Spitz buttons as GPIO keys in a way compatible with the old driver:

On/Off: As Suspend EV_PWR key
Raw values of lid sensors SWA and SWB: As EV_SW switches
SWA: Display Down
SWB: Lid Closed
Recommended user space decoding:
SWA==0 & SWB==0: lid opened (landscape mode)
SWA==1 & SWB==0: invalid (or mechanic race condition)
SWA==0 & SWB==1: lid closed with display up (portrait mode or mechanic
race condition while closing to display-less mode)
SWA==1 & SWB==1: lid closed with display down (display-less mode)

AK_INT remote trigger is not mapped as input event. Without complete
remote driver and remote pull-up control it has no useful
interpretation.

Signed-off-by: Stanislav Brabec <utx@penguin.cz>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>

[ARM] pxa/cm-x300: fix mmc numbering

CM-X300 has libertas on mmc2 and SD card slot on mmc1.
This patch fixes wrong MMC ports assignment.

Signed-off-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>

pcmcia: properly close previous dev_printk if kzalloc fails in do_io_probe

Signed-off-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

pcmcia: fix controller printk format warnings

Fix new pcmcia printk format warnings:
[This has now moved from linux-next to mainline.
Originally sent 2009-SEP-17.]

drivers/pcmcia/i82365.c:1055: warning: format '%#x' expects type 'unsigned int', but argument 6 has type 'phys_addr_t'
drivers/pcmcia/i82365.c:1055: warning: format '%#x' expects type 'unsigned int', but argument 7 has type 'phys_addr_t'
drivers/pcmcia/tcic.c:734: warning: format '%#x' expects type 'unsigned int', but argument 6 has type 'phys_addr_t'
drivers/pcmcia/tcic.c:734: warning: format '%#x' expects type 'unsigned int', but argument 7 has type 'phys_addr_t'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n

vmxnet3 was using dprintk() for debugging output. This was
defined in <linux/dst.h> and was the only thing that was
used from that header file. This caused compile errors
when CONFIG_BLOCK was not enabled due to bio* and BIO*
uses in the header file, so change this driver to use
dev_dbg() for debugging output.

include/linux/dst.h:520: error: dereferencing pointer to incomplete type
include/linux/dst.h:520: error: 'BIO_POOL_BITS' undeclared (first use in this function)
include/linux/dst.h:521: error: dereferencing pointer to incomplete type
include/linux/dst.h:522: error: dereferencing pointer to incomplete type
include/linux/dst.h:525: error: dereferencing pointer to incomplete type
make[4]: *** [drivers/net/vmxnet3/vmxnet3_drv.o] Error 1

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Bhavesh Davda <bhavesh@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dm snapshot: allow chunk size to be less than page size

Allow the snapshot chunk size to be smaller than the page size
The code is now capable of handling this due to some previous
fixes and enhancements.

As the page size varies between computers, prior to this patch,
the chunk size of a snapshot dictated which machines could read it:
Snapshots created on one machine might not be readable on another.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm snapshot: use unsigned integer chunk size

Use unsigned integer chunk size.

Maximum chunk size is 512kB, there won't ever be need to use 4GB chunk size,
so the number can be 32-bit. This fixes compiler failure on 32-bit systems
with large block devices.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm snapshot: lock snapshot while supplying status

This patch locks the snapshot when returning status. It fixes a race
when it could return an invalid number of free chunks if someone
was simultaneously modifying it.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm exception store: fix failed set_chunk_size error path

Properly close the device if failing because of an invalid chunk size.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm snapshot: require non zero chunk size by end of ctr

If we are creating snapshot with memory-stored exception store, fail if
the user didn't specify chunk size. Zero chunk size would probably crash
a lot of places in the rest of snapshot code.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm: dec_pending needs locking to save error value

Multiple instances of dec_pending() can run concurrently so a lock is
needed when it saves the first error code.

I have never experienced actual problem without locking and just found
this during code inspection while implementing the barrier support
patch for request-based dm.

This patch adds the locking.
I've done compile, boot and basic I/O testings.

Cc: stable@kernel.org
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm: add missing del_gendisk to alloc_dev error path

Add missing del_gendisk() to error path when creation of workqueue fails.
Otherwice there is a resource leak and following warning is shown:

WARNING: at fs/sysfs/dir.c:487 sysfs_add_one+0xc5/0x160()
sysfs: cannot create duplicate filename '/devices/virtual/block/dm-0'

Cc: stable@kernel.org
Signed-off-by: Zdenek Kabelac <zkabelac@redhat.com>
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm log: userspace fix incorrect luid cast in userspace_ctr

mips:

drivers/md/dm-log-userspace-base.c: In function `userspace_ctr':
drivers/md/dm-log-userspace-base.c:159: warning: cast from pointer to integer of different size

Cc: stable@kernel.org
Cc: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm snapshot: free exception store on init failure

While initializing the snapshot module, if we fail to register
the snapshot target then we must back-out the exception store
module initialization.

Cc: stable@kernel.org
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm snapshot: sort by chunk size to fix race

Avoid a race causing corruption when snapshots of the same origin have
different chunk sizes by sorting the internal list of snapshots by chunk
size, largest first.
  https://bugzilla.redhat.com/show_bug.cgi?id=182659

For example, let's have two snapshots with different chunk sizes. The
first snapshot (1) has small chunk size and the second snapshot (2) has
large chunk size.  Let's have chunks A, B, C in these snapshots:
snapshot1: ====A====   ====B====
snapshot2: ==========C==========

(Chunk size is a power of 2. Chunks are aligned.)

A write to the origin at a position within A and C comes along. It
triggers reallocation of A, then reallocation of C and links them
together using A as the 'primary' exception.

Then another write to the origin comes along at a position within B and
C.  It creates pending exception for B.  C already has a reallocation in
progress and it already has a primary exception (A), so nothing is done
to it: B and C are not linked.

If the reallocation of B finishes before the reallocation of C, because
there is no link with the pending exception for C it does not know to
wait for it and, the second write is dispatched to the origin and causes
data corruption in the chunk C in snapshot2.

To avoid this situation, we maintain snapshots sorted in descending
order of chunk size.  This leads to a guaranteed ordering on the links
between the pending exceptions and avoids the problem explained above -
both A and B now get linked to C.

Cc: stable@kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
amd64_edac: fix DRAM base and limit extraction masks, v2

amd64_edac: fix DRAM base and limit extraction masks, v2

This is a proper fix as a follow-up to 66216a7 and 916d11b.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  sata_mv: Prevent PIO commands to be defered too long if traffic in progress.
  pata_sc1200: Fix crash on boot
  libata: fix internal command failure handling
  libata: fix PMP initialization
  sata_nv: make sure link is brough up online when skipping hardreset
  ahci / atiixp / pci quirks: rename AMD SB900 into Hudson-2
  ahci: Add the AHCI controller Linux Device ID for NVIDIA chipsets.
  pata_via: extend the rev_max for VT6330

KVM: Prevent kvm_init from corrupting debugfs structures

I'm seeing an oops condition when kvm-intel and kvm-amd are modprobe'd
during boot (say on an Intel system) and then rmmod'd:

   # modprobe kvm-intel
     kvm_init()
     kvm_init_debug()
     kvm_arch_init()  <-- stores debugfs dentries internally
     (success, etc)

   # modprobe kvm-amd
     kvm_init()
     kvm_init_debug() <-- second initialization clobbers kvm's
                          internal pointers to dentries
     kvm_arch_init()
     kvm_exit_debug() <-- and frees them

   # rmmod kvm-intel
     kvm_exit()
     kvm_exit_debug() <-- double free of debugfs files!

     *BOOM*

If execution gets to the end of kvm_init(), then the calling module has been
established as the kvm provider.  Move the debugfs initialization to the end of
the function, and remove the now-unnecessary call to kvm_exit_debug() from the
error path.  That way we avoid trampling on the debugfs entries and freeing
them twice.

Cc: stable@kernel.org
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

KVM: MMU: fix pointer cast

On a 32 bits compile, commit 3da0dd433dc399a8c0124d0614d82a09b6a49bce
introduced the following warnings:

arch/x86/kvm/mmu.c: In function ‘kvm_set_pte_rmapp’:
arch/x86/kvm/mmu.c:770: warning: cast to pointer from integer of different size
arch/x86/kvm/mmu.c: In function ‘kvm_set_spte_hva’:
arch/x86/kvm/mmu.c:849: warning: cast from pointer to integer of different size

The following patch uses 'unsigned long' instead of u64 to match the
pointer size on both arches.

Signed-off-by: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

KVM: use proper hrtimer function to retrieve expiration time

hrtimer->base can be temporarily NULL due to racing hrtimer_start.
See switch_hrtimer_base/lock_hrtimer_base.

Use hrtimer_get_remaining which is robust against it.

CC: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

x86, UV: Fix and clean up bau code to use uv_gpa_to_pnode()

Create an inline function to extract the pnode from a global
physical address and then convert the broadcast assist unit to
use the newly created uv_gpa_to_pnode function.

The open-coded code was wrong as well - it might explain a
few of our unexplained bau hangs.

Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Cliff Whickman <cpw@sgi.com>
Cc: linux-mm@kvack.org
Cc: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20091016112920.GZ8903@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sata_mv: Prevent PIO commands to be defered too long if traffic in progress.

Use excl_link when non NCQ commands are defered, to be sure they are processed
as soon as outstanding commands are completed. It prevents some commands to be
defered indifinitely when using a port multiplier.

Signed-off-by: Gwendal Grignou <gwendal@google.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

pata_sc1200: Fix crash on boot

The SC1200 needs a NULL terminator or it may cause a crash on boot.

Bug #14227

Also correct a bogus comment as the driver had serializing added so can run
dual port.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

libata: fix internal command failure handling

When an internal command fails, it should be failed directly without
invoking EH. In the original implemetation, this was accomplished by
letting internal command bypass failure handling in ata_qc_complete().
However, later changes added post-successful-completion handling to
that code path and the success path is no longer adequate as internal
command failure path. One of the visible problems is that internal
command failure due to timeout or other freeze conditions would
spuriously trigger WARN_ON_ONCE() in the success path.

This patch updates failure path such that internal command failure
handling is contained there.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

libata: fix PMP initialization

Commit 842faa6c1a1d6faddf3377948e5cf214812c6c90 fixed error handling
during attach by not committing detected device class to dev->class
while attaching a new device.  However, this change missed the PMP
class check in the configuration loop causing a new PMP device to go
through ata_dev_configure() as if it were an ATA or ATAPI device.

As PMP device doesn't have a regular IDENTIFY data, this makes
ata_dev_configure() tries to configure a PMP device using an invalid
data.  For the most part, it wasn't too harmful and went unnoticed but
this ends up clearing dev->flags which may have ATA_DFLAG_AN set by
sata_pmp_attach().  This means that SATA_PMP_FEAT_NOTIFY ends up being
disabled on PMPs and on PMPs which honor the flag breaks hotplug
support.

This problem was discovered and reported by Ethan Hsiao.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ethan Hsiao <ethanhsiao@jmicron.com>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

sata_nv: make sure link is brough up online when skipping hardreset

prereset doesn't bring link online if hardreset is about to happen and
nv_hardreset() may skip if conditions are not right so softreset may
be entered with non-working link status if the system firmware didn't
bring it up before entering OS code which can happen during resume.
This patch makes nv_hardreset() to bring up the link if it's skipping
reset.

This bug was reported by frodone@gmail.com in the following bug entry.

http://bugzilla.kernel.org/show_bug.cgi?id=14329

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: frodone@gmail.com
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

ahci / atiixp / pci quirks: rename AMD SB900 into Hudson-2

This patch renames the code name SB900 into Hudson-2

Signed-off-by: Shane Huang <shane.huang@amd.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

ahci: Add the AHCI controller Linux Device ID for NVIDIA chipsets.

Add the generic device ID for NVIDIA AHCI controller.

Signed-off-by: Peer Chen <peerchen@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

pata_via: extend the rev_max for VT6330

Fix the VT6330 issue, it's because the rev_max of VT6330 exceeds 0x2f.
The VT6415 and VT6330 share the same device ID.

Signed-off-by: Joseph Chan <josephchan@via.com.tw>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

sh: Kill off stray HAVE_FTRACE_SYSCALLS reference.

This seems to have popped back in via some merge damage. Kill it off.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>

perf tools: Bump version to 0.0.2

We released the first version of perf with 0.0.1 in v2.6.31,
time to double our version number to 0.0.2 ;-)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

futex: Move drop_futex_key_refs out of spinlock'ed region

When requeuing tasks from one futex to another, the reference held
by the requeued task to the original futex location needs to be
dropped eventually.

Dropping the reference may ultimately lead to a call to
"iput_final" and subsequently call into filesystem- specific code -
which may be non-atomic.

It is therefore safer to defer this drop operation until after the
futex_hash_bucket spinlock has been dropped.

Originally-From: Helge Bahmann <hcb@chaoticmind.net>
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Cc: <stable@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Dinakar Guniguntala <dino@in.ibm.com>
Cc: John Stultz <johnstul@linux.vnet.ibm.com>
Cc: Sven-Thorsten Dietrich <sdietrich@novell.com>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <4AD7A298.5040802@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86: Don't print number of MCE banks for every CPU

The MCE initialization code explicitly says it doesn't handle
asymmetric configurations where different CPUs support different
numbers of MCE banks, and it prints a big warning in that case.

Therefore, printing the "mce: CPU supports <x> MCE banks"
message into the kernel log for every CPU is pure redundancy
that clutters the log significantly for systems with lots of
CPUs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
LKML-Reference: <adaeip473qt.fsf@cisco.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86, UV: Fix information in __uv_hub_info structure

A few parts of the uv_hub_info structure are initialized
incorrectly.

- n_val is being loaded with m_val.
- gpa_mask is initialized with a bytes instead of an unsigned long.
- Handle the case where none of the alias registers are used.

Lastly I converted the bau over to using the uv_hub_info->m_val
which is the correct value.

Without this patch, booting a large configuration hits a
problem where the upper bits of the gnode affect the pnode
and the bau will not operate.

Signed-off-by: Robin Holt <holt@sgi.com>
Acked-by: Jack Steiner <steiner@sgi.com>
Cc: Cliff Whickman <cpw@sgi.com>
Cc: stable@kernel.org
LKML-Reference: <20091015224946.396355000@alcatraz.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sh: Remove BKL from landisk gio.

The open function got the BKL via the big push down. Replace it by
preempt_enable/disable as this is sufficient for an UP machine.

The ioctl can be unlocked because there is no functionality which
requires serialization. The usage by multiple callers is broken with
and without the BKL due to the local static variable addr.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>

sh: disabled cache handling fix.

Add code to handle the cache disabled case. Fixes breakage introduced by
37443ef3f0406e855e169c87ae3f4ffb4b6ff635 ("sh: Migrate SH-4 cacheflush
ops to function pointers."). Without this patch configuring caches off
with CONFIG_CACHE_OFF=y makes kfr2r09 and migo-r lock up in fbdev
deferred io or early user space.

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>

x86: Document linker script ASSERT() quirk

Older binutils breaks if ASSERT() is used without a sink
for the output.

For example 2.14.90.0.6 is known to be broken, the link
fails with:

  LD      .tmp_vmlinux1
  ld:arch/x86/kernel/vmlinux.lds:678: parse error

Document this quirk in all three files that use it.

  See:    http://marc.info/?l=linux-kbuild&m=124930110427870&w=2
  See[2]: d2ba8b2 ("x86: Fix assert syntax in vmlinux.lds.S")

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <4AD6523D.5030909@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

sh: Fix up single page flushing to use PAGE_SIZE.

Presently The SH-4 cache flushing code uses flush_cache_4096() for most
of the real flushing work, which breaks down to a fixed 4096 unroll and
increment. Not only is this sub-optimal for larger page sizes, it's also
uncovered a bug in sh4_flush_dcache_page() when large page sizes are used
and we have no cache aliases -- resulting in only a part of the page's
D-cache lines being written back.

Signed-off-by: Valentin Sitdikov <valentin.sitdikov@siemens.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>

Linux 2.6.32-rc5

Merge branch 'docs-next' of git://git.lwn.net/linux-2.6

* 'docs-next' of git://git.lwn.net/linux-2.6:
Update flex_arrays.txt

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
dlm: fix socket fd translation
dlm: fix lowcomms_connect_node for sctp

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
Revert "x86: linker script syntax nits"
x86, perf_event: Rename 'performance counter interrupt'

KEYS: get_instantiation_keyring() should inc the keyring refcount in all cases

The destination keyring specified to request_key() and co. is made available to
the process that instantiates the key (the slave process started by
/sbin/request-key typically).  This is passed in the request_key_auth struct as
the dest_keyring member.

keyctl_instantiate_key and keyctl_negate_key() call get_instantiation_keyring()
to get the keyring to attach the newly constructed key to at the end of
instantiation.  This may be given a specific keyring into which a link will be
made later, or it may be asked to find the keyring passed to request_key().  In
the former case, it returns a keyring with the refcount incremented by
lookup_user_key(); in the latter case, it returns the keyring from the
request_key_auth struct - and does _not_ increment the refcount.

The latter case will eventually result in an oops when the keyring prematurely
runs out of references and gets destroyed.  The effect may take some time to
show up as the key is destroyed lazily.

To fix this, the keyring returned by get_instantiation_keyring() must always
have its refcount incremented, no matter where it comes from.

This can be tested by setting /etc/request-key.conf to:

#OP TYPE DESCRIPTION CALLOUT INFO PROGRAM ARG1 ARG2 ARG3 ...
#====== ======= =============== =============== ===============================
create  * test:* * |/bin/false %u %g %d %{user:_display}
negate * * * /bin/keyctl negate %k 10 @u

and then doing:

keyctl add user _display aaaaaaaa @u
        while keyctl request2 user test:x test:x @u &&
        keyctl list @u;
        do
                keyctl request2 user test:x test:x @u;
                sleep 31;
                keyctl list @u;
        done

which will oops eventually.  Changing the negate line to have @u rather than
%S at the end is important as that forces the latter case by passing a special
keyring ID rather than an actual keyring ID.

Reported-by: Alexander Zangerl <az@bond.edu.au>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Alexander Zangerl <az@bond.edu.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/pci: Fix MODPOST warning
  powerpc/oprofile: Add ppc750 CL as supported by oprofile
  powerpc: warning: allocated section `.data_nosave' not in segment
  powerpc/kgdb: Fix build failure caused by "kgdb.c: unused variable 'acc'"
  powerpc: Fix hypervisor TLB batching
  powerpc/mm: Fix hang accessing top of vmalloc space
  powerpc: Fix memory leak in axon_msi.c
  powerpc/pmac: Fix issues with sleep on some powerbooks
  powerpc64/ftrace: use PACA to retrieve TOC in mod_return_to_handler
  powerpc/ftrace: show real return addresses in modules

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI button: don't try to use a non-existent lid device
  ACPI: video: Loosen strictness of video bus detection code
  eeepc-laptop: Prevent a panic when disabling RT2860 wireless when associated
  eeepc-laptop: Properly annote eeepc_enable_camera().
  ACPI / PCI: Fix NULL pointer dereference in acpi_get_pci_dev() (rev. 2)
  fujitsu-laptop: address missed led-class ifdef fixup
  ACPI: Kconfig, fix proc aggregator text
  ACPI: add AC/DC notifier

Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6

* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
  OMAP2xxx clock: set up clockdomain pointer in struct clk
  OMAP: Fix race condition with autodeps
  omap: McBSP: Fix incorrect receiver stop in omap_mcbsp_stop
  omap: Initialization of SDRC params on Zoom2
  omap: RX-51: Drop I2C-1 speed to 2200
  omap: SDMA: Fixing bug in omap_dma_set_global_params()
  omap: CONFIG_ISP1301_OMAP redefined in Beagle defconfig

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: always pin metadata in discard mode
  Btrfs: enable discard support
  Btrfs: add -o discard option
  Btrfs: properly wait log writers during log sync
  Btrfs: fix possible ENOSPC problems with truncate
  Btrfs: fix btrfs acl #ifdef checks
  Btrfs: streamline tree-log btree block writeout
  Btrfs: avoid tree log commit when there are no changes
  Btrfs: only write one super copy during fsync

Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
tty: fix vt_compat_ioctl

Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
sysfs: Allow sysfs_notify_dirent to be called from interrupt context.
sysfs: Allow sysfs_move_dir(..., NULL) again.

Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: gadget: Fix EEM driver comments and VID/PID
  usb-storage: Workaround devices with bogus sense size
  USB: ehci: Fix IST boundary checking interval math.
  USB: option: Support for AIRPLUS MCD650 Datacard
  USB: whci-hcd: always do an update after processing a halted qTD
  USB: whci-hcd: handle early deletion of endpoints
  USB: wusb: don't use the stack to read security descriptor
  USB: rename Documentation/ABI/.../sysfs-class-usb_host

Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6:
  Staging: rt2860sta: prevent a panic when disabling when associated
  staging: more sched.h fixes
  Staging: et131x: Fix the add_10bit macro
  Staging: et131x: Correct WRAP bit handling
  staging: Complete sched.h removal from interrupt.h
  Staging: vme: fix sched.h build breakage
  Staging: poch: fix sched.h build breakage
  Staging: b3dfg: fix sched.h build breakage
  Staging: comedi: fix sched.h build breakage
  Staging: iio: Fix missing include <linux/sched.h>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (53 commits)
  vmxnet: fix 2 build problems
  net: add support for STMicroelectronics Ethernet controllers.
  net: ks8851_mll uses mii interfaces
  net/fec_mpc52xx: Fix kernel panic on FEC error
  net: Fix OF platform drivers coldplug/hotplug when compiled as modules
  TI DaVinci EMAC: Clear statistics register properly.
  r8169: partial support and phy init for the 8168d
  irda/sa1100_ir: check return value of startup hook
  udp: Fix udp_poll() and ioctl()
  WAN: fix Cisco HDLC handshaking.
  tcp: fix tcp_defer_accept to consider the timeout
  3c574_cs: spin_lock the set_multicast_list function
  net: Teach pegasus driver to ignore bluetoother adapters with clashing Vendor:Product IDs
  netxen: fix pci bar mapping
  ethoc: fix warning from 32bit build
  libertas: fix build
  net: VMware virtual Ethernet NIC driver: vmxnet3
  net: Fix IXP 2000 network driver building.
  libertas: fix build
  mac80211: document ieee80211_rx() context requirement
  ...

Merge the right tty-fixes branch

* branch 'tty-fixes'
  tty: use the new 'flush_delayed_work()' helper to do ldisc flush
  workqueue: add 'flush_delayed_work()' to run and wait for delayed work
  tty: Make flush_to_ldisc() locking more robust

rcu: Fix TREE_PREEMPT_RCU CPU_HOTPLUG bad-luck hang

If the following sequence of events occurs, then
TREE_PREEMPT_RCU will hang waiting for a grace period to
complete, eventually OOMing the system:

o A TREE_PREEMPT_RCU build of the kernel is booted on a system
with more than 64 physical CPUs present (32 on a 32-bit system).
Alternatively, a TREE_PREEMPT_RCU build of the kernel is booted
with RCU_FANOUT set to a sufficiently small value that the
physical CPUs populate two or more leaf rcu_node structures.

o A task is preempted in an RCU read-side critical section
while running on a CPU corresponding to a given leaf rcu_node
structure.

o All CPUs corresponding to this same leaf rcu_node structure
record quiescent states for the current grace period.

o All of these same CPUs go offline (hence the need for enough
physical CPUs to populate more than one leaf rcu_node structure).
This causes the preempted task to be moved to the root rcu_node
structure.

At this point, there is nothing left to cause the quiescent
state to be propagated up the rcu_node tree, so the current
grace period never completes.

The simplest fix, especially after considering the deadlock
possibilities, is to detect this situation when the last CPU is
offlined, and to set that CPU's ->qsmask bit in its leaf
rcu_node structure. This will cause the next invocation of
force_quiescent_state() to end the grace period.

Without this fix, this hang can be triggered in an hour or so on
some machines with rcutorture and random CPU onlining/offlining.
With this fix, these same machines pass a full 10 hours of this
sort of abuse.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
LKML-Reference: <20091015162614.GA19131@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ARM: 5763/1: ARM: SMP: Fix the BUG with CONFIG_PREEMPT enabled

This patch fixes the BUG: using smp_processor_id() in preemptible
Below is the stripped backtrace.

BUG: using smp_processor_id() in preemptible [00000000] code: init/1
caller is flush_tlb_mm+0x44/0x70
Backtrace:
[<c00225c4>] (dump_backtrace+0x0/0x110) from [<c01713a0>] (dump_stack+0x18/0x1c)
r7:00000000 r6:c00234f0 r5:00000001 r4:c7828000
[<c0171388>] (dump_stack+0x0/0x1c) from [<c0135364>] (debug_smp_processor_id+0xc0/0xf0)
[<c01352a4>] (debug_smp_processor_id+0x0/0xf0) from [<c00234f0>] (flush_tlb_mm+0x44/0x70)
r7:00000000 r6:c60b41a0 r5:c60b4154 r4:00000001
[<c00234ac>] (flush_tlb_mm+0x0/0x70) from [<c0039568>] (dup_mm+0x304/0x38c)
r5:c1f09058 r4:00000000
[<c0039264>] (dup_mm+0x0/0x38c) from [<c0039de4>] (copy_process+0x7b8/0xeb0)
[<c003962c>] (copy_process+0x0/0xeb0) from [<c003a638>] (do_fork+0x15c/0x29c)
[<c003a4dc>] (do_fork+0x0/0x29c) from [<c0021df0>] (sys_clone+0x34/0x3c)
[<c0021dbc>] (sys_clone+0x0/0x3c) from [<c001efa0>] (ret_fast_syscall+0x0/0x2c)

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

Update flex_arrays.txt

The 2.6.32 merge window brought a number of changes to the flexible array
API; this patch updates the documentation to match the new state of
affairs.

Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

Merge branch 'for-rmk-rc' of git://git.pengutronix.de/git/imx/linux-2.6

rcu: Stopgap fix for synchronize_rcu_expedited() for TREE_PREEMPT_RCU

For the short term, map synchronize_rcu_expedited() to
synchronize_rcu() for TREE_PREEMPT_RCU and to
synchronize_sched_expedited() for TREE_RCU.

Longer term, there needs to be a real expedited grace period for
TREE_PREEMPT_RCU, but candidate patches to date are considerably
more complex and intrusive.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: npiggin@suse.de
Cc: jens.axboe@oracle.com
LKML-Reference: <12555405592331-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>

rcu: Prevent RCU IPI storms in presence of high call_rcu() load

As the number of callbacks on a given CPU rises, invoke
force_quiescent_state() only every blimit number of callbacks
(defaults to 10,000), and even then only if no other CPU has
invoked force_quiescent_state() in the meantime.

This should fix the performance regression reported by Nick.

Reported-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: jens.axboe@oracle.com
LKML-Reference: <12555405592133-git-send-email->
Signed-off-by: Ingo Molnar <mingo@elte.hu>

virtio_net: use dev_kfree_skb_any() in free_old_xmit_skbs()

Because netpoll can call netdevice start_xmit() method with
irqs disabled, drivers should not call kfree_skb() from
their start_xmit(), but use dev_kfree_skb_any() instead.

Oct  8 11:16:52 172.30.1.31 [113074.791813] ------------[ cut here ]------------
Oct  8 11:16:52 172.30.1.31 [113074.791813] WARNING: at net/core/skbuff.c:398 \
                skb_release_head_state+0x64/0xc8()
Oct  8 11:16:52 172.30.1.31 [113074.791813] Hardware name:
Oct  8 11:16:52 172.30.1.31 [113074.791813] Modules linked in: netconsole ocfs2 jbd2 quota_tree \
ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs crc32c drbd cn loop \
serio_raw psmouse snd_pcm snd_timer snd soundcore snd_page_alloc virtio_net pcspkr parport_pc parport \
i2c_piix4 i2c_core button processor evdev ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot \
dm_mod ide_cd_mod cdrom ata_generic ata_piix virtio_blk libata scsi_mod piix ide_pci_generic ide_core \
                virtio_pci virtio_ring virtio floppy thermal fan thermal_sys [last unloaded: netconsole]
Oct  8 11:16:52 172.30.1.31 [113074.791813] Pid: 11132, comm: php5-cgi Tainted: G        W  \
                2.6.31.2-vserver #1
Oct  8 11:16:52 172.30.1.31 [113074.791813] Call Trace:
Oct  8 11:16:52 172.30.1.31 [113074.791813] <IRQ>  [<ffffffff81253cd5>] ? \
                skb_release_head_state+0x64/0xc8
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81253cd5>] ? skb_release_head_state+0x64/0xc8
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81049ae1>] ? warn_slowpath_common+0x77/0xa3
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81253cd5>] ? skb_release_head_state+0x64/0xc8
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81253a1a>] ? __kfree_skb+0x9/0x7d
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffffa01cb139>] ? free_old_xmit_skbs+0x51/0x6e \
                [virtio_net]
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffffa01cbc85>] ? start_xmit+0x26/0xf2 [virtio_net]
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8126934f>] ? netpoll_send_skb+0xd2/0x205
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffffa0429216>] ? write_msg+0x90/0xeb [netconsole]
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81049f06>] ? __call_console_drivers+0x5e/0x6f
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8102b49d>] ? kvm_clock_read+0x4d/0x52
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8104a082>] ? release_console_sem+0x115/0x1ba
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8104a632>] ? vprintk+0x2f2/0x34b
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8106b142>] ? vx_update_load+0x18/0x13e
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81308309>] ? printk+0x4e/0x5d
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8102b49d>] ? kvm_clock_read+0x4d/0x52
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81070b62>] ? getnstimeofday+0x55/0xaf
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81062683>] ? ktime_get_ts+0x21/0x49
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff810626b7>] ? ktime_get+0xc/0x41
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81062788>] ? hrtimer_interrupt+0x9c/0x146
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81024a4b>] ? smp_apic_timer_interrupt+0x80/0x93
Oct  8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81011663>] ? apic_timer_interrupt+0x13/0x20
Oct  8 11:16:52 172.30.1.31 [113074.791813] <EOI>  [<ffffffff8130a9eb>] ? _spin_unlock_irq+0xd/0x31

Reported-and-tested-by: Massimo Cetra <mcetra@navynet.it>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Bug-Entry: http://bugzilla.kernel.org/show_bug.cgi?id=14378
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: fix support for PCI hot plug

Before issuing any cmds to the FW, the driver must first wait
till the fW becomes ready. This is needed for PCI hot plug when
the driver can be probed while the card fw is being initialized.

Signed-off-by: Sathya Perla <sathyap@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: fix promiscuous and multicast promiscuous modes being enabled always

Signed-off-by: Sathya Perla <sathyap@serverengines.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "x86: linker script syntax nits"

This reverts commit e9a63a4e559fbdc522072281d05e6b13c1022f4b.

This breaks older binutils, where sink-less asserts are broken.

See this commit for further details:

d2ba8b2: x86: Fix assert syntax in vmlinux.lds.S

Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4AD6523D.5030909@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Merge branch 'linus' into x86/urgent

Merge reason: pull in latest, to be able to revert a patch there.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

Merge branch 'misc' into release

Merge branch 'launchpad-333386' into release

Merge branch 'eeepc-laptop' into release

Merge branch 'bugzilla-14129' into release

vmxnet: fix 2 build problems

vmxnet3 uses in_dev* interfaces so it should depend on INET.
Also fix so that the driver builds when CONFIG_PCI_MSI is disabled.

vmxnet3_drv.c:(.text+0x2a88cb): undefined reference to `in_dev_finish_destroy'

drivers/net/vmxnet3/vmxnet3_drv.c:1335: error: 'struct vmxnet3_intr' has no member named 'msix_entries'
drivers/net/vmxnet3/vmxnet3_drv.c:1384: error: 'struct vmxnet3_intr' has no member named 'msix_entries'
drivers/net/vmxnet3/vmxnet3_drv.c:2137: error: 'struct vmxnet3_intr' has no member named 'msix_entries'
drivers/net/vmxnet3/vmxnet3_drv.c:2138: error: 'struct vmxnet3_intr' has no member named 'msix_entries'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Bhavesh davda <bhavesh@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge commit 'ftrace/ppc' into merge

Merge branch '2_6_32rc4_fixes' of git://git.pwsan.com/linux-2.6 into omap-fixes-for-linus

tty: fix vt_compat_ioctl

Call compat_unimap_ioctl, not do_unimap_ioctl.

This was broken by commit e9216651.

The compat_unimap_ioctl was originally called do_unimap_ioctl in
fs/compat_ioctl.h which got moved to drivers/char/vt_ioctl.c.
In that patch, the caller was not updated and consequently called
the native handler.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

OMAP2xxx clock: set up clockdomain pointer in struct clk

clock24xx.c is missing a omap2_init_clk_clkdm() in its
omap2_clk_init() function. Among other bad effects, this causes the
OMAP hwmod layer to oops on boot.

Thanks to Carlos Aguiar <carlos.aguiar@indt.org.br> and Stefano
Panella <Stefano.Panella@csr.com> for reporting this bug. Thanks to Tony
Lindgren <tony@atomide.com> for N800 booting advice.

Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Carlos Aguiar <carlos.aguiar@indt.org.br>
Cc: Stefano Panella <Stefano.Panella@csr.com>
Cc: Tony Lindgren <tony@atomide.com>

OMAP: Fix race condition with autodeps

There is a possible race condition in clockdomain
code handling hw supported idle transitions.

When multiple autodeps dependencies are being added
or removed, a transition of still remaining dependent
powerdomain can result in false readings of the
state counter. This is especially fatal for off mode
state counter, as it could result in a driver not
noticing a context loss.

Fixed by disabling hw supported state transitions
when autodeps are being changed.

Signed-off-by: Kalle Jokiniemi <kalle.jokiniemi@digia.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  firewire: sbp2: provide fallback if mgt_ORB_timeout is missing
  ieee1394: add documentation entry to MAINTAINERS
  ieee1394: update URLs in debugging-via-ohci1394.txt