git.karo-electronics.de Git - karo-tx-linux.git/log

xfrm_user: don't copy esn replay window twice for new states

The ESN replay window was already fully initialized in
xfrm_alloc_replay_state_esn(). No need to copy it again.

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mathias Krause [Thu, 20 Sep 2012 10:01:49 +0000 (10:01 +0000)]

xfrm_user: ensure user supplied esn replay window is valid

The current code fails to ensure that the netlink message actually
contains as many bytes as the header indicates. If a user creates a new
state or updates an existing one but does not supply the bytes for the
whole ESN replay window, the kernel copies random heap bytes into the
replay bitmap, the ones happen to follow the XFRMA_REPLAY_ESN_VAL
netlink attribute. This leads to following issues:

1. The replay window has random bits set confusing the replay handling
   code later on.

2. A malicious user could use this flaw to leak up to ~3.5kB of heap
   memory when she has access to the XFRM netlink interface (requires
   CAP_NET_ADMIN).

Known users of the ESN replay window are strongSwan and Steffen's
iproute2 patch (<http://patchwork.ozlabs.org/patch/85962/>). The latter
uses the interface with a bitmap supplied while the former does not.
strongSwan is therefore prone to run into issue 1.

To fix both issues without breaking existing userland allow using the
XFRMA_REPLAY_ESN_VAL netlink attribute with either an empty bitmap or a
fully specified one. For the former case we initialize the in-kernel
bitmap with zero, for the latter we copy the user supplied bitmap. For
state updates the full bitmap must be supplied.

To prevent overflows in the bitmap length calculation the maximum size
of bmp_len is limited to 128 by this patch -- resulting in a maximum
replay window of 4096 packets. This should be sufficient for all real
life scenarios (RFC 4303 recommends a default replay window size of 64).

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Martin Willi <martin@revosec.ch>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mathias Krause [Wed, 19 Sep 2012 11:33:41 +0000 (11:33 +0000)]

xfrm_user: fix info leak in copy_to_user_tmpl()

The memory used for the template copy is a local stack variable. As
struct xfrm_user_tmpl contains multiple holes added by the compiler for
alignment, not initializing the memory will lead to leaking stack bytes
to userland. Add an explicit memset(0) to avoid the info leak.

Initial version of the patch by Brad Spengler.

Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mathias Krause [Wed, 19 Sep 2012 11:33:40 +0000 (11:33 +0000)]

xfrm_user: fix info leak in copy_to_user_policy()

The memory reserved to dump the xfrm policy includes multiple padding
bytes added by the compiler for alignment (padding bytes in struct
xfrm_selector and struct xfrm_userpolicy_info). Add an explicit
memset(0) before filling the buffer to avoid the heap info leak.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mathias Krause [Wed, 19 Sep 2012 11:33:39 +0000 (11:33 +0000)]

xfrm_user: fix info leak in copy_to_user_state()

The memory reserved to dump the xfrm state includes the padding bytes of
struct xfrm_usersa_info added by the compiler for alignment (7 for
amd64, 3 for i386). Add an explicit memset(0) before filling the buffer
to avoid the info leak.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mathias Krause [Wed, 19 Sep 2012 11:33:38 +0000 (11:33 +0000)]

xfrm_user: fix info leak in copy_to_user_auth()

copy_to_user_auth() fails to initialize the remainder of alg_name and
therefore discloses up to 54 bytes of heap memory via netlink to
userland.

Use strncpy() instead of strcpy() to fill the trailing bytes of alg_name
with null bytes.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Bjørn Mork [Wed, 19 Sep 2012 10:03:36 +0000 (10:03 +0000)]

net: qmi_wwan: adding Huawei E367, ZTE MF683 and Pantech P4200

One of the modes of Huawei E367 has this QMI/wwan interface:

I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=01 Prot=07 Driver=(none)
E:  Ad=83(I) Atr=03(Int.) MxPS=  64 Ivl=2ms
E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=4ms

Huawei use subclass and protocol to identify vendor specific
functions, so adding a new vendor rule for this combination.

The Pantech devices UML290 (106c:3718) and P4200 (106c:3721) use
the same subclass to identify the QMI/wwan function.  Replace the
existing device specific UML290 entries with generic vendor matching,
adding support for the Pantech P4200.

The ZTE MF683 has 6 vendor specific interfaces, all using
ff/ff/ff for cls/sub/prot.  Adding a match on interface #5 which
is a QMI/wwan interface.

Cc: Fangxiaozhi (Franko) <fangxiaozhi@huawei.com>
Cc: Thomas Schäfer <tschaefer@t-online.de>
Cc: Dan Williams <dcbw@redhat.com>
Cc: Shawn J. Goff <shawn7400@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Andrey Vagin [Wed, 19 Sep 2012 09:40:00 +0000 (09:40 +0000)]

tcp: restore rcv_wscale in a repair mode (v2)

rcv_wscale is a symetric parameter with snd_wscale.

Both this parameters are set on a connection handshake.

Without this value a remote window size can not be interpreted correctly,
because a value from a packet should be shifted on rcv_wscale.

And one more thing is that wscale_ok should be set too.

This patch doesn't break a backward compatibility.
If someone uses it in a old scheme, a rcv window
will be restored with the same bug (rcv_wscale = 0).

v2: Save backward compatibility on big-endian system. Before
    the first two bytes were snd_wscale and the second two bytes were
    rcv_wscale. Now snd_wscale is opt_val & 0xFFFF and rcv_wscale >> 16.
    This approach is independent on byte ordering.

Cc: David S. Miller <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
CC: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Tony Luck [Thu, 20 Sep 2012 18:47:13 +0000 (11:47 -0700)]

[IA64] Must enable interrupts in do_notify_resume_user before calling tracehook_notify_resume()

If we call with interrupts disabled, we'll be hit with:

WARNING: at kernel/softirq.c:160 local_bh_enable_ip+0x150/0x180() and a stack
trace like this:

Call Trace:
[<a000000100015480>] show_stack+0x80/0xa0
[<a000000100d9a520>] dump_stack+0x30/0x50
[<a000000100072fc0>] warn_slowpath_common+0xc0/0x100
[<a000000100073040>] warn_slowpath_null+0x40/0x60
[<a0000001000884d0>] local_bh_enable_ip+0x150/0x180
[<a000000100da2960>] _raw_write_unlock_bh+0x40/0x60
[<a000000100cf03c0>] unix_release_sock+0x120/0x5a0
[<a000000100cf0880>] unix_release+0x40/0x60
[<a000000100b84400>] sock_release+0x60/0x1a0
[<a000000100b84b70>] sock_close+0x30/0xa0
[<a0000001001d10f0>] __fput+0x190/0x500
[<a0000001001d1580>] ____fput+0x20/0x40
[<a0000001000b6570>] task_work_run+0x1b0/0x260
[<a000000100015190>] do_notify_resume_user+0x110/0x2a0
[<a00000010000c5a0>] notify_resume_user+0x40/0x60
[<a00000010000c4d0>] skip_rbs_switch+0xe0/0xf0
[<a000000000040720>] ia64_ivt+0xffffffff00040720/0x400

Fix-suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Tony Luck <tony.luck@intel.com>

commit | commitdiff | tree

Tao Ma [Thu, 20 Sep 2012 15:35:38 +0000 (11:35 -0400)]

ext4: remove erroneous ext4_superblock_csum_set() in update_backups()

The update_backups() function is used to backup all the metadata
blocks, so we should not take it for granted that 'data' is pointed to
a super block and use ext4_superblock_csum_set to calculate the
checksum there. In case where the data is a group descriptor block,
it will corrupt the last group descriptor, and then e2fsck will
complain about it it.

As all the metadata checksums should already be OK when we do the
backup, remove the wrong ext4_superblock_csum_set and it should be
just fine.

Reported-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

commit | commitdiff | tree

Theodore Ts'o [Thu, 20 Sep 2012 02:42:36 +0000 (22:42 -0400)]

ext4: fix potential deadlock in ext4_nonda_switch()

In ext4_nonda_switch(), if the file system is getting full we used to
call writeback_inodes_sb_if_idle().  The problem is that we can be
holding i_mutex already, and this causes a potential deadlock when
writeback_inodes_sb_if_idle() when it tries to take s_umount.  (See
lockdep output below).

As it turns out we don't need need to hold s_umount; the fact that we
are in the middle of the write(2) system call will keep the superblock
pinned.  So we can just call writeback_inodes_sb() directly without
taking s_umount first.

   [ INFO: possible circular locking dependency detected ]
   3.6.0-rc1-00042-gce894ca #367 Not tainted
   -------------------------------------------------------
   dd/8298 is trying to acquire lock:
    (&type->s_umount_key#18){++++..}, at: [<c02277d4>] writeback_inodes_sb_if_idle+0x28/0x46

   but task is already holding lock:
    (&sb->s_type->i_mutex_key#8){+.+...}, at: [<c01ddcce>] generic_file_aio_write+0x5f/0xd3

   which lock already depends on the new lock.

   2 locks held by dd/8298:
    #0:  (sb_writers#2){.+.+.+}, at: [<c01ddcc5>] generic_file_aio_write+0x56/0xd3
    #1:  (&sb->s_type->i_mutex_key#8){+.+...}, at: [<c01ddcce>] generic_file_aio_write+0x5f/0xd3

   stack backtrace:
   Pid: 8298, comm: dd Not tainted 3.6.0-rc1-00042-gce894ca #367
   Call Trace:
    [<c015b79c>] ? console_unlock+0x345/0x372
    [<c06d62a1>] print_circular_bug+0x190/0x19d
    [<c019906c>] __lock_acquire+0x86d/0xb6c
    [<c01999db>] ? mark_held_locks+0x5c/0x7b
    [<c0199724>] lock_acquire+0x66/0xb9
    [<c02277d4>] ? writeback_inodes_sb_if_idle+0x28/0x46
    [<c06db935>] down_read+0x28/0x58
    [<c02277d4>] ? writeback_inodes_sb_if_idle+0x28/0x46
    [<c02277d4>] writeback_inodes_sb_if_idle+0x28/0x46
    [<c026f3b2>] ext4_nonda_switch+0xe1/0xf4
    [<c0271ece>] ext4_da_write_begin+0x27/0x193
    [<c01dcdb0>] generic_file_buffered_write+0xc8/0x1bb
    [<c01ddc47>] __generic_file_aio_write+0x1dd/0x205
    [<c01ddce7>] generic_file_aio_write+0x78/0xd3
    [<c026d336>] ext4_file_write+0x480/0x4a6
    [<c0198c1d>] ? __lock_acquire+0x41e/0xb6c
    [<c0180944>] ? sched_clock_cpu+0x11a/0x13e
    [<c01967e9>] ? trace_hardirqs_off+0xb/0xd
    [<c018099f>] ? local_clock+0x37/0x4e
    [<c0209f2c>] do_sync_write+0x67/0x9d
    [<c0209ec5>] ? wait_on_retry_sync_kiocb+0x44/0x44
    [<c020a7b9>] vfs_write+0x7b/0xe6
    [<c020a9a6>] sys_write+0x3b/0x64
    [<c06dd4bd>] syscall_call+0x7/0xb

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

commit | commitdiff | tree

Axel Lin [Wed, 19 Sep 2012 22:56:23 +0000 (15:56 -0700)]

Input: edt-ft5x06 - return -EFAULT on copy_to_user() error

copy_to_user() returns the number of bytes remaining, but we want a
negative error code here.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

commit | commitdiff | tree

Tai-hwa Liang [Wed, 19 Sep 2012 18:10:47 +0000 (11:10 -0700)]

Input: sentelic - filter out erratic movement when lifting finger

When lifing finger off the surface some versions of touchpad send movement
packets with very low coordinates, which cause cursor to jump to the upper
left corner of the screen. Let's ignore least significant bits of X and Y
coordinates if higher bits are all zeroes and consider finger not touching
the pad.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43197
Reported-and-tested-by: Aleksey Spiridonov <leks13@leks13.ru>
Tested-by: Eddie Dunn <eddie.dunn@gmail.com>
Tested-by: Jakub Luzny <limoto94@gmail.com>
Tested-by: Olivier Goffart <olivier@woboq.com>
Signed-off-by: Tai-hwa Liang <avatar@sentelic.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

commit | commitdiff | tree

Pawel Moll [Wed, 19 Sep 2012 18:10:47 +0000 (11:10 -0700)]

Input: ambakmi - [un]prepare clocks when enabling amd disabling

Clocks must be prepared before enabling and unprepared
after disabling. Use appropriate functions to do this
in one go.

Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

commit | commitdiff | tree

Anisse Astier [Wed, 19 Sep 2012 18:10:48 +0000 (11:10 -0700)]

Input: i8042 - disable mux on Toshiba C850D

On Toshiba Satellite C850D, the touchpad and the keyboard might randomly
not work at boot. Preventing MUX mode activation solves this issue.

Signed-off-by: Anisse Astier <anisse@astier.eu>
Cc: stable@kernel.org
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

commit | commitdiff | tree

Li RongQing [Tue, 18 Sep 2012 16:53:21 +0000 (16:53 +0000)]

net/core: fix comment in skb_try_coalesce

It should be the skb which is not cloned

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Søren holm [Mon, 17 Sep 2012 21:50:57 +0000 (21:50 +0000)]

asix: Support DLink DUB-E100 H/W Ver C1

Signed-off-by: Søren Holm <sgh@sgh.dk>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Paolo Valente [Sat, 15 Sep 2012 00:41:35 +0000 (00:41 +0000)]

pkt_sched: fix virtual-start-time update in QFQ

If the old timestamps of a class, say cl, are stale when the class
becomes active, then QFQ may assign to cl a much higher start time
than the maximum value allowed. This may happen when QFQ assigns to
the start time of cl the finish time of a group whose classes are
characterized by a higher value of the ratio
max_class_pkt/weight_of_the_class with respect to that of
cl. Inserting a class with a too high start time into the bucket list
corrupts the data structure and may eventually lead to crashes.
This patch limits the maximum start time assigned to a class.

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Michal Kubeček [Fri, 14 Sep 2012 04:59:52 +0000 (04:59 +0000)]

tcp: flush DMA queue before sk_wait_data if rcv_wnd is zero

If recv() syscall is called for a TCP socket so that
  - IOAT DMA is used
  - MSG_WAITALL flag is used
  - requested length is bigger than sk_rcvbuf
  - enough data has already arrived to bring rcv_wnd to zero
then when tcp_recvmsg() gets to calling sk_wait_data(), receive
window can be still zero while sk_async_wait_queue exhausts
enough space to keep it zero. As this queue isn't cleaned until
the tcp_service_net_dma() call, sk_wait_data() cannot receive
any data and blocks forever.

If zero receive window and non-empty sk_async_wait_queue is
detected before calling sk_wait_data(), process the queue first.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Lüssing [Fri, 14 Sep 2012 00:40:54 +0000 (00:40 +0000)]

batman-adv: make batadv_test_bit() return 0 or 1 only

On some architectures test_bit() can return other values than 0 or 1:

With a generic x86 OpenWrt image in a kvm setup (batadv_)test_bit()
frequently returns -1 for me, leading to batadv_iv_ogm_update_seqnos()
wrongly signaling a protected seqno window.

This patch tries to fix this issue by making batadv_test_bit() return 0
or 1 only.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Zheng Liu [Wed, 19 Sep 2012 19:24:40 +0000 (15:24 -0400)]

ext4: add SEEK_HOLE/SEEK_DATA support

This patch makes ext4 really support SEEK_DATA/SEEK_HOLE flags. Extent-based
and block-based files are implemented together because ext4_map_blocks hides
this difference.

CC: Hugh Dickins <hughd@google.com>
CC: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

commit | commitdiff | tree

Steve French [Wed, 19 Sep 2012 19:24:02 +0000 (14:24 -0500)]

[CIFS] Allow SMB3 negotiation

SMB3 will all optional features disabled, negotiates
almost the same as SMB2.1 so now we also can allow "vers=3.0"
(not just "vers=2.1"). Key SMB3 new features (improved packet signing
for example) will be added in subsequent patches, but the
basic operations work.

Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>

commit | commitdiff | tree

Pavel Shilovsky [Wed, 19 Sep 2012 12:03:26 +0000 (16:03 +0400)]

CIFS: Fix possible memory leaks in SMB2 code

and add missed increments of failed async read and write requests.

Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>

commit | commitdiff | tree

Pavel Shilovsky [Wed, 19 Sep 2012 08:36:52 +0000 (12:36 +0400)]

CIFS: Fix endian conversion of IndexNumber

by making it __le64 rather than __u64 in FILE_AL_INFO structure.

Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>

commit | commitdiff | tree

Steve French [Wed, 19 Sep 2012 16:19:39 +0000 (09:19 -0700)]

Trivial endian fixes

Some trivial endian fixes for the SMB2 code. One
warning remains which I asked Pavel to look at.

Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>

commit | commitdiff | tree

Zheng Liu [Wed, 19 Sep 2012 19:15:34 +0000 (15:15 -0400)]

ext4: reimplement ext4_find_delay_alloc_range on extent status tree

ext4_find_delay_alloc_range is reimplemented on extent status tree.

Reviewed-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Reviewed-by: Allison Henderson <achender@linux.vnet.ibm.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

commit | commitdiff | tree

Zheng Liu [Wed, 19 Sep 2012 19:11:33 +0000 (15:11 -0400)]

ext4: reimplement fiemap on extent status tree

This patch reimplements fiemap on extent status tree.

Reviwed-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Reviewed-by: Allison Henderson <achender@linux.vnet.ibm.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

commit | commitdiff | tree

Zheng Liu [Wed, 19 Sep 2012 19:07:46 +0000 (15:07 -0400)]

ext4: add some tracepoints in extent status tree

This patch adds some tracepoints in extent status tree.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

commit | commitdiff | tree

Zheng Liu [Wed, 19 Sep 2012 19:06:50 +0000 (15:06 -0400)]

ext4: let ext4 maintain extent status tree

This patch lets ext4 maintain extent status tree.

Reviewed-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Reviewed-by: Allison Henderson <achender@linux.vnet.ibm.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

commit | commitdiff | tree

Zheng Liu [Wed, 19 Sep 2012 18:45:51 +0000 (14:45 -0400)]

ext4: initialize extent status tree

Let ext4 initialize extent status tree of an inode.

Reviewed-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Reviewed-by: Allison Henderson <achender@linux.vnet.ibm.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

commit | commitdiff | tree

Zheng Liu [Wed, 19 Sep 2012 18:35:32 +0000 (14:35 -0400)]

ext4: add operations on extent status tree

This patch adds operations on a extent status tree.

CC: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Reviewed-by: Allison Henderson <achender@linux.vnet.ibm.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>

commit | commitdiff | tree

Zheng Liu [Wed, 19 Sep 2012 18:19:24 +0000 (14:19 -0400)]

ext4: add extent status tree data structures

This patch adds two structures that supports status extent tree and
status_extent_info to inode_info.

Reviewed-by: Yongqiang Yang <xiaoqiangnk@gmail.com>
Reviewed-by: Allison Henderson <achender@linux.vnet.ibm.com>
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

commit | commitdiff | tree

Andrey Sidorov [Wed, 19 Sep 2012 18:14:53 +0000 (14:14 -0400)]

ext4: speed up truncate/unlink by not using bforget() unless needed

Do not iterate over data blocks scanning for bh's to forget as they're
never exist. This improves time taken by unlink / truncate syscall.
Tested by continuously truncating file that is being written by dd.
Another test is rm -rf of linux tree while tar unpacks it. With
ordered data mode condition unlikely(!tbh) was always met in
ext4_free_blocks. With journal data mode tbh was found only few times,
so optimisation is also possible.

Unlinking fallocated 60G file after doing sync && echo 3 >
/proc/sys/vm/drop_caches && time rm --help

X86 before (linux 3.6-rc4):
# time rm -f test1
real    0m2.710s
user    0m0.000s
sys     0m1.530s

X86 after:
# time rm -f test1
real    0m0.644s
user    0m0.003s
sys     0m0.060s

MIPS before (linux 2.6.37):
# time rm -f test1
real    0m 4.93s
user    0m 0.00s
sys     0m 4.61s

MIPS after:
# time rm -f test1
real    0m 0.16s
user    0m 0.00s
sys     0m 0.06s

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrey Sidorov <qrxd43@motorola.com>

commit | commitdiff | tree

Linus Torvalds [Wed, 19 Sep 2012 18:04:34 +0000 (11:04 -0700)]

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"A small collection of driver fixes/updates and a core fix for 3.6.  It
  contains:

   - Bug fixes for mtip32xx, and support for new hardware (just addition
     of IDs).  They have been queued up for 3.7 for a few weeks as well.

   - rate-limit a failing command error message in block core.

   - A fix for an old cciss bug from Stephen.

   - Prevent overflow of partition count from Alan."

* 'for-linus' of git://git.kernel.dk/linux-block:
  cciss: fix handling of protocol error
  blk: add an upper sanity check on partition adding
  mtip32xx: fix user_buffer check in exec_drive_command
  mtip32xx: Remove dead code
  mtip32xx: Change printk to pr_xxxx
  mtip32xx: Proper reporting of write protect status on big-endian
  mtip32xx: Increase timeout for standby command
  mtip32xx: Handle NCQ commands during the security locked state
  mtip32xx: Add support for new devices
  block: rate-limit the error message from failing commands

commit | commitdiff | tree

Linus Torvalds [Wed, 19 Sep 2012 18:03:55 +0000 (11:03 -0700)]

Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh

Pull SuperH fixes from Paul Mundt.

* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
  sh: Fix up TIF_NOTIFY_RESUME sans TIF_SIGPENDING handling.
  sh: pfc: Release spinlock in sh_pfc_gpio_request_enable() error path
  sh: intc: Fix up multi-evt irq association.

commit | commitdiff | tree

Linus Torvalds [Wed, 19 Sep 2012 18:03:13 +0000 (11:03 -0700)]

Merge tag 'rpmsg-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg

Pull rpmsg fix from Ohad Ben-Cohen:
"A quick rpmsg fix from Fernando, fixing two buggy invocations of
dma_free_coherent"

* tag 'rpmsg-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/rpmsg:
rpmsg: fix dma_free_coherent dev parameter

commit | commitdiff | tree

Linus Torvalds [Wed, 19 Sep 2012 18:01:38 +0000 (11:01 -0700)]

Merge tag 'md-3.6-fixes' of git://neil.brown.name/md

Pull md fixes from NeilBrown:
"3 fixes for md in 3.6.

  One reverts a recent patch which turns out to not be such a good idea.

  Other two fix minor bugs with the new (since 3.3) 'replacement' code
  and have been tagged for -stable."

* tag 'md-3.6-fixes' of git://neil.brown.name/md:
  md: make sure metadata is updated when spares are activated or removed.
  md/raid5: fix calculate of 'degraded' when a replacement becomes active.
  Revert "md/raid5: For odirect-write performance, do not set STRIPE_PREREAD_ACTIVE."

commit | commitdiff | tree

Linus Torvalds [Wed, 19 Sep 2012 18:00:07 +0000 (11:00 -0700)]

Merge branch 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue / powernow-k8 fix from Tejun Heo:
"This is the fix for the bug where cpufreq/powernow-k8 was tripping
  BUG_ON() in try_to_wake_up_local() by migrating workqueue worker to a
  different CPU.

    https://bugzilla.kernel.org/show_bug.cgi?id=47301

  As discussed, the fix is now two parts - one to reimplement
  work_on_cpu() so that it doesn't create a new kthread each time and
  the actual fix which makes powernow-k8 use work_on_cpu() instead of
  performing manual migration.

  While pretty late in the merge cycle, both changes are on the safer
  side.  Jiri and I verified two existing users of work_on_cpu() and
  Duncan confirmed that the powernow-k8 fix survived about 18 hours of
  testing."

* 'for-3.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  cpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU
  workqueue: reimplement work_on_cpu() using system_wq

commit | commitdiff | tree

Dmitry Torokhov [Wed, 19 Sep 2012 17:23:18 +0000 (10:23 -0700)]

Revert "input: ab8500-ponkey: Create AB8500 domain IRQ mapping"

This reverts commit ca3b3faf9bee4dc5df4f10eae2d1e48f7de0a8ad.

There was a plan to place ab8500_irq_get_virq() calls in each AB8500
child device prior to requesting an IRQ, but as we're no longer using
Device Tree to collect our IRQ numbers, it's actually better to allow
the core to do this during device registration time. So the IRQ number
we pull from its resource has already been converted to a virtual IRQ.

Signed-off-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

commit | commitdiff | tree

Dmitry Torokhov [Wed, 19 Sep 2012 17:21:21 +0000 (10:21 -0700)]

Merge tag 'v3.6-rc5' into for-linus

Sync with mainline so that I can revert an input patch that came in through
another subsystem tree.

commit | commitdiff | tree

Tejun Heo [Tue, 18 Sep 2012 21:24:59 +0000 (14:24 -0700)]

cpufreq/powernow-k8: workqueue user shouldn't migrate the kworker to another CPU

powernowk8_target() runs off a per-cpu work item and if the
cpufreq_policy->cpu is different from the current one, it migrates the
kworker to the target CPU by manipulating current->cpus_allowed.  The
function migrates the kworker back to the original CPU but this is
still broken.  Workqueue concurrency management requires the kworkers
to stay on the same CPU and powernowk8_target() ends up triggerring
BUG_ON(rq != this_rq()) in try_to_wake_up_local() if it contends on
fidvid_mutex and sleeps.

It is unclear why this bug is being reported now.  Duncan says it
appeared to be a regression of 3.6-rc1 and couldn't reproduce it on
3.5.  Bisection seemed to point to 63d95a91 "workqueue: use @pool
instead of @gcwq or @cpu where applicable" which is an non-functional
change.  Given that the reproduce case sometimes took upto days to
trigger, it's easy to be misled while bisecting.  Maybe something made
contention on fidvid_mutex more likely?  I don't know.

This patch fixes the bug by using work_on_cpu() instead if @pol->cpu
isn't the same as the current one.  The code assumes that
cpufreq_policy->cpu is kept online by the caller, which Rafael tells
me is the case.

stable: ed48ece27c ("workqueue: reimplement work_on_cpu() using
        system_wq") should be applied before this; otherwise, the
        behavior could be horrible.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Duncan <1i5t5.duncan@cox.net>
Tested-by: Duncan <1i5t5.duncan@cox.net>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: stable@vger.kernel.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=47301

commit | commitdiff | tree

Tejun Heo [Tue, 18 Sep 2012 19:48:43 +0000 (12:48 -0700)]

workqueue: reimplement work_on_cpu() using system_wq

The existing work_on_cpu() implementation is hugely inefficient.  It
creates a new kthread, execute that single function and then let the
kthread die on each invocation.

Now that system_wq can handle concurrent executions, there's no
advantage of doing this.  Reimplement work_on_cpu() using system_wq
which makes it simpler and way more efficient.

stable: While this isn't a fix in itself, it's needed to fix a
        workqueue related bug in cpufreq/powernow-k8.  AFAICS, this
        shouldn't break other existing users.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@vger.kernel.org

commit | commitdiff | tree

Jeff Layton [Wed, 19 Sep 2012 13:22:46 +0000 (06:22 -0700)]

cifs: fix return value in cifsConvertToUTF16

This function returns the wrong value, which causes the callers to get
the length of the resulting pathname wrong when it contains non-ASCII
characters.

This seems to fix https://bugzilla.samba.org/show_bug.cgi?id=6767

Cc: <stable@vger.kernel.org>
Reported-by: Baldvin Kovacs <baldvin.kovacs@gmail.com>
Reported-and-Tested-by: Nicolas Lefebvre <nico.lefebvre@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>

commit | commitdiff | tree

Steve French [Wed, 19 Sep 2012 13:22:46 +0000 (06:22 -0700)]

[CIFS] MARK SMB2 support EXPERIMENTAL

Now that the merge of the remaining pieces needed for
SMB2 (SMB2.1 dialect) are in, and most test cases pass,
we can consider SMB2.1 EXPERIMENTAL rather than "BROKEN."

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <smfrench@gmail.com>

commit | commitdiff | tree

Steve French [Wed, 19 Sep 2012 13:22:46 +0000 (06:22 -0700)]

[CIFS] Update cifs version number

With SMB2 support, update from version 1.79 to 2.0 to make
it easier for users to recognize which version has SMB2 support.

Signed-off-by: Steve French <sfrench@us.ibm.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>

commit | commitdiff | tree

Jeff Layton [Wed, 19 Sep 2012 13:22:46 +0000 (06:22 -0700)]

cifs: add FL_CLOSE to fl_flags mask in cifs_read_flock

FL_CLOSE is quite common when you close a file on which you hold a
lock. The spurious "Unknown lock flags" message in cFYI is
confusing in this case.

Reported-by: Alexander Bokovoy <abokovoy@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>

commit | commitdiff | tree

Sachin Prabhu [Wed, 19 Sep 2012 13:22:45 +0000 (06:22 -0700)]

cifs: Mangle string used for unc in /proc/mounts

The string for "unc=" in /proc/mounts needs to be escaped. The current
behaviour can create problems in cases when mounting a share starting
with a number.

example:
>mount -t cifs -o username=test,password=x vm140-31:/17000-test /mnt
>mount -o remount,password=x /mnt
mount error: could not resolve address for vm140-31x00-test: Unknown
error

The sub-string "\170" which is part of the unc for the mount above in
/proc/mounts is interpreted as character'x' in the case above. Escaping
the string fixes the problem.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>

commit | commitdiff | tree

Jeff Layton [Wed, 19 Sep 2012 13:22:45 +0000 (06:22 -0700)]

cifs: cleanups for cifs_mkdir_qinfo

Rename inode pointers for better clarity. Move the d_instantiate call to
the end of the function to prevent other tasks from seeing it before
we've finished constructing it. Since we should have exclusive access to
the inode at this point, remove the spinlock around i_nlink update.

Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>

commit | commitdiff | tree

Pavel Shilovsky [Wed, 19 Sep 2012 13:22:45 +0000 (06:22 -0700)]

CIFS: Fix fast lease break after open problem

Now we walk though cifsFileInfo's list for every incoming lease
break and look for an equivalent there. That approach misses lease
breaks that come just after an open response - we don't have time
to populate new cifsFileInfo structure to the list. Fix this by
adding new list of pending opens and look for a lease there if we
didn't find it in the list of cifsFileInfo structures.

Signed-off-by: Pavel Shilovsky <pshilovsky@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>

commit | commitdiff | tree

Pavel Shilovsky [Wed, 19 Sep 2012 13:22:45 +0000 (06:22 -0700)]

CIFS: Add SMB2.1 lease break support

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <sfrench@us.ibm.com>

commit | commitdiff | tree

Pavel Shilovsky [Wed, 19 Sep 2012 13:22:45 +0000 (06:22 -0700)]

CIFS: Fix cache coherency for read oplock case

When we have a file opened with read oplock and we are writing a data
to this file, we need to store the data in the cache and then send to
the server to ensure that the next read operation will get a coherent
data.

Also mark it as CONFIG_CIFS_SMB2 because it's more suitable for SMB2
code but can fix some CIFS problems too (when server delays sending
an oplock break after a write request). We can drop this ifdefs
dependence in future.

Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>

commit | commitdiff | tree

Pavel Shilovsky [Wed, 19 Sep 2012 13:22:44 +0000 (06:22 -0700)]

CIFS: Request SMB2.1 leases

if server supports them and we need oplocks.

Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>

commit | commitdiff | tree

Pavel Shilovsky [Wed, 19 Sep 2012 13:22:44 +0000 (06:22 -0700)]

CIFS: Check for mandatory brlocks on read/write

Currently CIFS code accept read/write ops on mandatory locked area
when two processes use the same file descriptor - it's wrong.
Fix this by serializing io and brlock operations on the inode.

Signed-off-by: Pavel Shilovsky <pshilovsky@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>

commit | commitdiff | tree

Pavel Shilovsky [Wed, 19 Sep 2012 13:22:44 +0000 (06:22 -0700)]

CIFS: Turn lock mutex into rw semaphore

and allow several processes to walk through the lock list and read
can_cache_brlcks value if they are not going to modify them.

Signed-off-by: Pavel Shilovsky <pshilovsky@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>

commit | commitdiff | tree

Pavel Shilovsky [Wed, 19 Sep 2012 13:22:44 +0000 (06:22 -0700)]

CIFS: Use brlock cache for SMB2

Signed-off-by: Pavel Shilovsky <pshilovsky@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>

commit | commitdiff | tree