]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
11 years agoixgbe: add support for X540-AT1
joshua.a.hay@intel.com [Fri, 21 Sep 2012 00:08:21 +0000 (00:08 +0000)]
ixgbe: add support for X540-AT1

commit df376f0de167754da9b3ece4afdb5bb8bf3fbf3e upstream.

This patch adds device support for Ethernet Controller X540-AT1.

Signed-off-by: Josh Hay <joshua.a.hay@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Abdallah Chatila <Abdallah.Chatila@ericsson.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoKVM: x86: invalid opcode oops on SET_SREGS with OSXSAVE bit set (CVE-2012-4461)
Petr Matousek [Tue, 6 Nov 2012 18:24:07 +0000 (19:24 +0100)]
KVM: x86: invalid opcode oops on SET_SREGS with OSXSAVE bit set (CVE-2012-4461)

commit 6d1068b3a98519247d8ba4ec85cd40ac136dbdf9 upstream.

On hosts without the XSAVE support unprivileged local user can trigger
oops similar to the one below by setting X86_CR4_OSXSAVE bit in guest
cr4 register using KVM_SET_SREGS ioctl and later issuing KVM_RUN
ioctl.

invalid opcode: 0000 [#2] SMP
Modules linked in: tun ip6table_filter ip6_tables ebtable_nat ebtables
...
Pid: 24935, comm: zoog_kvm_monito Tainted: G      D      3.2.0-3-686-pae
EIP: 0060:[<f8b9550c>] EFLAGS: 00210246 CPU: 0
EIP is at kvm_arch_vcpu_ioctl_run+0x92a/0xd13 [kvm]
EAX: 00000001 EBX: 000f387e ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: ef5a0060 ESP: d7c63e70
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process zoog_kvm_monito (pid: 24935, ti=d7c62000 task=ed84a0c0
task.ti=d7c62000)
Stack:
 00000001 f70a1200 f8b940a9 ef5a0060 00000000 00200202 f8769009 00000000
 ef5a0060 000f387e eda5c020 8722f9c8 00015bae 00000000 ed84a0c0 ed84a0c0
 c12bf02d 0000ae80 ef7f8740 fffffffb f359b740 ef5a0060 f8b85dc1 0000ae80
Call Trace:
 [<f8b940a9>] ? kvm_arch_vcpu_ioctl_set_sregs+0x2fe/0x308 [kvm]
...
 [<c12bfb44>] ? syscall_call+0x7/0xb
Code: 89 e8 e8 14 ee ff ff ba 00 00 04 00 89 e8 e8 98 48 ff ff 85 c0 74
1e 83 7d 48 00 75 18 8b 85 08 07 00 00 31 c9 8b 95 0c 07 00 00 <0f> 01
d1 c7 45 48 01 00 00 00 c7 45 1c 01 00 00 00 0f ae f0 89
EIP: [<f8b9550c>] kvm_arch_vcpu_ioctl_run+0x92a/0xd13 [kvm] SS:ESP
0068:d7c63e70

QEMU first retrieves the supported features via KVM_GET_SUPPORTED_CPUID
and then sets them later. So guest's X86_FEATURE_XSAVE should be masked
out on hosts without X86_FEATURE_XSAVE, making kvm_set_cr4 with
X86_CR4_OSXSAVE fail. Userspaces that allow specifying guest cpuid with
X86_FEATURE_XSAVE even on hosts that do not support it, might be
susceptible to this attack from inside the guest as well.

Allow setting X86_CR4_OSXSAVE bit only if host has XSAVE support.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoscsi: Silence unnecessary warnings about ioctl to partition
Jan Kara [Fri, 15 Jun 2012 10:52:46 +0000 (12:52 +0200)]
scsi: Silence unnecessary warnings about ioctl to partition

commit 6d9359280753d2955f86d6411047516a9431eb51 upstream.

Sometimes, warnings about ioctls to partition happen often enough that they
form majority of the warnings in the kernel log and users complain. In some
cases warnings are about ioctls such as SG_IO so it's not good to get rid of
the warnings completely as they can ease debugging of userspace problems
when ioctl is refused.

Since I have seen warnings from lots of commands, including some proprietary
userspace applications, I don't think disallowing the ioctls for processes
with CAP_SYS_RAWIO will happen in the near future if ever. So lets just
stop warning for processes with CAP_SYS_RAWIO for which ioctl is allowed.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: James Bottomley <JBottomley@parallels.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: satoru takeuchi <satoru.takeuchi@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agobas_gigaset: fix pre_reset handling
Tilman Schmidt [Wed, 24 Oct 2012 08:44:32 +0000 (08:44 +0000)]
bas_gigaset: fix pre_reset handling

commit c6fdd8e5d0c65bb8821dc6da26ee1a2ddd58b3cc upstream.

The delayed work function int_in_work() may call usb_reset_device()
and thus, indirectly, the driver's pre_reset method. Trying to
cancel the work synchronously in that situation would deadlock.
Fix by avoiding cancel_work_sync() in the pre_reset method.

If the reset was NOT initiated by int_in_work() this might cause
int_in_work() to run after the post_reset method, with urb_int_in
already resubmitted, so handle that case gracefully.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoALSA: hda - Add support for Realtek ALC292
David Henningsson [Wed, 21 Nov 2012 07:57:58 +0000 (08:57 +0100)]
ALSA: hda - Add support for Realtek ALC292

commit af02dde8a609d8d071c4b31a82df811a55690a4a upstream.

We found a new codec ID 292, and that just a simple quirk would enable
sound output/input on this ALC292 chip.

BugLink: https://bugs.launchpad.net/bugs/1081466
Tested-by: Acelan Kao <acelan.kao@canonical.com>
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoALSA: hda - Fix missing beep on ASUS X43U notebook
Duncan Roe [Wed, 10 Oct 2012 12:19:50 +0000 (14:19 +0200)]
ALSA: hda - Fix missing beep on ASUS X43U notebook

commit 7110005e8d5c3cd418fc4b64f9f124f004422a9a upstream.

Signed-off-by: Duncan Roe <duncan_roe@acslink.net.au>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoALSA: hda - Add new codec ALC283 ALC290 support
Kailang Yang [Sat, 6 Oct 2012 15:02:30 +0000 (17:02 +0200)]
ALSA: hda - Add new codec ALC283 ALC290 support

commit 7ff34ad80b7080fafaac8efa9ef0061708eddd51 upstream.

These are compatible with standard ALC269 parser.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoPM / QoS: fix wrong error-checking condition
Guennadi Liakhovetski [Fri, 23 Nov 2012 19:55:06 +0000 (20:55 +0100)]
PM / QoS: fix wrong error-checking condition

commit a7227a0faa117d0bc532aea546ae5ac5f89e8ed7 upstream.

dev_pm_qos_add_request() can return 0, 1, or a negative error code,
therefore the correct error test is "if (error < 0)." Checking just for
non-zero return code leads to erroneous setting of the req->dev pointer
to NULL, which then leads to a repeated call to
dev_pm_qos_add_ancestor_request() in st1232_ts_irq_handler(). This in turn
leads to an Oops, when the I2C host adapter is unloaded and reloaded again
because of the inconsistent state of its QoS request list.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agosparc64: not any error from do_sigaltstack() should fail rt_sigreturn()
Al Viro [Mon, 19 Nov 2012 03:27:03 +0000 (22:27 -0500)]
sparc64: not any error from do_sigaltstack() should fail rt_sigreturn()

commit fae2ae2a900a5c7bb385fe4075f343e7e2d5daa2 upstream.

If a signal handler is executed on altstack and another signal comes,
we will end up with rt_sigreturn() on return from the second handler
getting -EPERM from do_sigaltstack().  It's perfectly OK, since we
are not asking to change the settings; in fact, they couldn't have been
changed during the second handler execution exactly because we'd been
on altstack all along.  64bit sigreturn on sparc treats any error from
do_sigaltstack() as "SIGSEGV now"; we need to switch to the same semantics
we are using on other architectures.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agojbd: Fix lock ordering bug in journal_unmap_buffer()
Jan Kara [Fri, 23 Nov 2012 13:03:04 +0000 (14:03 +0100)]
jbd: Fix lock ordering bug in journal_unmap_buffer()

commit 25389bb207987b5774182f763b9fb65ff08761c8 upstream.

Commit 09e05d48 introduced a wait for transaction commit into
journal_unmap_buffer() in the case we are truncating a buffer undergoing commit
in the page stradding i_size on a filesystem with blocksize < pagesize. Sadly
we forgot to drop buffer lock before waiting for transaction commit and thus
deadlock is possible when kjournald wants to lock the buffer.

Fix the problem by dropping the buffer lock before waiting for transaction
commit. Since we are still holding page lock (and that is OK), buffer cannot
disappear under us.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agocan: bcm: initialize ifindex for timeouts without previous frame reception
Oliver Hartkopp [Mon, 26 Nov 2012 21:24:23 +0000 (22:24 +0100)]
can: bcm: initialize ifindex for timeouts without previous frame reception

commit 81b401100c01d2357031e874689f89bd788d13cd upstream.

Set in the rx_ifindex to pass the correct interface index in the case of a
message timeout detection. Usually the rx_ifindex value is set at receive
time. But when no CAN frame has been received the RX_TIMEOUT notification
did not contain a valid value.

Reported-by: Andre Naujoks <nautsch2@googlemail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agocan: peak_usb: fix hwtstamp assignment
Oliver Hartkopp [Wed, 21 Nov 2012 21:43:59 +0000 (22:43 +0100)]
can: peak_usb: fix hwtstamp assignment

commit c9faaa09e2a1335678f09c70a0d0eda095564bab upstream.

The skb->tstamp is set to the hardware timestamp when available in the USB
urb message. This leads to user visible timestamps which contain the 'uptime'
of the USB adapter - and not the usual system generated timestamp.

Fix this wrong assignment by applying the available hardware timestamp to the
skb_shared_hwtstamps data structure - which is intended for this purpose.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoradeon: add AGPMode 1 quirk for RV250
Paul Bolle [Mon, 19 Nov 2012 20:17:31 +0000 (21:17 +0100)]
radeon: add AGPMode 1 quirk for RV250

commit 45171002b01b2e2ec4f991eca81ffd8430fd0aec upstream.

The Intel 82855PM host bridge / Mobility FireGL 9000 RV250 combination
in an (outdated) ThinkPad T41 needs AGPMode 1 for suspend/resume (under
KMS, that is). So add a quirk for it.

(Change R250 to RV250 in comment for preceding quirk too.)

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomac80211: deinitialize ibss-internals after emptiness check
Simon Wunderlich [Tue, 13 Nov 2012 17:43:03 +0000 (18:43 +0100)]
mac80211: deinitialize ibss-internals after emptiness check

commit b78a4932f5fb11fadf41e69c606a33fa6787574c upstream.

The check whether the IBSS is active and can be removed should be
performed before deinitializing the fields used for the check/search.
Otherwise, the configured BSS will not be found and removed properly.

To make it more clear for the future, rename sdata->u.ibss to the
local pointer ifibss which is used within the checks.

This behaviour was introduced by
f3209bea110cade12e2b133da8b8499689cb0e2e
("mac80211: fix IBSS teardown race")

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Cc: Ignacy Gawedzki <i@lri.fr>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agofutex: avoid wake_futex() for a PI futex_q
Darren Hart [Tue, 27 Nov 2012 00:29:56 +0000 (16:29 -0800)]
futex: avoid wake_futex() for a PI futex_q

commit aa10990e028cac3d5e255711fb9fb47e00700e35 upstream.

Dave Jones reported a bug with futex_lock_pi() that his trinity test
exposed.  Sometime between queue_me() and taking the q.lock_ptr, the
lock_ptr became NULL, resulting in a crash.

While futex_wake() is careful to not call wake_futex() on futex_q's with
a pi_state or an rt_waiter (which are either waiting for a
futex_unlock_pi() or a PI futex_requeue()), futex_wake_op() and
futex_requeue() do not perform the same test.

Update futex_wake_op() and futex_requeue() to test for q.pi_state and
q.rt_waiter and abort with -EINVAL if detected.  To ensure any future
breakage is caught, add a WARN() to wake_futex() if the same condition
is true.

This fix has seen 3 hours of testing with "trinity -c futex" on an
x86_64 VM with 4 CPUS.

[akpm@linux-foundation.org: tidy up the WARN()]
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Reported-by: Dave Jones <davej@redat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agodm: fix deadlock with request based dm and queue request_fn recursion
Jens Axboe [Tue, 6 Nov 2012 11:24:26 +0000 (12:24 +0100)]
dm: fix deadlock with request based dm and queue request_fn recursion

commit a8c32a5c98943d370ea606a2e7dc04717eb92206 upstream.

Request based dm attempts to re-run the request queue off the
request completion path. If used with a driver that potentially does
end_io from its request_fn, we could deadlock trying to recurse
back into request dispatch. Fix this by punting the request queue
run to kblockd.

Tested to fix a quickly reproducible deadlock in such a scenario.

Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomd/raid10: decrement correct pending counter when writing to replacement.
NeilBrown [Thu, 22 Nov 2012 04:12:09 +0000 (15:12 +1100)]
md/raid10: decrement correct pending counter when writing to replacement.

commit 884162df2aadd7414bef4935e1a54976fd4e3988 upstream.

When a write to a replacement device completes, we carefully
and correctly found the rdev that the write actually went to
and the blithely called rdev_dec_pending on the primary rdev,
even if this write was to the replacement.

This means that any writes to an array while a replacement
was ongoing would cause the nr_pending count for the primary
device to go negative, so it could never be removed.

This bug has been present since replacement was introduced in
3.3, so it is suitable for any -stable kernel since then.

Reported-by: "George Spelvin" <linux@horizon.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomd: Avoid write invalid address if read_seqretry returned true.
majianpeng [Thu, 8 Nov 2012 00:56:27 +0000 (08:56 +0800)]
md: Avoid write invalid address if read_seqretry returned true.

commit 35f9ac2dcec8f79d7059ce174fd7b7ee3290d620 upstream.

If read_seqretry returned true and bbp was changed, it will write
invalid address which can cause some serious problem.

This bug was introduced by commit v3.0-rc7-130-g2699b67.
So fix is suitable for 3.0.y thru 3.6.y.

Reported-by: zhuwenfeng@kedacom.com
Tested-by: zhuwenfeng@kedacom.com
Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomd: Reassigned the parameters if read_seqretry returned true in func md_is_badblock.
majianpeng [Tue, 6 Nov 2012 09:13:44 +0000 (17:13 +0800)]
md: Reassigned the parameters if read_seqretry returned true in func md_is_badblock.

commit ab05613a0646dcc11049692d54bae76ca9ffa910 upstream.

This bug was introduced by commit(v3.0-rc7-126-g2230dfe).
So fix is suitable for 3.0.y thru 3.6.y.

Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agojffs2: Fix lock acquisition order bug in jffs2_write_begin
Thomas Betker [Wed, 17 Oct 2012 20:59:30 +0000 (22:59 +0200)]
jffs2: Fix lock acquisition order bug in jffs2_write_begin

commit 5ffd3412ae5536a4c57469cb8ea31887121dcb2e upstream.

jffs2_write_begin() first acquires the page lock, then f->sem. This
causes an AB-BA deadlock with jffs2_garbage_collect_live(), which first
acquires f->sem, then the page lock:

jffs2_garbage_collect_live
    mutex_lock(&f->sem)                         (A)
    jffs2_garbage_collect_dnode
        jffs2_gc_fetch_page
            read_cache_page_async
                do_read_cache_page
                    lock_page(page)             (B)

jffs2_write_begin
    grab_cache_page_write_begin
        find_lock_page
            lock_page(page)                     (B)
    mutex_lock(&f->sem)                         (A)

We fix this by restructuring jffs2_write_begin() to take f->sem before
the page lock. However, we make sure that f->sem is not held when
calling jffs2_reserve_space(), as this is not permitted by the locking
rules.

The deadlock above was observed multiple times on an SoC with a dual
ARMv7 (Cortex-A9), running the long-term 3.4.11 kernel; it occurred
when using scp to copy files from a host system to the ARM target
system. The fix was heavily tested on the same target system.

Signed-off-by: Thomas Betker <thomas.betker@rohde-schwarz.com>
Acked-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomtd: ofpart: Fix incorrect NULL check in parse_ofoldpart_partitions()
Sachin Kamat [Tue, 25 Sep 2012 09:57:13 +0000 (15:27 +0530)]
mtd: ofpart: Fix incorrect NULL check in parse_ofoldpart_partitions()

commit 5a6ea4af0907f995dc06df21a9c9ef764c7cd3bc upstream.

The pointer returned by kzalloc should be tested for NULL
to avoid potential NULL pointer dereference later. Incorrect
pointer was being tested for NULL. Bug introduced by commit fbcf62a3
(mtd: physmap_of: move parse_obsolete_partitions to become separate
parser).
This patch fixes this bug.

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Artem Bityutskiy <artem.bityutskiy@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomtd: slram: invalid checking of absolute end address
Jiri Engelthaler [Thu, 20 Sep 2012 14:49:50 +0000 (16:49 +0200)]
mtd: slram: invalid checking of absolute end address

commit c36a7ff4578ab6294885aef5ef241aeec4cdb1f0 upstream.

Fixed parsing end absolute address.

Signed-off-by: Jiri Engelthaler <engycz@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoPARISC: fix user-triggerable panic on parisc
Al Viro [Wed, 21 Nov 2012 19:27:23 +0000 (19:27 +0000)]
PARISC: fix user-triggerable panic on parisc

commit 441a179dafc0f99fc8b3a8268eef66958621082e upstream.

int sys32_rt_sigprocmask(int how, compat_sigset_t __user *set, compat_sigset_t __user *oset,
                                    unsigned int sigsetsize)
{
        sigset_t old_set, new_set;
        int ret;

        if (set && get_sigset32(set, &new_set, sigsetsize))

...
static int
get_sigset32(compat_sigset_t __user *up, sigset_t *set, size_t sz)
{
        compat_sigset_t s;
        int r;

        if (sz != sizeof *set) panic("put_sigset32()");

In other words, rt_sigprocmask(69, (void *)69, 69) done by 32bit process
will promptly panic the box.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoPARISC: fix virtual aliasing issue in get_shared_area()
James Bottomley [Fri, 2 Nov 2012 12:30:53 +0000 (12:30 +0000)]
PARISC: fix virtual aliasing issue in get_shared_area()

commit 949a05d03490e39e773e8652ccab9157e6f595b4 upstream.

On Thu, 2012-11-01 at 16:45 -0700, Michel Lespinasse wrote:
> Looking at the arch/parisc/kernel/sys_parisc.c implementation of
> get_shared_area(), I do have a concern though. The function basically
> ignores the pgoff argument, so that if one creates a shared mapping of
> pages 0-N of a file, and then a separate shared mapping of pages 1-N
> of that same file, both will have the same cache offset for their
> starting address.
>
> This looks like this would create obvious aliasing issues. Am I
> misreading this ? I can't understand how this could work good enough
> to be undetected, so there must be something I'm missing here ???

This turns out to be correct and we need to pay attention to the pgoff as
well as the address when creating the virtual address for the area.
Fortunately, the bug is rarely triggered as most applications which use pgoff
tend to use large values (git being the primary one, and it uses pgoff in
multiples of 16MB) which are larger than our cache coherency modulus, so the
problem isn't often seen in practise.

Reported-by: Michel Lespinasse <walken@google.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoALSA: hda - Cirrus: Correctly clear line_out_pins when moving to speaker
David Henningsson [Wed, 21 Nov 2012 09:03:10 +0000 (10:03 +0100)]
ALSA: hda - Cirrus: Correctly clear line_out_pins when moving to speaker

commit 34c3d1926bdaf45d3a891dd577482abcdd9faa34 upstream.

If this array is not cleared, the jack related code later might
fail to create "Internal Speaker Phantom Jack" on Dell Inspiron 3420 and
Dell Vostro 2420.

BugLink: https://bugs.launchpad.net/bugs/1076840
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoALSA: ua101, usx2y: fix broken MIDI output
Clemens Ladisch [Wed, 31 Oct 2012 15:35:30 +0000 (16:35 +0100)]
ALSA: ua101, usx2y: fix broken MIDI output

commit e99ddfde6ae0dd2662bb40435696002b590e4057 upstream.

Commit 88a8516a2128 (ALSA: usbaudio: implement USB autosuspend) added
autosuspend code to all files making up the snd-usb-audio driver.
However, midi.c is part of snd-usb-lib and is also used by other
drivers, not all of which support autosuspend.  Thus, calls to
usb_autopm_get_interface() could fail, and this unexpected error would
result in the MIDI output being completely unusable.

Make it work by ignoring the error that is expected with drivers that do
not support autosuspend.

Reported-by: Colin Fletcher <colin.m.fletcher@googlemail.com>
Reported-by: Devin Venable <venable.devin@gmail.com>
Reported-by: Dr Nick Bailey <nicholas.bailey@glasgow.ac.uk>
Reported-by: Jannis Achstetter <jannis_achstetter@web.de>
Reported-by: Rui Nuno Capela <rncbc@rncbc.org>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agodrm/radeon: add new SI pci id
Alex Deucher [Wed, 21 Nov 2012 23:37:38 +0000 (18:37 -0500)]
drm/radeon: add new SI pci id

commit 0181bd5dea2ed0696f84591a92da0b6a1f1a2e62 upstream.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoSCSI: isci: copy fis 0x34 response into proper buffer
Maciej Patelczyk [Mon, 15 Oct 2012 12:29:03 +0000 (14:29 +0200)]
SCSI: isci: copy fis 0x34 response into proper buffer

commit 49bd665c5407a453736d3232ee58f2906b42e83c upstream.

SATA MICROCODE DOWNALOAD fails on isci driver. After receiving Register
Device to Host (FIS 0x34) frame Initiator resets phy.
In the frame handler routine response (FIS 0x34) was copied into wrong
buffer and upper layer did not receive any answer which resulted in
timeout and reset.
This patch corrects this bug.

Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomwifiex: fix system hang issue in cmd timeout error case
Bing Zhao [Thu, 15 Nov 2012 23:58:47 +0000 (15:58 -0800)]
mwifiex: fix system hang issue in cmd timeout error case

commit b1a47aa5e1e159e2cb06d7dfcc17ef5149b09299 upstream.

Reported by Tim Shepard:
I was seeing sporadic failures (wedgeups), and the majority of those
failures I saw printed the printouts in mwifiex_cmd_timeout_func with
cmd = 0xe5 which is CMD_802_11_HS_CFG_ENH.  When this happens, two
minutes later I get notified that the rtcwake thread is blocked, like
this:
      INFO: task rtcwake:3495 blocked for more than 120 seconds.

To get the hung thread unblocked we wake up the cmd wait queue and
cancel the ioctl.

Reported-by: Tim Shepard <shep@laptop.org>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomwifiex: report error to MMC core if we cannot suspend
Bing Zhao [Thu, 15 Nov 2012 23:58:48 +0000 (15:58 -0800)]
mwifiex: report error to MMC core if we cannot suspend

commit dd321acddc3be1371263b8c9e6c6f2af89f63d57 upstream.

When host_sleep_config command fails we should return error to
MMC core to indicate the failure for our device.

The misspelled variable is also removed as it's redundant.

Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agortlwifi: rtl8192cu: Add new USB ID
Albert Pool [Tue, 30 Oct 2012 19:58:06 +0000 (20:58 +0100)]
rtlwifi: rtl8192cu: Add new USB ID

commit a485e827f07bfdd0762059386e6e787bed6e81ee upstream.

This is an ISY IWL 2000. Probably a clone of Belkin F7D1102 050d:1102.
Its FCC ID is the same.

Signed-off-by: Albert Pool <albertpool@solcon.nl>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agox86, microcode, AMD: Add support for family 16h processors
Boris Ostrovsky [Thu, 15 Nov 2012 18:41:50 +0000 (13:41 -0500)]
x86, microcode, AMD: Add support for family 16h processors

commit 36c46ca4f322a7bf89aad5462a3a1f61713edce7 upstream.

Add valid patch size for family 16h processors.

[ hpa: promoting to urgent/stable since it is hw enabling and trivial ]

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@amd.com>
Acked-by: Andreas Herrmann <herrmann.der.user@googlemail.com>
Link: http://lkml.kernel.org/r/1353004910-2204-1-git-send-email-boris.ostrovsky@amd.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agox86, efi: Fix processor-specific memcpy() build error
Matt Fleming [Tue, 20 Nov 2012 13:07:46 +0000 (13:07 +0000)]
x86, efi: Fix processor-specific memcpy() build error

commit 0f905a43ce955b638139bd84486194770a6a2c08 upstream.

Building for Athlon/Duron/K7 results in the following build error,

arch/x86/boot/compressed/eboot.o: In function `__constant_memcpy3d':
eboot.c:(.text+0x385): undefined reference to `_mmx_memcpy'
arch/x86/boot/compressed/eboot.o: In function `efi_main':
eboot.c:(.text+0x1a22): undefined reference to `_mmx_memcpy'

because the boot stub code doesn't link with the kernel proper, and
therefore doesn't have access to the 3DNow version of memcpy. So,
follow the example of misc.c and #undef memcpy so that we use the
version provided by misc.c.

See https://bugzilla.kernel.org/show_bug.cgi?id=50391

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Ryan Underwood <nemesis@icequake.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agox86-32: Fix invalid stack address while in softirq
Robert Richter [Mon, 3 Sep 2012 18:54:48 +0000 (20:54 +0200)]
x86-32: Fix invalid stack address while in softirq

commit 1022623842cb72ee4d0dbf02f6937f38c92c3f41 upstream.

In 32 bit the stack address provided by kernel_stack_pointer() may
point to an invalid range causing NULL pointer access or page faults
while in NMI (see trace below). This happens if called in softirq
context and if the stack is empty. The address at &regs->sp is then
out of range.

Fixing this by checking if regs and &regs->sp are in the same stack
context. Otherwise return the previous stack pointer stored in struct
thread_info. If that address is invalid too, return address of regs.

 BUG: unable to handle kernel NULL pointer dereference at 0000000a
 IP: [<c1004237>] print_context_stack+0x6e/0x8d
 *pde = 00000000
 Oops: 0000 [#1] SMP
 Modules linked in:
 Pid: 4434, comm: perl Not tainted 3.6.0-rc3-oprofile-i386-standard-g4411a05 #4 Hewlett-Packard HP xw9400 Workstation/0A1Ch
 EIP: 0060:[<c1004237>] EFLAGS: 00010093 CPU: 0
 EIP is at print_context_stack+0x6e/0x8d
 EAX: ffffe000 EBX: 0000000a ECX: f4435f94 EDX: 0000000a
 ESI: f4435f94 EDI: f4435f94 EBP: f5409ec0 ESP: f5409ea0
  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
 CR0: 8005003b CR2: 0000000a CR3: 34ac9000 CR4: 000007d0
 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
 DR6: ffff0ff0 DR7: 00000400
 Process perl (pid: 4434, ti=f5408000 task=f5637850 task.ti=f4434000)
 Stack:
  000003e8 ffffe000 00001ffc f4e39b00 00000000 0000000a f4435f94 c155198c
  f5409ef0 c1003723 c155198c f5409f04 00000000 f5409edc 00000000 00000000
  f5409ee8 f4435f94 f5409fc4 00000001 f5409f1c c12dce1c 00000000 c155198c
 Call Trace:
  [<c1003723>] dump_trace+0x7b/0xa1
  [<c12dce1c>] x86_backtrace+0x40/0x88
  [<c12db712>] ? oprofile_add_sample+0x56/0x84
  [<c12db731>] oprofile_add_sample+0x75/0x84
  [<c12ddb5b>] op_amd_check_ctrs+0x46/0x260
  [<c12dd40d>] profile_exceptions_notify+0x23/0x4c
  [<c1395034>] nmi_handle+0x31/0x4a
  [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [<c13950ed>] do_nmi+0xa0/0x2ff
  [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [<c13949e5>] nmi_stack_correct+0x28/0x2d
  [<c1029dc5>] ? ftrace_define_fields_irq_handler_entry+0x45/0x45
  [<c1003603>] ? do_softirq+0x4b/0x7f
  <IRQ>
  [<c102a06f>] irq_exit+0x35/0x5b
  [<c1018f56>] smp_apic_timer_interrupt+0x6c/0x7a
  [<c1394746>] apic_timer_interrupt+0x2a/0x30
 Code: 89 fe eb 08 31 c9 8b 45 0c ff 55 ec 83 c3 04 83 7d 10 00 74 0c 3b 5d 10 73 26 3b 5d e4 73 0c eb 1f 3b 5d f0 76 1a 3b 5d e8 73 15 <8b> 13 89 d0 89 55 e0 e8 ad 42 03 00 85 c0 8b 55 e0 75 a6 eb cc
 EIP: [<c1004237>] print_context_stack+0x6e/0x8d SS:ESP 0068:f5409ea0
 CR2: 000000000000000a
 ---[ end trace 62afee3481b00012 ]---
 Kernel panic - not syncing: Fatal exception in interrupt

V2:
* add comments to kernel_stack_pointer()
* always return a valid stack address by falling back to the address
  of regs

Reported-by: Yang Wei <wei.yang@windriver.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Link: http://lkml.kernel.org/r/20120912135059.GZ8285@erda.amd.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jun Zhang <jun.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agortlwifi: rtl8192se: Fix gcc 4.7.x warning
Larry Finger [Wed, 20 Jun 2012 16:47:26 +0000 (11:47 -0500)]
rtlwifi: rtl8192se: Fix gcc 4.7.x warning

commit f761b6947dde42890beea59b020e1be87491809e upstream.

With gcc 4.7.x, the following warning is issued as the routine that sets
the array has the possibility of not initializing the values:

  CC [M]  drivers/net/wireless/rtlwifi/rtl8192se/phy.o
drivers/net/wireless/rtlwifi/rtl8192se/phy.c: In function â€˜rtl92s_phy_set_txpower’:
drivers/net/wireless/rtlwifi/rtl8192se/phy.c:1268:23: warning: â€˜ofdmpowerLevel[0]’ may be used uninitialized in this function [-Wuninitialized]

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoscsi: aha152x: Fix sparse warning and make printing pointer address more portable.
Krzysztof Wilczynski [Wed, 2 May 2012 10:34:01 +0000 (11:34 +0100)]
scsi: aha152x: Fix sparse warning and make printing pointer address more portable.

commit b631cf1f899f9d2e449884dbccc34940637c639f upstream.

This is to change use of "0x%08x" in favour of "%p" as per ../Documentation/printk-formats.txt,
which also takes care about the following warning during compilation time:

  drivers/scsi/aha152x.c: In function â€˜get_command’:
  drivers/scsi/aha152x.c:2987: warning: cast from pointer to integer of different size

Signed-off-by: Krzysztof Wilczynski <krzysztof.wilczynski@linux.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomvsas: remove unused variable in mvs_task_exec()
Dan Carpenter [Fri, 22 Jun 2012 06:36:35 +0000 (23:36 -0700)]
mvsas: remove unused variable in mvs_task_exec()

commit cca85013ef54f66eb4616e6f3860549a96c8338b upstream.

We don't use "dev" any more after 07ec747a5f ("libsas: remove
ata_port.lock management duties from lldds") and it causes a compile
warning.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Xiangliang Yu <yuxiangl@marvell.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agodrivers/leds/leds-lp5521.c: fix lp5521_read() error handling
Dan Carpenter [Tue, 29 May 2012 22:07:26 +0000 (15:07 -0700)]
drivers/leds/leds-lp5521.c: fix lp5521_read() error handling

commit 5bc9ad774c063f6b41965e7314f2c26aa5e465a0 upstream.

Gcc 4.6.2 complains that:

  drivers/leds/leds-lp5521.c: In function `lp5521_load_program':
  drivers/leds/leds-lp5521.c:214:21: warning: `mode' may be used uninitialized in this function [-Wuninitialized]
  drivers/leds/leds-lp5521.c: In function `lp5521_probe':
  drivers/leds/leds-lp5521.c:788:5: warning: `buf' may be used uninitialized in this function [-Wuninitialized]
  drivers/leds/leds-lp5521.c:740:6: warning: `ret' may be used uninitialized in this function [-Wuninitialized]

These are real problems if lp5521_read() returns an error.  When that
happens we should handle it, instead of ignoring it or doing a bitwise
OR with all the other error codes and continuing.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Milo <Milo.Kim@ti.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Bryan Wu <bryan.wu@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoALSA: snd-usb-caiaq: initialize card pointer
Daniel Mack [Mon, 18 Jun 2012 19:16:31 +0000 (21:16 +0200)]
ALSA: snd-usb-caiaq: initialize card pointer

commit da185443c12f5ef7416af50293833a5654854186 upstream.

Fixes the following warning:

  CC [M]  sound/usb/caiaq/device.o
sound/usb/caiaq/device.c: In function â€˜snd_probe’:
sound/usb/caiaq/device.c:500:16: warning: â€˜card’ may be used
uninitialized in this function [-Wmaybe-uninitialized]

Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoBluetooth: Fix using uninitialized option in RFCMode
Szymon Janc [Fri, 8 Jun 2012 09:33:33 +0000 (11:33 +0200)]
Bluetooth: Fix using uninitialized option in RFCMode

commit 8f321f853ea33330c7141977cd34804476e2e07e upstream.

If remote device sends bogus RFC option with invalid length,
undefined options values are used. Fix this by using defaults when
remote misbehaves.

This also fixes the following warning reported by gcc 4.7.0:

net/bluetooth/l2cap_core.c: In function 'l2cap_config_rsp':
net/bluetooth/l2cap_core.c:3302:13: warning: 'rfc.max_pdu_size' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.max_pdu_size' was declared here
net/bluetooth/l2cap_core.c:3298:25: warning: 'rfc.monitor_timeout' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.monitor_timeout' was declared here
net/bluetooth/l2cap_core.c:3297:25: warning: 'rfc.retrans_timeout' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.retrans_timeout' was declared here
net/bluetooth/l2cap_core.c:3295:2: warning: 'rfc.mode' may be used uninitialized in this function [-Wmaybe-uninitialized]
net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.mode' was declared here

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoNVMe: Fix uninitialized iod compiler warning
Keith Busch [Fri, 27 Jul 2012 17:53:28 +0000 (11:53 -0600)]
NVMe: Fix uninitialized iod compiler warning

commit c7d36ab8fa04c213328119a9c0d66985fe204ee5 upstream.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoUBIFS: fix compilation warning
Alexandre Pereira da Silva [Mon, 25 Jun 2012 20:47:49 +0000 (17:47 -0300)]
UBIFS: fix compilation warning

commit 782759b9f5f5223e0962af60c3457c912fab755f upstream.

Fix the following compilation warning:

fs/ubifs/dir.c: In function 'ubifs_rename':
fs/ubifs/dir.c:972:15: warning: 'saved_nlink' may be used uninitialized
in this function

Use the 'uninitialized_var()' macro to get rid of this false-positive.

Artem: massaged the patch a bit.

Signed-off-by: Alexandre Pereira da Silva <aletes.xgr@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoLinux 3.4.20 v3.4.20
Greg Kroah-Hartman [Mon, 26 Nov 2012 19:40:05 +0000 (11:40 -0800)]
Linux 3.4.20

11 years agolibceph: drop declaration of ceph_con_get()
Alex Elder [Thu, 21 Jun 2012 19:49:23 +0000 (12:49 -0700)]
libceph: drop declaration of ceph_con_get()

commit 261030215d970c62f799e6e508e3c68fc7ec2aa9 upstream.

For some reason the declaration of ceph_con_get() and
ceph_con_put() did not get deleted in this commit:
    d59315ca libceph: drop ceph_con_get/put helpers and nref member

Clean that up.

Signed-off-by: Alex Elder <elder@inktank.com>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoRevert "serial: omap: fix software flow control"
Felipe Balbi [Tue, 16 Oct 2012 14:09:22 +0000 (17:09 +0300)]
Revert "serial: omap: fix software flow control"

commit a4f743851f74fc3e0cc40c13082e65c24139f481 upstream.

This reverts commit 957ee7270d632245b43f6feb0e70d9a5e9ea6cf6
(serial: omap: fix software flow control).

As Russell has pointed out, that commit isn't fixing
Software Flow Control at all, and it actually makes
it even more broken.

It was agreed to revert this commit and use Russell's
latest UART patches instead.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Tony Lindgren <tony@atomide.com>
Cc: Andreas Bießmann <andreas.devel@googlemail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoACPI video: Ignore errors after _DOD evaluation.
Igor Murzov [Sat, 13 Oct 2012 00:41:25 +0000 (04:41 +0400)]
ACPI video: Ignore errors after _DOD evaluation.

commit fba4e087361605d1eed63343bb08811f097c83ee upstream.

There are systems where video module known to work fine regardless
of broken _DOD and ignoring returned value here doesn't cause
any issues later. This should fix brightness controls on some laptops.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=47861

Signed-off-by: Igor Murzov <e-mail@date.by>
Reviewed-by: Sergey V <sftp.mtuci@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Abdallah Chatila <abdallah.chatila@ericsson.com>
11 years agoceph: avoid 32-bit page index overflow
Alex Elder [Tue, 2 Oct 2012 15:25:51 +0000 (10:25 -0500)]
ceph: avoid 32-bit page index overflow

(cherry picked from commit 6285bc231277419255f3498d3eb5ddc9f8e7fe79)

A pgoff_t is defined (by default) to have type (unsigned long).  On
architectures such as i686 that's a 32-bit type.  The ceph address
space code was attempting to produce 64 bit offsets by shifting a
page's index by PAGE_CACHE_SHIFT, but the result was not what was
desired because the shift occurred before the result got promoted
to 64 bits.

Fix this by converting all uses of page->index used in this way to
use the page_offset() macro, which ensures the 64-bit result has the
intended value.

This fixes http://tracker.newdream.net/issues/3112

Reported-by: Mohamed Pakkeer <pakkeer.mohideen@realimage.com>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: check for invalid mapping
Sage Weil [Tue, 25 Sep 2012 03:59:48 +0000 (20:59 -0700)]
libceph: check for invalid mapping

(cherry picked from commit d63b77f4c552cc3a20506871046ab0fcbc332609)

If we encounter an invalid (e.g., zeroed) mapping, return an error
and avoid a divide by zero.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoceph: Fix oops when handling mdsmap that decreases max_mds
Yan, Zheng [Thu, 20 Sep 2012 09:42:25 +0000 (17:42 +0800)]
ceph: Fix oops when handling mdsmap that decreases max_mds

(cherry picked from commit 3e8f43a089f06279c5f76a9ccd42578eebf7bfa5)

When i >= newmap->m_max_mds, ceph_mdsmap_get_addr(newmap, i) return
NULL. Passing NULL to memcmp() triggers oops.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: avoid NULL kref_put when osd reset races with alloc_msg
Sage Weil [Wed, 24 Oct 2012 23:12:58 +0000 (16:12 -0700)]
libceph: avoid NULL kref_put when osd reset races with alloc_msg

(cherry picked from commit 9bd952615a42d7e2ce3fa2c632e808e804637a1a)

The ceph_on_in_msg_alloc() method drops con->mutex while it allocates a
message.  If that races with a timeout that resends a zillion messages and
resets the connection, and the ->alloc_msg() method returns a NULL message,
it will call ceph_msg_put(NULL) and BUG.

Fix by only calling put if msg is non-NULL.

Fixes http://tracker.newdream.net/issues/3142

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agorbd: reset BACKOFF if unable to re-queue
Alex Elder [Tue, 9 Oct 2012 03:37:30 +0000 (20:37 -0700)]
rbd: reset BACKOFF if unable to re-queue

(cherry picked from commit 588377d6199034c36d335e7df5818b731fea072c)

If ceph_fault() is unable to queue work after a delay, it sets the
BACKOFF connection flag so con_work() will attempt to do so.

In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't
result in newly-queued work, it simply ignores this condition and
proceeds as if no backoff delay were desired.  There are two
problems with this--one of which is a bug.

The first problem is simply that the intended behavior is to back
off, and if we aren't able queue the work item to run after a delay
we're not doing that.

The only reason queue_delayed_work() won't queue work is if the
provided work item is already queued.  In the messenger, this
means that con_work() is already scheduled to be run again.  So
if we simply set the BACKOFF flag again when this occurs, we know
the next con_work() call will again attempt to hold off activity
on the connection until after the delay.

The second problem--the bug--is a leak of a reference count.  If
queue_delayed_work() returns 0 in con_work(), con->ops->put() drops
the connection reference held on entry to con_work().  However,
processing is (was) allowed to continue, and at the end of the
function a second con->ops->put() is called.

This patch fixes both problems.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: only kunmap kmapped pages
Alex Elder [Fri, 21 Sep 2012 22:59:58 +0000 (17:59 -0500)]
libceph: only kunmap kmapped pages

(cherry picked from commit 5ce765a540f34d1e2005e1210f49f67fdf11e997)

In write_partial_msg_pages(), pages need to be kmapped in order to
perform a CRC-32c calculation on them.  As an artifact of the way
this code used to be structured, the kunmap() call was separated
from the kmap() call and both were done conditionally.  But the
conditions under which the kmap() and kunmap() calls were made
differed, so there was a chance a kunmap() call would be done on a
page that had not been mapped.

The symptom of this was tripping a BUG() in kunmap_high() when
pkmap_count[nr] became 0.

Reported-by: Bryan K. Wright <bryan@virginia.edu>
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: avoid truncation due to racing banners
Jim Schutt [Fri, 10 Aug 2012 17:37:38 +0000 (10:37 -0700)]
libceph: avoid truncation due to racing banners

(cherry picked from commit 6d4221b53707486dfad3f5bfe568d2ce7f4c9863)

Because the Ceph client messenger uses a non-blocking connect, it is
possible for the sending of the client banner to race with the
arrival of the banner sent by the peer.

When ceph_sock_state_change() notices the connect has completed, it
schedules work to process the socket via con_work().  During this
time the peer is writing its banner, and arrival of the peer banner
races with con_work().

If con_work() calls try_read() before the peer banner arrives, there
is nothing for it to do, after which con_work() calls try_write() to
send the client's banner.  In this case Ceph's protocol negotiation
can complete succesfully.

The server-side messenger immediately sends its banner and addresses
after accepting a connect request, *before* actually attempting to
read or verify the banner from the client.  As a result, it is
possible for the banner from the server to arrive before con_work()
calls try_read().  If that happens, try_read() will read the banner
and prepare protocol negotiation info via prepare_write_connect().
prepare_write_connect() calls con_out_kvec_reset(), which discards
the as-yet-unsent client banner.  Next, con_work() calls
try_write(), which sends the protocol negotiation info rather than
the banner that the peer is expecting.

The result is that the peer sees an invalid banner, and the client
reports "negotiation failed".

Fix this by moving con_out_kvec_reset() out of
prepare_write_connect() to its callers at all locations except the
one where the banner might still need to be sent.

[elder@inktak.com: added note about server-side behavior]

Signed-off-by: Jim Schutt <jaschut@sandia.gov>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: delay debugfs initialization until we learn global_id
Sage Weil [Sun, 19 Aug 2012 19:29:16 +0000 (12:29 -0700)]
libceph: delay debugfs initialization until we learn global_id

(cherry picked from commit d1c338a509cea5378df59629ad47382810c38623)

The debugfs directory includes the cluster fsid and our unique global_id.
We need to delay the initialization of the debug entry until we have
learned both the fsid and our global_id from the monitor or else the
second client can't create its debugfs entry and will fail (and multiple
client instances aren't properly reflected in debugfs).

Reported by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: fix crypto key null deref, memory leak
Sylvain Munaut [Thu, 2 Aug 2012 16:12:59 +0000 (09:12 -0700)]
libceph: fix crypto key null deref, memory leak

(cherry picked from commit f0666b1ac875ff32fe290219b150ec62eebbe10e)

Avoid crashing if the crypto key payload was NULL, as when it was not correctly
allocated and initialized.  Also, avoid leaking it.

Signed-off-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: recheck con state after allocating incoming message
Sage Weil [Tue, 31 Jul 2012 01:19:45 +0000 (18:19 -0700)]
libceph: recheck con state after allocating incoming message

(cherry picked from commit 6139919133377652992a5fe134e22abce3e9c25e)

We drop the lock when calling the ->alloc_msg() con op, which means
we need to (a) not clobber con->in_msg without the mutex held, and (b)
we need to verify that we are still in the OPEN state when we retake
it to avoid causing any mayhem.  If the state does change, -EAGAIN
will get us back to con_work() and loop.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: change ceph_con_in_msg_alloc convention to be less weird
Sage Weil [Tue, 31 Jul 2012 01:19:30 +0000 (18:19 -0700)]
libceph: change ceph_con_in_msg_alloc convention to be less weird

(cherry picked from commit 4740a623d20c51d167da7f752b63e2b8714b2543)

This function's calling convention is very limiting.  In particular,
we can't return any error other than ENOMEM (and only implicitly),
which is a problem (see next patch).

Instead, return an normal 0 or error code, and make the skip a pointer
output parameter.  Drop the useless in_hdr argument (we have the con
pointer).

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: avoid dropping con mutex before fault
Sage Weil [Tue, 31 Jul 2012 01:17:13 +0000 (18:17 -0700)]
libceph: avoid dropping con mutex before fault

(cherry picked from commit 8636ea672f0c5ab7478c42c5b6705ebd1db7eb6a)

The ceph_fault() function takes the con mutex, so we should avoid
dropping it before calling it.  This fixes a potential race with
another thread calling ceph_con_close(), or _open(), or similar (we
don't reverify con->state after retaking the lock).

Add annotation so that lockdep realizes we will drop the mutex before
returning.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: verify state after retaking con lock after dispatch
Sage Weil [Tue, 31 Jul 2012 01:16:56 +0000 (18:16 -0700)]
libceph: verify state after retaking con lock after dispatch

(cherry picked from commit 7b862e07b1a4d5c963d19027f10ea78085f27f9b)

We drop the con mutex when delivering a message.  When we retake the
lock, we need to verify we are still in the OPEN state before
preparing to read the next tag, or else we risk stepping on a
connection that has been closed.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: revoke mon_client messages on session restart
Sage Weil [Tue, 31 Jul 2012 01:16:40 +0000 (18:16 -0700)]
libceph: revoke mon_client messages on session restart

(cherry picked from commit 4f471e4a9c7db0256834e1b376ea50c82e345c3c)

Revoke all mon_client messages when we shut down the old connection.
This is mostly moot since we are re-using the same ceph_connection,
but it is cleaner.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: fix handling of immediate socket connect failure
Sage Weil [Tue, 31 Jul 2012 01:16:16 +0000 (18:16 -0700)]
libceph: fix handling of immediate socket connect failure

(cherry picked from commit 8007b8d626b49c34fb146ec16dc639d8b10c862f)

If the connect() call immediately fails such that sock == NULL, we
still need con_close_socket() to reset our socket state to CLOSED.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: clear all flags on con_close
Sage Weil [Sat, 21 Jul 2012 00:30:40 +0000 (17:30 -0700)]
libceph: clear all flags on con_close

Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 43c7427d100769451601b8a36988ac0528ce0124)

11 years agolibceph: clean up con flags
Sage Weil [Sat, 21 Jul 2012 00:29:55 +0000 (17:29 -0700)]
libceph: clean up con flags

(cherry picked from commit 4a8616920860920abaa51193146fe36b38ef09aa)

Rename flags with CON_FLAG prefix, move the definitions into the c file,
and (better) document their meaning.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: replace connection state bits with states
Sage Weil [Sat, 21 Jul 2012 00:24:40 +0000 (17:24 -0700)]
libceph: replace connection state bits with states

(cherry picked from commit 8dacc7da69a491c515851e68de6036f21b5663ce)

Use a simple set of 6 enumerated values for the socket states (CON_STATE_*)
and use those instead of the state bits.  All of the con->state checks are
now under the protection of the con mutex, so this is safe.  It also
simplifies many of the state checks because we can check for anything other
than the expected state instead of various bits for races we can think of.

This appears to hold up well to stress testing both with and without socket
failure injection on the server side.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: drop unnecessary CLOSED check in socket state change callback
Sage Weil [Sat, 21 Jul 2012 00:19:43 +0000 (17:19 -0700)]
libceph: drop unnecessary CLOSED check in socket state change callback

(cherry picked from commit d7353dd5aaf22ed611fbcd0d4a4a12fb30659290)

If we are CLOSED, the socket is closed and we won't get these.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: close socket directly from ceph_con_close()
Sage Weil [Fri, 20 Jul 2012 23:45:49 +0000 (16:45 -0700)]
libceph: close socket directly from ceph_con_close()

(cherry picked from commit ee76e0736db8455e3b11827d6899bd2a4e1d0584)

It is simpler to do this immediately, since we already hold the con mutex.
It also avoids the need to deal with a not-quite-CLOSED socket in con_work.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: drop gratuitous socket close calls in con_work
Sage Weil [Fri, 20 Jul 2012 22:40:04 +0000 (15:40 -0700)]
libceph: drop gratuitous socket close calls in con_work

(cherry picked from commit 2e8cb10063820af7ed7638e3fd9013eee21266e7)

If the state is CLOSED or OPENING, we shouldn't have a socket.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: move ceph_con_send() closed check under the con mutex
Sage Weil [Fri, 20 Jul 2012 22:34:04 +0000 (15:34 -0700)]
libceph: move ceph_con_send() closed check under the con mutex

(cherry picked from commit a59b55a602b6c741052d79c1e3643f8440cddd27)

Take the con mutex before checking whether the connection is closed to
avoid racing with someone else closing it.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: move msgr clear_standby under con mutex protection
Sage Weil [Fri, 20 Jul 2012 22:33:04 +0000 (15:33 -0700)]
libceph: move msgr clear_standby under con mutex protection

(cherry picked from commit 00650931e52e97fe64096bec167f5a6780dfd94a)

Avoid dropping and retaking con->mutex in the ceph_con_send() case by
leaving locking up to the caller.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: fix fault locking; close socket on lossy fault
Sage Weil [Fri, 20 Jul 2012 22:22:53 +0000 (15:22 -0700)]
libceph: fix fault locking; close socket on lossy fault

(cherry picked from commit 3b5ede07b55b52c3be27749d183d87257d032065)

If we fault on a lossy connection, we should still close the socket
immediately, and do so under the con mutex.

We should also take the con mutex before printing out the state bits in
the debug output.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: reset connection retry on successfully negotiation
Sage Weil [Mon, 30 Jul 2012 23:22:05 +0000 (16:22 -0700)]
libceph: reset connection retry on successfully negotiation

(cherry picked from commit 85effe183dd45854d1ad1a370b88cddb403c4c91)

We exponentially back off when we encounter connection errors.  If several
errors accumulate, we will eventually wait ages before even trying to
reconnect.

Fix this by resetting the backoff counter after a successful negotiation/
connection with the remote node.  Fixes ceph issue #2802.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: protect ceph_con_open() with mutex
Sage Weil [Mon, 30 Jul 2012 23:21:40 +0000 (16:21 -0700)]
libceph: protect ceph_con_open() with mutex

(cherry picked from commit 5469155f2bc83bb2c88b0a0370c3d54d87eed06e)

Take the con mutex while we are initiating a ceph open.  This is necessary
because the may have previously been in use and then closed, which could
result in a racing workqueue running con_work().

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: (re)initialize bio_iter on start of message receive
Sage Weil [Mon, 30 Jul 2012 23:20:25 +0000 (16:20 -0700)]
libceph: (re)initialize bio_iter on start of message receive

(cherry picked from commit a4107026976f06c9a6ce8cc84a763564ee39d901)

Previously, we were opportunistically initializing the bio_iter if it
appeared to be uninitialized in the middle of the read path.  The problem
is that a sequence like:

 - start reading message
 - initialize bio_iter
 - read half a message
 - messenger fault, reconnect
 - restart reading message
 - ** bio_iter now non-NULL, not reinitialized **
 - read past end of bio, crash

Instead, initialize the bio_iter unconditionally when we allocate/claim
the message for read.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: resubmit linger ops when pg mapping changes
Sage Weil [Mon, 30 Jul 2012 23:19:28 +0000 (16:19 -0700)]
libceph: resubmit linger ops when pg mapping changes

(cherry picked from commit 6194ea895e447fdf4adfd23f67873a32bf4f15ae)

The linger op registration (i.e., watch) modifies the object state.  As
such, the OSD will reply with success if it has already applied without
doing the associated side-effects (setting up the watch session state).
If we lose the ACK and resubmit, we will see success but the watch will not
be correctly registered and we won't get notifies.

To fix this, always resubmit the linger op with a new tid.  We accomplish
this by re-registering as a linger (i.e., 'registered') if we are not yet
registered.  Then the second loop will treat this just like a normal
case of re-registering.

This mirrors a similar fix on the userland ceph.git, commit 5dd68b95, and
ceph bug #2796.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: fix mutex coverage for ceph_con_close
Sage Weil [Mon, 30 Jul 2012 23:24:37 +0000 (16:24 -0700)]
libceph: fix mutex coverage for ceph_con_close

(cherry picked from commit 8c50c817566dfa4581f82373aac39f3e608a7dc8)

Hold the mutex while twiddling all of the state bits to avoid possible
races.  While we're here, make not of why we cannot close the socket
directly.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: report socket read/write error message
Sage Weil [Mon, 30 Jul 2012 23:24:21 +0000 (16:24 -0700)]
libceph: report socket read/write error message

(cherry picked from commit 3a140a0d5c4b9e35373b016e41dfc85f1e526bdb)

We need to set error_msg to something useful before calling ceph_fault();
do so here for try_{read,write}().  This is more informative than

libceph: osd0 192.168.106.220:6801 (null)

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: prevent the race of incoming work during teardown
Guanjun He [Mon, 9 Jul 2012 02:50:33 +0000 (19:50 -0700)]
libceph: prevent the race of incoming work during teardown

(cherry picked from commit a2a3258417eb6a1799cf893350771428875a8287)

Add an atomic variable 'stopping' as flag in struct ceph_messenger,
set this flag to 1 in function ceph_destroy_client(), and add the condition code
in function ceph_data_ready() to test the flag value, if true(1), just return.

Signed-off-by: Guanjun He <gjhe@suse.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: initialize msgpool message types
Sage Weil [Mon, 9 Jul 2012 21:22:34 +0000 (14:22 -0700)]
libceph: initialize msgpool message types

(cherry picked from commit d50b409fb8698571d8209e5adfe122e287e31290)

Initialize the type field for messages in a msgpool.  The caller was doing
this for osd ops, but not for the reply messages.

Reported-by: Alex Elder <elder@inktank.com>
Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: allow sock transition from CONNECTING to CLOSED
Sage Weil [Wed, 27 Jun 2012 19:31:02 +0000 (12:31 -0700)]
libceph: allow sock transition from CONNECTING to CLOSED

(cherry picked from commit fbb85a478f6d4cce6942f1c25c6a68ec5b1e7e7f)

It is possible to close a socket that is in the OPENING state.  For
example, it can happen if ceph_con_close() is called on the con before
the TCP connection is established.  con_work() will come around and shut
down the socket.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: initialize mon_client con only once
Sage Weil [Wed, 27 Jun 2012 19:24:34 +0000 (12:24 -0700)]
libceph: initialize mon_client con only once

(cherry picked from commit 735a72ef952d42a256f79ae3e6dc1c17a45c041b)

Do not re-initialize the con on every connection attempt.  When we
ceph_con_close, there may still be work queued on the socket (e.g., to
close it), and re-initializing will clobber the work_struct state.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: set peer name on con_open, not init
Sage Weil [Wed, 27 Jun 2012 19:24:08 +0000 (12:24 -0700)]
libceph: set peer name on con_open, not init

(cherry picked from commit b7a9e5dd40f17a48a72f249b8bbc989b63bae5fd)

The peer name may change on each open attempt, even when the connection is
reused.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: add some fine ASCII art
Alex Elder [Thu, 21 Jun 2012 02:53:53 +0000 (21:53 -0500)]
libceph: add some fine ASCII art

(cherry picked from commit bc18f4b1c850ab355e38373fbb60fd28568d84b5)

Sage liked the state diagram I put in my commit description so
I'm putting it in with the code.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: small changes to messenger.c
Alex Elder [Mon, 11 Jun 2012 19:57:13 +0000 (14:57 -0500)]
libceph: small changes to messenger.c

(cherry picked from commit 5821bd8ccdf5d17ab2c391c773756538603838c3)

This patch gathers a few small changes in "net/ceph/messenger.c":
  out_msg_pos_next()
    - small logic change that mostly affects indentation
  write_partial_msg_pages().
    - use a local variable trail_off to represent the offset into
      a message of the trail portion of the data (if present)
    - once we are in the trail portion we will always be there, so we
      don't always need to check against our data position
    - avoid computing len twice after we've reached the trail
    - get rid of the variable tmpcrc, which is not needed
    - trail_off and trail_len never change so mark them const
    - update some comments
  read_partial_message_bio()
    - bio_iovec_idx() will never return an error, so don't bother
      checking for it

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: distinguish two phases of connect sequence
Alex Elder [Thu, 24 May 2012 16:55:03 +0000 (11:55 -0500)]
libceph: distinguish two phases of connect sequence

(cherry picked from commit 7593af920baac37752190a0db703d2732bed4a3b)

Currently a ceph connection enters a "CONNECTING" state when it
begins the process of (re-)connecting with its peer.  Once the two
ends have successfully exchanged their banner and addresses, an
additional NEGOTIATING bit is set in the ceph connection's state to
indicate the connection information exhange has begun.  The
CONNECTING bit/state continues to be set during this phase.

Rather than have the CONNECTING state continue while the NEGOTIATING
bit is set, interpret these two phases as distinct states.  In other
words, when NEGOTIATING is set, clear CONNECTING.  That way only
one of them will be active at a time.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: separate banner and connect writes
Alex Elder [Thu, 31 May 2012 16:37:29 +0000 (11:37 -0500)]
libceph: separate banner and connect writes

(cherry picked from commit ab166d5aa3bc036fba7efaca6e4e43a7e9510acf)

There are two phases in the process of linking together the two ends
of a ceph connection.  The first involves exchanging a banner and
IP addresses, and if that is successful a second phase exchanges
some detail about each side's connection capabilities.

When initiating a connection, the client side now queues to send
its information for both phases of this process at the same time.
This is probably a bit more efficient, but it is slightly messier
from a layering perspective in the code.

So rearrange things so that the client doesn't send the connection
information until it has received and processed the response in the
initial banner phase (in process_banner()).

Move the code (in the (con->sock == NULL) case in try_write()) that
prepares for writing the connection information, delaying doing that
until the banner exchange has completed.  Move the code that begins
the transition to this second "NEGOTIATING" phase out of
process_banner() and into its caller, so preparing to write the
connection information and preparing to read the response are
adjacent to each other.

Finally, preparing to write the connection information now requires
the output kvec to be reset in all cases, so move that into the
prepare_write_connect() and delete it from all callers.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: define and use an explicit CONNECTED state
Alex Elder [Wed, 23 May 2012 19:35:23 +0000 (14:35 -0500)]
libceph: define and use an explicit CONNECTED state

(cherry picked from commit e27947c767f5bed15048f4e4dad3e2eb69133697)

There is no state explicitly defined when a ceph connection is fully
operational.  So define one.

It's set when the connection sequence completes successfully, and is
cleared when the connection gets closed.

Be a little more careful when examining the old state when a socket
disconnect event is reported.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: clear NEGOTIATING when done
Alex Elder [Wed, 23 May 2012 19:35:23 +0000 (14:35 -0500)]
libceph: clear NEGOTIATING when done

(cherry picked from commit 3ec50d1868a9e0493046400bb1fdd054c7f64ebd)

A connection state's NEGOTIATING bit gets set while in CONNECTING
state after we have successfully exchanged a ceph banner and IP
addresses with the connection's peer (the server).  But that bit
is not cleared again--at least not until another connection attempt
is initiated.

Instead, clear it as soon as the connection is fully established.
Also, clear it when a socket connection gets prematurely closed
in the midst of establishing a ceph connection (in case we had
reached the point where it was set).

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: clear CONNECTING in ceph_con_close()
Alex Elder [Thu, 21 Jun 2012 02:53:53 +0000 (21:53 -0500)]
libceph: clear CONNECTING in ceph_con_close()

(cherry picked from commit bb9e6bba5d8b85b631390f8dbe8a24ae1ff5b48a)

A connection that is closed will no longer be connecting.  So
clear the CONNECTING state bit in ceph_con_close().  Similarly,
if the socket has been closed we no longer are in connecting
state (a new connect sequence will need to be initiated).

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: don't touch con state in con_close_socket()
Alex Elder [Thu, 21 Jun 2012 02:53:53 +0000 (21:53 -0500)]
libceph: don't touch con state in con_close_socket()

(cherry picked from commit 456ea46865787283088b23a8a7f69244513b95f0)

In con_close_socket(), a connection's SOCK_CLOSED flag gets set and
then cleared while its shutdown method is called and its reference
gets dropped.

Previously, that flag got set only if it had not already been set,
so setting it in con_close_socket() might have prevented additional
processing being done on a socket being shut down.  We no longer set
SOCK_CLOSED in the socket event routine conditionally, so setting
that bit here no longer provides whatever benefit it might have
provided before.

A race condition could still leave the SOCK_CLOSED bit set even
after we've issued the call to con_close_socket(), so we still clear
that bit after shutting the socket down.  Add a comment explaining
the reason for this.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: just set SOCK_CLOSED when state changes
Alex Elder [Thu, 21 Jun 2012 02:53:53 +0000 (21:53 -0500)]
libceph: just set SOCK_CLOSED when state changes

(cherry picked from commit d65c9e0b9eb43d14ece9dd843506ccba06162ee7)

When a TCP_CLOSE or TCP_CLOSE_WAIT event occurs, the SOCK_CLOSED
connection flag bit is set, and if it had not been previously set
queue_con() is called to ensure con_work() will get a chance to
handle the changed state.

con_work() atomically checks--and if set, clears--the SOCK_CLOSED
bit if it was set.  This means that even if the bit were set
repeatedly, the related processing in con_work() only gets called
once per transition of the bit from 0 to 1.

What's important then is that we ensure con_work() gets called *at
least* once when a socket close event occurs, not that it gets
called *exactly* once.

The work queue mechanism already takes care of queueing work
only if it is not already queued, so there's no need for us
to call queue_con() conditionally.

So this patch just makes it so the SOCK_CLOSED flag gets set
unconditionally in ceph_sock_state_change().

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: don't change socket state on sock event
Alex Elder [Thu, 21 Jun 2012 02:53:53 +0000 (21:53 -0500)]
libceph: don't change socket state on sock event

(cherry picked from commit 188048bce311ee41e5178bc3255415d0eae28423)

Currently the socket state change event handler records an error
message on a connection to distinguish a close while connecting from
a close while a connection was already established.

Changing connection information during handling of a socket event is
not very clean, so instead move this assignment inside con_work(),
where it can be done during normal connection-level processing (and
under protection of the connection mutex as well).

Move the handling of a socket closed event up to the top of the
processing loop in con_work(); there's no point in handling backoff
etc. if we have a newly-closed socket to take care of.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: SOCK_CLOSED is a flag, not a state
Alex Elder [Thu, 21 Jun 2012 02:53:53 +0000 (21:53 -0500)]
libceph: SOCK_CLOSED is a flag, not a state

(cherry picked from commit a8d00e3cdef4c1c4f194414b72b24cd995439a05)

The following commit changed it so SOCK_CLOSED bit was stored in
a connection's new "flags" field rather than its "state" field.

    libceph: start separating connection flags from state
    commit 928443cd

That bit is used in con_close_socket() to protect against setting an
error message more than once in the socket event handler function.

Unfortunately, the field being operated on in that function was not
updated to be "flags" as it should have been.  This fixes that
error.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: don't use bio_iter as a flag
Alex Elder [Mon, 11 Jun 2012 19:57:13 +0000 (14:57 -0500)]
libceph: don't use bio_iter as a flag

(cherry picked from commit abdaa6a849af1d63153682c11f5bbb22dacb1f6b)

Recently a bug was fixed in which the bio_iter field in a ceph
message was not being properly re-initialized when a message got
re-transmitted:
    commit 43643528cce60ca184fe8197efa8e8da7c89a037
    Author: Yan, Zheng <zheng.z.yan@intel.com>
    rbd: Clear ceph_msg->bio_iter for retransmitted message

We are now only initializing the bio_iter field when we are about to
start to write message data (in prepare_write_message_data()),
rather than every time we are attempting to write any portion of the
message data (in write_partial_msg_pages()).  This means we no
longer need to use the msg->bio_iter field as a flag.

So just don't do that any more.  Trust prepare_write_message_data()
to ensure msg->bio_iter is properly initialized, every time we are
about to begin writing (or re-writing) a message's bio data.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: move init of bio_iter
Alex Elder [Mon, 11 Jun 2012 19:57:13 +0000 (14:57 -0500)]
libceph: move init of bio_iter

(cherry picked from commit 572c588edadaa3da3992bd8a0fed830bbcc861f8)

If a message has a non-null bio pointer, its bio_iter field is
initialized in write_partial_msg_pages() if this has not been done
already.  This is really a one-time setup operation for sending a
message's (bio) data, so move that initialization code into
prepare_write_message_data() which serves that purpose.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: move init_bio_*() functions up
Alex Elder [Mon, 11 Jun 2012 19:57:13 +0000 (14:57 -0500)]
libceph: move init_bio_*() functions up

(cherry picked from commit df6ad1f97342ebc4270128222e896541405eecdb)

Move init_bio_iter() and iter_bio_next() up in their source file so
the'll be defined before they're needed.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: don't mark footer complete before it is
Alex Elder [Mon, 11 Jun 2012 19:57:13 +0000 (14:57 -0500)]
libceph: don't mark footer complete before it is

(cherry picked from commit fd154f3c75465abd83b7a395033e3755908a1e6e)

This is a nit, but prepare_write_message() sets the FOOTER_COMPLETE
flag before the CRC for the data portion (recorded in the footer)
has been completely computed.  Hold off setting the complete flag
until we've decided it's ready to send.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: encapsulate advancing msg page
Alex Elder [Mon, 11 Jun 2012 19:57:13 +0000 (14:57 -0500)]
libceph: encapsulate advancing msg page

(cherry picked from commit 84ca8fc87fcf4ab97bb8acdb59bf97bb4820cb14)

In write_partial_msg_pages(), once all the data from a page has been
sent we advance to the next one.  Put the code that takes care of
this into its own function.

While modifying write_partial_msg_pages(), make its local variable
"in_trail" be Boolean, and use the local variable "msg" (which is
just the connection's current out_msg pointer) consistently.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: encapsulate out message data setup
Alex Elder [Mon, 11 Jun 2012 19:57:13 +0000 (14:57 -0500)]
libceph: encapsulate out message data setup

(cherry picked from commit 739c905baa018c99003564ebc367d93aa44d4861)

Move the code that prepares to write the data portion of a message
into its own function.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: drop ceph_con_get/put helpers and nref member
Sage Weil [Thu, 21 Jun 2012 19:49:23 +0000 (12:49 -0700)]
libceph: drop ceph_con_get/put helpers and nref member

(cherry picked from commit d59315ca8c0de00df9b363f94a2641a30961ca1c)

These are no longer used.  Every ceph_connection instance is embedded in
another structure, and refcounts manipulated via the get/put ops.

Signed-off-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolibceph: use con get/put methods
Sage Weil [Thu, 21 Jun 2012 19:47:08 +0000 (12:47 -0700)]
libceph: use con get/put methods

(cherry picked from commit 36eb71aa57e6a33d61fd90a2fd87f00c6844bc86)

The ceph_con_get/put() helpers manipulate the embedded con ref
count, which isn't used now that ceph_connections are embedded in
other structures.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>