cfq_cic_link() has race condition. When some processes which shared ioc
issue I/O to same block device simultaneously, cfq_cic_link() returns -EEXIST
sometimes. The race condition might stop I/O by following steps:
step 1: Process A: Issue an I/O to /dev/sda
step 2: Process A: Get an ioc (iocA here) in get_io_context() which does not
linked with a cic for the device
step 3: Process A: Get a new cic for the device (cicA here) in
cfq_alloc_io_context()
step 4: Process B: Issue an I/O to /dev/sda
step 5: Process B: Get iocA in get_io_context() since process A and B share the
same ioc
step 6: Process B: Get a new cic for the device (cicB here) in
cfq_alloc_io_context() since iocA has not been linked with a
cic for the device yet
step 7: Process A: Link cicA to iocA in cfq_cic_link()
step 8: Process A: Dispatch I/O to driver and finish it
step 9: Process B: Try to link cicB to iocA in cfq_cic_link()
But it fails with showing "cfq: cic link failed!" kernel
message, since iocA has already linked with cicA at step 7.
step 10: Process B: Wait for finishig I/O in get_request_wait()
The function does not wake up, when there is no I/O to the
device.
When cfq_cic_link() returns -EEXIST, it means ioc has already linked with cic.
So when cfq_cic_link() return -EEXIST, retry cfq_cic_lookup().
which got bisected down to the stable version of this commit.
Reported-by: Jonathan Nieder <jrnieder@gmail.com> Reported-by: Phil Miller <mille121@illinois.edu> Reported-by: Philip Langdale <philipl@overt.org> Reported-by: Tim Gardner <tim.gardner@canonical.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If the pte mapping in generic_perform_write() is unmapped between
iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the
"copied" parameter to ->end_write can be zero. ext4 couldn't cope with
it with delayed allocations enabled. This skips the i_disksize
enlargement logic if copied is zero and no new data was appeneded to
the inode.
gdb> bt
#0 0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\
08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467
#1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
#2 0xffffffff810d97f1 in generic_perform_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value o\
ptimized out>, pos=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2440
#3 generic_file_buffered_write (iocb=<value optimized out>, iov=<value optimized out>, nr_segs=<value optimized out>, p\
os=0x108000, ppos=0xffff88001e26be40, count=<value optimized out>, written=0x0) at mm/filemap.c:2482
#4 0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\
xffff88001e26be40) at mm/filemap.c:2600
#5 0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=<value optimi\
zed out>, pos=<value optimized out>) at mm/filemap.c:2632
#6 0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\
t fs/ext4/file.c:136
#7 0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=<value optimized out>, len=<value optimized out>, \
ppos=0xffff88001e26bf48) at fs/read_write.c:406
#8 0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x4\
000, pos=0xffff88001e26bf48) at fs/read_write.c:435
#9 0xffffffff8113816c in sys_write (fd=<value optimized out>, buf=0x1ec2960 <Address 0x1ec2960 out of bounds>, count=0x\
4000) at fs/read_write.c:487
#10 <signal handler called>
#11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ ()
#12 0x0000000000000000 in ?? ()
gdb> print offset
$22 = 0xffffffffffffffff
gdb> print idx
$23 = 0xffffffff
gdb> print inode->i_blkbits
$24 = 0xc
gdb> up
#1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
2512 if (ext4_da_should_update_i_disksize(page, end)) {
gdb> print start
$25 = 0x0
gdb> print end
$26 = 0xffffffffffffffff
gdb> print pos
$27 = 0x108000
gdb> print new_i_size
$28 = 0x108000
gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize
$29 = 0xd9000
gdb> down
2467 for (i = 0; i < idx; i++)
gdb> print i
$30 = 0xd44acbee
This is 100% reproducible with some autonuma development code tuned in
a very aggressive manner (not normal way even for knumad) which does
"exotic" changes to the ptes. It wouldn't normally trigger but I don't
see why it can't happen normally if the page is added to swap cache in
between the two faults leading to "copied" being zero (which then
hangs in ext4). So it should be fixed. Especially possible with lumpy
reclaim (albeit disabled if compaction is enabled) as that would
ignore the young bits in the ptes.
Robert Richter [Mon, 12 Dec 2011 23:40:36 +0000 (00:40 +0100)]
oprofile, x86: Fix crash when unloading module (timer mode)
Based on 97f7f81 oprofile, x86: Fix crash when unloading module (nmi timer
mode) upstream.
Fix for stable kernels v2.6.28.y to v2.6.34.y. This patch is for .32.
Oprofile crashs while unlaoding modules and if in timer mode. Timer
mode is the fallback if the architectural initialization fails. The
pointer variable model is then used uninitialzied during exit causing
a NULL pointer dereference.
It can be triggered with kernel parameters oprofile.timer=1 nolapic
used. Happens esp. in virtual machine environments.
Backport for stable kernel v2.6.32.y to v2.6.36.y.
Current oprofile's x86 callgraph support may trigger page faults
throwing the BUG_ON(in_nmi()) message below. This patch fixes this by
using the same nmi-safe copy-from-user code as in perf.
------------[ cut here ]------------
kernel BUG at .../arch/x86/kernel/traps.c:436!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:0a.0/0000:07:00.0/0000:08:04.0/net/eth0/broadcast
CPU 5
Modules linked in:
Clement Lecigne reports a filesystem which causes a kernel oops in
hfs_find_init() trying to dereference sb->ext_tree which is NULL.
This proves to be because the filesystem has a corrupted MDB extent
record, where the extents file does not fit into the first three extents
in the file record (the first blocks).
In hfs_get_block() when looking up the blocks for the extent file
(HFS_EXT_CNID), it fails the first blocks special case, and falls
through to the extent code (which ultimately calls hfs_find_init())
which is in the process of being initialised.
Hfs avoids this scenario by always having the extents b-tree fitting
into the first blocks (the extents B-tree can't have overflow extents).
The fix is to check at mount time that the B-tree fits into first
blocks, i.e. fail if HFS_I(inode)->alloc_blocks >=
HFS_I(inode)->first_blocks
Note, the existing commit 47f365eb57573 ("hfs: fix oops on mount with
corrupted btree extent records") becomes subsumed into this as a special
case, but only for the extents B-tree (HFS_EXT_CNID), it is perfectly
acceptable for the catalog B-Tree file to grow beyond three extents,
with the remaining extent descriptors in the extents overfow.
This fixes CVE-2011-2203
Reported-by: Clement LECIGNE <clement.lecigne@netasq.com> Signed-off-by: Phillip Lougher <plougher@redhat.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Moritz Mühlenhoff <jmm@inutil.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ok, this isn't optimal, since it means that 'iotop' needs admin
capabilities, and we may have to work on this some more. But at the
same time it is very much not acceptable to let anybody just read
anybody elses IO statistics quite at this level.
Use of the GENL_ADMIN_PERM suggested by Johannes Berg as an alternative
to checking the capabilities by hand.
I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when
mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3
image has s_first = 0 in journal superblock, and the 0 is passed to
journal->j_head in journal_reset(), then to blocknr in
cleanup_journal_tail(), in the end the J_ASSERT failed.
So validate s_first after reading journal superblock from disk in
journal_get_superblock() to ensure s_first is valid.
Exactly like roundup_pow_of_two(1), the rounddown version was buggy for
the case of a compile-time constant '1' argument. Probably because it
originated from the same code, sharing history with the roundup version
from before the bugfix (for that one, see commit 1a06a52ee1b0: "Fix
roundup_pow_of_two(1)").
However, unlike the roundup version, the fix for rounddown is to just
remove the broken special case entirely. It's simply not needed - the
generic code
1UL << ilog2(n)
does the right thing for the constant '1' argment too. The only reason
roundup needed that special case was because rounding up does so by
subtracting one from the argument (and then adding one to the result)
causing the obvious problems with "ilog2(0)".
But rounddown doesn't do any of that, since ilog2() naturally truncates
(ie "rounds down") to the right rounded down value. And without the
ilog2(0) case, there's no reason for the special case that had the wrong
value.
tl;dr: rounddown_pow_of_two(1) should be 1, not 0.
Percpu allocator recorded the cpus which map to the first and last
units in pcpu_first/last_unit_cpu respectively and used them to
determine the address range of a chunk - e.g. it assumed that the
first unit has the lowest address in a chunk while the last unit has
the highest address.
This simply isn't true. Groups in a chunk can have arbitrary positive
or negative offsets from the previous one and there is no guarantee
that the first unit occupies the lowest offset while the last one the
highest.
Fix it by actually comparing unit offsets to determine cpus occupying
the lowest and highest offsets. Also, rename pcu_first/last_unit_cpu
to pcpu_low/high_unit_cpu to avoid confusion.
The chunk address range is used to flush cache on vmalloc area
map/unmap and decide whether a given address is in the first chunk by
per_cpu_ptr_to_phys() and the bug was discovered by invalid
per_cpu_ptr_to_phys() translation for crash_note.
Kudos to Dave Young for tracking down the problem.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: WANG Cong <xiyou.wangcong@gmail.com> Reported-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com>
LKML-Reference: <4EC21F67.10905@redhat.com> Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This fixes the A->B/B->A locking dependency, see the warning below.
The function task_exit_notify() is called with (task_exit_notifier)
.rwsem set and then calls sync_buffer() which locks buffer_mutex. In
sync_start() the buffer_mutex was set to prevent notifier functions to
be started before sync_start() is finished. But when registering the
notifier, (task_exit_notifier).rwsem is locked too, but now in
different order than in sync_buffer(). In theory this causes a locking
dependency, what does not occur in practice since task_exit_notify()
is always called after the notifier is registered which means the lock
is already released.
However, after checking the notifier functions it turned out the
buffer_mutex in sync_start() is unnecessary. This is because
sync_buffer() may be called from the notifiers even if sync_start()
did not finish yet, the buffers are already allocated but empty. No
need to protect this with the mutex.
So we fix this theoretical locking dependency by removing buffer_mutex
in sync_start(). This is similar to the implementation before commit:
750d857 oprofile: fix crash when accessing freed task structs
which introduced the locking dependency.
Lockdep warning:
oprofiled/4447 is trying to acquire lock:
(buffer_mutex){+.+...}, at: [<ffffffffa0000e55>] sync_buffer+0x31/0x3ec [oprofile]
but task is already holding lock:
((task_exit_notifier).rwsem){++++..}, at: [<ffffffff81058026>] __blocking_notifier_call_chain+0x39/0x67
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Carl Love <carll@us.ibm.com> Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
After registering the task free notifier we possibly have tasks in our
dying_tasks list. Free them after unregistering the notifier in case
of an error.
Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The function setup_vpif_input_channel_mode() used the VSCLKDIS register
instead of VIDCLKCTL. This meant that when in HD mode videoport channel 0
used a different clock from channel 1.
Clearly a copy-and-paste error.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Manjunath Hadli <manjunath.hadli@ti.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When no imux is available (e.g. a single capture source),
alc_auto_init_input_src() may trigger an Oops due to the access to -1.
Add a proper zero-check to avoid it.
There are some AC97 codec and board combinations that have been observed
to take a very long time to respond after the cold reset has completed.
In one case, more than 350 ms was required. To allow users to have sound
on those platforms, we'll wait up to 500ms for the codec to become
ready.
As a board may have multiple codecs, with some faster than others to
reset, we add a module parameter to inform the driver which codecs
should be present.
If a device is shutdown, then there might be a pending interrupt,
which will be processed after we reenable interrupts, which causes the
original handler to be run. If the old handler is the (broadcast)
periodic handler the shutdown state might hang the kernel completely.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In irq_wait_for_interrupt(), the should_stop member is verified before
setting the task's state to TASK_INTERRUPTIBLE and calling schedule().
In case kthread_stop sets should_stop and wakes up the process after
should_stop is checked by the irq thread but before the task's state
is changed, the irq thread might never exit:
If oprofile uses the nmi timer interrupt there is a crash while
unloading the module. The bug can be triggered with oprofile build as
module and kernel parameter nolapic set. This patch fixes this.
oprofile: using NMI timer interrupt.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
PGD 42dbca067 PUD 41da6a067 PMD 0
Oops: 0002 [#1] PREEMPT SMP
CPU 5
Modules linked in: oprofile(-) [last unloaded: oprofile]
In commit f8924e770e04 ("x86: unify mp_bus_info"), the 32-bit
and 64-bit versions of MP_bus_info were rearranged to match each
other better. Unfortunately it introduced a regression: prior
to that change we used to always set the mp_bus_not_pci bit,
then clear it if we found a PCI bus. After it, we set
mp_bus_not_pci for ISA buses, clear it for PCI buses, and leave
it alone otherwise.
In the cases of ISA and PCI, there's not much difference. But
ISA is not the only non-PCI bus, so it's better to always set
mp_bus_not_pci and clear it only for PCI.
Without this change, Dan's Dell PowerEdge 4200 panics on boot
with a log indicating interrupt routing trouble unless the
"noapic" option is supplied. With this change, the machine
boots reliably without "noapic".
In hundreds of days, the __cycles_2_ns calculation in sched_clock
has an overflow. cyc * per_cpu(cyc2ns, cpu) exceeds 64 bits, causing
the final value to become zero. We can solve this without losing
any precision.
We can decompose TSC into quotient and remainder of division by the
scale factor, and then use this to convert TSC into nanoseconds.
Signed-off-by: Salman Qazi <sqazi@google.com> Acked-by: John Stultz <johnstul@us.ibm.com> Reviewed-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20111115221121.7262.88871.stgit@dungbeetle.mtv.corp.google.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The stable release 2.6.32.32 added the upstream commit 12fed00de963433128b5366a21a55808fab2f756. However, one of the hunks of
the original patch seems missing from the stable backport which can be
found here:
http://permalink.gmane.org/gmane.linux.kernel.stable/5676
This hunk corresponds to the change in is_valid_oplock_break() at
fs/cifs/misc.c.
This patch backports the missing hunk and is against
linux-2.6.32.y stable kernel.
Cc: Steve French <sfrench@us.ibm.com> Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When we tear down a device we try to flush all outstanding
commands in scsi_free_queue(). However the check in
scsi_request_fn() is imperfect as it only signals that
we _might start_ aborting commands, not that we've actually
aborted some.
So move the printk inside the scsi_kill_request function,
this will also give us a hint about which commands are aborted.
Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Interrupts must be disabled prior to calling usb_hcd_unlink_urb_from_ep.
If interrupts are not disabled, it can potentially lead to a deadlock.
The deadlock is readily reproduceable on a slower (ARM based) device
such as the TI Pandaboard.
vlan: Centralize handling of hardware acceleration.
The bulk of that commit is a rework of the hardware assisted vlan tagging
driver interface, and as such doesn't classify for -stable inclusion. The fix
that is needed is a part of that commit but can work independently of the
rest.
This patch can avoid panics on the 2.6.32.y -stable kernels and is in the same
spirit as mainline commits 66c46d7 gro: Reset dev pointer on reuse 6d152e2 gro: reset skb_iif on reuse
which are already in -stable.
For drivers using the vlan_gro_frags() interface, a packet with an invalid tci
leads to GRO_DROP and napi_reuse_skb(). The skb has to be sanitized before
being reused or we may send an skb with an invalid vlan_tci field up the stack
where it is not expected.
Signed-off-by: Benjamin Poirier <bpoirier@suse.de> Cc: Jesse Gross <jesse@nicira.com> Acked-by: David S. Miller <davem@davemloft.net>
priv->work must not be synced while priv->mutex is locked, because
the mutex is taken in the work handler.
Move cancel_work_sync down to after the device shutdown code.
This is safe, because the work handler checks fw_state and bails out
early in case of a race.
Signed-off-by: Michael Buesch <m@bues.ch> Acked-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The tx_lock is not initialized properly. Add spin_lock_init().
Signed-off-by: Michael Buesch <m@bues.ch> Acked-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ktime_get and ktime_get_ts were calling timekeeping_get_ns()
but later they were not calling arch_gettimeoffset() so architectures
using this mechanism returned 0 ns when calling these functions.
This happened for example when running Busybox's ping which calls
syscall(__NR_clock_gettime, CLOCK_MONOTONIC, ts) which eventually
calls ktime_get. As a result the returned ping travel time was zero.
By returning '0' instead of 'EAGAIN' when the tests in xs_nospace() fail
to find evidence of socket congestion, we are making the RPC engine believe
that the message was incorrectly sent and so it disconnects the socket
instead of just retrying.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 6175ddf06b6172046a329e3abfd9c901a43efd2e optimized the mem*io
functions that have been used to send commands to the device. these
optimizations somehow corrupted the communication with the lx6464es,
that resulted the device to be unusable with kernels after 2.6.33.
this patch emulates the memcpy_*_io functions via a loop to avoid these
problems.
This patch implements a workaround for PL310 erratum 769419. On
revisions of the PL310 prior to r3p2, the Store Buffer does not
automatically drain. This can cause normal, non-cacheable writes to be
retained when the memory system is idle, leading to suboptimal I/O
performance for drivers using coherent DMA.
This patch adds an optional wmb() call to the cpu_idle loop. On systems
with an outer cache, this causes an explicit flush of the store buffer.
Acked-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Previously we claimed device ID 0x7450, regardless of the vendor, which is
clearly wrong. Now we'll claim that device ID only for AMD.
I suspect this was just a typo in the original code, but it's possible this
change will break shpchp on non-7450 AMD bridges. If so, we'll have to fix
them as we find them.
Characters with ASCII values greater than the size of
filename_rev_map[] are valid filename characters.
ecryptfs_decode_from_filename() will access kernel memory beyond
that array, and ecryptfs_parse_tag_70_packet() will then decrypt
those characters. The attacker, using the FNEK of the crafted file,
can then re-encrypt the characters to reveal the kernel memory past
the end of the filename_rev_map[] array. I expect low security
impact since this array is statically allocated in the text area,
and the amount of memory past the array that is accessible is
limited by the largest possible ASCII filename character.
This patch solves the issue reported by mhalcrow but with an
implementation suggested by Linus to simply extend the length of
filename_rev_map[] to 256. Characters greater than 0x7A are mapped to
0x00, which is how invalid characters less than 0x7A were previously
being handled.
Dan Rosenberg noted that various drivers return the struct with uncleared
fields. Instead of spending forever trying to stomp all the drivers that
get it wrong (and every new driver) do the job in one place.
This first patch adds the needed operations and hooks them up, including
the needed USB midlayer and serial core plumbing.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
The 8020i protocol (also 8070i and QIC-157) uses 12-byte commands;
shorter commands must be padded. Simon Detheridge reports that his
3-TB USB disk drive claims to use the 8020i protocol (which is
normally meant for ATAPI devices like CD drives), and because of its
large size, the disk drive requires the use of 16-byte commands.
However the usb_stor_pad12_command() routine in usb-storage always
sets the command length to 12, making the drive impossible to use.
Since the SFF-8020i specification allows for 16-byte commands in
future extensions, we may as well accept them. This patch (as1490)
changes usb_stor_pad12_command() to leave commands larger than 12
bytes alone rather than truncating them.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Simon Detheridge <simon@widgit.com> CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ftdi_set_termios() is constantly setting the baud rate, data bits and parity
unnecessarily on every call, . When called while characters are being
transmitted can cause the FTDI chip to corrupt the serial port bit stream
output by stalling the output half a bit during the output of a character.
Simple fix by skipping this setting if the baud rate/data bits/parity are
unchanged.
Signed-off-by: Andrew Worsley <amworsley@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I get report from customer that his usb-serial
converter doesn't work well,it sometimes work,
but sometimes it doesn't.
The usb-serial converter's id:
vendor_id product_id
0x4348 0x5523
Then I search the usb-serial codes, and there are
two drivers announce support this device, pl2303
and ch341, commit 026dfaf1 cause it. Through many
times to test, ch341 works well with this device,
and pl2303 doesn't work quite often(it just work quite little).
ch341 works well with this device, so we doesn't
need pl2303 to support.I try to revert 026dfaf1 first,
but it failed. So I prepare this patch by hand to revert it.
Signed-off-by: Wang YanQing <Udknight@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Starting with 4.4, gcc will happily accept -Wno-<anything> in the
cc-option test and complain later when compiling a file that has some
other warning. This rather unexpected behavior is intentional as per
http://gcc.gnu.org/PR28322, so work around it by testing for support of
the opposite option (without the no-). Introduce a new Makefile function
cc-disable-warning that does this and update two uses of cc-option in
the toplevel Makefile.
Reported-and-tested-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Michal Marek <mmarek@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
At this point, skb->data points to skb_transport_header.
So, headroom check is wrong.
For some case:bridge(UFO is on) + eth device(UFO is off),
there is no enough headroom for IPv6 frag head.
But headroom check is always false.
This will bring about data be moved to there prior to skb->head,
when adding IPv6 frag header to skb.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The /proc/vmallocinfo shows information about vmalloc allocations in vmlist
that is a linklist of vm_struct. It, however, may access pages field of
vm_struct where a page was not allocated. This results in a null pointer
access and leads to a kernel panic.
Why this happen:
In __vmalloc_node() called from vmalloc(), newly allocated vm_struct
is added to vmlist at __get_vm_area_node() and then, some fields of
vm_struct such as nr_pages and pages are set at __vmalloc_area_node(). In
other words, it is added to vmlist before it is fully initialized. At the
same time, when the /proc/vmallocinfo is read, it accesses the pages field
of vm_struct according to the nr_pages field at show_numa_info(). Thus, a
null pointer access happens.
Patch:
This patch adds newly allocated vm_struct to the vmlist *after* it is fully
initialized. So, it can avoid accessing the pages field with unallocated
page when show_numa_info() is called.
Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds a mechanism to resume selected IRQs during syscore_resume
instead of dpm_resume_noirq.
Under Xen we need to resume IRQs associated with IPIs early enough
that the resched IPI is unmasked and we can therefore schedule
ourselves out of the stop_machine where the suspend/resume takes
place.
This issue was introduced by 676dc3cf5bc3 "xen: Use IRQF_FORCE_RESUME".
Back ported to 2.6.32 (which lacks syscore support) by calling the relavant
resume function directly from sysdev_resume).
v2: Fixed non-x86 build errors.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1318713254.11016.52.camel@dagon.hellion.org.uk Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Move the access control up from the fast paths, which are no longer
universally taken first, up into the caller. This then duplicates some
sanity checking along the slow paths, but is much simpler.
Tracked as CVE-2010-2962.
Reported-by: Kees Cook <kees@ubuntu.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It was wrong included in 2.6.32 stable (was intended for 2.6.38+ in the
original commit changelog in Linus tree), and causes a regression on
2.6.32 (https://launchpad.net/bugs/875300).
Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This example file uses the old V4L1 API. It also doesn't use libv4l.
So, it is completely obsolete. A good example already exists at
v4l-utils (v4l2grab.c):
http://git.linuxtv.org/v4l-utils.git
When the number of failed devices exceeds the allowed number
we must abort any active parity operations (checks or updates) as they
are no longer meaningful, and can lead to a BUG_ON in
handle_parity_checks6.
Reported-by: Chris Paulson-Ellis <chris@edesix.com> Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Disable the new -Wunused-but-set-variable that was added in gcc 4.6.0
It produces more false positives than useful warnings.
This can still be enabled using W=1
[gregkh - No it can not for 2.6.32, but we don't care]
Signed-off-by: Dave Jones <davej@redhat.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Tested-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Michal Marek <mmarek@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In enter_state() we use "state" as an offset for the pm_states[]
array. The pm_states[] array only has PM_SUSPEND_MAX elements so
this test is off by one.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On writes in MODE_RAW the mtd_oob_ops struct is not sufficiently
initialized which may cause nandwrite to fail. With this patch
it is possible to write raw nand/oob data without additional ECC
(either for testing or when some sectors need different oob layout
e.g. bootloader) like
nandwrite -n -r -o /dev/mtd0 <myfile>
L2TP for example uses NLA_MSECS like this:
policy:
[L2TP_ATTR_RECV_TIMEOUT] = { .type = NLA_MSECS, },
code:
if (info->attrs[L2TP_ATTR_RECV_TIMEOUT])
cfg.reorder_timeout = nla_get_msecs(info->attrs[L2TP_ATTR_RECV_TIMEOUT]);
As nla_get_msecs() is essentially nla_get_u64() plus the
conversion to a HZ-based value, this will not properly
reject attributes from userspace that aren't long enough
and might overrun the message.
Add NLA_MSECS to the attribute minlen array to check the
size properly.
Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The sunrpc layer keeps a cache of recently used credentials and
'unx_match' is used to find the credential which matches the current
process.
However unx_match allows a match when the cached credential has extra
groups at the end of uc_gids list which are not in the process group list.
So if a process with a list of (say) 4 group accesses a file and gains
access because of the last group in the list, then another process
with the same uid and gid, and a gid list being the first tree of the
gids of the original process tries to access the file, it will be
granted access even though it shouldn't as the wrong rpc credential
will be used.
Make sure that SCSI device removal via scsi_remove_host() does finish
all pending SCSI commands. Currently that's not the case and hence
removal of a SCSI host during I/O can cause a deadlock. See also
"blkdev_issue_discard() hangs forever if underlying storage device is
removed" (http://bugzilla.kernel.org/show_bug.cgi?id=40472). See also
http://lkml.org/lkml/2011/8/27/6.
Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The call to complete() in st_scsi_execute_end() wakes up sleeping thread
in write_behind_check(), which frees the st_request, thus invalidating
the pointer to the associated bio structure, which is then passed to the
blk_rq_unmap_user(). Fix by storing pointer to bio structure into
temporary local variable.
This bug is present since at least linux-2.6.32.
Signed-off-by: Petr Uzel <petr.uzel@suse.cz> Reported-by: Juergen Groß <juergen.gross@ts.fujitsu.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
arch/powerpc/sysdev/mpic.c: In function 'irq_choose_cpu':
arch/powerpc/sysdev/mpic.c:574: error: passing argument 1 of '__cpus_equal' from incompatible pointer type
Reported-by: Jiri Slaby <jslaby@suse.cz> Cc: Jiajun Wu <b06378@freescale.com> Cc: Li Yang <leoli@freescale.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It breaks the build and probably shouldn't be in the 2.6.32 kernel
Reported-by: Stefan Bader <stefan.bader@canonical.com> Cc: Chris Paulson-Ellis <chris@edesix.com> Cc: Axel Lin <axel.lin@gmail.com> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This doesn't make much sense, and it exposes a bug in the kernel where
attempts to create a new file in an append-only directory using
O_CREAT will fail (but still leave a zero-length file). This was
discovered when xfstests #79 was generalized so it could run on all
file systems.
The reason is that it forgot to mark dirty when splitting two extents in
ext4_ext_convert_to_initialized(). Althrough ex has been updated in
memory, it is not dirtied both in ext4_ext_convert_to_initialized() and
ext4_ext_insert_extent(). The disk layout is corrupted. Then it will
meet with a BUG_ON() when writting at the start of that extent again.
Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Xiaoyun Mao <xiaoyun.maoxy@aliyun-inc.com> Cc: Yingbin Wang <yingbin.wangyb@aliyun-inc.com> Cc: Jia Wan <jia.wanj@aliyun-inc.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
tc_fill_qdisc() should not be called for builtin qdisc, or it
dereference a NULL pointer to get device ifindex.
Fix is to always use tc_qdisc_dump_ignore() before calling
tc_fill_qdisc().
Reported-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When one of the SSID's length passed in a scan or sched_scan request
is larger than 255, there will be an overflow in the u8 that is used
to store the length before checking. This causes the check to fail
and we overrun the buffer when copying the SSID.
Fix this by checking the nl80211 attribute length before copying it to
the struct.
A remote user can provide a small value for the command size field in
the command header of an l2cap configuration request, resulting in an
integer underflow when subtracting the size of the configuration request
header. This results in copying a very large amount of data via
memcpy() and destroying the kernel heap. Check for underflow.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit a626ca6a6564 ("vm: fix vm_pgoff wrap in stack expansion") fixed
the case of an expanding mapping causing vm_pgoff wrapping when you had
downward stack expansion. But there was another case where IA64 and
PA-RISC expand mappings: upward expansion.
Commit 982134ba6261 ("mm: avoid wrapping vm_pgoff in mremap()") fixed
the case of a expanding mapping causing vm_pgoff wrapping when you used
mremap. But there was another case where we expand mappings hiding in
plain sight: the automatic stack expansion.
This fixes that case too.
This one also found by Robert Święcki, using his nasty system call
fuzzer tool. Good job.
If the NLM daemon is killed on the NFS server, we can currently end up
hanging forever on an 'unlock' request, instead of aborting. Basically,
if the rpcbind request fails, or the server keeps returning garbage, we
really want to quit instead of retrying.
All of those are rw-r--r-- and all are broken for suid - if you open
a file before the target does suid-root exec, you'll be still able
to access it. For personality it's not a big deal, but for syscall
and stack it's a real problem.
Fix: check that task is tracable for you at the time of read().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Lower SCM_MAX_FD from 255 to 253 so that allocations for scm_fp_list are
halved. (commit f8d570a4 added two pointers in this structure)
scm_fp_dup() should not copy whole structure (and trigger kmemcheck
warnings), but only the used part. While we are at it, only allocate
needed size.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The newer Lenovo ThinkPads have HKEY HID of LEN0068 instead
of IBM0068. Added new HID so that thinkpad_acpi module will
auto load on these newer Lenovo ThinkPads.
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: stable@vger.kernel.org Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com> Signed-off-by: Andy Lutomirski <luto@mit.edu> Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Florian Fainelli [Mon, 17 Oct 2011 17:47:44 +0000 (19:47 +0200)]
watchdog: mtx1-wdt: fix build failure
Commit 6d86a0ee (watchdog: mtx1-wdt: request gpio before using it) was
backported from upstream. The patch is using a gpiolib call which is only
available in kernel 2.6.34+. Fix build by using the "old" gpiolib API
instead.
This adds a mechanism to resume selected IRQs during syscore_resume
instead of dpm_resume_noirq.
Under Xen we need to resume IRQs associated with IPIs early enough
that the resched IPI is unmasked and we can therefore schedule
ourselves out of the stop_machine where the suspend/resume takes
place.
This issue was introduced by 676dc3cf5bc3 "xen: Use IRQF_FORCE_RESUME".
Back ported to 2.6.32 (which lacks syscore support) by calling the relavant
resume function directly from sysdev_resume).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1318713254.11016.52.camel@dagon.hellion.org.uk Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
vcpu->last_guest_tsc is updated in vcpu_enter_guest() and kvm_arch_vcpu_put()
by getting the last value of the TSC from the guest.
On reset, the SeaBIOS resets the TSC to 0, which triggers a bug on the next
call to kvm_write_guest_time(): Since vcpu->hw_clock.tsc_timestamp still
contains the old value before the reset, "max_kernel_ns = vcpu->last_guest_tsc
- vcpu->hw_clock.tsc_timestamp" gets negative. Since the variable is u64, it
gets translated to a large positive value.
For completeness, here are the values for my 3 GHz CPU:
vcpu->hv_clock.tsc_shift =-1
vcpu->hv_clock.tsc_to_system_mul =2_863_019_502
This makes the guest kernel crawl very slowly when clocksource=kvmclock is
used: sleeps take way longer than expected and don't match wall clock any more.
The times printed with printk() don't match real time and the reboot often
stalls for long times.
In linux-git this isn't a problem, since on every MSR_IA32_TSC write
vcpu->arch.hv_clock.tsc_timestamp is reset to 0, which disables above logic.
The code there is only in arch/x86/kvm/x86.c, since much of the kvm-clock
related code has been refactured for 2.6.37: 99e3e30a arch/x86/kvm/x86.c
(Zachary Amsden 2010-08-19 22:07:17 -1000 1084)
vcpu->arch.hv_clock.tsc_timestamp = 0;