While we are never normally passed an instruction that exceeds 15 bytes,
smp games can cause us to attempt to interpret one, which will cause
large latencies in non-preempt hosts.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1274) simplifies the counting of transaction-error
retries. Now we will count up from 0 to QH_XACTERR_MAX instead of
down from QH_XACTERR_MAX to 0.
The patch also fixes a small bug: qh->xacterr was not getting
initialized for interrupt endpoints.
I notice that the processcompl_compat() function seems to be leaking the
'struct async *as' in the error paths.
I think that the calling convention is fundamentally buggered. The
caller is the one that did the "reap_as()" to get the as thing, the
caller should be the one to free it too.
Freeing it in the caller also means that it very clearly always gets
freed, and avoids the need for any "free in the error case too".
From: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Marcus Meissner <meissner@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When controlling an industrial radio modem it can be necessary to
manipulate the handshake lines in order to control the radio modem's
transmitter, from userspace.
The transmitter should not be turned off before all characters have been
transmitted. serial8250_tx_empty() was reporting that all characters were
transmitted before they actually were.
===
Discovered in parallel with more testing and analysis by Kees Schoenmakers
as follows:
I ran into an NetMos 9835 serial pci board which behaves a little
different than the standard. This type of expansion board is very common.
"Standard" 8250 compatible devices clear the 'UART_LST_TEMT" bit together
with the "UART_LSR_THRE" bit when writing data to the device.
The NetMos device does it slightly different
I believe that the TEMT bit is coupled to the shift register. The problem
is that after writing data to the device and very quickly after that one
does call serial8250_tx_empty, it returns the wrong information.
My patch makes the test more robust (and solves the problem) and it does
not affect the already correct devices.
Alan:
We may yet need to quirk this but now we know which chips we have a
way to do that should we find this breaks some other 8250 clone with
dodgy THRE.
Signed-off-by: Dick Hollenbeck <dick@softplc.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: Kees Schoenmakers <k.schoenmakers@sigmae.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
dev_dbg outputs dev_name, which is released with device_unregister. This bug
resulted in output like this:
i2c Xy2�0: adapter [SMBus I801 adapter at 1880] unregistered
The right output would be:
i2c i2c-0: adapter [SMBus I801 adapter at 1880] unregistered
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
While running 20 parallel instances of dd as follows:
#!/bin/bash
for i in `seq 1 20`; do
dd if=/dev/zero of=/export/hda3/dd_$i bs=1073741824 count=1 &
done
wait
on a 16G machine, we noticed that rather than just killing the processes,
the entire kernel went down. Stracing dd reveals that it first does an
mmap2, which makes 1GB worth of zero page mappings. Then it performs a
read on those pages from /dev/zero, and finally it performs a write.
The machine died during the reads. Looking at the code, it was noticed
that /dev/zero's read operation had been changed by 557ed1fa2620dc119adb86b34c614e152a629a80 ("remove ZERO_PAGE") from giving
zero page mappings to actually zeroing the page.
The zeroing of the pages causes physical pages to be allocated to the
process. But, when the process exhausts all the memory that it can, the
kernel cannot kill it, as it is still in the kernel mode allocating more
memory. Consequently, the kernel eventually crashes.
To fix this, I propose that when a fatal signal is pending during
/dev/zero read operation, we simply return and let the user process die.
Signed-off-by: Salman Qazi <sqazi@google.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Modified error return and comment trivially. - Linus] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Lin Ming reported a 10% OLTP regression against 2.6.27-rc4.
The difference seems to come from different preemption agressiveness,
which affects the cache footprint of the workload and its effective
cache trashing.
Aggresively preempt a task if its avg overlap is very small, this should
avoid the task going to sleep and find it still running when we schedule
back to it - saving a wakeup.
Reported-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The SKY2_HW_RAM_BUFFER bit in hw->flags was checked in sky2_mac_init(),
before being set later in sky2_up().
Setting SKY2_HW_RAM_BUFFER in sky2_init() where other hw->flags are set
should avoid this problem recurring.
Signed-off-by: Mike McCormack <mikem@ring3k.org> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Just like ip_fast_csum, the assembly snippet in csum_ipv6_magic needs a
memory clobber, as it is only passed the address of the buffer, not a
memory reference to the buffer itself.
This caused failures in Hurd's pfinetv4 when we tried to compile it with
gcc-4.3 (bogus checksums).
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This appeared after I enabled CONFIG_TCP_MD5SIG and played with it a
bit, so I looked at what might have caused it.
One thing that struck me as strange is tcp_twsk_destructor(), as it
calls tcp_put_md5sig_pool() -- which entails a put_cpu(), causing the
detected imbalance. Found on 2.6.23.9, but 2.6.31 is affected as well,
as far as I can tell.
Signed-off-by: Robert Varga <nite@hq.alert.sk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When requesting all prl entries (kprl.addr == INADDR_ANY) and there are
more prl entries than there is space passed from userspace, the existing
code would always copy cmax+1 entries, which is more than can be handled.
This patch makes the kernel copy only exactly cmax entries.
Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Acked-By: Fred L. Templin <Fred.L.Templin@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
"..., when one process calls sendmsg once to send 43804 bytes of
data and one file descriptor, and another process then calls recvmsg
three times to receive the 16032+16032+11740 bytes, each of those
recvmsg calls returns the file descriptor in the ancillary data. I
confirmed this with strace. The behaviour differs from Linux
2.6.26, where reportedly only one of those recvmsg calls (I think
the first one) returned the file descriptor."
This bug was introduced by a patch from me titled "net: unix: fix inflight
counting bug in garbage collector", commit 6209344f5.
And the reason is, quoting Kalle:
"Before your patch, unix_attach_fds() would set scm->fp = NULL, so
that if the loop in unix_stream_sendmsg() ran multiple iterations,
it could not call unix_attach_fds() again. But now,
unix_attach_fds() leaves scm->fp unchanged, and I think this causes
it to be called multiple times and duplicate the same file
descriptors to each struct sk_buff."
Fix this by introducing a flag that is cleared at the start and set
when the fds attached to the first buffer. The resulting code should
work equivalently to the one on 2.6.26.
Reported-by: Kalle Olavi Niemitalo <kon@iki.fi> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In ax25_make_new, if kmemdup of digipeat returns an error, there would
be an oops in sk_free while calling sk_destruct, because sk_protinfo
is NULL at the moment; move sk->sk_destruct initialization after this.
BTW of reported-by: Bernard Pidoux F6BVP <f6bvp@free.fr>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Mon, 8 Feb 2010 14:43:22 +0000 (09:43 -0500)]
EHCI: fix bug in keeping track of resuming ports
This patch fixes a bug caused by backporting commit cec3a53c7fe794237b582e8e77fc0e48465e65ee (USB: EHCI & UHCI: fix race
between root-hub suspend and port resume) to 2.6.27.stable without
also backporting commit eafe5b99f2135488b21cf17a262c54997c44f784 (USB:
EHCI: fix remote-wakeup support for ARC/TDI core). This extracts the
necessary changes from the earlier patch and backports them.
The symptom of the bug is that the system will fail to suspend more
than once. The problem is caused by setting ehci->reset_done[i] but
never clearing it. When ehci_bus_suspend() sees a nonzero value
there, it assumes this means the port is in the middle of resuming so
it aborts the bus suspend.
In r8169 driver MTU is used to calculate receive buffer size.
Receive buffer size is used to configure hardware incoming packet filter.
For jumbo frames:
Receive buffer size = Max frame size = MTU + 14 (ethernet header) + 4
(vlan header) + 4 (ethernet checksum) = MTU + 22
Bug:
driver for all MTU up to 1536 use receive buffer size 1536
As you can see from formula, this mean all IP packets > 1536 - 22
(for vlan tagged, 1536 - 18 for not tagged) are dropped by hardware
filter.
Example:
host_good> ifconfig eth0 mtu 1536
host_r8169> ifconfig eth0 mtu 1536
host_good> ping host_r8169
Ok
host_good> ping -s 1500 host_r8169
Fail
host_good> ifconfig eth0 mtu 7000
host_r8169> ifconfig eth0 mtu 7000
host_good> ping -s 1500 host_r8169
Ok
Bonus: got rid of magic number 8
Signed-off-by: Raimonds Cicans <ray@apollo.lv> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We want to be sure that compiler fetches the limit variable only
once, so add helpers for fetching current and maximal resource
limits which do that.
Add them to sched.h (instead of resource.h) due to circular dependency
sched.h->resource.h->task_struct
Alternative would be to create a separate res_access.h or similar.
If the owner of a PI futex dies we fix up the pi_state and set
pi_state->owner to NULL. When a malicious or just sloppy programmed
user space application sets the futex value to 0 e.g. by calling
pthread_mutex_init(), then the futex can be acquired again. A new
waiter manages to enqueue itself on the pi_state w/o damage, but on
unlock the kernel dereferences pi_state->owner and oopses.
Prevent this by checking pi_state->owner in the unlock path. If
pi_state->owner is not current we know that user space manipulated the
futex value. Ignore the mess and return -EINVAL.
This catches the above case and also the case where a task hijacks the
futex by setting the tid value and then tries to unlock it.
Reported-by: Jermome Marchand <jmarchan@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Darren Hart <dvhltc@us.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The WARN_ON in lookup_pi_state which complains about a mismatch
between pi_state->owner->pid and the pid which we retrieved from the
user space futex is completely bogus.
The code just emits the warning and then continues despite the fact
that it detected an inconsistent state of the futex. A conveniant way
for user space to spam the syslog.
Replace the WARN_ON by a consistency check. If the values do not match
return -EINVAL and let user space deal with the mess it created.
This also fixes the missing task_pid_vnr() when we compare the
pi_state->owner pid with the futex value.
Reported-by: Jermome Marchand <jmarchan@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Darren Hart <dvhltc@us.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This reverts commit 703625118069 ("tty: fix race in tty_fasync") and
commit b04da8bfdfbb ("fnctl: f_modown should call write_lock_irqsave/
restore") that tried to fix up some of the fallout but was incomplete.
It turns out that we really cannot hold 'tty->ctrl_lock' over calling
__f_setown, because not only did that cause problems with interrupt
disables (which the second commit fixed), it also causes a potential
ABBA deadlock due to lock ordering.
Thanks to Tetsuo Handa for following up on the issue, and running
lockdep to show the problem. It goes roughly like this:
- f_getown gets filp->f_owner.lock for reading without interrupts
disabled, so an interrupt that happens while that lock is held can
cause a lockdep chain from f_owner.lock -> sighand->siglock.
- at the same time, the tty->ctrl_lock -> f_owner.lock chain that
commit 703625118069 introduced, together with the pre-existing
sighand->siglock -> tty->ctrl_lock chain means that we have a lock
dependency the other way too.
So instead of extending tty->ctrl_lock over the whole __f_setown() call,
we now just take a reference to the 'pid' structure while holding the
lock, and then release it after having done the __f_setown. That still
guarantees that 'struct pid' won't go away from under us, which is all
we really ever needed.
We incorrectly depended on the 'node_state/node_isset()' functions
testing the node range, rather than checking it explicitly. That's not
reliable, even if it might often happen to work. So do the proper
explicit test.
Commit 703625118069f9f8960d356676662d3db5a9d116 exposed that f_modown()
should call write_lock_irqsave instead of just write_lock_irq so that
because a caller could have a spinlock held and it would not be good to
renable interrupts.
Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Tavis Ormandy <taviso@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kvm_handle_sie_intercept uses a jump table to get the intercept handler
for a SIE intercept. Static code analysis revealed a potential problem:
the intercept_funcs jump table was defined to contain (0x48 >> 2) entries,
but we only checked for code > 0x48 which would cause an off-by-one
array overflow if code == 0x48.
Use the compiler and ARRAY_SIZE to automatically set the limits.
We have apparently had a memory leak since 7ca7e564e049d8b350ec9d958ff25eaa24226352 "ipc: store ipcs into IDRs" in
2007. The idr of which 3 exist for each ipc namespace is never freed.
This patch simply frees them when the ipcns is freed. I don't believe any
idr_remove() are done from rcu (and could therefore be delayed until after
this idr_destroy()), so the patch should be safe. Some quick testing
showed no harm, and the memory leak fixed.
Caught by kmemleak.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1321) fixes a problem with EHCI and UHCI root-hub
suspends: If the suspend occurs while a port is trying to resume, the
resume doesn't finish and simply gets lost. When remote wakeup is
enabled, this is undesirable behavior.
The patch checks first to see if any port resumes are in progress, and
if they are then it fails the root-hub suspend with -EBUSY.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1320) fixes two problems related to interrupt-URB
scheduling in ehci-hcd.
URBs with an interval of 2 or 4 microframes aren't handled.
For the time being, the patch reduces to interval to 1 uframe.
URBs are constrained to have an interval no larger than 1024
frames by usb_submit_urb(). But some EHCI controllers allow
use of a schedule as short as 256 frames; for these
controllers we may have to decrease the interval to the
actual schedule length.
The second problem isn't very significant since few devices expose
interrupt endpoints with an interval larger than 256 frames. But the
first problem is critical; it will prevent the kernel from working
with devices having interrupt intervals of 2 or 4 uframes.
This patch (as1330) fixes a bug in khbud's handling of remote
wakeups. When a device sends a remote-wakeup request, the parent hub
(or the host controller driver, for directly attached devices) begins
the resume sequence and notifies khubd when the sequence finishes. At
this point the port's SUSPEND feature is automatically turned off.
However the device needs an additional 10-ms resume-recovery time
(TRSMRCY in the USB spec). Khubd does not wait for this delay if the
SUSPEND feature is off, and as a result some devices fail to behave
properly following a remote wakeup. This patch adds the missing
delay to the remote-wakeup path.
It also extends the resume-signalling delay used by ehci-hcd and
uhci-hcd from 20 ms (the value in the spec) to 25 ms (the value we use
for non-remote-wakeup resumes). The extra time appears to help some
devices.
Ecryptfs_open dereferences a pointer to the private lower file (the one
stored in the ecryptfs inode), without checking if the pointer is NULL.
Right afterward, it initializes that pointer if it is NULL. Swap order of
statements to first initialize. Bug discovered by Duckjin Kang.
It can happen that write does not use all the blocks allocated in
write_begin either because of some filesystem error (like ENOSPC) or
because page with data to write has been removed from memory. We truncate
these blocks so that we don't have dangling blocks beyond i_size.
Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
EDAC MC0: INTERNAL ERROR: channel-b out of range (4 >= 4)
Kernel panic - not syncing: EDAC MC0: Uncorrected Error (XEN) Domain 0 crashed: 'noreboot' set - not rebooting.
This happens because FERR_NF_FBD bit 28 is not updated on i5000. Due to
that, both bits 28 and 29 may be equal to one, returning channel = 3. As
this value is invalid, EDAC core generates the panic.
When we call giveup_fpu, we need to need to turn off VSX for the
current process. If we don't, on return to userspace it may execute a
VSX instruction before the next FP instruction, and not have its
register state refreshed correctly from the thread_struct. Ditto for
altivec.
This caused a bug where an unaligned lfs or stfs results in
fix_alignment calling giveup_fpu so it can use the FPRs (in order to
do a single <-> double conversion), and then returning to userspace
with FP off but VSX on. Then if a VSX instruction is executed, before
another FP instruction, it will proceed without another exception and
hence have the incorrect register state for VSX registers 0-31.
lfs unaligned <- alignment exception turns FP off but leaves VSX on
VSX instruction <- no exception since VSX on, hence we get the
wrong VSX register values for VSX registers 0-31,
which overlap the FPRs.
Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Several leaks in audit_tree didn't get caught by commit 318b6d3d7ddbcad3d6867e630711b8a705d873d7, including the leak on normal
exit in case of multiple rules refering to the same chunk.
... aka "Al had badly fscked up when writing that thing and nobody
noticed until Eric had fixed leaks that used to mask the breakage".
The function essentially creates a copy of old array sans one element
and replaces the references to elements of original (they are on cyclic
lists) with those to corresponding elements of new one. After that the
old one is fair game for freeing.
First of all, there's a dumb braino: when we get to list_replace_init we
use indices for wrong arrays - position in new one with the old array
and vice versa.
Another bug is more subtle - termination condition is wrong if the
element to be excluded happens to be the last one. We shouldn't go
until we fill the new array, we should go until we'd finished the old
one. Otherwise the element we are trying to kill will remain on the
cyclic lists...
That crap used to be masked by several leaks, so it was not quite
trivial to hit. Eric had fixed some of those leaks a while ago and the
shit had hit the fan...
When print-fatal-signals is enabled it's possible to dump any memory
reachable by the kernel to the log by simply jumping to that address from
user space.
Or crash the system if there's some hardware with read side effects.
The fatal signals handler will dump 16 bytes at the execution address,
which is fully controlled by ring 3.
In addition when something jumps to a unmapped address there will be up to
16 additional useless page faults, which might be potentially slow (and at
least is not very efficient)
Fortunately this option is off by default and only there on i386.
But fix it by checking for kernel addresses and also stopping when there's
a page fault.
generic_permission was refusing CAP_DAC_READ_SEARCH-enabled
processes from opening DAC-protected files read-only, because
do_filp_open adds MAY_OPEN to the open mask.
Ignore MAY_OPEN. After this patch, CAP_DAC_READ_SEARCH is
again sufficient to open(fname, O_RDONLY) on a file to which
DAC otherwise refuses us read permission.
Reported-by: Mike Kazantsev <mk.fraggod@gmail.com> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Tested-by: Mike Kazantsev <mk.fraggod@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The loop condition is fragile: we compare an unsigned value to zero, and
then decrement it by something larger than one in the loop. All the
callers should be passing in appropriately aligned buffer lengths, but
it's better to just not rely on it, and have some appropriate defensive
loop limits.
When a DASD device is used with the DIAG discipline, the DIAG
initialization will indicate success or error with a respective
return code. So far we have interpreted a return code of 4 as error,
but it actually means that the initialization was successful, but
the device is read-only. To allow read-only devices to be used with
DIAG we need to accept a return code of 4 as success.
Re-initialization of the DIAG access is also part of the DIAG error
recovery. If we find that the access mode of a device has been
changed from writable to read-only while the device was in use,
we print an error message.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Stephen Powell <zlinuxman@wowway.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Currently the same reassembly queue might be used for packets reassembled
by conntrack in different positions in the stack (PREROUTING/LOCAL_OUT),
as well as local delivery. This can cause "packet jumps" when the fragment
completing a reassembled packet is queued from a different position in the
stack than the previous ones.
Add a "user" identifier to the reassembly queue key to seperate the queues
of each caller, similar to what we do for IPv4.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
According to the TAOS Application Note 'Controlling a Backlight with
the TSL2550 Ambient Light Sensor' (page 14), the actual lux value in
extended mode should be obtained multiplying the calculated lux value
by 5.
Signed-off-by: Michele Jr De Candia <michele.decandia@valueteam.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When allocating the PCM buffer, use vmalloc_user() instead of vmalloc().
Otherwise, it would be possible for applications to play the previous
contents of the kernel memory to the speakers, or to read it directly if
the buffer is exported to userspace.
adev->dma_mode stores the transfer mode value not UDMA mode number
so the condition in cmd64x_set_dmamode() is always true and the higher
UDMA clock is always selected. This can potentially result in data
corruption when UDMA33 device is used, when 40-wire cable is used or
when the error recovery code decides to lower the device speed down.
The issue was introduced in the commit 6a40da0 ("libata cmd64x: whack
into a shape that looks like the documentation") which goes back to
kernel 2.6.20.
Regression caused in 2.6.23 and then despite repeated requests never fixed
or dealt with (Petr promised to sort it in 2008 but seems to have
forgotten).
Enough is enough - remove the problem line that was added. If it upsets
someone they've had two years to deal with it and at the very least it'll
rattle their cage and wake them up.
Ever since jffs2_garbage_collect_metadata() was first half-written in
February 2001, it's been broken on architectures where 'char' is signed.
When garbage collecting a symlink with target length above 127, the payload
length would end up negative, causing interesting and bad things to happen.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch fixes a problem with any mos7840 device where the use of the
field "minor" before it is initialised results in all the devices being
overlaid in memory (minor = 0 for all instances)
Contributed by: Phillip Branch
Backported to .27 by Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Signed-off-by: Tony Cook <tony-cook@bigpond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The 32-bit parameters (len and csum) of csum_ipv6_magic() are passed in 64-bit
registers in2 and in4. The high order 32 bits of the registers were never
cleared, and garbage was sometimes calculated into the checksum.
Fix this by clearing the high order 32 bits of these registers.
Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Dennis Schridde <devurandom@gmx.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On a multi-node x3950M2 system, there's a slight oddity in the
PCI device tree for all secondary nodes:
30:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
\-33:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01)
\-34:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)
...as compared to the primary node:
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1)
\-01:00.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)
03:00.0 PCI bridge: IBM CalIOC2 PCI-E Root Port (rev 01)
\-04:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)
In both nodes, the LSI RAID controller hangs off a CalIOC2
device, but on the secondary nodes, the BIOS hides the VGA
device and substitutes the device tree ending with the disk
controller.
It would seem that Calgary devices don't necessarily appear at
the top of the PCI tree, which means that the current code to
find the Calgary IOMMU that goes with a particular device is
buggy.
Rather than walk all the way to the top of the PCI
device tree and try to match bus number with Calgary descriptor,
the code needs to examine each parent of the particular device;
if it encounters a Calgary with a matching bus number, simply
use that.
Otherwise, we BUG() when the bus number of the Calgary doesn't
match the bus number of whatever's at the top of the device tree.
Extra note: This patch appears to work correctly for the x3950
that came before the x3950 M2.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Acked-by: Muli Ben-Yehuda <muli@il.ibm.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Jon D. Mason <jdmason@kudzu.us> Cc: Corinna Schultz <coschult@us.ibm.com>
LKML-Reference: <20091202230556.GG10295@tux1.beaverton.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Bug reporter noted their system with an ASUS P4S800 motherboard would
hang when rebooting unless reboot=b was specified. Their dmidecode
didn't contain descriptive System Information for Manufacturer or
Product Name, so I used their Base Board Information to create a
reboot quirk patch. The bug reporter confirmed this patch resolves
the reboot hang.
Handle 0x0001, DMI type 1, 25 bytes
System Information
Manufacturer: System Manufacturer
Product Name: System Name
Version: System Version
Serial Number: SYS-1234567890
UUID: E0BFCD8B-7948-D911-A953-E486B4EEB67F
Wake-up Type: Power Switch
Handle 0x0002, DMI type 2, 8 bytes
Base Board Information
Manufacturer: ASUSTeK Computer INC.
Product Name: P4S800
Version: REV 1.xx
Serial Number: xxxxxxxxxxx
BugLink: http://bugs.launchpad.net/bugs/366682
ASUS P4S800 will hang when rebooting unless reboot=b is specified.
Add a quirk to reboot through the bios.
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
LKML-Reference: <1259972107.4629.275.camel@emiko> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The range check in the sprom image parser hex2sprom() is broken.
One sprom word is 4 hex characters.
This fixes the check and also adds much better sanity checks to the code.
We better make sure the image is OK by doing some sanity checks to avoid
bricking the device by accident.
Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
These drivers inherited from the older 'hpt366' IDE driver the buggy timing
register masks in their set_piomode() metods. As a result, too low command
cycle active time is programmed for slow PIO modes. Quite fortunately, it's
later "fixed up" by the set_dmamode() methods which also "helpfully" reprogram
the command timings, usually to PIO mode 4; unfortunately, setting an UltraDMA
mode #N also reprograms already set PIO data timings, usually to MWDMA mode #
max(N, 2) timings...
However, the drivers added some breakage of their own too: the bit that they
set/clear to control the FIFO is sometimes wrong -- it's actually the MSB of
the command cycle setup time; also, setting it in DMA mode is wrong as this
bit is only for PIO actually and clearing it for PIO modes is not needed as
no mode in any timing table has it set...
Fix all this, inverting the masks while at it, like in the 'hpt366' and
'pata_hpt366' drivers; bump the drivers' versions, accounting for recent
patches that forgot to do it...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A specially-crafted Hierarchical File System (HFS) filesystem could cause
a buffer overflow to occur in a process's kernel stack during a memcpy()
call within the hfs_bnode_read() function (at fs/hfs/bnode.c:24). The
attacker can provide the source buffer and length, and the destination
buffer is a local variable of a fixed length. This local variable (passed
as "&entry" from fs/hfs/dir.c:112 and allocated on line 60) is stored in
the stack frame of hfs_bnode_read()'s caller, which is hfs_readdir().
Because the hfs_readdir() function executes upon any attempt to read a
directory on the filesystem, it gets called whenever a user attempts to
inspect any filesystem contents.
[amwang@redhat.com: modify this patch and fix coding style problems] Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eugene Teo <eteo@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Dave Anderson <anderson@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Queueing to receive an ISO packet with a payload length of zero
silently does nothing in dualbuffer mode, and crashes the kernel in
packet-per-buffer mode. Return an error in dualbuffer mode, because
the DMA controller won't let us do what we want, and work correctly in
packet-per-buffer mode.
Signed-off-by: Jay Fenlason <fenlason@redhat.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Setting fops and private data outside of the mutex at debugfs file
creation introduces a race where the files can be opened with the wrong
file operations and private data. It is easy to trigger with a process
waiting on file creation notification.
All architectures in the kernel increment/decrement the stack pointer
before storing values on the stack.
On architectures which have the stack grow down sas_ss_sp == sp is not
on the alternate signal stack while sas_ss_sp + sas_ss_size == sp is
on the alternate signal stack.
On architectures which have the stack grow up sas_ss_sp == sp is on
the alternate signal stack while sas_ss_sp + sas_ss_size == sp is not
on the alternate signal stack.
The current implementation fails for architectures which have the
stack grow down on the corner case where sas_ss_sp == sp.This was
reported as Debian bug #544905 on AMD64.
Simplified test case: http://download.breakpoint.cc/tc-sig-stack.c
The test case creates the following stack scenario:
0xn0300 stack top
0xn0200 alt stack pointer top (when switching to alt stack)
0xn01ff alt stack end
0xn0100 alt stack start == stack pointer
If the signal is sent the stack pointer is pointing to the base
address of the alt stack and the kernel erroneously decides that it
has already switched to the alternate stack because of the current
check for "sp - sas_ss_sp < sas_ss_size"
On parisc (stack grows up) the scenario would be:
0xn0200 stack pointer
0xn01ff alt stack end
0xn0100 alt stack start = alt stack pointer base
(when switching to alt stack)
0xn0000 stack base
This is handled correctly by the current implementation.
[ tglx: Modified for archs which have the stack grow up (parisc) which
would fail with the correct implementation for stack grows
down. Added a check for sp >= current->sas_ss_sp which is
strictly not necessary but makes the code symetric for both
variants ]
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Roland McGrath <roland@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca>
LKML-Reference: <20091025143758.GA6653@Chamillionaire.breakpoint.cc> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Change spin_locks to irqsave to prevent dead-locks.
Protect adding and deleting to/from dca_providers list.
Drop the lock during dca_sysfs_add_req() and dca_sysfs_remove_req() calls
as they might sleep (use GFP_KERNEL allocation).
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1254-2) splits up the shutdown method of usb_serial_driver
into a disconnect and a release method.
The problem is that the usb-serial core was calling shutdown during
disconnect handling, but drivers didn't expect it to be called until
after all the open file references had been closed. The result was an
oops when the close method tried to use memory that had been
deallocated by shutdown.
This patch implements suspend and resume methods for the
option driver. With my hardware I can even suspend the system
and keep up a connection for a short time.
The following patch in the driver is required to avoid USB 1.1 device
failures that may occur due to requests from USB OHCI controllers may
be overwritten if the latency for any pending request by the USB
controller is very long (in the range of milliseconds).
Signed-off-by: Libin Yang <libin.yang@amd.com> Cc: David Brownell <dbrownell@users.sourceforge.net> Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
fuse_direct_io() has a loop where requests are allocated in each
iteration. if allocation fails, the loop is broken out and follows
into an unconditional fuse_put_request() on that invalid pointer.
Signed-off-by: Anand V. Avati <avati@gluster.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
also holds for fuse_create, however, the same kind of check was missing there.
As an impact of this bug, open(newfile, O_RDWR|O_CREAT|O_DIRECT) fails, but a
stub newfile will remain if the fuse server handled the implied FUSE_CREATE
request appropriately.
Other impact: in the above situation ima_file_free() will complain to open/free
imbalance if CONFIG_IMA is set.
In commit 0de51088e6a82bc8413d3ca9e28bbca2788b5b53, we introduced the
use of acpi-cpufreq on VIA/Centaur CPU's by removing a vendor check for
VENDOR_INTEL. However, as it turns out, at least the Nano CPU's also
need the PDC (processor driver capabilities) handshake in order to
activate the methods required for acpi-cpufreq.
Since arch_acpi_processor_init_pdc() contains another vendor check for
Intel, the PDC is not initialized on VIA CPU's. The resulting behavior
of a current mainline kernel on such systems is: acpi-cpufreq
loads and it indicates CPU frequency changes. However, the CPU stays at
a single frequency
This trivial patch ensures that init_intel_pdc() is called on Intel and
VIA/Centaur CPU's alike.
Signed-off-by: Harald Welte <HaraldWelte@viatech.com> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The s2255 driver had logic which aborted processing of a video frame
if there was no process waiting on the video buffer in question. That
simply doesn't work when the application is doing things in an
asynchronous manner. If the application went to the trouble to queue
the buffer in the first place, then the driver should always attempt
to complete it - even if the application at that moment has its
attention turned elsewhere. Applications which always blocked waiting
for I/O on the capture device would not have been affected by this.
Applications which *mostly* blocked waiting for I/O on the capture
device probably only would have been somewhat affected (frame lossage,
at a rate which goes up as the application blocks less). Applications
which never blocked on the capture device (e.g. polling only) however
would never have been able to receive any video frames, since in that
case this "is anyone waiting on this?" check on the buffer never would
have evalutated true. This patch just deletes that harmful check
against the buffer's wait queue.
Signed-off-by: Mike Isely <isely@pobox.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Michael Krufky <mkrufky@kernellabs.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Because the counters were not reset when starting up streaming, they would
be reused from the previous run. This can result in cases such that when the
second instance of streaming starts up, the "cnt" variable in
em28xx_audio_isocirq() can end up being negative, resulting in attempting to
write to memory before the start of runtime->dma_area (as well as having a
negative number of bytes to copy).
While having tda18271 module set with debug=17 (cal & info prints) and
cal=0 (delay calibration process until first use) - I discovered that
during the calibration process, if the frequency test for 69750000
returned a bcal of 0 (see tda18721-fe.c in tda18271_powerscan func) that
the tuner wouldn't be able to pickup any of the frequencies in the range
(all the other frequencies bands returned bcal=1). I spent some time
going over the code and the NXP's tda18271 spec (ver.4 of it i think) and
adding a lot of debug prints and walking/stepping through the calibration
process. I found that when the powerscan fails to find a frequency, the
rf calibration is not run and the default value is supposed to be used in
its place (pulled from the RF_CAL_map table) - but something was getting
goofed up there.
Now, my c coding skills are very rusty, but i think root of the problem is
a signedness issue with the math operation for calculating the rf_a1 and
rf_a2 values in tda18271_rf_tracking_filters_init func, which results in
values like 20648 for rf_a1 (when it should probably have a value like 0,
or so slightly negative that it should be zero - this bad value for rf_a1
would in turn makes the approx calc within
tda18271c2_rf_tracking_filters_correction go out of whack). The simplest
solution i found was to explicitly convert the signedness of the
denominator to avoid the implicit conversion. The values placed into the
u32 rf_freq array should never exceed about 900mhz, so i think the s32 max
value shouldn't be an issue in this case.
I've tested it out a little, and even when i get a bcal=0 with the
modified code, the default calibration value gets used, rf_a1 is zero, and
the tuner seems to lock on the stream and mythtv seems to play it fine.
Signed-off-by: Seth Barry <seth@cyberseth.com> Signed-off-by: Michael Krufky <mkrufky@kernellabs.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Multiplication by 62500 causes an overflow in the 32 bit freq variable,
which is later divided by 1000 when using FM radio.
This patch prevents the overflow by scaling the frequency value correctly
upfront. Thanks to Henk Vergonet for spotting the problem and providing
a preliminary patch, which this changeset was based upon.
Fixing kernel oops when driver attemps to load xc2028 firmware.
Note by djh: the patch contribute by Martin is a port of a fix I made during
the PCTV 340e development. It's a temporary workaround that fixes a regression
(an OOPS condition) and the real fix should be in the code that manages the
i2c master on the dib7000p. But this fix does address the immmediate
regression and should be merged upstream until we do a cleaner fix.
In 2.6.23 kernel, commit a32ea1e1f925399e0d81ca3f7394a44a6dafa12c
("Fix read/truncate race") fixed a race in the generic code, and as a
side effect, now do_generic_file_read() can ask us to readpage() past
the i_size. This seems to be correctly handled by the block routines
(e.g. block_read_full_page() fills the page with zeroes in case if
somebody is trying to read past the last inode's block).
JFFS2 doesn't handle this; it assumes that it won't be asked to read
pages which don't exist -- and thus that there will be at least _one_
valid 'frag' on the page it's being asked to read. It will fill any
holes with the following memset:
When the 'closest smaller match' returned by jffs2_lookup_node_frag() is
actually on a previous page and ends before 'offset', that results in:
memset(buf, 0, <huge unsigned negative>);
Hopefully, in most cases the corruption is fatal, and quickly causing
random oopses, like this:
root@10.0.0.4:~/ltp-fs-20090531# ./testcases/kernel/fs/ftest/ftest01
Unable to handle kernel paging request for data at address 0x00000008
Faulting instruction address: 0xc01cd980
Oops: Kernel access of bad area, sig: 11 [#1]
[...]
NIP [c01cd980] rb_insert_color+0x38/0x184
LR [c0043978] enqueue_hrtimer+0x88/0xc4
Call Trace:
[c6c63b60] [c004f9a8] tick_sched_timer+0xa0/0xe4 (unreliable)
[c6c63b80] [c0043978] enqueue_hrtimer+0x88/0xc4
[c6c63b90] [c0043a48] __run_hrtimer+0x94/0xbc
[c6c63bb0] [c0044628] hrtimer_interrupt+0x140/0x2b8
[c6c63c10] [c000f8e8] timer_interrupt+0x13c/0x254
[c6c63c30] [c001352c] ret_from_except+0x0/0x14
--- Exception: 901 at memset+0x38/0x5c
LR = jffs2_read_inode_range+0x144/0x17c
[c6c63cf0] [00000000] (null) (unreliable)
This patch fixes the issue, plus fixes all LTP tests on NAND/UBI with
JFFS2 filesystem that were failing since 2.6.23 (seems like the bug
above also broke the truncation).
Reported-By: Anton Vorontsov <avorontsov@ru.mvista.com> Tested-By: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A negative offset could be used to index before the event buffer and
lead to a security breach.
Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix combine_word problem where first octet is not
read properly. The only affected place seems to be the
INPUT_TERMINAL type. Before now, sound controls can be created
with the output terminal's name which is a fallback mechanism
used only for unknown input terminal types. For example,
Line can wrongly appear as Speaker. After the change it
should appear as Line.
The side effect of this change can be that users
can expect the wrong control name in their scripts or
programs while now we return the correct one.
Probably, these defines should use get_unaligned_le16 and
friends.
There is an erratum for IOMMU hardware which documents
undefined behavior when forwarding SMI requests from
peripherals and the DTE of that peripheral has a sysmgt
value of 01b. This problem caused weird IO_PAGE_FAULTS in my
case.
This patch implements the suggested workaround for that
erratum into the AMD IOMMU driver. The erratum is
documented with number 63.
The function iommu_feature_disable is required on system
shutdown to disable the IOMMU but it is marked as __init.
This may result in a panic if the memory is reused. This
patch fixes this bug.