The qdisc_run loop is currently unbounded and runs entirely in a
softirq. This is bad as it may create an unbounded softirq run.
This patch fixes this by calling need_resched and breaking out if
necessary.
It also adds a break out if the jiffies value changes since that would
indicate we've been transmitting for too long which starves other
softirqs.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
When walking a session's packet reorder queue, use
skb_queue_walk_safe() since the list could be modified inside the
loop.
Rearrange the unlinking skbs from the reorder queue such that it is
done while the queue lock is held in pppol2tp_recv_dequeue() when
walking the skb list.
A version of this patch was suggested by Jarek Poplawski.
Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Fix locking issues in the pppol2tp driver which can cause a kernel
crash on SMP boxes. There were two problems:-
1. The driver was violating read_lock() and write_lock() scheduling
rules because it wasn't using softirq-safe locks in softirq
contexts. So we now consistently use the _bh variants of the lock
functions.
2. The driver was calling sk_dst_get() in pppol2tp_xmit() which was
taking sk_dst_lock in softirq context. We now call __sk_dst_get().
Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
zap_completion_queue() retrieves skbs from completion_queue where they have
zero skb->users counter. Before dev_kfree_skb_any() it should be non-zero
yet, so it's increased now.
Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
LLC currently allows users to inject raw frames, including IP packets
encapsulated in SNAP. While Linux doesn't handle IP over SNAP, other
systems do. Restrict LLC sockets to root similar to packet sockets.
[ Modified Patrick's patch to use CAP_NEW_RAW --DaveM ]
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
I started getting this warning with recent kernels:
[ 773.908927] ------------[ cut here ]------------
[ 773.908954] Badness at net/core/dev.c:2204
...
If we loop more than once in gem_poll(), we'll
use more than the real budget in our gem_rx()
calls, thus eventually trigger the caller's
assertions in net_rx_action().
Subtract "work_done" from "budget" for the second
arg to gem_rx() to fix the bug.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Kirill A. Shutemov <k.shutemov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
According to some OOPS reports ax25_kick tries to clone NULL skbs
sometimes. It looks like a race with ax25_clear_queues(). Probably
there is no need to add more than a simple check for this yet.
Another report suggested there are probably also cases where ax25
->paclen == 0 can happen in ax25_output(); this wasn't confirmed
during testing but let's leave this debugging check for some time.
Reported-and-tested-by: Jann Traschewski <jann@gmx.de> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
In 2.6.14 a patch was merged which switching the order of the ipmi device
naming from in-order-of-discovery over to reverse-order-of-discovery.
So on systems with multiple BMC interfaces, the ipmi device names are being
created in reverse order relative to how they are discovered on the system
(e.g. on an IBM x3950 multinode server with N nodes, the device name for the
BMC in the first node is /dev/ipmiN-1 and the device name for the BMC in the
last node is /dev/ipmi0, etc.).
The problem is caused by the list handling routines chosen in dmi_scan.c.
Using list_add() causes the multiple ipmi devices to be added to the device
list using a stack-paradigm and so the ipmi driver subsequently pulls them off
during initialization in LIFO order. This patch changes the
dmi_save_ipmi_device() list handling paradigm to a queue, thereby allowing the
ipmi driver to build the ipmi device names in the order in which they are
found on the system.
Signed-off-by: Carol Hebert <cah@us.ibm.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
THe CFI driver in 2.6.24 kernel is broken. Not so intensive read/write
operations cause incomplete writes which lead to kernel panics in JFFS2.
We investigated the issue - it is caused by bug in FL_SHUTDOWN parsing code.
Sometimes chip returns -EIO as if it is in FL_SHUTDOWN state when it should
wait in FL_PONT (error in order of conditions).
The following patch fixes the bug in state parsing code of CFI. Also I've
added comments to notify developers if they want to add new case in future.
Signed-off-by: Alexey Korolev <akorolev@infradead.org> Reviewed-by: Joern Engel <joern@logfs.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
The kernel crashes when ipsec passes a udp packet of about 14XX bytes
of data to aes-xcbc-mac.
It seems the first xxxx bytes of the data are in first sg entry,
and remaining xx bytes are in next sg entry. But we don't
check next sg entry to see if we need to go look the page up.
I noticed in hmac.c, we do a scatterwalk_sg_next(), to do this check
and possible lookup, thus xcbc.c needs to use this routine too.
A 15-hour run of an ipsec stress test sending streams of tcp and
udp packets of various sizes, using this patch and
aes-xcbc-mac completed successfully, so hopefully this fixes the
problem.
Signed-off-by: Joy Latten <latten@austin.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
[chrisw@sous-sol.org: backport to 2.6.24.4] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
The changes introduced in commit 063a2da8f01806906f7d7b1a1424b9afddebc443 changed the semantics of the
num_interrupt_in, num_interrupt_out, num_bulk_in and num_bulk_out
entries of the usb_serial_driver struct to be the number of endpoints
the device has when probed.
This patch changes the ti_1port_device usb_serial_driver struct to
reflect this change. The single port devices only have 1
bulk_out endpoint in their initial configuration, and so this patch
changes the number of other types to NUM_DONT_CARE.
The same change probably needs doing to the ti_2port_device struct,
but I don't have a two port device at hand.
Signed-off-by: Robert Spanton <rspanton@zepler.net> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Fixes a bug/inconsistency revealed by the additional sanity checking in
commit 063a2da8f01806906f7d7b1a1424b9afddebc443
introduced in the original 2.6.24 branch.
The Handspring Visor / PalmOS 4 device structure defines .num_bulk_out=2
but the usb-serial probe returns num_bulk_out=3, triggering the check in
the above commit and forcing a bail out when the device (a Garmin iQue in
my case) attempts to connect. The patch bumps the expected number of
endpoints to 3.
FWIW, this patch will probably solve the following kernel bug report for
Treo users (identical symptoms, different model PalmOS units):
<http://bugzilla.kernel.org/show_bug.cgi?id=10118>
Fixes the keyspan driver after the addition of additional
checking of driver requirements introduced in usb-serial.c
commit 063a2da8f01806906f7d7b1a1424b9afddebc443. The initialization
of the keyspan usb_serial_driver structs were not initializing the
num_interrupt_out field and the additional checking was rejecting
the end point so the driver wouldn't finish initializing.
This commit initializes the fields to NUM_DONT_CARE.
It works for the keyspan USA-49WG and doesn't break the USA-19HS
which are the two keyspan devices I have to test with.
Signed-off-by: Clark Rawlins <clark.rawlins@escient.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Fix the problem that makedumpfile sometimes fails on x86_64 machine.
This patch adds the symbol "phys_base" to a vmcoreinfo data. The
vmcoreinfo data has the minimum debugging information only for dump
filtering. makedumpfile (dump filtering command) gets it to distinguish
unnecessary pages, and makedumpfile creates a small dumpfile.
On x86_64 kernel which compiled with CONFIG_PHYSICAL_START=0x0 and
CONFIG_RELOCATABLE=y, makedumpfile fails like the following:
# makedumpfile -d31 /proc/vmcore dumpfile
The kernel version is not supported.
The created dumpfile may be incomplete.
_exclude_free_page: Can't get next online node.
makedumpfile Failed.
#
The cause is the lack of the symbol "phys_base" in a vmcoreinfo data.
If the symbol "phys_base" does not exist, makedumpfile considers an
x86_64 kernel as non relocatable. As the result, makedumpfile
misunderstands the physical address where the kernel is loaded, and it
cannot translate a kernel virtual address to physical address correctly.
To fix this problem, this patch adds the symbol "phys_base" to a
vmcoreinfo data.
Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@kernel.org> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Only request I/O ports 0x295-0x296 instead of the full I/O address
range. This solves a conflict with PNP resources on a few motherboards.
Also request the I/O ports in two parts (4 low ports, 4 high ports)
during device detection, otherwise the PNP resource makes the request
(and thus the detection) fail.
This fixes lm-sensors ticket #2306:
http://www.lm-sensors.org/ticket/2306
Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Mark M. Hoffman <mhoffman@lightlink.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
The HP Compaq nc6120 has the same PCI sub-device ID as the nx6110, and the
SMBus is used by ACPI for thermal management on the nc6120, so Linux should
not attach a native driver to it. This means that this quirk is unsafe and
has to be removed.
I also added a comment to help developers realize that adding new IDs to this
SMBus unhiding quirk table should be done only with great care, and in
particular only after checking that ACPI is not making use of the SMBus.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Tomasz Koprowski <tomek@koprowski.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Current nobh_write_end() implementation ignore partial writes(copied < len)
case if page was fully mapped and simply mark page as Uptodate, which is
totally wrong because area [pos+copied, pos+len) wasn't updated explicitly in
previous write_begin call. It simply contains garbage from pagecache and
result in data leakage.
As you can see file's page contains garbage from pagecache instead of zeros.
#TEST_CASE_END
Attached patch:
- Add sanity check BUG_ON in order to prevent incorrect usage by caller,
This is function invariant because page can has buffers and in no zero
*fadata pointer at the same time.
- Always attach buffers to page is it is partial write case.
- Always switch back to generic_write_end if page has buffers.
This is reasonable because if page already has buffer then generic_write_begin
was called previously.
Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org> Reviewed-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Some oprofile results obtained while using tbench on a 2x2 cpu machine were
very surprising.
For example, loopback_xmit() function was using high number of cpu cycles
to perform the statistic updates, supposed to be real cheap since they use
percpu data
struct pcpu_lstats is a small structure containing two longs. It appears
that on my 32bits platform, alloc_percpu(8) allocates a single cache line,
instead of giving to each cpu a separate cache line.
Using the following patch gave me impressive boost in various benchmarks
( 6 % in tbench)
(all percpu_counters hit this bug too)
Long term fix (ie >= 2.6.26) would be to let each CPU allocate their own
block of memory, so that we dont need to roudup sizes to L1_CACHE_BYTES, or
merging the SGI stuff of course...
Note : SLUB vs SLAB is important here to *show* the improvement, since they
dont have the same minimum allocation sizes (8 bytes vs 32 bytes). This
could very well explain regressions some guys reported when they switched
to SLUB.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Instead of allocating a fix sized array of NR_CPUS pointers for percpu_data,
we can use nr_cpu_ids, which is generally < NR_CPUS.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
We need to set up the shared_info pointer once we've mapped the real
shared_info into its fixmap slot. That needs to happen once the general
pagetable setup has been done. Previously, the UP shared_info was set
up one in xen_start_kernel, but that was left pointing to the dummy
shared info. Unfortunately there's no really good place to do a later
setup of the shared_info in UP, so just do it once the pagetable setup
has been done.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
[chrisw@sous-sol.org: backport to 2.6.24.4] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
xen_irq_enable_direct and xen_sysexit were using "andw $0x00ff,
XEN_vcpu_info_pending(vcpu)" to unmask events and test for pending ones
in one instuction.
Unfortunately, the pending flag must be modified with a locked operation
since it can be set by another CPU, and the unlocked form of this
operation was causing the pending flag to get lost, allowing the processor
to return to usermode with pending events and ultimately deadlock.
The simple fix would be to make it a locked operation, but that's rather
costly and unnecessary. The fix here is to split the mask-clearing and
pending-testing into two instructions; the interrupt window between
them is of no concern because either way pending or new events will
be processed.
This should fix lingering bugs in using direct vcpu structure access too.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Commit 556a169dab38b5100df6f4a45b655dddd3db94c1 ("slab: fix bootstrap on
memoryless node") introduced bootstrap-time cache_cache list3s for all nodes
but forgot that initkmem_list3 needs to be accessed by [somevalue + node]. This
patch fixes list_add() corruption in mm/slab.c seen on the ES7000.
Cc: Mel Gorman <mel@csn.ul.ie> Cc: Olaf Hering <olaf@aepfle.de> Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
add_timer_on() can add a timer on a CPU which is currently in a long
idle sleep, but the timer wheel is not reevaluated by the nohz code on
that CPU. So a timer can be delayed for quite a long time. This
triggered a false positive in the clocksource watchdog code.
To avoid this we need to wake up the idle CPU and enforce the
reevaluation of the timer wheel for the next timer event.
Add a function, which checks a given CPU for idle state, marks the
idle task with NEED_RESCHED and sends a reschedule IPI to notify the
other CPU of the change in the timer wheel.
The inotify debugging code is supposed to verify that the
DCACHE_INOTIFY_PARENT_WATCHED scalability optimisation does not result in
notifications getting lost nor extra needless locking generated.
Unfortunately there are also some races in the debugging code. And it isn't
very good at finding problems anyway. So remove it for now.
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Robert Love <rlove@google.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Jan Kara <jack@ucw.cz> Cc: Yan Zheng <yanzheng@21cn.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christian Lamparter <chunkeey@web.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
There is a race between setting an inode's children's "parent watched" flag
when placing the first watch on a parent, and instantiating new children of
that parent: a child could miss having its flags set by
set_dentry_child_flags, but then inotify_d_instantiate might still see
!inotify_inode_watched.
The solution is to set_dentry_child_flags after adding the watch. Locking is
taken care of, because both set_dentry_child_flags and inotify_d_instantiate
hold dcache_lock and child->d_locks.
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Robert Love <rlove@google.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Jan Kara <jack@ucw.cz> Cc: Yan Zheng <yanzheng@21cn.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christian Lamparter <chunkeey@web.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
This patch (as1057) fixes a problem with the X-Rite/Gretag-Macbeth
Eye-One Pro display colorimeter; the device crashes when it receives a
Set-Interface request. A new quirk (USB_QUIRK_NO_SET_INTF) is
introduced and a quirks entry is created for this device.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[chrisw@sous-sol.org: backport to 2.6.24.4] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Motorola ROKR Z6 cellphone has bugs in its USB, so it is impossible to use
it as mass storage. Patch describes new "unusual" USB device for it with
FIX_INQUIRY and FIX_CAPACITY flags and new BULK_IGNORE_TAG flag.
Last flag relaxes check for equality of bcs->Tag and us->tag in
usb_stor_Bulk_transport routine.
Signed-off-by: Constantin Baranov <const@tltsu.ru> Signed-off-by: Matthew Dharm <mdharm-usb@one-eyed-alien.net> Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Mapping of physical memory in UIO needs pgprot_noncached() to ensure
that IO memory is not cached. Without pgprot_noncached(), it (accidentally)
works on x86 and arm, but fails on PPC.
Signed-off-by: Jean-Samuel Chenard <jsamch@gmail.com> Signed-off-by: Hans J Koch <hjk@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
If a dma transfer is attempted for either yuv or framebuffer output, a
missing sg_init_table() call causes a kernel BUG in scatterlist.h if
CONFIG_DEBUG_SG is set.
Signed-off-by: Ian Armstrong <ian@iarmst.demon.co.uk> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Exposing the binary blob which is the md 'super-block' via sysfs doesn't
really fit with the whole sysfs model, and ever since commit 8118a859dc7abd873193986c77a8d9bdb877adc8 ("sysfs: fix off-by-one error
in fill_read_buffer()") it doesn't actually work at all (as the size of
the blob is often one page).
(akpm: as in, fs/sysfs/file.c:fill_read_buffer() goes BUG)
So just remove it altogether. It isn't really useful.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
The block2mtd driver (drivers/mtd/devices/block2mtd.c) will kfree an on-stack
pointer when handling an invalid argument line (e.g.
block2mtd=/dev/loop0,xxx).
The kfree was added some time ago when "name" was dynamically allocated.
Signed-off-by: Ingo van Lil <inguin@gmx.de> Acked-by: Joern Engel <joern@lazybastard.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
The module alias support in the kernel have a consistency
check where it is checked that the size of a structure
in the kernel and on the build host are the same.
For cross builds this check does not make sense so detect
when we do cross builds and silently skip the check in these
situations.
This fixes a build bug for a wireless driver when cross building
for arm.
Acked-by: Michael Buesch <mb@bu3sch.de> Tested-by: Gordon Farquharson <gordonfarquharson@gmail.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
[chrisw@sous-sol.org: backport to 2.6.24.4] Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Heiko Carstens [Thu, 20 Mar 2008 16:33:38 +0000 (17:33 +0100)]
S390 futex: let futex_atomic_cmpxchg_pt survive early functional tests.
a0c1e9073ef7428a14309cba010633a6cd6719ea "futex: runtime enable pi and
robust functionality" introduces a test wether futex in atomic stuff
works or not.
It does that by writing to address 0 of the kernel address space. This
will crash on older machines where addressing mode switching is enabled
but where the mvcos instruction is not available. Page table walking is
done by hand and therefore the code tries to access current->mm which
is NULL.
Therefore add an extra check, so we survive the early test.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Joe Korty [Wed, 5 Mar 2008 23:04:59 +0000 (15:04 -0800)]
slab: NUMA slab allocator migration bugfix
NUMA slab allocator cpu migration bugfix
The NUMA slab allocator (specifically, cache_alloc_refill)
is not refreshing its local copies of what cpu and what
numa node it is on, when it drops and reacquires the irq
block that it inherited from its caller. As a result
those values become invalid if an attempt to migrate the
process to another numa node occured while the irq block
had been dropped.
The solution is to make cache_alloc_refill reload these
variables whenever it drops and reacquires the irq block.
The error is very difficult to hit. When it does occur,
one gets the following oops + stack traceback bits in
check_spinlock_acquired:
kernel BUG at mm/slab.c:2417
cache_alloc_refill+0xe6
kmem_cache_alloc+0xd0
...
This patch was developed against 2.6.23, ported to and
compiled-tested only against 2.6.25-rc4.
Signed-off-by: Joe Korty <joe.korty@ccur.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Dave Young [Fri, 1 Feb 2008 02:33:10 +0000 (18:33 -0800)]
BLUETOOTH: Fix bugs in previous conn add/del workqueue changes.
Jens Axboe noticed that we were queueing &conn->work on both btaddconn
and keventd_wq.
Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
buf[i] can be up to 0xfd, so doubling it and assigning the result to an
unsigned char truncates the value. Just use an unsigned int instead;
it's only a temporary.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If the channel cannot perform the operation in one call to
->device_prep_dma_zero_sum, then fallback to the xor+page_is_zero path.
This only affects users with arrays larger than 16 devices on iop13xx or
32 devices on iop3xx.
Cc: <stable@kernel.org> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
[chrisw@sous-sol.org: backport to 2.6.24.3] Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
My group ran into a AIO process hang on a 2.6.24 kernel with the process
sleeping indefinitely in io_getevents(2) waiting for the last wakeup to come
and it never would.
We ran the tests on x86_64 SMP. The hang only occurred on a Xeon box
("Clovertown") but not a Core2Duo ("Conroe"). On the Xeon, the L2 cache isn't
shared between all eight processors, but is L2 is shared between between all
two processors on the Core2Duo we use.
My analysis of the hang is if you go down to the second while-loop
in read_events(), what happens on processor #1:
1) add_wait_queue_exclusive() adds thread to ctx->wait
2) aio_read_evt() to check tail
3) if aio_read_evt() returned 0, call [io_]schedule() and sleep
In aio_complete() with processor #2:
A) info->tail = tail;
B) waitqueue_active(&ctx->wait)
C) if waitqueue_active() returned non-0, call wake_up()
The way the code is written, step 1 must be seen by all other processors
before processor 1 checks for pending events in step 2 (that were recorded by
step A) and step A by processor 2 must be seen by all other processors
(checked in step 2) before step B is done.
The race I believed I was seeing is that steps 1 and 2 were
effectively swapped due to the __list_add() being delayed by the L2
cache not shared by some of the other processors. Imagine:
proc 2: just before step A
proc 1, step 1: adds to ctx->wait, but is not visible by other processors yet
proc 1, step 2: checks tail and sees no pending events
proc 2, step A: updates tail
proc 1, step 3: calls [io_]schedule() and sleeps
proc 2, step B: checks ctx->wait, but sees no one waiting, skips wakeup
so proc 1 sleeps indefinitely
My patch adds a memory barrier between steps A and B. It ensures that the
update in step 1 gets seen on processor 2 before continuing. If processor 1
was just before step 1, the memory barrier makes sure that step A (update
tail) gets seen by the time processor 1 makes it to step 2 (check tail).
Before the patch our AIO process would hang virtually 100% of the time. After
the patch, we have yet to see the process ever hang.
Signed-off-by: Quentin Barnes <qbarnes+linux@yahoo-inc.com> Reviewed-by: Zach Brown <zach.brown@oracle.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: <stable@kernel.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ We should probably disallow that "if (waitqueue_active()) wake_up()"
coding pattern, because it's so often buggy wrt memory ordering ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping. At the moment the wrong buffer head's
data is being unescaped.
To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JFS_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device. Without this patch the files will contain zeros
instead of the correct data after recovery.
Signed-off-by: Duane Griffin <duaneg@dghda.com> Acked-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix a long-standing typo (predating git) that will cause data corruption if a
journal data block needs unescaping. At the moment the wrong buffer head's
data is being unescaped.
To test this case mount a filesystem with data=journal, start creating and
deleting a bunch of files containing only JBD2_MAGIC_NUMBER (0xc03b3998), then
pull the plug on the device. Without this patch the files will contain zeros
instead of the correct data after recovery.
Signed-off-by: Duane Griffin <duaneg@dghda.com> Acked-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A read request outside i_size will be handled in do_generic_file_read(). So
we just return 0 to avoid getting -EIO as normal reading, let
do_generic_file_read do the rest.
At the same time we need unlock the page to avoid system stuck.
Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Acked-by: Jan Kara <jack@suse.cz> Report-by: Christian Perle <chris@linuxinfotag.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Size of the netlink skb was wrongly computed because the formula was using
NLMSG_ALIGN instead of NLMSG_SPACE. NLMSG_ALIGN does not add the room for
netlink header as NLMSG_SPACE does. This was causing a failure of message
building in some cases.
On my test system, all messages for packets in range [8*k+41, 8*k+48] where k
is an integer were invalid and the corresponding packets were dropped.
Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
xt_time_match() in net/netfilter/xt_time.c in kernel 2.6.24 never
matches on Sundays. On my host I have a rule like
iptables -A OUTPUT -m time --weekdays Sun -j REJECT
and it never matches. The problem is in localtime_2(), which uses
r->weekday = (4 + r->dse) % 7;
to map the epoch day onto a weekday in {0,...,6}. In particular this
gives 0 for Sundays. But 0 has to be wrong; a weekday of 0 can never
match. xt_time_match() has
if (!(info->weekdays_match & (1 << current_time.weekday)))
return false;
and when current_time.weekday = 0, the result of the & is always
zero, even when info->weekdays_match = XT_TIME_ALL_WEEKDAYS = 0xFE.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
J. Bruce Fields [Fri, 14 Mar 2008 23:37:11 +0000 (19:37 -0400)]
nfsd: fix oops on access from high-numbered ports
This bug was always here, but before my commit 6fa02839bf9412e18e77
("recheck for secure ports in fh_verify"), it could only be triggered by
failure of a kmalloc(). After that commit it could be triggered by a
client making a request from a non-reserved port for access to an export
marked "secure". (Exports are "secure" by default.)
The result is a struct svc_export with a reference count one too low,
resulting in likely oopses next time the export is accessed.
The reference counting here is not straightforward; a later patch will
clean up fh_verify().
Thanks to Lukas Hejtmanek for the bug report and followup.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Lukas Hejtmanek <xhejtman@ics.muni.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix a hard to trigger crash seen in the -rt kernel that also affects
the vanilla scheduler.
There is a race condition between schedule() and some dequeue/enqueue
functions; rt_mutex_setprio(), __setscheduler() and sched_move_task().
When scheduling to idle, idle_balance() is called to pull tasks from
other busy processor. It might drop the rq lock. It means that those 3
functions encounter on_rq=0 and running=1. The current task should be
put when running.
The current process of CPU1(P1) is scheduling. Deactivated P1, and the
scheduler looks for another process on other CPU's runqueue because CPU1
will be idle. idle_balance(), load_balance_newidle() and
double_lock_balance() are called and double_lock_balance() could drop
the rq lock. On the other hand, CPU0 is trying to boost the priority of
P1. The result of boosting only P1's prio and sched_class are changed to
RT. The sched entities of P1 and P1's group are never put. It makes
cfs_rq invalid, because the cfs_rq has curr and no leaf, but
pick_next_task_fair() is called, then the kernel panics.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
[chrisw@sous-sol.org: backport to 2.6.24.3] Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Don't oops if NumPhys==0, instead return -ENODEV.
This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=9909
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl> Acked-by: Eric Moore <Eric.Moore@lsi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The recent patch named:
[SCSI] gdth: !use_sg cleanup and use of scsi accessors
has done a bad job in handling internal commands issued by gdth_execute().
Internal commands are issued with device gdth_cmd_str ready made directly
to the card, without any mapping or translations of scsi commands. So here
I added a gdth_cmd_str pointer to the gdth_cmndinfo private structure which
is then copied directly to host.
following this patch is a cleanup that removes the home cooked accessors
and reverts them to regular scsi_cmnd accessors. Since they are not used
anymore. After review maybe the 2 patches should be squashed together.
FIXME: There is still a problem with gdth_get_info(). as reported there
is a WARN_ON trigerd in dma_free_coherent() when doing:
$ cat /proc/sys/gdth/0
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Tested-by: Joerg Dorchain: <joerg@dorchain.net> Tested-by: Stefan Priebe <s.priebe@allied-internet.ag> Tested-by: Jon Chelton <jchelton@ffpglobal.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
gdth_exit would first remove all cards then stop the timer
and would not sync with the timer function. This caused a crash
in gdth_timer() when module was unloaded.
So del_timer_sync the timer before we delete the cards.
also the reboot notifier function would crash. So clean
that up and fix the crashes.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Tested-by: Joerg Dorchain: <joerg@dorchain.net> Tested-by: Stefan Priebe <s.priebe@allied-internet.ag> Tested-by: Jon Chelton <jchelton@ffpglobal.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Oddly enough, unsigned int c = '\300'; puts a "negative" value in c, not
0300... This fixes the default unicode compose table by using integers
instead of character constants.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fold in upstream commit 10a7f3135ac4937a3dc8ed11614a2b70cbd44728 (Build
fix for drivers/s390/char/defkeymap.c) from Tony Breeds.
x86: don't use P6_NOPs if compiling with CONFIG_X86_GENERIC
P6_NOPs are definitely not supported on some VIA CPUs, and possibly
(unverified) on AMD K7s. It is also the only thing that prevents a
686 kernel from running on Transmeta TM3x00/5x00 (Crusoe) series.
The performance benefit over generic NOPs is very small, so when
building for generic consumption, avoid using them.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[cebbert@redhat.com: backport take 2, with parens this time] Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When sending a SCSI command to a tape drive via the SCSI Generic (sg)
driver, if the command has a data transfer length more than
scatter_elem_sz (32 KB default) and not a multiple of 512, then I either
hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else
the command never completes (depending on the LLDD).
When constructing scatterlists, the sg driver rounds up the scatterlist
element sizes to be a multiple of 512. This can result in
sum(scatterlist lengths) > bufflen. In this case, scsi_req_map_sg()
incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to
bufflen. When the command completes, req_bio_endio() detects that
bio->bi_size != 0, and so it doesn't call bio_endio(). This causes the
command to be resubmitted, resulting in BUG_ON or the command never
completing.
This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather
than to sum(scatterlist lengths), which fixes the problem.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Acked-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
USB: ehci: Fixes completion for multi-qtd URB the short read case
When use of urb->status in the EHCI driver was reworked last August
(commit 14c04c0f88f228fee1f412be91d6edcb935c78aa), a bug was inserted
in the handling of early completion for bulk transactions that need
more than one qTD (e.g. more than 20KB in one URB).
This patch resolves that problem by ensuring that the early completion
status is preserved until the URB is handed back to its submitter,
instead of resetting it after each qTD.
Signed-off-by: Misha Zhilin <misha@epiphan.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
VT notifier callbacks need to be aware of console switches. This is already
partially done from console_callback(), but at that time fg_console, cursor
positions, etc. are not yet updated and hence screen readers fetch the old
values.
This adds an update notify after all of the values are updated in
redraw_screen(vc, 1).
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When the page is not up to date, ecryptfs_prepare_write() should be
acting much like ecryptfs_readpage(). This includes the painfully
obvious step of actually decrypting the page contents read from the
lower encrypted file.
Note that this patch resolves a bug in eCryptfs in 2.6.24 that one can
produce with these steps:
Initialize 'ack' to zero in case the descriptor has been recycled.
Prevents "kernel BUG at crypto/async_tx/async_xor.c:185!"
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Shannon Nelson <shannon.nelson@intel.com>
[chrisw@sous-sol.org: backport to 2.6.24.3] Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The atmel_spi driver does not initialize clock polarity correctly (except for
at91rm9200 CS0 channel) in some case.
The atmel_spi driver uses gpio-controlled chipselect. OTOH spi clock signal
is controlled by CSRn.CPOL bit, but this register controls clock signal
correctly only in 'real transfer' duration. At the time of cs_activate()
call, CSRn.CPOL will be initialized correctly, but the controller do not know
which channel is to be used next, so clock signal will stay at the inactive
state of last transfer. If clock polarity of new transfer and last transfer
was differ, new transfer will start with wrong clock signal state.
For example, if you started SPI MODE 2 or 3 transfer after SPI MODE 0 or 1
transfer, the clock signal state at the assertion of chipselect will be low.
Of course this will violates SPI transfer.
This patch is short term solution for this problem. It makes all CSRn.CPOL
match for the transfer before activating chipselect. For longer term, the
best fix might be to let NPCS0 stay selected permanently in MR and overwrite
CSR0 with to the new slave's settings before asserting CS.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Acked-by: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Michael Buesch [Fri, 29 Feb 2008 11:55:41 +0000 (12:55 +0100)]
b43: Backport bcm4311 fix
This is a backport of upstream commit 013978b6 ("b43: Changes to enable
BCM4311 rev 02 with wireless core revision 13") and the changes include
the following:
(1) Add the 802.11 rev 13 device to the ssb_device_id table to load b43.
(2) Add PHY revision 9 to the supported list.
(3) Change the 2-bit routing code for address extensions to 0b10 rather
than the 0b01 used for the 32-bit case.
(4) Remove some magic numbers in the DMA setup.
The DMA implementation for this chip supports full 64-bit addressing with
one exception. Whenever the Descriptor Ring Buffer is in high memory, a
fatal DMA error occurs. This problem was not present in 2.6.24-rc2 due
to code to "Bias the placement of kernel pages at lower PFNs". When
commit 44048d70 reverted that code, the DMA error appeared. As a "fix",
use the GFP_DMA flag when allocating the buffer for 64-bit DMA. At present,
this problem is thought to arise from a hardware error.
Signed-off-by: Michael Buesch <mb@bu3sch.de> Cc: Larry Finger <Larry.Finger@lwfinger.net> Cc: John W. Linville <linville@tuxdriver.com> Cc: Alexey Zaytsev <alexey.zaytsev@gmail.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mike Pagano [Thu, 28 Feb 2008 00:35:01 +0000 (19:35 -0500)]
arcmsr: fix IRQs disabled warning spew
As of 2.6.24, running the archttp passthrough daemon with the arcmsr
driver produces an endless spew of dma_free_coherent warnings:
WARNING: at arch/x86/kernel/pci-dma_64.c:169 dma_free_coherent()
It turns out that coherent memory is not needed, so commit 76d78300 by
Nick Cheng <nick.cheng@areca.com.tw> switched it to kmalloc (as well as
making a lot of other changes which have not been included here).
James Bottomley pointed out that the new kmalloc usage was also wrong,
I corrected this in commit 69e562c2.
This patch combines both of the above for the purpose of fixing 2.6.24.
details in http://bugs.gentoo.org/208493.
Signed-off-by: Daniel Drake <dsd@gentoo.org> Cc: Nick Cheng <nick.cheng@areca.com.tw> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Auke Kok [Tue, 12 Feb 2008 23:20:24 +0000 (15:20 -0800)]
e1000e: Fix CRC stripping in hardware context bug
CRC stripping was only correctly enabled for packet split recieves
which is used when receiving jumbo frames. Correctly enable SECRC
also for normal buffer packet receives.
Tested by Andy Gospodarek and Johan Andersson, see bugzilla #9940.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com> Signed-off-by: Jeff Garzik <jeff@garzik.org> Cc: Mike Pagano <mike@mpagano.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thanks to Loic Prylli <loic@myri.com>, who originally proposed
this idea.
Always using legacy configuration mechanism for the legacy config space
and extended mechanism (mmconf) for the extended config space is
a simple and very logical approach. It's supposed to resolve all
known mmconf problems. It still allows per-device quirks (tweaking
dev->cfg_size). It also allows to get rid of mmconf fallback code.
This patch fixes a boot hang on Intel Q35 chipset as detailed at:
http://bugs.gentoo.org/198810
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Pagano <mpagano@gentoo.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On alpha, ia64 and ppc64 only relocations to local data can go into
read-only sections. The vast majority of module parameters use the global
generic param_set_*/param_get_* functions, so the 'const' attribute for
struct kernel_param is not only useless, but it also causes compile
failures due to 'section type conflict' in those rare cases where
param_set/get are local functions.
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=8964
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Cc: Tony Luck <tony.luck@intel.com> Cc: Anton Blanchard <anton@samba.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Adrian Bunk <bunk@stusta.de> Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Pagano <mpagano@gentoo.org>
[chrisw@sous-sol.org: backport to 2.6.24.3] Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When masking, mask out the modes that are unsupported not the ones
that are supported. This makes life happier.
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The problem is that struct asc_dvc_var is placed on
shost->hostdata. So if the hostdata is not on an 8 byte boundary, the
advansys crashes. The hostdata is placed on a sizeof(unsigned long)
boundary so the 8 byte boundary is not garanteed with x86_32.
With 2.6.23 and 2.6.24, the hostdata is on an 8 byte boundary by
chance, but with the current git, it's not.
This patch removes overrun_buf static array and use kzalloc.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
FUJITA Tomonori notes:
We thought that 2.6.24 doesn't have this bug, however it does.
Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The function ebt_do_table doesn't take NF_DROP as a verdict from the targets.
Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
http://bugzilla.kernel.org/show_bug.cgi?id=9920
The function skb_make_writable returns true or false.
Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As reported by Tomas Simonaitis <tomas.simonaitis@gmail.com>, inserting new
data in skbs queued over {ip,ip6,nfnetlink}_queue triggers a SKB_LINEAR_ASSERT
in skb_put().
Going back through the git history, it seems this bug is present since at
least 2.6.12-rc2, probably even since the removal of skb_linearize() for
netfilter.
Linearize non-linear skbs through skb_copy_expand() when enlarging them.
Tested by Thomas, fixes bugzilla #9933.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fixes a sequencing bug in spi driver pxa2xx_spi.c in which the chip select
for a transfer may be asserted before the clock polarity is set on the
interface. As a result of this bug, the clock signal may have the wrong
polarity at transfer start, so it may need to make an extra half transition
before the intended clock/data signals begin. (This probably means all
transfers are one bit out of sequence.)
This only occurs on the first transfer following a change in clock polarity
in systems using more than one more than one such polarity. The fix
assures that the clock mode is properly set before asserting chip select.
This bug was introduced in a patch merged on 2006/12/10, kernel 2.6.20.
The patch defines an additional bit in: include/asm-arm/arch-pxa/regs-ssp.h
for 2.6.25 and newer kernels but this addition must be made in:
include/asm-arm/arch-pxa/pxa-regs.h for kernels between 2.6.20 and 2.6.24,
inclusive
Signed-off-by: Ned Forrester <nforrester@whoi.edu> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[chrisw@sous-sol.org: backport to 2.6.24.3] Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When we free a page via free_huge_page and we detect that we are in surplus
the page will be returned to the buddy. After this we no longer own the page.
However at the end free_huge_page we clear out our mapping pointer from
page private. Even where the page is not a surplus we free the page to
the hugepage pool, drop the pool locks and then clear page private. In
either case the page may have been reallocated. BAD.
Make sure we clear out page private before we free the page.
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Acked-by: Adam Litke <agl@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Simplify the uid equivalence check in cap_task_kill(). Anyone can kill a
process owned by the same uid.
Without this patch wireshark is reported to fail.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Not all architectures implement futex_atomic_cmpxchg_inatomic(). The default
implementation returns -ENOSYS, which is currently not handled inside of the
futex guts.
Futex PI calls and robust list exits with a held futex result in an endless
loop in the futex code on architectures which have no support.
Fixing up every place where futex_atomic_cmpxchg_inatomic() is called would
add a fair amount of extra if/else constructs to the already complex code. It
is also not possible to disable the robust feature before user space tries to
register robust lists.
Compile time disabling is not a good idea either, as there are already
architectures with runtime detection of futex_atomic_cmpxchg_inatomic support.
Detect the functionality at runtime instead by calling
cmpxchg_futex_value_locked() with a NULL pointer from the futex initialization
code. This is guaranteed to fail, but the call of
futex_atomic_cmpxchg_inatomic() happens with pagefaults disabled.
On architectures, which use the asm-generic implementation or have a runtime
CPU feature detection, a -ENOSYS return value disables the PI/robust features.
On architectures with a working implementation the call returns -EFAULT and
the PI/robust features are enabled.
The relevant syscalls return -ENOSYS and the robust list exit code is blocked,
when the detection fails.
Fixes http://lkml.org/lkml/2008/2/11/149
Originally reported by: Lennart Buytenhek
When the futex init code fails to initialize the futex pseudo file system it
returns early without initializing the hash queues. Should the boot succeed
then a futex syscall which tries to enqueue a waiter on the hashqueue will
crash due to the unitilialized plist heads.
Ensure that the clock lookup always finds an entry for a specific
device and ID before it falls back to finding just by ID. This
fixes a problem reported by Holger Schurig where the BTUART was
assigned the wrong clock.
Tested-by: Holger Schurig <hs4233@mail.mn-solutions.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Uli Luckas notes:
The patch fixes the otherwise unusable bluetooth uart on pxa25x. The
patch is written by Russell King [1] who also gave his OK for
stable inclusion [2]. The patch is also available as commit a0dd005d1d9f4c3beab52086f3844ef9342d1e67 to Linus' tree.
The exception fixup for the futex macros __futex_atomic_op1/2 and
futex_atomic_cmpxchg_inatomic() is missing an entry when the lock
prefix is replaced by a NOP via SMP alternatives.
Chuck Ebert tracked this down from the information provided in:
https://bugzilla.redhat.com/show_bug.cgi?id=429412
A possible solution would be to add another fixup after the
LOCK_PREFIX, so both the LOCK and NOP case have their own entry in the
exception table, but it's not really worth the trouble.
Simply replace LOCK_PREFIX with lock and keep those untouched by SMP
alternatives.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
[cebbert@redhat.com: backport to 2.6.24] Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The second message is because the driver fails to identify the task
it's being asked to abort. On closer inpection, there's a thinko in
the for each task loop over pending tasks in both the REQ_TASK_ABORT
and REQ_DEVICE_RESET cases where it doesn't look at the task on the
pending list but at the one on the ESCB (which is always NULL).
Fix by looking at the right task. Also add a print for the case where
the pending SCB doesn't have a task attached.
Not sure if this will fix all the problems, but it's a definite first
step.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The spinlock is held over too large a region: pscratch is a permanent
address (it's allocated at boot time and never changes). All you need
the smp lock for is mediating the scratch in use flag, so fix this by
moving the spinlock into the case where we set the pscratch_busy flag
to false.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Fri, 22 Feb 2008 22:03:25 +0000 (17:03 -0500)]
usb-storage: don't access beyond the end of the sg buffer
This patch (as1038) fixes a bug in usb_stor_access_xfer_buf() and
usb_stor_set_xfer_buf() (the bug was originally found by Boaz
Harrosh): The routine must not attempt to write beyond the end of a
scatter-gather list or beyond the number of bytes requested.
I added a nasty local variable shadowing bug to fuse in 2.6.24, with the
result, that the 'default_permissions' mount option is basically ignored.
How did this happen?
- old err declaration in inner scope
- new err getting declared in outer scope
- 'return err' from inner scope getting removed
- old declaration not being noticed
-Wshadow would have saved us, but it doesn't seem practical for
the kernel :(
More testing would have also saved us :((
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The XTS blockmode uses a copy of the IV which is saved on the stack
and may or may not be properly aligned. If it is not, it will break
hardware cipher like the geode or padlock.
This patch encrypts the IV in place so we don't have to worry about
alignment.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc> Tested-by: Stefan Hellermann <stefan@the2masters.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When using aes-xcbc-mac for authentication in IPsec,
the kernel crashes. It seems this algorithm doesn't
account for the space IPsec may make in scatterlist for authtag.
Thus when crypto_xcbc_digest_update2() gets called,
nbytes may be less than sg[i].length.
Since nbytes is an unsigned number, it wraps
at the end of the loop allowing us to go back
into loop and causing crash in memcpy.
I used update function in digest.c to model this fix.
Please let me know if it looks ok.
Signed-off-by: Joy Latten <latten@austin.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Acked-by: Mark Salyzyn <mark_salyzyn@adaptec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
FUJITA Tomonori notes:
It didn't intend to fix a critical bug, however, it turned out that it
does. Without this patch, the ips driver in 2.6.23 and 2.6.24 doesn't
work at all. You can find the more details at the following thread:
Its previous use in a call to on_each_cpu() was pointless, as at the
time that code gets executed only one CPU is online. Further, the
function can be __cpuinit, and for this to work without
CONFIG_HOTPLUG_CPU setup_nmi() must also get an attribute (this one
can even be __init; on 64-bits check_timer() also was lacking that
attribute).
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[ tglx@linutronix.de: backport to 2.6.24.3] Cc: Justin Piszcz <jpiszcz@lucidpixels.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There's a bug in the current implementation of dma_get_required_mask()
where it ands the returned mask with the current device mask. This
rather defeats the purpose if you're using the call to determine what
your mask should be (since you will at that time have the default
DMA_32BIT_MASK). This bug results in any driver that uses this function
*always* getting a 32 bit mask, which is wrong.
Fix by removing the and with dev->dma_mask.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
iov_iter_advance() skips over zero-length iovecs, however it does not properly
terminate at the end of the iovec array. Fix this by checking against
i->count before we skip a zero-length iov.
The bug was reproduced with a test program that continually randomly creates
iovs to writev. The fix was also verified with the same program and also it
could verify that the correct data was contained in the file after each
writev.
Signed-off-by: Nick Piggin <npiggin@suse.de> Tested-by: "Kevin Coffman" <kwc@citi.umich.edu> Cc: "Alexey Dobriyan" <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Aurelien Jarno [Sat, 8 Mar 2008 10:43:52 +0000 (11:43 +0100)]
x86: Clear DF before calling signal handler
x86: Clear DF before calling signal handler
The Linux kernel currently does not clear the direction flag before
calling a signal handler, whereas the x86/x86-64 ABI requires that.
This become a real problem with gcc version 4.3, which assumes that
the direction flag is correctly cleared at the entry of a function.
This patches changes the setup_frame() functions to clear the
direction before entering the signal handler.