This fixes an incorrect assumption (BUG_ON) made in
cfg80211 when handling country IE regulatory requests.
The assumption was that we won't try to call_crda()
twice for the same event and therefore we will not
recieve two replies through nl80211 for the regulatory
request. As it turns out it is true we don't call_crda()
twice for the same event, however, kobject_uevent_env()
*might* send the udev event twice and/or userspace can
simply process the udev event twice. We remove the BUG_ON()
and simply ignore the duplicate request.
Reported-by: Maxim Levitsky <maximlevitsky@gmail.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The minstrel rate controller periodically looks up rate indexes in
a sampling table. When accessing a specific row and column, minstrel
correctly does a bounds check which, on the surface, appears to handle
the case where mi->n_rates < 2. However, mi->sample_idx is actually
defined as an unsigned, so the right hand side is taken to be a huge
positive number when negative, and the check will always fail.
Consequently, the RC will overrun the array and cause random memory
corruption when communicating with a peer that has only a single rate.
The max value of mi->sample_idx is around 25 so casting to int should
have no ill effects.
Without the change, uptime is a few minutes under load with an AP
that has a single hard-coded rate, and both the AP and STA could
potentially crash. With the change, both lasted 12 hours with a
steady load.
Thanks to Ognjen Maric for providing the single-rate clue so I could
reproduce this.
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=12490 on the
regression list (also http://bugzilla.kernel.org/show_bug.cgi?id=13000).
Reported-by: Sergey S. Kostyliov <rathamahata@gmail.com> Reported-by: Ognjen Maric <ognjen.maric@gmail.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On systems where CONFIG_SHMEM is disabled, mounting tmpfs filesystems can
fail when tmpfs options are used. This is because tmpfs creates a small
wrapper around ramfs which rejects unknown options, and ramfs itself only
supports a tiny subset of what tmpfs supports. This makes it pretty hard
to use the same userspace systems across different configuration systems.
As such, ramfs should ignore the tmpfs options when tmpfs is merely a
wrapper around ramfs.
This used to work before commit c3b1b1cbf0 as previously, ramfs would
ignore all options. But now, we get:
ramfs: bad mount option: size=10M
mount: mounting mdev on /dev failed: Invalid argument
Another option might be to restore the previous behavior, where ramfs
simply ignored all unknown mount options ... which is what Hugh prefers.
x86 stack traces are a piece of crap without frame pointers, and its not
like the 'performance gain' of not having stack pointers matters when you
selected lockdep.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <new-submission> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I unfortunately accepted a patch time ago, to drop the "current" usage
from possible IRQ context, w/out proper thought over it. The patch
switched to using the CPU id by bounding the nested call callback with a
get_cpu()/put_cpu().
Unfortunately the ep_call_nested() function can be called with a callback
that grabs sleepy locks (from own f_op->poll()), that results in epic
fails. The following patch uses the proper "context" depending on the
path where it is called, and on the kind of callback.
This has been reported by Stefan Richter, that has also verified the patch
is his previously failing environment.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The ConnectX Programmer's Reference Manual states that the "SO" bit
must be set when posting Fast Register and Local Invalidate send work
requests. When this bit is set, the work request will be executed
only after all previous work requests on the send queue have been
executed. (If the bit is not set, Fast Register and Local Invalidate
WQEs may begin execution too early, which violates the defined
semantics for these operations)
This fixes the issue with NFS/RDMA reported in
<http://lists.openfabrics.org/pipermail/general/2009-April/059253.html>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Without this, the default implementation is a no op which is completely
wrong with a VIVT cache, and usage of sg_copy_buffer() produces
unpredictable results.
Tested-by: Sebastian Andrzej Siewior <bigeasy@breakpoint.cc> Signed-off-by: Nicolas Pitre <nico@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
timer interrupts are excluded from being disabled during suspend. The
clock events code manages the disabling of clock events on its own
because the timer interrupt needs to be functional before the resume
code reenables the device interrupts.
The hpet per cpu timers request their interrupt without setting the
IRQF_TIMER flag so suspend_device_irqs() disables them as well which
results in a fatal resume failure on the boot CPU.
Adding IRQF_TIMER to the interupt flags when requesting the hpet per
cpu timer interrupts solves the problem.
Reported-by: Benjamin S. <sbenni@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Benjamin S. <sbenni@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The commit f9e336f65b666b8f1764d17e9b7c21c90748a37e
ALSA: hda - Unify capture mixer creation in realtek codes
removed the "Input Source" mixer element creation for toshiba-s06 model
because it contains a digital-mic input.
The PCM pointer callback sometimes returns invalid positions and this
screws up the hw_ptr updater in PCM core. Especially since now the
jiffies check is optional with xrun_debug, the invalid position is
handled as is, and causes serious sound skips, etc.
This patch simplifies the position-fix strategy in intel8x0 to be more
robust:
- just falls back to the last position if bogus position is detected
- another sanity check for the backward move of the position due to
a race of register update and the base-index update
This patch is applicable also for 2.6.30.
Tested-by: David Miller <davem@davemloft.net> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On a system where system memory (according e820) is not covered by
mtrr, mtrr_trim_memory converts a portion of memory to reserved, but
bootloader has already put the initrd in that range.
Thus, we need to have 64bit to use relocate_initrd too.
[ Impact: fix using initrd when mtrr_trim_memory happen ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The initialization of the UV Broadcast Assist Unit's sending
buffers was making an invalid assumption about the
initialization of an MMR that defines its address.
The BIOS will not be providing that MMR. So
uv_activation_descriptor_init() should unconditionally set it.
The *fence instructions were moved to vsyscall_64.c by commit cb9e35dce94a1b9c59d46224e8a94377d673e204. But this breaks the
vDSO, because vread methods are also called from there.
Besides, the synchronization might be unnecessary for other
time sources than TSC.
[ Impact: fix potential time warp in VDSO ]
Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
LKML-Reference: <9d0ea9ea0f866bdc1f4d76831221ae117f11ea67.1243241859.git.ptesarik@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The current code to set up the GART as an IOMMU enables GART
translations before it removes the aperture from the kernel memory
map, sets the GART PTEs to UC, sets up the guard and scratch
pages, or does a wbinvd(). This leaves the possibility of cache
aliasing open and can cause system crashes.
Re-order the code so as to enable the GART translations only
after all safeguards are in place and the tlb has been flushed.
AMD has tested this patch on both Istanbul systems and 1st
generation Opteron systems with APG enabled and seen no adverse
effects. Istanbul systems with HT Assist enabled sometimes
see MCE errors due to cache artifacts with the unmodified
code.
Fix bug in the SGI UV macros that support systems with multiple
coherency domains. The macros used for referencing global MMR
(chipset registers) are failing to correctly "or" the NASID
(node identifier) bits that reside above M+N. These high bits
are supplied automatically by the chipset for memory accesses
coming from the processor socket.
However, the bits must be present for references to the special
global MMR space used to map chipset registers. (See uv_hub.h
for more details ...)
The bug results in references to invalid/incorrect nodes.
The UV tlb shootdown code has a serious initialization error.
An array of structures [32*8] is initialized as if it were [32].
The array is indexed by (cpu number on the blade)*8, so the short
initialization works for up to 4 cpus on a blade.
But above that, we provide an invalid opcode to the hub's
broadcast assist unit.
This patch changes the allocation of the array to use its symbolic
dimensions for better clarity. And initializes all 32*8 entries.
Shortened 'UV_ACTIVATION_DESCRIPTOR_SIZE' to 'UV_ADP_SIZE' per Ingo's
recommendation.
Using gcc 3.3.5 a "make allmodconfig" + "CONFIG_KVM=n"
triggers a build error:
arch/x86/mm/built-in.o(.init.text+0x43f7): In function `__change_page_attr':
arch/x86/mm/pageattr.c:114: undefined reference to `__udivdi3'
make: *** [.tmp_vmlinux1] Error 1
The culprit turned out to be a division in arch/x86/mm/memtest.c
For more info see this thread:
The Blackfin serial driver never initialized the spin_lock that is part of
the serial core structure, but we never noticed because spin_lock's are
rarely enabled on UP systems. Yeah lockdep and friends.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit fec1878fe952b994125a3be7c94b1322db586f3b caused a regression in
which contiguous blocks being allocated to the end of an extent were
getting a new extent created. This typically results in files entirely
made up of 1-block extents even though the blocks are contiguous on
disk.
Apparently grub doesn't handle a jfs file being fragmented into too many
extents, since it refuses to boot a kernel from jfs that was created by
the 2.6.30 kernel.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Reported-by: Alex <alevkovich@tut.by> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Please note that this patch attached has been BACKPORTED to fit kernel
2.6.30.y
Since i2c autoprobing is no longer supported by v4l2 we need to make sure
that the i2c modules are linked before the v4l2 modules. The v4l2 modules
now rely on the presence of the i2c modules, so these must have initialized
themselves before the v4l2 modules.
The exception is the ir-kbd-i2c module, which is the only one still using
autoprobing. This one should be loaded at the end of the v4l2 module. Loading
it earlier actually causes problems with tveeprom. Once ir-kbd-i2c is no
longer autoprobing, then it has to move up as well.
This is only an issue when everything is compiled into the kernel.
Thanks to Marcus Swoboda for reporting this and Udo Steinberg for testing
this patch.
Tested-by: Udo A. Steinberg <udo@hypervisor.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The cx25840 module's VBI initialization logic uses the current video
standard as part of its internal algorithm. This therefore means that
we probably need to make sure that the correct video standard has been
set before initializing VBI. (Normally we would not care about VBI,
but as described in an earlier changeset, VBI must be initialized
correctly on the cx25840 in order for the chip's hardware scaler to
operate correctly.)
It's kind of messy to force the video standard to be set before
initializing VBI (mainly because we can't know what the app really
wants that early in the initialization process). So this patch does
the next best thing: VBI is re-initialized after any point where the
video standard has been set.
Signed-off-by: Mike Isely <isely@pobox.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The cx25840 module requires that its VBI initialization entry point be
called in order for hardware-scaled video capture to work properly -
even if we don't care about VBI. Making this behavior even more
subtle is that if the capture resolution is set to 720x480 - which is
the default that the pvrusb2 driver sets up - then the cx25840
bypasses the hardware scaler. Therefore this problem does not
manifest itself until some other resolution, e.g. 640x480, is tried.
MythTV typically defaults to 640x480 or 480x480, which means that
things break whenever the driver is used with MythTV.
This all has been known for a while (since at least Nov 2006), but
recent changes in the pvrusb2 driver (specifically in regards to
sub-device support) caused this to break again. VBI initialization
must happen *after* the chip's firmware is loaded, not before. With
this fix, 24xxx devices work correctly again.
A related fix that is part of this changeset is that now we
re-initialize VBI any time after we issue a reset to the cx25840
driver. Issuing a chip reset erases the state that the VBI setup
previously did. Until the HVR-1950 came along this subtlety went
unnoticed, because the pvrusb2 driver previously never issued such a
reset. But with the HVR-1950 we have to do that reset in order to
correctly transition from digital back to analog mode - and since the
HVR-1950 always starts in digital mode (required for the DVB side to
initialize correctly) then this device has never had a chance to work
correctly in analog mode! Analog capture on the HVR-1950 has been
broken this *ENTIRE* time. I had missed it until now because I've
usually been testing at the default 720x480 resolution which does not
require scaling... What fun. By re-initializing VBI after a cx25840
chip reset, correct behavior is restored.
Signed-off-by: Mike Isely <isely@pobox.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A previous change (v4l2-common: remove v4l2_ctrl_query_fill_std) broke
the handling of class controls in VIDIOC_QUERYCTRL. The MPEG class control
was broken for all drivers that use the cx2341x module and the USER class
control was broken for ivtv and cx18.
This change adds back proper class control support.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Steve Holland pointed out that we forgot to call break; in the switch
statment. This probably resolves a lot of the bug reports I've gotten
for the driver lately.
Stupid me...
Reported-by: Steve Holland <sdh4@iastate.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Booting a 32-bit kernel on Magny-Cours results in the following panic:
...
Using APIC driver default
...
Overriding APIC driver with bigsmp
...
Getting VERSION: 80050010
Getting VERSION: 80050010
Getting ID: 10000000
Getting ID: ef000000
Getting LVT0: 700
Getting LVT1: 10000
Kernel panic - not syncing: Boot APIC ID in local APIC unexpected (16 vs 0)
Pid: 1, comm: swapper Not tainted 2.6.30-rcX #2
Call Trace:
[<c05194da>] ? panic+0x38/0xd3
[<c0743102>] ? native_smp_prepare_cpus+0x259/0x31f
[<c073b19d>] ? kernel_init+0x3e/0x141
[<c073b15f>] ? kernel_init+0x0/0x141
[<c020325f>] ? kernel_thread_helper+0x7/0x10
The reason is that default_get_apic_id handled extension of local APIC
ID field just in case of XAPIC.
Thus for this AMD CPU, default_get_apic_id() returns 0 and
bigsmp_get_apic_id() returns 16 which leads to the respective kernel
panic.
This patch introduces a Linux specific feature flag to indicate
support for extended APIC id (8 bits instead of 4 bits width) and sets
the flag on AMD CPUs if applicable.
In moxa.c there are 32 minor numbers reserved for each device. The number
of ports actually available per device is stored in
moxa_board_conf->numPorts. This number is not considered in moxa_open().
Opening a port that is not available results in a kernel oops. This patch
adds a test to moxa_open() that prevents opening unavailable ports.
Some device drivers map the same physical address multiple times to a
dma address. Without an IOMMU this results in the same dma address being
put into the dma-debug hash multiple times. With a first-fit match in
hash_bucket_find() this function may return the wrong dma_debug_entry.
This can result in false positive warnings. This patch fixes it by
changing the first-fit behavior of hash_bucket_find() into a best-fit
algorithm.
Some users still load bond module multiple times to create bonding
devices. This accidentally was broken by a later patch about
the time sysfs was fixed. According to Jay, it was broken
by:
commit b8a9787eddb0e4665f31dd1d64584732b2b5d051
Author: Jay Vosburgh <fubar@us.ibm.com>
Date: Fri Jun 13 18:12:04 2008 -0700
bonding: Allow setting max_bonds to zero
Note: sysfs and procfs still produce WARN() messages when this is done
so the sysfs method is the recommended API.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It is possible for tun_chr_close to race with dellink on the
a tun device. In which case if __tun_get runs before dellink
but dellink runs before tun_chr_close calls unregister_netdevice
we will attempt to unregister the netdevice after it is already
gone.
The two cases are already serialized on the rtnl_lock, so I have
gone for the cheap simple fix of moving rtnl_lock to cover __tun_get
in tun_chr_close. Eliminating the possibility of the tun device
being unregistered between __tun_get and unregister_netdevice in
tun_chr_close.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Tested-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The code to compute VPD size didn't handle some systems that use
chip without VPD. Also some of the newer chips use some additional
registers to store the actual size, and wasn't worth putting the
additional complexity in, so just remove the code.
No big loss since the code to set the VPD size was only a
convenience so that utilities would not read the extra space past
the end of the available VPD.
Move the first PCI config read earlier to detect bad hardware
where it returns all ones and refuse loading driver before furthur
damage.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Tested-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When route caching is disabled (rt_caching returns false), We still use route
cache entries that are created and passed into rt_intern_hash once. These
routes need to be made usable for the one call path that holds a reference to
them, and they need to be reclaimed when they're finished with their use. To be
made usable, they need to be associated with a neighbor table entry (which they
currently are not), otherwise iproute_finish2 just discards the packet, since we
don't know which L2 peer to send the packet to. To do this binding, we need to
follow the path a bit higher up in rt_intern_hash, which calls
arp_bind_neighbour, but not assign the route entry to the hash table.
Currently, if caching is off, we simply assign the route to the rp pointer and
are reutrn success. This patch associates us with a neighbor entry first.
Secondly, we need to make sure that any single use routes like this are known to
the garbage collector when caching is off. If caching is off, and we try to
hash in a route, it will leak when its refcount reaches zero. To avoid this,
this patch calls rt_free on the route cache entry passed into rt_intern_hash.
This places us on the gc list for the route cache garbage collector, so that
when its refcount reaches zero, it will be reclaimed (Thanks to Alexey for this
suggestion).
I've tested this on a local system here, and with these patches in place, I'm
able to maintain routed connectivity to remote systems, even if I set
/proc/sys/net/ipv4/rt_cache_rebuild_count to -1, which forces rt_caching to
return false.
Signed-off-by: Neil Horman <nhorman@redhat.com> Reported-by: Jarek Poplawski <jarkao2@gmail.com> Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I recently got a report of an oops on a route lookup. Maxime was
testing what would happen if route caching was turned off (doing so by setting
making rt_caching always return 0), and found that it triggered an oops. I
looked at it and found that the problem stemmed from the fact that the route
lookup routines were returning success from their lookup paths (which is good),
but never set the **rp pointer to anything (which is bad). This happens because
in rt_intern_hash, if rt_caching returns false, we call rt_drop and return 0.
This almost emulates slient success. What we should be doing is assigning *rp =
rt and _not_ dropping the route. This way, during slow path lookups, when we
create a new route cache entry, we don't immediately discard it, rather we just
don't add it into the cache hash table, but we let this one lookup use it for
the purpose of this route request. Maxime has tested and reports it prevents
the oops. There is still a subsequent routing issue that I'm looking into
further, but I'm confident that, even if its related to this same path, this
patch makes sense to take.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
e1000e: commonize tx cleanup routine to match e1000 & igb
changed the logic for determining if we should call napi_complete or
not at then end of a napi poll.
If the NIC is using MSI-X with no work to do in ->poll, net_rx_action
can just spin indefinitely on older kernels and for 2 jiffies on newer
kernels since napi_complete is never called and budget isn't
decremented.
Discovered and verified while testing driver backport to an older
kernel.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If IMA tried to measure a file which was larger than 4G dentry_open would fail
with -EOVERFLOW since IMA wasn't passing O_LARGEFILE. This patch passes
O_LARGEFILE to all IMA opens to avoid this problem.
Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Currently IMA does not handle failures from dentry_open(). This means that we
leave a pointer set to ERR_PTR(errno) and then try to use it just a few lines
later in fput(). Oops.
Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity
checking for the new cr3 value:
"Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to
the kernel. This will trigger a NULL pointer access in gfn_to_rmap()
when userspace next tries to call KVM_RUN on the affected VCPU and kvm
attempts to activate the new non-existent page table root.
This happens since kvm only validates that cr3 points to a valid guest
physical memory page when code *inside* the guest sets cr3. However, kvm
currently trusts the userspace caller (e.g. QEMU) on the host machine to
always supply a valid page table root, rather than properly validating
it along with the rest of the reloaded guest state."
If a slots guest physical address and host virtual address unequal (mod
large page size), then we would erronously try to back guest large pages
with host large pages. Detect this misalignment and diable large page
support for the trouble slot.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
VT-x needs an explicit MC vector intercept to handle machine checks in
the hypervisor.
It also has a special option to catch machine checks that happen
during VT entry.
Do these interceptions and forward them to the Linux machine check
handler. Make it always look like user space is interrupted because
the machine check handler treats kernel/user space differently.
Thanks to Huang Ying and Jiang Yunhong for help and testing.
Cc: ying.huang@intel.com
v2: Handle machine checks still in interrupt off context
to avoid problems on preemptible kernels.
v3: Handle old style 32bit and make fully standalone
Some filesystems can call in to sync an inode that is still in the
I_NEW state (eg. ext family, when mounted with -osync). This is OK
because the filesystem has sole access to the new inode, so it can
modify i_state without races (because no other thread should be
modifying it, by definition of I_NEW). Ie. a false positive, so
remove the warnings.
Peer reported:
| The bug is introduced from kernel 2.6.27, if E820 table reserve the memory
| above 4G in 32bit OS(BIOS-e820: 00000000fff80000 - 0000000120000000
| (reserved)), system will report Int 6 error and hang up. The bug is caused by
| the following code in drivers/firmware/memmap.c, the resource_size_t is 32bit
| variable in 32bit OS, the BUG_ON() will be invoked to result in the Int 6
| error. I try the latest 32bit Ubuntu and Fedora distributions, all hit this
| bug.
|======
|static int firmware_map_add_entry(resource_size_t start, resource_size_t end,
| const char *type,
| struct firmware_map_entry *entry)
and it only happen with CONFIG_PHYS_ADDR_T_64BIT is not set.
it turns out we need to pass u64 instead of resource_size_t for that.
[akpm@linux-foundation.org: add comment] Reported-and-tested-by: Peer Chen <pchen@nvidia.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Peter Botha [Wed, 10 Jun 2009 00:16:32 +0000 (17:16 -0700)]
char: mxser, fix ISA board lookup
There's a bug in the mxser kernel module that still appears in the
2.6.29.4 kernel.
mxser_get_ISA_conf takes a ioaddress as its first argument, by passing the
not of the ioaddr, you're effectively passing 0 which means it won't be
able to talk to an ISA card. I have tested this, and removing the !
fixes the problem.
Jan Kara [Tue, 9 Jun 2009 23:26:26 +0000 (16:26 -0700)]
jbd: fix race in buffer processing in commit code
In commit code, we scan buffers attached to a transaction. During this
scan, we sometimes have to drop j_list_lock and then we recheck whether
the journal buffer head didn't get freed by journal_try_to_free_buffers().
But checking for buffer_jbd(bh) isn't enough because a new journal head
could get attached to our buffer head. So add a check whether the journal
head remained the same and whether it's still at the same transaction and
list.
This is a nasty bug and can cause problems like memory corruption (use after
free) or trigger various assertions in JBD code (observed).
Signed-off-by: Jan Kara <jack@suse.cz> Cc: <stable@kernel.org> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ian Kent [Tue, 9 Jun 2009 23:26:24 +0000 (16:26 -0700)]
autofs4: remove hashed check in validate_wait()
The recent ->lookup() deadlock correction required the directory inode
mutex to be dropped while waiting for expire completion. We were
concerned about side effects from this change and one has been identified.
I saw several error messages.
They cause autofs to become quite confused and don't really point to the
actual problem.
Things like:
handle_packet_missing_direct:1376: can't find map entry for (43,1827932)
which is usually totally fatal (although in this case it wouldn't be
except that I treat is as such because it normally is).
do_mount_direct: direct trigger not valid or already mounted
/test/nested/g3c/s1/ss1
which is recoverable, however if this problem is at play it can cause
autofs to become quite confused as to the dependencies in the mount tree
because mount triggers end up mounted multiple times. It's hard to
accurately check for this over mounting case and automount shouldn't need
to if the kernel module is doing its job.
There was one other message, similar in consequence of this last one but I
can't locate a log example just now.
When checking if a mount has already completed prior to adding a new mount
request to the wait queue we check if the dentry is hashed and, if so, if
it is a mount point. But, if a mount successfully completed while we
slept on the wait queue mutex the dentry must exist for the mount to have
completed so the test is not really needed.
Mounts can also be done on top of a global root dentry, so for the above
case, where a mount request completes and the wait queue entry has already
been removed, the hashed test returning false can cause an incorrect
callback to the daemon. Also, d_mountpoint() is not sufficient to check
if a mount has completed for the multi-mount case when we don't have a
real mount at the base of the tree.
Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 9 Jun 2009 23:26:23 +0000 (16:26 -0700)]
shm: fix unused warnings on nommu
The massive nommu update (8feae131) resulted in these warnings:
ipc/shm.c: In function `sys_shmdt':
ipc/shm.c:974: warning: unused variable `size'
ipc/shm.c:972: warning: unused variable `next'
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
cls_cgroup: Fix oops when user send improperly 'tc filter add' request
r8169: fix crash when large packets are received
Linus Torvalds [Tue, 9 Jun 2009 15:41:22 +0000 (08:41 -0700)]
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md/raid5: fix bug in reshape code when chunk_size decreases.
md/raid5 - avoid deadlocks in get_active_stripe during reshape
md/raid5: use conf->raid_disks in preference to mddev->raid_disk
it turns out that we need clear cpus_hardware_enabled in that case.
Reported-and-tested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Minoru Usui [Tue, 9 Jun 2009 11:03:09 +0000 (04:03 -0700)]
cls_cgroup: Fix oops when user send improperly 'tc filter add' request
I found a bug in cls_cgroup_change() in cls_cgroup.c.
cls_cgroup_change() expected tca[TCA_OPTIONS] was set from user space properly,
but tc in iproute2-2.6.29-1 (which I used) didn't set it.
In the current source code of tc in git, it set tca[TCA_OPTIONS].
If we always use a newest iproute2 in git when we use cls_cgroup,
we don't face this oops probably.
But I think, kernel shouldn't panic regardless of use program's behaviour.
Signed-off-by: Minoru Usui <usui@mxm.nes.nec.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 9 Jun 2009 11:01:02 +0000 (04:01 -0700)]
r8169: fix crash when large packets are received
Michael Tokarev reported receiving a large packet could crash
a machine with RTL8169 NIC.
( original thread at http://lkml.org/lkml/2009/6/8/192 )
Problem is this driver tells that NIC frames up to 16383 bytes
can be received but provides skb to rx ring allocated with
smaller sizes (1536 bytes in case standard 1500 bytes MTU is used)
When a frame larger than what was allocated by driver is received,
dma transfert can occurs past the end of buffer and corrupt
kernel memory.
Fix is to tell to NIC what is the maximum size a frame can be.
This bug is very old, (before git introduction, linux-2.6.10), and
should be backported to stable versions.
Reported-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
NeilBrown [Tue, 9 Jun 2009 06:32:22 +0000 (16:32 +1000)]
md/raid5: fix bug in reshape code when chunk_size decreases.
Now that we support changing the chunksize, we calculate
"reshape_sectors" to be the max of number of sectors in old
and new chunk size.
However there is one please where we still use 'chunksize'
rather than 'reshape_sectors'.
This causes a reshape that reduces the size of chunks to freeze.
NeilBrown [Tue, 9 Jun 2009 04:39:59 +0000 (14:39 +1000)]
md/raid5 - avoid deadlocks in get_active_stripe during reshape
md has functionality to 'quiesce' and array so that all pending
IO completed and no new IO starts. This is used to achieve a
stable state before making internal changes.
Currently this quiescing applies equally to normal IO, resync
IO, and reshape IO.
However there is a problem with applying it to reshape IO.
Reshape can have multiple 'stripe_heads' that must be active together.
If the quiesce come between allocating the first and the last of
such a collection, then we deadlock, as the last will not be allocated
until the quiesce is lifted, the quiesce will not be lifted until the
first (which has been allocated) gets used, and that first cannot be
used until the last is allocated.
It is not necessary to inhibit reshape IO when a quiesce is
requested. Those places in the code that require a full quiesce will
ensure the reshape thread is not running at all.
So allow reshape requests to get access to new stripe_heads without
being blocked by a 'quiesce'.
This only affects in-place reshapes (i.e. where the array does not
grow or shrink) and these are only newly supported. So this patch is
not needed in earlier kernels.
Linus Torvalds [Mon, 8 Jun 2009 19:31:53 +0000 (12:31 -0700)]
async: Fix lack of boot-time console due to insufficient synchronization
Our async work synchronization was broken by "async: make sure
independent async domains can't accidentally entangle" (commit d5a877e8dd409d8c702986d06485c374b705d340), because it would report
the wrong lowest active async ID when there was both running and
pending async work.
This caused things like no being able to read the root filesystem,
resulting in missing console devices and inability to run 'init',
causing a boot-time panic.
This fixes it by properly returning the lowest pending async ID: if
there is any running async work, that will have a lower ID than any
pending work, and we should _not_ look at the pending work list.
There were alternative patches from Jaswinder and James, but this one
also cleans up the code by removing the pointless 'ret' variable and
the unnecesary testing for an empty list around 'for_each_entry()' (if
the list is empty, the for_each_entry() thing just won't execute).
Fixes-bug: http://bugzilla.kernel.org/show_bug.cgi?id=13474 Reported-and-tested-by: Chris Clayton <chris2553@googlemail.com> Cc: Jaswinder Singh Rajput <jaswinder@kernel.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 8 Jun 2009 16:22:53 +0000 (09:22 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: Outline udelay and fix a few issues.
MIPS: ioctl.h: Fix headers_check warnings
MIPS: Cobalt: PCI bus is always required to obtain the board ID
MIPS: Kconfig: Remove "Support for" from Cavium system type
MIPS: Sibyte: Honor CONFIG_CMDLINE
SSB: BCM47xx: Export ssb_watchdog_timer_set
Ralf Baechle [Sat, 28 Feb 2009 09:44:28 +0000 (09:44 +0000)]
MIPS: Outline udelay and fix a few issues.
Outlining fixes the issue were on certain CPUs such as the R10000 family
the delay loop would need an extra cycle if it overlaps a cacheline
boundary.
The rewrite also fixes build errors with GCC 4.4 which was changed in
way incompatible with the kernel's inline assembly.
Relying on pure C for computation of the delay value removes the need for
explicit. The price we pay is a slight slowdown of the computation - to
be fixed on another day.
Linus Torvalds [Mon, 8 Jun 2009 15:29:31 +0000 (08:29 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5543/1: arm: serial amba: add missing declaration in serial.h
[ARM] pxa: fix pxa27x_udc default pullup GPIO
[ARM] pxa/imote2: fix UCAM sensor board ADC model number
mx[23]: don't put clock lookups in __initdata
fix oops when using console=ttymxcN with N > 0
[ARM] ARMv7 errata: only apply fixes when running on applicable CPU
[ARM] 5534/1: kmalloc must return a cache line aligned buffer
Linus Torvalds [Mon, 8 Jun 2009 14:53:59 +0000 (07:53 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
sdhci-of: Fix the wrong accessor to HOSTVER register
mvsdio: fix config failure with some high speed SDHC cards
mvsdio: ignore high speed timing requests from the core
mmc/omap: Use disable_irq_nosync() from within irq handlers.
sdhci-of: Add fsl,esdhc as a valid compatible to bind against
mvsdio: allow automatic loading when modular
mxcmmc: Fix missing return value checking in DMA setup code.
mxcmmc : Reset the SDHC hardware if software timeout occurs.
omap_hsmmc: Trivial fix for a typo in comment
mxcmmc: decrease minimum frequency to make MMC cards work
Avi Kivity [Sat, 6 Jun 2009 09:34:39 +0000 (12:34 +0300)]
KVM: Explicity initialize cpus_hardware_enabled
Under CONFIG_MAXSMP, cpus_hardware_enabled is allocated from the heap and
not statically initialized. This causes a crash on reboot when kvm thinks
vmx is enabled on random nonexistent cpus and accesses nonexistent percpu
lists.
Fix by explicitly clearing the variable.
Cc: stable@kernel.org Reported-and-tested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Avi Kivity <avi@redhat.com>
[ARM] 5543/1: arm: serial amba: add missing declaration in serial.h
This header is sometimes included in the uncompress stage to get
register values, but no <linux/amba/bus.h> can be included there.
So declare "struct amba_device" here before using it in a prototype.
Signed-off-by: Alessandro Rubini <rubini@unipv.it> Acked-by: Andrea Gallo <andrea.gallo@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Sergei Shtylyov [Sun, 7 Jun 2009 11:52:50 +0000 (13:52 +0200)]
pdc202xx_old: fix resetproc() method
pdc202xx_reset() calls pdc202xx_reset_host() twice, for both channels, while
that function actually twiddles the single, shared software reset bit -- the
net effect is a duplicated reset and horrendous 4 second delay happening not
only on a channel reset but also when dma_lost_irq() and dma_clear() methods
are called. Fold pdc202xx_reset_host() into pdc202xx_reset(), fix printk(),
and move it before the actual reset...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei Shtylyov [Sun, 7 Jun 2009 11:52:50 +0000 (13:52 +0200)]
pdc202xx_old: fix 'pdc20246_dma_ops'
Commit ac95beedf8bc97b24f9540d4da9952f07221c023 (ide: add struct ide_port_ops
(take 2)) erroneously converted the driver's dma_timeout() and dma_lost_irq()
methods to call the driver's resetproc() method regardless of whether it was
defined for this specific controller while it hadn't been defined and hence
called for PDC20246. So the dma_clear() method, the successor of dma_timeout(),
shouldn't exist and the dma_lost_irq() method should be standard for PDC20246.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Linus Torvalds [Sat, 6 Jun 2009 21:33:54 +0000 (14:33 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/pci: fix mmconfig detection with 32bit near 4g
PCI: use fixed-up device class when configuring device
Hugh Dickins [Sat, 6 Jun 2009 20:18:09 +0000 (21:18 +0100)]
integrity: fix IMA inode leak
CONFIG_IMA=y inode activity leaks iint_cache and radix_tree_node objects
until the system runs out of memory. Nowhere is calling ima_inode_free()
a.k.a. ima_iint_delete(). Fix that by calling it from destroy_inode().
Linus Torvalds [Sat, 6 Jun 2009 19:18:14 +0000 (12:18 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
ext3/4 with synchronous writes gets wedged by Postfix
Fix nobh_truncate_page() to not pass stack garbage to get_block()