Disabling all runtime PM during system shutdown turns out not to be a
good idea, because some devices may need to be woken up from a
low-power state at that time.
The whole point of disabling runtime PM for system shutdown was to
prevent untimely runtime-suspend method calls. This patch (as1504)
accomplishes the same result by incrementing the usage count for each
device and waiting for ongoing runtime-PM callbacks to finish. This
is what we already do during system suspend and hibernation, which
makes sense since the shutdown method is pretty much a legacy analog
of the pm->poweroff method.
This fixes a recent regression on some OMAP systems introduced by
commit af8db1508f2c9f3b6e633e2d2d906c6557c617f9 (PM / driver core:
disable device's runtime PM during shutdown).
Reported-and-tested-by: NeilBrown <neilb@suse.de> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Kostyantyn Shlyakhovoy <x0155534@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some older Panasonic made camcorders (Panasonic AG-EZ30 and NV-DX110,
Grundig Scenos DLC 2000) reject requests with ack_busy_X if a request is
sent immediately after they sent a response to a prior transaction.
This causes firewire-core to fail probing of the camcorder with "giving
up on config rom for node id ...". Consequently, programs like kino or
dvgrab are unaware of the presence of a camcorder.
Such transaction failures happen also with the ieee1394 driver stack
(of the 2.4...2.6 kernel series until 2.6.36 inclusive) but with a lower
likelihood, such that kino or dvgrab are generally able to use these
camcorders via the older driver stack. The cause for firewire-ohci's or
firewire-core's worse behavior is not yet known. Gap count optimization
in firewire-core is not the cause. Perhaps the slightly higher latency
of transaction completion in the older stack plays a role. (ieee1394:
AR-resp DMA context tasklet -> packet completion ktread -> user process;
firewire-core: tasklet -> user process.)
This change introduces retries and delays after ack_busy_X into
firewire-core's Config ROM reader, such that at least firewire-core's
probing and /dev/fw* creation are successful. This still leaves the
problem that userland processes are facing transaction failures.
gscanbus's built-in retry routines deal with them successfully, but
neither kino's nor dvgrab's do ever succeed.
But at least DV capture with "dvgrab -noavc -card 0" works now. Live
video preview in kino works too, but not actual capture.
One way to prevent Configuration ROM reading failures in application
programs is to modify libraw1394 to synthesize read responses by means
of firewire-core's Configuration ROM cache. This would only leave
CMP and FCP transaction failures as a potential problem source for
applications.
Reported-and-tested-by: Thomas Seilund <tps@netmaster.dk> Reported-and-tested-by: René Fritz <rene@colorcube.de> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clemens points out that we need to use compat_ptr() in order to safely
cast from u64 to addresses of a 32-bit usermode client.
Before, our conversion went wrong
- in practice if the client cast from pointer to integer such that
sign-extension happened, (libraw1394 and libdc1394 at least were not
doing that, IOW were not affected)
or
- in theory on s390 (which doesn't have FireWire though) and on the
tile architecture, regardless of what the client does.
The bug would usually be observed as the initial get_info ioctl failing
with "Bad address" (EFAULT).
Reported-by: Carl Karsten <carl@personnelware.com> Reported-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Right now we won't touch ASPM state if ASPM is disabled, except in the case
where we find a device that appears to be too old to reliably support ASPM.
Right now we'll clear it in that case, which is almost certainly the wrong
thing to do. The easiest way around this is just to disable the blacklisting
when ASPM is disabled.
Commit f0fbf0abc093 ("x86: integrate delay functions") converted
delay_tsc() into a random delay generator for 64 bit. The reason is
that it merged the mostly identical versions of delay_32.c and
delay_64.c. Though the subtle difference of the result was:
static void delay_tsc(unsigned long loops)
{
- unsigned bclock, now;
+ unsigned long bclock, now;
Now the function uses rdtscl() which returns the lower 32bit of the
TSC. On 32bit that's not problematic as unsigned long is 32bit. On 64
bit this fails when the lower 32bit are close to wrap around when
bclock is read, because the following check
if ((now - bclock) >= loops)
break;
evaluated to true on 64bit for e.g. bclock = 0xffffffff and now = 0
because the unsigned long (now - bclock) of these values results in
0xffffffff00000001 which is definitely larger than the loops
value. That explains Tvortkos observation:
"Because I am seeing udelay(500) (_occasionally_) being short, and
that by delaying for some duration between 0us (yep) and 491us."
Make those variables explicitely u32 again, so this works for both 32
and 64 bit.
Current code has put_ioctx() called asynchronously from aio_fput_routine();
that's done *after* we have killed the request that used to pin ioctx,
so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
from progressing. As the result, we can end up with async call of
put_ioctx() being the last one and possibly happening during exit_mmap()
or elf_core_dump(), neither of which expects stray munmap() being done
to them...
We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
with that, but that's all we care about - neither io_destroy() nor
exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
does really_put_req(), so the ioctx teardown won't be done until then
and we don't care about the contents of ioctx past that point.
Since actual freeing of these suckers is RCU-delayed, we don't need to
bump ioctx refcount when request goes into list for async removal.
All we need is rcu_read_lock held just over the ->ctx_lock-protected
area in aio_fput_routine().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Have ioctx_alloc() return an extra reference, so that caller would drop it
on success and not bother with re-grabbing it on failure exit. The current
code is obviously broken - io_destroy() from another thread that managed
to guess the address io_setup() would've returned would free ioctx right
under us; gets especially interesting if aio_context_t * we pass to
io_setup() points to PROT_READ mapping, so put_user() fails and we end
up doing io_destroy() on kioctx another thread has just got freed...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Silence following warnings:
WARNING: drivers/mfd/cs5535-mfd.o(.data+0x20): Section mismatch in
reference from the variable cs5535_mfd_drv to the function
.devinit.text:cs5535_mfd_probe()
The variable cs5535_mfd_drv references
the function __devinit cs5535_mfd_probe()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console
WARNING: drivers/mfd/cs5535-mfd.o(.data+0x28): Section mismatch in
reference from the variable cs5535_mfd_drv to the function
.devexit.text:cs5535_mfd_remove()
The variable cs5535_mfd_drv references
the function __devexit cs5535_mfd_remove()
If the reference is valid then annotate the
variable with __exit* (see linux/init.h) or name the variable:
*driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console
Rename the variable from *_drv to *_driver so
modpost ignore the OK references to __devinit/__devexit
functions.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Acked-by: Andres Salomon <dilinger@queued.net> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Both md and dm have support for flush, but the dm-raid target
forgot to set the flag to indicate that flushes should be
passed on. (Important for data integrity e.g. with writeback cache
enabled.)
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|kernel BUG at kernel/rtmutex.c:724!
|[<c029599c>] (rt_spin_lock_slowlock+0x108/0x2bc) from [<c01c2330>] (defer_bh+0x1c/0xb4)
|[<c01c2330>] (defer_bh+0x1c/0xb4) from [<c01c3afc>] (rx_complete+0x14c/0x194)
|[<c01c3afc>] (rx_complete+0x14c/0x194) from [<c01cac88>] (usb_hcd_giveback_urb+0xa0/0xf0)
|[<c01cac88>] (usb_hcd_giveback_urb+0xa0/0xf0) from [<c01e1ff4>] (musb_giveback+0x34/0x40)
|[<c01e1ff4>] (musb_giveback+0x34/0x40) from [<c01e2b1c>] (musb_advance_schedule+0xb4/0x1c0)
|[<c01e2b1c>] (musb_advance_schedule+0xb4/0x1c0) from [<c01e2ca8>] (musb_cleanup_urb.isra.9+0x80/0x8c)
|[<c01e2ca8>] (musb_cleanup_urb.isra.9+0x80/0x8c) from [<c01e2ed0>] (musb_urb_dequeue+0xec/0x108)
|[<c01e2ed0>] (musb_urb_dequeue+0xec/0x108) from [<c01cbb90>] (unlink1+0xbc/0xcc)
|[<c01cbb90>] (unlink1+0xbc/0xcc) from [<c01cc2ec>] (usb_hcd_unlink_urb+0x54/0xa8)
|[<c01cc2ec>] (usb_hcd_unlink_urb+0x54/0xa8) from [<c01c2a84>] (unlink_urbs.isra.17+0x2c/0x58)
|[<c01c2a84>] (unlink_urbs.isra.17+0x2c/0x58) from [<c01c2b44>] (usbnet_terminate_urbs+0x94/0x10c)
|[<c01c2b44>] (usbnet_terminate_urbs+0x94/0x10c) from [<c01c2d68>] (usbnet_stop+0x100/0x15c)
|[<c01c2d68>] (usbnet_stop+0x100/0x15c) from [<c020f718>] (__dev_close_many+0x94/0xc8)
defer_bh() takes the lock which is hold during unlink_urbs(). The safe
walk suggest that the skb will be removed from the list and this is done
by defer_bh() so it seems to be okay to drop the lock here.
Reported-by: AnÃbal Almeida Pinto <anibal.pinto@efacec.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Oliver Neukum <oliver@neukum.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On Access Point mode, when transmitting a packet, if the destination
station is in powersave mode, we abort transmitting the packet to the
device queue, but we do not reclaim the allocated memory. Given enough
packets, we can go in a state where there is no packet on the device
queue, but we think the device has no memory left, so no packet gets
transmitted, connections breaks and the AP stops working.
This undo the allocation done in the TX path when the station is in
power-save mode.
Signed-off-by: Nicolas Cavallari <cavallar@lri.fr> Acked-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the module init function registers a platform_device and
only then allocates its IRQ and I/O region. This allows allocation to
race with the device's suspend() function. Instead, allocate
resources in the platform driver's probe() function and free them in
the remove() function.
The module exit function removes the platform device before the
character device that provides access to it. Change it to reverse the
order of initialisation.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Erratum #743622 affects all r2 variants of the Cortex-A9 processor, so
ensure that the workaround is applied regardless of the revision.
Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A hardware bug in the OMAP4 HDMI PHY causes physical damage to the board
if the HDMI PHY is kept powered on when the cable is not connected.
This patch solves the problem by adding hot-plug-detection into the HDMI
IP driver. This is not a real HPD support in the sense that nobody else
than the IP driver gets to know about the HPD events, but is only meant
to fix the HW bug.
The strategy is simple: If the display device is turned off by the user,
the PHY power is set to OFF. When the display device is turned on by the
user, the PHY power is set either to LDOON or TXON, depending on whether
the HDMI cable is connected.
The reason to avoid PHY OFF when the display device is on, but the cable
is disconnected, is that when the PHY is turned OFF, the HDMI IP is not
"ticking" and thus the DISPC does not receive pixel clock from the HDMI
IP. This would, for example, prevent any VSYNCs from happening, and
would thus affect the users of omapdss. By using LDOON when the cable is
disconnected we'll avoid the HW bug, but keep the HDMI working as usual
from the user's point of view.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
"hdmi_hpd" pin is muxed to INPUT and PULLUP, but the pin is not
currently used, and in the future when it is used, the pin is used as a
GPIO and is board specific, not an OMAP4 wide thing.
So remove the muxing for now.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The GPIO 60 on 4430sdp and Panda is not HPD GPIO, as currently marked in
the board files, but CT_CP_HPD, which is used to enable/disable HPD
functionality.
This patch renames the GPIO.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use default regn and regm2 dividers in the hdmi driver if the board file
does not define them.
Cc: Mythri P K <mythripk@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Patchset "ARM: orion: Refactor the MPP code common in the orion
platform" broke at least Orion5x based platforms. These platforms have
pins configured as GPIO when the selector is not 0x0. However the
common code assumes the selector is always 0x0 for a GPIO lines. It
then ignores the GPIO bits in the MPP definitions, resulting in that
Orion5x machines cannot correctly configure there GPIO lines.
The Fix removes the assumption that the selector is always 0x0.
In order that none GPIO configurations are correctly blocked,
Kirkwood and mv78xx0 MPP definitions are corrected to only set the
GPIO bits for GPIO configurations.
This third version, which does not contain any whitespace changes,
and is rebased on v3.3-rc2.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net>
The patch "ARM: orion: Consolidate USB platform setup code.", commit 4fcd3f374a928081d391cd9a570afe3b2c692fdc broke USB on TS-7800 and
other orion5x boards, because the wrong type of PHY was being passed
to the EHCI driver in the platform data. Orion5x needs EHCI_PHY_ORION
and all the others want EHCI_PHY_NA.
Allow the mach- code to tell the generic plat-orion code which USB PHY
enum to place into the platform data.
Version 2: Rebase to v3.3-rc2.
Reported-by: Ambroz Bizjak <ambrop7@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Ambroz Bizjak <ambrop7@gmail.com> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Beulich [Tue, 28 Feb 2012 10:41:37 +0000 (10:41 +0000)]
kprobes: adjust "fix a memory leak in function pre_handler_kretprobe()"
3.0.21's 603b63484725a6e88e4ae5da58716efd88154b1e directly used
the upstream patch, yet kprobes locking in 3.0.x uses spin_lock...()
rather than raw_spin_lock...().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Enable use of the generic atomic64 implementation on AVR32 platforms.
Without this the kernel fails to build as the architecture does not
provide its version.
The models do not resume correctly without acpi_sleep=nonvs.
Signed-off-by: Keng-Yu Lin <kengyu@canonical.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to i.MX27 Reference Manual (p 1593) TXBIT0 bit selects
whether the most significant or the less significant part of the
data word written to the FIFO is transmitted.
As DSP_A is the same as DSP_B with a data offset of 1 bit, it
doesn't make any sense to remove TXBIT0 bit here.
Signed-off-by: Javier Martin <javier.martin@vista-silicon.com> Acked-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Recent enhancements in the bias management means that we might not be
in standby when the CODEC is idle and can have active widgets without
being in full power mode but the shutdown functionality assumes these
things. Add checks for the bias level at each stage so that we don't
do transitions other than the ON->PREPARE->STANDBY->OFF ones that the
drivers are expecting.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It used to be that minors where 8 bit. But now they
are actually 20 bit. So the fix is simplicity itself.
I've tested with 300 devices and all user-mode utils
work just fine. I have also mechanically added 10,000
to the ida (so devices are /dev/osd10000, /dev/osd10001 ...)
and was able to mkfs an exofs filesystem and access osds
from user-mode.
All the open-osd user-mode code uses the same library
to access devices through their symbolic names in
/dev/osdX so I'd say it's pretty safe. (Well tested)
This patch is very important because some of the systems
that will be deploying the 3.2 pnfs-objects code are larger
than 64 OSDs and will stop to work properly when reaching
that number.
This patch (as1531) adds a NOGET quirk for the Slim+ keyboard marketed
by AIREN. This keyboard seems to have a lot of bugs; NOGET works
around only one of them.
Dave Jones reports a few Fedora users hitting the BUG_ON(mm->nr_ptes...)
in exit_mmap() recently.
Quoting Hugh's discovery and explanation of the SMP race condition:
"mm->nr_ptes had unusual locking: down_read mmap_sem plus
page_table_lock when incrementing, down_write mmap_sem (or mm_users
0) when decrementing; whereas THP is careful to increment and
decrement it under page_table_lock.
Now most of those paths in THP also hold mmap_sem for read or write
(with appropriate checks on mm_users), but two do not: when
split_huge_page() is called by hwpoison_user_mappings(), and when
called by add_to_swap().
It's conceivable that the latter case is responsible for the
exit_mmap() BUG_ON mm->nr_ptes that has been reported on Fedora."
The simplest way to fix it without having to alter the locking is to make
split_huge_page() a noop in nr_ptes terms, so by counting the preallocated
pagetables that exists for every mapped hugepage. It was an arbitrary
choice not to count them and either way is not wrong or right, because
they are not used but they're still allocated.
Reported-by: Dave Jones <davej@redhat.com> Reported-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Josh Boyer <jwboyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Duplicate the data for iniAddac early on, to avoid having to do redundant
memcpy calls later. While we're at it, make AR5416 < v2.2 use the same
codepath. Fixes a reported crash on x86.
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Reported-by: Magnus Määttä <magnus.maatta@logica.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
rate control algorithms concludes the rate as invalid
with rate[i].idx < -1 , while they do also check for rate[i].count is
non-zero. it would be safer to zero initialize the 'count' field.
recently we had a ath9k rate control crash where the ath9k rate control
in ath_tx_status assumed to check only for rate[i].count being non-zero
in one instance and ended up in using invalid rate index for
'connection monitoring NULL func frames' which eventually lead to the crash.
thanks to Pavel Roskin for fixing it and finding the root cause.
https://bugzilla.redhat.com/show_bug.cgi?id=768639
Cc: Pavel Roskin <proski@gnu.org> Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The cifs code will attempt to open files on lookup under certain
circumstances. What happens though if we find that the file we opened
was actually a FIFO or other special file?
Currently, the open filehandle just ends up being leaked leading to
a dentry refcount mismatch and oops on umount. Fix this by having the
code close the filehandle on the server if it turns out not to be a
regular file. While we're at it, change this spaghetti if statement
into a switch too.
Reported-by: CAI Qian <caiqian@redhat.com> Tested-by: CAI Qian <caiqian@redhat.com> Reviewed-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is an issue when memcg unregisters events that were attached to
the same eventfd:
- On the first call mem_cgroup_usage_unregister_event() removes all
events attached to a given eventfd, and if there were no events left,
thresholds->primary would become NULL;
- Since there were several events registered, cgroups core will call
mem_cgroup_usage_unregister_event() again, but now kernel will oops,
as the function doesn't expect that threshold->primary may be NULL.
That's a good question whether mem_cgroup_usage_unregister_event()
should actually remove all events in one go, but nowadays it can't
do any better as cftype->unregister_event callback doesn't pass
any private event-associated cookie. So, let's fix the issue by
simply checking for threshold->primary.
FWIW, w/o the patch the following oops may be observed:
On i.MX53 we have to write a special SDHCI_CMD_ABORTCMD to the
SDHCI_TRANSFER_MODE register during a MMC_STOP_TRANSMISSION
command. This works for SD cards. However, with MMC cards
the MMC_SET_BLOCK_COUNT command is used instead, but this
needs the same handling. Fix MMC cards by testing for the
MMC_SET_BLOCK_COUNT command aswell. Tested on a custom i.MX53
board with a Transcend MMC+ card and eMMC.
The kernel started used MMC_SET_BLOCK_COUNT in 3.0, so this
is a regression for these boards introduced in 3.0; it should
go to 3.0/3.1/3.2-stable.
: : I have noticed some user space problems (pulseaudio crashes in pthread
: : code, glibc/nptl test suite failures, java compiler freezes on SMP alpha
: : systems) that arise when using a 2.6.39 or later kernel on Alpha.
: : Bisecting between 2.6.38 and 2.6.39 (using glibc/nptl test suite as
: : criterion for good/bad kernel) eventually leads to:
: :
: : 8d7718aa082aaf30a0b4989e1f04858952f941bc is the first bad commit
: : commit 8d7718aa082aaf30a0b4989e1f04858952f941bc
: : Author: Michel Lespinasse <walken@google.com>
: : Date: Thu Mar 10 18:50:58 2011 -0800
: :
: : futex: Sanitize futex ops argument types
: :
: : Change futex_atomic_op_inuser and futex_atomic_cmpxchg_inatomic
: : prototypes to use u32 types for the futex as this is the data type the
: : futex core code uses all over the place.
: :
: : Looking at the commit I see there is a change of the uaddr argument in
: : the Alpha architecture specific code for futexes from int to u32, but I
: : don't see why this should cause a problem.
Richard Henderson said:
: futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
: u32 oldval, u32 newval)
: ...
: : "r"(uaddr), "r"((long)oldval), "r"(newval)
:
:
: There is no 32-bit compare instruction. These are implemented by
: consistently extending the values to a 64-bit type. Since the
: load instruction sign-extends, we want to sign-extend the other
: quantity as well (despite the fact it's logically unsigned).
:
: So:
:
: - : "r"(uaddr), "r"((long)oldval), "r"(newval)
: + : "r"(uaddr), "r"((long)(int)oldval), "r"(newval)
:
: should do the trick.
Michael said:
: This fixes the glibc test suite failures and the pulseaudio related
: crashes, but it does not fix the java compiiler lockups that I was (and
: are still) observing. That is some other problem.
Reported-by: Michael Cree <mcree@orcon.net.nz> Tested-by: Michael Cree <mcree@orcon.net.nz> Acked-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Michel Lespinasse <walken@google.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the current kernel implementation, the Logitech Harmony 900 remote
control is matched to the cdc_ether driver through the generic
USB_CDC_SUBCLASS_MDLM entry. However, this device appears to be of the
pseudo-MDLM (Belcarra) type, rather than the standard one. This patch
blacklists the Harmony 900 from the cdc_ether driver and whitelists it for
the pseudo-MDLM driver in zaurus.
Signed-off-by: Scott Talbert <talbert@techie.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
s3c2410_dma_suspend suspends channels from 0 to dma_channels.
s3c2410_dma_resume resumes channels in reverse order. So
pointer should be decremented instead of being incremented.
Xommit ac5637611(genirq: Unmask oneshot irqs when thread was not woken)
fails to unmask when a !IRQ_ONESHOT threaded handler is handled by
handle_level_irq.
This happens because thread_mask is or'ed unconditionally in
irq_wake_thread(), but for !IRQ_ONESHOT interrupts never cleared. So
the check for !desc->thread_active fails and keeps the interrupt
disabled.
Keep the thread_mask zero for !IRQ_ONESHOT interrupts.
Document the thread_mask magic while at it.
Reported-and-tested-by: Sven Joachim <svenjoac@gmx.de> Reported-and-tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The code is currently always checking the first resource of every
device only (several times.) This has been broken since the ACPI check
was added in February 2010 in commit 91fedede0338eb6203cdd618d8ece873fdb7c22c.
Fix the check to run on each resource individually, once.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is only one error code to return for a bad user-space buffer
pointer passed to a system call in the same address space as the
system call is executed, and that is EFAULT. Furthermore, the
low-level access routines, which catch most of the faults, return
EFAULT already.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@hack.frob.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The regset common infrastructure assumed that regsets would always
have .get and .set methods, but not necessarily .active methods.
Unfortunately people have since written regsets without .set methods.
Rather than putting in stub functions everywhere, handle regsets with
null .get or .set methods explicitly.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@hack.frob.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some codecs don't supply the mute amp-capabilities although the lowest
volume gives the mute. It'd be handy if the parser provides the mute
mixers in such a case.
This patch adds an extension amp-cap bit (which is used only in the
driver) to represent the min volume = mute state. Also modified the
amp cache code to support the fake mute feature when this bit is set
but the real mute bit is unset.
In addition, conexant cx5051 parser uses this new feature to implement
the missing mute controls.
Enable the compat keyctl wrapper on s390x so that 32-bit s390 userspace can
call the keyctl() syscall.
There's an s390x assembly wrapper that truncates all the register values to
32-bits and this then calls compat_sys_keyctl() - but the latter only exists if
CONFIG_KEYS_COMPAT is enabled, and the s390 Kconfig doesn't enable it.
Without this patch, 32-bit calls to the keyctl() syscall are given an ENOSYS
error:
[root@devel4 ~]# keyctl show
Session Keyring
-3: key inaccessible (Function not implemented)
Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: dan@danny.cz Cc: Carsten Otte <cotte@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The hardware generates an interrupt for every completed command in the
queue while the code assumed that it will only generate one interrupt
when the queue is empty. So, explicitly check if the queue is really
empty. This patch fixed problems which occurred due to high traffic on
the bus. While we are here, move the completion-initialization after the
parameter error checking.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Cc: Shawn Guo <shawn.guo@linaro.org> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Lothar Waßmann <LW@KARO-electronics.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1. address has to be page aligned.
2. set_memory_x uses page size argument, not size.
Bug causes with following commit:
commit da28179b4e90dda56912ee825c7eaa62fc103797
Author: Mingarelli, Thomas <Thomas.Mingarelli@hp.com>
Date: Mon Nov 7 10:59:00 2011 +0100
watchdog: hpwdt: Changes to handle NX secure bit in 32bit path
The GPI_28 IRQ was not registered properly. The registration of
IRQ_LPC32XX_GPI_28 was added and the (wrong) IRQ_LPC32XX_GPI_11 at
LPC32XX_SIC1_IRQ(4) was replaced by IRQ_LPC32XX_GPI_28 (see manual of
LPC32xx / interrupt controller).
Signed-off-by: Roland Stigge <stigge@antcom.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes the initialization of the interrupt controller of the LPC32xx
by correctly setting up SIC1 and SIC2 instead of (wrongly) using the same value
as for the Main Interrupt Controller (MIC).
Signed-off-by: Roland Stigge <stigge@antcom.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The autofs compat handling fix caused a compile failure when
CONFIG_COMPAT isn't defined.
Instead of adding random #ifdef'fery in autofs, let's just make the
compat helpers earlier to use: without CONFIG_COMPAT, is_compat_task()
just hardcodes to zero.
We could probably do something similar for a number of other cases where
we have #ifdef's in code, but this is the low-hanging fruit.
Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the autofs protocol version 5 packet type was added in commit 5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it
obvously tried quite hard to be word-size agnostic, and uses explicitly
sized fields that are all correctly aligned.
However, with the final "char name[NAME_MAX+1]" array at the end, the
actual size of the structure ends up being not very well defined:
because the struct isn't marked 'packed', doing a "sizeof()" on it will
align the size of the struct up to the biggest alignment of the members
it has.
And despite all the members being the same, the alignment of them is
different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round
number (256), the name[] array starts out a 4-byte aligned.
End result: the "packed" size of the structure is 300 bytes: 4-byte, but
not 8-byte aligned.
As a result, despite all the fields being in the same place on all
architectures, sizeof() will round up that size to 304 bytes on
architectures that have 8-byte alignment for u64.
Note that this is *not* a problem for 32-bit compat mode on POWER, since
there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit
and 64-bit alignment is different for 64-bit entities, and as a result
the structure that has exactly the same layout has different sizes.
So on x86-64, but no other architecture, we will just subtract 4 from
the size of the structure when running in a compat task. That way we
will write the properly sized packet that user mode expects.
Not pretty. Sadly, this very subtle, and unnecessary, size difference
has been encoded in user space that wants to read packets of *exactly*
the right size, and will refuse to touch anything else.
Reported-and-tested-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
"nframes" comes from the user and "nframes * CD_FRAMESIZE_RAW" can wrap
on 32 bit systems. That would have been ok if we used the same wrapped
value for the copy, but we use a shifted value. We should just use the
checked version of copy_to_user() because it's not going to make a
difference to the speed.
The current epoll code can be tickled to run basically indefinitely in
both loop detection path check (on ep_insert()), and in the wakeup paths.
The programs that tickle this behavior set up deeply linked networks of
epoll file descriptors that cause the epoll algorithms to traverse them
indefinitely. A couple of these sample programs have been previously
posted in this thread: https://lkml.org/lkml/2011/2/25/297.
To fix the loop detection path check algorithms, I simply keep track of
the epoll nodes that have been already visited. Thus, the loop detection
becomes proportional to the number of epoll file descriptor and links.
This dramatically decreases the run-time of the loop check algorithm. In
one diabolical case I tried it reduced the run-time from 15 mintues (all
in kernel time) to .3 seconds.
Fixing the wakeup paths could be done at wakeup time in a similar manner
by keeping track of nodes that have already been visited, but the
complexity is harder, since there can be multiple wakeups on different
cpus...Thus, I've opted to limit the number of possible wakeup paths when
the paths are created.
This is accomplished, by noting that the end file descriptor points that
are found during the loop detection pass (from the newly added link), are
actually the sources for wakeup events. I keep a list of these file
descriptors and limit the number and length of these paths that emanate
from these 'source file descriptors'. In the current implemetation I
allow 1000 paths of length 1, 500 of length 2, 100 of length 3, 50 of
length 4 and 10 of length 5. Note that it is sufficient to check the
'source file descriptors' reachable from the newly added link, since no
other 'source file descriptors' will have newly added links. This allows
us to check only the wakeup paths that may have gotten too long, and not
re-check all possible wakeup paths on the system.
In terms of the path limit selection, I think its first worth noting that
the most common case for epoll, is probably the model where you have 1
epoll file descriptor that is monitoring n number of 'source file
descriptors'. In this case, each 'source file descriptor' has a 1 path of
length 1. Thus, I believe that the limits I'm proposing are quite
reasonable and in fact may be too generous. Thus, I'm hoping that the
proposed limits will not prevent any workloads that currently work to
fail.
In terms of locking, I have extended the use of the 'epmutex' to all
epoll_ctl add and remove operations. Currently its only used in a subset
of the add paths. I need to hold the epmutex, so that we can correctly
traverse a coherent graph, to check the number of paths. I believe that
this additional locking is probably ok, since its in the setup/teardown
paths, and doesn't affect the running paths, but it certainly is going to
add some extra overhead. Also, worth noting is that the epmuex was
recently added to the ep_ctl add operations in the initial path loop
detection code using the argument that it was not on a critical path.
Another thing to note here, is the length of epoll chains that is allowed.
Currently, eventpoll.c defines:
/* Maximum number of nesting allowed inside epoll sets */
#define EP_MAX_NESTS 4
This basically means that I am limited to a graph depth of 5 (EP_MAX_NESTS
+ 1). However, this limit is currently only enforced during the loop
check detection code, and only when the epoll file descriptors are added
in a certain order. Thus, this limit is currently easily bypassed. The
newly added check for wakeup paths, stricly limits the wakeup paths to a
length of 5, regardless of the order in which ep's are linked together.
Thus, a side-effect of the new code is a more consistent enforcement of
the graph depth.
Thus far, I've tested this, using the sample programs previously
mentioned, which now either return quickly or return -EINVAL. I've also
testing using the piptest.c epoll tester, which showed no difference in
performance. I've also created a number of different epoll networks and
tested that they behave as expectded.
I believe this solves the original diabolical test cases, while still
preserving the sane epoll nesting.
Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Nelson Elhage <nelhage@ksplice.com> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
signalfd_cleanup() ensures that ->signalfd_wqh is not used, but
this is not enough. eppoll_entry->whead still points to the memory
we are going to free, ep_unregister_pollwait()->remove_wait_queue()
is obviously unsafe.
Change ep_poll_callback(POLLFREE) to set eppoll_entry->whead = NULL,
change ep_unregister_pollwait() to check pwq->whead != NULL under
rcu_read_lock() before remove_wait_queue(). We add the new helper,
ep_remove_wait_queue(), for this.
This works because sighand_cachep is SLAB_DESTROY_BY_RCU and because
->signalfd_wqh is initialized in sighand_ctor(), not in copy_sighand.
ep_unregister_pollwait()->remove_wait_queue() can play with already
freed and potentially reused ->sighand, but this is fine. This memory
must have the valid ->signalfd_wqh until rcu_read_unlock().
This patch is intentionally incomplete to simplify the review.
It ignores ep_unregister_pollwait() which plays with the same wqh.
See the next change.
epoll assumes that the EPOLL_CTL_ADD'ed file controls everything
f_op->poll() needs. In particular it assumes that the wait queue
can't go away until eventpoll_release(). This is not true in case
of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand
which is not connected to the file.
This patch adds the special event, POLLFREE, currently only for
epoll. It expects that init_poll_funcptr()'ed hook should do the
necessary cleanup. Perhaps it should be defined as EPOLLFREE in
eventpoll.
__cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if
->signalfd_wqh is not empty, we add the new signalfd_cleanup()
helper.
ep_poll_callback(POLLFREE) simply does list_del_init(task_list).
This make this poll entry inconsistent, but we don't care. If you
share epoll fd which contains our sigfd with another process you
should blame yourself. signalfd is "really special". I simply do
not know how we can define the "right" semantics if it used with
epoll.
The main problem is, epoll calls signalfd_poll() once to establish
the connection with the wait queue, after that signalfd_poll(NULL)
returns the different/inconsistent results depending on who does
EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd
has nothing to do with the file, it works with the current thread.
In short: this patch is the hack which tries to fix the symptoms.
It also assumes that nobody can take tasklist_lock under epoll
locks, this seems to be true.
Note:
- we do not have wake_up_all_poll() but wake_up_poll()
is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE.
- signalfd_cleanup() uses POLLHUP along with POLLFREE,
we need a couple of simple changes in eventpoll.c to
make sure it can't be "lost".
By hwmon sysfs interface convention, setting pwm_enable to zero sets a fan
to full speed. In the f75375s driver, this need be done by enabling
manual fan control, plus duty mode for the F875387 chip, and then setting
the maximum duty cycle. Fix a bug where the two necessary register writes
were swapped, effectively discarding the setting to full-speed.
The current use of /tmp for file lists is insecure. Put them under
$objtree/debian instead.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: maximilian attems <max@stro.at> Signed-off-by: Michal Marek <mmarek@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Said commit adds a check whether the carrier link is ok. If the link is
not ok, the skb is freed and no new dma descriptor added to the rx dma
channel. This causes trouble during initialization when the carrier
status has not yet been updated. If a lot of packets are received while
netif_carrier_ok returns false, all dma descriptors are freed and the
rx dma transfer is stopped.
The bug occurs when the board is connected to a network with lots of
traffic and the ifconfig down/up is done, e.g., when reconfiguring
the interface with DHCP.
The bug can be reproduced by flood pinging the davinci board while doing
ifconfig eth0 down
ifconfig eth0 up
on the board.
After that, the rx path stops working and the overrun value reported
by ifconfig is counting up.
Set the RX FIFO flush watermark lower.
According to Federico and JMicron's reply,
setting it to 16QW would be stable on most platforms.
Otherwise, user might experience packet drop issue.
Reported-by: Federico Quagliata <federico@quagliata.org> Fixed-by: Federico Quagliata <federico@quagliata.org> Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit f11017ec2d1859c661f4e2b12c4a8d250e1f47cf (2.6.37)
moved the fwmark variable in subcontext that is invalidated before
reaching the ip_vs_ct_in_get call. As vaddr is provided as pointer
in the param structure make sure the fwmark variable is in
same context. As the fwmark templates can not be matched,
more and more template connections are created and the
controlled connections can not go to single real server.
Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch (as1520) fixes a bug in the SCSI layer's power management
implementation.
LUN scanning can be carried out asynchronously in do_scan_async(), and
sd uses an asynchronous thread for the time-consuming parts of disk
probing in sd_probe_async(). Currently nothing coordinates these
async threads with system sleep transitions; they can and do attempt
to continue scanning/probing SCSI devices even after the host adapter
has been suspended. As one might expect, the outcome is not ideal.
This is what the "prepare" stage of system suspend was created for.
After the prepare callback has been called for a host, target, or
device, drivers are not allowed to register any children underneath
them. Currently the SCSI prepare callback is not implemented; this
patch rectifies that omission.
For SCSI hosts, the prepare routine calls scsi_complete_async_scans()
to wait until async scanning is finished. It might be slightly more
efficient to wait only until the host in question has been scanned,
but there's currently no way to do that. Besides, during a sleep
transition we will ultimately have to wait until all the host scanning
has finished anyway.
For SCSI devices, the prepare routine calls async_synchronize_full()
to wait until sd probing is finished. The routine does nothing for
SCSI targets, because asynchronous target scanning is done only as
part of host scanning.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In do_scan_async(), calling scsi_autopm_put_host(shost) may reference
freed shost, and cause Posison overwitten warning.
Yes, this case can happen, for example, an USB is disconnected just
when do_scan_async() thread starts to run, then scsi_host_put() called
in scsi_finish_async_scan() will lead to shost be freed(because the
refcount of shost->shost_gendev decreases to 1 after USB disconnects),
at this point, if references shost again, system will show following
warning msg.
To make scsi_autopm_put_host(shost) always reference a valid shost,
put it just before scsi_host_put() in function
scsi_finish_async_scan().
An interrupt might be pending when irq_startup() is called, but the
startup code does not invoke the resend logic. In some cases this
prevents the device from issuing another interrupt which renders the
device non functional.
Call the resend function in irq_startup() to keep things going.
Reported-and-tested-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the primary handler of an interrupt which is marked IRQ_ONESHOT
returns IRQ_HANDLED or IRQ_NONE, then the interrupt thread is not
woken and the unmask logic of the interrupt line is never
invoked. This keeps the interrupt masked forever.
This was not noticed as most IRQ_ONESHOT users wake the thread
unconditionally (usually because they cannot access the underlying
device from hard interrupt context). Though this behaviour was nowhere
documented and not necessarily intentional. Some drivers can avoid the
thread wakeup in certain cases and run into the situation where the
interrupt line s kept masked.
Rate control algorithms are supposed to stop processing when they
encounter a rate with the index -1. Checking for rate->count not being
zero is not enough.
Allowing a rate with negative index leads to memory corruption in
ath_debug_stat_rc().
One consequence of the bug is discussed at
https://bugzilla.redhat.com/show_bug.cgi?id=768639
Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For L1 instruction cache and L2 cache the shared CPU information
is wrong. On current AMD family 15h CPUs those caches are shared
between both cores of a compute unit.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=42607
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Petkov Borislav <Borislav.Petkov@amd.com> Cc: Dave Jones <davej@redhat.com> Link: http://lkml.kernel.org/r/20120208195229.GA17523@alberich.amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Intel has a PCI USB xhci host controller on a new platform. It doesn't
have a line IRQ definition in BIOS. The Linux driver refuses to
initialize this controller, but Windows works well because it only depends
on MSI.
Actually, Linux also can work for MSI. This patch avoids the line IRQ
checking for USB3 HCDs in usb core PCI probe. It allows the xHCI driver
to try to enable MSI or MSI-X first. It will fail the probe if MSI
enabling failed and there's no legacy PCI IRQ.
This patch should be backported to kernels as old as 2.6.32.
[Maintainer note: This patch is a backport of commit 68d07f64b8a11a852d48d1b05b724c3e20c0d94b "USB: Don't fail USB3 probe on
missing legacy PCI IRQ." to the 3.0 kernel. Note, the original patch
description was wrong. We should not back port this to kernels older
than 2.6.36, since that was the first kernel to support MSI and MSI-X
for xHCI hosts. These systems will just not work without MSI support,
so the probe should fail on kernels older than 2.6.36.]
Signed-off-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch (as1521b) fixes the interaction between usb-storage's
scanning thread and the freezer. The current implementation has a
race: If the device is unplugged shortly after being plugged in and
just as a system sleep begins, the scanning thread may get frozen
before the khubd task. Khubd won't be able to freeze until the
disconnect processing is complete, and the disconnect processing can't
proceed until the scanning thread finishes, so the sleep transition
will fail.
The implementation in the 3.2 kernel suffers from an additional
problem. There the scanning thread calls set_freezable_with_signal(),
and the signals sent by the freezer will mess up the thread's I/O
delays, which are all interruptible.
The solution to both problems is the same: Replace the kernel thread
used for scanning with a delayed-work routine on the system freezable
work queue. Freezable work queues have the nice property that you can
cancel a work item even while the work queue is frozen, and no signals
are needed.
The 3.2 version of this patch solves the problem in Bugzilla #42730.
After all the FPU state cleanups and finally finding the problem that
caused all our FPU save/restore problems, this re-introduces the
preloading of FPU state that was removed in commit b3b0870ef3ff ("i387:
do not preload FPU state at task switch time").
However, instead of simply reverting the removal, this reimplements
preloading with several fixes, most notably
- properly abstracted as a true FPU state switch, rather than as
open-coded save and restore with various hacks.
In particular, implementing it as a proper FPU state switch allows us
to optimize the CR0.TS flag accesses: there is no reason to set the
TS bit only to then almost immediately clear it again. CR0 accesses
are quite slow and expensive, don't flip the bit back and forth for
no good reason.
- Make sure that the same model works for both x86-32 and x86-64, so
that there are no gratuitous differences between the two due to the
way they save and restore segment state differently due to
architectural differences that really don't matter to the FPU state.
- Avoid exposing the "preload" state to the context switch routines,
and in particular allow the concept of lazy state restore: if nothing
else has used the FPU in the meantime, and the process is still on
the same CPU, we can avoid restoring state from memory entirely, just
re-expose the state that is still in the FPU unit.
That optimized lazy restore isn't actually implemented here, but the
infrastructure is set up for it. Of course, older CPU's that use
'fnsave' to save the state cannot take advantage of this, since the
state saving also trashes the state.
In other words, there is now an actual _design_ to the FPU state saving,
rather than just random historical baggage. Hopefully it's easier to
follow as a result.
This moves the bit that indicates whether a thread has ownership of the
FPU from the TS_USEDFPU bit in thread_info->status to a word of its own
(called 'has_fpu') in task_struct->thread.has_fpu.
This fixes two independent bugs at the same time:
- changing 'thread_info->status' from the scheduler causes nasty
problems for the other users of that variable, since it is defined to
be thread-synchronous (that's what the "TS_" part of the naming was
supposed to indicate).
So perfectly valid code could (and did) do
ti->status |= TS_RESTORE_SIGMASK;
and the compiler was free to do that as separate load, or and store
instructions. Which can cause problems with preemption, since a task
switch could happen in between, and change the TS_USEDFPU bit. The
change to TS_USEDFPU would be overwritten by the final store.
In practice, this seldom happened, though, because the 'status' field
was seldom used more than once, so gcc would generally tend to
generate code that used a read-modify-write instruction and thus
happened to avoid this problem - RMW instructions are naturally low
fat and preemption-safe.
- On x86-32, the current_thread_info() pointer would, during interrupts
and softirqs, point to a *copy* of the real thread_info, because
x86-32 uses %esp to calculate the thread_info address, and thus the
separate irq (and softirq) stacks would cause these kinds of odd
thread_info copy aliases.
This is normally not a problem, since interrupts aren't supposed to
look at thread information anyway (what thread is running at
interrupt time really isn't very well-defined), but it confused the
heck out of irq_fpu_usable() and the code that tried to squirrel
away the FPU state.
(It also caused untold confusion for us poor kernel developers).
It also turns out that using 'task_struct' is actually much more natural
for most of the call sites that care about the FPU state, since they
tend to work with the task struct for other reasons anyway (ie
scheduling). And the FPU data that we are going to save/restore is
found there too.
Thanks to Arjan Van De Ven <arjan@linux.intel.com> for pointing us to
the %esp issue.
Cc: Arjan van de Ven <arjan@linux.intel.com> Reported-and-tested-by: Raphael Prevost <raphael@buro.asia> Acked-and-tested-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
pending. In order to not leak FIP state from one process to another, we
need to do a floating point load after the fxsave of the old process,
and before the fxrstor of the new FPU state. That resets the state to
the (uninteresting) kernel load, rather than some potentially sensitive
user information.
We used to do this directly after the FPU state save, but that is
actually very inconvenient, since it
(a) corrupts what is potentially perfectly good FPU state that we might
want to lazy avoid restoring later and
(b) on x86-64 it resulted in a very annoying ordering constraint, where
"__unlazy_fpu()" in the task switch needs to be delayed until after
the DS segment has been reloaded just to get the new DS value.
Coupling it to the fxrstor instead of the fxsave automatically avoids
both of these issues, and also ensures that we only do it when actually
necessary (the FP state after a save may never actually get used). It's
simply a much more natural place for the leaked state cleanup.
Yes, taking the trap to re-load the FPU/MMX state is expensive, but so
is spending several days looking for a bug in the state save/restore
code. And the preload code has some rather subtle interactions with
both paravirtualization support and segment state restore, so it's not
nearly as simple as it should be.
Also, now that we no longer necessarily depend on a single bit (ie
TS_USEDFPU) for keeping track of the state of the FPU, we migth be able
to do better. If we are really switching between two processes that
keep touching the FP state, save/restore is inevitable, but in the case
of having one process that does most of the FPU usage, we may actually
be able to do much better than the preloading.
In particular, we may be able to keep track of which CPU the process ran
on last, and also per CPU keep track of which process' FP state that CPU
has. For modern CPU's that don't destroy the FPU contents on save time,
that would allow us to do a lazy restore by just re-enabling the
existing FPU state - with no restore cost at all!
This creates three helper functions that do the TS_USEDFPU accesses, and
makes everybody that used to do it by hand use those helpers instead.
In addition, there's a couple of helper functions for the "change both
CR0.TS and TS_USEDFPU at the same time" case, and the places that do
that together have been changed to use those. That means that we have
fewer random places that open-code this situation.
The intent is partly to clarify the code without actually changing any
semantics yet (since we clearly still have some hard to reproduce bug in
this area), but also to make it much easier to use another approach
entirely to caching the CR0.TS bit for software accesses.
Right now we use a bit in the thread-info 'status' variable (this patch
does not change that), but we might want to make it a full field of its
own or even make it a per-cpu variable.