The core exports the hci_conn_hold_device() and hci_conn_put_device()
functions for device reference of connections. Use this to ensure that
the uevents from the parent are send after the child ones.
Based on a report by Brian Rogers <brian@xyzw.org>
The device model itself has no real usable reference counting at the
moment and this causes problems if parents are deleted before their
children. The device model itself handles the memory details of this
correctly, but the uevent order is not consistent. This causes various
problems for systems like HAL or even X.
So until device_put() does a proper cleanup, the device for Bluetooth
connection will be protected with an extra reference counting to ensure
the correct order of uevents when connections are terminated.
This is not an automatic feature. Higher Bluetooth layers like HIDP or
BNEP should grab this new reference to ensure that their uevents are
send before the ones from the parent device.
Based on a report by Brian Rogers <brian@xyzw.org>
Currently the HID subsystem will create HIDRAW devices for the transport
driver, but it will not disconnect them. Until the HID subsytem gets
fixed, ensure that HIDRAW and HIDDEV devices are disconnected when the
Bluetooth HID device gets removed.
Based on a patch from Brian Rogers <brian@xyzw.org>
PI futexes do not use the same plist_node_empty() test for wakeup.
It was possible for the waiter (in futex_wait_requeue_pi()) to set
TASK_INTERRUPTIBLE after the waker assigned the rtmutex to the
waiter. The waiter would then note the plist was not empty and call
schedule(). The task would not be found by any subsequeuent futex
wakeups, resulting in a userspace hang.
By moving the setting of TASK_INTERRUPTIBLE to before the call to
queue_me(), the race with the waker is eliminated. Since we no
longer call get_user() from within queue_me(), there is no need to
delay the setting of TASK_INTERRUPTIBLE until after the call to
queue_me().
The FUTEX_LOCK_PI operation is not affected as futex_lock_pi()
relies entirely on the rtmutex code to handle schedule() and
wakeup. The requeue PI code is affected because the waiter starts
as a non-PI waiter and is woken on a PI futex.
Remove the crusty old comment about holding spinlocks() across
get_user() as we no longer do that. Correct the locking statement
with a description of why the test is performed.
Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Dinakar Guniguntala <dino@in.ibm.com> Cc: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <20090922053038.8717.97838.stgit@Aeon> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is currently no check to ensure that userspace uses the same
futex requeue target (uaddr2) in futex_requeue() that the waiter used
in futex_wait_requeue_pi(). A mismatch here could very unexpected
results as the waiter assumes it either wakes on uaddr1 or uaddr2. We
could detect this on wakeup in the waiter, but the cleanup is more
intense after the improper requeue has occured.
This patch stores the waiter's expected requeue target in a new
requeue_pi_key pointer in the futex_q which futex_requeue() checks
prior to attempting to do a proxy lock acquistion or a requeue when
requeue_pi=1. If they don't match, return -EINVAL from futex_requeue,
aborting the requeue of any remaining waiters.
Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Dinakar Guniguntala <dino@in.ibm.com> Cc: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <20090814003650.14634.63916.stgit@Aeon> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Yet another reason why trusting this stuff to the BIOS was a bad idea.
The HP DC7900 BIOS reports an iommu at an address which just returns all
ones, when VT-d is disabled in the BIOS.
Fix up the missing iounmap in the error paths while we're at it.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: Arto Jantunen <viiru@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Bash 4 filters out variables which contain a dot in them.
This happends to be the case of CPPFLAGS_vmlinux.lds.
This is rather unfortunate, as it now causes
build failures when using SHELL=/bin/bash to compile,
or when bash happens to be used by make (eg when it's /bin/sh)
Remove the common definition of CPPFLAGS_vmlinux.lds by
pushing relevant stuff to either Makefile.build or the
arch specific kernel/Makefile where we build the linker script.
This is also nice cleanup as we move the information out where
it is used.
Notes for the different architectures touched:
arm - we use an already exported symbol
cris - we use a config symbol aleady available
[Not build tested]
mips - the jiffies complexity has moved to vmlinux.lds.S where we need it.
Added a few variables to CPPFLAGS - they are only used by
the linker script.
[Not build tested]
powerpc - removed assignment that is not needed
[not build tested]
sparc - simplified it using $(BITS)
um - introduced a few new exported variables to deal with this
xtensa - added options to CPP invocation
[not build tested]
Cc: Albin Tonnerre <albin.tonnerre@free-electrons.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1294) fixes a problem that has plagued users for several
kernel releases. Some USB mass-storage devices don't return any sense
data when they encounter certain kinds of errors. The SCSI layer
interprets this to mean that the operation should be retried, and the
same thing happens -- over and over again with no limit. In some
circumstances (such as when a bus reset occurs) that is the right
thing to do, but not here.
The patch checks for this condition (a transport failure with no sense
data) and changes the result code to DID_ERROR and the sense code to
Hardware Error. This does get only a limited number of retries, and
so the command will fail relatively quickly instead of getting stuck
in an infinite loop.
The generic usbserial driver in Linux 2.6.31 halts its receiving
channel in response to throttle requests from the line discipline.
Unfortunately it drops the contents of the first URB received after
throttling takes effect. This patch corrects that problem.
Signed-off-by: Joris van Rantwijk <jorispubl@xs4all.nl> Acked-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In the Dell inspiron mini 10, the GPS is connected via a cp2102. This patch
adds detection of this USB device. (I haven't managed to use the GPS under
Linux yet, though)
This patch (as1293) fixes a problem with the ipaq serial driver. It
tries to bind to all the interfaces, even those that don't have enough
endpoints. The symptom is an invalid memory reference and oops when
the device is plugged in.
This patch (as1295) fixes a recently-added bug in the USB serial core.
If certain kinds of errors occur during probing, the core may call a
serial driver's release method without previously calling the attach
method. This causes some drivers (io_ti in particular) to perform an
invalid memory access.
The patch adds a new flag to keep track of whether or not attach has
been called.
- Re-structure read processing.
- Kill obsolete work queue and always push to tty in completion handler.
- Use tty_insert_flip_string instead of per character push when
possible.
- Fix stalled-read regression in 2.6.31 by using urb status to
determine when port is closed rather than port count.
- Fix race with open/close by checking ASYNCB_INITIALIZED in
unthrottle.
- Kill private rx_flag and lock and use throttle flags in
usb_serial_port instead.
Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove superfluous error checks in completion handler:
- No need to check private data and urb pointers as we check urb-status
before dereferencing priv (which is not freed until urb has been killed
on close).
- No need to check tty as it is checked again when processing.
- No need to check urb->number_of_packets on bulk urb.
Note that both private data and tty are checked again before processing
(possibly from work queue which also is cancelled on close).
Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Bastian Blank reported a boot crash with stackprotector enabled,
and debugged it back to edx register corruption.
For historical reasons irq enable/disable/save/restore had special
calling sequences to make them more efficient. With the more
recent introduction of higher-level and more general optimisations
this is no longer necessary so we can just use the normal PVOP_
macros.
This fixes some residual bugs in the old implementations which left
edx liable to inadvertent clobbering. Also, fix some bugs in
__PVOP_VCALLEESAVE which were revealed by actual use.
Running sg_luns on s390x with CONFIG_DEBUG_PAGEALLOC enabled fails
with EFAULT from the SG_IO ioctl. The EFAULT is the result from
copy_to_user failing in this call chain:
The sg driver calls sg_remove_scat to free the memory pages before
calling blk_rq_unmap_user that tries to copy the data back to
userspace. Change the order to first call blk_rq_unmap_user before
freeing the pages in sg_remove_scat.
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A target reset when I/O is ongoing might result
an eventual device offline, as scsi_eh_completed_normally()
might return ADD_TO_MLQUEUE in addition to the
advertised SUCCESS, FAILED, and NEEDS_RETRY.
Which is unfortunate as scsi_send_eh_cmnd() will
therefore map ADD_TO_MLQUEUE to FAILED instead of
the more appropriate NEEDS_RETRY.
Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When requesting all prl entries (kprl.addr == INADDR_ANY) and there are
more prl entries than there is space passed from userspace, the existing
code would always copy cmax+1 entries, which is more than can be handled.
This patch makes the kernel copy only exactly cmax entries.
Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de> Acked-By: Fred L. Templin <Fred.L.Templin@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jan Rafaj <jr+netfilter-devel@cedric.unob.cz> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In ax25_make_new, if kmemdup of digipeat returns an error, there would
be an oops in sk_free while calling sk_destruct, because sk_protinfo
is NULL at the moment; move sk->sk_destruct initialization after this.
BTW of reported-by: Bernard Pidoux F6BVP <f6bvp@free.fr>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
And also do a better job of returning proper NET_{RX,XMIT}_ values.
Based on a patch by Mark Smith.
This fixes CVE-2009-2903
Reported-by: Mark Smith <lk-netdev@lk-netdev.nosense.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The SKY2_HW_RAM_BUFFER bit in hw->flags was checked in sky2_mac_init(),
before being set later in sky2_up().
Setting SKY2_HW_RAM_BUFFER in sky2_init() where other hw->flags are set
should avoid this problem recurring.
Signed-off-by: Mike McCormack <mikem@ring3k.org> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Usbnet framework assumes USB hardware doesn't handle zero length
packets, but SMSC LAN95xx requires these to be sent for correct
operation.
This patch fixes an easily reproducible tx lockup when sending a frame
that results in exactly 512 bytes in a USB transmission (e.g. a UDP
frame with 458 data bytes, due to IP headers and our USB headers). It
adds an extra flag to usbnet for the hardware driver to indicate that
it can handle and requires the zero length packets.
This patch should not affect other usbnet users, please also consider
for -stable.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
(net: No more expensive sock_hold()/sock_put() on each tx)
opens a window in sock_wfree() where another cpu
might free the socket we are working on.
A fix is to call sk->sk_write_space(sk) while still
holding a reference on sk.
Reported-by: Jike Song <albcamus@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This appeared after I enabled CONFIG_TCP_MD5SIG and played with it a
bit, so I looked at what might have caused it.
One thing that struck me as strange is tcp_twsk_destructor(), as it
calls tcp_put_md5sig_pool() -- which entails a put_cpu(), causing the
detected imbalance. Found on 2.6.23.9, but 2.6.31 is affected as well,
as far as I can tell.
Signed-off-by: Robert Varga <nite@hq.alert.sk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
After commit 2b980dbd77d229eb60588802162c9659726b11f4
("lsm: Add hooks to the TUN driver") tun_set_iff doesn't
return -EINVAL though neither IFF_TUN nor IFF_TAP is set.
Signed-off-by: Kusanagi Kouichi <slash@ma.neweb.ne.jp> Reviewed-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
"..., when one process calls sendmsg once to send 43804 bytes of
data and one file descriptor, and another process then calls recvmsg
three times to receive the 16032+16032+11740 bytes, each of those
recvmsg calls returns the file descriptor in the ancillary data. I
confirmed this with strace. The behaviour differs from Linux
2.6.26, where reportedly only one of those recvmsg calls (I think
the first one) returned the file descriptor."
This bug was introduced by a patch from me titled "net: unix: fix inflight
counting bug in garbage collector", commit 6209344f5.
And the reason is, quoting Kalle:
"Before your patch, unix_attach_fds() would set scm->fp = NULL, so
that if the loop in unix_stream_sendmsg() ran multiple iterations,
it could not call unix_attach_fds() again. But now,
unix_attach_fds() leaves scm->fp unchanged, and I think this causes
it to be called multiple times and duplicate the same file
descriptors to each struct sk_buff."
Fix this by introducing a flag that is cleared at the start and set
when the fds attached to the first buffer. The resulting code should
work equivalently to the one on 2.6.26.
Reported-by: Kalle Olavi Niemitalo <kon@iki.fi> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We lost rx timestamping of packets received on accelerated vlans.
Effect is that tcpdump on real dev can show strange timings, since it gets rx timestamps
too late (ie at skb dequeueing time, not at skb queueing time)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The message "ACPI: Device needs an ACPI driver" is misleading. The
device _may_ need an ACPI driver, if the BIOS implemented a custom
API for the device in question (which, AFAIK, can't be checked.) If
not, then either a generic ACPI driver may be used (for example
"thermal"), or nothing can be done (other than a white list).
I propose to reword the message to:
ACPI: If an ACPI driver is available for this device, you should use
it instead of the native driver
which I think is more correct. Comments and suggestions welcome.
I also added a message warning about possible problems and system
instability when users pass acpi_enforce_resources=lax, as suggested
by Len.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Alan Jenkins <sourcejedi.lkml@googlemail.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When creating a new file, ima_path_check() assumed the new file
was being opened for write. Call ima_path_check() with the
appropriate acc_mode so that the read/write counters are
incremented correctly.
The problem was that on suspend, the PIT is removed as a clocksource,
and was using the mult value essentially as a is_enabled() flag. The
mult adjustments done in the commit above caused that usage to break,
causing bad list manipulation and the oops.
Further, on resume, the PIT clocksource is never restored, causing the
system to run in a degraded mode with jiffies as the clocksource.
This issue has since been resolved in 2.6.32-rc by commit 8cab02dc3c58a12235c6d463ce684dded9696848 which removes the clocksource
disabling on suspend. Testing shows no issues there.
So the following patch rectifies the situation for 2.6.31 users of the
PIT clocksource that use suspend and resume (which is probably not that
many).
Many thanks to Ondrej for helping narrow down what was happening, what
caused it, and verifying the fix.
---------------
Avoid using the unprotected clocksource.mult value as an "is_registered"
flag, instead us an explicit flag variable. This avoids possible list
corruption if the clocksource is double-unregistered.
Also re-register the PIT clocksource on resume so folks don't have to
use jiffies after suspend.
Signed-off-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Clear prefetch setting before potentially (re-)enabling it in
config_drive_art_rwp() so the transition of the device type on
the port from ATA to ATAPI (i.e. during warm-plug operation)
is handled correctly.
This is a really old bug (it probably goes back to very early
days of the driver) but it was only affecting warm-plug operation
until the recent "ide: try to use PIO Mode 0 during probe if
possible" change (commit 6029336426a2b43e4bc6f4a84be8789a047d139e).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Tested-by: David Fries <david@fries.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
After commit 355cfa73 ("mm: modify swap_map and add SWAP_HAS_CACHE flag"),
read_swap_cache_async() will busy-wait while a entry doesn't exist in swap
cache but it has SWAP_HAS_CACHE flag.
Such entries can exist on add/delete path of swap cache. On add path,
add_to_swap_cache() is called soon after SWAP_HAS_CACHE flag is set, and
on delete path, swapcache_free() will be called (SWAP_HAS_CACHE flag is
cleared) soon after __delete_from_swap_cache() is called. So, the
busy-wait works well in most cases.
But this mechanism can cause soft lockup if add_to_swap_cache() sleeps and
read_swap_cache_async() tries to swap-in the same entry on the same cpu.
This patch calls radix_tree_preload() before swapcache_prepare() and
divides add_to_swap_cache() into two part: radix_tree_preload() part and
radix_tree_insert() part(define it as __add_to_swap_cache()).
Which is why I have always preferred sizeof(struct foo) over
sizeof(var).
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When svm_vcpu_load is called while the vcpu is running in
guest mode the tsc adjustment made there is lost on the next
emulated #vmexit. This causes the tsc running backwards in
the guest. This patch fixes the issue by also adjusting the
tsc_offset in the emulated hsave area so that it will not
get lost.
The "VIA DXS" controls are actually volume controls that apply to the
four PCM substreams, so we better indicate this connection by moving the
controls to the PCM interface.
Commit b452e08e73c0e3dbb0be82130217be4b7084299e in 2.6.30 broke the
restoring of these volumes by "alsactl restore" that most distributions
use; the renaming in this patch cures that regression by preventing
alsactl from applying the old, wrong volume levels to the new controls.
http://bugzilla.kernel.org/show_bug.cgi?id=14151
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=532613
While trying to work around spurious detection retries for
non-existent devices on slave links, commit 816ab89782ac139a8b65147cca990822bb7e8675 incorrectly added link
offline check logic before ata_eh_thaw() was called. This means that
if an occupied link goes down briefly at the time that offline check
was performed, device class will be cleared to ATA_DEV_NONE and libata
wouldn't retry thus failing detection of the device.
The offline check should be done after the port is thawed together
with online check so that such link glitches can be detected by the
interrupt handler and handled properly.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Tim Blechmann <tim@klingt.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If the system did not switch into NOHZ mode ts->inidle is not set when
tick_nohz_stop_sched_tick() is called from the idle routine. Therefor
all subsequent calls from irq_exit() to tick_nohz_stop_sched_tick()
fail to call tick_nohz_start_idle(). This results in bogus idle
accounting information which is passed to cpufreq governors.
Set the inidle flag unconditionally of the NOHZ active state to keep
the idle time accounting correct in any case.
[ tglx: Added comment and tweaked the changelog ]
Reported-by: Steven Noonan <steven@uplinklabs.net> Signed-off-by: Eero Nurkkala <ext-eero.nurkkala@nokia.com> Cc: Rik van Riel <riel@redhat.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Steven Noonan <steven@uplinklabs.net>
LKML-Reference: <1254907901.30157.93.camel@eenurkka-desktop> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It's caused by the displacement of the retry_private label in
futex_wake_op(). The code unlocks the hash bucket locks in the
error handling path and retries without locking them again which
makes the next unlock fail.
Move retry_private so we lock the hash bucket locks when we retry.
Reported-by: Rich Ercolany <rercola@acm.jhu.edu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Darren Hart <dvhltc@us.ibm.com>
LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The robust list pointers of user space held futexes are kept intact
over an exec() call. When the exec'ed task exits exit_robust_list() is
called with the stale pointer. The risk of corruption is minimal, but
still it is incorrect to keep the pointers valid. Actually glibc
should uninstall the robust list before calling exec() but we have to
deal with it anyway.
Nullify the pointers after [compat_]exit_robust_list() has been
called.
Reported-by: Anirban Sinha <ani@anirban.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If futex_wait_requeue_pi() wakes prior to requeue, we drop the
reference to the source futex_key twice, once in
handle_early_requeue_pi_wakeup() and once on our way out.
Remove the drop from the handle_early_requeue_pi_wakeup() and keep
the get/drops together in futex_wait_requeue_pi().
Reported-by: Helge Bahmann <hcb@chaoticmind.net> Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Cc: Helge Bahmann <hcb@chaoticmind.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Dinakar Guniguntala <dino@in.ibm.com> Cc: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <4ACCE21E.5030805@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Due to legacy code from back when the dynamic tracer used a daemon,
only core kernel code was checking for failures. This is no longer
the case. We must check for failures any time we perform text modifications.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When the module is about the unload we release its call records.
The ftrace_release function was given wrong values representing
the module core boundaries, thus not releasing its call records.
Plus making ftrace_release function module specific.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
LKML-Reference: <1254934835-363-3-git-send-email-jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A couple of people have hit the WARN_ON() in drivers/char/tty_io.c,
tty_open() that is unhappy about seeing the tty line discipline go away
during the tty hangup. See for example
http://bugzilla.kernel.org/show_bug.cgi?id=14255
and the reason is that we do the tty_ldisc_halt() outside the
ldisc_mutex in order to be able to flush the scheduled work without a
deadlock with vhangup_work.
However, it turns out that we can solve this particular case by
- using "cancel_delayed_work_sync()" in tty_ldisc_halt(), which waits
for just the particular work, rather than synchronizing with any
random outstanding pending work.
This won't deadlock, since the buf.work we synchronize with doesn't
care about the ldisc_mutex, it just flushes the tty ldisc buffers.
- realize that for this particular case, we don't need to wait for any
hangup work, because we are inside the hangup codepaths ourselves.
so as a result we can just drop the flush_scheduled_work() entirely, and
then move the tty_ldisc_halt() call to inside the mutex. That way we
never expose the partially torn down ldisc state to tty_open(), and hold
the ldisc_mutex over the whole sequence.
Just like ip_fast_csum, the assembly snippet in csum_ipv6_magic needs a
memory clobber, as it is only passed the address of the buffer, not a
memory reference to the buffer itself.
This caused failures in Hurd's pfinetv4 when we tried to compile it with
gcc-4.3 (bogus checksums).
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
For devices using OTP memory, EEPROM image can start from
any one of the OTP blocks. If shadow RAM is disabled, we need to
traverse link list to find the last valid block, then start the EEPROM
image reading.
If OTP is not full, the valid block is the block _before_ the last block
on the link list; the last block on the link list is the empty block
ready for next OTP refresh/update.
If OTP is full, then the last block is the valid block to be used for
configure the device.
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On 1000, there are two Switching Voltage Regulators (SVR). The first one
apply digital voltage level (1.32V) for PCIe block and core. We need to
use this regulator to solve a stability issue related to noisy DC2DC
line in the silicon.
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Adding new API version to account for change to ucode file format. New
header includes the build number of the ucode. This build number is the
SVN revision thus allowing for exact correlation to the code that
generated it.
The header adds the build number so that older ucode images can also be
enhanced to include the build in the future.
some cleanup in iwl_read_ucode needed to ensure old header not used and
reduce unnecessary references through pointer with the data is already
in heap variable.
Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix MAP_PRIVATE mmap() of files and devices where the data in the backing store
might be mapped directly. Use the BDI_CAP_MAP_DIRECT capability flag to govern
whether or not we should be trying to map a file directly. This can be used to
determine whether or not a region has been filled in at the point where we call
do_mmap_shared() or do_mmap_private().
The BDI_CAP_MAP_DIRECT capability flag is cleared by validate_mmap_request() if
there's any reason we can't use it. It's also cleared in do_mmap_pgoff() if
f_op->get_unmapped_area() fails.
Without this fix, attempting to run a program from a RomFS image on a
non-mappable MTD partition results in a BUG as the kernel attempts XIP, and
this can be caught in gdb:
Program received signal SIGABRT, Aborted.
0xc005dce8 in add_nommu_region (region=<value optimized out>) at mm/nommu.c:547
(gdb) bt
#0 0xc005dce8 in add_nommu_region (region=<value optimized out>) at mm/nommu.c:547
#1 0xc005f168 in do_mmap_pgoff (file=0xc31a6620, addr=<value optimized out>, len=3808, prot=3, flags=6146, pgoff=0) at mm/nommu.c:1373
#2 0xc00a96b8 in elf_fdpic_map_file (params=0xc33fbbec, file=0xc31a6620, mm=0xc31bef60, what=0xc0213144 "executable") at mm.h:1145
#3 0xc00aa8b4 in load_elf_fdpic_binary (bprm=0xc316cb00, regs=<value optimized out>) at fs/binfmt_elf_fdpic.c:343
#4 0xc006b588 in search_binary_handler (bprm=0x6, regs=0xc33fbce0) at fs/exec.c:1234
#5 0xc006c648 in do_execve (filename=<value optimized out>, argv=0xc3ad14cc, envp=0xc3ad1460, regs=0xc33fbce0) at fs/exec.c:1356
#6 0xc0008cf0 in sys_execve (name=<value optimized out>, argv=0xc3ad14cc, envp=0xc3ad1460) at arch/frv/kernel/process.c:263
#7 0xc00075dc in __syscall_call () at arch/frv/kernel/entry.S:897
Note that this fix does the following commit differently:
commit a190887b58c32d19c2eee007c5eb8faa970a69ba
Author: David Howells <dhowells@redhat.com>
Date: Sat Sep 5 11:17:07 2009 -0700
nommu: fix error handling in do_mmap_pgoff()
Reported-by: Graff Yang <graff.yang@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch is solving problem for PAE kernel DMA operation.
On PAE system dma_addr and unsigned long will have different
values.
Now dma_addr is not type casted using unsigned long.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Cc: Jan Beulich <JBeulich@novell.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit fa047e4f6fa63a6e9d0ae4d7749538830d14a343 "HID: fix inverted
wheel for bluetooth version of apple mighty mouse" is incomplete. If
we remove Apple MightyMouse (bluetooth version) from the list of
apple_devices in drivers/hid/hid-apple.c we have to remove it from
hid_blacklist in drivers/hid/hid-core.c as well.
powerpc: Fix incorrect setting of __HAVE_ARCH_PTE_SPECIAL
[I'm going to fix upstream differently, by having all CPU types
actually support _PAGE_SPECIAL, but I prefer the simple and obvious
fix for -stable. -- Ben]
The test that decides whether to define __HAVE_ARCH_PTE_SPECIAL on
powerpc is bogus and will end up always defining it, even when
_PAGE_SPECIAL is not supported (in which case it's 0) such as on
8xx or 40x processors.
Signed-off-by: Bernhard Weirich <bernhard.weirich@riedel.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
After upgrading to the latest kernel on my mpc875 userspace started
running incredibly slow (hours to get to a shell, even!).
I tracked it down to commit 8d30c14cab30d405a05f2aaceda1e9ad57800f36,
that patch removed a work-around for the 8xx. Adding it
back makes my problem go away.
Signed-off-by: Rex Feany <rfeany@mrv.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ir-kbd-i2c's ir_probe() function can be called much later (i.e. at
ir-kbd-i2c module load), than the lifetime of a struct IR_i2c_init_data
allocated off of the stack in cx18_i2c_new_ir() at registration time.
Make sure we pass a pointer to a persistent IR_i2c_init_data object at
i2c registration time.
Thanks to Brian Rogers, Dustin Mitchell, Andy Walls and Jean Delvare to
rise this question.
Before this patch, if ir-kbd-i2c were probed after SAA7134, trash data
were used.
Compile tested only, but the patch is identical to em28xx one. So, it
should work properly.
ir-kbd-i2c's ir_probe() function can be called much later (i.e. at
ir-kbd-i2c module load), than the lifetime of a struct IR_i2c_init_data
allocated off of the stack in cx18_i2c_new_ir() at registration time.
Make sure we pass a pointer to a persistent IR_i2c_init_data object at
i2c registration time.
Thanks to Brian Rogers, Dustin Mitchell, Andy Walls and Jean Delvare to
rise this question.
Before this patch, if ir-kbd-i2c were probed after em28xx, trash data
were used. After the patch, no matter what order, it is properly
reported as tested by me:
input: i2c IR (i2c IR (EM2840 Hauppaug as /class/input/input10
ir-kbd-i2c: i2c IR (i2c IR (EM2840 Hauppaug detected at i2c-4/4-0030/ir0 [em28xx #0]
During a page fault and rebinding the buffer there exists a window for a
signal to arrive during the i915_wait_request() and trigger a
ERESTARTSYS. This used to be handled by returning SIGBUS and thereby
killing the application. Try 'cairo-perf-trace & cairo-test-suite' and
watch X go boom!
The solution as suggested by H. Peter Anvin is to simply return NOPAGE and
leave the higher layers to spot we did not fill the page and resubmit
the page fault.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[anholt: Mostly squash it with another commit] Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Git commit 79741dd changes idle cputime accounting, but unfortunately
the /proc/uptime file hasn't caught up. Here the idle time calculation
from /proc/stat is copied over.
Signed-off-by: Michael Abbott <michael.abbott@diamond.ac.uk> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We noticed very erratic behavior [throughput] with the AIM7 shared
workload running on recent distro [SLES11] and mainline kernels on an
8-socket, 32-core, 256GB x86_64 platform. On the SLES11 kernel
[2.6.27.19+] with Barcelona processors, as we increased the load [10s of
thousands of tasks], the throughput would vary between two "plateaus"--one
at ~65K jobs per minute and one at ~130K jpm. The simple patch below
causes the results to smooth out at the ~130k plateau.
But wait, there's more:
We do not see this behavior on smaller platforms--e.g., 4 socket/8 core.
This could be the result of the larger number of cpus on the larger
platform--a scalability issue--or it could be the result of the larger
number of interconnect "hops" between some nodes in this platform and how
the tasks for a given load end up distributed over the nodes' cpus and
memories--a stochastic NUMA effect.
The variability in the results are less pronounced [on the same platform]
with Shanghai processors and with mainline kernels. With 31-rc6 on
Shanghai processors and 288 file systems on 288 fibre attached storage
volumes, the curves [jpm vs load] are both quite flat with the patched
kernel consistently producing ~3.9% better throughput [~80K jpm vs ~77K
jpm] than the unpatched kernel.
Profiling indicated that the "slow" runs were incurring high[er]
contention on an anon_vma lock in vma_adjust(), apparently called from the
sbrk() system call.
The patch:
A comment in mm/mmap.c:vma_adjust() suggests that we don't really need the
anon_vma lock when we're only adjusting the end of a vma, as is the case
for brk(). The comment questions whether it's worth while to optimize for
this case. Apparently, on the newer, larger x86_64 platforms, with
interesting NUMA topologies, it is worth while--especially considering
that the patch [if correct!] is quite simple.
We can detect this condition--no overlap with next vma--by noting a NULL
"importer". The anon_vma pointer will also be NULL in this case, so
simply avoid loading vma->anon_vma to avoid the lock.
However, we DO need to take the anon_vma lock when we're inserting a vma
['insert' non-NULL] even when we have no overlap [NULL "importer"], so we
need to check for 'insert', as well. And Hugh points out that we should
also take it when adjusting vm_start (so that rmap.c can rely upon
vma_address() while it holds the anon_vma lock).
akpm: Zhang Yanmin reprts a 150% throughput improvement with aim7, so it
might be -stable material even though thiss isn't a regression: "this
issue is not clear on dual socket Nehalem machine (2*4*2 cpu), but is
severe on large machine (4*8*2 cpu)"
[hugh.dickins@tiscali.co.uk: test vma start too] Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Nick Piggin <npiggin@suse.de> Cc: Eric Whitney <eric.whitney@hp.com> Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
do_anonymous_page() has been wrong to dirty the pte regardless.
If it's not going to mark the pte writable, then it won't help
to mark it dirty here, and clogs up memory with pages which will
need swap instead of being thrown away. Especially wrong if no
overcommit is chosen, and this vma is not yet VM_ACCOUNTed -
we could exceed the limit and OOM despite no overcommit.
Hiroaki Wakabayashi points out that when mlock() has been interrupted
by SIGKILL, the subsequent munlock() takes unnecessarily long because
its use of __get_user_pages() insists on faulting in all the pages
which mlock() never reached.
It's worse than slowness if mlock() is terminated by Out Of Memory kill:
the munlock_vma_pages_all() in exit_mmap() insists on faulting in all the
pages which mlock() could not find memory for; so innocent bystanders are
killed too, and perhaps the system hangs.
__get_user_pages() does a lot that's silly for munlock(): so remove the
munlock option from __mlock_vma_pages_range(), and use a simple loop of
follow_page()s in munlock_vma_pages_range() instead; ignoring absent
pages, and not marking present pages as accessed or dirty.
(Change munlock() to only go so far as mlock() reached? That does not
work out, given the convention that mlock() claims complete success even
when it has to give up early - in part so that an underlying file can be
extended later, and those pages locked which earlier would give SIGBUS.)
After anti-fragmentation was merged, a bug was reported whereby devices
that depended on high-order atomic allocations were failing. The solution
was to preserve a property in the buddy allocator which tended to keep the
minimum number of free pages in the zone at the lower physical addresses
and contiguous. To preserve this property, MIGRATE_RESERVE was introduced
and a number of pageblocks at the start of a zone would be marked
"reserve", the number of which depended on min_free_kbytes.
Anti-fragmentation works by avoiding the mixing of page migratetypes
within the same pageblock. One way of helping this is to increase
min_free_kbytes because it becomes less like that it will be necessary to
place pages of of MIGRATE_RESERVE is unbounded, the free memory is kept
there in large contiguous blocks instead of helping anti-fragmentation as
much as it should. With the page-allocator tracepoint patches applied, it
was found during anti-fragmentation tests that the number of
fragmentation-related events were far higher than expected even with
min_free_kbytes at higher values.
This patch limits the number of MIGRATE_RESERVE blocks that exist per zone
to two. For example, with a sufficient min_free_kbytes, 4MB of memory
will be kept aside on an x86-64 and remain more or less free and
contiguous for the systems uptime. This should be sufficient for devices
depending on high-order atomic allocations while helping fragmentation
control when min_free_kbytes is tuned appropriately. As side-effect of
this patch is that the reserve variable is converted to int as unsigned
long was the wrong type to use when ensuring that only the required number
of reserve blocks are created.
With the patches applied, fragmentation-related events as measured by the
page allocator tracepoints were significantly reduced when running some
fragmentation stress-tests on systems with min_free_kbytes tuned to a
value appropriate for hugepage allocations at runtime. On x86, the events
recorded were reduced by 99.8%, on x86-64 by 99.72% and on ppc64 by
99.83%.
Lee Schermerhorn [Tue, 22 Sep 2009 00:01:04 +0000 (17:01 -0700)]
hugetlb: restore interleaving of bootmem huge pages (2.6.31)
Not upstream as it is fixed differently in .32
I noticed that alloc_bootmem_huge_page() will only advance to the next
node on failure to allocate a huge page. I asked about this on linux-mm
and linux-numa, cc'ing the usual huge page suspects. Mel Gorman
responded:
I strongly suspect that the same node being used until allocation
failure instead of round-robin is an oversight and not deliberate
at all. It appears to be a side-effect of a fix made way back in
commit 63b4613c3f0d4b724ba259dc6c201bb68b884e1a ["hugetlb: fix
hugepage allocation with memoryless nodes"]. Prior to that patch
it looked like allocations would always round-robin even when
allocation was successful.
Andy Whitcroft countered that the existing behavior looked like Andi
Kleen's original implementation and suggested that we ask him. We did and
Andy replied that his intention was to interleave the allocations. So,
...
This patch moves the advance of the hstate next node from which to
allocate up before the test for success of the attempted allocation. This
will unconditionally advance the next node from which to alloc,
interleaving successful allocations over the nodes with sufficient
contiguous memory, and skipping over nodes that fail the huge page
allocation attempt.
Note that alloc_bootmem_huge_page() will only be called for huge pages of
order > MAX_ORDER.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: David Rientjes <rientjes@google.com> Cc: Adam Litke <agl@us.ibm.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Eric Whitney <eric.whitney@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When there's a descriptor after the SuperSpeed endpoint companion
descriptor, the previous code would have skipped over twice the length it
was supposed to. This code fixes crashes seen with UASP devices (which
have a UASP descriptor after the SS endpoint companion descriptor).
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Interrupt transfers are submitted to the xHCI hardware using the same TRB
type as bulk transfers. Re-use the bulk transfer enqueueing code to
enqueue interrupt transfers.
Interrupt transfers are a bit different than bulk transfers. When the
interrupt endpoint is to be serviced, the xHC will consume (at most) one
TD. A TD (comprised of sg list entries) can take several service
intervals to transmit. The important thing for device drivers to note is
that if they use the scatter gather interface to submit interrupt
requests, they will not get data sent from two different scatter gather
lists in the same service interval.
For now, the xHCI driver will use the service interval from the endpoint's
descriptor (bInterval). Drivers will need a hook to poll at a more
frequent interval. Set urb->interval to the interval that the xHCI
hardware will use.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The xHCI hardware reports the number of bytes untransferred for a given
transfer buffer. If the hardware reports a bytes untransferred value
greater than the submitted buffer size, we want to play it safe and say no
data was transferred. If the driver considers a short packet to be an
error, remember to set -EREMOTEIO.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make sure that the amount of data the xHC says was transmitted is less
than or equal to the size of the requested transfer buffer. Before, if
the host controller erroneously reported that the number of bytes
untransferred was bigger than the buffer in the URB, urb->actual_length
could be set to a very large size.
Make sure urb->actual_length <= urb->transfer_buffer_length.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On a successful transfer, urb->td is freed before the URB is ready to be
given back to the driver. Don't touch urb->td after it's freed. This bug
would have only shown up when xHCI debugging was turned on, and the freed
memory was quickly reused for something else.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The 0.95 xHCI spec says that non-control endpoints will be halted if a
babble is detected on a transfer. The 0.96 xHCI spec says all types of
endpoints will be halted when a babble is detected. Some hardware that
claims to be 0.95 compliant halts the control endpoint anyway.
When a babble is detected on a control endpoint, check the hardware's
output endpoint context to see if the endpoint is marked as halted. If
the control endpoint is halted, a reset endpoint command must be issued
and the transfer ring dequeue pointer needs to be moved past the stopped
transfer. Basically, we treat it as if the control endpoint had stalled.
Handle bulk babbling endpoints as if we got a completion event with a
stall completion code.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This Fresco Logic xHCI host controller chip revision puts bad data into
the output endpoint context after a Reset Endpoint command. It needs a
Configure Endpoint command (instead of a Set TR Dequeue Pointer command)
after the reset endpoint command.
Set up the input context before issuing the Reset Endpoint command so we
don't copy bad data from the output endpoint context. The HW also can't
handle two commands queued at once, so submit the TRB for the Configure
Endpoint command in the event handler for the Reset Endpoint command.
Devices that stall on control endpoints before a configuration is selected
will not work under this Fresco Logic xHCI host controller revision.
This patch is for prototype hardware that will be given to other companies
for evaluation purposes only, and should not reach consumer hands. Fresco
Logic's next chip rev should have this bug fixed.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When a control endpoint stalls, the next control transfer will clear the
stall. The USB core doesn't call down to the host controller driver's
endpoint_reset() method when control endpoints stall, so the xHCI driver
has to do all its stall handling for internal state in its interrupt handler.
When the host stalls on a control endpoint, it may stop on the data phase
or status phase of the control transfer. Like other stalled endpoints,
the xHCI driver needs to queue a Reset Endpoint command and move the
hardware's control endpoint ring dequeue pointer past the failed control
transfer (with a Set TR Dequeue Pointer or a Configure Endpoint command).
Since the USB core doesn't call usb_hcd_reset_endpoint() for control
endpoints, we need to do this in interrupt context when we get notified of
the stalled transfer. URBs may be queued to the hardware before these two
commands complete. The endpoint queue will be restarted once both
commands complete.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Full speed devices have varying max packet sizes (8, 16, 32, or 64) for
endpoint 0. The xHCI hardware needs to know the real max packet size
that the USB core discovers after it fetches the first 8 bytes of the
device descriptor.
In order to fix this without adding a new hook to host controller drivers,
the xHCI driver looks for an updated max packet size for control
endpoints. If it finds an updated size, it issues an evaluate context
command and waits for that command to finish. This should only happen in
the initialization and device descriptor fetching steps in the khubd
thread, so blocking should be fine.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>