Nokia S60 phones expose two ACM channels. The first is a modem and is picked
up by the standard AT-command interface information in the CDC-ACM driver. The
second is marked as having a vendor-specific protocol. Normally, we don't
expose those as ttys. (On some other devices, they may be claimed by the
rndis_host driver and used as a network interface).
But on S60 this second ACM channel is the way that third-party S60 application
developers are expected to communicate over USB. It acts as a serial device
at the S60 end, and so it should on Linux too.
The list of devices is largely derived from:
http://wiki.forum.nokia.com/index.php/S60_Platform_and_device_identification_codes
http://wiki.forum.nokia.com/index.php/Nokia_USB_Product_IDs
and includes only the S60 3rd Edition+ devices documented there.
There are many devices for which the USB device ID is not documented,
including:
Nokia 6290
Nokia E63
Nokia 5630 XpressMusic
Nokia 5730 XpressMusic
Nokia 6710 Navigator
Nokia 6720 classic
Nokia 6730 Classic
Nokia 6760 slide
Nokia 6790 slide
Nokia 6790 Surge
Nokia E52
Nokia E55
Nokia E71x (AT&T)
Nokia E72
Nokia E75
Nokia E75 US+LTA variant
Nokia N79
Nokia N86 8MP
Nokia 5230 (RM-588)
Nokia 5230 (RM-594)
Nokia 5530 XpressMusic
Nokia 5530 XpressMusic (china)
Nokia 5800 XM
Nokia N97 (RM-506)
Nokia N97 mini
Nokia X6
It would be good to add those subsequently.
Signed-off-by: Adrian Taylor <aat@realvnc.com> Acked-by: Oliver Neukum <oliver@neukum.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add the USB IDs needed to support the B&B USOPTL4-4P, USO9ML2-2P, and
USO9ML2-4P. This patch expands and corrects a typo in the patch sent
on 08-31-2010.
Signed-off-by: Dave Ludlow <dave.ludlow@bay.ws> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is the cable between an H3000 navigation unit and a multi-function display.
http://www.bandg.com/en/Products/H3000/Spares-and-Accessories/Cables/H3000-CPU-USB-Cable-Pack/
Signed-off-by: Jason Detring <jason.detring@navico.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The iounmap(ehci->ohci_hcctrl_reg); should be the first thing we do
because the ioremap() was the last thing we did. Also if we hit any of
the goto statements in the original code then it would have led to a
NULL dereference of "ehci". This bug was introduced in: 796bcae7361c
"USB: powerpc: Workaround for the PPC440EPX USBH_23 errata [take 3]"
I modified the few lines in front a little so that my code didn't
obscure the return success code path.
Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
For local mounts, ocfs2_read_locked_inode() calls ocfs2_read_blocks_sync() to
read the inode off the disk. The latter first checks to see if that block is
cached in the journal, and, if so, returns that block. That is ok.
But ocfs2_read_locked_inode() goes wrong when it tries to validate the checksum
of such blocks. Blocks that are cached in the journal may not have had their
checksum computed as yet. We should not validate the checksums of such blocks.
The 5 GHz CTL indexes were not being read for all hardware
devices due to the masking out through the CTL_MODE_M mask
being one bit too short. Without this the calibrated regulatory
maximum values were not being picked up when devices operate
on 5 GHz in HT40 mode. The final output power used for Atheros
devices is the minimum between the calibrated CTL values and
what CRDA provides.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Bartly reported that fuse can hang in fuse_get_req_nofail() when
the connection to the filesystem server is no longer active.
If bg_queue is not empty then flush_bg_queue() called from
request_end() can put more requests on to the pending queue. If this
happens while ending requests on the processing queue then those
background requests will be queued to the pending list and never
ended.
Another problem is that fuse_dev_release() didn't wake up processes
sleeping on blocked_waitq.
Solve this by:
a) flushing the background queue before calling end_requests() on the
pending and processing queues
b) setting blocked = 0 and waking up processes waiting on
blocked_waitq()
Increased storvsc ringbuffer and max_io_requests. This now more
closely mimics the numbers on Hyper-V. And will allow more IO requests
to take place for the SCSI driver.
Max_IO is set to double from what it was before, Hyper-V allows it and
we have had appliance builder requests to see if it was a problem to
increase the number.
Ringbuffer size for storvsc is now increased because I have seen A few buffer
problems on extremely busy systems. They were Set pretty low before.
And since max_io_requests is increased I Really needed to increase the buffer
as well.
Fixed the value of the 64bit-hole inside ring buffer, this
caused a problem on Hyper-V when running checked Windows builds.
Checked builds of Windows are used internally and given to external
system integrators at times. They are builds that for example that all
elements in a structure follow the definition of that Structure. The bug
this fixed was for a field that we did not fill in at all (Because we do
Not use it on the Linux side), and the checked build of windows gives
errors on it internally to the Windows logs.
Fixed bounce offset kmap problem by using correct index.
The symptom of the problem is that in some NAS appliances this problem
represents Itself by a unresponsive VM under a load with many clients writing
small files.
Fix missing functions for net_device_ops.
It's a bug when porting the drivers from 2.6.27 to 2.6.32. In 2.6.27,
the default functions for Ethernet, like eth_change_mtu(), were assigned
by ether_setup(). But in 2.6.32, these function pointers moved to
net_device_ops structure and no longer be assigned in ether_setup(). So
we need to set these functions in our driver code. It will ensure the
MTU won't be set beyond 1500. Otherwise, this can cause an error on the
server side, because the HyperV linux driver doesn't support jumbo frame
yet.
Mike Galbraith [Thu, 26 Aug 2010 03:29:16 +0000 (05:29 +0200)]
sched: revert stable c6fc81a sched: Fix a race between ttwu() and migrate_task()
This commit does not appear to have been meant for 32-stable, and causes ltp's
cpusets testcases to fail, revert it.
Original commit text:
sched: Fix a race between ttwu() and migrate_task()
Based on commit e2912009fb7b715728311b0d8fe327a1432b3f79 upstream, but
done differently as this issue is not present in .33 or .34 kernels due
to rework in this area.
If a task is in the TASK_WAITING state, then try_to_wake_up() is working
on it, and it will place it on the correct cpu.
This commit ensures that neither migrate_task() nor __migrate_task()
calls set_task_cpu(p) while p is in the TASK_WAKING state. Otherwise,
there could be two concurrent calls to set_task_cpu(p), resulting in
the task's cfs_rq being inconsistent with its cpu.
Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Amit Arora [Wed, 19 May 2010 09:05:57 +0000 (14:35 +0530)]
sched: kill migration thread in CPU_POST_DEAD instead of CPU_DEAD
[Fixed in a different manner upstream, due to rewrites in this area]
Problem : In a stress test where some heavy tests were running along with
regular CPU offlining and onlining, a hang was observed. The system seems to
be hung at a point where migration_call() tries to kill the migration_thread
of the dying CPU, which just got moved to the current CPU. This migration
thread does not get a chance to run (and die) since rt_throttled is set to 1
on current, and it doesn't get cleared as the hrtimer which is supposed to
reset the rt bandwidth (sched_rt_period_timer) is tied to the CPU which we just
marked dead!
Solution : This patch pushes the killing of migration thread to "CPU_POST_DEAD"
event. By then all the timers (including sched_rt_period_timer) should have got
migrated (along with other callbacks).
Signed-off-by: Amit Arora <amitarora@in.ibm.com> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 2ca1af9aa3285c6a5f103ed31ad09f7399fc65d7 "PCI: MSI: Remove
unsafe and unnecessary hardware access" changed read_msi_msg_desc() to
return the last MSI message written instead of reading it from the
device, since it may be called while the device is in a reduced
power state.
However, the pSeries platform code really does need to read messages
from the device, since they are initially written by firmware.
Therefore:
- Restore the previous behaviour of read_msi_msg_desc()
- Add new functions get_cached_msi_msg{,_desc}() which return the
last MSI message written
- Use the new functions where appropriate
Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
During suspend on an SMP system, {read,write}_msi_msg_desc() may be
called to mask and unmask interrupts on a device that is already in a
reduced power state. At this point memory-mapped registers including
MSI-X tables are not accessible, and config space may not be fully
functional either.
While a device is in a reduced power state its interrupts are
effectively masked and its MSI(-X) state will be restored when it is
brought back to D0. Therefore these functions can simply read and
write msi_desc::msg for devices not in D0.
Further, read_msi_msg_desc() should only ever be used to update a
previously written message, so it can always read msi_desc::msg
and never needs to touch the hardware.
TSC's get reset after suspend/resume (even on cpu's with invariant TSC
which runs at a constant rate across ACPI P-, C- and T-states). And in
some systems BIOS seem to reinit TSC to arbitrary large value (still
sync'd across cpu's) during resume.
This leads to a scenario of scheduler rq->clock (sched_clock_cpu()) less
than rq->age_stamp (introduced in 2.6.32). This leads to a big value
returned by scale_rt_power() and the resulting big group power set by the
update_group_power() is causing improper load balancing between busy and
idle cpu's after suspend/resume.
This resulted in multi-threaded workloads (like kernel-compilation) go
slower after suspend/resume cycle on core i5 laptops.
Fix this by recomputing cyc2ns_offset's during resume, so that
sched_clock() continues from the point where it was left off during
suspend.
Fix DSM/TRIM commands in sata_mv (v2).
These need to be issued using old-school "BM DMA",
rather than via the EDMA host queue.
Since the chips don't have proper BM DMA status,
we need to be more careful with setting the ATA_DMA_INTR bit,
since DSM/TRIM often has a long delay between "DMA complete"
and "command complete".
GEN_I chips don't have BM DMA, so no TRIM for them.
Signed-off-by: Mark Lord <mlord@pobox.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The non-standard name "iMic" makes PulseAudio ignore the microphone. BugLink: https://launchpad.net/bugs/605101 Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
IPIs and VIRQs are inherently per-cpu event types, so treat them as such:
- use a specific percpu irq_chip implementation, and
- handle them with handle_percpu_irq
This makes the path for delivering these interrupts more efficient
(no masking/unmasking, no locks), and it avoid problems with attempts
to migrate them.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Xen events are logically edge triggered, as Xen only calls the event
upcall when an event is newly set, but not continuously as it remains set.
As a result, use handle_edge_irq rather than handle_level_irq.
This has the important side-effect of fixing a long-standing bug of
events getting lost if:
- an event's interrupt handler is running
- the event is migrated to a different vcpu
- the event is re-triggered
The most noticable symptom of these lost events is occasional lockups
of blkfront.
Many thanks to Tom Kopec and Daniel Stodden in tracking this down.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Tom Kopec <tek@acm.org> Cc: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit 8bf0223ed515be24de0c671eedaff49e78bebc9c (hwmon, k8temp: Fix
temperature reporting for ASB1 processor revisions) fixed temperature
reporting for ASB1 CPUs. But those CPU models (model 0x6b, 0x6f, 0x7f)
were packaged both as AM2 (desktop) and ASB1 (mobile). Thus the commit
leads to wrong temperature reporting for AM2 CPU parts.
The solution is to determine the package type for models 0x6b, 0x6f,
0x7f.
This is done using BrandId from CPUID Fn8000_0001_EBX[15:0]. See
"Constructing the processor Name String" in "Revision Guide for AMD
NPT Family 0Fh Processors" (Rev. 3.46).
Cc: Rudolf Marek <r.marek@assembler.cz> Reported-by: Vladislav Guberinic <neosisani@gmail.com> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When the SMP kernel decides to crash_kexec() the local APICs may have
pending interrupts in their vector tables.
The setup routine for the local APIC has a deficient mechanism for
clearing these interrupts, it only handles interrupts that has already
been dispatched to the local core for servicing (the ISR register) safely,
it doesn't consider lower prioritized queued interrupts stored in the IRR
register.
If you have more than one pending interrupt within the same 32 bit word in
the LAPIC vector table registers you may find yourself entering the IO
APIC setup with pending interrupts left in the LAPIC. This is a situation
for wich the IO APIC setup is not prepared. Depending of what/which
interrupt vector/vectors are stuck in the APIC tables your system may show
various degrees of malfunctioning. That was the reason why the
check_timer() failed in our system, the timer interrupts was blocked by
pending interrupts from the old kernel when routed trough the IO APIC.
Additional comment from Jiri Bohac:
==============
If this should go into stable release,
I'd add some kind of limit on the number of iterations, just to be safe from
hard to debug lock-ups:
with MAX_LOOPS something like 1E9 this would leave plenty of time for the
pending IRQs to be cleared and would and still cause at most a second of delay
if the loop were to lock-up for whatever reason.
[trenn@suse.de:
V2: Use tsc if avail to bail out after 1 sec due to possible virtual
apic_read calls which may take rather long (suggested by: Avi Kivity
<avi@redhat.com>) If no tsc is available bail out quickly after
cpu_khz, if we broke out too early and still have irqs pending (which
should never happen?) we still get a WARN_ON...
V3: - Fixed indentation -> checkpatch clean
- max_loops must be signed
V4: - Fix typo, mixed up tsc and ntsc in first rdtscll() call
V5: Adjust WARN_ON() condition to also catch error in cpu_has_tsc case]
Cc: <jbohac@novell.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Kerstin Jonsson <kerstin.jonsson@ericsson.com> Cc: Avi Kivity <avi@redhat.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Thomas Renninger <trenn@suse.de>
LKML-Reference: <201005241913.o4OJDGWM010865@imap1.linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Thomas Renninger <trenn@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add ftdi product ID for Lenz LI-USB, a model train interface. This
was NOT tested against 2.6.35, but a similar patch was tested with the
CentOS 2.6.18-194.11.1.el5 kernel. It wasn't clear to me what
ordering is being used in ftdi_sio.c, so I inserted the ID after another
model train entry(SPROG_II).
The code to increment the TRB pointer has a slight ambiguity that could
lead to a bug on different compilers. The ANSI C specification does not
specify the precedence of the assignment operator over the postfix
operator. gcc 4.4 produced the correct code (increment the pointer and
assign the value), but a MIPS compiler that one of John's clients used
assigned the old (unincremented) value.
Remove the unnecessary assignment to make all compilers produce the
correct assembly.
Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If we can't read the firmware for a device from the disk, and yet the
device already has a valid firmware image in it, we don't want to
replace the firmware with something invalid. So check the version
number to be less than the current one to verify this is the correct
thing to do.
Reported-by: Chris Beauchamp <chris@chillibean.tv> Tested-by: Chris Beauchamp <chris@chillibean.tv> Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add support for the Zeagle N2iTiON3 dive computer interface. Since
Zeagle devices are actually manufactured by Seiko, this patch will
support other Seiko based models as well.
>Try the navman driver instead. You can either add the device id to the
> driver and rebuild it, or do this before you plug the device in:
> modprobe navman
> echo -n "0x0df7 0x0900" > /sys/bus/usb-serial/drivers/navman/new_id
>
> and then plug your device in and see if that works.
I can confirm that the navman driver works with the right device IDs on
my i-gotU GT-600, which has the same device IDs. Attached is a patch
adding the IDs.
From: Ross Burton <ross@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Userspace controls the amount of memory to be allocate, so it can
get the ioctl to allocate more memory than the kernel uses, and get
access to kernel stack. This can only be done for processes authenticated
to the X server for DRI access, and if the user has DRI access.
Fix is to just memset the data to 0 if the user doesn't copy into
it in the first place.
net/compat/wext: send different messages to compat tasks
we had a race condition when setting and then
restoring frag_list. Eric attempted to fix it,
but the fix created even worse problems.
However, the original motivation I had when I
added the code that turned out to be racy is
no longer clear to me, since we only copy up
to skb->len to userspace, which doesn't include
the frag_list length. As a result, not doing
any frag_list clearing and restoring avoids
the race condition, while not introducing any
other problems.
Additionally, while preparing this patch I found
that since none of the remaining netlink code is
really aware of the frag_list, we need to use the
original skb's information for packet information
and credentials. This fixes, for example, the
group information received by compat tasks.
Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BugLink: https://bugs.launchpad.net/bugs/619439
This ThinkPad model needs External Amplifier muted for audible playback,
so set the inv_eapd quirk for it.
Reported-and-tested-by: Dennis Bell <dennis.bell@parkerg.co.uk> Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It doesn't like pattern and explicit rules to be on the same line,
and it seems to be more picky when matching file (or really directory)
names with different numbers of trailing slashes.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Sam Ravnborg <sam@ravnborg.org>
Andrew Benton <b3nton@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Newer Intel processors identifying themselves as model 30 are not recognized by
oprofile.
<cpuinfo snippet>
model : 30
model name : Intel(R) Xeon(R) CPU X3470 @ 2.93GHz
</cpuinfo snippet>
Running oprofile on these machines gives the following:
+ opcontrol --init
+ opcontrol --list-events
oprofile: available events for CPU type "Intel Architectural Perfmon"
See Intel 64 and IA-32 Architectures Software Developer's Manual
Volume 3B (Document 253669) Chapter 18 for architectural perfmon events
This is a limited set of fallback events because oprofile doesn't know your CPU
CPU_CLK_UNHALTED: (counter: all)
Clock cycles when not halted (min count: 6000)
INST_RETIRED: (counter: all)
number of instructions retired (min count: 6000)
LLC_MISSES: (counter: all)
Last level cache demand requests from this core that missed the LLC
(min count: 6000)
Unit masks (default 0x41)
----------
0x41: No unit mask
LLC_REFS: (counter: all)
Last level cache demand requests from this core (min count: 6000)
Unit masks (default 0x4f)
----------
0x4f: No unit mask
BR_MISS_PRED_RETIRED: (counter: all)
number of mispredicted branches retired (precise) (min count: 500)
+ opcontrol --shutdown
Back when the patch was submitted for "Add Xeon 7500 series support to
oprofile", Robert Richter had asked for a followon patch that
converted all the CPU ID values to hex.
I have done that here for the "i386/core_i7" and "i386/atom" class
processors in the ppro_init() function and also added some comments on
where to find documentation on the Intel processors.
Signed-off-by: John L. Villalovos <john.l.villalovos@intel.com> Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There are duplicate macro definitions of in_range() in mballoc.h and
balloc.c. This consolidates these two definitions into ext4.h, and
changes extents.c to use in_range() as well.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger@sun.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
NR_IRQS may be as low as 16, causing a (harmless?) buffer overflow in
pcmcia_setup_isa_irq():
static u8 pcmcia_used_irq[NR_IRQS];
...
if ((try < 32) && pcmcia_used_irq[irq])
continue;
This is read-only, so if this address would be non-zero, it would just
mean we would not attempt an IRQ >= NR_IRQS -- which would fail anyway!
And as request_irq() fails for an irq >= NR_IRQS, the setting code path:
pcmcia_used_irq[irq]++;
is never reached as well.
Reported-by: Christoph Fritz <chf.fritz@googlemail.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix "system goes unresponsive under memory pressure and lots of
dirty/writeback pages" bug.
http://lkml.org/lkml/2010/4/4/86
In the above thread, Andreas Mohr described that
Invoking any command locked up for minutes (note that I'm
talking about attempted additional I/O to the _other_,
_unaffected_ main system HDD - such as loading some shell
binaries -, NOT the external SSD18M!!).
This happens when the two conditions are both meet:
- under memory pressure
- writing heavily to a slow device
OOM also happens in Andreas' system. The OOM trace shows that 3 processes
are stuck in wait_on_page_writeback() in the direct reclaim path. One in
do_fork() and the other two in unix_stream_sendmsg(). They are blocked on
this condition:
(sc->order && priority < DEF_PRIORITY - 2)
which was introduced in commit 78dc583d (vmscan: low order lumpy reclaim
also should use PAGEOUT_IO_SYNC) one year ago. That condition may be too
permissive. In Andreas' case, 512MB/1024 = 512KB. If the direct reclaim
for the order-1 fork() allocation runs into a range of 512KB
hard-to-reclaim LRU pages, it will be stalled.
It's a severe problem in three ways.
Firstly, it can easily happen in daily desktop usage. vmscan priority can
easily go below (DEF_PRIORITY - 2) on _local_ memory pressure. Even if
the system has 50% globally reclaimable pages, it still has good
opportunity to have 0.1% sized hard-to-reclaim ranges. For example, a
simple dd can easily create a big range (up to 20%) of dirty pages in the
LRU lists. And order-1 to order-3 allocations are more than common with
SLUB. Try "grep -v '1 :' /proc/slabinfo" to get the list of high order
slab caches. For example, the order-1 radix_tree_node slab cache may
stall applications at swap-in time; the order-3 inode cache on most
filesystems may stall applications when trying to read some file; the
order-2 proc_inode_cache may stall applications when trying to open a
/proc file.
Secondly, once triggered, it will stall unrelated processes (not doing IO
at all) in the system. This "one slow USB device stalls the whole system"
avalanching effect is very bad.
Thirdly, once stalled, the stall time could be intolerable long for the
users. When there are 20MB queued writeback pages and USB 1.1 is writing
them in 1MB/s, wait_on_page_writeback() will stuck for up to 20 seconds.
Not to mention it may be called multiple times.
So raise the bar to only enable PAGEOUT_IO_SYNC when priority goes below
DEF_PRIORITY/3, or 6.25% LRU size. As the default dirty throttle ratio is
20%, it will hardly be triggered by pure dirty pages. We'd better treat
PAGEOUT_IO_SYNC as some last resort workaround -- its stall time is so
uncomfortably long (easily goes beyond 1s).
The bar is only raised for (order < PAGE_ALLOC_COSTLY_ORDER) allocations,
which are easy to satisfy in 1TB memory boxes. So, although 6.25% of
memory could be an awful lot of pages to scan on a system with 1TB of
memory, it won't really have to busy scan that much.
Andreas tested an older version of this patch and reported that it mostly
fixed his problem. Mel Gorman helped improve it and KOSAKI Motohiro will
fix it further in the next patch.
after updating the value of the ICMP payload, inet_proto_csum_replace4() should
be called with zero pseudohdr.
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The main motivation of this patch changing strcpy() to strlcpy().
We strcpy() to copy a 48 byte buffers into a 49 byte buffers. So at
best the last byte has leaked information, or maybe there is an
overflow? Anyway, this patch closes the information leaks by zeroing
the memory and the calls to strlcpy() prevent overflows.
Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds a limit for nframes as the number of frames in TX_SETUP and
RX_SETUP are derived from a single byte multiplex value by default.
Use-cases that would require to send/filter more than 256 CAN frames should
be implemented in userspace for complexity reasons anyway.
Additionally the assignments of unsigned values from userspace to signed
values in kernelspace and vice versa are fixed by using unsigned values in
kernelspace consistently.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Reported-by: Ben Hawkes <hawkes@google.com> Acked-by: Urs Thuermann <urs.thuermann@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
>Xin Xiaohui wrote:
> I looked into the code dev_gro_receive(), found the code here:
> if the frags[0] is pulled to 0, then the page will be released,
> and memmove() frags left.
> Is that right? I'm not sure if memmove do right or not, but
> frags[0].size is never set after memove at least. what I think
> a simple way is not to do anything if we found frags[0].size == 0.
> The patch is as followed.
...
This version of the patch fixes the bug directly in memmove.
Reported-by: "Xin, Xiaohui" <xiaohui.xin@intel.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
SunBlade-2500 has 'parallel' device node with compatible
property "pnpALI,1533,3" so add that to the ID table.
Reported-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
These just represent the secondary and further heads attached to the
card, and they have different sets of PCI bar registers to map.
So don't try to drive them in the main driver.
Reported-by: Frans van Berckel <fberckel@xs4all.nl> Tested-by: Frans van Berckel <fberckel@xs4all.nl> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch fixes alignment of slab objects in case CONFIG_DEBUG_PAGEALLOC is
active.
Before this spot in kmem_cache_create, we have this situation:
- align contains the required alignment of the object
- cachep->obj_offset is 0 or equals align in case of CONFIG_DEBUG_SLAB
- size equals the size of the object, or object plus trailing redzone in case
of CONFIG_DEBUG_SLAB
This spot tries to fill one page per object if the object is in certain size
limits, however setting obj_offset to PAGE_SIZE - size does break the object
alignment since size may not be aligned with the required alignment.
This patch simply adds an ALIGN(size, align) to the equation and fixes the
object size detection accordingly.
This code in drivers/s390/cio/qdio_setup_init has lead to incorrectly aligned
slab objects (sizeof(struct qdio_q) equals 1792):
qdio_q_cache = kmem_cache_create("qdio_q", sizeof(struct qdio_q),
256, 0, NULL);
Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The hibernate issues that got fixed in commit 985b823b9192 ("drm/i915:
fix hibernation since i915 self-reclaim fixes") turn out to have been
incomplete. Vefa Bicakci tested lots of hibernate cycles, and without
the __GFP_RECLAIMABLE flag the system eventually fails to resume.
With the flag added, Vefa can apparently hibernate forever (or until he
gets bored running his automated scripts, whichever comes first).
The reclaimable flag was there originally, and was one of the flags that
were dropped (unintentionally) by commit 4bdadb978569 ("drm/i915:
Selectively enable self-reclaim") that introduced all these problems,
but I didn't want to just blindly add back all the flags in commit 985b823b9192, and it looked like __GFP_RECLAIM wasn't necessary. It
clearly was.
I still suspect that there is some subtle reason we're missing that
causes the problems, but __GFP_RECLAIMABLE is certainly not wrong to use
in this context, and is what the code historically used. And we have no
idea what the causes the corruption without it.
Reported-and-tested-by: M. Vefa Bicakci <bicave@superonline.com> Cc: Dave Airlie <airlied@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since commit 4bdadb9785696439c6e2b3efe34aa76df1149c83 ("drm/i915:
Selectively enable self-reclaim"), we've been passing GFP_MOVABLE to the
i915 page allocator where we weren't before due to some over-eager
removal of the page mapping gfp_flags games the code used to play.
This caused hibernate on Intel hardware to result in a lot of memory
corruptions on resume. See for example
http://bugzilla.kernel.org/show_bug.cgi?id=13811
Reported-by: Evengi Golov (in bugzilla) Signed-off-by: Dave Airlie <airlied@redhat.com> Tested-by: M. Vefa Bicakci <bicave@superonline.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Like the mlock() change previously, this makes the stack guard check
code use vma->vm_prev to see what the mapping below the current stack
is, rather than have to look it up with find_vma().
Also, accept an abutting stack segment, since that happens naturally if
you split the stack with mlock or mprotect.
It's a really simple list, and several of the users want to go backwards
in it to find the previous vma. So rather than have to look up the
previous entry with 'find_vma_prev()' or something similar, just make it
doubly linked instead.
This patch changes dm_hash_remove_all() to release _hash_lock when
removing a device. After removing the device, dm_hash_remove_all()
takes _hash_lock and searches the hash from scratch again.
This patch is a preparation for the next patch, which changes device
deletion code to wait for md reference to be 0. Without this patch,
the wait in the next patch may cause AB-BA deadlock:
CPU0 CPU1
-----------------------------------------------------------------------
dm_hash_remove_all()
down_write(_hash_lock)
table_status()
md = find_device()
dm_get(md)
<increment md->holders>
dm_get_live_or_inactive_table()
dm_get_inactive_table()
down_write(_hash_lock)
<in the md deletion code>
<wait for md->holders to be 0>
Test on a PXA310 platform with Samsung K9F2G08X0B NAND flash,
with tCH=5 and clk is 156MHz, ns2cycle(5, 156000000) returns -1.
ns2cycle returns negtive value will break NDTR0_tXX macros.
After checking the commit log, I found the problem is introduced by
commit 5b0d4d7c8a67c5ba3d35e6ceb0c5530cc6846db7
"[MTD] [NAND] pxa3xx: convert from ns to clock ticks more accurately"
To get num of clock cycles, we use below equation:
num of clock cycles = time (ns) / one clock cycle (ns) + 1
We need to add 1 cycle here because integer division will truncate the result.
It is possible the developers set the Min values in SPEC for timing settings.
Thus the truncate may cause problem, and it is safe to add an extra cycle here.
The various fields in NDTR{01} are in units of clock ticks minus one,
thus we should subtract 1 cycle then.
Thus the correct equation should be:
num of clock cycles = time (ns) / one clock cycle (ns) + 1 - 1
= time (ns) / one clock cycle (ns)
Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Lei Wen <leiwen@marvell.com> Acked-by: Eric Miao <eric.y.miao@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Atheros PCIe wireless cards handled by ath5k do require L0s disabled.
For distributions shipping with CONFIG_PCIEASPM (this will be enabled
by default in the future in 2.6.36) this will also mean both L1 and L0s
will be disabled when a pre 1.1 PCIe device is detected. We do know L1
works correctly even for all ath5k pre 1.1 PCIe devices though but cannot
currently undue the effect of a blacklist, for details you can read
pcie_aspm_sanity_check() and see how it adjusts the device link
capability.
It may be possible in the future to implement some PCI API to allow
drivers to override blacklists for pre 1.1 PCIe but for now it is
best to accept that both L0s and L1 will be disabled completely for
distributions shipping with CONFIG_PCIEASPM rather than having this
issue present. Motivation for adding this new API will be to help
with power consumption for some of these devices.
Example of issues you'd see:
- On the Acer Aspire One (AOA150, Atheros Communications Inc. AR5001
Wireless Network Adapter [168c:001c] (rev 01)) doesn't work well
with ASPM enabled, the card will eventually stall on heavy traffic
with often 'unsupported jumbo' warnings appearing. Disabling
ASPM L0s in ath5k fixes these problems.
- On the same card you would see a storm of RXORN interrupts
even though medium is idle.
Credit for root causing and fixing the bug goes to Jussi Kivilinna.
Cc: David Quan <David.Quan@atheros.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Tim Gardner <tim.gardner@canonical.com> Cc: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It's not OK to call platform_device_add_resources() multiple times
in a row. Despite its name, this functions sets the resources, it
doesn't add them. So we have to prepare an array with all the
resources, and then call platform_device_add_resources() once.
Before this fix, only the last I/O resource would be actually
registered. The other I/O resources were leaked.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thanks a lot for all the review and comments so far;) I'd like to send
the improved (V4) version of this patch.
This patch fixes a deadlock in OCFS2 ACL. We found this bug in OCFS2
and Samba integration using scenario, the symptom is several smbd
processes will be hung under heavy workload. Finally we found out it
is the nested PR lock calling that leads to this deadlock:
node1 node2
gr PR
|
V
PR(EX)---> BAST:OCFS2_LOCK_BLOCKED
|
V
rq PR
|
V
wait=1
After requesting the 2nd PR lock, the process "smbd" went into D
state. It can only be woken up when the 1st PR lock's RO holder equals
zero. There should be an ocfs2_inode_unlock in the calling path later
on, which can decrement the RO holder. But since it has been in
uninterruptible sleep, the unlock function has no chance to be called.
When testing cpu hotplug code on 32-bit we kept hitting the "CPU%d:
Stuck ??" message due to multiple cores concurrently accessing the
cpu_callin_mask, among others.
Since these codepaths are not protected from concurrent access due to
the fact that there's no sane reason for making an already complex
code unnecessarily more complex - we hit the issue only when insanely
switching cores off- and online - serialize hotplugging cores on the
sysfs level and be done with it.
[ v2.1: fix !HOTPLUG_CPU build ]
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100819181029.GC17171@aftab> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When we need to take both dlm_domain_lock and dlm->spinlock, we should take
them in order of: dlm_domain_lock then dlm->spinlock.
There is pathes disobey this order. That is calling dlm_lockres_put() with
dlm->spinlock held in dlm_run_purge_list. dlm_lockres_put() calls dlm_put() at
the ref and dlm_put() locks on dlm_domain_lock.
Fix:
Don't grab/put the dlm when the initialising/releasing lockres.
That grab is not required because we don't call dlm_unregister_domain()
based on refcount.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In the following situation, there remains an incorrect bit in refmap on the
recovery master. Finally the recovery master will fail at purging the lockres
due to the incorrect bit in refmap.
1) node A has no interest on lockres A any longer, so it is purging it.
2) the owner of lockres A is node B, so node A is sending de-ref message
to node B.
3) at this time, node B crashed. node C becomes the recovery master. it recovers
lockres A(because the master is the dead node B).
4) node A migrated lockres A to node C with a refbit there.
5) node A failed to send de-ref message to node B because it crashed. The failure
is ignored. no other action is done for lockres A any more.
For mormal, re-send the deref message to it to recovery master can fix it. Well,
ignoring the failure of deref to the original master and not recovering the lockres
to recovery master has the same effect. And the later is simpler.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The refcount record calculation in ocfs2_calc_refcount_meta_credits
is too optimistic that we can always allocate contiguous clusters
and handle an already existed refcount rec as a whole. Actually
because of file system fragmentation, we may have the chance to split
a refcount record into 3 parts during the transaction. So consider
the worst case in record calculation.
Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch fixes two problems in dlm_run_purgelist
1. If a lockres is found to be in use, dlm_run_purgelist keeps trying to purge
the same lockres instead of trying the next lockres.
2. When a lockres is found unused, dlm_run_purgelist releases lockres spinlock
before setting DLM_LOCK_RES_DROPPING_REF and calls dlm_purge_lockres.
spinlock is reacquired but in this window lockres can get reused. This leads
to BUG.
This patch modifies dlm_run_purgelist to skip lockres if it's in use and purge
next lockres. It also sets DLM_LOCK_RES_DROPPING_REF before releasing the
lockres spinlock protecting it from getting reused.
When we have to take both dlm->master_lock and lockres->spinlock,
take them in order
lockres->spinlock and then dlm->master_lock.
The patch fixes a violation of the rule.
We can simply move taking dlm->master_lock to where we have dropped res->spinlock
since when we access res->state and free mle memory we don't need master_lock's
protection.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Setting the acl while creating a new inode depends on
the error codes of posix_acl_create_masq. This patch fix
a issue of overwriting the error codes of it.
Reported-by: Pawel Zawora <pzawora@gmail.com> Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I discovered tonight that ALSA no longer sets up a stream for the second ADC
provided by the Realtek ALC260 HDA codec. At some point alc_build_pcms()
started using stream_analog_alt_capture when constructing the second ADC
stream, but patch_alc260() was never updated accordingly. I have no idea
when this regression occurred. The trivial patch to patch_alc260() given
below fixes the problem as far as I can tell. The patch is against 2.6.35.
With some hardware combinations, the PCM interrupts are acknowledged
before the period boundary from the emu10k1 chip. The midlevel PCM code
gets confused and the playback stream is interrupted.
It seems that the interrupt processing shift by 2 samples is enough
to fix this issue. This default value does not harm other,
non-affected hardware.
The detection and loading of firmeware on riptide driver has been broken
due to rewrite of some codes, checking the presense wrongly.
This patch fixes the logic again.
mspro_block_remove() is called from detect thread that first calls the
mspro_block_stop(), which stops the request queue. If we call
del_gendisk() with the queue stopped we get a deadlock.
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Alex Dubov <oakad@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This commit makes the stack guard page somewhat less visible to user
space. It does this by:
- not showing the guard page in /proc/<pid>/maps
It looks like lvm-tools will actually read /proc/self/maps to figure
out where all its mappings are, and effectively do a specialized
"mlockall()" in user space. By not showing the guard page as part of
the mapping (by just adding PAGE_SIZE to the start for grows-up
pages), lvm-tools ends up not being aware of it.
- by also teaching the _real_ mlock() functionality not to try to lock
the guard page.
That would just expand the mapping down to create a new guard page,
so there really is no point in trying to lock it in place.
It would perhaps be nice to show the guard page specially in
/proc/<pid>/maps (or at least mark grow-down segments some way), but
let's not open ourselves up to more breakage by user space from programs
that depends on the exact deails of the 'maps' file.
Special thanks to Henrique de Moraes Holschuh for diving into lvm-tools
source code to see what was going on with the whole new warning.
Reported-and-tested-by: François Valenduc <francois.valenduc@tvcablenet.be Reported-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We do in fact need to unmap the page table _before_ doing the whole
stack guard page logic, because if it is needed (mainly 32-bit x86 with
PAE and CONFIG_HIGHPTE, but other architectures may use it too) then it
will do a kmap_atomic/kunmap_atomic.
And those kmaps will create an atomic region that we cannot do
allocations in. However, the whole stack expand code will need to do
anon_vma_prepare() and vma_lock_anon_vma() and they cannot do that in an
atomic region.
Now, a better model might actually be to do the anon_vma_prepare() when
_creating_ a VM_GROWSDOWN segment, and not have to worry about any of
this at page fault time. But in the meantime, this is the
straightforward fix for the issue.
See https://bugzilla.kernel.org/show_bug.cgi?id=16588 for details.
Reported-by: Wylda <wylda@volny.cz> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Reported-by: Mike Pagano <mpagano@gentoo.org> Reported-by: François Valenduc <francois.valenduc@tvcablenet.be> Tested-by: Ed Tomlinson <edt@aei.ca> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It's wrong for several reasons, but the most direct one is that the
fault may be for the stack accesses to set up a previous SIGBUS. When
we have a kernel exception, the kernel exception handler does all the
fixups, not some user-level signal handler.
Even apart from the nested SIGBUS issue, it's also wrong to give out
kernel fault addresses in the signal handler info block, or to send a
SIGBUS when a system call already returns EFAULT.
.. which didn't show up in my tests because it's a no-op on x86-64 and
most other architectures. But we enter the function with the last-level
page table mapped, and should unmap it at exit.
This is a rather minimally invasive patch to solve the problem of the
user stack growing into a memory mapped area below it. Whenever we fill
the first page of the stack segment, expand the segment down by one
page.
Now, admittedly some odd application might _want_ the stack to grow down
into the preceding memory mapping, and so we may at some point need to
make this a process tunable (some people might also want to have more
than a single page of guarding), but let's try the minimal approach
first.
Tested with trivial application that maps a single page just below the
stack, and then starts recursing. Without this, we will get a SIGSEGV
_after_ the stack has smashed the mapping. With this patch, we'll get a
nice SIGBUS just as the stack touches the page just above the mapping.
Since 2.6.31, swap_map[]'s refcounting was changed to show that a used
swap entry is just for swap-cache, can be reused. Then, while scanning
free entry in swap_map[], a swap entry may be able to be reclaimed and
reused. It was caused by commit c9e444103b5e7a5 ("mm: reuse unused swap
entry if necessary").
But this caused deta corruption at resume. The scenario is
- Assume a clean-swap cache, but mapped.
- at hibernation_snapshot[], clean-swap-cache is saved as
clean-swap-cache and swap_map[] is marked as SWAP_HAS_CACHE.
- then, save_image() is called. And reuse SWAP_HAS_CACHE entry to save
image, and break the contents.
After resume:
- the memory reclaim runs and finds clean-not-referenced-swap-cache and
discards it because it's marked as clean. But here, the contents on
disk and swap-cache is inconsistent.
Hance memory is corrupted.
This patch avoids the bug by not reclaiming swap-entry during hibernation.
This is a quick fix for backporting.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Reported-by: Ondreg Zary <linux@rainbow-software.org> Tested-by: Ondreg Zary <linux@rainbow-software.org> Tested-by: Andrea Gelmini <andrea.gelmini@gmail.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When a raid1 array is configured to support write-behind
on some devices, it normally only reads from other devices.
If all devices are write-behind (because the rest have failed)
it is possible for a read request to be serviced before a
behind-write request, which would appear as data corruption.
So when forced to read from a WriteMostly device, wait for any
write-behind to complete, and don't start any more behind-writes.
If a command times out resulting in EH getting invoked, we wait for the
aborted commands to come back after sending the abort. Shorten
the amount of time we wait for these responses, to ensure we don't
get stuck in EH for several minutes.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commands which are completed by the VIOS are placed on a CRQ
in kernel memory for the ibmvfc driver to process. Each CRQ
entry is 16 bytes. The ibmvfc driver reads the first 8 bytes
to check if the entry is valid, then reads the next 8 bytes to get
the handle, which is a pointer the completed command. This fixes
an issue seen on Power 7 where the processor reordered the
loads from memory, resulting in processing command completion
with a stale handle. This could result in command timeouts,
and also early completion of commands.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When removing several devices aic79xx will occasionally Oops
in ahd_handle_nonpkt_busfree during rescan. Looking at the
code I found that we're indeed not checking if the scb in
question is NULL. So check for it before accessing it.
Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>