Increased storvsc ringbuffer and max_io_requests. This now more
closely mimics the numbers on Hyper-V. And will allow more IO requests
to take place for the SCSI driver.
Max_IO is set to double from what it was before, Hyper-V allows it and
we have had appliance builder requests to see if it was a problem to
increase the number.
Ringbuffer size for storvsc is now increased because I have seen A few buffer
problems on extremely busy systems. They were Set pretty low before.
And since max_io_requests is increased I Really needed to increase the buffer
as well.
Fixed the value of the 64bit-hole inside ring buffer, this
caused a problem on Hyper-V when running checked Windows builds.
Checked builds of Windows are used internally and given to external
system integrators at times. They are builds that for example that all
elements in a structure follow the definition of that Structure. The bug
this fixed was for a field that we did not fill in at all (Because we do
Not use it on the Linux side), and the checked build of windows gives
errors on it internally to the Windows logs.
Fixed bounce offset kmap problem by using correct index.
The symptom of the problem is that in some NAS appliances this problem
represents Itself by a unresponsive VM under a load with many clients writing
small files.
Fix missing functions for net_device_ops.
It's a bug when porting the drivers from 2.6.27 to 2.6.32. In 2.6.27,
the default functions for Ethernet, like eth_change_mtu(), were assigned
by ether_setup(). But in 2.6.32, these function pointers moved to
net_device_ops structure and no longer be assigned in ether_setup(). So
we need to set these functions in our driver code. It will ensure the
MTU won't be set beyond 1500. Otherwise, this can cause an error on the
server side, because the HyperV linux driver doesn't support jumbo frame
yet.
Since ALC259/269 use the same parser of ALC268, the pin 0x1b was ignored
as an invalid widget. Just add this NID to handle properly.
This will add the missing mixer controls for some devices.
While the original problem only caused a slight disturbance on the
edge of the touchpad, the commit above to "fix" it completely breaks
operation on some other models such as mine.
We'll sort this out separately, revert the patch for now.
If we write into an unwritten extent using AIO we need to complete the AIO
request after the extent conversion has finished. Without that a read could
race to see see the extent still unwritten and return zeros. For synchronous
I/O we already take care of that by flushing the xfsconvertd workqueue (which
might be a bit of overkill).
To do that add iocb and result fields to struct xfs_ioend, so that we can
call aio_complete from xfs_end_io after the extent conversion has happened.
Note that we need a new result field as io_error is used for positive errno
values, while the AIO code can return negative error values and positive
transfer sizes.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch is to be applied upon Christoph's "direct-io: move aio_complete
into ->end_io" patch. It adds iocb and result fields to struct ext4_io_end_t,
so that we can call aio_complete from ext4_end_io_nolock() after the extent
conversion has finished.
I have verified with Christoph's aio-dio test that used to fail after a few
runs on an original kernel but now succeeds on the patched kernel.
See http://thread.gmane.org/gmane.comp.file-systems.ext4/19659 for details.
Filesystems with unwritten extent support must not complete an AIO request
until the transaction to convert the extent has been commited. That means
the aio_complete calls needs to be moved into the ->end_io callback so
that the filesystem can control when to call it exactly.
This makes a bit of a mess out of dio_complete and the ->end_io callback
prototype even more complicated.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Alex Elder <aelder@sgi.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 2ca1af9aa3285c6a5f103ed31ad09f7399fc65d7 "PCI: MSI: Remove
unsafe and unnecessary hardware access" changed read_msi_msg_desc() to
return the last MSI message written instead of reading it from the
device, since it may be called while the device is in a reduced
power state.
However, the pSeries platform code really does need to read messages
from the device, since they are initially written by firmware.
Therefore:
- Restore the previous behaviour of read_msi_msg_desc()
- Add new functions get_cached_msi_msg{,_desc}() which return the
last MSI message written
- Use the new functions where appropriate
Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
During suspend on an SMP system, {read,write}_msi_msg_desc() may be
called to mask and unmask interrupts on a device that is already in a
reduced power state. At this point memory-mapped registers including
MSI-X tables are not accessible, and config space may not be fully
functional either.
While a device is in a reduced power state its interrupts are
effectively masked and its MSI(-X) state will be restored when it is
brought back to D0. Therefore these functions can simply read and
write msi_desc::msg for devices not in D0.
Further, read_msi_msg_desc() should only ever be used to update a
previously written message, so it can always read msi_desc::msg
and never needs to touch the hardware.
commit f3c5c1bfd430858d3a05436f82c51e53104feb6b
(netfilter: xtables: make ip_tables reentrant) forgot to
also compute the jumpstack size in the compat handlers.
Result is that "iptables -I INPUT -j userchain" turns into -j DROP.
Reported by Sebastian Roesner on #netfilter, closes
http://bugzilla.netfilter.org/show_bug.cgi?id=669.
Note: arptables change is compile-tested only.
Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
TSC's get reset after suspend/resume (even on cpu's with invariant TSC
which runs at a constant rate across ACPI P-, C- and T-states). And in
some systems BIOS seem to reinit TSC to arbitrary large value (still
sync'd across cpu's) during resume.
This leads to a scenario of scheduler rq->clock (sched_clock_cpu()) less
than rq->age_stamp (introduced in 2.6.32). This leads to a big value
returned by scale_rt_power() and the resulting big group power set by the
update_group_power() is causing improper load balancing between busy and
idle cpu's after suspend/resume.
This resulted in multi-threaded workloads (like kernel-compilation) go
slower after suspend/resume cycle on core i5 laptops.
Fix this by recomputing cyc2ns_offset's during resume, so that
sched_clock() continues from the point where it was left off during
suspend.
Writeback is not terminating in background writeback if ->writepage is
returning with wbc->nr_to_write == 0, resulting in sub-optimal single page
writeback on XFS.
Fix the write_cache_pages loop to terminate correctly when this situation
occurs and so prevent this sub-optimal background writeback pattern. This
improves sustained sequential buffered write performance from around
250MB/s to 750MB/s for a 100GB file on an XFS filesystem on my 8p test VM.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Wu Fengguang <fengguang.wu@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit d62f5576 (pata_cmd64x: fix handling of address setup timings)
incorrectly called ata_timing_compute() on UDMA mode on 0 @UT leading
to devide by zero fault. Revert it until better fix is available.
This is reported in bko#16607 by Milan Kocian who also root caused it.
https://bugzilla.kernel.org/show_bug.cgi?id=16607
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-and-root-caused-by: Milan Kocian <milan.kocian@wq.cz> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix DSM/TRIM commands in sata_mv (v2).
These need to be issued using old-school "BM DMA",
rather than via the EDMA host queue.
Since the chips don't have proper BM DMA status,
we need to be more careful with setting the ATA_DMA_INTR bit,
since DSM/TRIM often has a long delay between "DMA complete"
and "command complete".
GEN_I chips don't have BM DMA, so no TRIM for them.
Signed-off-by: Mark Lord <mlord@pobox.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove harmful BUG_ON() from ata_bmdma_qc_issue(),
as it casts too wide of a net and breaks sata_mv.
It also crashes the kernel while doing the BUG_ON().
There's already a WARN_ON_ONCE() further down to catch
the case of POLLING for a BMDMA operation.
Signed-off-by: Mark Lord <mlord@pobox.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The non-standard name "iMic" makes PulseAudio ignore the microphone. BugLink: https://launchpad.net/bugs/605101 Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Two users report model=auto is needed to make the internal mic work properly. BugLink: https://bugs.launchpad.net/bugs/495134 Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Under heavy load parallel metadata loads (e.g. dbench), we can fail
to mark all the inodes in a cluster being freed as XFS_ISTALE as we
skip inodes we cannot get the XFS_ILOCK_EXCL or the flush lock on.
When this happens and the inode cluster buffer has already been
marked stale and freed, inode reclaim can try to write the inode out
as it is dirty and not marked stale. This can result in writing th
metadata to an freed extent, or in the case it has already
been overwritten trigger a magic number check failure and return an
EUCLEAN error such as:
Filesystem "ram0": inode 0x442ba1 background reclaim flush failed with 117
Fix this by ensuring that we hoover up all in memory inodes in the
cluster and mark them XFS_ISTALE when freeing the cluster.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit 7124fe0a5b619d65b739477b3b55a20bf805b06d ("xfs: validate untrusted inode
numbers during lookup") changes the inode lookup code to do btree lookups for
untrusted inode numbers. This change made an invalid assumption about the
alignment of inodes and hence incorrectly calculated the first inode in the
cluster. As a result, some inode numbers were being incorrectly considered
invalid when they were actually valid.
The issue was not picked up by the xfstests suite because it always runs fsr
and dump (the two utilities that utilise the bulkstat interface) on cache hot
inodes and hence the lookup code in the cold cache path was not sufficiently
exercised to uncover this intermittent problem.
Fix the issue by relaxing the btree lookup criteria and then checking if the
record returned contains the inode number we are lookup for. If it we get an
incorrect record, then the inode number is invalid.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
IPIs and VIRQs are inherently per-cpu event types, so treat them as such:
- use a specific percpu irq_chip implementation, and
- handle them with handle_percpu_irq
This makes the path for delivering these interrupts more efficient
(no masking/unmasking, no locks), and it avoid problems with attempts
to migrate them.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Xen events are logically edge triggered, as Xen only calls the event
upcall when an event is newly set, but not continuously as it remains set.
As a result, use handle_edge_irq rather than handle_level_irq.
This has the important side-effect of fixing a long-standing bug of
events getting lost if:
- an event's interrupt handler is running
- the event is migrated to a different vcpu
- the event is re-triggered
The most noticable symptom of these lost events is occasional lockups
of blkfront.
Many thanks to Tom Kopec and Daniel Stodden in tracking this down.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Tom Kopec <tek@acm.org> Cc: Daniel Stodden <daniel.stodden@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit 8bf0223ed515be24de0c671eedaff49e78bebc9c (hwmon, k8temp: Fix
temperature reporting for ASB1 processor revisions) fixed temperature
reporting for ASB1 CPUs. But those CPU models (model 0x6b, 0x6f, 0x7f)
were packaged both as AM2 (desktop) and ASB1 (mobile). Thus the commit
leads to wrong temperature reporting for AM2 CPU parts.
The solution is to determine the package type for models 0x6b, 0x6f,
0x7f.
This is done using BrandId from CPUID Fn8000_0001_EBX[15:0]. See
"Constructing the processor Name String" in "Revision Guide for AMD
NPT Family 0Fh Processors" (Rev. 3.46).
Cc: Rudolf Marek <r.marek@assembler.cz> Reported-by: Vladislav Guberinic <neosisani@gmail.com> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Unfortunately, the current timer tracing is not very useful: the
actual timer function is not recorded in the trace at the start
of timer execution.
Although this is recorded for timer "start" time (when it gets
armed), this is not useful; most timers get started early, and a
tracer like PowerTOP will never see this event, but will only
see the actual running of the timer.
This patch just adds the function to the timer tracing; I've
verified with PowerTOP that now it can get useful information
about timers.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: xiaoguangrong@cn.fujitsu.com Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4C6C5FA9.3000405@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is a scalability issue for current implementation of optimistic
mutex spin in the kernel. It is found on a 8 node 64 core Nehalem-EX
system (HT mode).
The intention of the optimistic mutex spin is to busy wait and spin on a
mutex if the owner of the mutex is running, in the hope that the mutex
will be released soon and be acquired, without the thread trying to
acquire mutex going to sleep. However, when we have a large number of
threads, contending for the mutex, we could have the mutex grabbed by
other thread, and then another ……, and we will keep spinning, wasting cpu
cycles and adding to the contention. One possible fix is to quit
spinning and put the current thread on wait-list if mutex lock switch to
a new owner while we spin, indicating heavy contention (see the patch
included).
I did some testing on a 8 socket Nehalem-EX system with a total of 64
cores. Using Ingo's test-mutex program that creates/delete files with 256
threads (http://lkml.org/lkml/2006/1/8/50) , I see the following speed up
after putting in the mutex spin fix:
./mutex-test V 256 10
Ops/sec
2.6.34 62864
With fix 197200
Repeating the test with Aim7 fserver workload, again there is a speed up
with the fix:
Jobs/min
2.6.34 91657
With fix 149325
To look at the impact on the distribution of mutex acquisition time, I
collected the mutex acquisition time on Aim7 fserver workload with some
instrumentation. The average acquisition time is reduced by 48% and
number of contentions reduced by 32%.
#contentions Time to acquire mutex (cycles)
2.6.34 72973 44765791
With fix 49210 23067129
The histogram of mutex acquisition time is listed below. The acquisition
time is in 2^bin cycles. We see that without the fix, the acquisition
time is mostly around 2^26 cycles. With the fix, we the distribution get
spread out a lot more towards the lower cycles, starting from 2^13.
However, there is an increase of the tail distribution with the fix at
2^28 and 2^29 cycles. It seems a small price to pay for the reduced
average acquisition time and also getting the cpu to do useful work.
I've done similar experiments with 2.6.35 kernel on smaller boxes as
well. One is on a dual-socket Westmere box (12 cores total, with HT).
Another experiment is on an old dual-socket Core 2 box (4 cores total, no
HT)
On the 12-core Westmere box, I see a 250% increase for Ingo's mutex-test
program with my mutex patch but no significant difference in aim7's
fserver workload.
On the 4-core Core 2 box, I see the difference with the patch for both
mutex-test and aim7 fserver are negligible.
So far, it seems like the patch has not caused regression on smaller
systems.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1282168827.9542.72.camel@schen9-DESK> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add ftdi product ID for Lenz LI-USB, a model train interface. This
was NOT tested against 2.6.35, but a similar patch was tested with the
CentOS 2.6.18-194.11.1.el5 kernel. It wasn't clear to me what
ordering is being used in ftdi_sio.c, so I inserted the ID after another
model train entry(SPROG_II).
The code to increment the TRB pointer has a slight ambiguity that could
lead to a bug on different compilers. The ANSI C specification does not
specify the precedence of the assignment operator over the postfix
operator. gcc 4.4 produced the correct code (increment the pointer and
assign the value), but a MIPS compiler that one of John's clients used
assigned the old (unincremented) value.
Remove the unnecessary assignment to make all compilers produce the
correct assembly.
Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If we can't read the firmware for a device from the disk, and yet the
device already has a valid firmware image in it, we don't want to
replace the firmware with something invalid. So check the version
number to be less than the current one to verify this is the correct
thing to do.
Reported-by: Chris Beauchamp <chris@chillibean.tv> Tested-by: Chris Beauchamp <chris@chillibean.tv> Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add support for the Zeagle N2iTiON3 dive computer interface. Since
Zeagle devices are actually manufactured by Seiko, this patch will
support other Seiko based models as well.
>Try the navman driver instead. You can either add the device id to the
> driver and rebuild it, or do this before you plug the device in:
> modprobe navman
> echo -n "0x0df7 0x0900" > /sys/bus/usb-serial/drivers/navman/new_id
>
> and then plug your device in and see if that works.
I can confirm that the navman driver works with the right device IDs on
my i-gotU GT-600, which has the same device IDs. Attached is a patch
adding the IDs.
From: Ross Burton <ross@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Each net_device in a system will automatically managed as a possible
batman_if and holds different informations like a buffer with a prepared
originator messages. To reduce the memory usage, the packet_buff will
only be allocated when the interface is really added/enabled for
batman-adv.
The function to update the hw address information inside the packet_buff
just assumes that the packet_buff is always initialised and thus the
kernel will just oops when we try to change the hw address of a not
already fully enabled interface.
We must always check if the packet_buff is allocated before we try to
change information inside of it.
Reported-by: Tim Glaremin <Tim.Glaremin@web.de> Reported-by: Kazuki Shimada <zukky@bb.banban.jp> Signed-off-by: Sven Eckelmann <sven.eckelmann@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
dev_put allows a device to be freed when all its references are dropped.
After that we are not allowed to access that information anymore. Access
to the data structure of a net_device must be surrounded a dev_hold
and ended using dev_put.
batman-adv adds a device to its own management structure in
hardif_add_interface and will release it in hardif_remove_interface.
Thus it must hold a reference all the time between those functions to
prevent any access to the already released net_device structure.
Reported-by: Tim Glaremin <Tim.Glaremin@web.de> Signed-off-by: Sven Eckelmann <sven.eckelmann@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We try to get all events for all net_devices to be able to add special
sysfs folders for the batman-adv configuration. This also includes such
events like NETDEV_POST_INIT which has no valid kobject according to v2.6.32-rc3-13-g7ffbe3f. This would create an oops in that situation.
It is enough to create the batman_if only on NETDEV_REGISTER events
because we will also receive those events for devices which already
existed when we registered the notifier call.
Signed-off-by: Sven Eckelmann <sven.eckelmann@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The orig_hash_lock spinlock always has to be locked with IRQs being
disabled to avoid deadlocks between code that is being executed in
IRQ context and code that is being executed in non-IRQ context.
Reported-by: Sven Eckelmann <sven.eckelmann@gmx.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Sven Eckelmann <sven.eckelmann@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Userspace controls the amount of memory to be allocate, so it can
get the ioctl to allocate more memory than the kernel uses, and get
access to kernel stack. This can only be done for processes authenticated
to the X server for DRI access, and if the user has DRI access.
Fix is to just memset the data to 0 if the user doesn't copy into
it in the first place.
The pins for ddc and aux are shared so you need to switch the
mode when doing ddc. The ProcessAuxChannel table already sets
the pin mode to DP. This should fix unreliable ddc issues
on DP ports using non-DP monitors.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
net/compat/wext: send different messages to compat tasks
we had a race condition when setting and then
restoring frag_list. Eric attempted to fix it,
but the fix created even worse problems.
However, the original motivation I had when I
added the code that turned out to be racy is
no longer clear to me, since we only copy up
to skb->len to userspace, which doesn't include
the frag_list length. As a result, not doing
any frag_list clearing and restoring avoids
the race condition, while not introducing any
other problems.
Additionally, while preparing this patch I found
that since none of the remaining netlink code is
really aware of the frag_list, we need to use the
original skb's information for packet information
and credentials. This fixes, for example, the
group information received by compat tasks.
Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BugLink: https://bugs.launchpad.net/bugs/619439
This ThinkPad model needs External Amplifier muted for audible playback,
so set the inv_eapd quirk for it.
Reported-and-tested-by: Dennis Bell <dennis.bell@parkerg.co.uk> Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It doesn't like pattern and explicit rules to be on the same line,
and it seems to be more picky when matching file (or really directory)
names with different numbers of trailing slashes.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Sam Ravnborg <sam@ravnborg.org>
Andrew Benton <b3nton@gmail.com> Signed-off-by: Michal Marek <mmarek@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
broke 3945 under some unknown circumstances, as
reported by Alex.
Since I want to keep the direct application of
filter flags on iwlagn, duplicate the code into
both 3945 and agn and remove committing the
RXON that broke things from the 3945 version.
Reported-by: Alex Romosan <romosan@sycorax.lbl.gov> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The alternate MAC address feature is only supported by 80003ES2LAN and
82571 LOMs as well as a couple 82571 mezzanine cards. Checking for an
alternate MAC address on other parts can fail leading to the driver not
able to load. This patch limits the check for an alternate MAC address
to be done only for parts that support the feature.
This issue has been around since support for the feature was introduced
to the e1000e driver in 2.6.34.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Reported-by: Fabio Varesano <fax8@users.sourceforge.net> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On the e1000-devel mailing list, Nils Faerber reported latency issues with
the 82573 LOM on a ThinkPad X60. It was found to be caused by ASPM L1;
disabling it resolves the latency. The issue is present in kernels back
to 2.6.34 and possibly 2.6.33.
Reported-by: Nils Faerber <nils.faerber@kernelconcepts.de> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch separates the device deletion code from dm_put()
to make sure the deletion happens in the process context.
By this patch, device deletion always occurs in an ioctl (process)
context and dm_put() can be called in interrupt context.
As a result, the request-based dm's bad dm_put() usage pointed out
by Mikulas below disappears.
http://marc.info/?l=dm-devel&m=126699981019735&w=2
Without this patch, I confirmed there is a case to crash the system:
dm_put() => dm_table_destroy() => vfree() => BUG_ON(in_interrupt())
Some more backgrounds and details:
In request-based dm, a device opener can remove a mapped_device
while the last request is still completing, because bios in the last
request complete first and then the device opener can close and remove
the mapped_device before the last request completes:
CPU0 CPU1
=================================================================
<<INTERRUPT>>
blk_end_request_all(clone_rq)
blk_update_request(clone_rq)
bio_endio(clone_bio) == end_clone_bio
blk_update_request(orig_rq)
bio_endio(orig_bio)
<<I/O completed>>
dm_blk_close()
dev_remove()
dm_put(md)
<<Free md>>
blk_finish_request(clone_rq)
....
dm_end_request(clone_rq)
free_rq_clone(clone_rq)
blk_end_request_all(orig_rq)
rq_completed(md)
So request-based dm used dm_get()/dm_put() to hold md for each I/O
until its request completion handling is fully done.
However, the final dm_put() can call the device deletion code which
must not be run in interrupt context and may cause kernel panic.
To solve the problem, this patch moves the device deletion code,
dm_destroy(), to predetermined places that is actually deleting
the mapped_device in ioctl (process) context, and changes dm_put()
just to decrement the reference count of the mapped_device.
By this change, dm_put() can be used in any context and the symmetric
model below is introduced:
dm_create(): create a mapped_device
dm_destroy(): destroy a mapped_device
dm_get(): increment the reference count of a mapped_device
dm_put(): decrement the reference count of a mapped_device
dm_destroy() waits for all references of the mapped_device to disappear,
then deletes the mapped_device.
dm_destroy() uses active waiting with msleep(1), since deleting
the mapped_device isn't performance-critical task.
And since at this point, nobody opens the mapped_device and no new
reference will be taken, the pending counts are just for racing
completing activity and will eventually decrease to zero.
For the unlikely case of the forced module unload, dm_destroy_immediate(),
which doesn't wait and forcibly deletes the mapped_device, is also
introduced and used in dm_hash_remove_all(). Otherwise, "rmmod -f"
may be stuck and never return.
And now, because the mapped_device is deleted at this point, subsequent
accesses to the mapped_device may cause NULL pointer references.
Currently the driver will try to protect all frames,
which leads to a lot of odd things like sending an
RTS with a zeroed RA before multicast frames, which
is clearly bogus.
In order to fix all of this, we need to take a step
back and see what we need to achieve:
* we need RTS/CTS protection if requested by
the AP for the BSS, mac80211 tells us this
* in that case, CTS-to-self should only be
enabled when mac80211 tells us
* additionally, as a hardware workaround, on
some devices we have to protect aggregated
frames with RTS
To achieve the first two items, set up the RXON
accordingly and set the protection required flag
in the transmit command when mac80211 requests
protection for the frame.
To achieve the last item, set the rate-control
RTS-requested flag for all stations that we have
aggregation sessions with, and set the protection
required flag when sending aggregated frames (on
those devices where this is required).
Since otherwise bugs can occur, do not allow the
user to override the RTS-for-aggregation setting
from sysfs any more.
Finally, also clean up the way all these flags get
set in the driver and move everything into the
device-specific functions.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit fc6055a5ba31e2c14e36e8939f9bf2b6d586a7f5 (net: Introduce
skb_orphan_try()) allows an early orphan of the skb and takes care on
tx timestamping, which needs the sk-reference in the skb on driver level.
So does the can-raw socket, which has not been taken into account here.
The patch below adds a 'prevent_sk_orphan' bit in the skb tx shared info,
which fixes the problem discovered by Matthias Fuchs here:
Even if it's not a primary tx timestamp topic it fits well into some skb
shared tx context. Or should be find a different place for the information to
protect the sk reference until it reaches the driver level?
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The tv_nsec is a long and when added to the shifted interval it can wrap
and become negative which later causes looping problems in the
getrawmonotonic(). The edge case occurs when the system has slept for
a short period of time of ~2 seconds.
A trace printk of the values in this patch illustrate the problem:
The kernel starts looping at 46.349925 in the getrawmonotonic() due to
the negative value from adding the raw value to tv_nsec.
A simple solution is to accumulate into a u64, and then normalize it
to a timespec_t.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
[ Reworked variable names and simplified some of the code. - John ] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Some BIOSes will claim a large chunk of stolen space. Unless we
reclaim it, our aperture for remapping buffer objects will be
constrained. So clamp the stolen space to 32M and ignore the rest.
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=15469 among others.
Adding the ignored stolen memory back into the general pool using the
memory hotplug code is left as an exercise for the reader.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Simon Farnsworth <simon.farnsworth@onelan.com> Tested-by: Artem S. Tashkinov <t.artem@mailcity.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Newer Intel processors identifying themselves as model 30 are not recognized by
oprofile.
<cpuinfo snippet>
model : 30
model name : Intel(R) Xeon(R) CPU X3470 @ 2.93GHz
</cpuinfo snippet>
Running oprofile on these machines gives the following:
+ opcontrol --init
+ opcontrol --list-events
oprofile: available events for CPU type "Intel Architectural Perfmon"
See Intel 64 and IA-32 Architectures Software Developer's Manual
Volume 3B (Document 253669) Chapter 18 for architectural perfmon events
This is a limited set of fallback events because oprofile doesn't know your CPU
CPU_CLK_UNHALTED: (counter: all)
Clock cycles when not halted (min count: 6000)
INST_RETIRED: (counter: all)
number of instructions retired (min count: 6000)
LLC_MISSES: (counter: all)
Last level cache demand requests from this core that missed the LLC
(min count: 6000)
Unit masks (default 0x41)
----------
0x41: No unit mask
LLC_REFS: (counter: all)
Last level cache demand requests from this core (min count: 6000)
Unit masks (default 0x4f)
----------
0x4f: No unit mask
BR_MISS_PRED_RETIRED: (counter: all)
number of mispredicted branches retired (precise) (min count: 500)
+ opcontrol --shutdown
Back when the patch was submitted for "Add Xeon 7500 series support to
oprofile", Robert Richter had asked for a followon patch that
converted all the CPU ID values to hex.
I have done that here for the "i386/core_i7" and "i386/atom" class
processors in the ppro_init() function and also added some comments on
where to find documentation on the Intel processors.
Signed-off-by: John L. Villalovos <john.l.villalovos@intel.com> Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We should unlock here. This is the only place where we return from the
function with the lock held. The caller isn't expecting it.
Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Downgrade some error messages which occur frequently during
normal operation to debug messages.
Impact: logging Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make /dev/console get initialised before any initialisation routine that
invokes modprobe because if modprobe fails, it's going to want to open
/dev/console, presumably to write an error message to.
The problem with that is that if the /dev/console driver is not yet
initialised, the chardev handler will call request_module() to invoke
modprobe, which will fail, because we never compile /dev/console as a
module.
This will lead to a modprobe loop, showing the following in the kernel
log:
This can happen, for example, when the built in md5 module can't find
the built in cryptomgr module (because the latter fails to initialise).
The md5 module comes before the call to tty_init(), presumably because
'crypto' comes before 'drivers' alphabetically.
Fix this by calling tty_init() from chrdev_init().
pskb_may_pull() may change skb pointers, so adjust icmph after pskb_may_pull().
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Although netif_rx() isn't expected to be called in process context with
preemption enabled, it'd better handle this case. And this is why get_cpu()
is used in the non-RPS #ifdef branch. If tree RCU is selected,
rcu_read_lock() won't disable preemption, so preempt_disable() should be
called explictly.
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since there was added ->tcf_chain() method without ->bind_tcf() to
sch_sfq class options, there is oops when a filter is added with
the classid parameter.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Reported-by: Franchoze Eric <franchoze@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
>Xin Xiaohui wrote:
> I looked into the code dev_gro_receive(), found the code here:
> if the frags[0] is pulled to 0, then the page will be released,
> and memmove() frags left.
> Is that right? I'm not sure if memmove do right or not, but
> frags[0].size is never set after memove at least. what I think
> a simple way is not to do anything if we found frags[0].size == 0.
> The patch is as followed.
...
This version of the patch fixes the bug directly in memmove.
Reported-by: "Xin, Xiaohui" <xiaohui.xin@intel.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The netpoll_rx_on() check in __napi_gro_receive() skips part of the
"common" GRO_NORMAL path, especially "pull:" in dev_gro_receive(),
where at least eth header should be copied for entirely paged skbs.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The main motivation of this patch changing strcpy() to strlcpy().
We strcpy() to copy a 48 byte buffers into a 49 byte buffers. So at
best the last byte has leaked information, or maybe there is an
overflow? Anyway, this patch closes the information leaks by zeroing
the memory and the calls to strlcpy() prevent overflows.
Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds a limit for nframes as the number of frames in TX_SETUP and
RX_SETUP are derived from a single byte multiplex value by default.
Use-cases that would require to send/filter more than 256 CAN frames should
be implemented in userspace for complexity reasons anyway.
Additionally the assignments of unsigned values from userspace to signed
values in kernelspace and vice versa are fixed by using unsigned values in
kernelspace consistently.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Reported-by: Ben Hawkes <hawkes@google.com> Acked-by: Urs Thuermann <urs.thuermann@volkswagen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
after updating the value of the ICMP payload, inet_proto_csum_replace4() should
be called with zero pseudohdr.
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On the bridge TX path we're leaking an skb when br_multicast_rcv
returns an error.
Reported-by: David Lamparter <equinox@diac24.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is a bug in do_tcp_setsockopt(net/ipv4/tcp.c),
TCP_COOKIE_TRANSACTIONS case.
In some cases (when tp->cookie_values == NULL) new tcp_cookie_values
structure can be allocated (at cvp), but not bound to
tp->cookie_values. So a memory leak occurs.
Signed-off-by: Dmitry Popov <dp@highloadlab.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Long ago, when bridge was converted to RCU, rcu lock was equivalent
to having preempt disabled. RCU has changed a lot since then and
bridge code was still assuming the since transmit was called with
bottom half disabled, it was RCU safe.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Tested-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If a video head and keyboard are hooked up, specifying "console=ttyS0"
or similar to use a serial console will not work properly.
The key issue is that we must register all serial console capable
devices with register_console(), otherwise the command line specified
device won't be found. The sun serial drivers would only register
themselves as console devices if the OpenFirmware specified console
device node matched. To fix this part we now unconditionally get
the serial console register by setting serial_drv->cons always.
Secondarily we must not add_preferred_console() using the firmware
provided console setting if the user gaven an override on the kernel
command line using "console=" The "primary framebuffer" matching
logic was always triggering o n openfirmware device node match, make
it not when a command line override was given.
Reported-by: Frans Pop <elendil@planet.nl> Tested-by: Frans Pop <elendil@planet.nl> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
SunBlade-2500 has 'parallel' device node with compatible
property "pnpALI,1533,3" so add that to the ID table.
Reported-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch fixes alignment of slab objects in case CONFIG_DEBUG_PAGEALLOC is
active.
Before this spot in kmem_cache_create, we have this situation:
- align contains the required alignment of the object
- cachep->obj_offset is 0 or equals align in case of CONFIG_DEBUG_SLAB
- size equals the size of the object, or object plus trailing redzone in case
of CONFIG_DEBUG_SLAB
This spot tries to fill one page per object if the object is in certain size
limits, however setting obj_offset to PAGE_SIZE - size does break the object
alignment since size may not be aligned with the required alignment.
This patch simply adds an ALIGN(size, align) to the equation and fixes the
object size detection accordingly.
This code in drivers/s390/cio/qdio_setup_init has lead to incorrectly aligned
slab objects (sizeof(struct qdio_q) equals 1792):
qdio_q_cache = kmem_cache_create("qdio_q", sizeof(struct qdio_q),
256, 0, NULL);
Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Clean up and simplify set_64bit(). This code is quite old (1.3.11)
and contains a fair bit of auxilliary machinery that current versions
of gcc handle just fine automatically. Worse, the auxilliary
machinery can actually cause an unnecessary spill to memory.
Furthermore, the loading of the old value inside the loop in the
32-bit case is unnecessary: if the value doesn't match, the CMPXCHG8B
instruction will already have loaded the "new previous" value for us.
Clean up the comment, too, and remove page references to obsolete
versions of the Intel SDM.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <tip-*@vger.kernel.org> Tested-by: Mark Stanovich <mrktimber@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Like the mlock() change previously, this makes the stack guard check
code use vma->vm_prev to see what the mapping below the current stack
is, rather than have to look it up with find_vma().
Also, accept an abutting stack segment, since that happens naturally if
you split the stack with mlock or mprotect.
It's a really simple list, and several of the users want to go backwards
in it to find the previous vma. So rather than have to look up the
previous entry with 'find_vma_prev()' or something similar, just make it
doubly linked instead.
This patch changes dm_hash_remove_all() to release _hash_lock when
removing a device. After removing the device, dm_hash_remove_all()
takes _hash_lock and searches the hash from scratch again.
This patch is a preparation for the next patch, which changes device
deletion code to wait for md reference to be 0. Without this patch,
the wait in the next patch may cause AB-BA deadlock:
CPU0 CPU1
-----------------------------------------------------------------------
dm_hash_remove_all()
down_write(_hash_lock)
table_status()
md = find_device()
dm_get(md)
<increment md->holders>
dm_get_live_or_inactive_table()
dm_get_inactive_table()
down_write(_hash_lock)
<in the md deletion code>
<wait for md->holders to be 0>
This patch prevents access to mapped_device which is being deleted.
Currently, even after a mapped_device has been removed from the hash,
it could be accessed through idr_find() using minor number.
That could cause a race and NULL pointer reference below:
CPU0 CPU1
------------------------------------------------------------------
dev_remove(param)
down_write(_hash_lock)
dm_lock_for_deletion(md)
spin_lock(_minor_lock)
set_bit(DMF_DELETING)
spin_unlock(_minor_lock)
__hash_remove(hc)
up_write(_hash_lock)
dev_status(param)
md = find_device(param)
down_read(_hash_lock)
__find_device_hash_cell(param)
dm_get_md(param->dev)
md = dm_find_md(dev)
spin_lock(_minor_lock)
md = idr_find(MINOR(dev))
spin_unlock(_minor_lock)
dm_put(md)
free_dev(md)
dm_get(md)
up_read(_hash_lock)
__dev_status(md, param)
dm_put(md)
Validate chunk size against both origin and snapshot sector size
Don't allow chunk size smaller than either origin or snapshot logical
sector size. Reading or writing data not aligned to sector size is not
allowed and causes immediate errors.
This requires us to open the origin before initialising the
exception store and to export dm_snap_origin.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
iterate_devices method should call the callback for all the devices where
the bio may be remapped. Thus, snapshot_iterate_devices should call the callback
for both snapshot and origin underlying devices because it remaps some bios
to the snapshot and some to the origin.
snapshot_iterate_devices called the callback only for the origin device.
This led to badly calculated device limits if snapshot and origin were placed
on different types of disks.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
My i855GM suffers from a 80k/s interrupt storm without this.
So add 2nd gen to the list of things that don't like more than
one outstanding pageflip request.
Furthermore I've changed the busy loop into a ringbuffer wait.
Busy-loops that don't check whether the chip died are simply evil.
And performance should actually improve, because there's usually
a decent amount of rendering queued on the gpu, hopefully rendering
that MI_WAIT into a noop by the time it's executed.
The current code holds dev->struct_mutex while executing this loop,
hence stalling all other gem activity anyway.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[anholt: resolved against conflict] Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add a new path for 2nd gen chips that uses the commands for i81x
chips (where public docs do exist) augmented with the plane bits
from i915. It seems to work and doesn't result in a black screen
like before.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[anholt: resolved against conflict] Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>