Mike Galbraith [Thu, 11 Mar 2010 16:17:13 +0000 (17:17 +0100)]
sched: Rate-limit nohz
Entering nohz code on every micro-idle is costing ~10% throughput for netperf
TCP_RR when scheduling cross-cpu. Rate limiting entry fixes this, but raises
ticks a bit. On my Q6600, an idle box goes from ~85 interrupts/sec to 128.
The higher the context switch rate, the more nohz entry costs. With this patch
and some cycle recovery patches in my tree, max cross cpu context switch rate is
improved by ~16%, a large portion of which of which is this ratelimiting.
Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268301003.6785.28.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Lucas De Marchi [Thu, 11 Mar 2010 02:37:45 +0000 (23:37 -0300)]
sched: Implement group scheduler statistics in one struct
Put all statistic fields of sched_entity in one struct, sched_statistics,
and embed it into sched_entity.
This change allows to memset the sched_statistics to 0 when needed (for
instance when forking), avoiding bugs of non initialized fields.
Signed-off-by: Lucas De Marchi <lucas.de.marchi@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1268275065-18542-1-git-send-email-lucas.de.marchi@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86: Fix sched_clock_cpu for systems with unsynchronized TSC
On UV systems, the TSC is not synchronized across blades. The
sched_clock_cpu() function is returning values that can go
backwards (I've seen as much as 8 seconds) when switching
between cpus.
As each cpu comes up, early_init_intel() will currently set the
sched_clock_stable flag true. When mark_tsc_unstable() runs, it
clears the flag, but this only occurs once (the first time a cpu
comes up whose TSC is not synchronized with cpu 0). After this,
early_init_intel() will set the flag again as the next cpu comes
up.
Only set sched_clock_stable if tsc has not been marked unstable.
Linus Torvalds [Mon, 1 Mar 2010 21:05:40 +0000 (13:05 -0800)]
Merge branch 'davinci-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-davinci
* 'davinci-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-davinci: (40 commits)
DaVinci DM365: Adding support for SPI EEPROM
DaVinci DM365: Adding DM365 SPI support
DaVinci DM355: Modifications to DM355 SPI support
DaVinci: SPI: Adding header file for SPI support.
davinci: dm646x: CDCE clocks: davinci_clk converted to clk_lookup
davinci: clkdev cleanup: remove clk_lookup wrapper, use clkdev_add_table()
DaVinci: DM365: Voice codec support for the DM365 SoC
davinci: clock: let clk->set_rate function sleep
Add SDA and SCL pin numbers to i2c platform data
davinci: da8xx/omap-l1xx: Add EDMA platform data for da850/omap-l138
davinci: build list of unused EDMA events dynamically
davinci: Fix edma_alloc_channel api for EDMA_CHANNEL_ANY case
davinci: Keep count of channel controllers on a platform
davinci: Correct return value of edma_alloc_channel api
davinci: add CDCE949 support on DM6467 EVM
davinci: add support for CDCE949 clock synthesizer
davinci: da850/omap-l138 EVM: register for suspend support
davinci: da850/omap-l138: add support for SoC suspend
davinci: add power management support
DaVinci: DM365: Changing default queue for DM365.
...
Linus Torvalds [Mon, 1 Mar 2010 21:04:58 +0000 (13:04 -0800)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (38 commits)
sata_via: Delay on vt6420 when starting ATAPI DMA write
ata: Detect Delkin Devices compact flash
pata_efar: Enable parallel scanning
pata_atiixp: enable parallel scan
[libata] pata_atiixp: add locking for parallel scanning
[libata] pata_efar: add locking for parallel scanning
libata: Pass host flags into the pci helper
[libata] pata_marvell: CONFIG_AHCI is really CONFIG_SATA_AHCI
libata: Allow pata_legacy to be built on non-ISA but PCI systems
pata_pdc202xx_old: fix UDMA mode for PDC2026x chipsets
pata_pdc202xx_old: fix UDMA mode for Promise UDMA33 cards
[libata] pata_at91: fix backslash-continued string
pata_via: store UDMA masks in via_isa_bridges table
pata_via: fix address setup timings underlocking
pata_serverworks: fix error message
pata_serverworks: fix PIO setup for the second channel
pata_efar: fix secondary port support
pata_cypress: fix PIO timings underclocking
pata_cs5535: use correct values for PIO1 and PIO2 data timings
pata_cmd64x: remove unused definitions
...
Bart Hartgers [Sun, 14 Feb 2010 12:04:50 +0000 (13:04 +0100)]
sata_via: Delay on vt6420 when starting ATAPI DMA write
When writing a disc on certain lite-on dvd-writers (also rebadged
as optiarc/LG/...) connected to a vt6420, the ATAPI CDB ends
up in the datastream and on the disc, causing silent corruption.
Delaying between sending the CDB and starting DMA seems to
prevent this.
I do not know if there are burners that do not suffer from
this, but the patch should be safe for those as well.
There are many reports of this issue, but AFAICT no solution was
found before. For example:
http://lkml.indiana.edu/hypermail/linux/kernel/0802.3/0561.html
Signed-off-by: Bart Hartgers <bart.hartgers@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Ben Gardner [Tue, 23 Feb 2010 18:41:22 +0000 (12:41 -0600)]
ata: Detect Delkin Devices compact flash
I have a Delkin Devices compact flash card that isn't being recognized using the
SATA/PATA drivers.
The card is recognized and works with the deprecated ATA drivers.
The error I am seeing is:
ata1.00: failed to IDENTIFY (device reports invalid type, err_mask=0x0)
I tracked it down to ata_id_is_cfa() in include/linux/ata.h.
The Delkin card has id[0] set to 0x844a and id[83] set to 0.
This isn't what the kernel expects and is probably incorrect.
The simplest work-around is to add a check for 0x844a to ata_id_is_cfa().
Signed-off-by: Ben Gardner <gardner.ben@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
[libata] pata_atiixp: add locking for parallel scanning
This is similar change as commit 60c3be3 for ata_piix host driver
and while pata_atiixp doesn't enable parallel scan yet the race
could probably also be triggered by requesting re-scanning of both
ports at the same time using SCSI sysfs interface.
[Ported to current tree without other patch dependancies by Alan Cox]
Original is Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This one is Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
[libata] pata_efar: add locking for parallel scanning
Add clearing of UDMA enable bit also for PIO modes and then add
extra locking for parallel scanning.
This is similar change as commit 60c3be3 for ata_piix host driver
and while pata_efar doesn't enable parallel scan yet the race could
probably also be triggered by requesting re-scanning of both ports
at the same time using SCSI sysfs interface.
[Ported to current kernel without other patch dependancies by
Alan Cox]
Original is Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This one is Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Alan Cox [Tue, 23 Feb 2010 07:26:06 +0000 (02:26 -0500)]
libata: Pass host flags into the pci helper
This allows parallel scan and the like to be set without having to stop
using the existing full helper functions. This patch merely adds the argument
and fixes up the callers. It doesn't undo the special cases already in the
tree or add any new parallel callers.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Christoph Egger [Fri, 5 Feb 2010 15:26:35 +0000 (16:26 +0100)]
[libata] pata_marvell: CONFIG_AHCI is really CONFIG_SATA_AHCI
The marvell driver comtains a fallback to ahci for the sata ports
which is incorrectly checked as CONFIG_AHCI while the only AHCI config
item is actually called SATA_AHCI (which also sounds sensible
considering it's a fallback for the sata ports).
Signed-off-by: Christoph Egger <siccegge@stud.informatik.uni-erlangen.de> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Alan Cox [Mon, 8 Feb 2010 10:04:54 +0000 (10:04 +0000)]
libata: Allow pata_legacy to be built on non-ISA but PCI systems
This is needed for some unsupported hardware setups on strange 64bit
mainboards where crazy stuff has been done like putting flash ata adapters
on the LPC bus, or where the real hardware is hidden/confused.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
pata_pdc202xx_old: fix UDMA mode for Promise UDMA33 cards
On Monday 04 January 2010 02:30:24 pm Russell King wrote:
> Found the problem - getting rid of the read of the alt status register
> after the command has been written fixes the UDMA CRC errors on write:
>
> @@ -676,7 +676,8 @@ void ata_sff_exec_command(struct ata_port *ap, const struct
> ata_taskfile *tf)
> DPRINTK("ata%u: cmd 0x%X\n", ap->print_id, tf->command);
>
> iowrite8(tf->command, ap->ioaddr.command_addr);
> - ata_sff_pause(ap);
> + ndelay(400);
> +// ata_sff_pause(ap);
> }
> EXPORT_SYMBOL_GPL(ata_sff_exec_command);
>
>
> This rather makes sense. The PDC20247 handles the UDMA part of the
> protocol. It has no way to tell the PDC20246 to wait while it suspends
> UDMA, so that a normal register access can take place - the 246 ploughs
> on with the register access without any regard to the state of the 247.
>
> If the drive immediately starts the UDMA protocol after a write to the
> command register (as it probably will for the DMA WRITE command), then
> we'll be accessing the taskfile in the middle of the UDMA setup, which
> can't be good. It's certainly a violation of the ATA specs.
Fix it by adding custom ->sff_exec_command method for UDMA33 chipsets.
Debugged-by: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
pata_cs5535: use correct values for PIO1 and PIO2 data timings
There shouldn't be any problems with it as IDE cs5535 host driver
has been using those values for years and they match values given
in the (publicly available) datasheet.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Bart Hartgers [Sat, 16 Jan 2010 23:56:54 +0000 (00:56 +0100)]
sata_via: Correctly setup PIO/DMA for pata slave on vt6421.
Before only the timings for master were set. Datasheet can be found
here: ftp://ftp.vtbridge.org/Docs/Storage/DS_VT6421A_100_CCPL.PDF
Surprisingly, a slave drive works without this patch. According to the
datasheet, the controller by default derives the DMA mode from the
Set Features command issued to a drive. Not sure about the PIO
timings, though. The real problem is that the timings for the master
effectively are the ones tuned for the slave. If these support
different UDMA-settings, there is trouble, especially when the slave
supports a higher UDMA than the master.
Anyhow, using the same mechanism for both master and slave seems like
a good idea.
Tejun Heo [Tue, 19 Jan 2010 01:49:19 +0000 (10:49 +0900)]
libata: implement spurious irq handling for SFF and apply it to piix
Traditional IDE interface sucks in that it doesn't have a reliable IRQ
pending bit, so if the controller raises IRQ while the driver is
expecting it not to, the IRQ won't be cleared and eventually the IRQ
line will be killed by interrupt subsystem. Some controllers have
non-standard mechanism to indicate IRQ pending so that this condition
can be detected and worked around.
This patch adds an optional operation ->sff_irq_check() which will be
called for each port from the ata_sff_interrupt() if an unexpected
interrupt is received. If the operation returns %true,
->sff_check_status() and ->sff_irq_clear() will be cleared for the
port. Note that this doesn't mark the interrupt as handled so it
won't prevent IRQ subsystem from killing the IRQ if this mechanism
fails to clear the spurious IRQ.
This patch also implements ->sff_irq_check() for ata_piix. Note that
this adds slight overhead to shared IRQ operation as IRQs which are
destined for other controllers will trigger extra register accesses to
check whether IDE interrupt is pending but this solves rare screaming
IRQ cases and for some curious reason also helps weird BIOS related
glitch on Samsung n130 as reported in bko#14314.
http://bugzilla.kernel.org/show_bug.cgi?id=14314
* piix_base_ops dropped as suggested by Sergei.
* Spurious IRQ detection doesn't kick in anymore if polling qc is in
progress. This provides less protection but some controllers have
possible data corruption issues if the wrong register is accessed
while a command is in progress.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Johannes Stezenbach <js@sig21.net> Reported-by: Hans Werner <hwerner4@gmx.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Tue, 19 Jan 2010 01:46:32 +0000 (10:46 +0900)]
libata: cleanup ata_sff_interrupt()
host->ports[i] is never NULL if i < host->n_ports and non-NULL return
from ata_qc_from_tag() guarantees that the returned qc is active.
Drop unnecessary tests.
Superflous () dropped as suggested by Sergei.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Describe UDMA timing bits 18-20 and 21 separately; add a note to bit
31 about it being meaningful for PIO only. Reformat the whole comment,
while at it...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Sergei Shtylyov [Mon, 7 Dec 2009 19:30:06 +0000 (23:30 +0400)]
pata_hpt{37x|3x2n}: unify mode programming
As these drivers' set_piomode() and set_dmamode() methods are almost
identical, factor out the common hpt{37x|3x2n}_set_mode() function
to be called by both of them, the same as in 'pata_hpt366' driver.
This results in ~5% decrease in the 'pata_hpt37x' driver binary
size and in ~4% decrease in the 'pata_hpt3x2n' driver binary size
(as measured on x86-32).
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Sergei Shtylyov [Mon, 7 Dec 2009 19:25:52 +0000 (23:25 +0400)]
pata_hpt3x2n: always stretch UltraDMA timing
The UltraDMA Tss timing must be stretched with ATA clock of 66 MHz, but the
driver only does this when PCI clock is 66 MHz, whereas it always programs
DPLL clock (which is used as the ATA clock) to 66 MHz.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: <stable@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Linus Torvalds [Mon, 1 Mar 2010 18:38:09 +0000 (10:38 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (62 commits)
Input: atkbd - release previously reserved keycodes 248 - 254
Input: add KEY_WPS_BUTTON definition
Input: ads7846 - add regulator support
Input: winbond-cir - fix suspend/resume
Input: gamecon - use pr_err() and friends
Input: gamecon - constify some of the setup structures
Input: gamecon - simplify pad type handling
Input: gamecon - simplify coordinate calculation for PSX
Input: gamecon - fix some formatting issues
Input: gamecon - add rumble support for N64 pads
Input: wacom - add device type to device name string
Input: s3c24xx_ts - report touch only when stylus is down
Input: s3c24xx_ts - re-enable IRQ on resume
Input: wacom - constify product features data
Input: wacom - use per-device instance of wacom_features
Input: sh_keysc - enable building on SH-Mobile ARM
Input: wacom - get features from driver info
Input: rotary-encoder - set gpio direction for each requested gpio
Input: sh_keysc - update the driver with mode 6
Input: sh_keysc - switch to using bitmaps
...
Linus Torvalds [Mon, 1 Mar 2010 18:36:22 +0000 (10:36 -0800)]
Merge branch 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI: replace acpi_integer by u64
ACPICA: Update version to 20100121.
ACPICA: Remove unused uint32_struct type
ACPICA: Disassembler: Remove obsolete "Integer64" field in parse object
ACPICA: Remove obsolete ACPI_INTEGER (acpi_integer) type
ACPICA: Predefined name repair: fix NULL package elements
ACPICA: AcpiGetDevices: Eliminate unnecessary _STA calls
ACPICA: Update all ACPICA copyrights and signons to 2010
ACPICA: Update for new gcc-4 warning options
Sandeep Paulraj [Mon, 1 Feb 2010 14:51:15 +0000 (09:51 -0500)]
DaVinci DM355: Modifications to DM355 SPI support
This patch does the following
1) Minor change to the SPI clocks making it
similar to DM365.
2) Changing the interrupt used by SPI0
3) Adding EDMA resources that can be used by SPI0
4) Adding platform specific data.
Signed-off-by: Sandeep Paulraj <s-paulraj@ti.com> Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Linus Torvalds [Mon, 1 Mar 2010 18:14:46 +0000 (10:14 -0800)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] wrong attribute of HUB chip written in uv_setup()
[IA64] remove trailing space in messages
[IA64] use asm-generic/scatterlist.h
[IA64] build arch/ia64/kernel/acpi-ext.o when CONFIG_ACPI
[IA64] Only build arch/ia64/kernel/acpi.o when CONFIG_ACPI
[IA64] Remove COMPAT_IA32 support
Linus Torvalds [Mon, 1 Mar 2010 17:00:29 +0000 (09:00 -0800)]
Merge branch 'for-2.6.34' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.34' of git://git.kernel.dk/linux-2.6-block: (38 commits)
block: don't access jiffies when initialising io_context
cfq: remove 8 bytes of padding from cfq_rb_root on 64 bit builds
block: fix for "Consolidate phys_segment and hw_segment limits"
cfq-iosched: quantum check tweak
blktrace: perform cleanup after setup error
blkdev: fix merge_bvec_fn return value checks
cfq-iosched: requests "in flight" vs "in driver" clarification
cciss: Fix problem with scatter gather elements in the scsi half of the driver
cciss: eliminate unnecessary pointer use in cciss scsi code
cciss: do not use void pointer for scsi hba data
cciss: factor out scatter gather chain block mapping code
cciss: fix scatter gather chain block dma direction kludge
cciss: simplify scatter gather code
cciss: factor out scatter gather chain block allocation and freeing
cciss: detect bad alignment of scsi commands at build time
cciss: clarify command list padding calculation
cfq-iosched: rethink seeky detection for SSDs
cfq-iosched: rework seeky detection
block: remove padding from io_context on 64bit builds
block: Consolidate phys_segment and hw_segment limits
...
Linus Torvalds [Mon, 1 Mar 2010 16:58:44 +0000 (08:58 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (252 commits)
ASoC: Check progress when reporting periods from i.MX FIQ handler
ASoC: Remove a unused variables from i.MX FIQ runtime data
ALSA: hda - Add/fix ALC269 FSC and Quanta models
ALSA: hda - Add ALC670 codec support
OMAP4: PMIC: Add support for twl6030 codec
ALSA: hda - remove unnecessary msleep on power state transitions
usb/gadget/{f_audio,gmidi}.c: follow recent changes in audio.h
ASoC: fsi: Modify over/under run error settlement
ASoC: OMAP4: Add McPDM platform driver
ASoC: OMAP4: Add support for McPDM
ASoC: OMAP: data_type and sync_mode configurable in audio dma
ALSA: hda - Add missing description in HD-Audio-Models.txt
ALSA: add support for Macbook Air 2,1 internal speaker
ALSA: usbaudio: consolidate header files
ALSA: usbmixer: bail out early when parsing audio class v2 descriptors
ALSA: usbaudio: implement basic set of class v2.0 parser
ALSA: usbaudio: introduce new types for audio class v2
ALSA: usbaudio: parse USB descriptors with structs
ALSA: hda - enable snoop for Intel Cougar Point
ALSA: hda - Remove identical definitions for macmini3 model
...
Linus Torvalds [Mon, 1 Mar 2010 16:51:52 +0000 (08:51 -0800)]
Merge branches 'futexes-for-linus', 'irq-core-for-linus' and 'bkl-drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futex: Protect pid lookup in compat code with RCU
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Fix documentation of default chip disable()
* 'bkl-drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
nvram: Drop the BKL from nvram_open()
Randy Dunlap [Mon, 1 Mar 2010 01:32:45 +0000 (17:32 -0800)]
scsi.c: add missing kernel-doc notation for new VPD parameters
Add missing kernel-doc notation for new function parameters:
Warning(drivers/scsi/scsi.c:1031): No description found for parameter 'buf'
Warning(drivers/scsi/scsi.c:1031): No description found for parameter 'buf_len'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yinghai Lu [Sun, 28 Feb 2010 23:49:39 +0000 (15:49 -0800)]
pci: don't reassign to ROM res if it is not going to be enabled
A ROM resource that doesn't fit should not cause us to try to re-assign
all the bus resources. Nobody generally cares, and re-assigning is
going to just cause way more troubles than it tries to solve.
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Richard Kennedy [Mon, 1 Mar 2010 09:57:22 +0000 (10:57 +0100)]
block: don't access jiffies when initialising io_context
As the comment says the initial value of last_waited is never used, so
there is no need to initialise it with the current jiffies. Jiffies is
hot enough without accessing it for no reason.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Shaohua Li [Mon, 1 Mar 2010 08:20:54 +0000 (09:20 +0100)]
cfq-iosched: quantum check tweak
Currently a queue can only dispatch up to 4 requests if there are other queues.
This isn't optimal, device can handle more requests, for example, AHCI can
handle 31 requests. I can understand the limit is for fairness, but we could
do a tweak: if the queue still has a lot of slice left, sounds we could
ignore the limit. Test shows this boost my workload (two thread randread of
a SSD) from 78m/s to 100m/s.
Thanks for suggestions from Corrado and Vivek for the patch.
Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Linus Torvalds [Sun, 28 Feb 2010 19:00:55 +0000 (11:00 -0800)]
Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, uv: Remove recursion in uv_heartbeat_enable()
x86, uv: uv_global_gru_mmr_address() macro fix
x86, uv: Add serial number parameter to uv_bios_get_sn_info()
Linus Torvalds [Sun, 28 Feb 2010 18:59:18 +0000 (10:59 -0800)]
Merge branch 'x86-pci-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-pci-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Enable NMI on all cpus on UV
vgaarb: Add user selectability of the number of GPUS in a system
vgaarb: Fix VGA arbiter to accept PCI domains other than 0
x86, uv: Update UV arch to target Legacy VGA I/O correctly.
pci: Update pci_set_vga_state() to call arch functions
Dmitry Monakhov [Sat, 27 Feb 2010 17:35:13 +0000 (20:35 +0300)]
blktrace: perform cleanup after setup error
Currently even if BLKTRACESETUP ioctl has failed user must call
BLKTRACETEARDOWN to be shure what all staff was cleaned, which
is contr-intuitive.
Let's setup ioctl make necessery cleanup by it self.
Dmitry Monakhov [Sat, 27 Feb 2010 17:35:12 +0000 (20:35 +0300)]
blkdev: fix merge_bvec_fn return value checks
merge_bvec_fn() returns bvec->bv_len on success. So we have to check
against this value. But in case of fs_optimization merge we compare
with wrong value. This patch must be included in b428cd6da7e6559aca69aa2e3a526037d3f20403
But accidentally i've forgot to add this in the initial patch.
To make things straight let's replace all such checks.
In fact this makes code easy to understand.
Corrado Zoccolo [Sun, 28 Feb 2010 18:45:05 +0000 (19:45 +0100)]
cfq-iosched: requests "in flight" vs "in driver" clarification
Counters for requests "in flight" and "in driver" are used asymmetrically
in cfq_may_dispatch, and have slightly different meaning.
We split the rq_in_flight counter (was sync_flight) to count both sync
and async requests, in order to use this one, which is more accurate in
some corner cases.
The rq_in_driver counter is coalesced, since individual sync/async counts
are not used any more.
Linus Torvalds [Sun, 28 Feb 2010 18:43:53 +0000 (10:43 -0800)]
Merge branch 'x86-setup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-setup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, setup: Don't skip mode setting for the standard VGA modes
x86-64, setup: Inhibit decompressor output if video info is invalid
x86, setup: When restoring the screen, update boot_params.screen_info
cciss: Fix problem with scatter gather elements in the scsi half of the driver
cciss: Fix problem with scatter gather elements in the scsi half of the driver
When support for more than 31 scatter gather elements was added to the block
half of the driver, the SCSI half of the driver was not addressed, and the bump
from 31 to 32 scatter gather elements in the command block itself (not chained)
actually broke the SCSI half of the driver, so that any transfer requiring 32
scatter gather elements wouldn't work. This fix also increases the max transfer
size and size of the scatter gather table to the limit supported by the controller
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: do not use void pointer for scsi hba data
and get rid of related unnecessary type casting
and delete some superfluous and misleading comments nearby.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: fix scatter gather chain block dma direction kludge
cciss: fix scatter gather chain block dma direction kludge
The data direction for the chained block of scatter gather
elements should always be PCI_DMA_TODEVICE, but was mistakenly
set to the direction of the data transfer, then a kludge to
fix it was added, in which pci_dma_sync_single_for_device or
pci_dma_sync_single_for_cpu was called. If the correct direction
is used in the first place, the kludge isn't needed.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: simplify scatter gather code.
Instead of allocating an array of pointers to a structure
containing an SGDescriptor structure, and two other elements
that aren't really used, just allocate SGDescriptor structs.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Linus Torvalds [Sun, 28 Feb 2010 18:41:35 +0000 (10:41 -0800)]
Merge branch 'x86-rwsem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-rwsem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-64, rwsem: Avoid store forwarding hazard in __downgrade_write
x86-64, rwsem: 64-bit xadd rwsem implementation
x86: Fix breakage of UML from the changes in the rwsem system
x86-64: support native xadd rwsem implementation
x86: clean up rwsem type system
Corrado Zoccolo [Sat, 27 Feb 2010 18:45:40 +0000 (19:45 +0100)]
cfq-iosched: rethink seeky detection for SSDs
CFQ currently applies the same logic of detecting seeky queues and
grouping them together for rotational disks as well as SSDs.
For SSDs, the time to complete a request doesn't depend on the
request location, but only on the size.
This patch therefore changes the criterion to group queues by
request size in case of SSDs, in order to achieve better fairness.
Corrado Zoccolo [Sat, 27 Feb 2010 18:45:39 +0000 (19:45 +0100)]
cfq-iosched: rework seeky detection
Current seeky detection is based on average seek lenght.
This is suboptimal, since the average will not distinguish between:
* a process doing medium sized seeks
* a process doing some sequential requests interleaved with larger seeks
and even a medium seek can take lot of time, if the requested sector
happens to be behind the disk head in the rotation (50% probability).
Therefore, we change the seeky queue detection to work as follows:
* each request can be classified as sequential if it is very close to
the current head position, i.e. it is likely in the disk cache (disks
usually read more data than requested, and put it in cache for
subsequent reads). Otherwise, the request is classified as seeky.
* an history window of the last 32 requests is kept, storing the
classification result.
* A queue is marked as seeky if more than 1/8 of the last 32 requests
were seeky.
This patch fixes a regression reported by Yanmin, on mmap 64k random
reads.
Linus Torvalds [Sun, 28 Feb 2010 18:39:36 +0000 (10:39 -0800)]
Merge branch 'x86-numa-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-numa-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, numa: Remove configurable node size support for numa emulation
x86, numa: Add fixed node size option for numa emulation
x86, numa: Fix numa emulation calculation of big nodes
x86, acpi: Map hotadded cpu to correct node.
Linus Torvalds [Sun, 28 Feb 2010 18:39:16 +0000 (10:39 -0800)]
Merge branch 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Convert set_atomicity_lock to raw_spinlock
x86, mtrr: Kill over the top warn
x86, mtrr: Constify struct mtrr_ops
The bug is in virtio-pci: we use msix_vector as array index to get irq
entry, but some vqs do not have a dedicated vector so this causes an out
of bounds access. By chance, we seem to often get 0 value, which
results in this error.
Fix by verifying that vector is legal before using it as index.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Anthony Liguori <aliguori@us.ibm.com> Acked-by: Shirley Ma <xma@us.ibm.com> Acked-by: Amit Shah <amit.shah@redhat.com>
Linus Torvalds [Sun, 28 Feb 2010 18:38:45 +0000 (10:38 -0800)]
Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mm: Unify kernel_physical_mapping_init() API
x86, mm: Allow highmem user page tables to be disabled at boot time
x86: Do not reserve brk for DMI if it's not going to be used
x86: Convert tlbstate_lock to raw_spinlock
x86: Use the generic page_is_ram()
x86: Remove BIOS data range from e820
Move page_is_ram() declaration to mm.h
Generic page_is_ram: use __weak
resources: introduce generic page_is_ram()
Linus Torvalds [Sun, 28 Feb 2010 18:37:06 +0000 (10:37 -0800)]
Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, cacheinfo: Enable L3 CID only on AMD
x86, cacheinfo: Remove NUMA dependency, fix for AMD Fam10h rev D1
x86, cpu: Print AMD virtualization features in /proc/cpuinfo
x86, cacheinfo: Calculate L3 indices
x86, cacheinfo: Add cache index disable sysfs attrs only to L3 caches
x86, cacheinfo: Fix disabling of L3 cache indices
intel-agp: Switch to wbinvd_on_all_cpus
x86, lib: Add wbinvd smp helpers
Linus Torvalds [Sun, 28 Feb 2010 18:31:01 +0000 (10:31 -0800)]
Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits)
sched: Fix SCHED_MC regression caused by change in sched cpu_power
sched: Don't use possibly stale sched_class
kthread, sched: Remove reference to kthread_create_on_cpu
sched: cpuacct: Use bigger percpu counter batch values for stats counters
percpu_counter: Make __percpu_counter_add an inline function on UP
sched: Remove member rt_se from struct rt_rq
sched: Change usage of rt_rq->rt_se to rt_rq->tg->rt_se[cpu]
sched: Remove unused update_shares_locked()
sched: Use for_each_bit
sched: Queue a deboosted task to the head of the RT prio queue
sched: Implement head queueing for sched_rt
sched: Extend enqueue_task to allow head queueing
sched: Remove USER_SCHED
sched: Fix the place where group powers are updated
sched: Assume *balance is valid
sched: Remove load_balance_newidle()
sched: Unify load_balance{,_newidle}()
sched: Add a lock break for PREEMPT=y
sched: Remove from fwd decls
sched: Remove rq_iterator from move_one_task
...