It apparently causes problems with partition table read-ahead
on archs with large page sizes. Until that problem is diagnosed
further, just drop the readpages support on block devices.
blk_queue_bounce_limit() is more than a wrapper about the request queue
limits.bounce_pfn variable. Introduce blk_queue_bounce_pfn() which can
be called by stacking drivers that wish to set the bounce limit
explicitly.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Andrew Morton [Tue, 2 Jun 2009 12:51:30 +0000 (14:51 +0200)]
cciss: use schedule_timeout_interruptible()
Use schedule_timeout_interruptible() instead of open-coding the set and
schedule parts.
Cc: Mike Miller <mikem@beardog.cca.cpqcorp.net> Cc: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Add sysfs entries to the cciss driver needed for the dm/multipath tools.
A file for vendor, model, rev, and unique_id is added for each logical
drive under directory /sys/bus/pci/devices/<dev>/ccissX/cXdY. Where X =
the controller (or host) number and Y is the logical drive number.
A link from /sys/bus/pci/devices/<dev>/ccissX/cXdY/block:cciss!cXdY to
/sys/block/cciss!cXdY/device is also created. A bus is created in
/sys/bus/cciss. A link is created from the pci ccissX entry to
/sys/bus/cciss/devices/ccissX. Please consider this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com> Cc: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Fix the SCSI reset error handler to send a working, properly addressed
reset message to the target device and add code to wait for the target
device to become ready by polling it with Test Unit Ready.
The existing reset code was broken in that it didn't bother to set the
8-byte LUN address to anything besides zero, so the command was addressed
to the controller, which pretended to the driver that the command
succeeded, while doing nothing. Ages ago I tested this code, but
unbeknownst to me, my test was flawed, and what I thought was a tape drive
getting reset was actually nothing of the sort. Unfortunately, there is
still lots of Smartarray firmware that doesn't handle doing target resets
right, and this code won't help in those cases, but it also shouldn't make
things worse in those cases than they already are.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net> Cc: Mike Miller <mikem@beardog.cca.cpqcorp.net> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
cciss: factor out core of sendcmd() for a more sane interface
Factor out the core of sendcmd() to provide a simpler interface which
exposes all the error information to the caller and make the original
sendcmd use this new function. Rationale: The SCSI error handling
routines need to send commands with interrupts turned off, but they also
need access to the full error information.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net> Cc: Mike Miller <mikem@beardog.cca.cpqcorp.net> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Kiyoshi Ueda [Tue, 2 Jun 2009 06:44:01 +0000 (08:44 +0200)]
block: fix a possible oops on elv_abort_queue()
I found one more mis-conversion to the 'request is always dequeued
when completing' model in elv_abort_queue() during code inspection.
Although I haven't hit any problem caused by this mis-conversion yet
and just done compile/boot test, please apply if you have no problem.
Request must be dequeued when it completes.
However, elv_abort_queue() completes requests without dequeueing.
This will cause oops in the __blk_end_request_all().
This patch fixes the oops.
James Bottomley [Sat, 30 May 2009 04:43:49 +0000 (06:43 +0200)]
block: fix an oops on BLKPREP_KILL
Doing a bit of torture testing, I ran across a BUG in the block
subsystem (at blk-core.c:2048): the test for if the request is queued.
It turns out the trigger was a BLKPREP_KILL coming out of the SCSI prep
function. Currently for BLKPREP_KILL requests, we send them straight
into __blk_end_request_all() with an error, but they've never been
dequeued, so they trip the bug. Fix this by starting requests before
killing them.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Kiyoshi Ueda [Wed, 27 May 2009 12:50:02 +0000 (14:50 +0200)]
block: fix no diskstat problem
The commit below in 2.6-block/for-2.6.31 causes no diskstat problem
because the blk_discard_rq() check was added with '&&'.
It should be 'blk_fs_request() || blk_discard_rq()'.
This patch does it and fixes the no diskstat problem.
Please review and apply.
We currently don't do merging on discard requests, but we potentially
could. If we do, then we need to include discard requests in the IO
accounting, or merging would end up decrementing in_flight IO counters
for an IO which never incremented them.
block: implement and enforce request peek/start/fetch
Added a BUG_ON(blk_queued_rq(req)) to the top of blk_finish_req().
Unfortunately, this checks whether req->queuelist is empty. This list
is doing double duty both as the queue list and the tag list, so tagged
requests come in here with this not empty and boom (the tag list is
emptied by blk_queue_end_tag() lower down).
Fix this by moving the BUG_ON to below the end tag we also seem
vulnerable to this in blk_requeue_request() as well. I think all uses
of blk_queued_rq() need auditing because the check is clearly wrong in
the tagged case.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
block: Export I/O topology for block devices and partitions
To support devices with physical block sizes bigger than 512 bytes we
need to ensure proper alignment. This patch adds support for exposing
I/O topology characteristics as devices are stacked.
logical_block_size is the smallest unit the device can address.
physical_block_size indicates the smallest I/O the device can write
without incurring a read-modify-write penalty.
The io_min parameter is the smallest preferred I/O size reported by
the device. In many cases this is the same as the physical block
size. However, the io_min parameter can be scaled up when stacking
(RAID5 chunk size > physical block size).
The io_opt characteristic indicates the optimal I/O size reported by
the device. This is usually the stripe width for arrays.
The alignment_offset parameter indicates the number of bytes the start
of the device/partition is offset from the device's natural alignment.
Partition tools and MD/DM utilities can use this to pad their offsets
so filesystems start on proper boundaries.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Currently stacking devices do not have a queue directory in sysfs.
However, many of the I/O characteristics like sector size, maximum
request size, etc. are queue properties.
This patch enables the queue directory for MD/DM devices. The elevator
code has been modified to deal with queues that do not have an I/O
scheduler.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Until now we have had a 1:1 mapping between storage device physical
block size and the logical block sized used when addressing the device.
With SATA 4KB drives coming out that will no longer be the case. The
sector size will be 4KB but the logical block size will remain
512-bytes. Hence we need to distinguish between the physical block size
and the logical ditto.
This patch renames hardsect_size to logical_block_size.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Linus Torvalds [Fri, 22 May 2009 14:37:42 +0000 (07:37 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: IP32: Remove unnecessary if not even harmful volatile keywords.
MIPS: IP32: Fix build error due to uninitialized variable.
MIPS: Fix sparse warning in incompatiable argument type of clear_user.
Linus Torvalds [Fri, 22 May 2009 14:33:38 +0000 (07:33 -0700)]
Merge branch 'sh/for-2.6.30' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh/for-2.6.30' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
video: stop sh_mobile_lcdcfb only if started
sh: ap325 camera without i2c driver fix
Corey Minyard [Wed, 20 May 2009 18:36:17 +0000 (13:36 -0500)]
ipmi: fix ipmi_si modprobe hang
Instead of queuing IPMB messages before channel initialization, just
throw them away. Nobody will be listening for them at this point,
anyway, and they will clog up the queue and nothing will be delivered
if we queue them.
Also set the current channel to the number of channels, as this value
is used to tell if the channel information has been initialized.
Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: Ferenc Wagner <wferi@niif.hu> Cc: Dan Frazier <dannf@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is possible for ide-cd to ignore ide_error()'s return value under
some circumstances. Workaround it in ide_intr() and ide_timer_expiry()
by checking if there is a device/port reset pending currently.
It turns out that such devices lack cable detection altogether
(which in turn results in incorrect detection of 40-wire cables
by our current cable detection strategy) so always handle them
by trusting host-side cable detection only.
v2:
Model detection fixup from Martin.
Reported-and-tested-by: Martin Lottermoser <Martin.Lottermoser@t-online.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Ralf Baechle [Fri, 22 May 2009 09:48:17 +0000 (10:48 +0100)]
MIPS: IP32: Fix build error due to uninitialized variable.
CC arch/mips/sgi-ip32/ip32-reset.o
cc1: warnings being treated as errors
arch/mips/sgi-ip32/ip32-reset.c: In function 'debounce':
arch/mips/sgi-ip32/ip32-reset.c:97: error: 'reg_a' is used uninitialized in this function
The issues is old but due to the volatile keyword gcc older than 4.4 did
not warn about this obvious bug.
Wu Zhangjin [Wed, 20 May 2009 21:50:01 +0000 (05:50 +0800)]
MIPS: Fix sparse warning in incompatiable argument type of clear_user.
The type of the second argument of access_ok should be (void __user *).
The unnecessary conversion of the clear_user address argument was causing
sparse to emit warnings on the __chk_user_ptr check.
Ryusuke Konishi [Fri, 22 May 2009 11:36:21 +0000 (20:36 +0900)]
nilfs2: fix memory leak in nilfs_ioctl_clean_segments
This fixes a new memory leak problem in garbage collection. The
problem was brought by the bugfix patch ("nilfs2: fix lock order
reversal in nilfs_clean_segments ioctl").
Thanks to Kentaro Suzuki for finding this problem.
Roel Kluin [Fri, 22 May 2009 07:25:32 +0000 (09:25 +0200)]
xen-blkfront: beyond ARRAY_SIZE of info->shadow
Do not go beyond ARRAY_SIZE of info->shadow Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Magnus Damm [Wed, 20 May 2009 14:34:43 +0000 (14:34 +0000)]
video: stop sh_mobile_lcdcfb only if started
This patch fixes the LCDC driver to avoid calling the
function sh_mobile_lcdc_start_stop(priv, 0) unless the
same function has been called before to start the LCDC
hardware.
Triggered when sh_mobile_lcdcfb.c failed to probe() due to
missing MSTP clocks.
Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Magnus Damm [Wed, 20 May 2009 14:30:06 +0000 (14:30 +0000)]
sh: ap325 camera without i2c driver fix
This patch fixes the ap325rxa ncm03j camera code to handle
the case where no i2c driver is present. Without this fix
i2c_transfer() may be passed NULL as adapter which results
in a crash.
Triggered when i2c-sh_mobile.c failed to probe() due to
missing MSTP clocks.
Signed-off-by: Magnus Damm <damm@igel.co.jp> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Steve French [Thu, 21 May 2009 22:21:53 +0000 (22:21 +0000)]
[CIFS] fix posix open regression
Posix open code was not properly adding the file to the
list of open files. Fix allocating cifsFileInfo
more than once, and adding twice to flist and tlist.
Also fix mode setting to be done in one place in these
paths.
Signed-off-by: Steve French <sfrench@us.ibm.com> Reviewed-by: Shirish Pargaonkar <shirishp@us.ibm.com> Tested-by: Jeff Layton <jlayton@redhat.com> Tested-by: Luca Tettamanti <kronos.it@gmail.com>
Linus Torvalds [Wed, 20 May 2009 23:40:24 +0000 (16:40 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/drm-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/drm-2.6:
drm: Copy back ioctl data to userspace regardless of return code.
drm: Round size of SHM maps to PAGE_SIZE
Linus Torvalds [Wed, 20 May 2009 23:32:19 +0000 (16:32 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: 64-bit: Fix system lockup.
MIPS: IP28: Change to build with -mr10k-cache-barrier=store
MIPS: IP22: Fix hang in power button interrupt handler
MIPS: IP32: Fix hang on shutdown in power button interrupt handler.
The second argument of the probe method points to the amba_id
structure, so it's better passed with the correct type. None of the
current in-tree drivers uses the pointer, so they have only been
checked for a clean compile.
Change suggested by Russell King.
Signed-off-by: Alessandro Rubini <rubini@unipv.it> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Michel Dänzer [Wed, 20 May 2009 11:32:00 +0000 (13:32 +0200)]
drm: Copy back ioctl data to userspace regardless of return code.
Fixes a regression from commit 9d5b3ffc42f7820e8ee07705496955e4c2c38dd9
('drm: fixup some of the ioctl function exit paths'): The vblank ioctl
needs to update the userspace parameters when interrupted by a signal,
which was prevented by the return code check. This could cause the X
server to hang in drmWaitVBlank().
Signed-off-by: Michel Dänzer <daenzer@vmware.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Greg Ungerer [Wed, 20 May 2009 06:12:32 +0000 (16:12 +1000)]
MIPS: 64-bit: Fix system lockup.
The address range size calculation inside local_flush_tlb_kernel_range()
is being truncated by a too small size variable holder on 64-bit systems.
The truncated size can result in an erroneous tlbsize check that means we
sit spinning inside a loop trying to flush a hige number of TLB entries.
This is for all intents and purposes a system hang. Fix by using an
appropriately sized valiable to hold the size.
[Ralf: Greg's original patch submission identified the issue and fixed one
instance in tlb-r4k.c but there there were several more. For consistency
I also modified tlb-r3k.c even though that file is only used on 32-bit.]
peter fuerst [Sun, 17 May 2009 21:49:45 +0000 (23:49 +0200)]
MIPS: IP28: Change to build with -mr10k-cache-barrier=store
Richard Sandiford's new code for inserting the cache-barriers, for GCC
4.3 and above and already incorporated in the current GCC-release, uses
a slightly different option-syntax.
Signed-off-by: peter fuerst <post@pfrst.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Sat, 16 May 2009 11:23:45 +0000 (12:23 +0100)]
MIPS: IP22: Fix hang in power button interrupt handler
The hang was caused by the use of disable_irq() from the interrupt handler
itself. Fixed by the use of disable_irq_nosync(). The issue was
triggered by:
MIPS: IP32: Fix hang on shutdown in power button interrupt handler.
The hang was caused by the use of disable_irq() from the interrupt handler
itself. Fixed by the use of disable_irq_nosync(). The issue was
triggered by:
Linus Torvalds [Wed, 20 May 2009 15:56:10 +0000 (08:56 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
cdrom: beyond ARRAY_SIZE of viocd_diskinfo
xen/blkfront: fix warning when deleting gendisk on unplug/shutdown
xen/blkfront: allow xenbus state transition to Closing->Closed when not Connected
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
ipv4: make default for INET_LRO consistent with help text
net: fix skb_seq_read returning wrong offset/length for page frag data
pkt_sched: gen_estimator: use 64 bit intermediate counters for bps
be2net: add two new pci device ids to pci device table
sch_teql: should not dereference skb after ndo_start_xmit()
tcp: fix MSG_PEEK race check
Doc: fixed descriptions on /proc/sys/net/core/* and /proc/sys/net/unix/*
Neterion: *FIFO1_DMA_ERR set twice, should 2nd be *FIFO2_DMA_ERR?
mv643xx_eth: fix PPC DMA breakage
bonding: fix link down handling in 802.3ad mode
bridge: fix initial packet flood if !STP
bridge: relay bridge multicast pkgs if !STP
NET: Meth: Fix unsafe mix of irq and non-irq spinlocks.
mlx4_en: Fix not deleted napi structures
ipconfig: handle case of delayed DHCP server
netpoll: don't dereference NULL dev from np
wimax/i2400m: fix device crash: fix optimization in _roq_queue_update_ws
Linus Torvalds [Wed, 20 May 2009 01:42:45 +0000 (18:42 -0700)]
Merge branch 'core/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futex: setup writeable mapping for futex ops which modify user space data
Currently, userspace can fail to obtain the SAREA mapping (among other
reasons) if it passes SAREA_MAX to drmAddMap without aligning it to the
page size. This breaks for example on PowerPC with 64K pages and radeon
despite the kernel radeon actually doing the right rouding in the first
place.
The way SAREA_MAX is defined with a bunch of ifdef's and duplicated
between libdrm and the X server is gross, ultimately it should be
retrieved by userspace from the kernel, but in the meantime, we have
plenty of existing userspace built with bad values that need to work.
This patch works around broken userspace by rounding the requested size
in drm_addmap_core() of any SHM map to the page size. Since the backing
memory for SHM maps is also allocated within addmap_core, there is no
danger of adjacent memory being exposed due to the increased map size.
The only side effect is that drivers that previously tried to create or
access SHM maps using a size < PAGE_SIZE and failed (getting -EINVAL),
will now succeed at the cost of a little bit more memory used if that
happens to be when the map is created.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Eric Paris [Wed, 13 May 2009 16:50:40 +0000 (12:50 -0400)]
TPM: get_event_name stack corruption
get_event_name uses sprintf to fill a buffer declared on the stack. It fills
the buffer 2 bytes at a time. What the code doesn't take into account is that
sprintf(buf, "%02x", data) actually writes 3 bytes. 2 bytes for the data and
then it nul terminates the string. Since we declare buf to be 40 characters
long and then we write 40 bytes of data into buf sprintf is going to write 41
characters. The fix is to leave room in buf for the nul terminator.
Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
PCI PM: Fix initialization and kexec breakage for some devices
Recent PCI PM changes introduced a bug that causes some devices to be
mishandled after kexec and during early initialization. The failure
scenario in the kexec case is the following:
* Assume a PCI device is not power-manageable by the platform and has
PCI_PM_CTRL_NO_SOFT_RESET set in PMCSR.
* The device is put into D3 before kexec (using the native PCI PM).
* After kexec, pci_setup_device() sets the device's power state to
PCI_UNKNOWN.
* pci_set_power_state(dev, PCI_D0) is called by the device's driver.
* __pci_start_power_transition(dev, PCI_D0) is called and since the
device is not power-manageable by the platform, it causes
pci_update_current_state(dev, PCI_D0) to be called. As a result
the device's current_state field is updated to PCI_D3, in
accordance with the contents of its PCI PM registers.
* pci_raw_set_power_state() is called and it changes the device power
state to D0. *However*, it should also call pci_restore_bars() to
reinitialize the device, but it doesn't, because the device's
current_state field has been modified earlier.
To prevent this from happening, modify pci_platform_power_transition()
so that it doesn't use pci_update_current_state() to update the
current_state field for devices that aren't power-manageable by the
platform. Instead, this field should be updated directly for devices
that don't support the native PCI PM.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Thomas Gleixner [Mon, 18 May 2009 19:20:10 +0000 (21:20 +0200)]
futex: setup writeable mapping for futex ops which modify user space data
The futex code installs a read only mapping via get_user_pages_fast()
even if the futex op function has to modify user space data. The
eventual fault was fixed up by futex_handle_fault() which walked the
VMA with mmap_sem held.
After the cleanup patches which removed the mmap_sem dependency of the
futex code commit 4dc5b7a36a49eff97050894cf1b3a9a02523717 (futex:
clean up fault logic) removed the private VMA walk logic from the
futex code. This change results in a stale RO mapping which is not
fixed up.
Instead of reintroducing the previous fault logic we set up the
mapping in get_user_pages_fast() read/write for all operations which
modify user space data. Also handle private futexes in the same way
and make the current unconditional access_ok(VERIFY_WRITE) depend on
the futex op.
Reported-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> CC: stable@kernel.org
Mark Brown [Thu, 30 Apr 2009 13:48:36 +0000 (14:48 +0100)]
mfd: Keep a cache of WM8350 volatile values
Due to the way that the WM8350 audio driver handles CODEC_ENA many of
the WM8350 audio registers are marked as volatile when they aren't
actually so. Allow the audio driver to see a cache of these values for
inspection during interrupt context.
To do this we need to stop satisfying any bits from volatile registers
from cache - there's no real benefit from doing so anyway, we did the
read already.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Linus Torvalds [Tue, 19 May 2009 18:31:56 +0000 (11:31 -0700)]
Merge branch 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze
* 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Fix kind-of-intr checking against number of interrupts
microblaze: Update Microblaze defconfig
Linus Torvalds [Tue, 19 May 2009 18:25:35 +0000 (11:25 -0700)]
Avoid ICE in get_random_int() with gcc-3.4.5
Martin Knoblauch reports that trying to build 2.6.30-rc6-git3 with
RHEL4.3 userspace (gcc (GCC) 3.4.5 20051201 (Red Hat 3.4.5-2)) causes an
internal compiler error (ICE):
and after some debugging it turns out that it's due to the code trying
to figure out the rough value of the current stack pointer by taking an
address of an uninitialized variable and casting that to an integer.
This is clearly a compiler bug, but it's not worth fighting - while the
current stack kernel pointer might be somewhat hard to predict in user
space, it's also not generally going to change for a lot of the call
chains for a particular process.
So just drop it, and mumble some incoherent curses at the compiler.
Tested-by: Martin Knoblauch <spamtrap@knobisoft.de> Cc: Matt Mackall <mpm@selenic.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Boaz Harrosh [Tue, 19 May 2009 17:54:09 +0000 (19:54 +0200)]
scsi_lib: remove unused variable
The last request completion cleanup in scsi_lib left an unused
this_count variable in scsi_io_completion().
(It was used before in a code segment that now uses blk_end_request_all())
Jeff Layton [Tue, 19 May 2009 13:57:03 +0000 (09:57 -0400)]
cifs: fix pointer initialization and checks in cifs_follow_symlink (try #4)
This is the third respin of the patch posted yesterday to fix the error
handling in cifs_follow_symlink. It also includes a fix for a bogus NULL
pointer check in CIFSSMBQueryUnixSymLink that Jeff Moyer spotted.
It's possible for CIFSSMBQueryUnixSymLink to return without setting
target_path to a valid pointer. If that happens then the current value
to which we're initializing this pointer could cause an oops when it's
kfree'd.
This patch is a little more comprehensive than the last patches. It
reorganizes cifs_follow_link a bit for (hopefully) better readability.
It should also eliminate the uneeded allocation of full_path on servers
without unix extensions (assuming they can get to this point anyway, of
which I'm not convinced).
On a side note, I'm not sure I agree with the logic of enabling this
query even when unix extensions are disabled on the client. It seems
like that should disable this as well. But, changing that is outside the
scope of this fix, so I've left it alone for now.
Reported-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Christoph Hellwig <hch@inraded.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
Boaz Harrosh [Sun, 17 May 2009 16:00:01 +0000 (19:00 +0300)]
block: Un-export blk_rq_append_bio
OSD was the last in-tree user of blk_rq_append_bio(). Now
that it is fixed blk_rq_append_bio is un-exported and
is only used internally by block layer.
Miklos Szeredi [Tue, 19 May 2009 09:37:46 +0000 (11:37 +0200)]
splice: fix kmaps in default_file_splice_write()
Unfortunately multiple kmap() within a single thread are deadlockable,
so writing out multiple buffers with writev() isn't possible.
Change the implementation so that it does a separate write() for each
buffer. This actually simplifies the code a lot since the
splice_from_pipe() helper can be used.
This limitation is caused by HIGHMEM pages, and so only affects a
subset of architectures and configurations. In the future it may be
worth to implement default_file_splice_write() in a more efficient way
on configs that allow it.
Tejun Heo [Tue, 19 May 2009 09:33:06 +0000 (18:33 +0900)]
bio: always copy back data for copied kernel requests
When a read bio_copy_kern() request fails, the content of the bounce
buffer is not copied back. However, as request failure doesn't
necessarily mean complete failure, the buffer state can be useful.
This behavior is also inconsistent with the user map counterpart and
causes the subtle difference between bounced and unbounced IO causes
confusion.
This patch makes bio_copy_kern_endio() ignore @err and always copy
back data on request completion.
Tejun Heo [Tue, 19 May 2009 09:33:05 +0000 (18:33 +0900)]
block: set rq->resid_len to blk_rq_bytes() on issue
In commit c3a4d78c580de4edc9ef0f7c59812fb02ceb037f, while introducing
rq->resid_len, the default value of residue count was changed from
full count to zero. The conversion was done under the assumption that
when a request fails residue count wasn't defined. However, Boaz and
James pointed out that this wasn't true and the residue count should
be preserved for failed requests too.
This patchset restores the original behavior by setting rq->resid_len
to blk_rq_bytes(rq) on request start and restoring explicit clearing
in affected drivers. While at it, take advantage of the fact that
rq->resid_len is set to full count where applicable.
* ide-cd: rq->resid_len cleared on pc success
* mptsas: req->resid_len cleared on success
* sas_expander: rsp/req->resid_len cleared on success
* mpt2sas_transport: req->resid_len cleared on success
* ide-cd, ide-tape, mptsas, sas_host_smp, mpt2sas_transport, ub: take
advantage of initial full count to simplify code
Boaz Harrosh spotted bug in resid_len initialization. Fixed as
suggested.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Borislav Petkov <petkovbb@googlemail.com> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Tue, 19 May 2009 09:33:04 +0000 (18:33 +0900)]
ub: use __blk_end_request_all()
ub_end_rq() always tries to complete full request. The @cmd_len
parameter was there because rq->data_len used to be overwritten with
residue count. Drop @cmd_len and use __blk_end_request_all().
We can fix this by calling del_gendisk() later in blkfront_closing, after
releasing blkif_io_lock. Since the queue is stopped during the interrupts
disabled phase I don't think there is any danger of an event occuring between
releasing the blkif_io_lock and deleting the disk.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Ian Campbell [Tue, 19 May 2009 06:25:48 +0000 (08:25 +0200)]
xen/blkfront: allow xenbus state transition to Closing->Closed when not Connected
This situation can occur when attempting to attach a block device whose
backend is an empty physical CD-ROM driver. The backend in this case
will go directly from the Initialising state to Closing->Closed.
Previously this would result in a NULL pointer deref on info->gd
(xenbus_dev_fatal does not return as a1a15ac5 seems to expect)
Cc: stable@kernel.org Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Frans Pop [Tue, 19 May 2009 04:48:38 +0000 (21:48 -0700)]
ipv4: make default for INET_LRO consistent with help text
Commit e81963b1 ("ipv4: Make INET_LRO a bool instead of tristate.")
changed this config from tristate to bool. Add default so that it is
consistent with the help text.
Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Chenault [Tue, 19 May 2009 04:43:27 +0000 (21:43 -0700)]
net: fix skb_seq_read returning wrong offset/length for page frag data
When called with a consumed value that is less than skb_headlen(skb)
bytes into a page frag, skb_seq_read() incorrectly returns an
offset/length relative to skb->data. Ensure that data which should come
from a page frag does.
Signed-off-by: Thomas Chenault <thomas_chenault@dell.com> Tested-by: Shyam Iyer <shyam_iyer@dell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Filz [Mon, 18 May 2009 21:41:40 +0000 (17:41 -0400)]
nfs: Fix NFS v4 client handling of MAY_EXEC in nfs_permission.
The problem is that permission checking is skipped if atomic open is
possible, but when exec opens a file, it just opens it O_READONLY which
means EXEC permission will not be checked at that time.
This problem is observed by the following sequence (executed as root):
mount -t nfs4 server:/ /mnt4
echo "ls" >/mnt4/foo
chmod 744 /mnt4/foo
su guest -c "mnt4/foo"
Eric Dumazet [Tue, 19 May 2009 02:26:37 +0000 (19:26 -0700)]
pkt_sched: gen_estimator: use 64 bit intermediate counters for bps
gen_estimator can overflow bps (bytes per second) with Gb links, while
it was designed with a u32 API, with a theorical limit of 34360Mbit
(2^32 bytes)
Using 64 bit intermediate avbps/brate counters can allow us to reach
this theorical limit.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Sun, 10 May 2009 20:32:34 +0000 (20:32 +0000)]
tcp: fix MSG_PEEK race check
Commit 518a09ef11 (tcp: Fix recvmsg MSG_PEEK influence of
blocking behavior) lets the loop run longer than the race check
did previously expect, so we need to be more careful with this
check and consider the work we have been doing.
I tried my best to deal with urg hole madness too which happens
here:
if (!sock_flag(sk, SOCK_URGINLINE)) {
++*seq;
...
by using additional offset by one but I certainly have very
little interest in testing that part.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Frans Pop <elendil@planet.nl> Tested-by: Ian Zimmermann <itz@buug.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 18 May 2009 17:22:04 +0000 (10:22 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Explicit alignment for .data.cacheline_aligned
powerpc/ps3: Update ps3_defconfig
powerpc/ftrace: Fix constraint to be early clobber
powerpc/ftrace: Use pr_devel() in ftrace.c
powerpc: Do not assert pte_locked for hugepage PTE entries
Linus Torvalds [Mon, 18 May 2009 17:11:06 +0000 (10:11 -0700)]
Merge branches 'sched-fixes-for-linus-2' and 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix fallback sched_clock()'s offset when using jiffies
* 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: increase MAX_LOCKDEP_ENTRIES and MAX_LOCKDEP_CHAINS
Linus Torvalds [Mon, 18 May 2009 16:15:41 +0000 (09:15 -0700)]
Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Append prompt in /debug/tracing/README file
x86/function-graph: fix constraint for recording old return value
David Woodhouse [Mon, 18 May 2009 12:07:35 +0000 (13:07 +0100)]
Fix oops on close of hot-unplugged FTDI serial converter
Commit c45d6320 ("fix reference counting of ftdi_private") stopped
ftdi_sio_port_remove() from directly freeing the port-private data, with
the intention if the port was still open, it would be freed when
ftdi_close() is eventually called and releases the last refcount on the
structure.
That's all very well, but ftdi_sio_port_remove() still contains a call
to usb_set_serial_port_data(port, NULL) -- so by the time we get to
ftdi_close() for the port which was unplugged, it _still_ oopses on
dereferencing that NULL pointer, as it did before (and does in 2.6.29).
The fix is just not to clear the private data in ftdi_sio_port_remove().
Then the refcount is properly reduced to zero when the final kref_put()
happens in ftdi_close().
Remove a bogus comment too, while we're at it. And stop doing things
inside "if (priv)" -- it must _always_ be there.
Based loosely on an earlier patch by Daniel Mack, and suggestions by
Alan Stern.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Tested-by: Daniel Mack <daniel@caiaq.de> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Korsgaard [Mon, 18 May 2009 10:13:54 +0000 (11:13 +0100)]
mtd_dataflash: unbreak erase support
Commit 5b7f3a50 (fix dataflash 64-bit divisions) unfortunately
introduced a typo. Erase addr and len were swapped in the pageaddr
calculation, causing the wrong sectors to get erased.
Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Acked-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>