Tejun Heo [Wed, 8 Dec 2010 19:57:42 +0000 (20:57 +0100)]
sd: implement sd_check_events()
Replace sd_media_change() with sd_check_events(). sd used to set the
changed state whenever the device is not ready, which can cause event
loop while the device is not ready. Media presence handling code is
changed such that the changed state is set iff the media presence
actually changes. UA still always sets the changed state and
NOT_READY always (at least where it used to set ->changed) clears
media presence, so no event is lost.
Tejun Heo [Thu, 16 Dec 2010 16:52:17 +0000 (17:52 +0100)]
sr: implement sr_check_events()
Replace sr_media_change() with sr_check_events(). It normally only
uses GET_EVENT_STATUS_NOTIFICATION to check both media change and
eject request. If @clearing includes DISK_EVENT_MEDIA_CHANGE, it
issues TUR and compares whether media presence has changed. The SCSI
specific media change uevent is kept for compatibility.
sr_media_change() was doing both media change check and revalidation.
The revalidation part is split into sr_block_revalidate_disk().
Tejun Heo [Wed, 8 Dec 2010 19:57:40 +0000 (20:57 +0100)]
scsi: replace sr_test_unit_ready() with scsi_test_unit_ready()
The usage of TUR has been confusing involving several different
commits updating different parts over time. Currently, the only
differences between scsi_test_unit_ready() and sr_test_unit_ready()
are,
* scsi_test_unit_ready() also sets sdev->changed on NOT_READY.
* scsi_test_unit_ready() returns 0 if TUR ended with UNIT_ATTENTION or
NOT_READY.
Due to the above two differences, sr is using its own
sr_test_unit_ready(), but sd - the sole user of the above extra
handling - doesn't even need them.
Where scsi_test_unit_ready() is used in sd_media_changed(), the code
is looking for device ready w/ media present state which is true iff
TUR succeeds w/o sense data or UA, and when the device is not ready
for whatever reason sd_media_changed() explicitly marks media as
missing so there's no reason to set sdev->changed automatically from
scsi_test_unit_ready() on NOT_READY.
Drop both special handlings from scsi_test_unit_ready(), which makes
it equivalant to sr_test_unit_ready(), and replace
sr_test_unit_ready() with scsi_test_unit_ready(). Also, drop the
unnecessary explicit NOT_READY check from sd_media_changed().
Checking return value is enough for testing device readiness.
Tejun Heo [Thu, 9 Dec 2010 10:18:42 +0000 (11:18 +0100)]
scsi: fix TUR error handling in sr_media_change()
sr_test_unit_ready() returns 0 iff TUR succeeded - IOW, when media is
present and the device is actually ready, so the return value wouldn't
be zero when TUR ends with sense data. sr_media_change() incorrectly
tests (retval || (scsi_sense_valid(sshdr)...)) when it tries to test
whether TUR failed without sense data or with sense data indicating
media-not-present.
Fix the test using scsi_status_is_good() and update comments.
Tejun Heo [Wed, 8 Dec 2010 19:57:38 +0000 (20:57 +0100)]
cdrom: add ->check_events() support
In principle, cdrom just needs to pass through ->check_events() but
CDROM_MEDIA_CHANGED ioctl makes things a bit more complex. Just as
with ->media_changed() support, cdrom code needs to buffer the events
and serve them to ioctl and vfs as requested.
As the code has to deal with both ->check_events() and
->media_changed(), and vfs and ioctl event buffering, this patch adds
check_events caching on top of the existing cdi->mc_flags buffering.
It may be a good idea to deprecate CDROM_MEDIA_CHANGED ioctl and
remove all this mess.
Tejun Heo [Wed, 8 Dec 2010 19:57:37 +0000 (20:57 +0100)]
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Tejun Heo [Wed, 8 Dec 2010 19:57:36 +0000 (20:57 +0100)]
block: move register_disk() and del_gendisk() to block/genhd.c
There's no reason for register_disk() and del_gendisk() to be in
fs/partitions/check.c. Move both to genhd.c. While at it, collapse
unlink_gendisk(), which was artificially in a separate function due to
genhd.c / check.c split, into del_gendisk().
block cfq: select new workload if priority changed
If priority is changed, continuing to check workload_expires and service tree
count of the previous workload does not make sense. We should always choose
the workload with lowest key of new priority in such case.
Signed-off-by: Shaohua Li <shaohua.li@intel.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Eric Paris [Mon, 15 Nov 2010 23:36:29 +0000 (18:36 -0500)]
capabilities/syslog: open code cap_syslog logic to fix build failure
The addition of CONFIG_SECURITY_DMESG_RESTRICT resulted in a build
failure when CONFIG_PRINTK=n. This is because the capabilities code
which used the new option was built even though the variable in question
didn't exist.
The patch here fixes this by moving the capabilities checks out of the
LSM and into the caller. All (known) LSMs should have been calling the
capabilities hook already so it actually makes the code organization
better to eliminate the hook altogether.
Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 15 Nov 2010 22:06:11 +0000 (14:06 -0800)]
Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
arm: omap1: devices: need to return with a value
OMAP1: camera.h: add missing include
omap: dma: Add read-back to DMA interrupt handler to avoid spuriousinterrupts
OMAP2: Devkit8000: Fix mmc regulator failure
Linus Torvalds [Mon, 15 Nov 2010 22:05:44 +0000 (14:05 -0800)]
Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (w83795) Check for BEEP pin availability
hwmon: (w83795) Clear intrusion alarm immediately
hwmon: (w83795) Read the intrusion state properly
hwmon: (w83795) Print the actual temperature channels as sources
hwmon: (w83795) List all usable temperature sources
hwmon: (w83795) Expose fan control method
hwmon: (w83795) Fix fan control mode attributes
hwmon: (lm95241) Check validity of input values
hwmon: Change mail address of Hans J. Koch
Linus Torvalds [Mon, 15 Nov 2010 22:01:33 +0000 (14:01 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: sysfs: fix printk warnings
PCI: fix pci_bus_alloc_resource() hang, prefer positive decode
PCI: read current power state at enable time
PCI: fix size checks for mmap() on /proc/bus/pci files
x86/PCI: coalesce overlapping host bridge windows
PCI hotplug: ibmphp: Add check to prevent reading beyond mapped area
Jean Delvare [Mon, 15 Nov 2010 21:40:38 +0000 (22:40 +0100)]
i2c: Delete unused adapter IDs
Delete unused I2C adapter IDs. Special cases are:
* I2C_HW_B_RIVA was still set in driver rivafb, however no other
driver is ever looking for this value, so we can safely remove it.
* I2C_HW_B_HDPVR is used in staging driver lirc_zilog, however no
adapter ID is ever set to this value, so the code in question never
runs. As the code additionally expects that I2C_HW_B_HDPVR may not
be defined, we can delete it now and let the lirc_zilog driver
maintainer rewrite this piece of code.
Big thanks for Hans Verkuil for doing all the hard work :)
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Jarod Wilson <jarod@redhat.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Hans Verkuil <hverkuil@xs4all.nl>
Wolfram Sang [Mon, 15 Nov 2010 21:40:38 +0000 (22:40 +0100)]
i2c: Remove obsolete cleanup for clientdata
A few new i2c-drivers came into the kernel which clear the clientdata-pointer
on exit. This is obsolete meanwhile, so fix it and hope the word will spread.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Acked-by: Alan Cox <alan@linux.intel.com> Acked-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Jean Delvare <khali@linux-fr.org>
Linus Torvalds [Mon, 15 Nov 2010 21:37:37 +0000 (13:37 -0800)]
include/linux/kernel.h: Move logging bits to include/linux/printk.h
Move the logging bits from kernel.h into printk.h so that
there is a bit more logical separation of the generic from
the printk logging specific parts.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The fix in commit 6b4e81db2552 ("i8k: Tell gcc that *regs gets
clobbered") to work around the gcc miscompiling i8k.c to add "+m
(*regs)" caused register pressure problems and a build failure.
Changing the 'asm' statement to 'asm volatile' instead should prevent
that and works around the gcc bug as well, so we can remove the "+m".
[ Background on the gcc bug: a memory clobber fails to mark the function
the asm resides in as non-pure (aka "__attribute__((const))"), so if
the function does nothing else that triggers the non-pure logic, gcc
will think that that function has no side effects at all. As a result,
callers will be mis-compiled.
Adding the "+m" made gcc see that it's not a pure function, and so
does "asm volatile". The problem was never really the need to mark
"*regs" as changed, since the memory clobber did that part - the
problem was just a bug in the gcc "pure" function analysis - Linus ]
Signed-off-by: Jim Bos <jim876@xs4all.nl> Acked-by: Jakub Jelinek <jakub@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jean Delvare [Mon, 15 Nov 2010 20:38:57 +0000 (21:38 +0100)]
hwmon: (w83795) Check for BEEP pin availability
On the W83795ADG, there's a single pin for BEEP and OVT#, so you
can't have both. Check the configuration and don't create beep
attributes when BEEP pin is not available.
The W83795G has a dedicated BEEP pin so the functionality is always
available there.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Jean Delvare [Mon, 15 Nov 2010 20:38:57 +0000 (21:38 +0100)]
hwmon: (w83795) Clear intrusion alarm immediately
When asked to clear the intrusion alarm, do so immediately. We have to
invalidate the cache to make sure the new status will be read. But we
also have to read from the status register once to clear the pending
alarm, as writing to CLR_CHS surprising won't clear it automatically.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Jean Delvare [Mon, 15 Nov 2010 20:38:56 +0000 (21:38 +0100)]
hwmon: (w83795) Read the intrusion state properly
We can't read the intrusion state from the real-time alarm registers
as we do for all other alarm flags, because real-time alarm bits don't
stick (by definition) and the intrusion state has to stick until
explicitly cleared (otherwise it has little value.)
So we have to use the interrupt status register instead, which is read
from the same address but with a configuration bit flipped in another
register.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Jean Delvare [Mon, 15 Nov 2010 20:38:56 +0000 (21:38 +0100)]
hwmon: (w83795) List all usable temperature sources
Temperature sources are not correlated directly with temperature
channels. A look-up table is required to find out which temperature
sources can be used depending on which temperature channels (both
analog and digital) are enabled.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Jean Delvare [Mon, 15 Nov 2010 20:38:56 +0000 (21:38 +0100)]
hwmon: (w83795) Expose fan control method
Expose fan control method (DC vs. PWM) using the standard sysfs
attributes. I've made it read-only as the board should be wired for
a given mode, the BIOS should have set up the chip for this mode, and
you shouldn't have to change it. But it would be easy enough to make
it changeable if someone comes up with a use case.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Jean Delvare [Mon, 15 Nov 2010 20:38:56 +0000 (21:38 +0100)]
hwmon: (w83795) Fix fan control mode attributes
There were two bugs:
* Speed cruise mode was improperly reported for all fans but fan1.
* Fan control method (PWM vs. DC) was mixed with the control mode.
It will be added back as a separate attribute, as per the standard
sysfs interface.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Jean Delvare [Mon, 15 Nov 2010 20:38:56 +0000 (21:38 +0100)]
hwmon: (lm95241) Check validity of input values
This clears the following build-time warnings I was seeing:
drivers/hwmon/lm95241.c: In function "set_interval":
drivers/hwmon/lm95241.c:132:15: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_max2":
drivers/hwmon/lm95241.c:278:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_max1":
drivers/hwmon/lm95241.c:277:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_min2":
drivers/hwmon/lm95241.c:249:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_min1":
drivers/hwmon/lm95241.c:248:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_type2":
drivers/hwmon/lm95241.c:220:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
drivers/hwmon/lm95241.c: In function "set_type1":
drivers/hwmon/lm95241.c:219:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
This also fixes a small race in set_interval() as a side effect: by
working with a temporary local variable we prevent data->interval from
being accessed at a time it contains the interval value in the wrong
unit.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Davide Rizzo <elpa.rizzo@gmail.com>
Vivek Goyal [Mon, 15 Nov 2010 18:37:36 +0000 (19:37 +0100)]
blk-cgroup: Allow creation of hierarchical cgroups
o Allow hierarchical cgroup creation for blkio controller
o Currently we disallow it as both the io controller policies (throttling
as well as proportion bandwidth) do not support hierarhical accounting
and control. But the flip side is that blkio controller can not be used with
libvirt as libvirt creates a cgroup hierarchy deeper than 1 level.
o So this patch will allow creation of cgroup hierarhcy but at the backend
everything will be treated as flat. So if somebody created a an hierarchy
like as follows.
root
/ \
test1 test2
|
test3
CFQ and throttling will practically treat all groups at same level.
pivot
/ | \ \
root test1 test2 test3
o Once we have actual support for hierarchical accounting and control
then we can introduce another cgroup tunable file "blkio.use_hierarchy"
which will be 0 by default but if user wants to enforce hierarhical
control then it can be set to 1. This way there should not be any
ABI problems down the line.
o The only not so pretty part is introduction of extra file "use_hierarchy"
down the line. Kame-san had mentioned that hierarhical accounting is
expensive in memory controller hence they keep it off by default. I
suspect same will be the case for IO controller also as for each IO
completion we shall have to account IO through hierarchy up to the root.
if yes, then it probably is not a very bad idea to introduce this extra
file so that it will be used only when somebody needs it and some people
might enable hierarchy only in part of the hierarchy.
o This is how basically memory controller also uses "use_hierarhcy" and
they also allowed creation of hierarchies when actual backend support
was not available.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Reviewed-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Reviewed-by: Ciju Rajan K <ciju@linux.vnet.ibm.com> Tested-by: Ciju Rajan K <ciju@linux.vnet.ibm.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Randy Dunlap [Sat, 13 Nov 2010 16:44:33 +0000 (08:44 -0800)]
PCI: sysfs: fix printk warnings
Cast pci_resource_start() and pci_resource_len() to u64 for printk.
drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 9 has type 'resource_size_t'
drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 10 has type 'resource_size_t'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Linus Torvalds [Mon, 15 Nov 2010 16:42:07 +0000 (08:42 -0800)]
Merge branch 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6
* 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6:
fsl-diu-fb: drop dead ioctl define
MAINTAINERS: Add an fbdev git tree entry.
OMAP: DSS: Fix documentation regarding 'vram' kernel parameter
OMAP: VRAM: Fix boot-time memory allocation
OMAP: VRAM: improve VRAM error prints
sisfb: limit POST memory test according to PCI resource length
fbdev: sh_mobile_lcdc: use correct number of modes, when using the default
fbdev: sh_mobile_lcdc: use the standard CEA-861 720p timing
fbdev: sh_mobile_hdmi: properly clean up modedb on monitor unplug
Linus Torvalds [Mon, 15 Nov 2010 16:41:30 +0000 (08:41 -0800)]
Merge branches 'sh-fixes-for-linus' and 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: intc: Fix up build failure introduced by radix tree changes.
MAINTAINERS: update the sh git tree entry.
sh: clkfwk: fix up compiler warnings.
sh: intc: Fix up initializers for gcc 4.5.
rtc: rtc-sh - fix a memory leak
* 'rmobile-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
ARM: mach-shmobile: ap4evb: add fsib 44100Hz rate
MAINTAINERS: update the ARM SH-Mobile git tree entry.
ARM: mach-shmobile: ap4evb: Mark NOR boot loader partitions read-only.
ARM: mach-shmobile: intc-sh7372: fix interrupt number
This area of the code has always been a bit delicate due to the
subtleties of lock ordering. The problem is that for "normal"
alloc/dealloc, we always grab the inode locks first and the rgrp lock
later.
In order to ensure no races in looking up the unlinked, but still
allocated inodes, we need to hold the rgrp lock when we do the lookup,
which means that we can't take the inode glock.
The solution is to borrow the technique already used by NFS to solve
what is essentially the same problem (given an inode number, look up
the inode carefully, checking that it really is in the expected
state).
We cannot do that directly from the allocation code (lock ordering
again) so we give the job to the pre-existing delete workqueue and
carry on with the allocation as normal.
If we find there is no space, we do a journal flush (required anyway
if space from a deallocation is to be released) which should block
against the pending deallocations, so we should always get the space
back.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The fix is to:
a) grab rcu lock in sys_ioprio_{set,get}() and
b) avoid grabbing tasklist_lock.
Discussion in: http://marc.info/?l=linux-kernel&m=128951324702889
Signed-off-by: Greg Thelen <gthelen@google.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Modified by Jens to remove the now redundant inner rcu lock and
unlock since they are now protected by the outer lock.
Paul Mundt [Mon, 15 Nov 2010 05:30:30 +0000 (14:30 +0900)]
sh: intc: Fix up build failure introduced by radix tree changes.
The radix tree retry logic got a bit of an overhaul and subsequently
broke the virtual IRQ subgroup build. Simply switch over to
radix_tree_deref_retry() as per the filemap changes, which the virq
lookup logic was modelled after in the first place.
Pavel Emelyanov [Thu, 28 Oct 2010 09:50:37 +0000 (13:50 +0400)]
slub: Fix slub_lock down/up imbalance
There are two places, that do not release the slub_lock.
Respective bugs were introduced by sysfs changes ab4d5ed5 (slub: Enable
sysfs support for !CONFIG_SLUB_DEBUG) and 2bce6485 ( slub: Allow removal
of slab caches during boot).
Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Pekka Enberg <penberg@kernel.org>
Jim Bos [Sat, 13 Nov 2010 11:13:53 +0000 (12:13 +0100)]
i8k: Tell gcc that *regs gets clobbered
More recent GCC caused the i8k driver to stop working, on Slackware
compiler was upgraded from gcc-4.4.4 to gcc-4.5.1 after which it didn't
work anymore, meaning the driver didn't load or gave total nonsensical
output.
As it turned out the asm(..) statement forgot to mention it modifies the
*regs variable.
Credits to Andi Kleen and Andreas Schwab for providing the fix.
Signed-off-by: Jim Bos <jim876@xs4all.nl> Cc: Andi Kleen <andi@firstfloor.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com> Cc: stable@kernel.org (for 2.6.36) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tao Ma [Sat, 13 Nov 2010 08:22:02 +0000 (16:22 +0800)]
ocfs2: Change some lock status member in ocfs2_lock_res to char.
Commit 83fd9c7 changes l_level, l_requested and l_blocking of
ocfs2_lock_res from int to unsigned char. But actually it is
initially as -1(ocfs2_lock_res_init_common) which
correspoding to 255 for unsigned char. So the whole dlm lock
mechanism doesn't work now which means a disaster to ocfs2.
Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
Tejun Heo [Sat, 13 Nov 2010 10:55:18 +0000 (11:55 +0100)]
block: clean up blkdev_get() wrappers and their users
After recent blkdev_get() modifications, open_by_devnum() and
open_bdev_exclusive() are simple wrappers around blkdev_get().
Replace them with blkdev_get_by_dev() and blkdev_get_by_path().
blkdev_get_by_dev() is identical to open_by_devnum().
blkdev_get_by_path() is slightly different in that it doesn't
automatically add %FMODE_EXCL to @mode.
All users are converted. Most conversions are mechanical and don't
introduce any behavior difference. There are several exceptions.
* btrfs now sets FMODE_EXCL in btrfs_device->mode, so there's no
reason to OR it explicitly on blkdev_put().
* gfs2, nilfs2 and the generic mount_bdev() now set FMODE_EXCL in
sb->s_mode.
* With the above changes, sb->s_mode now always should contain
FMODE_EXCL. WARN_ON_ONCE() added to kill_block_super() to detect
errors.
The new blkdev_get_*() functions are with proper docbook comments.
While at it, add function description to blkdev_get() too.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Neil Brown <neilb@suse.de> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Joern Engel <joern@lazybastard.org> Cc: Chris Mason <chris.mason@oracle.com> Cc: Jan Kara <jack@suse.cz> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: reiserfs-devel@vger.kernel.org Cc: xfs-masters@oss.sgi.com Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Tejun Heo [Sat, 13 Nov 2010 10:55:17 +0000 (11:55 +0100)]
block: check bdev_read_only() from blkdev_get()
bdev read-only status can be queried using bdev_read_only() and may
change while the device is being opened. Enforce it by checking it
from blkdev_get() after open succeeds.
This makes bdev_read_only() check in open_bdev_exclusive() and
fsg_lun_open() unnecessary. Drop them.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: David Brownell <dbrownell@users.sourceforge.net> Cc: linux-usb@vger.kernel.org
Tejun Heo [Sat, 13 Nov 2010 10:55:17 +0000 (11:55 +0100)]
block: reorganize claim/release implementation
With claim/release rolled into blkdev_get/put(), there's no reason to
keep bd_abort/finish_claim(), __bd_claim() and bd_release() as
separate functions. It only makes the code difficult to follow.
Collapse them into blkdev_get/put(). This will ease future changes
around claim/release.
Tejun Heo [Sat, 13 Nov 2010 10:55:17 +0000 (11:55 +0100)]
block: make blkdev_get/put() handle exclusive access
Over time, block layer has accumulated a set of APIs dealing with bdev
open, close, claim and release.
* blkdev_get/put() are the primary open and close functions.
* bd_claim/release() deal with exclusive open.
* open/close_bdev_exclusive() are combination of open and claim and
the other way around, respectively.
* bd_link/unlink_disk_holder() to create and remove holder/slave
symlinks.
* open_by_devnum() wraps bdget() + blkdev_get().
The interface is a bit confusing and the decoupling of open and claim
makes it impossible to properly guarantee exclusive access as
in-kernel open + claim sequence can disturb the existing exclusive
open even before the block layer knows the current open if for another
exclusive access. Reorganize the interface such that,
* blkdev_get() is extended to include exclusive access management.
@holder argument is added and, if is @FMODE_EXCL specified, it will
gain exclusive access atomically w.r.t. other exclusive accesses.
* blkdev_put() is similarly extended. It now takes @mode argument and
if @FMODE_EXCL is set, it releases an exclusive access. Also, when
the last exclusive claim is released, the holder/slave symlinks are
removed automatically.
* bd_claim/release() and close_bdev_exclusive() are no longer
necessary and either made static or removed.
* bd_link_disk_holder() remains the same but bd_unlink_disk_holder()
is no longer necessary and removed.
* open_bdev_exclusive() becomes a simple wrapper around lookup_bdev()
and blkdev_get(). It also has an unexpected extra bdev_read_only()
test which probably should be moved into blkdev_get().
* open_by_devnum() is modified to take @holder argument and pass it to
blkdev_get().
Most of bdev open/close operations are unified into blkdev_get/put()
and most exclusive accesses are tested atomically at the open time (as
it should). This cleans up code and removes some, both valid and
invalid, but unnecessary all the same, corner cases.
open_bdev_exclusive() and open_by_devnum() can use further cleanup -
rename to blkdev_get_by_path() and blkdev_get_by_devt() and drop
special features. Well, let's leave them for another day.
Most conversions are straight-forward. drbd conversion is a bit more
involved as there was some reordering, but the logic should stay the
same.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Neil Brown <neilb@suse.de> Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Philipp Reisner <philipp.reisner@linbit.com> Cc: Peter Osterlund <petero2@telia.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jan Kara <jack@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: dm-devel@redhat.com Cc: drbd-dev@lists.linbit.com Cc: Leo Chen <leochen@broadcom.com> Cc: Scott Branden <sbranden@broadcom.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: Joern Engel <joern@logfs.org> Cc: reiserfs-devel@vger.kernel.org Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Tejun Heo [Sat, 13 Nov 2010 10:55:17 +0000 (11:55 +0100)]
block: simplify holder symlink handling
Code to manage symlinks in /sys/block/*/{holders|slaves} are overly
complex with multiple holder considerations, redundant extra
references to all involved kobjects, unused generic kobject holder
support and unnecessary mixup with bd_claim/release functionalities.
Strip it down to what's necessary (single gendisk holder) and make it
use a separate interface. This is a step for cleaning up
bd_claim/release. This patch makes dm-table slightly more complex but
it will be simplified again with further changes.
Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Neil Brown <neilb@suse.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com
Tejun Heo [Sat, 13 Nov 2010 10:55:17 +0000 (11:55 +0100)]
btrfs: close_bdev_exclusive() should use the same @flags as the matching open_bdev_exclusive()
In the failure path of __btrfs_open_devices(), close_bdev_exclusive()
is called with @flags which doesn't match the one used during
open_bdev_exclusive(). Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
[kgene.kim@samsung.com: Added fix same warning(mach-s3c64xx/Kconfig)] Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (66 commits)
can-bcm: fix minor heap overflow
gianfar: Do not call device_set_wakeup_enable() under a spinlock
ipv6: Warn users if maximum number of routes is reached.
docs: Add neigh/gc_thresh3 and route/max_size documentation.
axnet_cs: fix resume problem for some Ax88790 chip
ipv6: addrconf: don't remove address state on ifdown if the address is being kept
tcp: Don't change unlocked socket state in tcp_v4_err().
x25: Prevent crashing when parsing bad X.25 facilities
cxgb4vf: add call to Firmware to reset VF State.
cxgb4vf: Fail open if link_start() fails.
cxgb4vf: flesh out PCI Device ID Table ...
cxgb4vf: fix some errors in Gather List to skb conversion
cxgb4vf: fix bug in Generic Receive Offload
cxgb4vf: don't implement trivial (and incorrect) ndo_select_queue()
ixgbe: Look inside vlan when determining offload protocol.
bnx2x: Look inside vlan when determining checksum proto.
vlan: Add function to retrieve EtherType from vlan packets.
virtio-net: init link state correctly
ucc_geth: Fix deadlock
ucc_geth: Do not bring the whole IF down when TX failure.
...
Linus Torvalds [Sat, 13 Nov 2010 01:13:28 +0000 (17:13 -0800)]
Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (28 commits)
Revert "USB: xhci: Use GFP_ATOMIC under spin_lock"
USB: ohci-jz4740: Fix spelling in MODULE_ALIAS
UWB: Return UWB_RSV_ALLOC_NOT_FOUND rather than crashing on NULL dereference if kzalloc fails
usb: core: fix information leak to userland
usb: misc: iowarrior: fix information leak to userland
usb: misc: sisusbvga: fix information leak to userland
usb: subtle increased memory usage in u_serial
USB: option: fix when the driver is loaded incorrectly for some Huawei devices.
USB: xhci: Use GFP_ATOMIC under spin_lock
usb: gadget: goku_udc: add registered flag bit, fixing build
USB: ehci/mxc: compile fix
USB: Fix FSL USB driver on non Open Firmware systems
USB: the development of the usb tree is now in git
usb: musb: fail unaligned DMA transfers on v1.8 and above
USB: ftdi_sio: add device IDs for Milkymist One JTAG/serial
usb.h: fix ioctl kernel-doc info
usb: musb: gadget: kill duplicate code in musb_gadget_queue()
usb: musb: Fix handling of spurious SESSREQ
usb: musb: fix kernel oops when loading musb_hdrc module for the 2nd time
USB: musb: blackfin: push clkin value to platform resources
...
Linus Torvalds [Sat, 13 Nov 2010 00:02:30 +0000 (16:02 -0800)]
Merge branch 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
n_gsm: Fix length handling
n_gsm: Copy n2 over when configuring via ioctl interface
serial: bfin_5xx: grab port lock before making port termios changes
serial: bfin_5xx: disable CON_PRINTBUFFER for consoles
serial: bfin_5xx: remove redundant SSYNC to improve TX speed
serial: bfin_5xx: always include DMA headers
vcs: make proper usage of the poll flags
amiserial: Remove unused variable icount
8250: Fix tcsetattr to avoid ioctl(TIOCMIWAIT) hang
tty_ldisc: Fix BUG() on hangup
TTY: restore tty_ldisc_wait_idle
SERIAL: blacklist si3052 chip
drivers/serial/bfin_5xx.c: Fix line continuation defects
tty: prevent DOS in the flush_to_ldisc
8250: add support for Kouwell KW-L221N-2
nozomi: Fix warning from the previous TIOCGCOUNT changes
tty: fix warning in synclink driver
tty: Fix formatting in tty.h
tty: the development tree is now done in git
Linus Torvalds [Sat, 13 Nov 2010 00:01:55 +0000 (16:01 -0800)]
Merge branch 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
xen: do not release any memory under 1M in domain 0
xen: events: do not unmask event channels on resume
xen: correct size of level2_kernel_pgt
Linus Torvalds [Fri, 12 Nov 2010 23:54:39 +0000 (15:54 -0800)]
Merge branch 'stable/xen-pcifront-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/xen-pcifront-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
MAINTAINERS: Mark XEN lists as moderated
xen-pcifront: fix PCI reference leak
xen-pcifront: Remove duplicate inclusion of headers.
xen: fix memory leak in Xen PCI MSI/MSI-X allocator.
MAINTAINERS: Update mailing list name for Xen pieces.
Tejun Heo [Mon, 1 Nov 2010 10:39:19 +0000 (11:39 +0100)]
libata: fix NULL sdev dereference race in atapi_qc_complete()
SCSI commands may be issued between __scsi_add_device() and dev->sdev
assignment, so it's unsafe for ata_qc_complete() to dereference
dev->sdev->locked without checking whether it's NULL or not. Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@kernel.org Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Reported-by: Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> CC: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
gianfar: Do not call device_set_wakeup_enable() under a spinlock
The gianfar driver calls device_set_wakeup_enable() under a spinlock,
which causes a problem to happen after the recent core power
management changes, because this function can sleep now. Fix this
by moving the device_set_wakeup_enable() call out of the
spinlock-protected area.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Colitti [Wed, 27 Oct 2010 18:16:49 +0000 (18:16 +0000)]
ipv6: addrconf: don't remove address state on ifdown if the address is being kept
Currently, addrconf_ifdown does not delete statically configured IPv6
addresses when the interface is brought down. The intent is that when
the interface comes back up the address will be usable again. However,
this doesn't actually work, because the system stops listening on the
corresponding solicited-node multicast address, so the address cannot
respond to neighbor solicitations and thus receive traffic. Also, the
code notifies the rest of the system that the address is being deleted
(e.g, RTM_DELADDR), even though it is not. Fix it so that none of this
state is updated if the address is being kept on the interface.
Tested: Added a statically configured IPv6 address to an interface,
started ping, brought link down, brought link up again. When link came
up ping kept on going and "ip -6 maddr" showed that the host was still
subscribed to there
Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 12 Nov 2010 21:35:00 +0000 (13:35 -0800)]
tcp: Don't change unlocked socket state in tcp_v4_err().
Alexey Kuznetsov noticed a regression introduced by
commit f1ecd5d9e7366609d640ff4040304ea197fbc618
("Revert Backoff [v3]: Revert RTO on ICMP destination unreachable")
The RTO and timer modification code added to tcp_v4_err()
doesn't check sock_owned_by_user(), which if true means we
don't have exclusive access to the socket and therefore cannot
modify it's critical state.
Just skip this new code block if sock_owned_by_user() is true
and eliminate the now superfluous sock_owned_by_user() code
block contained within.
Reported-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net> CC: Damian Lukowski <damian@tvk.rwth-aachen.de> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Dan Rosenberg [Fri, 12 Nov 2010 20:44:42 +0000 (12:44 -0800)]
x25: Prevent crashing when parsing bad X.25 facilities
Now with improved comma support.
On parsing malformed X.25 facilities, decrementing the remaining length
may cause it to underflow. Since the length is an unsigned integer,
this will result in the loop continuing until the kernel crashes.
This patch adds checks to ensure decrementing the remaining length does
not cause it to wrap around.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Casey Leedom [Thu, 11 Nov 2010 09:06:50 +0000 (09:06 +0000)]
cxgb4vf: fix some errors in Gather List to skb conversion
There were some errors in the way that internal Gather Lists were being
translated into skb's. This also makes the VF Driver look more like the PF
Driver to facilitate easier comarison.
Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Don't implement (struct net_device_ops *)->ndo_select_queue() with simple
call to skb_tx_hash(). This leads to non-persistent TX queue selection in
the Linux dev_pick_tx() routine for TCP connections.
Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>