Jeff Moyer [Tue, 9 Aug 2011 18:32:09 +0000 (20:32 +0200)]
allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH
blk_insert_flush has the following check:
/*
* If there's data but flush is not necessary, the request can be
* processed directly without going through flush machinery. Queue
* for normal execution.
*/
if ((policy & REQ_FSEQ_DATA) &&
!(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
list_add_tail(&rq->queuelist, &q->queue_head);
return;
}
However, blk_flush_policy will not return with policy set to only
REQ_FSEQ_DATA:
static unsigned int blk_flush_policy(unsigned int fflags, struct request *rq)
{
unsigned int policy = 0;
if (fflags & REQ_FLUSH) {
if (rq->cmd_flags & REQ_FLUSH)
policy |= REQ_FSEQ_PREFLUSH;
if (blk_rq_sectors(rq))
policy |= REQ_FSEQ_DATA;
if (!(fflags & REQ_FUA) && (rq->cmd_flags & REQ_FUA))
policy |= REQ_FSEQ_POSTFLUSH;
}
return policy;
}
Notice that REQ_FSEQ_DATA is only set if REQ_FLUSH is set. Fix this
mismatch by moving the setting of REQ_FSEQ_DATA outside of the REQ_FLUSH
check.
Tejun notes:
Hmmm... yes, this can become a correctness issue if (and only if)
blk_queue_flush() is called to change q->flush_flags while requests
are in-flight; otherwise, requests wouldn't reach the function at all.
Also, I think it would be a generally good idea to always set
FSEQ_DATA if the request has data.
Vivek Goyal [Fri, 5 Aug 2011 07:42:20 +0000 (09:42 +0200)]
cfq-iosched: Add documentation about idling
There are always questions about why CFQ is idling on various conditions.
Recent ones is Christoph asking again why to idle on REQ_NOIDLE. His
assertion is that XFS is relying more and more on workqueues and is
concerned that CFQ idling on IO from every workqueue will impact
XFS badly.
So he suggested that I add some more documentation about CFQ idling
and that can provide more clarity on the topic and also gives an
opprotunity to poke a hole in theory and lead to improvements.
So here is my attempt at that. Any comments are welcome.
Tao Ma [Fri, 5 Aug 2011 07:37:10 +0000 (09:37 +0200)]
block: Make rq_affinity = 1 work as expected
Commit 5757a6d76c introduced a new rq_affinity = 2 so as to make
the request completed in the __make_request cpu. But it makes the
old rq_affinity = 1 not work any more. The root cause is that
if the 'cpu' and 'req->cpu' is in the same group and cpu != req->cpu,
ccpu will be the same as group_cpu, so the completion will be
excuted in the 'cpu' not 'group_cpu'.
This patch fix problem by simpling removing group_cpu and the codes
are more explicit now. If ccpu == cpu, we complete in cpu, otherwise
we raise_blk_irq to ccpu.
Cc: Christoph Hellwig <hch@infradead.org> Cc: Roland Dreier <roland@purestorage.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Tao Ma <boyu.mt@taobao.com> Reviewed-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Herbert Poetzl [Tue, 2 Aug 2011 10:43:50 +0000 (12:43 +0200)]
block/genhd.c: remove useless cast in diskstats_show()
Remove the (unsigned long long) cast in diskstats_show() and adjusts the
seq_printf() format string to 'unsigned long'
diskstats_show() uses part_stat_read() to get the stats, which either
accesses the specified field in the struct disk_stats directly (non SMP)
or sums up the per CPU values in a variable of the same type as the field,
so in any case the result will have the same type and range as the
specified field which for all disk_stats entries is unsigned long
Also, for unsigned long ranges the output of %lu should be identical to
the one of %llu, so no change in the actual proc entry contents.
Signed-off-by: Herbert Poetzl <herbert@13thfloor.at> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
drivers/block/drbd/drbd_nl.c: use bitmap_parse instead of __bitmap_parse
The buffer 'sc.cpu_mask' is a kernel buffer. If bitmap_parse is used
instead of __bitmap_parse the extra parameter that indicates a kernel
buffer is not needed.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Jens Axboe [Tue, 2 Aug 2011 08:43:35 +0000 (10:43 +0200)]
bsg-lib: add module.h include
Due to conflicts with the moduleh tree in linux-next, we
run into an include file mess. We really need export.h
in that tree, but if we add module.h locally then the
issue is easier to resolve.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Vivek Goyal [Tue, 2 Aug 2011 07:24:09 +0000 (09:24 +0200)]
cfq-iosched: Reduce linked group count upon group destruction
FQ keeps track of number of groups which are linked on blkcg->blkg_list.
This is useful to avoid races between queue exit and cgroup exit code
paths. So if at the request queue exit time linked group count is not
zero, that means there are some group out there which is yet to be
deleted under rcu read period and queue exit code should wait for
on rcu period.
In my previous patch I forgot to decrease the number of group count.
So in current form, we nr_blkcg_linked_grps is always non-zero and
we will always wait one rcu period (if BLK_CGROUP=y). The side effect
of this is that it can increase boot time. I am surprised, nobody
complained so far.
Kay Sievers [Sun, 31 Jul 2011 20:21:35 +0000 (22:21 +0200)]
loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other
LOOP_CLR_FD takes lo->lo_ctl_mutex and tries to remove the loop sysfs
files. Sysfs calls show() and waits for lo->lo_ctl_mutex. LOOP_CLR_FD
waits for show() to finish to remove the sysfs file.
Kay Sievers [Sun, 31 Jul 2011 20:08:04 +0000 (22:08 +0200)]
loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices
Instead of unconditionally creating a fixed number of dead loop
devices which need to be investigated by storage handling services,
even when they are never used, we allow distros start with 0
loop devices and have losetup(8) and similar switch to the dynamic
/dev/loop-control interface instead of searching /dev/loop%i for free
devices.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Kay Sievers [Sun, 31 Jul 2011 20:08:04 +0000 (22:08 +0200)]
loop: add management interface for on-demand device allocation
Loop devices today have a fixed pre-allocated number of usually 8.
The number can only be changed at module init time. To find a free
device to use, /dev/loop%i needs to be scanned, and all devices need
to be opened until a free one is possibly found.
This adds a new /dev/loop-control device node, that allows to
dynamically find or allocate a free device, and to add and remove loop
devices from the running system:
LOOP_CTL_ADD adds a specific device. Arg is the number
of the device. It returns the device i or a negative
error code.
LOOP_CTL_REMOVE removes a specific device, Arg is the
number the device. It returns the device i or a negative
error code.
LOOP_CTL_GET_FREE finds the next unbound device or allocates
a new one. No arg is given. It returns the device i or a
negative error code.
The loop kernel module gets automatically loaded when
/dev/loop-control is accessed the first time. The alias
specified in the module, instructs udev to create this
'dead' device node, even when the module is not loaded.
Example:
cfd = open("/dev/loop-control", O_RDWR);
# add a new specific loop device
err = ioctl(cfd, LOOP_CTL_ADD, devnr);
# remove a specific loop device
err = ioctl(cfd, LOOP_CTL_REMOVE, devnr);
# find or allocate a free loop device to use
devnr = ioctl(cfd, LOOP_CTL_GET_FREE);
Mike Christie [Sun, 31 Jul 2011 20:05:09 +0000 (22:05 +0200)]
block: add bsg helper library
This moves the FC classes bsg code to the block layer and
makes it a lib so that other classes like iscsi and SAS can use it.
It is helpful because working with the request queue, bios,
creating scatterlists, etc are a pain that the LLD does not
have to worry about with normal IOs and should not have to
worry about for bsg requests.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* git://git.infradead.org/battery-2.6:
gpio-charger: Fix checking return value of request_any_context_irq
power_supply: MAX17042: Support additional properties
max8903_charger: Allow platform data to be __initdata
power_supply: Add charger driver for MAX8998/LP3974
power_supply: Add charger driver for MAX8997/8966
max17042_battery: Remove obsolete cleanup for clientdata
twl4030_charger: Fix warnings
wm831x_power: Support multiple instances
wm831x_backup: Support multiple instances
apm_power: Fix style error in macros
s3c_adc_battery: Fix annotation for s3c_adc_battery_probe()
bq20z75: Enable detection after registering
bq20z75: Add support for external notification
* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/cpupowerutils:
cpupower: Do detect IDA (opportunistic processor performance) via cpuid
cpupower: Show Intel turbo ratio support via ./cpupower frequency-info
cpupowerutils: increase MAX_LINE_LEN
cpupower: Rename package from cpupowerutils to cpupower
cpupowerutils: Rename: libcpufreq->libcpupower
cpupowerutils: use kernel version-derived version string
cpupowerutils: utils - ConfigStyle bugfixes
cpupowerutils: helpers - ConfigStyle bugfixes
cpupowerutils: idle_monitor - ConfigStyle bugfixes
cpupowerutils: lib - ConfigStyle bugfixes
cpupowerutils: bench - ConfigStyle bugfixes
cpupowerutils: do not update po files on each and every compile
cpupowerutils: remove ccdv, use kernel quiet/verbose mechanism
cpupowerutils: use COPYING, CREDITS from top-level directory
cpupowerutils - cpufrequtils extended with quite some features
* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
smc91c92_cs.c: fix bogus compiler warning
orinoco_cs: be more careful when matching cards with ID 0x0156:0x0002
hostap_cs: support cards with "Version 01.02" as third product ID
pcmcia: add PCMCIA_DEVICE_MANF_CARD_PROD_ID3
pxa2xx pcmcia - stargate 2 use gpio array.
pcmcia: pxa2xx: remove empty socket_init / socket_resume functions.
drivers:pcmcia:soc_common: make socket_init and socket_suspend optional
Fred Isaman [Sun, 31 Jul 2011 00:52:54 +0000 (20:52 -0400)]
pnfsblock: bl_write_pagelist
Note: When upper layer's read/write request cannot be fulfilled, the block
layout driver shouldn't silently mark the page as error. It should do
what can be done and leave the rest to the upper layer. To do so, we
should set rdata/wdata->res.count properly.
When upper layer re-send the read/write request to finish the rest
part of the request, pgbase is the position where we should start at.
[pnfsblock: bl_write_pagelist support functions]
[pnfsblock: bl_write_pagelist adjust for missing PG_USE_PNFS] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: handle errors when read or write pagelist.] Signed-off-by: Zhang Jingwang <yyalone@gmail.com>
[pnfs-block: use new write_pagelist api] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu>
[SQUASHME: pnfsblock: mds_offset is set in the generic layer] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[pnfsblock: mark IO error with NFS_LAYOUT_{RW|RO}_FAILED] Signed-off-by: Peng Tao <peng_tao@emc.com>
[pnfsblock: SQUASHME: adjust to API change] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: fixup blksize alignment in bl_setup_layoutcommit] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[pnfsblock: bl_write_pagelist adjust for missing PG_USE_PNFS] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: handle errors when read or write pagelist.] Signed-off-by: Zhang Jingwang <yyalone@gmail.com>
[pnfs-block: use new write_pagelist api] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fred Isaman [Sun, 31 Jul 2011 00:52:53 +0000 (20:52 -0400)]
pnfsblock: bl_read_pagelist
Note: When upper layer's read/write request cannot be fulfilled, the block
layout driver shouldn't silently mark the page as error. It should do
what can be done and leave the rest to the upper layer. To do so, we
should set rdata/wdata->res.count properly.
When upper layer re-send the read/write request to finish the rest
part of the request, pgbase is the position where we should start at.
[pnfsblock: mark IO error with NFS_LAYOUT_{RW|RO}_FAILED] Signed-off-by: Peng Tao <peng_tao@emc.com>
[pnfsblock: read path error handling] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: handle errors when read or write pagelist.] Signed-off-by: Zhang Jingwang <yyalone@gmail.com>
[pnfs-block: use new read_pagelist api] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fred Isaman [Sun, 31 Jul 2011 00:52:52 +0000 (20:52 -0400)]
pnfsblock: cleanup_layoutcommit
In blocklayout driver. There are two things happening
while layoutcommit/cleanup.
1. the modified extents are encoded.
2. On cleanup the extents are put back on the layout rw
extents list, for reads.
In the new system where actual xdr encoding is done in
encode_layoutcommit() directly into xdr buffer, these are
the new commit stages:
1. On setup_layoutcommit, the range is adjusted as before
and a structure is allocated for communication with
bl_encode_layoutcommit && bl_cleanup_layoutcommit
(Generic layer provides a void-star to hang it on)
2. bl_encode_layoutcommit is called to do the actual
encoding directly into xdr. The commit-extent-list is not
freed and is stored on above structure.
FIXME: The code is not yet converted to the new XDR cleanup
3. On cleanup the commit-extent-list is put back by a call
to set_to_rw() as before, but with no need for XDR decoding
of the list as before. And the commit-extent-list is freed.
Finally allocated structure is freed.
[rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()] Signed-off-by: Jim Rees <rees@umich.edu>
[pnfsblock: introduce bl_committing list] Signed-off-by: Peng Tao <peng_tao@emc.com>
[pnfsblock: SQUASHME: adjust to API change] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[blocklayout: encode_layoutcommit implementation] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
[pnfsblock: fix bug setting up layoutcommit.] Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
[pnfsblock: cleanup_layoutcommit wants a status parameter] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fred Isaman [Sun, 31 Jul 2011 00:52:51 +0000 (20:52 -0400)]
pnfsblock: encode_layoutcommit
In blocklayout driver. There are two things happening
while layoutcommit/cleanup.
1. the modified extents are encoded.
2. On cleanup the extents are put back on the layout rw
extents list, for reads.
In the new system where actual xdr encoding is done in
encode_layoutcommit() directly into xdr buffer, these are
the new commit stages:
1. On setup_layoutcommit, the range is adjusted as before
and a structure is allocated for communication with
bl_encode_layoutcommit && bl_cleanup_layoutcommit
(Generic layer provides a void-star to hang it on)
2. bl_encode_layoutcommit is called to do the actual
encoding directly into xdr. The commit-extent-list is not
freed and is stored on above structure.
FIXME: The code is not yet converted to the new XDR cleanup
3. On cleanup the commit-extent-list is put back by a call
to set_to_rw() as before, but with no need for XDR decoding
of the list as before. And the commit-extent-list is freed.
Finally allocated structure is freed.
[rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()]
[pnfsblock: get rid of deprecated xdr macros] Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[blocklayout: encode_layoutcommit implementation] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
[pnfsblock: fix bug setting up layoutcommit.] Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
[pnfsblock: prevent commit list corruption]
[pnfsblock: fix layoutcommit with an empty opaque] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fred Isaman [Sun, 31 Jul 2011 00:52:49 +0000 (20:52 -0400)]
pnfsblock: add extent manipulation functions
Adds working implementations of various support functions
to handle INVAL extents, needed by writes, such as
bl_mark_sectors_init and bl_is_sector_init.
Fred Isaman [Sun, 31 Jul 2011 00:52:48 +0000 (20:52 -0400)]
pnfsblock: bl_find_get_extent
Implement bl_find_get_extent(), one of the core extent manipulation
routines.
[pnfsblock: Lookup list entry of layouts and tags in reverse order] Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Jim Rees <rees@umich.edu>
pnfsblock: fix print format warnings for sector_t and size_t
gcc spews warnings about these on x86_64, e.g.:
fs/nfs/blocklayout/blocklayout.c:74: warning: format ‘%Lu’ expects type ‘long long unsigned int’, but argument 2 has type ‘sector_t’
fs/nfs/blocklayout/blocklayout.c:388: warning: format ‘%d’ expects type ‘int’, but argument 5 has type ‘size_t’
Fred Isaman [Sun, 31 Jul 2011 00:52:46 +0000 (20:52 -0400)]
pnfsblock: call and parse getdevicelist
Call GETDEVICELIST during mount, then call and parse GETDEVICEINFO
for each device returned.
[pnfsblock: get rid of deprecated xdr macros] Signed-off-by: Jim Rees <rees@umich.edu>
[pnfsblock: fix pnfs_deviceid references] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: fix print format warnings for sector_t and size_t]
[pnfs-block: #include <linux/vmalloc.h>]
[pnfsblock: no PNFS_NFS_SERVER] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfsblock: fix bug determining size of striped volume]
[pnfsblock: fix oops when using multiple devices] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[pnfsblock: get rid of vmap and deviceid->area structure] Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Peng Tao [Sun, 31 Jul 2011 00:52:34 +0000 (20:52 -0400)]
pnfs: use lwb as layoutcommit length
Using NFS4_MAX_UINT64 will break current protocol.
[Needed in v3.0] CC: Stable Tree <stable@kernel.org> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Peng Tao [Sun, 31 Jul 2011 00:52:31 +0000 (20:52 -0400)]
pnfs: save layoutcommit lwb at layout header
No need to save it for every lseg.
[Needed in v3.0] CC: Stable Tree <stable@kernel.org> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
don't busy retry the inode on failed grab_super_passive()
This fixes a soft lockup on conditions
a) the flusher is working on a work by __bdi_start_writeback(), while
b) someone else calls writeback_inodes_sb*() or sync_inodes_sb(), which
grab sb->s_umount and enqueue a new work for the flusher to execute
The s_umount grabbed by (b) will fail the grab_super_passive() in (a).
Then if the inode is requeued, wb_writeback() will busy retry on it.
As a result, wb_writeback() loops for ever without releasing
wb->list_lock, which further blocks other tasks.
Fix the busy loop by redirtying the inode. This may undesirably delay
the writeback of the inode, however most likely it will be picked up
soon by the queued work by writeback_inodes_sb*(), sync_inodes_sb() or
even writeback_inodes_wb().
Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging: (24 commits)
hwmon: (lm90) Refactor reading of config2 register
hwmon: (lm90) Make SA56004 detection more robust
hwmon: (lm90) Simplify handling of extended local temp register
hwmon: (pmbus) Add client driver for LM25066, LM5064, and LM5066
hwmon: (max34440) Add support for peak attributes
hwmon: (max8688) Add support for peak attributes
hwmon: (max16064) Add support for peak attributes
hwmon: (adm1275) Add support for peak attributes
hwmon: (pmbus) Add support for peak attributes
hwmon: Add new attributes to sysfs ABI
hwmon: (pmbus) Strengthen check for status register existence
hwmon: (pmbus) Add support for virtual pages
hwmon: (pmbus) Support reading and writing of word registers in device specific code
hwmon: (pmbus) Increase attribute name size
hwmon: (pmbus) Add ADP4000, NCP4200 and NCP4208 to list of supported devices
hwmon: (pmbus) Add support for VID output voltage mode
hwmon: (pmbus) Move PMBus drivers to drivers/hwmon/pmbus
hwmon: (coretemp) Add core/pkg threshold support to Coretemp
hwmon: (lm95241) Add support for LM95231
hwmon: LM95245 driver
...
shm_lock() does a lookup of shm segment in shm_ids(ns).ipcs_idr, which
is redundant as we already know shmid_kernel address. An actual lock is
also not required for reads until we really want to destroy the segment.
exit_shm() and shm_destroy_orphaned() may avoid the loop by checking
whether there is at least one segment in current ipc_namespace.
The check of nsproxy and ipc_ns against NULL is redundant as exit_shm()
is called from do_exit() before the call to exit_notify(), so the
dereferencing current->nsproxy->ipc_ns is guaranteed to be safe.
shm_try_destroy_orphaned() and shm_try_destroy_current() didn't handle
the case of separate PID namespaces, but a single IPC namespace. If
there are tasks with the same PID values using the same shmem object,
the wrong destroy decision could be reached.
On shm segment creation store the pointer to the creator task in
shmid_kernel->shm_creator field and zero it on task exit. Then
use the ->shm_creator insread of shm_cprid in both functions. As
shmid_kernel object is already locked at this stage, no additional
locking is needed.
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (71 commits)
[SCSI] fcoe: cleanup cpu selection for incoming requests
[SCSI] fcoe: add fip retry to avoid missing critical keep alive
[SCSI] libfc: fix warn on in lport retry
[SCSI] libfc: Remove the reference to FCP packet from scsi_cmnd in case of error
[SCSI] libfc: cleanup sending SRR request
[SCSI] libfc: two minor changes in comments
[SCSI] libfc, fcoe: ignore rx frame with wrong xid info
[SCSI] libfc: release exchg cache
[SCSI] libfc: use FC_MAX_ERROR_CNT
[SCSI] fcoe: remove unused ptype field in fcoe_rcv_info
[SCSI] bnx2fc: Update copyright and bump version to 1.0.4
[SCSI] bnx2fc: Tx BDs cache in write tasks
[SCSI] bnx2fc: Do not arm CQ when there are no CQEs
[SCSI] bnx2fc: hold tgt lock when calling cmd_release
[SCSI] bnx2fc: Enable support for sequence level error recovery
[SCSI] bnx2fc: HSI changes for tape
[SCSI] bnx2fc: Handle REC_TOV error code from firmware
[SCSI] bnx2fc: REC/SRR link service request and response handling
[SCSI] bnx2fc: Support 'sequence cleanup' task
[SCSI] dh_rdac: Associate HBA and storage in rdac_controller to support partitions in storage
...
If the directory contents change, then we have to accept that the
file->f_pos value may shrink if we do a 'search-by-cookie'. In that
case, we should turn off the loop detection and let the NFS client
try to recover.
The patch also fixes a second loop detection bug by ensuring
that after turning on the ctx->duped flag, we read at least one new
cookie into ctx->dir_cookie before attempting to match with
ctx->dup_cookie.
Merge branch 'slub/lockless' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'slub/lockless' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: (21 commits)
slub: When allocating a new slab also prep the first object
slub: disable interrupts in cmpxchg_double_slab when falling back to pagelock
Avoid duplicate _count variables in page_struct
Revert "SLUB: Fix build breakage in linux/mm_types.h"
SLUB: Fix build breakage in linux/mm_types.h
slub: slabinfo update for cmpxchg handling
slub: Not necessary to check for empty slab on load_freelist
slub: fast release on full slab
slub: Add statistics for the case that the current slab does not match the node
slub: Get rid of the another_slab label
slub: Avoid disabling interrupts in free slowpath
slub: Disable interrupts in free_debug processing
slub: Invert locking and avoid slab lock
slub: Rework allocator fastpaths
slub: Pass kmem_cache struct to lock and freeze slab
slub: explicit list_lock taking
slub: Add cmpxchg_double_slab()
mm: Rearrange struct page
slub: Move page->frozen handling near where the page->freelist handling occurs
slub: Do not use frozen page flag but a bit in the page counters
...
Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6: (25 commits)
kconfig: Introduce IS_ENABLED(), IS_BUILTIN() and IS_MODULE()
xconfig: Abort close if configuration cannot be saved
kconfig: fix missing "0x" prefix from S_HEX symbol in autoconf.h
kconfig/nconf: remove useless conditionnal
kconfig/nconf: prevent segfault on empty menu
kconfig/nconf: use the generic menu_get_ext_help()
nconfig: Avoid Wunused-but-set warning
kconfig/conf: mark xfgets() private
kconfig: remove pending prototypes for kconfig_load()
kconfig/conf: add command line options' description
kconfig/conf: reduce the scope of `defconfig_file'
kconfig: use calloc() for expr allocation
kconfig: introduce specialized printer
kconfig: do not overwrite symbol direct dependency in assignment
kconfig/gconf: silent missing prototype warnings
kconfig/gconf: kill deadcode
kconfig: nuke LKC_DIRECT_LINK cruft
kconfig: nuke reference to SWIG
kconfig: add missing <stdlib.h> inclusion
kconfig: add missing <ctype.h> inclusion
...
Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (430 commits)
[media] ir-mce_kbd-decoder: include module.h for its facilities
[media] ov5642: include module.h for its facilities
[media] em28xx: Fix DVB-C maxsize for em2884
[media] tda18271c2dd: Fix saw filter configuration for DVB-C @6MHz
[media] v4l: mt9v032: Fix Bayer pattern
[media] V4L: mt9m111: rewrite set_pixfmt
[media] V4L: mt9m111: fix missing return value check mt9m111_reg_clear
[media] V4L: initial driver for ov5642 CMOS sensor
[media] V4L: sh_mobile_ceu_camera: fix Oops when USERPTR mapping fails
[media] V4L: soc-camera: remove soc-camera bus and devices on it
[media] V4L: soc-camera: un-export the soc-camera bus
[media] V4L: sh_mobile_csi2: switch away from using the soc-camera bus notifier
[media] V4L: add media bus configuration subdev operations
[media] V4L: soc-camera: group struct field initialisations together
[media] V4L: soc-camera: remove now unused soc-camera specific PM hooks
[media] V4L: pxa-camera: switch to using standard PM hooks
[media] NetUP Dual DVB-T/C CI RF: force card hardware revision by module param
[media] Don't OOPS if videobuf_dvb_get_frontend return NULL
[media] NetUP Dual DVB-T/C CI RF: load firmware according card revision
[media] omap3isp: Support configurable HS/VS polarities
...
Fix up conflicts:
- arch/arm/mach-omap2/board-rx51-peripherals.c:
cleanup regulator supply definitions in mach-omap2
vs
OMAP3: RX-51: define vdds_csib regulator supply
- drivers/staging/tm6000/tm6000-alsa.c (trivial)
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6:
ecryptfs: Make inode bdi consistent with superblock bdi
eCryptfs: Unlock keys needed by ecryptfsd
James Bottomley [Fri, 29 Jul 2011 13:11:32 +0000 (17:11 +0400)]
ramoops: fix compile failure on parisc
Fixes this:
drivers/char/ramoops.c: In function 'ramoops_init':
drivers/char/ramoops.c:221: error: implicit declaration of function 'IS_ERR'
drivers/char/ramoops.c:222: error: implicit declaration of function 'PTR_ERR'
If it actually builds on other platforms, it's probably getting
linux/err.h via some other #include.
Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: remove printks about disabled bridge windows
PCI: fold pci_calc_resource_flags() into decode_bar()
PCI: treat mem BAR type "11" (reserved) as 32-bit, not 64-bit, BAR
PCI: correct pcie_set_readrq write size
PCI: pciehp: change wait time for valid configuration access
x86/PCI: Preserve existing pci=bfsort whitelist for Dell systems
PCI: ARI is a PCIe v2 feature
x86/PCI: quirks: Use pci_dev->revision
PCI: Make the struct pci_dev * argument of pci_fixup_irqs const.
PCI hotplug: cpqphp: use pci_dev->vendor
PCI hotplug: cpqphp: use pci_dev->subsystem_{vendor|device}
x86/PCI: config space accessor functions should not ignore the segment argument
PCI: Assign values to 'pci_obff_signal_type' enumeration constants
x86/PCI: reduce severity of host bridge window conflict warnings
PCI: enumerate the PCI device only removed out PCI hieratchy of OS when re-scanning PCI
PCI: PCIe AER: add aer_recover_queue
x86/PCI: select direct access mode for mmconfig option
PCI hotplug: Rename is_ejectable which also exists in dock.c
Merge branch 'at91/cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc
* 'at91/cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc:
at91: add arch specific ioremap support
at91: factorize sram init
at91: move register clocks to soc generic init
at91: move clock subsystem init to soc generic init
at91: use structure to store the current soc
at91: remove AT91_DBGU offset from dbgu register macro
at91: factorize at91 interrupts init to soc
at91: introduce commom AT91_BASE_SYS
Merge branch 'next/dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc
* 'next/dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc: (21 commits)
arm/dt: tegra devicetree support
arm/versatile: Add device tree support
dt/irq: add irq_domain_generate_simple() helper
irq: add irq_domain translation infrastructure
dmaengine: imx-sdma: add device tree probe support
dmaengine: imx-sdma: sdma_get_firmware does not need to copy fw_name
dmaengine: imx-sdma: use platform_device_id to identify sdma version
mmc: sdhci-esdhc-imx: add device tree probe support
mmc: sdhci-pltfm: dt device does not pass parent to sdhci_alloc_host
mmc: sdhci-esdhc-imx: get rid of the uses of cpu_is_mx()
mmc: sdhci-esdhc-imx: do not reference platform data after probe
mmc: sdhci-esdhc-imx: extend card_detect and write_protect support for mx5
net/fec: add device tree probe support
net: ibm_newemac: convert it to use of_get_phy_mode
dt/net: add helper function of_get_phy_mode
net/fec: gasket needs to be enabled for some i.mx
serial/imx: add device tree probe support
serial/imx: get rid of the uses of cpu_is_mx1()
arm/dt: Add dtb make rule
arm/dt: Add skeleton dtsi file
...
When commit 4e34e719e457 ("fs: take the ACL checks to common code")
changed the xyz_check_acl() functions into the more natural
xyz_get_acl() interface, we grew two copies of the
#define ext2_get_acl NULL
define for the non-acl case.
Remove the extra one.
Reported-by: Marco Stornelli <marco.stornelli@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Marek [Wed, 20 Jul 2011 15:38:57 +0000 (17:38 +0200)]
kconfig: Introduce IS_ENABLED(), IS_BUILTIN() and IS_MODULE()
Replace the config_is_*() macros with a variant that allows for grepping
for usage of CONFIG_* options in the code. Usage:
if (IS_ENABLED(CONFIG_NUMA))
or
#if IS_ENABLED(CONFIG_NUMA)
The IS_ENABLED() macro evaluates to 1 if the argument is set (to either 'y'
or 'm'), IS_BUILTIN() tests if the option is 'y' and IS_MODULE() test if
the option is 'm'. Only boolean and tristate options are supported.
Reviewed-by: Arnaud Lacombe <lacombar@gmail.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Michal Marek <mmarek@suse.cz>
Thomas Renninger [Thu, 21 Jul 2011 09:54:54 +0000 (11:54 +0200)]
cpupower: Do detect IDA (opportunistic processor performance) via cpuid
IA32-Intel Devel guide Volume 3A - 14.3.2.1
-------------------------------------------
...
Opportunistic processor performance operation can be disabled by setting bit 38 of
IA32_MISC_ENABLES. This mechanism is intended for BIOS only. If
IA32_MISC_ENABLES[38] is set, CPUID.06H:EAX[1] will return 0.
Better detect things via cpuid, this cleans up the code a bit
and the MSR parts were not working correctly anyway.
Thomas Renninger [Thu, 21 Jul 2011 09:54:53 +0000 (11:54 +0200)]
cpupower: Show Intel turbo ratio support via ./cpupower frequency-info
This adds the last piece missing from turbostat (if called with -v).
It shows on Intel machines supporting Turbo Boost how many cores
have to be active/idle to enter which boost mode (frequency).
Whether the HW really enters these boost modes can be verified via
./cpupower monitor.
cpupowerutils - cpufrequtils extended with quite some features
CPU power consumption vs performance tuning is no longer
limited to CPU frequency switching anymore: deep sleep states,
traditional dynamic frequency scaling and hidden turbo/boost
frequencies are tied close together and depend on each other.
The first two exist on different architectures like PPC, Itanium and
ARM, the latter (so far) only on X86. On X86 the APU (CPU+GPU) will
only run most efficiently if CPU and GPU has proper power management
in place.
Users and Developers want to have *one* tool to get an overview what
their system supports and to monitor and debug CPU power management
in detail. The tool should compile and work on as many architectures
as possible.
Once this tool stabilizes a bit, it is intended to replace the
Intel-specific tools in tools/power/x86
CC [M] drivers/net/pcmcia/smc91c92_cs.o
drivers/net/pcmcia/smc91c92_cs.c: In function ‘smc91c92_probe’:
drivers/net/pcmcia/smc91c92_cs.c:812:12: warning: ‘j’ may be used uninitialized in this function
However, "j" is only used in a branch which has the same condition as
a previous branch, where j is set, e.g.
int j;
if (CONDITION)
j = VALUE
...
if (CONDITION)
printk(j)
Still, avoid this warning, as it is easy to circumvent.
Pavel Roskin [Tue, 26 Jul 2011 22:52:41 +0000 (18:52 -0400)]
hostap_cs: support cards with "Version 01.02" as third product ID
Cards with numeric ID 0x0156:0x0002 and third ID "Version 01.02" can be
assumed to have Intersil firmware. Cards with Agere firmware use
"Version 01.01".
Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Pavel Roskin [Tue, 26 Jul 2011 22:52:35 +0000 (18:52 -0400)]
pcmcia: add PCMCIA_DEVICE_MANF_CARD_PROD_ID3
This is needed to match wireless cards with Intersil firmware that have
ID 0x0156:0x0002 and the third ID "Version 01.02". Such cards are
currently matched by orinoco_cs, which doesn't support WPA. They should
be matched by hostap_cs.
The first and the second product ID vary widely, so there are few users
with some particular IDs. Of those, very few can submit a patch for
hostap_cs or write a useful bugreport. It's still important to support
their hardware properly.
With PCMCIA_DEVICE_MANF_CARD_PROD_ID3, it should be possible to cover
the remaining Intersil based designs that kept the numeric ID and the
"version" of the reference design.
Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Stephen Rothwell [Fri, 29 Jul 2011 05:41:45 +0000 (15:41 +1000)]
[media] ir-mce_kbd-decoder: include module.h for its facilities
drivers/media/rc/ir-mce_kbd-decoder.c:446:16: error: expected declaration specifiers or '...' before string constant
drivers/media/rc/ir-mce_kbd-decoder.c:446:1: warning: data definition has no type or storage class
drivers/media/rc/ir-mce_kbd-decoder.c:446:1: warning: type defaults to 'int' in declaration of 'MODULE_LICENSE'
drivers/media/rc/ir-mce_kbd-decoder.c:446:16: warning: function declaration isn't a prototype
drivers/media/rc/ir-mce_kbd-decoder.c:447:15: error: expected declaration specifiers or '...' before string constant
drivers/media/rc/ir-mce_kbd-decoder.c:447:1: warning: data definition has no type or storage class
drivers/media/rc/ir-mce_kbd-decoder.c:447:1: warning: type defaults to 'int' in declaration of 'MODULE_AUTHOR'
drivers/media/rc/ir-mce_kbd-decoder.c:447:15: warning: function declaration isn't a prototype
drivers/media/rc/ir-mce_kbd-decoder.c:448:20: error: expected declaration specifiers or '...' before string constant
drivers/media/rc/ir-mce_kbd-decoder.c:448:1: warning: data definition has no type or storage class
drivers/media/rc/ir-mce_kbd-decoder.c:448:1: warning: type defaults to 'int' in declaration of 'MODULE_DESCRIPTION'
drivers/media/rc/ir-mce_kbd-decoder.c:448:20: warning: function declaration isn't a prototype
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Stephen Rothwell [Fri, 29 Jul 2011 05:30:00 +0000 (15:30 +1000)]
[media] ov5642: include module.h for its facilities
drivers/media/video/ov5642.c:985:1: warning: data definition has no type or storage class
drivers/media/video/ov5642.c:985:1: warning: type defaults to 'int' in declaration of 'MODULE_DEVICE_TABLE'
drivers/media/video/ov5642.c:985:1: warning: parameter names (without types) in function declaration
drivers/media/video/ov5642.c: In function 'ov5642_mod_init':
drivers/media/video/ov5642.c:998:9: error: 'THIS_MODULE' undeclared (first use in this function)
drivers/media/video/ov5642.c:998:9: note: each undeclared identifier is reported only once for each function it appears in
drivers/media/video/ov5642.c: At top level:
drivers/media/video/ov5642.c:1009:20: error: expected declaration specifiers or '...' before string constant
drivers/media/video/ov5642.c:1009:1: warning: data definition has no type or storage class
drivers/media/video/ov5642.c:1009:1: warning: type defaults to 'int' in declaration of 'MODULE_DESCRIPTION'
drivers/media/video/ov5642.c:1009:20: warning: function declaration isn't a prototype
drivers/media/video/ov5642.c:1010:15: error: expected declaration specifiers or '...' before string constant
drivers/media/video/ov5642.c:1010:1: warning: data definition has no type or storage class
drivers/media/video/ov5642.c:1010:1: warning: type defaults to 'int' in declaration of 'MODULE_AUTHOR'
drivers/media/video/ov5642.c:1010:15: warning: function declaration isn't a prototype
drivers/media/video/ov5642.c:1011:16: error: expected declaration specifiers or '...' before string constant
drivers/media/video/ov5642.c:1011:1: warning: data definition has no type or storage class
drivers/media/video/ov5642.c:1011:1: warning: type defaults to 'int' in declaration of 'MODULE_LICENSE'
drivers/media/video/ov5642.c:1011:16: warning: function declaration isn't a prototype
drivers/media/video/ov5642.c: In function 'ov5642_mod_init':
drivers/media/video/ov5642.c:999:1: warning: control reaches end of non-void function
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Kernel keyring keys containing eCryptfs authentication tokens should not
be write locked when calling out to ecryptfsd to wrap and unwrap file
encryption keys. The eCryptfs kernel code can not hold the key's write
lock because ecryptfsd needs to request the key after receiving such a
request from the kernel.
Without this fix, all file opens and creates will timeout and fail when
using the eCryptfs PKI infrastructure. This is not an issue when using
passphrase-based mount keys, which is the most widely deployed eCryptfs
configuration.