rcu: improve kerneldoc for rcu_read_lock(), call_rcu(), and synchronize_rcu()
Make it explicit that new RCU read-side critical sections that start
after call_rcu() and synchronize_rcu() start might still be running
after the end of the relevant grace period.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 30 Jun 2010 18:43:52 +0000 (11:43 -0700)]
rcu: add boot parameter to suppress RCU CPU stall warning messages
Although the RCU CPU stall warning messages are a very good way to alert
people to a problem, once alerted, it is sometimes helpful to shut them
off in order to avoid obscuring other messages that might be being used
to track down the problem. Although you can rebuild the kernel with
CONFIG_RCU_CPU_STALL_DETECTOR=n, this is sometimes inconvenient. This
commit therefore adds a boot parameter named "rcu_cpu_stall_suppress"
that shuts these messages off without requiring a rebuild (though a
reboot might be needed for those not brave enough to patch their kernel
while it is running).
This message-suppression was already in place for the panic case, so this
commit need only rename the variable and export it via module_param().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Also set the default to 60 seconds, up from the previous hard-coded timeout
of 10 seconds. This allows people who care to set short timeouts, while
avoiding people with unusual configurations (make randconfig!!!) from being
bothered with spurious CPU stall warnings.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Tetsuo Handa [Fri, 25 Jun 2010 16:08:19 +0000 (01:08 +0900)]
Add RCU check for find_task_by_vpid().
find_task_by_vpid() says "Must be called under rcu_read_lock().". But due to
commit 3120438 "rcu: Disable lockdep checking in RCU list-traversal primitives",
we are currently unable to catch "find_task_by_vpid() with tasklist_lock held
but RCU lock not held" errors due to the RCU-lockdep checks being
suppressed in the RCU variants of the struct list_head traversals.
This commit therefore places an explicit check for being in an RCU
read-side critical section in find_task_by_pid_ns().
Lai Jiangshan [Mon, 28 Jun 2010 08:25:04 +0000 (16:25 +0800)]
rcu: simplify the usage of percpu data
&percpu_data is compatible with allocated percpu data.
And we use it and remove the "->rda[NR_CPUS]" array, saving significant
storage on systems with large numbers of CPUs. This does add an additional
level of indirection and thus an additional cache line referenced, but
because ->rda is not used on the read side, this is OK.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: Tejun Heo <tj@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Lai Jiangshan [Mon, 21 Jun 2010 08:57:42 +0000 (16:57 +0800)]
rcutorture: add random preemption
Add random preemption to help we to torture the preemptable rcu.
srcu_read_delay() also calls rcu_read_delay() for shorter delays.
Added comment to preempt_schedule() call indicating that no quiescent
states happen if preemption is disabled.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 16 Jun 2010 23:48:13 +0000 (16:48 -0700)]
rcu: add shiny new debug assists to Documentation/RCU/checklist.txt
Add a section describing PROVE_RCU, DEBUG_OBJECTS_RCU_HEAD, and
the __rcu sparse checking to the RCU checklist.
Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Arnd Bergmann [Thu, 25 Feb 2010 15:55:13 +0000 (16:55 +0100)]
rculist: avoid __rcu annotations
This avoids warnings from missing __rcu annotations
in the rculist implementation, making it possible to
use the same lists in both RCU and non-RCU cases.
We can add rculist annotations later, together with
lockdep support for rculist, which is missing as well,
but that may involve changing all the users.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Wed, 28 Apr 2010 21:39:09 +0000 (14:39 -0700)]
rcu: define __rcu address space modifier for sparse
This commit provides definitions for the __rcu annotation defined earlier.
This annotation permits sparse to check for correct use of RCU-protected
pointers. If a pointer that is annotated with __rcu is accessed
directly (as opposed to via rcu_dereference(), rcu_assign_pointer(),
or one of their variants), sparse can be made to complain. To enable
such complaints, use the new default-disabled CONFIG_SPARSE_RCU_POINTER
kernel configuration option. Please note that these sparse complaints are
intended to be a debugging aid, -not- a code-style-enforcement mechanism.
There are special rcu_dereference_protected() and rcu_access_pointer()
accessors for use when RCU read-side protection is not required, for
example, when no other CPU has access to the data structure in question
or while the current CPU hold the update-side lock.
This patch also updates a number of docbook comments that were showing
their age.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Christopher Li <sparse@chrisli.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Paul E. McKenney [Tue, 15 Jun 2010 00:06:21 +0000 (17:06 -0700)]
net: convert to rcu_dereference_index_check()
The task_cls_classid() function applies rcu_dereference() to integers,
which does not work with the shiny new sparse-based checking in
rcu_dereference(). This commit therefore moves to the new RCU API
rcu_dereference_index_check().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Linus Torvalds [Wed, 18 Aug 2010 22:45:23 +0000 (15:45 -0700)]
Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Fix an Oops in the NFSv4 atomic open code
NFS: Fix the selection of security flavours in Kconfig
NFS: fix the return value of nfs_file_fsync()
rpcrdma: Fix SQ size calculation when memreg is FRMR
xprtrdma: Do not truncate iova_start values in frmr registrations.
nfs: Remove redundant NULL check upon kfree()
nfs: Add "lookupcache" to displayed mount options
NFS: allow close-to-open cache semantics to apply to root of NFS filesystem
SUNRPC: fix NFS client over TCP hangs due to packet loss (Bug 16494)
Linus Torvalds [Wed, 18 Aug 2010 22:29:38 +0000 (15:29 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
USB HID: Add ID for eGalax Multitouch used in JooJoo tablet
HID: hiddev: fix memory corruption due to invalid intfdata
HID: hiddev: protect against disconnect/NULL-dereference race
HID: picolcd: correct ordering of framebuffer freeing
HID: picolcd: testing the wrong variable
Missed the declaration of sys_execve in the ia64 asm/unistd.h (perhaps
because there is no reason for it to be there ... it might be a left over
from the COMPAT code?). Just delete the conflicting version.
Linus Torvalds [Wed, 18 Aug 2010 16:35:08 +0000 (09:35 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
fs: brlock vfsmount_lock
fs: scale files_lock
lglock: introduce special lglock and brlock spin locks
tty: fix fu_list abuse
fs: cleanup files_lock locking
fs: remove extra lookup in __lookup_hash
fs: fs_struct rwlock to spinlock
apparmor: use task path helpers
fs: dentry allocation consolidation
fs: fix do_lookup false negative
mbcache: Limit the maximum number of cache entries
hostfs ->follow_link() braino
hostfs: dumb (and usually harmless) tpyo - strncpy instead of strlcpy
remove SWRITE* I/O types
kill BH_Ordered flag
vfs: update ctime when changing the file's permission by setfacl
cramfs: only unlock new inodes
fix reiserfs_evict_inode end_writeback second call
Linus Torvalds [Wed, 18 Aug 2010 16:32:13 +0000 (09:32 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix build on POSIX shells
latencytop: Fix kconfig dependency warnings
perf annotate tui: Fix exit and RIGHT keys handling
tracing: Sanitize value returned from write(trace_marker, "...", len)
tracing/events: Convert format output to seq_file
tracing: Extend recordmcount to better support Blackfin mcount
tracing: Fix ring_buffer_read_page reading out of page boundary
tracing: Fix an unallocated memory access in function_graph
Linus Torvalds [Wed, 18 Aug 2010 16:27:10 +0000 (09:27 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68knommu: include sched.h in ColdFire/SPI driver
m68knommu: formatting of pointers in printk()
m68knommu: arch/m68k/include/asm/ide.h fix for nommu
Linus Torvalds [Wed, 18 Aug 2010 16:26:42 +0000 (09:26 -0700)]
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md raid-1/10 Fix bio_rw bit manipulations again
md: provide appropriate return value for spare_active functions.
md: Notify sysfs when RAID1/5/10 disk is In_sync.
Update recovery_offset even when external metadata is used.
Linus Torvalds [Wed, 18 Aug 2010 16:26:17 +0000 (09:26 -0700)]
Merge branch 'merge-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'merge-devicetree' of git://git.secretlab.ca/git/linux-2.6:
spi.h: missing kernel-doc notation, please fix
of: fix missing headers for of_address_to_resource() in MTD and SysACE drivers
of: Fix missing includes
ata: update for of_device to platform_device replacement
microblaze: Fix of: eliminate of_device->node and dev_archdata->{of,prom}_node
microblaze: Fix of/address: Merge all of the bus translation code
booting-without-of: Remove nonexistent chapters from TOC, fix numbering
Jaroslav Kysela [Wed, 18 Aug 2010 12:08:17 +0000 (14:08 +0200)]
ALSA: emu10k1 - delay the PCM interrupts (add pcm_irq_delay parameter)
With some hardware combinations, the PCM interrupts are acknowledged
before the period boundary from the emu10k1 chip. The midlevel PCM code
gets confused and the playback stream is interrupted.
It seems that the interrupt processing shift by 2 samples is enough
to fix this issue. This default value does not harm other,
non-affected hardware.
More information: Kernel bugzilla bug#16300
[A copmile warning fixed by tiwai]
Signed-off-by: Jaroslav Kysela <perex@perex.cz> Cc: <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Nick Piggin [Tue, 17 Aug 2010 18:37:39 +0000 (04:37 +1000)]
fs: brlock vfsmount_lock
fs: brlock vfsmount_lock
Use a brlock for the vfsmount lock. It must be taken for write whenever
modifying the mount hash or associated fields, and may be taken for read when
performing mount hash lookups.
A new lock is added for the mnt-id allocator, so it doesn't need to take
the heavy vfsmount write-lock.
The number of atomics should remain the same for fastpath rlock cases, though
code would be slightly slower due to per-cpu access. Scalability is not not be
much improved in common cases yet, due to other locks (ie. dcache_lock) getting
in the way. However path lookups crossing mountpoints should be one case where
scalability is improved (currently requiring the global lock).
The slowpath is slower due to use of brlock. On a 64 core, 64 socket, 32 node
Altix system (high latency to remote nodes), a simple umount microbenchmark
(mount --bind mnt mnt2 ; umount mnt2 loop 1000 times), before this patch it
took 6.8s, afterwards took 7.1s, about 5% slower.
Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Nick Piggin [Tue, 17 Aug 2010 18:37:38 +0000 (04:37 +1000)]
fs: scale files_lock
fs: scale files_lock
Improve scalability of files_lock by adding per-cpu, per-sb files lists,
protected with an lglock. The lglock provides fast access to the per-cpu lists
to add and remove files. It also provides a snapshot of all the per-cpu lists
(although this is very slow).
One difficulty with this approach is that a file can be removed from the list
by another CPU. We must track which per-cpu list the file is on with a new
variale in the file struct (packed into a hole on 64-bit archs). Scalability
could suffer if files are frequently removed from different cpu's list.
However loads with frequent removal of files imply short interval between
adding and removing the files, and the scheduler attempts to avoid moving
processes too far away. Also, even in the case of cross-CPU removal, the
hardware has much more opportunity to parallelise cacheline transfers with N
cachelines than with 1.
A worst-case test of 1 CPU allocating files subsequently being freed by N CPUs
degenerates to contending on a single lock, which is no worse than before. When
more than one CPU are allocating files, even if they are always freed by
different CPUs, there will be more parallelism than the single-lock case.
Testing results:
On a 2 socket, 8 core opteron, I measure the number of times the lock is taken
to remove the file, the number of times it is removed by the same CPU that
added it, and the number of times it is removed by the same node that added it.
So a file is removed from the same CPU it was added by over 90% of the time.
It remains within the same node 95% of the time.
Tim Chen ran some numbers for a 64 thread Nehalem system performing a compile.
throughput
2.6.34-rc2 24.5
+patch 24.9
us sys idle IO wait (in %)
2.6.34-rc2 51.25 28.25 17.25 3.25
+patch 53.75 18.5 19 8.75
So significantly less CPU time spent in kernel code, higher idle time and
slightly higher throughput.
Single threaded performance difference was within the noise of microbenchmarks.
That is not to say penalty does not exist, the code is larger and more memory
accesses required so it will be slightly slower.
Cc: linux-kernel@vger.kernel.org Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Nick Piggin [Tue, 17 Aug 2010 18:37:37 +0000 (04:37 +1000)]
lglock: introduce special lglock and brlock spin locks
lglock: introduce special lglock and brlock spin locks
This patch introduces "local-global" locks (lglocks). These can be used to:
- Provide fast exclusive access to per-CPU data, with exclusive access to
another CPU's data allowed but possibly subject to contention, and to provide
very slow exclusive access to all per-CPU data.
- Or to provide very fast and scalable read serialisation, and to provide
very slow exclusive serialisation of data (not necessarily per-CPU data).
Brlocks are also implemented as a short-hand notation for the latter use
case.
Thanks to Paul for local/global naming convention.
Cc: linux-kernel@vger.kernel.org Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Nick Piggin [Tue, 17 Aug 2010 18:37:36 +0000 (04:37 +1000)]
tty: fix fu_list abuse
tty: fix fu_list abuse
tty code abuses fu_list, which causes a bug in remount,ro handling.
If a tty device node is opened on a filesystem, then the last link to the inode
removed, the filesystem will be allowed to be remounted readonly. This is
because fs_may_remount_ro does not find the 0 link tty inode on the file sb
list (because the tty code incorrectly removed it to use for its own purpose).
This can result in a filesystem with errors after it is marked "clean".
Taking idea from Christoph's initial patch, allocate a tty private struct
at file->private_data and put our required list fields in there, linking
file and tty. This makes tty nodes behave the same way as other device nodes
and avoid meddling with the vfs, and avoids this bug.
The error handling is not trivial in the tty code, so for this bugfix, I take
the simple approach of using __GFP_NOFAIL and don't worry about memory errors.
This is not a problem because our allocator doesn't fail small allocs as a rule
anyway. So proper error handling is left as an exercise for tty hackers.
[ Arguably filesystem's device inode would ideally be divorced from the
driver's pseudo inode when it is opened, but in practice it's not clear whether
that will ever be worth implementing. ]
Cc: linux-kernel@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Nick Piggin [Tue, 17 Aug 2010 18:37:34 +0000 (04:37 +1000)]
fs: remove extra lookup in __lookup_hash
fs: remove extra lookup in __lookup_hash
Optimize lookup for create operations, where no dentry should often be
common-case. In cases where it is not, such as unlink, the added overhead
is much smaller than the removed.
Also, move comments about __d_lookup racyness to the __d_lookup call site.
d_lookup is intuitive; __d_lookup is what needs commenting. So in that same
vein, add kerneldoc comments to __d_lookup and clean up some of the comments:
- We are interested in how the RCU lookup works here, particularly with
renames. Make that explicit, and point to the document where it is explained
in more detail.
- RCU is pretty standard now, and macros make implementations pretty mindless.
If we want to know about RCU barrier details, we look in RCU code.
- Delete some boring legacy comments because we don't care much about how the
code used to work, more about the interesting parts of how it works now. So
comments about lazy LRU may be interesting, but would better be done in the
LRU or refcount management code.
Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Nick Piggin [Tue, 17 Aug 2010 18:37:33 +0000 (04:37 +1000)]
fs: fs_struct rwlock to spinlock
fs: fs_struct rwlock to spinlock
struct fs_struct.lock is an rwlock with the read-side used to protect root and
pwd members while taking references to them. Taking a reference to a path
typically requires just 2 atomic ops, so the critical section is very small.
Parallel read-side operations would have cacheline contention on the lock, the
dentry, and the vfsmount cachelines, so the rwlock is unlikely to ever give a
real parallelism increase.
Replace it with a spinlock to avoid one or two atomic operations in typical
path lookup fastpath.
Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Nick Piggin [Tue, 17 Aug 2010 18:37:30 +0000 (04:37 +1000)]
fs: fix do_lookup false negative
fs: fix do_lookup false negative
In do_lookup, if we initially find no dentry, we take the directory i_mutex and
re-check the lookup. If we find a dentry there, then we revalidate it if
needed. However if that revalidate asks for the dentry to be invalidated, we
return -ENOENT from do_lookup. What should happen instead is an attempt to
allocate and lookup a new dentry.
This is probably not noticed because it is rare. It is only reached if a
concurrent create races in first (in which case, the dentry probably won't be
invalidated anyway), or if the racy __d_lookup has failed due to a
false-negative (which is very rare).
Fix this by removing code and have it use the normal reval path.
Signed-off-by: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
mbcache: Limit the maximum number of cache entries
Limit the maximum number of mb_cache entries depending on the number of
hash buckets: if the only limit to the number of cache entries is the
available memory the hash chains can grow very long, taking a long time
to search.
At least partially solves https://bugzilla.lustre.org/show_bug.cgi?id=22771.
Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
NeilBrown [Wed, 18 Aug 2010 06:16:05 +0000 (16:16 +1000)]
md raid-1/10 Fix bio_rw bit manipulations again
commit 7b6d91daee5cac6402186ff224c3af39d79f4a0e changed the behaviour
of a few variables in raid1 and raid10 from flags to bit-sets, but
left them as type 'bool' so they did not work.
These flags aren't real I/O types, but tell ll_rw_block to always
lock the buffer instead of giving up on a failed trylock.
Instead add a new write_dirty_buffer helper that implements this semantic
and use it from the existing SWRITE* callers. Note that the ll_rw_block
code had a bug where it didn't promote WRITE_SYNC_PLUG properly, which
this patch fixes.
In the ufs code clean up the helper that used to call ll_rw_block
to mirror sync_dirty_buffer, which is the function it implements for
compound buffers.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Using the coldfire qspi driver, I get the following error:
drivers/spi/coldfire_qspi.c: In function 'mcfqspi_irq_handler':
drivers/spi/coldfire_qspi.c:166: error: 'TASK_NORMAL' undeclared (first use in this function)
drivers/spi/coldfire_qspi.c:166: error: (Each undeclared identifier is reported only once
It is solved by adding the following include to coldfire_sqpi.c:
#include <linux/sched.h>
Fix suggested by Jate Sujjavanich <jsujjavanich@syntech-fuelmaster.com>
m68knommu: arch/m68k/include/asm/ide.h fix for nommu
The arch/m68k/include/asm/ide.h produces errors when the IDE driver is compiled for my 523x uClinux system under kernel. The header makes some redefines of operators not defined in the arch/m68k/include/asm/io_no.h header. There are no separate mmio and iospace defines.
NeilBrown [Wed, 18 Aug 2010 01:56:59 +0000 (11:56 +1000)]
md: provide appropriate return value for spare_active functions.
md_check_recovery expects ->spare_active to return 'true' if any
spares were activated, but none of them do, so the consequent change
in 'degraded' is not notified through sysfs.
So count the number of spares activated, subtract it from 'degraded'
just once, and return it.
Reported-by: Adrian Drzewiecki <adriand@vmware.com> Signed-off-by: NeilBrown <neilb@suse.de>
When RAID1 is done syncing disks, it'll update the state
of synced rdevs to In_sync. But it neglected to notify
sysfs that the attribute changed. So any programs that
are waiting for an rdev's state to change will not be
woken.
(raid5/raid10 added by neilb)
Signed-off-by: Adrian Drzewiecki <adriand@vmware.com> Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 16 Aug 2010 08:09:31 +0000 (18:09 +1000)]
Update recovery_offset even when external metadata is used.
The update of ->recovery_offset in sync_sbs is appropriate even then external
metadata is in use. However sync_sbs is only called when native
metadata is used.
So move that update in to the top of md_update_sb (which is the only
caller of sync_sbs) before the test on ->external.
This moves the update out of ->write_lock protection, but those fields
only need ->reconfig_mutex protection which they still have.
Also move the test on ->persistent up to where ->external is set as
for metadata update purposes they are the same.
Clear MD_CHANGE_DEVS and MD_CHANGE_CLEAN as they can only be confusing
if ->external is set or ->persistent isn't.
Finally move the update of ->utime down as it is only relevent (like
the ->events update) for native metadata.
Linus Torvalds [Wed, 18 Aug 2010 01:36:19 +0000 (18:36 -0700)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
vt,console,kdb: preserve console_blanked while in kdb
vt: fix regression warnings from KMS merge
arm,kgdb: fix GDB_MAX_REGS no longer used
kgdb: add missing __percpu markup in arch/x86/kernel/kgdb.c
kdb: fix compile error without CONFIG_KALLSYMS
Linus Torvalds [Wed, 18 Aug 2010 01:36:01 +0000 (18:36 -0700)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86:
platform/x86: move rfkill for Dell Mini 1012 to compal-laptop
thinkpad-acpi: Add KEY_CAMERA (Fn-F6) for Lenovo keyboards
thinkpad-acpi: add support for model-specific keymaps
thinkpad-acpi: lock down size of hotkey keymap
thinkpad-acpi: untangle ACPI/vendor backlight selection
thinkpad-acpi: find ACPI video device by synthetic HID
intel_ips: potential null dereference
drivers/platform/x86: Adjust confusing if indentation
x86: intel_ips: do not use PCI resources before pci_enable_device()
Linus Torvalds [Wed, 18 Aug 2010 01:35:39 +0000 (18:35 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix false warning saying one of two super blocks is broken
nilfs2: fix list corruption after ifile creation failure
Hugh Dickins [Tue, 17 Aug 2010 22:23:56 +0000 (15:23 -0700)]
shmem: put_super must percpu_counter_destroy
list_add() corruption messages reported from shmem_fill_super()'s recently
introduced percpu_counter_init(): shmem_put_super() needs to remember to
percpu_counter_destroy(). And also check error from percpu_counter_init().
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc64: Fix atomic64_t routine return values.
sparc64: Fix rwsem constant bug leading to hangs.
sparc: Hook up new fanotify and prlimit64 syscalls.
sparc: Really fix "console=" for serial consoles.
Linus Torvalds [Wed, 18 Aug 2010 01:11:49 +0000 (18:11 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
VIDEO: amba clcd: don't disable an already disabled clock
ARM: Tighten check for allowable CPSR values
ARM: 6329/1: wire up sys_accept4() on ARM
ARM: 6328/1: Build with -fno-dwarf2-cfi-asm
ARM: 6326/1: kgdb: fix GDB_MAX_REGS no longer used
id = fork();
if (id == 0) { sleep(1); exit(0); }
kill(id, SIGSTOP);
alarm(1);
waitid(P_PID, id, &infop, WCONTINUED);
return 0;
}
to call waitid() on a stopped process results in access to the child task's
credentials without the RCU read lock being held - which may be replaced in the
meantime - eliciting the following warning:
This is fixed by holding the RCU read lock in wait_task_continued() to ensure
that the task's current credentials aren't destroyed between us reading the
cred pointer and us reading the UID from those credentials.
Furthermore, protect wait_task_stopped() in the same way.
We don't need to keep holding the RCU read lock once we've read the UID from
the credentials as holding the RCU read lock doesn't stop the target task from
changing its creds under us - so the credentials may be outdated immediately
after we've read the pointer, lock or no lock.
Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Tue, 17 Aug 2010 22:52:56 +0000 (23:52 +0100)]
Make do_execve() take a const filename pointer
Make do_execve() take a const filename pointer so that kernel_execve() compiles
correctly on ARM:
arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type
This also requires the argv and envp arguments to be consted twice, once for
the pointer array and once for the strings the array points to. This is
because do_execve() passes a pointer to the filename (now const) to
copy_strings_kernel(). A simpler alternative would be to cast the filename
pointer in do_execve() when it's passed to copy_strings_kernel().
do_execve() may not change any of the strings it is passed as part of the argv
or envp lists as they are some of them in .rodata, so marking these strings as
const should be fine.
Further kernel_execve() and sys_execve() need to be changed to match.
This has been test built on x86_64, frv, arm and mips.
Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jarek Poplawski [Wed, 11 Aug 2010 02:02:10 +0000 (02:02 +0000)]
net: Fix a memmove bug in dev_gro_receive()
>Xin Xiaohui wrote:
> I looked into the code dev_gro_receive(), found the code here:
> if the frags[0] is pulled to 0, then the page will be released,
> and memmove() frags left.
> Is that right? I'm not sure if memmove do right or not, but
> frags[0].size is never set after memove at least. what I think
> a simple way is not to do anything if we found frags[0].size == 0.
> The patch is as followed.
...
This version of the patch fixes the bug directly in memmove.
Reported-by: "Xin, Xiaohui" <xiaohui.xin@intel.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
bottom half more than necessary), lockdep can raise a warning
because we attempt to lock a spinlock with BH enabled, while
the same lock is usually locked by another cpu in a softirq context.
Disable again BH to avoid these lockdep warnings.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Diagnosed-by: David S. Miller <davem@davemloft.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: J. Bruce Fields <bfields@fieldses.org> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Russell King [Tue, 17 Aug 2010 21:13:22 +0000 (22:13 +0100)]
VIDEO: amba clcd: don't disable an already disabled clock
Fix the clock enable/disable tracking in the AMBA CLCD driver so
that the driver doesn't try to disable an already disabled clock,
thereby causing the clock (if shared) to become unbalanced.
This resolves a problem with CLCD on LPC32xx ARM platforms.
Reported-by: Kevin Wells <wellsk40@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Graeme Smecher [Tue, 17 Aug 2010 17:13:44 +0000 (10:13 -0700)]
of: fix missing headers for of_address_to_resource() in MTD and SysACE drivers
The drivers for Xilinx' SystemACE and physically mapped MTDs were missing
prototypes for of_address_to_resource(). This patch adds the necessary
headers.
Signed-off-by: Graeme Smecher <graeme.smecher@mail.mcgill.ca> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
POSIX sh does not specify the brace expansion, so fix it by replacing the
global $(shell ...) lines quite at the top creating the output directories with
real rules.
Cc: Ingo Molnar <mingo@elte.hu> Cc: Kusanagi Kouichi <slash@ac.auone-net.jp> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1282046280.5822.4.camel@thorin> Signed-off-by: Bernd Petrovitsch <bernd@sysprog.at> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Randy Dunlap [Thu, 12 Aug 2010 19:31:21 +0000 (12:31 -0700)]
latencytop: Fix kconfig dependency warnings
warning: (LATENCYTOP && HAVE_LATENCYTOP_SUPPORT) selects
SCHED_DEBUG which has unmet direct dependencies (DEBUG_KERNEL &&
PROC_FS) warning: (LATENCYTOP && HAVE_LATENCYTOP_SUPPORT) selects
SCHEDSTATS which has unmet direct dependencies (DEBUG_KERNEL && PROC_FS)
Add depends on STACKTRACE_SUPPORT for 'select STACKTRACE'.
Add depends on PROC_FS since that is where the output goes.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Arjan van de Ven <arjan@linux.intel.com>
LKML-Reference: <20100812123121.a7c99cde.randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Grant Likely [Tue, 17 Aug 2010 05:44:49 +0000 (23:44 -0600)]
of: Fix missing includes
This patch fixes missing includes from a number of .c files because
the code (wrongfully) depended on prom.h including them. The include
of linux/of_address.h was removed in microblaze prom.h in commit
"of/address: Clean up function declarations" (sha1 id 22ae782f8), but
not fixed in some callers. This patch fixes them up.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Tested-by: Michal Simek <monstr@monstr.eu>
Jiri Slaby [Wed, 11 Aug 2010 09:28:02 +0000 (11:28 +0200)]
AppArmor: fix task_setrlimit prototype
After rlimits tree was merged we get the following errors:
security/apparmor/lsm.c:663:2: warning: initialization from incompatible pointer type
It is because AppArmor was merged in the meantime, but uses the old
prototype. So fix it by adding struct task_struct as a first parameter
of apparmor_task_setrlimit.
NOTE that this is ONLY a compilation warning fix (and crashes caused
by that). It needs proper handling in AppArmor depending on who is the
'task'.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: James Morris <jmorris@namei.org>
Jason Wessel [Mon, 16 Aug 2010 20:58:31 +0000 (15:58 -0500)]
vt,console,kdb: preserve console_blanked while in kdb
Commit b45cfba4e9005d64d419718e7ff7f7cab44c1994 (vt,console,kdb:
implement atomic console enter/leave functions) introduced the ability
to atomically change the console mode with kernel mode setting but did
not preserve the state of the console_blanked variable.
The console_blanked variable must be restored when executing the
con_debug_leave() or further kernel mode set changes (such as using
chvt X) will fail to correctly set the state of console.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> CC: Andrew Morton <akpm@linux-foundation.org>
Jason Wessel [Mon, 16 Aug 2010 20:58:30 +0000 (15:58 -0500)]
vt: fix regression warnings from KMS merge
Fix the following new sparse warnings in vt.c introduced by the commit b45cfba4e9005d64d419718e7ff7f7cab44c1994 (vt,console,kdb: implement
atomic console enter/leave functions):
drivers/char/vt.c:197:5: warning: symbol 'saved_fg_console' was not declared. Should it be static?
drivers/char/vt.c:198:5: warning: symbol 'saved_last_console' was not declared. Should it be static?
drivers/char/vt.c:199:5: warning: symbol 'saved_want_console' was not declared. Should it be static?
drivers/char/vt.c:200:5: warning: symbol 'saved_vc_mode' was not declared. Should it be static?
Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> CC: Andrew Morton <akpm@linux-foundation.org>
Jason Wessel [Mon, 16 Aug 2010 20:58:29 +0000 (15:58 -0500)]
kdb: fix compile error without CONFIG_KALLSYMS
If CONFIG_KGDB_KDB is set and CONFIG_KALLSYMS is not set the kernel
will fail to build with the error:
kernel/built-in.o: In function `kallsyms_symbol_next':
kernel/debug/kdb/kdb_support.c:237: undefined reference to `kdb_walk_kallsyms'
kernel/built-in.o: In function `kallsyms_symbol_complete':
kernel/debug/kdb/kdb_support.c:193: undefined reference to `kdb_walk_kallsyms'
The kdb_walk_kallsyms needs a #ifdef proper header to match the C
implementation. This patch also fixes the compiler warnings in
kdb_support.c when compiling without CONFIG_KALLSYMS set. The
compiler warnings are a result of the kallsyms_lookup() macro not
initializing the two of the pass by reference variables.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Reported-by: Michal Simek <monstr@monstr.eu>