Akinobu Mita [Wed, 5 Oct 2011 00:42:35 +0000 (11:42 +1100)]
ext4: use proper little-endian bitops
ext4_{set,clear}_bit() is defined as __test_and_{set,clear}_bit_le() for
ext4. Only two ext4_{set,clear}_bit() calls check the return value. The
rest of calls ignore the return value and they can be replaced with
__{set,clear}_bit_le().
This changes ext4_{set,clear}_bit() from __test_and_{set,clear}_bit_le()
to __{set,clear}_bit_le() and introduces ext4_test_and_{set,clear}_bit()
for the two places where old bit needs to be returned.
This ext4_{set,clear}_bit() change is considered safe, because if someone
uses these macros without noticing the change, new ext4_{set,clear}_bit
don't have return value and causes compiler errors where the return value
is used.
This also removes unused ext4_find_first_zero_bit().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Christine Chan [Wed, 5 Oct 2011 00:42:35 +0000 (11:42 +1100)]
kernel/timer.c: use debugobjects to catch deletion of uninitialized timers
del_timer_sync() calls debug_object_assert_init() to assert that a timer
has been initialized before calling lock_timer_base(). lock_timer_base()
would spin forever on a NULL(uninit-ed) base. The check is added to
del_timer() to prevent silent failure, even though it would not get stuck
in an infinite loop.
Signed-off-by: Christine Chan <cschan@codeaurora.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In file included from drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_param.c:22:
drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h:24:1: warning: "pr_fmt" redefined
In file included from include/linux/kernel.h:20,
from include/linux/cache.h:4,
from include/linux/time.h:7,
from include/linux/stat.h:60,
from include/linux/module.h:10,
from drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_param.c:21:
include/linux/printk.h:152:1: warning: this is the location of the previous definition
Cc: Tomoya <tomoya-linux@dsn.okisemi.com> Cc: Toshiharu Okada <toshiharu-linux@dsn.okisemi.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Amerigo Wang <amwang@redhat.com>
ERROR: Macros with complex values should be enclosed in parenthesis
#87: FILE: include/linux/ipc_namespace.h:126:
+#define DFLT_MSGSIZEMAX 1024*1024
ERROR: Macros with complex values should be enclosed in parenthesis
#88: FILE: include/linux/ipc_namespace.h:127:
+#define HARD_MSGSIZEMAX 16*1024*1024
total: 2 errors, 0 warnings, 75 lines checked
./patches/ipc-mqueue-update-maximums-for-the-mqueue-subsystem.patch has style problems, please review.
If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
Please run checkpatch prior to sending patches
Cc: Doug Ledford <dledford@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Doug Ledford [Wed, 5 Oct 2011 00:42:33 +0000 (11:42 +1100)]
ipc/mqueue: update maximums for the mqueue subsystem
Commit b231cca4381ee ("message queues: increase range limits") changed the
maximum size of a message in a message queue from INT_MAX to 8192*128.
Unfortunately, we had customers that relied on a size much larger than
8192*128 on their production systems. After reviewing POSIX, we found
that it is silent on the maximum message size. We did find a couple other
areas in which it was not silent. Fix up the mqueue maximums so that the
customer's system can continue to work, and document both the POSIX and
real world requirements in ipc_namespace.h so that we don't have this
issue crop back up.
Also, commit 9cf18e1dd74c ("ipc: HARD_MSGMAX should be higher not lower on
64bit") fiddled with HARD_MSGMAX without realizing that the number was
intentionally in place to limit the msg queue depth to one that was small
enough to kmalloc an array of pointers (hence why we divided 128k by
sizeof(long)). If we wish to meet POSIX requirements, we have no choice
but to change our allocation to a vmalloc instead (at least for the large
queue size case). With that, it's possible to increase our allowed
maximum to the POSIX requirements (or more if we choose).
Signed-off-by: Doug Ledford <dledford@redhat.com> Cc: Amerigo Wang <amwang@redhat.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Joe Korty <joe.korty@ccur.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Doug Ledford [Wed, 5 Oct 2011 00:42:32 +0000 (11:42 +1100)]
ipc/mqueue: enforce hard limits
In two places we don't enforce the hard limits for CAP_SYS_RESOURCE apps.
In preparation for making more reasonable hard limits, start enforcing
them even on CAP_SYS_RESOURCE.
Signed-off-by: Doug Ledford <dledford@redhat.com> Cc: Amerigo Wang <amwang@redhat.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Joe Korty <joe.korty@ccur.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Doug Ledford [Wed, 5 Oct 2011 00:42:32 +0000 (11:42 +1100)]
ipc/mqueue: switch back to using non-max values on create
Commit b231cca4381ee15e ("message queues: increase range limits") changed
how we create a queue that does not include an attr struct passed to open
so that it creates the queue with whatever the maximum values are.
However, if the admin has set the maximums to allow flexibility in
creating a queue (aka, both a large size and large queue are allowed, but
combined they create a queue too large for the RLIMIT_MSGQUEUE of the
user), then attempts to create a queue without an attr struct will fail.
Switch back to using acceptable defaults regardless of what the maximums
are.
Signed-off-by: Doug Ledford <dledford@redhat.com> Cc: Amerigo Wang <amwang@redhat.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Joe Korty <joe.korty@ccur.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Doug Ledford [Wed, 5 Oct 2011 00:42:31 +0000 (11:42 +1100)]
ipc/mqueue: cleanup definition names and locations
We had a customer come up with a problem while trying to upgrade from our
2.6.18 kernel to our 2.6.32 kernel. In diagnosing their problem, it was
determined that when commit b231cca4 ("message queues: increase range
limits") changed the msg size max from INT_MAX to 8192*128, that's what
broke their setup.
While fixing this problem, testing showed that if you increase the max
values of a msg queue, then attempt to create one without an attr struct
passed in to the open call, it could fail because it sets the queue size
to the max of both the msg size and queue size. If these are large
enough, they over run the default RLIMIT_MSGQUEUE. This change was also
introduced in the b231cca4 ("message queues: increase range limits")
commit.
We then found that the msg queue limits were not all being enforced on
CAP_SYS_RESOURCE apps.
Finally, we found that commit 9cf18e1d ("ipc: HARD_MSGMAX should be higher
not lower on 64bit") fiddled with HARD_MSGMAX without realizing that the
reason it was set to what it was, was to avoid trying to kmalloc a chunk
larger than 128K.
So this series of patches cleans up the various defines, takes us back to
having a larger HARD_MSGSIZEMAX, goes back to using a separate define for
the case where a user doesn't pass in an attr struct in case the maxes
have been raised too large for RLIMIT_MSGQUEUE, enforces the maximums on
CAP_SYS_RESOURCE apps, uses vmalloc instead of kmalloc when the msg
pointer array is too large, and documents all of this so it shouldn't
happen again.
This patch:
The various defines for minimums and maximums of the sysctl controllable
mqueue values are scattered amongst different files and named
inconsistently. Move them all into ipc_namespace.h and make them have
consistent names. Additionally, make the number of queues per namespace
also have a minimum and maximum and use the same sysctl function as the
other two settable variables.
Signed-off-by: Doug Ledford <dledford@redhat.com> Cc: Amerigo Wang <amwang@redhat.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: Joe Korty <joe.korty@ccur.com> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andi Kleen [Wed, 5 Oct 2011 00:42:31 +0000 (11:42 +1100)]
brlocks/lglocks: clean up code
lglocks and brlocks are currently generated with some complicated macros
in lglock.h. But there's no reason I can see to not just use common
utility functions that get pointers to the lglock.
Since there are at least two users it makes sense to share this code in a
library.
This will also make it later possible to dynamically allocate lglocks.
In general the users now look more like normal function calls with
pointers, not magic macros.
The patch is rather large because I move over all users in one go to keep
it bisectable. This impacts the VFS somewhat in terms of lines changed.
But no actual behaviour change.
Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Nick Piggin <npiggin@kernel.dk> Signed-off-by: Andrew Morton <akpm@google.com>
Shaohui Xie [Wed, 5 Oct 2011 00:42:29 +0000 (11:42 +1100)]
drivers/edac/mpc85xx_edac.c: fix memory controller compatible for edac
compatible in dts has been changed, so the driver needs to be updated
accordingly.
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jesper Juhl [Wed, 5 Oct 2011 00:42:29 +0000 (11:42 +1100)]
audit: always follow va_copy() with va_end()
A call to va_copy() should always be followed by a call to va_end() in the
same function. In kernel/autit.c::audit_log_vformat() this is not always
done. This patch makes sure va_end() is always called.
Signed-off-by: Jesper Juhl <jj@chaosbits.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Mathias Krause [Wed, 5 Oct 2011 00:42:28 +0000 (11:42 +1100)]
arm, exec: remove redundant set_fs(USER_DS)
The address limit is already set in flush_old_exec() so this
set_fs(USER_DS) is redundant.
Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Shaohua Li [Wed, 5 Oct 2011 00:42:28 +0000 (11:42 +1100)]
x86: tlb flush avoid superflous leave_mm()
If just one page VA tlb is required to be flushed and current task is in
lazy TLB state, doing leave_mm() is superfluous because it flushes the
whole TLB. This can reduce some TLB miss.
Signed-off-by: Shaohua Li <shaohua.li@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
arch/x86/kernel/e820.c: quiet sparse noise about plain integer as NULL pointer
The last parameter to sort() is a pointer to the function used to swap
items. This parameter should be NULL, not 0, when not used. This quiets
the following sparse warning:
warning: Using plain integer as NULL pointer
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
William Douglas [Wed, 5 Oct 2011 00:42:26 +0000 (11:42 +1100)]
x86,mrst: add mapping for bma023
There is now an upstream bma023 driver so instead of submitting ours we
use that one. The defaults are just fine so it's a simple mapping entry.
(Thanks go to Erik Andersson for incorporating the changes we needed into his
version)
Signed-off-by: William Douglas <william.douglas@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrew Morton [Wed, 5 Oct 2011 00:42:26 +0000 (11:42 +1100)]
drivers/power/intel_mid_battery.c: fix build
Seems that nobody's even trying any more.
Cc: Nithish Mahalingam <nithish.mahalingam@intel.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Anton Vorontsov <cbouatmailru@gmail.com> Cc: Major Lee <major_lee@wistron.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Feng Tang [Wed, 5 Oct 2011 00:42:25 +0000 (11:42 +1100)]
vrtc: change its year offset from 1960 to 1972
Real world year equals the value in vrtc YEAR register plus an offset. We
used 1960 for original developepment as the offset to make leap year
consistent, but for a device's first use, its YEAR register is 0 and the
system year will be parsed as 1960 which is not a valid UNIX time and will
cause many applications to fail mysteriously. Devices use 1972 instead to
fix this issue.
Updated patch which adds a sanity check suggested by Mathias
Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ludwig Nussel [Wed, 5 Oct 2011 00:42:24 +0000 (11:42 +1100)]
x86: fix mmap random address range
On x86_32 casting the unsigned int result of get_random_int() to long may
result in a negative value. On x86_32 the range of mmap_rnd() therefore
was -255 to 255. The 32bit mode on x86_64 used 0 to 255 as intended.
The bug was introduced by 675a081 ("x86: unify mmap_{32|64}.c") in January
2008.
Signed-off-by: Ludwig Nussel <ludwig.nussel@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Harvey Harrison <harvey.harrison@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Shérab [Wed, 5 Oct 2011 00:42:23 +0000 (11:42 +1100)]
arch/x86/platform/iris/iris.c: register a platform device and a platform driver
This makes the iris driver use the platform API, so it is properly exposed
in /sys.
[akpm@linux-foundation.org: remove commented-out code, add missing space to printk, clean up code layout] Signed-off-by: Shérab <Sebastien.Hinderer@ens-lyon.org> Cc: Len Brown <lenb@kernel.org> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The problem is that in copy_page_range() we turn lazy mode on, and then in
swap_entry_free() we call swap_count_continued() which ends up in:
map = kmap_atomic(page, KM_USER0) + offset;
and then later we touch *map.
Since we are running in batched mode (lazy) we don't actually set up the
PTE mappings and the kmap_atomic is not done synchronously and ends up
trying to dereference a page that has not been set.
Looking at kmap_atomic_prot_pfn(), it uses 'arch_flush_lazy_mmu_mode' and
doing the same in kmap_atomic_prot() and __kunmap_atomic() makes the problem
go away.
Interestingly, commit b8bcfe997e4615 ("x86/paravirt: remove lazy mode in
interrupts") removed part of this to fix an interrupt issue - but it went
to far and did not consider this scenario.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@google.com>
Andy Whitcroft [Wed, 5 Oct 2011 00:42:22 +0000 (11:42 +1100)]
readlinkat: ensure we return ENOENT for the empty pathname for normal lookups
Since the commit below which added O_PATH support to the *at() calls, the
error return for readlink/readlinkat for the empty pathname has switched
from ENOENT to EINVAL:
readlinkat(), fchownat() and fstatat() with empty relative pathnames
This is both unexpected for userspace and makes readlink/readlinkat
inconsistant with all other interfaces; and inconsistant with our stated
return for these pathnames.
As the readlinkat call does not have a flags parameter we cannot use the
AT_EMPTY_PATH approach used in the other calls. Therefore expose whether
the original path is infact entry via a new user_path_at_empty() path
lookup function. Use this to determine whether to default to EINVAL or
ENOENT for failures.
BugLink: http://bugs.launchpad.net/bugs/817187 Signed-off-by: Andy Whitcroft <apw@canonical.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Bligh [Wed, 5 Oct 2011 00:42:21 +0000 (11:42 +1100)]
net/netfilter/nf_conntrack_netlink.c: fix Oops on container destroy
Problem:
A repeatable Oops can be caused if a container with networking
unshared is destroyed when it has nf_conntrack entries yet to expire.
A copy of the oops follows below. A perl program generating the oops
repeatably is attached inline below.
Analysis:
The oops is called from cleanup_net when the namespace is
destroyed. conntrack iterates through outstanding events and calls
death_by_timeout on each of them, which in turn produces a call to
ctnetlink_conntrack_event. This calls nf_netlink_has_listeners, which
oopses because net->nfnl is NULL.
The perl program generates the container through fork() then
clone(NS_NEWNET). I does not explicitly set up netlink
explicitly set up netlink, but I presume it was set up else net->nfnl
would have been NULL earlier (i.e. when an earlier connection
timed out). This would thus suggest that net->nfnl is made NULL
during the destruction of the container, which I think is done by
nfnetlink_net_exit_batch.
I can see that the various subsystems are deinitialised in the opposite
order to which the relevant register_pernet_subsys calls are called,
and both nf_conntrack and nfnetlink_net_ops register their relevant
subsystems. If nfnetlink_net_ops registered later than nfconntrack,
then its exit routine would have been called first, which would cause
the oops described. I am not sure there is anything to prevent this
happening in a container environment.
Whilst there's perhaps a more complex problem revolving around ordering
of subsystem deinit, it seems to me that missing a netlink event on a
container that is dying is not a disaster. An early check for net->nfnl
being non-NULL in ctnetlink_conntrack_event appears to fix this. There
may remain a potential race condition if it becomes NULL immediately
after being checked (I am not sure any lock is held at this point or
how synchronisation for subsystem deinitialization works).
Patch:
The patch attached should apply on everything from 2.6.26 (if not before)
onwards; it appears to be a problem on all kernels. This was taken against
Ubuntu-3.0.0-11.17 which is very close to 3.0.4. I have torture-tested it
with the above perl script for 15 minutes or so; the perl script hung the
machine within 20 seconds without this patch.
Applicability:
If this is the right solution, it should be applied to all stable kernels
as well as head. Apart from the minor overhead of checking one variable
against NULL, it can never 'do the wrong thing', because if net->nfnl
is NULL, an oops will inevitably result. Therefore, checking is a reasonable
thing to do unless it can be proven than net->nfnl will never be NULL.
Check net->nfnl for NULL in ctnetlink_conntrack_event to avoid Oops on
container destroy
Signed-off-by: Alex Bligh <alex@alex.org.uk> Cc: Patrick McHardy <kaber@trash.net> Cc: David Miller <davem@davemloft.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrew Morton [Wed, 5 Oct 2011 00:42:20 +0000 (11:42 +1100)]
drivers/net/ethernet/i825xx/3c505.c: fix build with dynamic debug
The `#define filename' screws up the expansion of
DEFINE_DYNAMIC_DEBUG_METADATA:
drivers/net/ethernet/i825xx/3c505.c: In function 'send_pcb':
drivers/net/ethernet/i825xx/3c505.c:390: error: expected identifier before string constant
drivers/net/ethernet/i825xx/3c505.c:390: error: expected '}' before '.' token
drivers/net/ethernet/i825xx/3c505.c:436: error: expected identifier before string constant
drivers/net/ethernet/i825xx/3c505.c:435: error: expected '}' before '.' token
drivers/net/ethernet/i825xx/3c505.c: In function 'start_receive':
drivers/net/ethernet/i825xx/3c505.c:557: error: expected identifier before string constant
drivers/net/ethernet/i825xx/3c505.c:557: error: expected '}' before '.' token
drivers/net/ethernet/i825xx/3c505.c: In function 'receive_packet':
drivers/net/ethernet/i825xx/3c505.c:629: error: expected identifier before string constant
etc
Cc: Philip Blundell <philb@gnu.org> Cc: David Miller <davem@davemloft.net> Cc: Jason Baron <jbaron@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In file included from arch/x86/kernel/pci-dma.c:3:
include/linux/dmar.h:248: warning: 'struct acpi_dmar_header' declared inside parameter list
include/linux/dmar.h:248: warning: its scope is only this definition or declaration, which is probably not what you want