Ed Cashin [Fri, 7 Sep 2012 00:25:12 +0000 (10:25 +1000)]
aoe: associate frames with the AoE storage target
In the driver code, "target" and aoetgt refer to a particular remote
interface on the AoE storage target. The latter is identified by its AoE
major and minor addresses. Commands that are being sent to an AoE storage
target {major, minor} can be sent or retransmitted to any of the remote
MAC addresses associated with the AoE storage target.
That is, frames are naturally associated with not an aoetgt (AoE major,
AoE minor, remote MAC address) but an aoedev (AoE major, AoE minor).
Making the code reflect that reality simplifies the driver, especially
when the path to a remote MAC address becomes unusable.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Cashin [Fri, 7 Sep 2012 00:25:11 +0000 (10:25 +1000)]
aoe: disallow unsupported AoE minor addresses
A guard is inserted to prevent AoE minor addresses (slot addresses) higher
than 15 to be used, as they are not yet supported by the driver.
There is a change coming that will allow the aoe driver to overcome this
limit by using system device minor numbers dynamically, but until then,
this guard prevents unexpected targets from being used by the driver when
AoE targets with high minor numbers are on the AoE network.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Cashin [Fri, 7 Sep 2012 00:25:11 +0000 (10:25 +1000)]
aoe: do revalidation steps in order
The discovery process begins with an optional AoE config query command and
an AoE config query response. Normally when an aoe device is already
open, the config query response does not trigger an ATA identify device
command to be sent out, since the response contains storage capacity
information that, if changed, could surprise the user of the device.
The userland "aoe-revalidate" tool uses a character device to trigger an
AoE config query for a particular AoE storage target and an ATA device
identify command, even when the device is open.
This change causes the config query to go out first, reflecting the normal
discovery sequence. The responses could come back in any order, so this
change is fairly cosmetic.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Cashin [Fri, 7 Sep 2012 00:25:11 +0000 (10:25 +1000)]
aoe: failover remote interface based on aoe_deadsecs parameter
The aoe_deadsecs module parameter allows the user to specify a hard limit
on the number of seconds an AoE command can be retransmitted before the
AoE block device is considered to have failed.
Using aoe_deadsecs to determine the time we try using a different remote
interface helps to ensure that the hard limit is not reached before we've
tried to recover by sending to a different remote port.
As a data storage target, the AoE target is unambiguously identified by
its {major, minor} AoE address tuple, and an AoE target can have multiple
MAC addresses. However, note that "target" in the driver code and
comments means a {major, minor, MAC address} tuple, as in "somewhere to
send packets".
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Cashin [Fri, 7 Sep 2012 00:25:10 +0000 (10:25 +1000)]
aoe: use packets that work with the smallest-MTU local interface
Users with several network interfaces dedicated to AoE generally do not
configure them to support different-sized AoE data payloads on purpose.
For a given AoE target, there will be a set of local network interfaces
that can reach it. Using only the payload that will fit in the
smallest-sized MTU of all those local interfaces greatly simplifies the
driver, especially in failure scenarios.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Cashin [Fri, 7 Sep 2012 00:25:10 +0000 (10:25 +1000)]
aoe: use a kernel thread for transmissions
The dev_queue_xmit function needs to have interrupts enabled, so the most
simple way to get the locking right but still fulfill that requirement is
to use a process that can call dev_queue_xmit serially over queued
transmissions.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Cashin [Fri, 7 Sep 2012 00:25:10 +0000 (10:25 +1000)]
aoe: become I/O request queue handler for increased user control
To allow users to choose an elevator algorithm for their particular
workloads, change from a make_request-style driver to an
I/O-request-queue-handler-style driver.
We have to do a couple of things that might be surprising. We manipulate
the page _count directly on the assumption that we still have no guarantee
that users of the block layer are prohibited from submitting bios
containing pages with zero reference counts.[1] If such a prohibition now
exists, I can get rid of the _count manipulation.
Just as before this patch, we still keep track of the sk_buffs that the
network layer still hasn't finished yet and cap the resources we use with
a "pool" of skbs.[2]
Now that the block layer maintains the disk stats, the aoe driver's
diskstats function can go away.
Ed Cashin [Fri, 7 Sep 2012 00:25:10 +0000 (10:25 +1000)]
aoe: kernel thread handles I/O completions for simple locking
Make the frames the aoe driver uses to track the relationship between bios
and packets more flexible and detached, so that they can be passed to an
"aoe_ktio" thread for completion of I/O.
The frames are handled much like skbs, with a capped amount of
preallocation so that real-world use cases are likely to run smoothly and
degenerate gracefully even under memory pressure.
Decoupling I/O completion from the receive path and serializing it in a
process makes it easier to think about the correctness of the locking in
the driver, especially in the case of a remote MAC address becoming
unusable.
Signed-off-by: Ed Cashin <ecashin@coraid.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ed Cashin [Fri, 7 Sep 2012 00:25:09 +0000 (10:25 +1000)]
aoe: for performance support larger packet payloads
tAdd adds the ability to work with large packets composed of a number of
segments, using the scatter gather feature of the block layer (biovecs)
and the network layer (skb frag array). The motivation is the performance
gained by using a packet data payload greater than a page size and by
using the network card's scatter gather feature.
Users of the out-of-tree aoe driver already had these changes, but since
early 2011, they have complained of increased memory utilization and
higher CPU utilization during heavy writes.[1] The commit below appears
related, as it disables scatter gather on non-IP protocols inside the
harmonize_features function, even when the NIC supports sg.
net offloading: Generalize netif_get_vlan_features().
With that regression in place, transmits always linearize sg AoE packets,
but in-kernel users did not have this patch. Before 2.6.38, though, these
changes were working to allow sg to increase performance.
Paul Clements [Fri, 7 Sep 2012 00:25:09 +0000 (10:25 +1000)]
nbd: handle discard requests
Add discard support to nbd. If the nbd-server supports discard, it will
send NBD_FLAG_SEND_TRIM to the client. The client will then set the flag
in the kernel via NBD_SET_FLAGS, which tells the kernel to enable discards
for the device (QUEUE_FLAG_DISCARD).
If discard support is enabled, then when the nbd client system receives a
discard request, this will be passed along to the nbd-server. When the
discard request is received by the nbd-server, it will perform:
fallocate(.. FALLOC_FL_PUNCH_HOLE ..)
To punch a hole in the backend storage, which is no longer needed.
Signed-off-by: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Paul Clements [Fri, 7 Sep 2012 00:25:09 +0000 (10:25 +1000)]
nbd: add set flags ioctl
Add a set-flags ioctl, allowing various option flags to be set on an nbd
device. This allows the nbd-client to set the device flags (to enable
read-only mode, or enable discard support, etc.).
Flags are typically specified by the nbd-server. During the negotiation
phase of the nbd connection, the server sends its flags to the client.
The client then uses NBD_SET_FLAGS to inform the kernel of the options.
Also included is a one-line fix to debug output for the set-timeout ioctl.
Signed-off-by: Paul Clements <paul.clements@steeleye.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ipc/sem.c uses a custom wakeup scheme that relies on preempt_disable().
On -RT, this causes increased latencies and debug warnings.
The patch adds two additional schemes:
- one built around a completion - could be better for -RT kernels
- one built around a spinlock - unfortunately it's broken
- and the current one
My preferred solution would be the spinlock implementation: RT would use
premptible spinlocks, mainline normal spinlocks. Thus both get the
optimal implementation without any special code in ipc/sem.c.
Unfortunately, I don't see how it could be fixed.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
yan [Fri, 7 Sep 2012 00:25:06 +0000 (10:25 +1000)]
proc: no need to initialize proc_inode->fd in proc_get_inode()
proc_get_inode() obtains the inode via a call to iget_locked().
iget_locked() calls alloc_inode() which will call proc_alloc_inode() which
clears proc_inode.fd, so there is no need to clear this field in
proc_get_inode().
If iget_locked() instead found the inode via find_inode_fast(), that inode
will not have I_NEW set so this change has no effect.
Signed-off-by: yan <clouds.yan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
yan [Fri, 7 Sep 2012 00:25:06 +0000 (10:25 +1000)]
proc: return -ENOMEM when inode allocation failed
If proc_get_inode() returns NULL then presumably it encountered memory
exhaustion. proc_lookup_de() should return -ENOMEM in this case, not
-EINVAL.
Signed-off-by: yan <clouds.yan@gmail.com> Cc: Ryan Mallon <rmallon@gmail.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Kelly [Fri, 7 Sep 2012 00:25:06 +0000 (10:25 +1000)]
coredump: update coredump-related headers
Create a new header file, fs/coredump.h, which contains functions only
used by the new coredump.c. It also moves do_coredump to the
include/linux/coredump.h header file, for consistency.
Signed-off-by: Alex Kelly <alex.page.kelly@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Kelly [Fri, 7 Sep 2012 00:25:05 +0000 (10:25 +1000)]
fs: amended coredump-related sysctl functions
This fixes an error introduced in the coredump-header patch in the
coredump removal patch I submitted earlier. It should be squashed into
that patch series so that the Kconfig option to remove coredump doesn't
cause compile-time errors.
Signed-off-by: Alex Kelly <alex.page.kelly@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Kelly [Fri, 7 Sep 2012 00:25:05 +0000 (10:25 +1000)]
coredump: make core dump functionality optional
Adds an expert Kconfig option, CONFIG_COREDUMP, which allows disabling of
core dump. This saves approximately 2.6k in the compiled kernel, and
complements CONFIG_ELF_CORE, which now depends on it.
CONFIG_COREDUMP also disables coredump-related sysctls, except for
suid_dumpable and related functions, which are necessary for ptrace.
Signed-off-by: Alex Kelly <alex.page.kelly@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
device_cgroup: convert device_cgroup internally to policy + exceptions
The original model of device_cgroup is having a whitelist where all the
allowed devices are listed. The problem with this approach is that is
impossible to have the case of allowing everything but few devices.
The reason for that lies in the way the whitelist is handled internally:
since there's only a whitelist, the "all devices" entry would have to be
removed and replaced by the entire list of possible devices but the ones
that are being denied. Since dev_t is 32 bits long, representing the allowed
devices as a bitfield is not memory efficient.
This patch replaces the "whitelist" by a "exceptions" list and the default
policy is kept as "deny_all" variable in dev_cgroup structure.
The current interface determines that whenever "a" is written to devices.allow
or devices.deny, the entry masking all devices will be added or removed,
respectively. This behavior is kept and it's what will determine the default
policy:
# cat devices.list
a *:* rwm
# echo a >devices.deny
# cat devices.list
# echo a >devices.allow
# cat devices.list
a *:* rwm
The interface is also preserved. For example, if one wants to block only access
to /dev/null:
# ls -l /dev/null
crw-rw-rw- 1 root root 1, 3 Jul 24 16:17 /dev/null
# echo a >devices.allow
# echo "c 1:3 rwm" >devices.deny
# cat /dev/null
cat: /dev/null: Operation not permitted
# echo >/dev/null
bash: /dev/null: Operation not permitted
mknod /tmp/null c 1 3
mknod: `/tmp/null': Operation not permitted
# echo "c 1:3 r" >devices.allow
# cat /dev/null
# echo >/dev/null
bash: /dev/null: Operation not permitted
mknod /tmp/null c 1 3
mknod: `/tmp/null': Operation not permitted
# echo "c 1:3 rw" >devices.allow
# echo >/dev/null
# cat /dev/null
# mknod /tmp/null c 1 3
mknod: `/tmp/null': Operation not permitted
# echo "c 1:3 rwm" >devices.allow
# echo >/dev/null
# cat /dev/null
# mknod /tmp/null c 1 3
#
Note that I didn't rename the functions/variables in this patch, but in the
next one to make reviewing easier.
Signed-off-by: Aristeu Rozanski <aris@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: James Morris <jmorris@namei.org> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This does the following:
1: Splits the arguments of a function call to stop it
from exceeding 80 characters
2: Re-indents the arguments of another function call
to prevent the splitting of a quoted string.
Maintain an index of directory inodes by starting cluster, so that
fat_get_parent() can return the proper cached inode rather than inventing
one that cannot be traced back to the filesystem root.
Add a new msdos/vfat binary mount option "nfs" so that FAT filesystems
that are _not_ exported via NFS are not saddled with maintenance of an
index they will never use.
Finally, simplify NFS file handle generation and lookups. An
ext2-congruent implementation is adequate for FAT needs.
Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Under memory pressure, the system may evict dentries from cache. When the
FAT driver receives a NFS request involving an evicted dentry, it is
unable to reconnect it to the filesystem root. This causes the request to
fail, often with ENOENT.
This is partially due to ineffectiveness of the current FAT NFS
implementation, and partially due to an unimplemented fh_to_parent method.
The latter can cause file accesses to fail on shares exported with
subtree_check.
This patch set provides the FAT driver with the ability to
reconnect dentries. NFS file handle generation and lookups are simplified
and made congruent with ext2.
Testing has involved a memory-starved virtual machine running 3.5-rc5 that
exports a ~2 GB vfat filesystem containing a kernel tree (~770 MB, ~40000
files, 9 levels). Both 'cp -r' and 'ls -lR' operations were performed
from a client, some overlapping, some consecutive. Exports with
'subtree_check' and 'no_subtree_check' have been tested.
Note that while this patch set improves FAT's NFS support, it does not
eliminate ESTALE errors completely.
The following should be considered for NFS clients who are sensitive to ESTALE:
* Mounting with lookupcache=none
Unfortunately this can degrade performance severely, particularly for deep
filesystems.
* Incorporating VFS patches to retry ESTALE failures on the client-side,
such as https://lkml.org/lkml/2012/6/29/381
* Handling ESTALE errors in client application code
This patch:
Move NFS-related code into its own C file. No functional changes.
Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Julia Lawall [Fri, 7 Sep 2012 00:24:58 +0000 (10:24 +1000)]
drivers/rtc/rtc-coh901331.c: use clk_prepare_enable() and clk_disable_unprepare()
clk_prepare_enable and clk_disable_unprepare combine clk_prepare and
clk_enable, and clk_disable and clk_unprepare. They make the code more
concise, and ensure that clk_unprepare is called when clk_enable fails.
A simplified version of the semantic patch that introduces calls to these
functions is as follows: (http://coccinelle.lip6.fr/)
Add an RTC driver for the RTC device on Ricoh MFD Rc5t583. Ricoh RTC has
3 types of alarms. The current patch adds support for the Y-Alarm of
RC5t583 RTC.
Signed-off-by: Venu Byravarasu <vbyravarasu@nvidia.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
There are several comparisons of a unsigned int to less than zero int
spear RTC driver. Such a check will always be true. In all these cases a
signed int is assigned to the unsigned variable, which is checked, before.
So the right fix is to make the checked variable signed as well. In one
case the check can be dropped completely, because all it does it returns
'err' if 'err' is less than zero, otherwise it returns 0. Since in this
particular case 'err' is always either 0 or less this is the same as just
returning 'err'.
The issue has been found using the following coccinelle semantic patch:
//<smpl>
@@
type T;
unsigned T i;
@@
(
*i < 0
|
*i >= 0
)
//</smpl>
The irq field of the jz4740_irc struct is unsigned. Yet we assign the
result of platform_get_irq() to it. platform_get_irq() may return a
negative error code and the code checks for this condition by checking if
'irq' is less than zero. But since 'irq' is unsigned this test will
always be false. Fix it by making 'irq' signed.
The issue was found using the following coccinelle semantic patch:
//<smpl>
@@
type T;
unsigned T i;
@@
(
*i < 0
|
*i >= 0
)
//</smpl>
Stephen Warren [Fri, 7 Sep 2012 00:24:57 +0000 (10:24 +1000)]
rtc: add MAX8907 RTC driver
The MAX8907 is an I2C-based power-management IC containing voltage
regulators, a reset controller, a real-time clock, and a touch-screen
controller.
The driver is based on an original by or fixed by:
* Tom Cherry
* Prashant Gaikwad
* Joseph Yoon
During upstreaming, I (swarren):
* Converted to regmap.
* Fixed handling of RTC_HOUR register containing 12.
* Fixed handling of RTC_WEEKDAY register.
* General cleanup.
Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: Tom Cherry <tcherry@nvidia.com> Cc: Prashant Gaikwad <pgaikwad@nvidia.com> Cc: Joseph Yoon <tyoon@nvidia.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
rtc: tps65910: add RTC driver for TPS65910 PMIC RTC
TPS65910 PMIC is a MFD with RTC as one of the device. Adding RTC driver
for supporting RTC device present inside TPS65910 PMIC.
Only support for RTC alarm is implemented as part of this patch.
Signed-off-by: Venu Byravarasu <vbyravarasu@nvidia.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Vincent Palatin [Fri, 7 Sep 2012 00:24:56 +0000 (10:24 +1000)]
rtc: recycle id when unloading a rtc driver
When calling rtc_device_unregister, we are not freeing the id used by the
driver. So when doing a unload/load cycle for a RTC driver (e.g. rmmod
rtc_cmos && modprobe rtc_cmos), its id is incremented by one. As a
consequence, we no longer have neither an rtc0 driver nor a
/proc/driver/rtc (as it only exists for the first driver).
Signed-off-by: Vincent Palatin <vpalatin@chromium.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
rtc: snvs: change timeout to use a fixed number of loop
Andrew Morton <akpm@linux-foundation.org> wrote:
> The timeout code here is fragile. If acquiring the spinlock takes more
> than a millisecond or if this thread gets interrupted or preempted then
> we could easily execute that loop just a single time, and fail.
>
> It would be better to retry a fixed number of times, say 1000? That
> would take around 1 millisecond, but might be overkill.
Take Andrew's suggestion to change the timeout code to retry 1000
times.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Kim Phillips <kim.phillips@freescale.com> Cc: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kim, Milo [Fri, 7 Sep 2012 00:24:54 +0000 (10:24 +1000)]
rtc-proc: permit the /proc/driver/rtc device to use other devices
To get time information via /proc/driver/rtc, only the first device (rtc0)
is used. If the rtcN (eg. rtc1 or rtc2) is used for the system clock,
there is no way to get information of rtcN via /proc/driver/rtc. With
this patch, the time data can be retrieved from the system clock RTC.
If the RTC_HCTOSYS_DEVICE is not defined, then rtc0 is used by default.
Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Ben Gardner [Fri, 7 Sep 2012 00:24:54 +0000 (10:24 +1000)]
drivers/rtc/rtc-isl1208.c: add support for the ISL1218
The ISL1218 chip is identical to the ISL1208, except that it has 6
additional user-storage registers. This patch does not enable access to
those additional registers, but only adds the chip name to the list.
Signed-off-by: Ben Gardner <gardner.ben@gmail.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Paton J. Lewis [Fri, 7 Sep 2012 00:24:53 +0000 (10:24 +1000)]
epoll: support for disabling items, and a self-test app
Enhanced epoll_ctl to support EPOLL_CTL_DISABLE, which disables an epoll
item. If epoll_ctl doesn't return -EBUSY in this case, it is then safe to
delete the epoll item in a multi-threaded environment. Also added a new
test_epoll self- test app to both demonstrate the need for this feature
and test it.
Signed-off-by: Paton J. Lewis <palewis@adobe.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jason Baron <jbaron@redhat.com> Cc: Paul Holland <pholland@adobe.com> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Alex Elder [Fri, 7 Sep 2012 00:24:51 +0000 (10:24 +1000)]
lib/parser.c: avoid overflow in match_number()
The result of converting an integer value to another signed integer type
that's unable to represent the original value is implementation defined.
(See notes in section 6.3.1.3 of the C standard.)
In match_number(), the result of simple_strtol() (which returns type long)
is assigned to a value of type int.
Instead, handle the result of simple_strtol() in a well-defined way, and
return -ERANGE if the result won't fit in the int variable used to hold
the parsed result.
No current callers pay attention to the particular error value returned,
so this additional return code shouldn't do any harm.
Signed-off-by: Alex Elder <elder@inktank.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This driver was for the ProGear webpad device which was produced in
2000/2001 and is not available on a market. I no longer have this
hardware so can not even check how Linux works on it.
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This driver is a general version for LM3639 backlgiht + flash driver chip
of TI.
LM3639:
The LM3639 is a single chip LCD Display Backlight driver + white LED
Camera driver. Programming is done over an I2C compatible interface.
www.ti.com
Signed-off-by: G.Shark Jeong <gshark.jeong@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Daniel Jeong <daniel.jeong@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This driver is a general version for LM3630 backlgiht driver chip of TI.
LM3630 :
The LM3630 is a current mode boost converter which supplies the power
and controls the current in two strings of up to 10 LEDs per string.
Programming is done over an I2C compatible interface.
www.ti.com
Signed-off-by: G.Shark Jeong <gshark.jeong@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Daniel Jeong <daniel.jeong@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 7 Sep 2012 00:24:49 +0000 (10:24 +1000)]
drivers/video/backlight/kb3886_bl.c: use usleep_range() instead of msleep() for small sleeps
Since msleep() might not sleep for the desired amount when less than 20ms,
use usleep_range().
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Claudio Nieder <private@claudio.ch> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Sachin Kamat <sachin.kamat@linaro.org> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 7 Sep 2012 00:24:49 +0000 (10:24 +1000)]
drivers/video/backlight/ltv350qv.c: use usleep_range() instead of msleep() for small sleeps
Since msleep() might not sleep for the desired amount when less than 20ms,
use usleep_range().
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Sachin Kamat <sachin.kamat@linaro.org> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jingoo Han [Fri, 7 Sep 2012 00:24:48 +0000 (10:24 +1000)]
drivers/video/backlight/da9052_bl.c: use usleep_range() instead of msleep() for small sleeps
Since msleep() might not sleep for the desired amount when less than 20ms,
use usleep_range().
Signed-off-by: Jingoo Han <jg1.han@samsung.com> Cc: Ashish Jangam <ashish.jangam@kpitcummins.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Sachin Kamat <sachin.kamat@linaro.org> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
drivers/video/backlight/pwm_bl.c: add device tree support for Low Threshold Brightness
Low Threshold Brightness should be configured to have a linear relation in
brightness scale. This patch adds device tree support for low threshold
brightness as optional one for pwm_backlight.
Signed-off-by: Philip, Avinash <avinashphilip@ti.com> Cc: Grant Likely <grant.likely@secretlab.ca> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joe Perches [Fri, 7 Sep 2012 00:24:44 +0000 (10:24 +1000)]
MAINTAINERS: Update gianfar_ptp after renaming
commit ec21e2ec36769 ("freescale: Move the Freescale drivers")
moved the files, update the pattern.
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andy Shevchenko [Fri, 7 Sep 2012 00:24:29 +0000 (10:24 +1000)]
lib/vsprintf: update documentation to cover all of %p[Mm][FR]
Acked-by: Andrei Emeltchenko <andrei.emeltchenko@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
kernel/sys.c: call disable_nonboot_cpus() in kernel_restart()
As kernel_power_off() calls disable_nonboot_cpus(), we may also want to
have kernel_restart() call disable_nonboot_cpus(). Doing so can help
machines that require boot cpu be the last alive cpu during reboot to
survive with kernel restart.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Historically, the top three bytes of personality have been used for things
such as ADDR_NO_RANDOMIZE, which made sense only for specific
architectures.
We now however have a flag there that is general no matter the
architecture (UNAME26); generally we have to be careful to preserve the
personality flags across exec().
This patch fixes tile architecture not to forcefully overwrite personality
flags during exec().
In addition to that, we fix two other things along the way:
- exec_domain switching is fixed -- set_personality() should always
be used instead of directly assigning to current->personality.
- as pointed out by Arnd Bergmann, PER_LINUX_32BIT is not used anywhere
by tile, so let's just drop that in favor of PER_LINUX
Signed-off-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
cross-arch: don't corrupt personality flags upon exec()
Historically, the top three bytes of personality have been used for things
such as ADDR_NO_RANDOMIZE, which made sense only for specific
architectures.
We now however have a flag there that is general no matter the
architecture (UNAME26); generally we have to be careful to preserve the
personality flags across exec().
This patch tries to fix all architectures that forcefully overwrite
personality flags during exec() (ppc32 and s390 have been fixed recently
by commits f9783ec86 and 59e4c3a2f in a similar way already).
Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Mark Salter <msalter@redhat.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Commit 6afe1a1fe8ff83f6a ("PM: Remove legacy PM") removed the
initialization of retval, causing:
arch/frv/kernel/pm.c: In function 'sysctl_pm_do_suspend':
arch/frv/kernel/pm.c:165:5: warning: 'retval' may be used uninitialized in this function [-Wuninitialized]
Remove the variable completely to fix this, and convert to a proper
switch (...) { ... } construct to improve readability.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Haggai Eran [Fri, 7 Sep 2012 00:24:25 +0000 (10:24 +1000)]
mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end
In order to allow sleeping during invalidate_page mmu notifier calls, we
need to avoid calling when holding the PT lock. In addition to its direct
calls, invalidate_page can also be called as a substitute for a change_pte
call, in case the notifier client hasn't implemented change_pte.
This patch drops the invalidate_page call from change_pte, and instead
wraps all calls to change_pte with invalidate_range_start and
invalidate_range_end calls.
Note that change_pte still cannot sleep after this patch, and that clients
implementing change_pte should not take action on it in case the number of
outstanding invalidate_range_start calls is larger than one, otherwise
they might miss a later invalidation.
Signed-off-by: Haggai Eran <haggaie@mellanox.com> Cc: Andrea Arcangeli <andrea@qumranet.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Haggai Eran <haggaie@mellanox.com> Cc: Shachar Raindel <raindel@mellanox.com> Cc: Liran Liss <liranl@mellanox.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm: move all mmu notifier invocations to be done outside the PT lock
In order to allow sleeping during mmu notifier calls, we need to avoid
invoking them under the page table spinlock. This patch solves the
problem by calling invalidate_page notification after releasing the lock
(but before freeing the page itself), or by wrapping the page invalidation
with calls to invalidate_range_begin and invalidate_range_end.
To prevent accidental changes to the invalidate_range_end arguments after
the call to invalidate_range_begin, the patch introduces a convention of
saving the arguments in consistently named locals:
unsigned long mmun_start; /* For mmu_notifiers */
unsigned long mmun_end; /* For mmu_notifiers */
The patch changes code to use this convention for all calls to
mmu_notifier_invalidate_range_start/end, except those where the calls are
close enough so that anyone who glances at the code can see the values
aren't changing.
This patchset is a preliminary step towards on-demand paging design to be
added to the RDMA stack.
Why do we want on-demand paging for Infiniband?
Applications register memory with an RDMA adapter using system calls,
and subsequently post IO operations that refer to the corresponding
virtual addresses directly to HW. Until now, this was achieved by
pinning the memory during the registration calls. The goal of on demand
paging is to avoid pinning the pages of registered memory regions (MRs).
This will allow users the same flexibility they get when swapping any
other part of their processes address spaces. Instead of requiring the
entire MR to fit in physical memory, we can allow the MR to be larger,
and only fit the current working set in physical memory.
Why should anyone care? What problems are users currently experiencing?
This can make programming with RDMA much simpler. Today, developers
that are working with more data than their RAM can hold need either to
deregister and reregister memory regions throughout their process's
life, or keep a single memory region and copy the data to it. On demand
paging will allow these developers to register a single MR at the
beginning of their process's life, and let the operating system manage
which pages needs to be fetched at a given time. In the future, we
might be able to provide a single memory access key for each process
that would provide the entire process's address as one large memory
region, and the developers wouldn't need to register memory regions at
all.
Is there any prospect that any other subsystems will utilise these
infrastructural changes? If so, which and how, etc?
As for other subsystems, I understand that XPMEM wanted to sleep in
MMU notifiers, as Christoph Lameter wrote at
http://lkml.indiana.edu/hypermail/linux/kernel/0802.1/0460.html and
perhaps Andrea knows about other use cases.
Scheduling in mmu notifications is required since we need to sync the
hardware with the secondary page tables change. A TLB flush of an IO
device is inherently slower than a CPU TLB flush, so our design works by
sending the invalidation request to the device, and waiting for an
interrupt before exiting the mmu notifier handler.
Avi said:
kvm may be a buyer. kvm::mmu_lock, which serializes guest page
faults, also protects long operations such as destroying large ranges.
It would be good to convert it into a spinlock, but as it is used inside
mmu notifiers, this cannot be done.
(there are alternatives, such as keeping the spinlock and using a
generation counter to do the teardown in O(1), which is what the "may"
is doing up there).
Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Haggai Eran <haggaie@mellanox.com> Cc: Shachar Raindel <raindel@mellanox.com> Cc: Liran Liss <liranl@mellanox.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrew Morton [Fri, 7 Sep 2012 00:23:56 +0000 (10:23 +1000)]
mm-support-migrate_discard-fix
whitespace fixlet
Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Minchan Kim [Fri, 7 Sep 2012 00:23:56 +0000 (10:23 +1000)]
mm: support MIGRATE_DISCARD
Introduce MIGRATE_DISCARD mode in migration. It drops *clean cache pages*
instead of migration so that migration latency could be reduced by
avoiding (memcpy + page remapping). It's useful for CMA because latency
of migration is very important rather than eviction of background
processes's workingset. In addition, it needs less free pages for
migration targets so it could avoid memory reclaiming to get free pages,
which is another factor increase latency.
Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Minchan Kim [Fri, 7 Sep 2012 00:23:55 +0000 (10:23 +1000)]
mm: change enum migrate_mode with bitwise type
Change migrate_mode type to bitwise type because next patch will add
MIGRATE_DISCARD and it could be ORed with other attributes so it would be
better to change it with bitwise type.
Suggested by Michal Nazarewicz.
Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Michal Nazarewicz <mina86@mina86.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
memory-hotplug: build zonelists when offlining pages
online_pages() does build_all_zonelists() and zone_pcp_update(), I think
offline_pages() should do it too.
When the zone has no memory to allocate, remove it from other nodes'
zonelists. zone_batchsize() depends on zone's present pages, if zone's
present pages are changed, zone's pcp should be updated.
rbtree: move augmented rbtree functionality to rbtree_augmented.h
Provide rb_insert_augmented() and rb_erase_augmented through a new
rbtree_augmented.h include file. rb_erase_augmented() is defined there as
an __always_inline function, in order to allow inlining of augmented
rbtree callbacks into it. Since this generates a relatively large
function, each augmented rbtree users should make sure to have a single
call site.
Signed-off-by: Michel Lespinasse <walken@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
kmemleak uses a tree where each node represents an allocated memory object
in order to quickly find out what object a given address is part of.
However, the objects don't overlap, so rbtrees are a better choice than
prio tree for this use. They are both faster and have lower memory
overhead.
Tested by booting a kernel with kmemleak enabled, loading the
kmemleak_test module, and looking for the expected messages.
Signed-off-by: Michel Lespinasse <walken@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>