Sarah Sharp [Thu, 27 Aug 2009 21:36:24 +0000 (14:36 -0700)]
USB: xhci: Check URB's actual transfer buffer size.
Make sure that the amount of data the xHC says was transmitted is less
than or equal to the size of the requested transfer buffer. Before, if
the host controller erroneously reported that the number of bytes
untransferred was bigger than the buffer in the URB, urb->actual_length
could be set to a very large size.
Make sure urb->actual_length <= urb->transfer_buffer_length.
Sarah Sharp [Thu, 27 Aug 2009 21:36:14 +0000 (14:36 -0700)]
USB: xhci: Don't touch xhci_td after it's freed.
On a successful transfer, urb->td is freed before the URB is ready to be
given back to the driver. Don't touch urb->td after it's freed. This bug
would have only shown up when xHCI debugging was turned on, and the freed
memory was quickly reused for something else.
Sarah Sharp [Thu, 27 Aug 2009 21:36:03 +0000 (14:36 -0700)]
USB: xhci: Handle babbling endpoints correctly.
The 0.95 xHCI spec says that non-control endpoints will be halted if a
babble is detected on a transfer. The 0.96 xHCI spec says all types of
endpoints will be halted when a babble is detected. Some hardware that
claims to be 0.95 compliant halts the control endpoint anyway.
When a babble is detected on a control endpoint, check the hardware's
output endpoint context to see if the endpoint is marked as halted. If
the control endpoint is halted, a reset endpoint command must be issued
and the transfer ring dequeue pointer needs to be moved past the stopped
transfer. Basically, we treat it as if the control endpoint had stalled.
Handle bulk babbling endpoints as if we got a completion event with a
stall completion code.
Sarah Sharp [Fri, 7 Aug 2009 21:04:55 +0000 (14:04 -0700)]
USB: xhci: Add quirk for Fresco Logic xHCI hardware.
This Fresco Logic xHCI host controller chip revision puts bad data into
the output endpoint context after a Reset Endpoint command. It needs a
Configure Endpoint command (instead of a Set TR Dequeue Pointer command)
after the reset endpoint command.
Set up the input context before issuing the Reset Endpoint command so we
don't copy bad data from the output endpoint context. The HW also can't
handle two commands queued at once, so submit the TRB for the Configure
Endpoint command in the event handler for the Reset Endpoint command.
Devices that stall on control endpoints before a configuration is selected
will not work under this Fresco Logic xHCI host controller revision.
This patch is for prototype hardware that will be given to other companies
for evaluation purposes only, and should not reach consumer hands. Fresco
Logic's next chip rev should have this bug fixed.
Sarah Sharp [Fri, 7 Aug 2009 21:04:52 +0000 (14:04 -0700)]
USB: xhci: Handle stalled control endpoints.
When a control endpoint stalls, the next control transfer will clear the
stall. The USB core doesn't call down to the host controller driver's
endpoint_reset() method when control endpoints stall, so the xHCI driver
has to do all its stall handling for internal state in its interrupt handler.
When the host stalls on a control endpoint, it may stop on the data phase
or status phase of the control transfer. Like other stalled endpoints,
the xHCI driver needs to queue a Reset Endpoint command and move the
hardware's control endpoint ring dequeue pointer past the failed control
transfer (with a Set TR Dequeue Pointer or a Configure Endpoint command).
Since the USB core doesn't call usb_hcd_reset_endpoint() for control
endpoints, we need to do this in interrupt context when we get notified of
the stalled transfer. URBs may be queued to the hardware before these two
commands complete. The endpoint queue will be restarted once both
commands complete.
Sarah Sharp [Fri, 7 Aug 2009 21:04:49 +0000 (14:04 -0700)]
USB: xhci: Support full speed devices.
Full speed devices have varying max packet sizes (8, 16, 32, or 64) for
endpoint 0. The xHCI hardware needs to know the real max packet size
that the USB core discovers after it fetches the first 8 bytes of the
device descriptor.
In order to fix this without adding a new hook to host controller drivers,
the xHCI driver looks for an updated max packet size for control
endpoints. If it finds an updated size, it issues an evaluate context
command and waits for that command to finish. This should only happen in
the initialization and device descriptor fetching steps in the khubd
thread, so blocking should be fine.
Sarah Sharp [Fri, 7 Aug 2009 21:04:46 +0000 (14:04 -0700)]
USB: xhci: Set correct max packet size for HS/FS control endpoints.
Set the max packet size for the default control endpoint on high speed
devices to be 64 bytes. High speed devices always have a max packet size
of 64 bytes. There's no use setting it to eight for the initial 8 byte
descriptor fetch and then issuing (and waiting for) an evaluate context
command to update it to 64 bytes for the subsequent control transfers.
The USB core guesses that the max packet size on a full speed control
endpoint is 64 bytes, and then updates it after the first 8-byte
descriptor fetch. Change the initial setup for the xHCI internal
representation of the full speed device to have a 64 byte max packet size.
Sarah Sharp [Fri, 7 Aug 2009 21:04:43 +0000 (14:04 -0700)]
USB: xhci: Configure endpoint code refactoring.
Refactor out the code issue, wait for, and parse the event completion code
for a configure endpoint command. Modify it to support the evaluate
context command, which has a very similar submission process. Add
functions to copy parts of the output context into the input context
(which will be used in the evaluate context command).
Sarah Sharp [Fri, 7 Aug 2009 21:04:36 +0000 (14:04 -0700)]
USB: xhci: Work around for chain bit in link TRBs.
Different sections of the xHCI 0.95 specification had opposing
requirements for the chain bit in a link transaction request buffer (TRB).
The chain bit is used to designate that adjacent TRBs are all part of the
same scatter gather list that should be sent to the device. Link TRBs can
be in the middle, or at the beginning or end of these chained TRBs.
Sections 4.11.5.1 and 6.4.4.1 both stated the link TRB "shall have the
chain bit set to 1", meaning it is always chained to the next TRB.
However, section 4.6.9 on the stop endpoint command has specific cases for
what the hardware must do for a link TRB with the chain bit set to 0. The
0.96 specification errata later cleared up this issue by fixing the
4.11.5.1 and 6.4.4.1 sections to state that a link TRB can have the chain
bit set to 1 or 0.
The problem is that the xHCI cancellation code depends on the chain bit of
the link TRB being cleared when it's at the end of a TD, and some 0.95
xHCI hardware simply stops processing the ring when it encounters a link
TRB with the chain bit cleared.
Allow users who are testing 0.95 xHCI prototypes to set a module parameter
(link_quirk) to turn on this link TRB work around. Cancellation may not
work if the ring is stopped exactly on a link TRB with chain bit set, but
cancellation should be a relatively uncommon case.
SL811 Device detected after removal used to be working in linux-2.6.22
but then broke somewhere between 2.6.22 and 2.6.28. Current
hub_port_connect_change() in drivers/usb/core/hub.c won't call
usb_disconnect() in case the SL811 driver sets portstatus
USB_PORT_FEAT_CONNECTION upon removal.
AFAIK the SL811 has only a combined Device Insert/Remove
detection bit, therefore use a count to distinguish insert or remove.
Some devices from the OpenDCC project are missing in the list
of the FTDI PIDs. These PIDs are listed at
http://www.opendcc.de/elektronik/usb/opendcc_usb.html
(Sorry for the german only page.)
This patch adds the three missing devices.
Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_event, powerpc: Fix compilation after big perf_counter rename
Merge branch 'for-2.6.32' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.32' of git://linux-nfs.org/~bfields/linux: (68 commits)
nfsd4: nfsv4 clients should cross mountpoints
nfsd: revise 4.1 status documentation
sunrpc/cache: avoid variable over-loading in cache_defer_req
sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
nfsd: return success for non-NFS4 nfs4_state_start
nfsd41: Refactor create_client()
nfsd41: modify nfsd4.1 backchannel to use new xprt class
nfsd41: Backchannel: Implement cb_recall over NFSv4.1
nfsd41: Backchannel: cb_sequence callback
nfsd41: Backchannel: Setup sequence information
nfsd41: Backchannel: Server backchannel RPC wait queue
nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
nfsd41: Backchannel: callback infrastructure
nfsd4: use common rpc_cred for all callbacks
nfsd4: allow nfs4 state startup to fail
SUNRPC: Defer the auth_gss upcall when the RPC call is asynchronous
nfsd4: fix null dereference creating nfsv4 callback client
nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel
sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
...
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
trivial: fix typo in aic7xxx comment
trivial: fix comment typo in drivers/ata/pata_hpt37x.c
trivial: typo in kernel-parameters.txt
trivial: fix typo in tracing documentation
trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c
trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c
trivial: remove unnecessary semicolons
trivial: Fix duplicated word "options" in comment
trivial: kbuild: remove extraneous blank line after declaration of usage()
trivial: improve help text for mm debug config options
trivial: doc: hpfall: accept disk device to unload as argument
trivial: doc: hpfall: reduce risk that hpfall can do harm
trivial: SubmittingPatches: Fix reference to renumbered step
trivial: fix typos "man[ae]g?ment" -> "management"
trivial: media/video/cx88: add __init/__exit macros to cx88 drivers
trivial: fix typo in CONFIG_DEBUG_FS in gcov doc
trivial: fix missing printk space in amd_k7_smp_check
trivial: fix typo s/ketymap/keymap/ in comment
trivial: fix typo "to to" in multiple files
trivial: fix typos in comments s/DGBU/DBGU/
...
Henrik Rydberg [Tue, 22 Sep 2009 00:04:50 +0000 (17:04 -0700)]
hwmon: applesmc: restore accelerometer and keyboard backlight on resume
On resume from suspend, the driver currently resets the logical state as
if it was brought up from halt. This patch uses the
dev_pm_ops.resume/restore methods to synchronize the hardware with the
memorized logical state, in effect bringing back the accelerometer and
backlight to the state prior to suspend. Works for both suspend to ram
and hibernation. The patch has zero effect on the running state.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Cc: Nicolas Boichat <nicolas@boichat.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Abbott [Tue, 22 Sep 2009 00:04:47 +0000 (17:04 -0700)]
drivers/hwmon/adm1021.c: add low_power support for adm1021 driver
Occasionally it is helpful to be able to turn a temperature sensor off
(for example if it's making unwanted electrical noise). This patch
adds a sysfs node to put any adm1021 compatible device into low power mode.
Signed-off-by: Michael Abbott <michael.abbott@diamond.ac.uk> Cc: Jean Delvare <khali@linux-fr.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Abbott [Tue, 22 Sep 2009 00:04:46 +0000 (17:04 -0700)]
drivers/hwmon/adm1021.c: support high precision ADM1023 remote sensor
The ADM1023 temperature sensor supports higher resolution for its external
sensor (sensitivity of 1/8 deg C). This patch makes this higher
resolution available through the appropriate temperature sysfs nodes.
Curiously, this functionality was available in the 2.4 kernel driver (but
formatted in a less helpful manner).
Cc: Jean Delvare <khali@linux-fr.org> Signed-off-by: Michael Abbott <michael.abbott@diamond.ac.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Mack [Tue, 22 Sep 2009 00:04:45 +0000 (17:04 -0700)]
lis3_spi: code cleanups
Signed-off-by: Daniel Mack <daniel@caiaq.de> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Eric Piel <eric.piel@tremplin-utc.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Mack [Tue, 22 Sep 2009 00:04:44 +0000 (17:04 -0700)]
lis3: add power management functions
This enabled power management functions for the SPI transport layer of the
lis3 devices. The device's suspend mode is only entered in case no wakeup
threshold has been given. In this case, the device is supposed to wake up
the system and must thus not be put to deep sleep.
[randy.dunlap@oracle.com: fix lis3-spi for CONFIG_PM=n] Signed-off-by: Daniel Mack <daniel@caiaq.de> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Eric Piel <eric.piel@tremplin-utc.net> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Mack [Tue, 22 Sep 2009 00:04:43 +0000 (17:04 -0700)]
lis3: add free-fall/wakeup function via platform_data
This offers a way for platforms to define flags and thresholds for the
free-fall/wakeup functions of the lis302d chips.
More registers needed to be seperated as they are specific to the
Signed-off-by: Daniel Mack <daniel@caiaq.de> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Eric Piel <eric.piel@tremplin-utc.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Mack [Tue, 22 Sep 2009 00:04:42 +0000 (17:04 -0700)]
lis3: fix typo
Bit 0x80 in CTRL_REG3 is an ACTIVE_LOW rather than an ACTIVE_HIGH
function, I got that wrong during my last change.
Signed-off-by: Daniel Mack <daniel@caiaq.de> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Eric Piel <eric.piel@tremplin-utc.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Riepe [Tue, 22 Sep 2009 00:04:41 +0000 (17:04 -0700)]
drivers/hwmon/coretemp.c: enable the Intel Atom
Enable the coretemp driver on an Intel Atom.
I'm not sure if the readings are correct, however - on my 330, the driver
reports values between 27 and 41 °C (with core1 being about 8°C hotter
than core0, given the same load). Maybe the maximum temperature of 100 °C
is wrong for Atom CPUs.
Cc: Arjan van de Ven <arjan@infradead.org> Cc: Rudolf Marek <r.marek@assembler.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 22 Sep 2009 00:04:40 +0000 (17:04 -0700)]
checkpatch: add some common Blackfin checks
Add checks for Blackfin-specific issues that seem to crop up from time to
time. In particular, we have helper macros to break a 32bit address into
the hi/lo parts, and we want to make sure people use the csync/ssync
variant that includes fun anomaly workarounds.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Bryan Wu <cooloney@kernel.org> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Whitcroft [Tue, 22 Sep 2009 00:04:38 +0000 (17:04 -0700)]
checkpatch: limit sN/uN matches to actual bit sizes
Limit our type matcher to the s/u/le/be etc sizes that actually exist to
prevent miss categorising s2 as a type. Fix up the spelling of the error
also.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hannes Eder [Tue, 22 Sep 2009 00:04:37 +0000 (17:04 -0700)]
checkpatch: make -f alias --file, add --help, more verbose help message
Impact:
- More verbose help/usage message.
- Make the option -f an alias for --file.
- On -h, --help, and --version display help message and exit(0).
- With no FILE(s) given, exit(1) with "no input files".
- On invalid options display help/usage and exit(1).
Based on a patch by Pavel Machek.
Signed-off-by: Hannes Eder <hannes@hanneseder.net> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes the sanitation process in checkpatch.pl so that it blocks out
the text after a C99 style comment the same way it does with block style
comments. This prevents the text from getting processed as regular code.
Signed-off-by: Daniel Walker <dwalker@fifo99.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Tue, 22 Sep 2009 00:04:33 +0000 (17:04 -0700)]
flex_array: add missing kerneldoc annotations
Add kerneldoc annotations for function formals of type struct flex_array
and gfp_t which are currently lacking.
Signed-off-by: David Rientjes <rientjes@google.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Tue, 22 Sep 2009 00:04:33 +0000 (17:04 -0700)]
flex_array: introduce DEFINE_FLEX_ARRAY
FLEX_ARRAY_INIT(element_size, total_nr_elements) cannot determine if
either parameter is valid, so flex arrays which are statically allocated
with this interface can easily become corrupted or reference beyond its
allocated memory.
This removes FLEX_ARRAY_INIT() as a struct flex_array initializer since no
initializer may perform the required checking. Instead, the array is now
defined with a new interface:
This may be prefixed with `static' for file scope.
This interface includes compile-time checking of the parameters to ensure
they are valid. Since the validity of both element_size and
total_nr_elements depend on FLEX_ARRAY_BASE_SIZE and FLEX_ARRAY_PART_SIZE,
the kernel build will fail if either of these predefined values changes
such that the array parameters are no longer valid.
Since BUILD_BUG_ON() requires compile time constants, several of the
static inline functions that were once local to lib/flex_array.c had to be
moved to include/linux/flex_array.h.
Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Tue, 22 Sep 2009 00:04:31 +0000 (17:04 -0700)]
flex_array: add flex_array_shrink function
Add a new function to the flex_array API:
int flex_array_shrink(struct flex_array *fa)
This function will free all unused second-level pages. Since elements are
now poisoned if they are not allocated with __GFP_ZERO, it's possible to
identify parts that consist solely of unused elements.
flex_array_shrink() returns the number of pages freed.
Signed-off-by: David Rientjes <rientjes@google.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Tue, 22 Sep 2009 00:04:31 +0000 (17:04 -0700)]
flex_array: poison free elements
Newly initialized flex_array's and/or flex_array_part's are now poisoned
with a new poison value, FLEX_ARRAY_FREE. It's value is similar to
POISON_FREE used in the various slab allocators, but is different to
distinguish between flex array's poisoned kmem and slab allocator poisoned
kmem.
This will allow us to identify flex_array_part's that only contain free
elements (and free them with an addition to the flex_array API). This
could also be extended in the future to identify `get' uses on elements
that have not been `put'.
If __GFP_ZERO is passed for a part's gfp mask, the poisoning is avoided.
These elements are considered to be in-use since they have been
initialized.
Signed-off-by: David Rientjes <rientjes@google.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Tue, 22 Sep 2009 00:04:30 +0000 (17:04 -0700)]
flex_array: add flex_array_clear function
Add a new function to the flex_array API:
int flex_array_clear(struct flex_array *fa,
unsigned int element_nr)
This function will zero the element at element_nr in the flex_array.
Although this is equivalent to using flex_array_put() and passing a
pointer to zero'd memory, flex_array_clear() does not require such a
pointer to memory that would most likely need to be allocated on the
caller's stack which could be significantly large depending on
element_size.
Signed-off-by: David Rientjes <rientjes@google.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 22 Sep 2009 00:04:27 +0000 (17:04 -0700)]
MAINTAINERS: move ARM lists to infradead
Signed-off-by: Joe Perches <joe@perches.com> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Krzysztof Halasa <khc@pm.waw.pl> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 22 Sep 2009 00:04:26 +0000 (17:04 -0700)]
MAINTAINERS: integrate P:/M: lines
A couple of new uses of separate "P: name" "M: address" lines are
converted to single line "M: name <address>"
Signed-off-by: Joe Perches <joe@perches.com> Cc: Anil Ravindranath <anil_ravindranath@pmc-sierra.com> Cc: Kalle Valo <kalle.valo@nokia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 22 Sep 2009 00:04:24 +0000 (17:04 -0700)]
scripts/get_maintainer.pl: add maintainers in order listed in matched section
Previous behavior was "bottom-up" in each section from the pattern "F:"
entry that matched. Now information is entered into the various lists in
the "as entered" order for each matched section.
This also allows the F: entry to be put anywhere in a section, not just as
the last entries in the section.
And a couple of improvements:
Don't alphabetically sort before outputting the matched scm, status,
subsystem and web sections.
Ignore content after a single email address so these entries are acceptable
M: name <address> whatever other comment
And a fix:
Make an M: entry without a name again use the name from an immediately
preceding P: line if it exists.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 22 Sep 2009 00:04:21 +0000 (17:04 -0700)]
scripts/get_maintainer.pl: add .mailmap use, shell and email cleanups
Add reading and using .mailmap file if it exists
Convert address entries in .mailmap to first encountered address
Don't terminate shell commands with \n
Strip characters found after sign-offs by: name <address> [stripped]
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 22 Sep 2009 00:04:17 +0000 (17:04 -0700)]
scripts/get_maintainer.pl: add --pattern-depth
--pattern-depth is used to control how many levels of directory traversal
should be performed to find maintainers. default is 0 (all directory levels).
For instance:
MAINTAINERS currently has multiple M: and F: entries that match
net/netfilter/ipvs/ip_vs_app.c
NETFILTER/IPTABLES/IPCHAINS
[...]
M: Patrick McHardy <kaber@trash.net>
[...]
F: net/netfilter/
NETWORKING [GENERAL]
M: "David S. Miller" <davem@davemloft.net>
[...]
F: net/
THE REST
M: Linus Torvalds <torvalds@linux-foundation.org>
[...]
F: */
Using this command will return all of those maintainers:
(except Linus unless --git-chief-maintainers is specified)
$ ./scripts/get_maintainer.pl --nogit -nol \
-f net/netfilter/ipvs/ip_vs_app.c
Julian Anastasov <ja@ssi.bg>
Simon Horman <horms@verge.net.au>
Wensong Zhang <wensong@linux-vs.org>
Patrick McHardy <kaber@trash.net>
David S. Miller <davem@davemloft.net>
Adding --pattern-depth=1 will match at the deepest level
$ ./scripts/get_maintainer.pl --nogit -nol --pattern-depth=1 \
-f net/netfilter/ipvs/ip_vs_app.c
Julian Anastasov <ja@ssi.bg>
Simon Horman <horms@verge.net.au>
Wensong Zhang <wensong@linux-vs.org>
Adding --pattern-depth=2 will match at the deepest level and 1 higher
$ ./scripts/get_maintainer.pl --nogit -nol --pattern-depth=2 \
-f net/netfilter/ipvs/ip_vs_app.c
Julian Anastasov <ja@ssi.bg>
Simon Horman <horms@verge.net.au>
Wensong Zhang <wensong@linux-vs.org>
Patrick McHardy <kaber@trash.net>
and so on.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 22 Sep 2009 00:04:13 +0000 (17:04 -0700)]
scripts/get_maintainer.pl: add --git-blame
Julia Lawall suggested that get_maintainers.pl should have the
ability to include signatories of commits that are modified by
a particular patch.
Vegard Nossum did something similar once.
http://lkml.org/lkml/2008/5/29/449
The modified script looks the commits for all lines in the
patch, and includes the "-by:" signatories for those commits.
It uses the same git-min-percent, git-max-maintainers, and
git-min-signatures options. git-since is ignored.
It can be used independently from the --git default, so
./scripts/get_maintainers.pl --nogit --git-blame <patch>
or
./scripts/get_maintainers.pl --nogit --git-blame -f <file>
is acceptable.
If used with -f <file>, all lines/commits for the file are
checked.
--git-blame can be slow if used with -f <file>
--git-blame does not work with -f <directory>
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hannes Eder [Tue, 22 Sep 2009 00:04:12 +0000 (17:04 -0700)]
MAINTAINERS: add IPVS include files
Signed-off-by: Hannes Eder <heder@google.com> Cc: Joe Perches <joe@perches.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the state residency accounting and statistics computation off the hot
exit path.
On exit, the need to recompute statistics is recorded, and new statistics
will be computed when menu_select is called again.
The expected effect is to reduce processor wakeup latency from sleep
(C-states). We are speaking of few hundreds of cycles reduction out of a
several microseconds latency (determined by the hardware transition), so
it is difficult to measure.
Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Adam Belay <abelay@novell.com Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arjan van de Ven [Tue, 22 Sep 2009 00:04:08 +0000 (17:04 -0700)]
cpuidle: fix the menu governor to boost IO performance
Fix the menu idle governor which balances power savings, energy efficiency
and performance impact.
The reason for a reworked governor is that there have been serious
performance issues reported with the existing code on Nehalem server
systems.
To show this I'm sure Andrew wants to see benchmark results:
(benchmark is "fio", "no cstates" is using "idle=poll")
no cstates current linux new algorithm
1 disk 107 Mb/s 85 Mb/s 105 Mb/s
2 disks 215 Mb/s 123 Mb/s 209 Mb/s
12 disks 590 Mb/s 320 Mb/s 585 Mb/s
In various power benchmark measurements, no degredation was found by our
measurement&diagnostics team. Obviously a small percentage more power was
used in the "fio" benchmark, due to the much higher performance.
While it would be a novel idea to describe the new algorithm in this
commit message, I cheaped out and described it in comments in the code
instead.
[changes since first post: spelling fixes from akpm, review feedback,
folded menu-tng into menu.c]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
john stultz [Tue, 22 Sep 2009 00:04:05 +0000 (17:04 -0700)]
m68k: convert to use arch_gettimeoffset()
Convert m68k to use GENERIC_TIME via the arch_getoffset() infrastructure,
reducing the amount of arch specific code we need to maintain.
I've taken my best swing at converting this, but I'm not 100% confident
I got it right. My cross-compiler is now out of date (gcc4.2) so I
wasn't able to check if it compiled. Any assistance from arch
maintainers or testers to get this merged would be great.
Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
john stultz [Tue, 22 Sep 2009 00:04:04 +0000 (17:04 -0700)]
m32r: convert to use arch_gettimeoffset()
Convert m32r to use GENERIC_TIME via the arch_getoffset() infrastructure,
reducing the amount of arch specific code we need to maintain.
I also noted that m32r doesn't seem to be taking the xtime write lock
before calling do_timer()! That looks like a pretty bad bug to me. If
folks agree, let me know and I can move the lock grab to the correct spot.
Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marcin Slusarz [Tue, 22 Sep 2009 00:04:01 +0000 (17:04 -0700)]
alpha: use printk_once
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
john stultz [Tue, 22 Sep 2009 00:04:00 +0000 (17:04 -0700)]
alpha: convert to use arch_gettimeoffset()
Converts alpha to use GENERIC_TIME via the arch_getoffset()
infrastructure, reducing the amount of arch specific code we need to
maintain.
I suspect the alpha arch could even be further improved to provide and
rpcc() based clocksource, but not having the hardware, I don't feel
comfortable attempting the more complicated conversion (but I'd be glad to
help if anyone else is interested).
[akpm@linux-foundation.org: fix build] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
pcmcia: cleanup/fixup patch for sa1100_jornada_pcmcia driver
Clean up the /drivers/pcmcia/sa1100_jornada.c file with respect to
formatting. It also changes a build warning into a code comment (since
its a pain to watch every build and havent seen any problems with driver
in 3.5years).
Mike Frysinger [Tue, 22 Sep 2009 00:03:53 +0000 (17:03 -0700)]
pcmcia: yenta: add missing __devexit marking
The remove member of the pci_driver yenta_cardbus_driver uses
__devexit_p(), so the remove function itself should be marked with
__devexit. Even more so considering the probe function is marked with
__devinit.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Daniel Ritz <daniel.ritz-ml@swissonline.ch> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the mm being switched to matches the active mm, we don't need to
increment and then drop the mm count. In a simple benchmark this happens
in about 50% of time. Making that conditional reduces contention on that
cacheline on SMP systems.
Acked-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anyone who wants to do copy to/from user from a kernel thread, needs
use_mm (like what fs/aio has). Move that into mm/, to make reusing and
exporting easier down the line, and make aio use it. Next intended user,
besides aio, will be vhost-net.
Acked-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pekka Enberg [Tue, 22 Sep 2009 00:03:50 +0000 (17:03 -0700)]
shmem: initialize struct shmem_sb_info to zero
Fixes the following kmemcheck false positive (the compiler is using
a 32-bit mov to load the 16-bit sbinfo->mode in shmem_fill_super):
[ 0.337000] Total of 1 processors activated (3088.38 BogoMIPS).
[ 0.352000] CPU0 attaching NULL sched-domain.
[ 0.360000] WARNING: kmemcheck: Caught 32-bit read from uninitialized
memory (9f8020fc)
[ 0.361000] a44240820000000041f6998100000000000000000000000000000000ff030000
[ 0.368000] i i i i i i i i i i i i i i i i u u u u i i i i i i i i i i u
u
[ 0.375000] ^
[ 0.376000]
[ 0.377000] Pid: 9, comm: khelper Not tainted (2.6.31-tip #206) P4DC6
[ 0.378000] EIP: 0060:[<810a3a95>] EFLAGS: 00010246 CPU: 0
[ 0.379000] EIP is at shmem_fill_super+0xb5/0x120
[ 0.380000] EAX: 00000000 EBX: 9f845400 ECX: 824042a4 EDX: 8199f641
[ 0.381000] ESI: 9f8020c0 EDI: 9f845400 EBP: 9f81af68 ESP: 81cd6eec
[ 0.382000] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 0.383000] CR0: 8005003b CR2: 9f806200 CR3: 01ccd000 CR4: 000006d0
[ 0.384000] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 0.385000] DR6: ffff4ff0 DR7: 00000400
[ 0.386000] [<810c25fc>] get_sb_nodev+0x3c/0x80
[ 0.388000] [<810a3514>] shmem_get_sb+0x14/0x20
[ 0.390000] [<810c207f>] vfs_kern_mount+0x4f/0x120
[ 0.392000] [<81b2849e>] init_tmpfs+0x7e/0xb0
[ 0.394000] [<81b11597>] do_basic_setup+0x17/0x30
[ 0.396000] [<81b11907>] kernel_init+0x57/0xa0
[ 0.398000] [<810039b7>] kernel_thread_helper+0x7/0x10
[ 0.400000] [<ffffffff>] 0xffffffff
[ 0.402000] khelper used greatest stack depth: 2820 bytes left
[ 0.407000] calling init_mmap_min_addr+0x0/0x10 @ 1
[ 0.408000] initcall init_mmap_min_addr+0x0/0x10 returned 0 after 0 usecs
Eric B Munson [Tue, 22 Sep 2009 00:03:47 +0000 (17:03 -0700)]
hugetlb: add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions
Add a flag for mmap that will be used to request a huge page region that
will look like anonymous memory to userspace. This is accomplished by
using a file on the internal vfsmount. MAP_HUGETLB is a modifier of
MAP_ANONYMOUS and so must be specified with it. The region will behave
the same as a MAP_ANONYMOUS region using small pages.
[akpm@linux-foundation.org: fix arch definitions of MAP_HUGETLB] Signed-off-by: Eric B Munson <ebmunson@us.ibm.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Adam Litke <agl@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm: add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions
Add a flag for mmap that will be used to request a huge page region that
will look like anonymous memory to user space. This is accomplished by
using a file on the internal vfsmount. MAP_HUGETLB is a modifier of
MAP_ANONYMOUS and so must be specified with it. The region will behave
the same as a MAP_ANONYMOUS region using small pages.
The patch also adds the MAP_STACK flag, which was previously defined only
on some architectures but not on others. Since MAP_STACK is meant to be a
hint only, architectures can define it without assigning a specific
meaning to it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Eric B Munson <ebmunson@us.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: David Rientjes <rientjes@google.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric B Munson [Tue, 22 Sep 2009 00:03:43 +0000 (17:03 -0700)]
hugetlbfs: allow the creation of files suitable for MAP_PRIVATE on the vfs internal mount
This patchset adds a flag to mmap that allows the user to request that an
anonymous mapping be backed with huge pages. This mapping will borrow
functionality from the huge page shm code to create a file on the kernel
internal mount and use it to approximate an anonymous mapping. The
MAP_HUGETLB flag is a modifier to MAP_ANONYMOUS and will not work without
both flags being preset.
A new flag is necessary because there is no other way to hook into huge
pages without creating a file on a hugetlbfs mount which wouldn't be
MAP_ANONYMOUS.
To userspace, this mapping will behave just like an anonymous mapping
because the file is not accessible outside of the kernel.
This patchset is meant to simplify the programming model. Presently there
is a large chunk of boiler platecode, contained in libhugetlbfs, required
to create private, hugepage backed mappings. This patch set would allow
use of hugepages without linking to libhugetlbfs or having hugetblfs
mounted.
Unification of the VM code would provide these same benefits, but it has
been resisted each time that it has been suggested for several reasons: it
would break PAGE_SIZE assumptions across the kernel, it makes page-table
abstractions really expensive, and it does not provide any benefit on
architectures that do not support huge pages, incurring fast path
penalties without providing any benefit on these architectures.
This patch:
There are two means of creating mappings backed by huge pages:
1. mmap() a file created on hugetlbfs
2. Use shm which creates a file on an internal mount which essentially
maps it MAP_SHARED
The internal mount is only used for shared mappings but there is very
little that stops it being used for private mappings. This patch extends
hugetlbfs_file_setup() to deal with the creation of files that will be
mapped MAP_PRIVATE on the internal hugetlbfs mount. This extended API is
used in a subsequent patch to implement the MAP_HUGETLB mmap() flag.
Signed-off-by: Eric Munson <ebmunson@us.ibm.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Adam Litke <agl@us.ibm.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lee Schermerhorn [Tue, 22 Sep 2009 00:03:40 +0000 (17:03 -0700)]
mmap: avoid unnecessary anon_vma lock acquisition in vma_adjust()
We noticed very erratic behavior [throughput] with the AIM7 shared
workload running on recent distro [SLES11] and mainline kernels on an
8-socket, 32-core, 256GB x86_64 platform. On the SLES11 kernel
[2.6.27.19+] with Barcelona processors, as we increased the load [10s of
thousands of tasks], the throughput would vary between two "plateaus"--one
at ~65K jobs per minute and one at ~130K jpm. The simple patch below
causes the results to smooth out at the ~130k plateau.
But wait, there's more:
We do not see this behavior on smaller platforms--e.g., 4 socket/8 core.
This could be the result of the larger number of cpus on the larger
platform--a scalability issue--or it could be the result of the larger
number of interconnect "hops" between some nodes in this platform and how
the tasks for a given load end up distributed over the nodes' cpus and
memories--a stochastic NUMA effect.
The variability in the results are less pronounced [on the same platform]
with Shanghai processors and with mainline kernels. With 31-rc6 on
Shanghai processors and 288 file systems on 288 fibre attached storage
volumes, the curves [jpm vs load] are both quite flat with the patched
kernel consistently producing ~3.9% better throughput [~80K jpm vs ~77K
jpm] than the unpatched kernel.
Profiling indicated that the "slow" runs were incurring high[er]
contention on an anon_vma lock in vma_adjust(), apparently called from the
sbrk() system call.
The patch:
A comment in mm/mmap.c:vma_adjust() suggests that we don't really need the
anon_vma lock when we're only adjusting the end of a vma, as is the case
for brk(). The comment questions whether it's worth while to optimize for
this case. Apparently, on the newer, larger x86_64 platforms, with
interesting NUMA topologies, it is worth while--especially considering
that the patch [if correct!] is quite simple.
We can detect this condition--no overlap with next vma--by noting a NULL
"importer". The anon_vma pointer will also be NULL in this case, so
simply avoid loading vma->anon_vma to avoid the lock.
However, we DO need to take the anon_vma lock when we're inserting a vma
['insert' non-NULL] even when we have no overlap [NULL "importer"], so we
need to check for 'insert', as well. And Hugh points out that we should
also take it when adjusting vm_start (so that rmap.c can rely upon
vma_address() while it holds the anon_vma lock).
akpm: Zhang Yanmin reprts a 150% throughput improvement with aim7, so it
might be -stable material even though thiss isn't a regression: "this
issue is not clear on dual socket Nehalem machine (2*4*2 cpu), but is
severe on large machine (4*8*2 cpu)"
[hugh.dickins@tiscali.co.uk: test vma start too] Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Nick Piggin <npiggin@suse.de> Cc: Eric Whitney <eric.whitney@hp.com> Tested-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CONFIG_SHMEM off gives you (ramfs masquerading as) tmpfs, even when
CONFIG_TMPFS is off: that's a little anomalous, and I'd intended to make
more sense of it by removing CONFIG_TMPFS altogether, always enabling its
code when CONFIG_SHMEM; but so many defconfigs have CONFIG_SHMEM on
CONFIG_TMPFS off that we'd better leave that as is.
But there is no point in asking for CONFIG_TMPFS if CONFIG_SHMEM is off:
make TMPFS depend on SHMEM, which also prevents TMPFS_POSIX_ACL
shmem_acl.o being pointlessly built into the kernel when SHMEM is off.
And a selfish change, to prevent the world from being rebuilt when I
switch between CONFIG_SHMEM on and off: the only CONFIG_SHMEM in the
header files is mm.h shmem_lock() - give that a shmem.c stub instead.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move highest_memmap_pfn __read_mostly from page_alloc.c next to zero_pfn
__read_mostly in memory.c: to help them share a cacheline, since they're
very often tested together in vm_normal_page().
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Rik van Riel <riel@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reinstate anonymous use of ZERO_PAGE to all architectures, not just to
those which __HAVE_ARCH_PTE_SPECIAL: as suggested by Nick Piggin.
Contrary to how I'd imagined it, there's nothing ugly about this, just a
zero_pfn test built into one or another block of vm_normal_page().
But the MIPS ZERO_PAGE-of-many-colours case demands is_zero_pfn() and
my_zero_pfn() inlines. Reinstate its mremap move_pte() shuffling of
ZERO_PAGEs we did from 2.6.17 to 2.6.19? Not unless someone shouts for
that: it would have to take vm_flags to weed out some cases.
I'm still reluctant to clutter __get_user_pages() with another flag, just
to avoid touching ZERO_PAGE count in mlock(); though we can add that later
if it shows up as an issue in practice.
But when mlocking, we can test page->mapping slightly earlier, to avoid
the potentially bouncy rescheduling of lock_page on ZERO_PAGE - mlock
didn't lock_page in olden ZERO_PAGE days, so we might have regressed.
And when munlocking, it turns out that FOLL_DUMP coincidentally does
what's needed to avoid all updates to ZERO_PAGE, so use that here also.
Plus add comment suggested by KAMEZAWA Hiroyuki.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Rik van Riel <riel@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Nick Piggin <npiggin@suse.de> Acked-by: Mel Gorman <mel@csn.ul.ie> Cc: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__get_user_pages() has been taking its own GUP flags, then processing
them into FOLL flags for follow_page(). Though oddly named, the FOLL
flags are more widely used, so pass them to __get_user_pages() now.
Sorry, VM flags, VM_FAULT flags and FAULT_FLAGs are still distinct.
(The patch to __get_user_pages() looks peculiar, with both gup_flags
and foll_flags: the gup_flags remain constant; but as before there's
an exceptional case, out of scope of the patch, in which foll_flags
per page have FOLL_WRITE masked off.)
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Rik van Riel <riel@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>