Wolfram Sang [Wed, 23 Nov 2011 14:57:06 +0000 (15:57 +0100)]
mtd: gpmi: add missing include 'module.h'
Fixes:
drivers/mtd/nand/gpmi-nand/gpmi-nand.c: In function 'gpmi_nfc_init':
drivers/mtd/nand/gpmi-nand/gpmi-nand.c:1475:16: error: 'THIS_MODULE' undeclared (first use in this function)
drivers/mtd/nand/gpmi-nand/gpmi-nand.c:1475:16: note: each undeclared identifier is reported only once for each function it appears in
drivers/mtd/nand/gpmi-nand/gpmi-nand.c: At top level:
drivers/mtd/nand/gpmi-nand/gpmi-nand.c:1617:15: error: expected declaration specifiers or '...' before string constant
drivers/mtd/nand/gpmi-nand/gpmi-nand.c:1617:1: warning: data definition has no type or storage class
drivers/mtd/nand/gpmi-nand/gpmi-nand.c:1617:1: warning: type defaults to 'int' in declaration of 'MODULE_AUTHOR'
Paul Moore [Tue, 29 Nov 2011 10:10:54 +0000 (10:10 +0000)]
netlabel: Fix build problems when IPv6 is not enabled
A recent fix to the the NetLabel code caused build problem with
configurations that did not have IPv6 enabled; see below:
netlabel_kapi.c: In function 'netlbl_cfg_unlbl_map_add':
netlabel_kapi.c:165:4:
error: implicit declaration of function 'netlbl_af6list_add'
This patch fixes this problem by making the IPv6 specific code conditional
on the IPv6 configuration flags as we done in the rest of NetLabel and the
network stack as a whole. We have to move some variable declarations
around as a result so things may not be quite as pretty, but at least it
builds cleanly now.
Some additional IPv6 conditionals were added to the NetLabel code as well
for the sake of consistency.
Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Paul Moore <pmoore@redhat.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 29 Nov 2011 21:31:23 +0000 (22:31 +0100)]
IB: Fix RCU lockdep splats
Commit f2c31e32b37 ("net: fix NULL dereferences in check_peer_redir()")
forgot to take care of infiniband uses of dst neighbours.
Many thanks to Marc Aurele who provided a nice bug report and feedback.
Reported-by: Marc Aurele La France <tsi@ualberta.ca> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: David Miller <davem@davemloft.net> CC: Roland Dreier <roland@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
The while loop at the bottom of ipoib_mcast_join_finish() will attempt
to send queued multicast packets in mcast->pkt_queue and eventually
end up in ipoib_mcast_send():
if (!mcast->ah) {
if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE)
skb_queue_tail(&mcast->pkt_queue, skb);
else {
++dev->stats.tx_dropped;
dev_kfree_skb_any(skb);
}
My read is that the code will requeue the packet and return to the
ipoib_mcast_join_finish() while loop and the stage is set for the
"hung" task diagnostic as the while loop never sees a non-NULL ah, and
will do nothing to resolve.
There are GFP_ATOMIC allocates in the provider routines, so this is
possible and should be dealt with.
The test that induced the failure is associated with a host SM on the
same server during a shutdown.
This patch causes ipoib_mcast_join_finish() to exit with an error
which will flush the queued mcast packets. Nothing is done to unwind
the QP attached state so that subsequent sends from above will retry
the join.
Reviewed-by: Ram Vepa <ram.vepa@qlogic.com> Reviewed-by: Gary Leshner <gary.leshner@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
Xi Wang [Tue, 29 Nov 2011 09:26:30 +0000 (09:26 +0000)]
sctp: better integer overflow check in sctp_auth_create_key()
The check from commit 30c2235c is incomplete and cannot prevent
cases like key_len = 0x80000000 (INT_MAX + 1). In that case, the
left-hand side of the check (INT_MAX - key_len), which is unsigned,
becomes 0xffffffff (UINT_MAX) and bypasses the check.
However this shouldn't be a security issue. The function is called
from the following two code paths:
1) setsockopt()
2) sctp_auth_asoc_set_secret()
In case (1), sca_keylength is never going to exceed 65535 since it's
bounded by a u16 from the user API. As such, the key length will
never overflow.
In case (2), sca_keylength is computed based on the user key (1 short)
and 2 * key_vector (3 shorts) for a total of 7 * USHRT_MAX, which still
will not overflow.
In other words, this overflow check is not really necessary. Just
make it more correct.
Signed-off-by: Xi Wang <xi.wang@gmail.com> Cc: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
With Dmitry fsstress updates I've seen very reproducible crashes in
xfs_attr_shortform_remove because xfs_attr_shortform_bytesfit claims that
the attributes would not fit inline into the inode after removing an
attribute. It turns out that we were operating on an inode with lots
of delalloc extents, and thus an if_bytes values for the data fork that
is larger than biggest possible on-disk storage for it which utterly
confuses the code near the end of xfs_attr_shortform_bytesfit.
Fix this by always allowing the current attribute fork, like we already
do for the attr1 format, given that delalloc conversion will take care
for moving either the data or attribute area out of line if it doesn't
fit at that point - or making the point moot by merging extents at this
point.
Also document the function better, and clean up some loose bits.
Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
Eric Dumazet [Mon, 28 Nov 2011 22:41:47 +0000 (22:41 +0000)]
tcp: avoid frag allocation for small frames
tcp_sendmsg() uses select_size() helper to choose skb head size when a
new skb must be allocated.
If GSO is enabled for the socket, current strategy is to force all
payload data to be outside of headroom, in PAGE fragments.
This strategy is not welcome for small packets, wasting memory.
Experiments show that best results are obtained when using 2048 bytes
for skb head (This includes the skb overhead and various headers)
This patch provides better len/truesize ratios for packets sent to
loopback device, and reduce memory needs for in-flight loopback packets,
particularly on arches with big pages.
If a sender sends many 1-byte packets to an unresponsive application,
receiver rmem_alloc will grow faster and will stop queuing these packets
sooner, or will collapse its receive queue to free excess memory.
netperf -t TCP_RR results are improved by ~4 %, and many workloads are
improved as well (tbench, mysql...)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 28 Nov 2011 20:30:35 +0000 (20:30 +0000)]
flow_dissector: use a 64bit load/store
Le lundi 28 novembre 2011 à 19:06 -0500, David Miller a écrit :
> From: Dimitris Michailidis <dm@chelsio.com>
> Date: Mon, 28 Nov 2011 08:25:39 -0800
>
> >> +bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys
> >> *flow)
> >> +{
> >> + int poff, nhoff = skb_network_offset(skb);
> >> + u8 ip_proto;
> >> + u16 proto = skb->protocol;
> >
> > __be16 instead of u16 for proto?
>
> I'll take care of this when I apply these patches.
( CC trimmed )
Thanks David !
Here is a small patch to use one 64bit load/store on x86_64 instead of
two 32bit load/stores.
[PATCH net-next] flow_dissector: use a 64bit load/store
gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE
In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flow_keys (src,dst) fields wont
break the rule.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
xfs: force buffer writeback before blocking on the ilock in inode reclaim
If we are doing synchronous inode reclaim we block the VM from making
progress in memory reclaim. So if we encouter a flush locked inode
promote it in the delwri list and wake up xfsbufd to write it out now.
Without this we can get hangs of up to 30 seconds during workloads hitting
synchronous inode reclaim.
The scheme is copied from what we do for dquot reclaims.
Reported-by: Simon Kirby <sim@hostway.ca> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Simon Kirby <sim@hostway.ca> Signed-off-by: Ben Myers <bpm@sgi.com>
Tom Herbert [Mon, 28 Nov 2011 16:33:09 +0000 (16:33 +0000)]
bql: Byte queue limits
Networking stack support for byte queue limits, uses dynamic queue
limits library. Byte queue limits are maintained per transmit queue,
and a dql structure has been added to netdev_queue structure for this
purpose.
Configuration of bql is in the tx-<n> sysfs directory for the queue
under the byte_queue_limits directory. Configuration includes:
limit_min, bql minimum limit
limit_max, bql maximum limit
hold_time, bql slack hold time
Also under the directory are:
limit, current byte limit
inflight, current number of bytes on the queue
Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Mon, 28 Nov 2011 16:33:02 +0000 (16:33 +0000)]
xps: Add xps_queue_release function
This patch moves the xps specific parts in netdev_queue_release into
its own function which netdev_queue_release can call. This allows
netdev_queue_release to be more generic (for adding new attributes
to tx queues).
Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Mon, 28 Nov 2011 16:32:52 +0000 (16:32 +0000)]
net: Add netdev interfaces for recording sends/comp
Add interfaces for drivers to call for recording number of packets and
bytes at send time and transmit completion. Also, added a function to
"reset" a queue. These will be used by Byte Queue Limits.
Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Mon, 28 Nov 2011 16:32:44 +0000 (16:32 +0000)]
net: Add queue state xoff flag for stack
Create separate queue state flags so that either the stack or drivers
can turn on XOFF. Added a set of functions used in the stack to determine
if a queue is really stopped (either by stack or driver)
Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Mon, 28 Nov 2011 16:32:35 +0000 (16:32 +0000)]
dql: Dynamic queue limits
Implementation of dynamic queue limits (dql). This is a libary which
allows a queue limit to be dynamically managed. The goal of dql is
to set the queue limit, number of objects to the queue, to be minimized
without allowing the queue to be starved.
dql would be used with a queue which has these properties:
1) Objects are queued up to some limit which can be expressed as a
count of objects.
2) Periodically a completion process executes which retires consumed
objects.
3) Starvation occurs when limit has been reached, all queued data has
actually been consumed but completion processing has not yet run,
so queuing new data is blocked.
4) Minimizing the amount of queued data is desirable.
A canonical example of such a queue would be a NIC HW transmit queue.
The queue limit is dynamic, it will increase or decrease over time
depending on the workload. The queue limit is recalculated each time
completion processing is done. Increases occur when the queue is
starved and can exponentially increase over successive intervals.
Decreases occur when more data is being maintained in the queue than
needed to prevent starvation. The number of extra objects, or "slack",
is measured over successive intervals, and to avoid hysteresis the
limit is only reduced by the miminum slack seen over a configurable
time period.
dql API provides routines to manage the queue:
- dql_init is called to intialize the dql structure
- dql_reset is called to reset dynamic values
- dql_queued called when objects are being enqueued
- dql_avail returns availability in the queue
- dql_completed is called when objects have be consumed in the queue
Configuration consists of:
- max_limit, maximum limit
- min_limit, minimum limit
- slack_hold_time, time to measure instances of slack before reducing
queue limit
Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Prasad Joshi [Sat, 26 Nov 2011 05:30:47 +0000 (11:00 +0530)]
logfs: update page reference count for pined pages
LogFS sets PG_private flag to indicate a pined page. We assumed that
marking a page as private is enough to ensure its existence. But
instead it is necessary to hold a reference count to the page.
The change resolves the following BUG
BUG: Bad page state in process flush-253:16 pfn:6a6d0
page flags: 0x100000000000808(uptodate|private)
Suggested-and-Acked-by: Joern Engel <joern@logfs.org> Signed-off-by: Prasad Joshi <prasadjoshi.linux@gmail.com>
Heiko Carstens [Tue, 29 Nov 2011 08:33:37 +0000 (09:33 +0100)]
[S390] topology: increase poll frequency if change is anticipated
Increase cpu topology change poll frequency if a change is anticipated.
Otherwise a user might be a bit confused to have to wait up to a minute
in order to see a change this should be visible immediatly.
However there is no guarantee that the change will happen during the
time frame the poll frequency is increased.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Another round of cleanup for entry[64].S, in particular the program check
handler looks more reasonable now. The code size for the 31 bit kernel
has been reduced by 616 byte and by 528 byte for the 64 bit version.
Even better the code is a bit faster as well.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[S390] kvm: move cmf host id constant out of lowcore
There is no reason for the cpu-measurement-facility host id constant to
reside in the lowcore where space is precious. Use an entry in the literal
pool in HANDLE_SIE_INTERCEPT and a stack slot in sie64a.
While we are at it replace the id -1 with 0 to indicate host execution.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Carsten Otte [Tue, 29 Nov 2011 08:33:32 +0000 (09:33 +0100)]
[S390] disable MACHINE_IS_VM check for pfault
This patch disables the check for MACHINE_IS_VM when initializing the
pfault infrastructure. The code checks for successful completion of
diag 258 anyway, thus it's safe to try initialization on LPAR anyway.
This is needed to use pfault on kvm
Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Tue, 29 Nov 2011 08:33:30 +0000 (09:33 +0100)]
[S390] topology: get rid of ifdefs
Remove all ifdefs from topology code and also only compile it for the
CONFIG_SCHED_BOOK case. The new code selects SCHED_MC if SCHED_BOOK is
selected. SCHED_MC without SCHED_BOOK is not possible anymore.
Furthermore various sysfs attributes are not available anymore for the
!SCHED_BOOK case. In particular all attributes that correspond to
CPU polarization.
But since all real world kernels have SCHED_BOOK selected anyway this
doesn't matter too much.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>