This patch series implement the Tail loss probe (TLP) algorithm described
in http://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01. The
first patch implements the basic algorithm.
TLP's goal is to reduce tail latency of short transactions. It achieves
this by converting retransmission timeouts (RTOs) occuring due
to tail losses (losses at end of transactions) into fast recovery.
TLP transmits one packet in two round-trips when a connection is in
Open state and isn't receiving any ACKs. The transmitted packet, aka
loss probe, can be either new or a retransmission. When there is tail
loss, the ACK from a loss probe triggers FACK/early-retransmit based
fast recovery, thus avoiding a costly RTO. In the absence of loss,
there is no change in the connection state.
PTO stands for probe timeout. It is a timer event indicating
that an ACK is overdue and triggers a loss probe packet. The PTO value
is set to max(2*SRTT, 10ms) and is adjusted to account for delayed
ACK timer when there is only one oustanding packet.
TLP Algorithm
On transmission of new data in Open state:
-> packets_out > 1: schedule PTO in max(2*SRTT, 10ms).
-> packets_out == 1: schedule PTO in max(2*RTT, 1.5*RTT + 200ms)
-> PTO = min(PTO, RTO)
Conditions for scheduling PTO:
-> Connection is in Open state.
-> Connection is either cwnd limited or no new data to send.
-> Number of probes per tail loss episode is limited to one.
-> Connection is SACK enabled.
When PTO fires:
new_segment_exists:
-> transmit new segment.
-> packets_out++. cwnd remains same.
no_new_packet:
-> retransmit the last segment.
Its ACK triggers FACK or early retransmit based recovery.
ACK path:
-> rearm RTO at start of ACK processing.
-> reschedule PTO if need be.
In addition, the patch includes a small variation to the Early Retransmit
(ER) algorithm, such that ER and TLP together can in principle recover any
N-degree of tail loss through fast recovery. TLP is controlled by the same
sysctl as ER, tcp_early_retrans sysctl.
tcp_early_retrans==0; disables TLP and ER.
==1; enables RFC5827 ER.
==2; delayed ER.
==3; TLP and delayed ER. [DEFAULT]
==4; TLP only.
The TLP patch series have been extensively tested on Google Web servers.
It is most effective for short Web trasactions, where it reduced RTOs by 15%
and improved HTTP response time (average by 6%, 99th percentile by 10%).
The transmitted probes account for <0.5% of the overall transmissions.
Signed-off-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Mon, 11 Mar 2013 04:40:14 +0000 (04:40 +0000)]
bnx2x: use list_move instead of list_del/list_add
Using list_move() instead of list_del() + list_add().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Mon, 11 Mar 2013 05:17:46 +0000 (05:17 +0000)]
bnx2x: Control number of vfs dynamically
1. Support sysfs interface for getting the maximal number of virtual functions
of a given physical function.
2. Support sysfs interface for getting and setting the current number of
virtual functions.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Mon, 11 Mar 2013 05:17:45 +0000 (05:17 +0000)]
bnx2x: Add iproute2 support for vfs
This patch adds support for iproute2 callbacks allowing querying a physical
function as to its child virtual functions, and setting the macs and vlans
of said virtual functions.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Mon, 11 Mar 2013 05:17:44 +0000 (05:17 +0000)]
bnx2x: Prevent "Unknown MF" print in SF mode
When using a chip operating in Single Function mode, when the chip is probed
the bnx2x would print a message warning of an unknown Multi Function mode.
This patch prevents said message.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Mon, 11 Mar 2013 05:17:42 +0000 (05:17 +0000)]
bnx2x: Set ethtool ops for vfs
Virtual functions don't have access to HW registers, therefore most ethtool ops
are forbidden to them. Instead of checking in each op whether the device being
driven is a virtual function or a physical function, this patch creates a
separate ethtool ops struct for virtual functions and uses it to initialize
the ethtool ops of the driver in case it is driving a virtual function device.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ethernet: amd: use PTR_RET instead of IS_ERR + PTR_ERR
This uses PTR_RET instead of IS_ERR and PTR_ERR in order to increase
readability.
Signed-off-by: Silviu-Mihai Popescu <silviupopescu1990@gmail.com> Acked-by: <Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Hector Palacios [Sun, 10 Mar 2013 22:50:03 +0000 (22:50 +0000)]
phy/micrel: move flag handling to function for common use
The flag MICREL_PHY_50MHZ_CLK is not of exclusive use of KSZ8051
model. At least KSZ8021 and KSZ8031 models also use it.
This patch moves the handling of this and future flags to a
separate function so that the different PHY models can call it on
their init function, if needed.
Signed-off-by: Hector Palacios <hector.palacios@digi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hector Palacios [Sun, 10 Mar 2013 22:50:02 +0000 (22:50 +0000)]
phy/micrel: Add support for KSZ8031
Micrel PHY KSZ8031 is similar to KSZ8021 and also requires the special
initialization of "Operation Mode Strap Override" in reg 0x16
introduced in 212ea99 (phy/micrel: Implement support for KSZ8021).
Signed-off-by: Hector Palacios <hector.palacios@digi.com> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
The bridge multicast fast leave feature was added sufficient space
was not reserved in the netlink message. This means the flag may be
lost in netlink events and results of queries.
Found by observation while looking up some netlink stuff for discussion with Vlad.
Problem introduced by commit c2d3babfafbb9f6629cfb47139758e59a5eb0d80
Author: David S. Miller <davem@davemloft.net>
Date: Wed Dec 5 16:24:45 2012 -0500
bridge: implement multicast fast leave
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ward [Mon, 11 Mar 2013 10:43:39 +0000 (10:43 +0000)]
net/ipv4: Ensure that location of timestamp option is stored
This is needed in order to detect if the timestamp option appears
more than once in a packet, to remove the option if the packet is
fragmented, etc. My previous change neglected to store the option
location when the router addresses were prespecified and Pointer >
Length. But now the option location is also stored when Flag is an
unrecognized value, to ensure these option handling behaviors are
still performed.
Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4: Allow for backward compatibility with new VPD scheme.
New scheme calls for 3rd party VPD at offset 0x0 and Chelsio VPD at offset
0x400 of the function. If no 3rd party VPD is present, then a copy of
Chelsio's VPD will be at offset 0x0 to keep in line with PCI spec which
requires the VPD to be present at offset 0x0.
Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 12 Mar 2013 09:14:44 +0000 (05:14 -0400)]
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next
Ben Hutchings says:
====================
1. Merge sfc changes (only) accepted for 3.9.
2. PTP improvements from Laurence Evans.
3. Overhaul of RX buffer management:
- Always allocate pages, and enable scattering where possible
- Fit as many buffers as will fit into a page, rather than limiting to 2
- Introduce recycle rings to reduce the need for IOMMU mapping and
unmapping
4. PCI error recovery (AER and EEH) implementation.
5. Fix a bug in RX filter replacement.
6. Fix configuration with 1 RX queue in the PF and multiple RX queues in
VFs.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Dalton [Mon, 11 Mar 2013 06:52:28 +0000 (06:52 +0000)]
flow_dissector: support L2 GRE
Add support for L2 GRE tunnels, so that RPS can be more effective.
Signed-off-by: Michael Dalton <mwdalton@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Lindner [Mon, 4 Mar 2013 02:39:49 +0000 (10:39 +0800)]
batman-adv: verify tt len does not exceed packet len
batadv_iv_ogm_process() accesses the packet using the tt_num_changes
attribute regardless of the real packet len (assuming the length check
was done before). Therefore a length check is needed to avoid reading
random memory.
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Linus Torvalds [Mon, 11 Mar 2013 14:54:29 +0000 (07:54 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Misc minor fixes mostly related to tracing"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
s390: Fix a header dependencies related build error
tracing: update documentation of snapshot utility
tracing: Do not return EINVAL in snapshot when not allocated
tracing: Add help of snapshot feature when snapshot is empty
ftrace: Update the kconfig for DYNAMIC_FTRACE
1) Missing cancel of work items in mac80211 MLME, from Ben Greear.
2) Fix DMA mapping handling in iwlwifi by using coherent DMA for
command headers, from Johannes Berg.
3) Decrease the amount of pressure on the page allocator by using order
1 pages less in iwlwifi, from Emmanuel Grumbach.
4) Fix mesh PS broadcast OOPS in mac80211, from Marco Porsch.
5) Don't forget to recalculate idle state in mac80211 monitor
interface, from Felix Fietkau.
6) Fix varargs in netfilter conntrack handler, from Joe Perches.
7) Need to reset entire chip when command queue fills up in iwlwifi,
from Emmanuel Grumbach.
8) The TX antenna value must be valid when calibrations are performed
in iwlwifi, fix from Dor Shaish.
9) Don't generate netfilter audit log entries when audit is disabled,
from Gao Feng.
10) Deal with DMA unit hang on e1000e during power state transitions,
from Bruce Allan.
11) Remove BUILD_BUG_ON check from igb driver, from Alexander Duyck.
12) Fix lockdep warning on i2c handling of igb driver, from Carolyn
Wyborny.
13) Fix several TTY handling issues in IRDA ircomm tty driver, from
Peter Hurley.
14) Several QFQ packet scheduler fixes from Paolo Valente.
15) When VXLAN encapsulates on transmit, we have to reset the netfilter
state. From Zang MingJie.
16) Fix jiffie check in net_rx_action() so that we really cap the
processing at 2HZ. From Eric Dumazet.
17) Fix erroneous trigger of IP option space exhaustion, when routers
are pre-specified and we are looking to see if we can insert a
timestamp, we will have the space. From David Ward.
18) Fix various issues in benet driver wrt waiting for firmware to
finish POST after resets or errors. From Gavin Shan and Sathya
Perla.
19) Fix TX locking in SFC driver, from Ben Hutchings.
20) Like the VXLAN fix above, when we encap in a TUN device we have to
reset the netfilter state. This should fix several strange crashes
reported by Dave Jones and others. From Eric Dumazet.
21) Don't forget to clean up MAC address resources when shutting down a
port in mlx4 driver, from Yan Burman.
22) Fix divide by zero in vmxnet3 driver, from Bhavesh Davda.
23) Fix device statistic regression in tg3 when the driver is using
phylib, from Nithin Sujir.
24) Fix info leak in several netlink handlers, from Mathias Krause.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (79 commits)
6lowpan: Fix endianness issue in is_addr_link_local().
rrunner.c: fix possible memory leak in rr_init_one()
dcbnl: fix various netlink info leaks
rtnl: fix info leak on RTM_GETLINK request for VF devices
bridge: fix mdb info leaks
tg3: Update link_up flag for phylib devices
ipv6: stop multicast forwarding to process interface scoped addresses
bridging: fix rx_handlers return code
netlabel: fix build problems when CONFIG_IPV6=n
drivers/isdn: checkng length to be sure not memory overflow
net/rds: zero last byte for strncpy
bnx2x: Fix SFP+ misconfiguration in iSCSI boot scenario
bnx2x: Fix intermittent long KR2 link up time
macvlan: Set IFF_UNICAST_FLT flag to prevent unnecessary promisc mode.
team: unsyc the devices addresses when port is removed
bridge: add missing vid to br_mdb_get()
Fix: sparse warning in inet_csk_prepare_forced_close
afkey: fix a typo
MAINTAINERS: Update qlcnic maintainers list
netlabel: correctly list all the static label mappings
...
Linus Torvalds [Mon, 11 Mar 2013 14:50:39 +0000 (07:50 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
"This update brings various fixes.
Nothing special...
In my local queue I have some more fixes which will be sent later to
you. 3.9 uncovered strange UML issues. :("
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: Use tty_port in SIGWINCH handler
um: Use tty_port_operations->destruct
um: fix build failure due to mess-up of sig_info protorype
um: add missing declaration of 'getrlimit()' and friends
net : enable tx time stamping in the vde driver.
hostfs: fix a not needed double check
Linus Torvalds [Mon, 11 Mar 2013 14:49:37 +0000 (07:49 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
"Except for the largish change to the ALPS driver adding "Dolphin V1"
support and Wacom getting a new signature of yet another device, the
rest are straightforward driver fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: mms114 - Fix regulator enable and disable paths
Input: ads7864 - check return value of regulator enable
Input: tc3589x-keypad - fix keymap size
Input: wacom - add support for 0x10d
Input: ALPS - update documentation for recent touchpad driver mods
Input: ALPS - add "Dolphin V1" touchpad support
Input: ALPS - remove unused argument to alps_enter_command_mode()
Input: cypress_ps2 - fix trackpadi found in Dell XPS12
Valentin Ilie [Sun, 10 Mar 2013 11:15:13 +0000 (11:15 +0000)]
net: can: af_can.c: Fix checkpatch warnings
Replace printk(KERN_ERR with pr_err
Add space before {
Removed OOM messages
Signed-off-by: Valentin Ilie <valentin.ilie@gmail.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
In file included from arch/s390/include/asm/perf_event.h:9:0,
from include/linux/perf_event.h:24,
from kernel/events/ring_buffer.c:12:
arch/s390/include/asm/cpu_mf.h: In function 'qctri':
arch/s390/include/asm/cpu_mf.h:61:12: error: 'EINVAL' undeclared (first use in this function)
cpu_mf.h had an implicit errno.h dependency, which was added
indirectly via cgroups.h but not anymore. Add it explicitly.
um: fix build failure due to mess-up of sig_info protorype
arch/um/os-Linux/signal.c:18:8: error: conflicting types for 'sig_info'
In file included from /home/slyfox/linux-2.6/arch/um/os-Linux/signal.c:12:0:
arch/um/include/shared/as-layout.h:64:15: note: previous declaration of 'sig_info' was here
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> CC: Jeff Dike <jdike@addtoit.com> CC: Richard Weinberger <richard@nod.at> CC: "Martin Pärtel" <martin.partel@gmail.com> CC: Al Viro <viro@zeniv.linux.org.uk> CC: user-mode-linux-devel@lists.sourceforge.net CC: user-mode-linux-user@lists.sourceforge.net CC: linux-kernel@vger.kernel.org Signed-off-by: Richard Weinberger <richard@nod.at>
um: add missing declaration of 'getrlimit()' and friends
arch/um/os-Linux/start_up.c: In function 'check_coredump_limit':
arch/um/os-Linux/start_up.c:338:16: error: storage size of 'lim' isn't known
arch/um/os-Linux/start_up.c:339:2: error: implicit declaration of function 'getrlimit' [-Werror=implicit-function-declaration]
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> CC: Jeff Dike <jdike@addtoit.com> CC: Richard Weinberger <richard@nod.at> CC: Al Viro <viro@zeniv.linux.org.uk> CC: user-mode-linux-devel@lists.sourceforge.net CC: user-mode-linux-user@lists.sourceforge.net CC: linux-kernel@vger.kernel.org Signed-off-by: Richard Weinberger <richard@nod.at>
Paul Chavent [Thu, 6 Dec 2012 15:25:05 +0000 (16:25 +0100)]
net : enable tx time stamping in the vde driver.
This new version moves the skb_tx_timestamp in the main uml
driver. This should avoid the need to call this function in each
transport (vde, slirp, tuntap, ...). It also add support for ethtool
get_ts_info.
Signed-off-by: Paul Chavent <paul.chavent@onera.fr> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
Marco Stornelli [Sat, 20 Oct 2012 10:02:59 +0000 (12:02 +0200)]
hostfs: fix a not needed double check
With the commit 3be2be0a32c18b0fd6d623cda63174a332ca0de1 we removed vmtruncate,
but actaully there is no need to call inode_newsize_ok() because the checks are
already done in inode_change_ok() at the begin of the function.
Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at>
Mark Brown [Mon, 4 Mar 2013 04:21:30 +0000 (20:21 -0800)]
Input: mms114 - Fix regulator enable and disable paths
When it uses regulators the mms114 driver checks to see if it managed to
acquire regulators and ignores errors. This is not the intended usage and
not great style in general.
Since the driver already refuses to probe if it fails to allocate the
regulators simply make the enable and disable calls unconditional and
add appropriate error handling, including adding cleanup of the
regulators if setup_reg() fails.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Joonyoung Shim <jy0922.shim@samsung.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Rabin Vincent [Sun, 10 Mar 2013 00:17:20 +0000 (16:17 -0800)]
Input: tc3589x-keypad - fix keymap size
The keymap size used by tc3589x is too low, leading to the driver
overwriting other people's memory. Fix this by making the driver
use the automatically allocated keymap provided by
matrix_keypad_build_keymap() instead of allocating one on its own.
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Christoph Paasch [Sun, 10 Mar 2013 05:18:39 +0000 (05:18 +0000)]
tcp: Remove unused tw_cookie_values from tcp_timewait_sock
tw_cookie_values is never used in the TCP-stack.
It was added by 435cf559f (TCPCT part 1d: define TCP cookie option,
extend existing struct's), but already at that time it was not used at
all, nor mentioned in the commit-message.
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Sat, 9 Mar 2013 23:00:39 +0000 (23:00 +0000)]
ipv6: introduce ip6tunnel_xmit() helper
Similar to iptunnel_xmit(), group these operations into a
helper function.
This by the way fixes the missing u64_stats_update_begin()
and u64_stats_update_end() for 32 bit arch.
Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Pravin B Shelar <pshelar@nicira.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mathias Krause [Sat, 9 Mar 2013 05:52:21 +0000 (05:52 +0000)]
dcbnl: fix various netlink info leaks
The dcb netlink interface leaks stack memory in various places:
* perm_addr[] buffer is only filled at max with 12 of the 32 bytes but
copied completely,
* no in-kernel driver fills all fields of an IEEE 802.1Qaz subcommand,
so we're leaking up to 58 bytes for ieee_ets structs, up to 136 bytes
for ieee_pfc structs, etc.,
* the same is true for CEE -- no in-kernel driver fills the whole
struct,
Prevent all of the above stack info leaks by properly initializing the
buffers/structures involved.
Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mathias Krause [Sat, 9 Mar 2013 05:52:20 +0000 (05:52 +0000)]
rtnl: fix info leak on RTM_GETLINK request for VF devices
Initialize the mac address buffer with 0 as the driver specific function
will probably not fill the whole buffer. In fact, all in-kernel drivers
fill only ETH_ALEN of the MAX_ADDR_LEN bytes, i.e. 6 of the 32 possible
bytes. Therefore we currently leak 26 bytes of stack memory to userland
via the netlink interface.
Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mathias Krause [Sat, 9 Mar 2013 05:52:19 +0000 (05:52 +0000)]
bridge: fix mdb info leaks
The bridging code discloses heap and stack bytes via the RTM_GETMDB
netlink interface and via the notify messages send to group RTNLGRP_MDB
afer a successful add/del.
Fix both cases by initializing all unset members/padding bytes with
memset(0).
Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Sat, 9 Mar 2013 16:38:39 +0000 (16:38 +0000)]
tunnel: use iptunnel_xmit() again
With recent patches from Pravin, most tunnels can't use iptunnel_xmit()
any more, due to ip_select_ident() and skb->ip_summed. But we can just
move these operations out of iptunnel_xmit(), so that tunnels can
use it again.
This by the way fixes a bug in vxlan (missing nf_reset()) for net-next.
Cc: Pravin B Shelar <pshelar@nicira.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 10 Mar 2013 00:51:13 +0000 (16:51 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace bugfixes from Eric Biederman:
"This is three simple fixes against 3.9-rc1. I have tested each of
these fixes and verified they work correctly.
The userns oops in key_change_session_keyring and the BUG_ON triggered
by proc_ns_follow_link were found by Dave Jones.
I am including the enhancement for mount to only trigger requests of
filesystem modules here instead of delaying this for the 3.10 merge
window because it is both trivial and the kind of change that tends to
bit-rot if left untouched for two months."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc: Use nd_jump_link in proc_ns_follow_link
fs: Limit sys_mount to only request filesystem modules (Part 2).
fs: Limit sys_mount to only request filesystem modules.
userns: Stop oopsing in key_change_session_keyring
o Add support for LED test on 83xx series adapters
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Shahed Shaikh [Fri, 8 Mar 2013 09:53:49 +0000 (09:53 +0000)]
qlcnic: Fix endian issues in 83xx driver
o Split mailbox structure elements on boundary of adapter
register size i.e. 32bit.
o Shuffle the position of structure elements based on CPU endianness.
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
The function is declared to take u32 but definition uses enum.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Rasesh Mody <rmody@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pravin B Shelar [Thu, 7 Mar 2013 13:22:36 +0000 (13:22 +0000)]
VXLAN: Use UDP Tunnel segmention.
Enable TSO for VXLAN devices and use UDP_TUNNEL to offload vxlan
segmentation.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pravin B Shelar [Thu, 7 Mar 2013 13:21:51 +0000 (13:21 +0000)]
tunneling: Add generic Tunnel segmentation.
Adds generic tunneling offloading support for IPv4-UDP based
tunnels.
GSO type is added to request this offload for a skb.
netdev feature NETIF_F_UDP_TUNNEL is added for hardware offloaded
udp-tunnel support. Currently no device supports this feature,
software offload is used.
This can be used by tunneling protocols like VXLAN.
CC: Jesse Gross <jesse@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pravin B Shelar [Thu, 7 Mar 2013 13:21:46 +0000 (13:21 +0000)]
tunneling: Capture inner mac header during encapsulation.
This patch adds inner mac header. This will be used in next patch
to find tunner header length. Header len is required to copy tunnel
header to each gso segment.
This patch does not change any functionality.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
This function will be used in next VXLAN_GSO patch. This patch does
not change any functionality.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pravin B Shelar [Thu, 7 Mar 2013 09:28:01 +0000 (09:28 +0000)]
net: Kill link between CSUM and SG features.
Earlier SG was unset if CSUM was not available for given device to
force skb copy to avoid sending inconsistent csum.
Commit c9af6db4c11c (net: Fix possible wrong checksum generation)
added explicit flag to force copy to fix this issue. Therefore
there is no need to link SG and CSUM, following patch kills this
link between there two features.
This patch is also required following patch in series.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 9 Mar 2013 18:31:01 +0000 (10:31 -0800)]
Atmel MXT touchscreen: increase reset timeouts
There is a more complete atmel patch-series out by Nick Dyer that fixes
this and other things, but in the meantime this is the minimal thing to
get the touchscreen going on (at least my) Pixel Chromebook.
Not that I want my dirty fingers near that beautiful screen, but it
seems that a non-initialized touchscreen will also end up being a
constant wakeup source, so you have to disable it to go to sleep. And
it's easier to just fix the initialization sequence.
Update proc_ns_follow_link to use nd_jump_link instead of just
manually updating nd.path.dentry.
This fixes the BUG_ON(nd->inode != parent->d_inode) reported by Dave
Jones and reproduced trivially with mkdir /proc/self/ns/uts/a.
Sigh it looks like the VFS change to require use of nd_jump_link
happend while proc_ns_follow_link was baking and since the common case
of proc_ns_follow_link continued to work without problems the need for
making this change was overlooked.
Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Linus Torvalds [Sat, 9 Mar 2013 01:33:20 +0000 (17:33 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"These are scattered fixes and one performance improvement. The
biggest functional change is in how we throttle metadata changes. The
new code bumps our average file creation rate up by ~13% in fs_mark,
and lowers CPU usage.
Stefan bisected out a regression in our allocation code that made
balance loop on extents larger than 256MB."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: improve the delayed inode throttling
Btrfs: fix a mismerge in btrfs_balance()
Btrfs: enforce min_bytes parameter during extent allocation
Btrfs: allow running defrag in parallel to administrative tasks
Btrfs: avoid deadlock on transaction waiting list
Btrfs: do not BUG_ON on aborted situation
Btrfs: do not BUG_ON in prepare_to_reloc
Btrfs: free all recorded tree blocks on error
Btrfs: build up error handling for merge_reloc_roots
Btrfs: check for NULL pointer in updating reloc roots
Btrfs: fix unclosed transaction handler when the async transaction commitment fails
Btrfs: fix wrong handle at error path of create_snapshot() when the commit fails
Btrfs: use set_nlink if our i_nlink is 0
Benson Leung [Fri, 8 Mar 2013 03:43:34 +0000 (19:43 -0800)]
Platform: x86: chromeos_laptop : Add basic platform data for atmel devices
Add basic platform data to get the current upstream driver working
with the 224s touchpad and 1664s touchscreen.
We will be using NULL config so we will use the settings from the
devices' NVRAMs.
Daniel Kurtz [Fri, 8 Mar 2013 03:43:33 +0000 (19:43 -0800)]
Input: atmel_mxt_ts - Support for touchpad variant
This same driver can be used by atmel based touchscreens and touchpads
(buttonpads). Platform data may specify a device is a touchpad
using the is_tp flag.
This will cause the driver to perform some touchpad specific
initializations, such as:
* register input device name "Atmel maXTouch Touchpad" instead of
Touchscreen.
* register BTN_LEFT & BTN_TOOL_* event types.
* register axis resolution (as a fixed constant, for now)
* register BUTTONPAD property
* process GPIO buttons using reportid T19
Input event GPIO mapping is done by the platform data key_map array.
key_map[x] should contain the KEY or BTN code to send when processing
GPIOx from T19. To specify a GPIO as not an input source, populate
with KEY_RESERVED, or 0.
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Benson Leung <bleung@chromium.org> Signed-off-by: Nick Dyer <nick.dyer@itdev.co.uk> Tested-by: Olof Johansson <olof@lixom.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 8 Mar 2013 23:22:08 +0000 (15:22 -0800)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French:
"A small set of cifs fixes which includes one for a recent regression
in the write path (pointed out by Anton), some fixes for rename
problems and as promised for 3.9 removing the obsolete sockopt mount
option (and the accompanying deprecation warning)."
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: Fix missing of oplock_read value in smb30_values structure
cifs: don't try to unlock pagecache page after releasing it
cifs: remove the sockopt= mount option
cifs: Check server capability before attempting silly rename
cifs: Fix bug when checking error condition in cifs_rename_pending_delete()
Linus Torvalds [Fri, 8 Mar 2013 23:05:42 +0000 (15:05 -0800)]
Merge branch 'akpm' (fixes from Andrew)
Merge misc fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
alpha: boot: fix build breakage introduced by system.h disintegration
memcg: initialize kmem-cache destroying work earlier
Randy has moved
ksm: fix m68k build: only NUMA needs pfn_to_nid
dmi_scan: fix missing check for _DMI_ signature in smbios_present()
Revert parts of "hlist: drop the node parameter from iterators"
idr: remove WARN_ON_ONCE() on negative IDs
mm/mempolicy.c: fix sp_node_init() argument ordering
mm/mempolicy.c: fix wrong sp_node insertion
ipc: don't allocate a copy larger than max
ipc: fix potential oops when src msg > 4k w/ MSG_COPY
Will Deacon [Fri, 8 Mar 2013 20:43:37 +0000 (12:43 -0800)]
alpha: boot: fix build breakage introduced by system.h disintegration
Commit ec2212088c42 ("Disintegrate asm/system.h for Alpha") removed the
system.h include from boot/head.S, which puts the PAL_* asm constants
out of scope.
Include <asm/pal.h> so we can get building again.
Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: David Rusling <david.rusling@linaro.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
memcg: initialize kmem-cache destroying work earlier
Fix a warning from lockdep caused by calling cancel_work_sync() for
uninitialized struct work. This path has been triggered by destructon
kmem-cache hierarchy via destroying its root kmem-cache.
cache ffff88003c072d80
obj ffff88003b410000 cache ffff88003c072d80
obj ffff88003b924000 cache ffff88003c20bd40
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
Pid: 2825, comm: insmod Tainted: G O 3.9.0-rc1-next-20130307+ #611
Call Trace:
__lock_acquire+0x16a2/0x1cb0
lock_acquire+0x8a/0x120
flush_work+0x38/0x2a0
__cancel_work_timer+0x89/0xf0
cancel_work_sync+0xb/0x10
kmem_cache_destroy_memcg_children+0x81/0xb0
kmem_cache_destroy+0xf/0xe0
init_module+0xcb/0x1000 [kmem_test]
do_one_initcall+0x11a/0x170
load_module+0x19b0/0x2320
SyS_init_module+0xc6/0xf0
system_call_fastpath+0x16/0x1b
Hugh Dickins [Fri, 8 Mar 2013 20:43:34 +0000 (12:43 -0800)]
ksm: fix m68k build: only NUMA needs pfn_to_nid
A CONFIG_DISCONTIGMEM=y m68k config gave
mm/ksm.c: In function `get_kpfn_nid':
mm/ksm.c:492: error: implicit declaration of function `pfn_to_nid'
linux/mmzone.h declares it for CONFIG_SPARSEMEM and CONFIG_FLATMEM, but
expects the arch's asm/mmzone.h to declare it for CONFIG_DISCONTIGMEM
(see arch/mips/include/asm/mmzone.h for example).
Or perhaps it is only expected when CONFIG_NUMA=y: too much of a maze,
and m68k got away without it so far, so fix the build in mm/ksm.c.
Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Petr Holasek <pholasek@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Hutchings [Fri, 8 Mar 2013 20:43:32 +0000 (12:43 -0800)]
dmi_scan: fix missing check for _DMI_ signature in smbios_present()
Commit 9f9c9cbb6057 ("drivers/firmware/dmi_scan.c: fetch dmi version
from SMBIOS if it exists") hoisted the check for "_DMI_" into
dmi_scan_machine(), which means that we don't bother to check for
"_DMI_" at offset 16 in an SMBIOS entry. smbios_present() may also call
dmi_present() for an address where we found "_SM_", if it failed further
validation.
Check for "_DMI_" in smbios_present() before calling dmi_present().
[akpm@linux-foundation.org: fix build] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Reported-by: Tim McGrath <tmhikaru@gmail.com> Tested-by: Tim Mcgrath <tmhikaru@gmail.com> Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 8 Mar 2013 20:43:31 +0000 (12:43 -0800)]
Revert parts of "hlist: drop the node parameter from iterators"
Commit b67bfe0d42ca ("hlist: drop the node parameter from iterators")
did a lot of nice changes but also contains two small hunks that seem to
have slipped in accidentally and have no apparent connection to the
intent of the patch.
This reverts the two extraneous changes.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Peter Senna Tschudin <peter.senna@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tejun Heo [Fri, 8 Mar 2013 20:43:30 +0000 (12:43 -0800)]
idr: remove WARN_ON_ONCE() on negative IDs
idr_find(), idr_remove() and idr_replace() used to silently ignore the
sign bit and perform lookup with the rest of the bits. The weird behavior
has been changed such that negative IDs are treated as invalid. As the
behavior change was subtle, WARN_ON_ONCE() was added in the hope of
determining who's calling idr functions with negative IDs so that they can
be examined for problems.
Up until now, all two reported cases are ID number coming directly from
userland and getting fed into idr_find() and the warnings seem to cause
more problems than being helpful. Drop the WARN_ON_ONCE()s.
Peter Hurley [Fri, 8 Mar 2013 20:43:27 +0000 (12:43 -0800)]
ipc: don't allocate a copy larger than max
When MSG_COPY is set, a duplicate message must be allocated for the copy
before locking the queue. However, the copy could not be larger than was
sent which is limited to msg_ctlmax.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>