git.karo-electronics.de Git - linux-beck.git/log

]> git.karo-electronics.de Git - linux-beck.git/log

projects / linux-beck.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Martin Schwidefsky [Mon, 10 Sep 2012 11:00:09 +0000 (13:00 +0200)]

s390/mm,tlb: race of lazy TLB flush vs. recreation of TLB entries

Git commit 050eef364ad70059 "[S390] fix tlb flushing vs. concurrent
/proc accesses" introduced the attach counter to avoid using the
mm_users value to decide between IPTE for every PTE and lazy TLB
flushing with IDTE. That fixed the problem with mm_users but it
introduced another subtle race, fortunately one that is very hard
to hit.
The background is the requirement of the architecture that a valid
PTE may not be changed while it can be used concurrently by another
cpu. The decision between IPTE and lazy TLB flushing needs to be
done while the PTE is still valid. Now if the virtual cpu is
temporarily stopped after the decision to use lazy TLB flushing but
before the invalid bit of the PTE has been set, another cpu can attach
the mm, find that flush_mm is set, do the IDTE, return to userspace,
and recreate a TLB that uses the PTE in question. When the first,
stopped cpu continues it will change the PTE while it is attached on
another cpu. The first cpu will do another IDTE shortly after the
modification of the PTE which makes the race window quite short.

To fix this race the CPU that wants to attach the address space of a
user space thread needs to wait for the end of the PTE modification.
The number of concurrent TLB flushers for an mm is tracked in the
upper 16 bits of the attach_count and finish_arch_post_lock_switch
is used to wait for the end of the flush operation if required.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Martin Schwidefsky [Fri, 26 Oct 2012 15:17:44 +0000 (17:17 +0200)]

sched/mm: call finish_arch_post_lock_switch in idle_task_exit and use_mm

The finish_arch_post_lock_switch is called at the end of the task
switch after all locks have been released. In concept it is paired
with the switch_mm function, but the current code only does the
call in finish_task_switch. Add the call to idle_task_exit and
use_mm. One use case for the additional calls is s390 which will
use finish_arch_post_lock_switch to wait for the completion of
TLB flush operations.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Heiko Carstens [Mon, 27 Jan 2014 09:25:39 +0000 (10:25 +0100)]

s390/uaccess: introduce 'uaccesspt' kernel parameter

The uaccesspt kernel parameter allows to enforce using the uaccess page
table walk variant. This is mainly for debugging purposes, so this mode
can also be enabled on machines which support the mvcos instruction.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Heiko Carstens [Mon, 27 Jan 2014 09:09:11 +0000 (10:09 +0100)]

s390/uaccess: remove dead kernel parameter 'user_mode='

Remove another leftover from the time when we supported running
user space in either home or primary address space.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Heiko Carstens [Fri, 24 Jan 2014 12:03:42 +0000 (13:03 +0100)]

s390/setup: get rid of MACHINE_HAS_MVCOS machine flag

MACHINE_HAS_MVCOS is used exactly once when the machine is brought up.
There is no need to cache the flag in the machine_flags.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Heiko Carstens [Fri, 24 Jan 2014 11:51:27 +0000 (12:51 +0100)]

s390/uaccess: consistent types

The types 'size_t' and 'unsigned long' have been used randomly for the
uaccess functions. This looks rather confusing.
So let's change all functions to use unsigned long instead and get rid
of size_t in order to have a consistent interface.

The only exception is strncpy_from_user() which uses 'long' since it
may return a signed value (-EFAULT).

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Heiko Carstens [Thu, 23 Jan 2014 10:18:36 +0000 (11:18 +0100)]

s390/uaccess: get rid of indirect function calls

There are only two uaccess variants on s390 left: the version that is used
if the mvcos instruction is available, and the page table walk variant.
So there is no need for expensive indirect function calls.

By default the mvcos variant will be called. If the mvcos instruction is not
available it will call the page table walk variant.

For minimal performance impact the "if (mvcos_is_available)" is implemented
with a jump label, which will be a six byte nop on machines with mvcos.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Heiko Carstens [Wed, 22 Jan 2014 13:49:30 +0000 (14:49 +0100)]

s390/uaccess: normalize order of parameters of indirect uaccess function calls

For some unknown reason the indirect uaccess functions on s390 implement a
different parameter order than what is usual.

e.g.:

unsigned long copy_to_user(void *to, const void *from, unsigned long n);
vs.
size_t (*copy_to_user)(size_t n, void __user * to, const void *from);

Let's get rid of this confusing parameter reordering.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Sebastian Ott [Mon, 27 Jan 2014 12:29:15 +0000 (13:29 +0100)]

s390/cio: fix irq stats for early interrupts on ccw consoles

Interrupts which happen on ccw consoles prior to their registration
with the driver core are not accounted to the respective device
driver. Fix this by setting the proper interrupt class during
initialization of ccw consoles.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Sebastian Ott [Mon, 27 Jan 2014 12:28:10 +0000 (13:28 +0100)]

s390/cio: reorder initialization of ccw consoles

Drivers for ccw consoles use ccw_device_probe_console to receive
an initialized ccw device which is already enabled for interrupts.
After that the device driver does the initialization of its private
data. This can race with unsolicited interrupts which can happen
once the device is enabled for interrupts.

Split ccw_device_probe_console into ccw_device_create_console and
ccw_device_enable_console and reorder the initialization of the ccw
console drivers.

While at it mark these functions as __init.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Sebastian Ott [Mon, 27 Jan 2014 12:26:10 +0000 (13:26 +0100)]

s390/cio: fix driver callback initialization for ccw consoles

ccw consoles are in use before they can be properly registered with
the driver core. For devices which are in use by a device driver we
rely on the ccw_device's pointer to the driver callbacks to be valid.
For ccw consoles this pointer is NULL until they are registered later
during boot and we dereferenced this pointer. This worked by
chance on 64 bit builds (cdev->drv was NULL but the optional callback
cdev->drv->path_event was also NULL by coincidence) and was unnoticed
until we received reports about boot failures on 31 bit systems.
Fix it by initializing the driver pointer for ccw consoles.

Cc: <stable@vger.kernel.org> # 3.10+
Reported-by: Mike Frysinger <vapier@gentoo.org>
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Linus Torvalds [Wed, 19 Feb 2014 00:36:07 +0000 (16:36 -0800)]

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Lots of little small things, nothing too major: nouveau regression
  fixes, vmware fixes for the new hw support, memory leaks in error path
  fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (31 commits)
  drm/radeon/ni: fix typo in dpm sq ramping setup
  drm/radeon/si: fix typo in dpm sq ramping setup
  drm/radeon: fix CP semaphores on CIK
  drm/radeon: delete a stray tab
  drm/radeon: fix display tiling setup on SI
  drm/radeon/dpm: reduce r7xx vblank mclk threshold to 200
  drm/radeon: fill in DRM_CAPs for cursor size
  drm: add DRM_CAPs for cursor size
  drm/radeon: unify bpc handling
  drm/ttm: Fix memory leak in ttm_agp_backend.c
  drm/ttm: declare 'struct device' in ttm_page_alloc.h
  drm/nouveau: fix TTM_PL_TT memtype on pre-nv50
  drm/nv50/disp: use correct register to determine DP display bpp
  drm/nouveau/fb: use correct ram oclass for nv1a hardware
  drm/nv50/gr: add missing nv_error parameter priv
  drm/nouveau: fix ENG_RUNLIST register address
  drm/nv4c/bios: disallow retrieving from prom on nv4x igp's
  drm/nv4c/vga: decode register is in a different place on nv4x igp's
  drm/nv4c/mc: nv4x igp's have a different msi rearm register
  drm/nouveau: set irq_enabled manually
  ...

commit | commitdiff | tree

Linus Torvalds [Wed, 19 Feb 2014 00:29:46 +0000 (16:29 -0800)]

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid

Pull HID update from Jiri Kosina:

- fixes for several bugs in incorrect allocations of buffers by David
   Herrmann and Benjamin Tissoires.

- support for a few new device IDs by Archana Patni, Benjamin
   Tissoires, Huei-Horng Yo, Reyad Attiyat and Yufeng Shen

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: hyperv: make sure input buffer is big enough
  HID: Bluetooth: hidp: make sure input buffers are big enough
  HID: hid-sensor-hub: quirk for STM Sensor hub
  HID: apple: add Apple wireless keyboard 2011 JIS model support
  HID: fix buffer allocations
  HID: multitouch: add FocalTech FTxxxx support
  HID: microsoft: Add ID's for Surface Type/Touch Cover 2
  HID: usbhid: quirk for CY-TM75 75 inch Touch Overlay

commit | commitdiff | tree

Linus Torvalds [Tue, 18 Feb 2014 23:52:43 +0000 (15:52 -0800)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) kvaser CAN driver has fixed limits of some of it's table, validate
    that we won't exceed those limits at probe time.  Fix from Olivier
    Sobrie.

2) Fix rtl8192ce disabling interrupts for too long, from Olivier
    Langlois.

3) Fix botched shift in ath5k driver, from Dan Carpenter.

4) Fix corruption of deferred packets in TIPC, from Erik Hugne.

5) Fix newlink error path in macvlan driver, from Cong Wang.

6) Fix netpoll deadlock in bonding, from Ding Tianhong.

7) Handle GSO packets properly in forwarding path when fragmentation is
    necessary on egress, from Florian Westphal.

8) Fix axienet build errors, from Michal Simek.

9) Fix refcounting of ubufs on tx in vhost net driver, from Michael S
    Tsirkin.

10) Carrier status isn't set properly in hyperv driver, from Haiyang
    Zhang.

11) Missing pci_disable_device() in tulip_remove_one), from Ingo Molnar.

12) AF_PACKET qdisc bypass mode doesn't adhere to driver provided TX
    queue selection method.  Add a fallback method mechanism to fix this
    bug, from Daniel Borkmann.

13) Fix regression in link local route handling on GRE tunnels, from
    Nicolas Dichtel.

14) Bonding can assign dup aggregator IDs in some sequences of
    configuration, fix by making the allocation counter per-bond instead
    of global.  From Jiri Bohac.

15) sctp_connectx() needs compat translations, from Daniel Borkmann.

16) Fix of_mdio PHY interrupt parsing, from Ben Dooks

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
  MAINTAINERS: add entry for the PHY library
  of_mdio: fix phy interrupt passing
  net: ethernet: update dependency and help text of mvneta
  NET: fec: only enable napi if we are successful
  af_packet: remove a stray tab in packet_set_ring()
  net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode
  ipv4: fix counter in_slow_tot
  irtty-sir.c: Do not set_termios() on irtty_close()
  bonding: 802.3ad: make aggregator_identifier bond-private
  usbnet: remove generic hard_header_len check
  gre: add link local route when local addr is any
  batman-adv: fix potential kernel paging error for unicast transmissions
  batman-adv: avoid double free when orig_node initialization fails
  batman-adv: free skb on TVLV parsing success
  batman-adv: fix TT CRC computation by ensuring byte order
  batman-adv: fix potential orig_node reference leak
  batman-adv: avoid potential race condition when adding a new neighbour
  batman-adv: properly check pskb_may_pull return value
  batman-adv: release vlan object after checking the CRC
  batman-adv: fix TT-TVLV parsing on OGM reception
  ...

commit | commitdiff | tree

Linus Torvalds [Tue, 18 Feb 2014 23:49:58 +0000 (15:49 -0800)]

Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
"A range of ARM fixes.  Biggest change is the stage-2 attributes used
  for for hyp mode which were wrong.  I've killed some bits in a couple
  of DT files which turned out not to be required, and a few other
  fixes.

  One fix touches code outside of arch/arm, which is related to sorting
  out the DMA masks correctly.  There is a long standing issue with the
  conversion from PFNs to addresses where people assume that shifting an
  unsigned long left by PAGE_SHIFT results in a correct address.  This
  is not the case with C: the integer promotion happens at assignment
  after evaluation.  This fixes the recently introduced dma_max_pfn()
  function, but there's a number of other places where we try this
  directly on an unsigned long in the mm code"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 7957/1: add DSB after icache flush in __flush_icache_all()
  Fix uses of dma_max_pfn() when converting to a limiting address
  ARM: 7955/1: spinlock: ensure we have a compiler barrier before sev
  ARM: 7953/1: mm: ensure TLB invalidation is complete before enabling MMU
  ARM: 7952/1: mm: Fix the memblock allocation for LPAE machines
  ARM: 7950/1: mm: Fix stage-2 device memory attributes
  ARM: dts: fix spdif pinmux configuration

commit | commitdiff | tree

Linus Torvalds [Tue, 18 Feb 2014 23:49:40 +0000 (15:49 -0800)]

Merge tag 'jfs-3.14-rc4' of git://github.com/kleikamp/linux-shaggy

Pull jfs fix from David Kleikamp:
"Another ACL regression. This one more subtle"

* tag 'jfs-3.14-rc4' of git://github.com/kleikamp/linux-shaggy:
jfs: set i_ctime when setting ACL

commit | commitdiff | tree

Florian Fainelli [Tue, 18 Feb 2014 17:47:49 +0000 (09:47 -0800)]

MAINTAINERS: add entry for the PHY library

The PHY library has been subject to some changes, new drivers and DT
interactions over the past few months. Add myself as a maintainer for
the core PHY library parts and drivers. Make sure the PHY library entry
also covers the Device Tree files which have a close interaction with
the MDIO bus, PHY connection and Ethernet PHY mode parsing.

CC: Grant Likely <grant.likely@linaro.org>
CC: Shaohui Xie <shaohui.xie@freescale.com>
CC: Andy Fleming <afleming@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Ben Dooks [Tue, 18 Feb 2014 12:16:58 +0000 (12:16 +0000)]

of_mdio: fix phy interrupt passing

The of_mdiobus_register_phy() is not setting phy->irq thus causing
some drivers to incorrectly assume that the PHY does not have an
IRQ associated with it. Not only do some drivers report no IRQ
they do not install an interrupt handler for the PHY.

Simplify the code setting irq and set the phy->irq at the same
time so that we cover the following issues, which should cover
all the cases the code will find:

- Set phy->irq if node has irq property and mdio->irq is NULL
- Set phy->irq if node has no irq and mdio->irq is not NULL
- Leave phy->irq as PHY_POLL default if none of the above

This fixes the issue:
net eth0: attached PHY 1 (IRQ -1) to driver Micrel KSZ8041RNLI

to the correct:
net eth0: attached PHY 1 (IRQ 416) to driver Micrel KSZ8041RNLI

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Thomas Petazzoni [Tue, 18 Feb 2014 13:18:11 +0000 (14:18 +0100)]

net: ethernet: update dependency and help text of mvneta

With the introduction of the support for Armada 375 and Armada 38x,
the hidden Kconfig option MACH_ARMADA_370_XP is being renamed to
MACH_MVEBU_V7. Therefore, the dependency that was used for the mvneta
driver can no longer work. This commit replaces this dependency by a
dependency on PLAT_ORION, which is used similarly for the mv643xx_eth
driver.

In addition to this, it takes this opportunity to adjust the
description and help text to indicate that the driver can is also used
for Armada 38x. Note that Armada 375 cannot use this driver as it has
a completely different networking unit, which will require a separate
driver.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Russell King [Tue, 18 Feb 2014 12:55:42 +0000 (12:55 +0000)]

NET: fec: only enable napi if we are successful

If napi is left enabled after a failed attempt to bring the interface
up, we BUG:

fec 2188000.ethernet eth0: no PHY, assuming direct connection to switch
libphy: PHY fixed-0:00 not found
fec 2188000.ethernet eth0: could not attach to PHY
------------[ cut here ]------------
kernel BUG at include/linux/netdevice.h:502!
Internal error: Oops - BUG: 0 [#1] SMP ARM
...
PC is at fec_enet_open+0x4d0/0x500
LR is at __dev_open+0xa4/0xfc

Only enable napi after we are past all the failure paths.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Dan Carpenter [Tue, 18 Feb 2014 12:20:51 +0000 (15:20 +0300)]

af_packet: remove a stray tab in packet_set_ring()

At first glance it looks like there is a missing curly brace but
actually the code works the same either way. I have adjusted the
indenting but left the code the same.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Dave Airlie [Tue, 18 Feb 2014 22:21:26 +0000 (08:21 +1000)]

Merge tag 'ttm-fixes-3.14-2014-02-18' of git://people.freedesktop.org/~thomash/linux into drm-fixes

Pull request of 2014-02-18

One compile fix and one memory leak.

* tag 'ttm-fixes-3.14-2014-02-18' of git://people.freedesktop.org/~thomash/linux:
drm/ttm: Fix memory leak in ttm_agp_backend.c
drm/ttm: declare 'struct device' in ttm_page_alloc.h

commit | commitdiff | tree

Dave Airlie [Tue, 18 Feb 2014 22:21:02 +0000 (08:21 +1000)]

Merge tag 'vmwgfx-fixes-3.14-2014-02-18' of git://people.freedesktop.org/~thomash/linux into drm-fixes

Pull request of 2014-02-18.

Nothing special. The biggest change is adding a couple of command defines and
packing the command data correctly.

* tag 'vmwgfx-fixes-3.14-2014-02-18' of git://people.freedesktop.org/~thomash/linux:
  drm/vmwgfx: Fix command defines and checks
  drm/vmwgfx: Fix possible integer overflow
  drm/vmwgfx: Remove stray const
  drm/vmwgfx: unlock on error path in vmw_execbuf_process()
  drm/vmwgfx: Get maximum mob size from register SVGA_REG_MOB_MAX_SIZE
  drm/vmwgfx: Fix a couple of sparse warnings and errors

commit | commitdiff | tree

Dave Airlie [Tue, 18 Feb 2014 22:20:14 +0000 (08:20 +1000)]

Merge branch 'drm-fixes-3.14' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

Fix for 128x128 cursors, along with some misc fixes.

* 'drm-fixes-3.14' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon/ni: fix typo in dpm sq ramping setup
  drm/radeon/si: fix typo in dpm sq ramping setup
  drm/radeon: fix CP semaphores on CIK
  drm/radeon: delete a stray tab
  drm/radeon: fix display tiling setup on SI
  drm/radeon/dpm: reduce r7xx vblank mclk threshold to 200
  drm/radeon: fill in DRM_CAPs for cursor size
  drm: add DRM_CAPs for cursor size
  drm/radeon: unify bpc handling

commit | commitdiff | tree

David S. Miller [Tue, 18 Feb 2014 21:57:42 +0000 (16:57 -0500)]

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless

John W. Linville says:

====================
Please pull this batch of fixes intended for the 3.14 stream...

For the iwlwifi one, Emmanuel says:

"As explicitly written in the commit message, we prefer to disable Tx
AMPDU on NICs supported by iwldvm. This feature gives a big boost in
Tx performance, but the firmware is buggy and we can't rely on it.
Our hope is that most of the users out there want wifi to surf on
the web which means that they care more for Rx traffic than for Tx.
People who want to enable it can do so with the help of a module
parameter."

On top of that...

Dan Carpenter fixes a typo/thinko in ath5k.

Olivier Langlois fixes a couple of rtlwifi issues, one which leaves
IRQs disabled too long (causing a variety of problems elsewhere),
and one which fixes an incorrect return code when failing to enable
the NIC.

Russell King fixes a NULL pointer dereference in hostap.

Stanislaw Gruszka fixes a DMA coherence issue in the rtl8187 driver.

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Daniel Borkmann [Mon, 17 Feb 2014 11:11:11 +0000 (12:11 +0100)]

net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode

SCTP's sctp_connectx() abi breaks for 64bit kernels compiled with 32bit
emulation (e.g. ia32 emulation or x86_x32). Due to internal usage of
'struct sctp_getaddrs_old' which includes a struct sockaddr pointer,
sizeof(param) check will always fail in kernel as the structure in
64bit kernel space is 4bytes larger than for user binaries compiled
in 32bit mode. Thus, applications making use of sctp_connectx() won't
be able to run under such circumstances.

Introduce a compat interface in the kernel to deal with such
situations by using a 'struct compat_sctp_getaddrs_old' structure
where user data is copied into it, and then sucessively transformed
into a 'struct sctp_getaddrs_old' structure with the help of
compat_ptr(). That fixes sctp_connectx() abi without any changes
needed in user space, and lets the SCTP test suite pass when compiled
in 32bit and run on 64bit kernels.

Fixes: f9c67811ebc0 ("sctp: Fix regression introduced by new sctp_connectx api")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Tue, 18 Feb 2014 20:40:50 +0000 (15:40 -0500)]

Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:
- fix soft-interface MTU computation
- fix bogus pointer mangling when parsing the TT-TVLV
  container. This bug led to a wrong memory access.
- fix memory leak by properly releasing the VLAN object
  after CRC check
- properly check pskb_may_pull() return value
- avoid potential race condition while adding new neighbour
- fix potential memory leak by removing all the references
  to the orig_node object in case of initialization failure
- fix the TT CRC computation by ensuring that every node uses
  the same byte order when hosts with different endianess are
  part of the same network
- fix severe memory leak by freeing skb after a successful
  TVLV parsing
- avoid potential double free when orig_node initialization
  fails
- fix potential kernel paging error caused by the usage of
  the old value of skb->data after skb reallocation

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Alex Deucher [Tue, 18 Feb 2014 15:16:28 +0000 (10:16 -0500)]

drm/radeon/ni: fix typo in dpm sq ramping setup

inverted logic.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

commit | commitdiff | tree

Alex Deucher [Tue, 18 Feb 2014 15:14:46 +0000 (10:14 -0500)]

drm/radeon/si: fix typo in dpm sq ramping setup

inverted logic.

Noticed-by: Sylvain BERTRAND <sylware@legeek.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

commit | commitdiff | tree

Christian König [Tue, 18 Feb 2014 10:37:20 +0000 (11:37 +0100)]

drm/radeon: fix CP semaphores on CIK

The CP semaphore queue on CIK has a bug that triggers if uncompleted
waits use the same address while a signal is still pending. Work around
this by using different addresses for each sync.

Signed-off-by: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Dan Carpenter [Mon, 17 Feb 2014 20:01:30 +0000 (23:01 +0300)]

drm/radeon: delete a stray tab

Static checkers complain that probably curly braces were intended here,
but actually it makes more sense to remove the extra tab.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Mon, 17 Feb 2014 19:16:31 +0000 (14:16 -0500)]

drm/radeon: fix display tiling setup on SI

Apply the same logic as CI to SI for setting up the
display tiling parameters. The num banks may vary
per tiling index just like CI.

Bugs:
https://bugs.freedesktop.org/show_bug.cgi?id=71488
https://bugs.freedesktop.org/show_bug.cgi?id=73946
https://bugs.freedesktop.org/show_bug.cgi?id=74927

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

commit | commitdiff | tree

Alex Deucher [Mon, 17 Feb 2014 17:50:13 +0000 (12:50 -0500)]

drm/radeon/dpm: reduce r7xx vblank mclk threshold to 200

Most laptops seems to have a vblank period of less than
300 and mclk switching works fine. Drop the quirk and
set the default threshold to 200.

bug:
https://bugzilla.kernel.org/show_bug.cgi?id=70701

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Wed, 12 Feb 2014 17:56:53 +0000 (12:56 -0500)]

drm/radeon: fill in DRM_CAPs for cursor size

CIK parts are 128x128, older parts are 64x64.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Alex Deucher [Wed, 12 Feb 2014 17:48:23 +0000 (12:48 -0500)]

drm: add DRM_CAPs for cursor size

Some hardware may not support standard 64x64 cursors. Add
a drm cap to query the cursor size from the kernel. Some examples
include radeon CIK parts (128x128 cursors) and armada (32x64 or 64x32).
This allows things like device specific ddxes to remove asics specific
logic and also allows xf86-video-modesetting to work properly with hw
cursors on this hardware. Default to 64 if the driver doesn't specify
a size.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Alex Deucher [Mon, 3 Feb 2014 20:53:25 +0000 (15:53 -0500)]

drm/radeon: unify bpc handling

We were already storing the bpc (bits per color) information
in radeon_crtc, so just use that everywhere rather than
calculating it everywhere we use it. This also allows us
to change it in one place if we ever want to override it.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

commit | commitdiff | tree

Linus Torvalds [Tue, 18 Feb 2014 18:04:09 +0000 (10:04 -0800)]

Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
"Miscellaneous ext4 bug fixes for v3.14"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  jbd2: fix use after free in jbd2_journal_start_reserved()
  ext4: don't leave i_crtime.tv_sec uninitialized
  ext4: fix online resize with a non-standard blocks per group setting
  ext4: fix online resize with very large inode tables
  ext4: don't try to modify s_flags if the the file system is read-only
  ext4: fix error paths in swap_inode_boot_loader()
  ext4: fix xfstest generic/299 block validity failures

commit | commitdiff | tree

Masanari Iida [Wed, 12 Feb 2014 13:46:25 +0000 (22:46 +0900)]

drm/ttm: Fix memory leak in ttm_agp_backend.c

This patch fix a memory leak found by cppcheck.
[drivers/gpu/drm/ttm/ttm_agp_backend.c:129]:
(error) Memory leak: agp_be

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>

commit | commitdiff | tree

Alexandre Courbot [Sun, 9 Feb 2014 09:43:18 +0000 (18:43 +0900)]

drm/ttm: declare 'struct device' in ttm_page_alloc.h

Declare 'struct device' explicitly in ttm_page_alloc.h as this file
does not include any file declaring it. This removes the following
warning:

warning: 'struct device' declared inside parameter list

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>

commit | commitdiff | tree

Dave Airlie [Tue, 18 Feb 2014 06:22:40 +0000 (16:22 +1000)]

Merge branch 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes

Nothing too exciting, mostly fixes for ancient boards, but a pretty important fix for DP on some systems.

Thanks,
* 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
  drm/nouveau: fix TTM_PL_TT memtype on pre-nv50
  drm/nv50/disp: use correct register to determine DP display bpp
  drm/nouveau/fb: use correct ram oclass for nv1a hardware
  drm/nv50/gr: add missing nv_error parameter priv
  drm/nouveau: fix ENG_RUNLIST register address
  drm/nv4c/bios: disallow retrieving from prom on nv4x igp's
  drm/nv4c/vga: decode register is in a different place on nv4x igp's
  drm/nv4c/mc: nv4x igp's have a different msi rearm register
  drm/nouveau: set irq_enabled manually

commit | commitdiff | tree

Dave Airlie [Tue, 18 Feb 2014 06:21:49 +0000 (16:21 +1000)]

Merge tag 'drm-intel-fixes-2014-02-14' of ssh://git.freedesktop.org/git/drm-intel into drm-fixes

3 fixes plus 1 prep patch, all four cc: stable. Jani will take over from
here and the plan is that he'll do 3.14-fixes for the entire release just
to work things out a bit.

* tag 'drm-intel-fixes-2014-02-14' of ssh://git.freedesktop.org/git/drm-intel:
  drm/i915/dp: add native aux defer retry limit
  drm/i915/dp: increase native aux defer retry timeout
  drm/i915: Prevent MI_DISPLAY_FLIP straddling two cachelines on IVB
  drm/i915: Add intel_ring_cachline_align()

commit | commitdiff | tree

Dave Airlie [Tue, 18 Feb 2014 06:20:17 +0000 (16:20 +1000)]

Merge branch 'tda998x-fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-cubox into drm-fixes

fix for leak in tda998x

* 'tda998x-fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-cubox:
drm/i2c: tda998x: Fix memory leak in tda998x_encoder_init error path.

commit | commitdiff | tree

Dan Carpenter [Tue, 18 Feb 2014 01:33:01 +0000 (20:33 -0500)]

jbd2: fix use after free in jbd2_journal_start_reserved()

If start_this_handle() fails then it leads to a use after free of
"handle".

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@vger.kernel.org

commit | commitdiff | tree

Ilia Mirkin [Sun, 16 Feb 2014 04:27:01 +0000 (23:27 -0500)]

drm/nouveau: fix TTM_PL_TT memtype on pre-nv50

Commit a55409066 ("drm/nv50-: map TTM_PL_SYSTEM through a BAR for CPU
access") made it possible to work with tiled memory. However
mem->mm_node is not a nouveau_mem for AGP-using pre-NV50 cards, but a
drm_mm_node, as created by the ttm_bo_manager_func. As such, extend the
untiled check to explicitly include all pre-nv50 cards.

Reported-by: Ronald <ronald645@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74613
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Ronald Uitermark <ronald645@gmail.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

commit | commitdiff | tree

Ilia Mirkin [Fri, 14 Feb 2014 02:57:15 +0000 (21:57 -0500)]

drm/nv50/disp: use correct register to determine DP display bpp

Commit 0a0afd282f ("drm/nv50-/disp: move DP link training to core and
train from supervisor") added code that uses the wrong register for
computing the display bpp, used for bandwidth calculation. Adjust to use
the same register as used by exec_clkcmp and nv50_disp_intr_unk20_2_dp.

Reported-by: Torsten Wagner <torsten.wagner@gmail.com>
Reported-by: Michael Gulick <mgulick@mathworks.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67628
Cc: stable@vger.kernel.org # 3.9+
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

commit | commitdiff | tree

Emil Velikov [Wed, 12 Feb 2014 01:41:42 +0000 (01:41 +0000)]

drm/nouveau/fb: use correct ram oclass for nv1a hardware

commit 8613e7314ac254fdd67ed46192f021d76141e4c9
Author: Ben Skeggs <bskeggs@redhat.com>
Date: Mon Oct 21 08:50:25 2013 +1000

drm/nouveau/fb: remove ram oclass argument from base fb constructor

Introduced a unfortunate regression by using nv10 ram oclass for nv1a
hardware, causing corruption and eventually system lockup.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74866
Reported-by: John F. Godfrey <jfgodfrey@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: stable@vger.kernel.org # 3.13+
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

commit | commitdiff | tree

Ilia Mirkin [Sun, 9 Feb 2014 03:35:13 +0000 (22:35 -0500)]

drm/nv50/gr: add missing nv_error parameter priv

Commit ea7dce901 ("drm/nv50/gr: print mpc trap name when it's not an mp
trap") added an nv_error call that was missing the priv parameter. This
causes GPFs if the error is ever hit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

commit | commitdiff | tree

Alexandre Courbot [Fri, 7 Feb 2014 13:22:57 +0000 (22:22 +0900)]

drm/nouveau: fix ENG_RUNLIST register address

Address of the ENG_RUNLIST register should be 0x002284 + (engine * 8),
not 0x002284 + (engine * 4).

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

commit | commitdiff | tree

Ilia Mirkin [Wed, 5 Feb 2014 19:33:04 +0000 (14:33 -0500)]

drm/nv4c/bios: disallow retrieving from prom on nv4x igp's

Suggested-by: Marcin Kościelnicki <koriakin@0x04.net>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

commit | commitdiff | tree

Ilia Mirkin [Wed, 5 Feb 2014 19:33:03 +0000 (14:33 -0500)]

drm/nv4c/vga: decode register is in a different place on nv4x igp's

Suggested-by: Marcin Kościelnicki <koriakin@0x04.net>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

commit | commitdiff | tree

Ilia Mirkin [Wed, 5 Feb 2014 19:33:02 +0000 (14:33 -0500)]

drm/nv4c/mc: nv4x igp's have a different msi rearm register

See https://bugs.freedesktop.org/show_bug.cgi?id=74492

Reported-by: Ronald <ronald645@gmail.com>
Suggested-by: Marcin Kościelnicki <koriakin@0x04.net>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

commit | commitdiff | tree

Ilia Mirkin [Thu, 30 Jan 2014 00:53:00 +0000 (19:53 -0500)]

drm/nouveau: set irq_enabled manually

Since commit 0fa9061ae8c ("drm/nouveau/mc: handle irq-related setup
ourselves"), drm_device->irq_enabled remained unset. This is needed in
order to properly wait for a vblank event in the generic drm code.

See https://bugs.freedesktop.org/show_bug.cgi?id=74195

Reported-by: Jan Janecek <janjanjanx@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

commit | commitdiff | tree

Vinayak Kale [Wed, 12 Feb 2014 06:30:01 +0000 (07:30 +0100)]

ARM: 7957/1: add DSB after icache flush in __flush_icache_all()

Add DSB after icache flush to complete the cache maintenance operation.

Signed-off-by: Vinayak Kale <vkale@apm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

commit | commitdiff | tree

Russell King [Tue, 11 Feb 2014 17:11:04 +0000 (17:11 +0000)]

Fix uses of dma_max_pfn() when converting to a limiting address

We must use a 64-bit for this, otherwise overflowed bits get lost, and
that can result in a lower than intended value set.

Fixes: 8e0cb8a1f6ac ("ARM: 7797/1: mmc: Use dma_max_pfn(dev) helper for bounce_limit calculations")
Fixes: 7d35496dd982 ("ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations")
Tested-Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

commit | commitdiff | tree

Duan Jiong [Mon, 17 Feb 2014 07:23:43 +0000 (15:23 +0800)]

ipv4: fix counter in_slow_tot

since commit 89aef8921bf("ipv4: Delete routing cache."), the counter
in_slow_tot can't work correctly.

The counter in_slow_tot increase by one when fib_lookup() return successfully
in ip_route_input_slow(), but actually the dst struct maybe not be created and
cached, so we can increase in_slow_tot after the dst struct is created.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Mon, 17 Feb 2014 21:51:00 +0000 (13:51 -0800)]

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client

Pull Ceph fixes from Sage Weil:
"We have some patches fixing up ACL support issues from Zheng and
  Guangliang and a mount option to enable/disable this support.  (These
  fixes were somewhat delayed by the Chinese holiday.)

  There is also a small fix for cached readdir handling when directories
  are fragmented"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: fix __dcache_readdir()
  ceph: add acl, noacl options for cephfs mount
  ceph: make ceph_forget_all_cached_acls() static inline
  ceph: add missing init_acl() for mkdir() and atomic_open()
  ceph: fix ceph_set_acl()
  ceph: fix ceph_removexattr()
  ceph: remove xattr when null value is given to setxattr()
  ceph: properly handle XATTR_CREATE and XATTR_REPLACE

commit | commitdiff | tree

Linus Torvalds [Mon, 17 Feb 2014 21:50:11 +0000 (13:50 -0800)]

Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6

Pull CIFS fixes from Steve French:
"Three cifs fixes, the most important fixing the problem with passing
  bogus pointers with writev (CVE-2014-0069).

  Two additional cifs fixes are still in review (including the fix for
  an append problem which Al also discovered)"

* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Fix too big maxBuf size for SMB3 mounts
  cifs: ensure that uncached writes handle unmapped areas correctly
  [CIFS] Fix cifsacl mounts over smb2 to not call cifs

commit | commitdiff | tree

David Howells [Mon, 17 Feb 2014 15:01:47 +0000 (15:01 +0000)]

FS-Cache: Handle removal of unadded object to the fscache_object_list rb tree

When FS-Cache allocates an object, the following sequence of events can
occur:

-->fscache_alloc_object()
    -->cachefiles_alloc_object() [via cache->ops->alloc_object]
    <--[returns new object]
    -->fscache_attach_object()
    <--[failed]
    -->cachefiles_put_object() [via cache->ops->put_object]
       -->fscache_object_destroy()
          -->fscache_objlist_remove()
             -->rb_erase() to remove the object from fscache_object_list.

resulting in a crash in the rbtree code.

The problem is that the object is only added to fscache_object_list on
the success path of fscache_attach_object() where it calls
fscache_objlist_add().

So if fscache_attach_object() fails, the object won't have been added to
the objlist rbtree.  We do, however, unconditionally try to remove the
object from the tree.

Thanks to NeilBrown for finding this and suggesting this solution.

Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: (a customer of) NeilBrown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Dave Jones [Mon, 17 Feb 2014 21:21:24 +0000 (16:21 -0500)]

reiserfs: fix utterly brain-damaged indentation.

This has been this way for years, and every time I stumble across it I
lose my lunch. After coming across it for the nth time in the Coverity
results, I had to overcome the bystander effect and do something about
it.

This ignores the 79 column limit in favor of making it look like C
instead of gibberish.

The correct thing to do here would be to lose some of the indentation by
breaking this function up into several smaller ones. I might do that at
some point if I have the stomach to look at this again.

(Also some of those overlong ternary operations would likely be more
readable as regular if's)

Signed-off-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Tommie Gannert [Mon, 17 Feb 2014 20:46:04 +0000 (20:46 +0000)]

irtty-sir.c: Do not set_termios() on irtty_close()

Issuing set_termios() from irtty_close() causes kernel Oops for
unplugged usb-serial devices.

Since no other tty_ldisc calls set_termios() on close and no tty driver
seem to check if tty->device_data is NULL or not on entry to set_termios(),
the only solution I can come up with is to remove the irtty_stop_receiver()
call, which only updates termios.

Signed-off-by: Tommie Gannert <tommie@gannert.se>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

John W. Linville [Mon, 17 Feb 2014 20:54:31 +0000 (15:54 -0500)]

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem

commit | commitdiff | tree

Linus Torvalds [Mon, 17 Feb 2014 20:42:45 +0000 (12:42 -0800)]

Merge tag 'dma-buf-for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf

Pull dma-buf fix from Sumit Semwal:
"Just some debugfs output updates.

  There's another patch related to dma-buf, but it'll get upstreamed via
  Greg KH's pull request"

* tag 'dma-buf-for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf:
  dma-buf: update debugfs output

commit | commitdiff | tree

Linus Torvalds [Mon, 17 Feb 2014 20:40:36 +0000 (12:40 -0800)]

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32

Pull AVR32 fixes from Hans-Christian Egtvedt.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
  avr32: add generic vga.h to Kbuild
  avr32: add generic ioremap_wc() definition in io.h
  avr32: Makefile: add '-D__linux__' flag for gcc-4.4.7 use
  avr32: fix missing module.h causing build failure in mimc200/fram.c

commit | commitdiff | tree

Yan, Zheng [Thu, 13 Feb 2014 11:40:26 +0000 (19:40 +0800)]

ceph: fix __dcache_readdir()

If directory is fragmented, readdir() read its dirfrags one by one.
After reading all dirfrags, the corresponding dentries are sorted in
(frag_t, off) order in the dcache. If dentries of a directory are all
cached, __dcache_readdir() can use the cached dentries to satisfy
readdir syscall. But when checking if a given dentry is after the
position of readdir, __dcache_readdir() compares numerical value of
frag_t directly. This is wrong, it should use ceph_frag_compare().

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>

commit | commitdiff | tree

Sage Weil [Sun, 16 Feb 2014 18:05:29 +0000 (10:05 -0800)]

ceph: add acl, noacl options for cephfs mount

Make the 'acl' option dependent on having ACL support compiled in. Make
the 'noacl' option work even without it so that one can always ask it to
be off and not error out on mount when it is not supported.

Signed-off-by: Guangliang Zhao <lucienchao@gmail.com>
Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Guangliang Zhao [Sun, 16 Feb 2014 16:35:52 +0000 (08:35 -0800)]

ceph: make ceph_forget_all_cached_acls() static inline

Signed-off-by: Guangliang Zhao <lucienchao@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Sage Weil <sage@inktank.com>

commit | commitdiff | tree

Yan, Zheng [Tue, 11 Feb 2014 04:55:05 +0000 (12:55 +0800)]

ceph: add missing init_acl() for mkdir() and atomic_open()

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>

commit | commitdiff | tree

Yan, Zheng [Tue, 11 Feb 2014 05:08:51 +0000 (13:08 +0800)]

ceph: fix ceph_set_acl()

If acl is equivalent to file mode permission bits, ceph_set_acl()
needs to remove any existing acl xattr. Use __ceph_setxattr() to
handle both setting and removing acl xattr cases, it doesn't return
-ENODATA when there is no acl xattr.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>

commit | commitdiff | tree

Yan, Zheng [Tue, 11 Feb 2014 05:23:09 +0000 (13:23 +0800)]

ceph: fix ceph_removexattr()

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>

commit | commitdiff | tree

Yan, Zheng [Tue, 11 Feb 2014 05:04:19 +0000 (13:04 +0800)]

ceph: remove xattr when null value is given to setxattr()

For the setxattr request, introduce a new flag CEPH_XATTR_REMOVE
to distinguish null value case from the zero-length value case.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>

commit | commitdiff | tree

Yan, Zheng [Tue, 11 Feb 2014 05:01:19 +0000 (13:01 +0800)]

ceph: properly handle XATTR_CREATE and XATTR_REPLACE

return -EEXIST if XATTR_CREATE is set and xattr alread exists.
return -ENODATA if XATTR_REPLACE is set but xattr does not exist.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>

commit | commitdiff | tree

Linus Torvalds [Mon, 17 Feb 2014 20:36:49 +0000 (12:36 -0800)]

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

Pull powerpc fixes from Ben Herrenschmidt:
"Here are some more powerpc fixes for 3.14

  The main one is a nasty issue with the NUMA balancing support which
  requires a small generic change and the addition of a new accessor to
  set _PAGE_NUMA.  Both have been reviewed and acked by Mel and Rik.

  The changelog should have plenty of details but basically, without
  this fix, we get random user segfaults and/or corruptions due to
  missing TLB/hash flushes.  Aneesh series of 3 patches fixes it.

  We have some vDSO vs.  perf fixes from Anton, some small EEH fixes
  from Gavin, a ppc32 regression vs the stack overflow detector, and a
  fix for the way we handle PCIe host bridge speed settings on pseries
  (which is needed for proper operations of AMD graphics cards on
  Power8)"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/eeh: Disable EEH on reboot
  powerpc/eeh: Cleanup on eeh_subsystem_enabled
  powerpc/powernv: Rework EEH reset
  powerpc: Use unstripped VDSO image for more accurate profiling data
  powerpc: Link VDSOs at 0x0
  mm: Use ptep/pmdp_set_numa() for updating _PAGE_NUMA bit
  mm: Dirty accountable change only apply to non prot numa case
  powerpc/mm: Add new "set" flag argument to pte/pmd update function
  powerpc/pseries: Add Gen3 definitions for PCIE link speed
  powerpc/pseries: Fix regression on PCI link speed
  powerpc: Set the correct ksp_limit on ppc32 when switching to irq stack

commit | commitdiff | tree

Linus Torvalds [Mon, 17 Feb 2014 20:24:45 +0000 (12:24 -0800)]

printk: fix syslog() overflowing user buffer

This is not a buffer overflow in the traditional sense: we don't
overflow any *kernel* buffers, but we do mis-count the amount of data we
copy back to user space for the SYSLOG_ACTION_READ_ALL case.

In particular, if the user buffer is too small to hold everything, and
*if* there is a continuation line at just the right place, we can end up
giving the user more data than he asked for.

The reason is that we first count up the number of bytes all the log
records contains, then we walk the records again until we've skipped the
records at the beginning that won't fit, and then we walk the rest of
the records and copy them to the user space buffer.

And in between that "skip the initial records that won't fit" and the
"copy the records that *will* fit to user space", we reset the 'prev'
variable that contained the record information for the last record not
copied. That meant that when we started copying to user space, we now
had a different character count than what we had originally calculated
in the first record walk-through.

The fix is to simply not clear the 'prev' flags value (in both cases
where we had the same logic: syslog_print_all and kmsg_dump_get_buffer:
the latter is used for pstore-like dumping)

Reported-and-tested-by: Debabrata Banerjee <dbanerje@akamai.com>
Acked-by: Kay Sievers <kay@vrfy.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

David Herrmann [Thu, 19 Dec 2013 11:32:24 +0000 (12:32 +0100)]

HID: hyperv: make sure input buffer is big enough

We need at least HID_MAX_BUFFER_SIZE (4096) bytes as input buffer. HID
core depends on this as it requires every input report to be at least as
big as advertised.

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

commit | commitdiff | tree

David Herrmann [Thu, 19 Dec 2013 11:09:32 +0000 (12:09 +0100)]

HID: Bluetooth: hidp: make sure input buffers are big enough

HID core expects the input buffers to be at least of size 4096
(HID_MAX_BUFFER_SIZE). Other sizes will result in buffer-overflows if an
input-report is smaller than advertised. We could, like i2c, compute the
biggest report-size instead of using HID_MAX_BUFFER_SIZE, but this will
blow up if report-descriptors are changed after ->start() has been called.
So lets be safe and just use the biggest buffer we have.

Note that this adds an additional copy to the HIDP input path. If there is
a way to make sure the skb-buf is big enough, we should use that instead.

The best way would be to make hid-core honor the @size argument, though,
that sounds easier than it is. So lets just fix the buffer-overflows for
now and afterwards look for a faster way for all transport drivers.

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

commit | commitdiff | tree

Jiri Bohac [Fri, 14 Feb 2014 17:13:50 +0000 (18:13 +0100)]

bonding: 802.3ad: make aggregator_identifier bond-private

aggregator_identifier is used to assign unique aggregator identifiers
to aggregators of a bond during device enslaving.

aggregator_identifier is currently a global variable that is zeroed in
bond_3ad_initialize().

This sequence will lead to duplicate aggregator identifiers for eth1 and eth3:

create bond0
change bond0 mode to 802.3ad
enslave eth0 to bond0 //eth0 gets agg id 1
enslave eth1 to bond0 //eth1 gets agg id 2
create bond1
change bond1 mode to 802.3ad
enslave eth2 to bond1 //aggregator_identifier is reset to 0
//eth2 gets agg id 1
enslave eth3 to bond0 //eth3 gets agg id 2

Fix this by making aggregator_identifier private to the bond.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Emil Goode [Thu, 13 Feb 2014 16:50:19 +0000 (17:50 +0100)]

usbnet: remove generic hard_header_len check

This patch removes a generic hard_header_len check from the usbnet
module that is causing dropped packages under certain circumstances
for devices that send rx packets that cross urb boundaries.

One example is the AX88772B which occasionally send rx packets that
cross urb boundaries where the remaining partial packet is sent with
no hardware header. When the buffer with a partial packet is of less
number of octets than the value of hard_header_len the buffer is
discarded by the usbnet module.

With AX88772B this can be reproduced by using ping with a packet
size between 1965-1976.

The bug has been reported here:

https://bugzilla.kernel.org/show_bug.cgi?id=29082

This patch introduces the following changes:
- Removes the generic hard_header_len check in the rx_complete
  function in the usbnet module.
- Introduces a ETH_HLEN check for skbs that are not cloned from
  within a rx_fixup callback.
- For safety a hard_header_len check is added to each rx_fixup
  callback function that could be affected by this change.
  These extra checks could possibly be removed by someone
  who has the hardware to test.
- Removes a call to dev_kfree_skb_any() and instead utilizes the
  dev->done list to queue skbs for cleanup.

The changes place full responsibility on the rx_fixup callback
functions that clone skbs to only pass valid skbs to the
usbnet_skb_return function.

Signed-off-by: Emil Goode <emilgoode@gmail.com>
Reported-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Nicolas Dichtel [Mon, 17 Feb 2014 13:22:21 +0000 (14:22 +0100)]

gre: add link local route when local addr is any

This bug was reported by Steinar H. Gunderson and was introduced by commit
f7cb8886335d ("sit/gre6: don't try to add the same route two times").

root@morgental:~# ip tunnel add foo mode gre remote 1.2.3.4 ttl 64
root@morgental:~# ip link set foo up mtu 1468
root@morgental:~# ip -6 route show dev foo
fe80::/64 proto kernel metric 256

but after the above commit, no such route shows up.

There is no link local route because dev->dev_addr is 0 (because local ipv4
address is 0), hence no link local address is configured.

In this scenario, the link local address is added manually: 'ip -6 addr add
fe80::1 dev foo' and because prefix is /128, no link local route is added by the
kernel.

Even if the right things to do is to add the link local address with a /64
prefix, we need to restore the previous behavior to avoid breaking userpace.

Reported-by: Steinar H. Gunderson <sesse@samfundet.no>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Antonio Quartulli [Sat, 15 Feb 2014 20:50:37 +0000 (21:50 +0100)]

batman-adv: fix potential kernel paging error for unicast transmissions

batadv_send_skb_prepare_unicast(_4addr) might reallocate the
skb's data. If it does then our ethhdr pointer is not valid
anymore in batadv_send_skb_unicast(), resulting in a kernel
paging error.

Fixing this by refetching the ethhdr pointer after the
potential reallocation.

Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>

commit | commitdiff | tree

Antonio Quartulli [Sat, 15 Feb 2014 01:17:20 +0000 (02:17 +0100)]

batman-adv: avoid double free when orig_node initialization fails

In the failure path of the orig_node initialization routine
the orig_node->bat_iv.bcast_own field is free'd twice: first
in batadv_iv_ogm_orig_get() and then later in
batadv_orig_node_free_rcu().

Fix it by removing the kfree in batadv_iv_ogm_orig_get().

Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>

commit | commitdiff | tree

Antonio Quartulli [Tue, 11 Feb 2014 16:05:07 +0000 (17:05 +0100)]

batman-adv: free skb on TVLV parsing success

When the TVLV parsing routine succeed the skb is left
untouched thus leading to a memory leak.

Fix this by consuming the skb in case of success.

Introduced by ef26157747d42254453f6b3ac2bd8bd3c53339c3
("batman-adv: tvlv - basic infrastructure")

Reported-by: Russel Senior <russell@personaltelco.net>
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Tested-by: Russell Senior <russell@personaltelco.net>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>

commit | commitdiff | tree

Antonio Quartulli [Tue, 11 Feb 2014 16:05:06 +0000 (17:05 +0100)]

batman-adv: fix TT CRC computation by ensuring byte order

When computing the CRC on a 2byte variable the order of
the bytes obviously alters the final result. This means
that computing the CRC over the same value on two archs
having different endianess leads to different numbers.

The global and local translation table CRC computation
routine makes this mistake while processing the clients
VIDs. The result is a continuous CRC mismatching between
nodes having different endianess.

Fix this by converting the VID to Network Order before
processing it. This guarantees that every node uses the same
byte order.

Introduced by 7ea7b4a142758deaf46c1af0ca9ceca6dd55138b
("batman-adv: make the TT CRC logic VLAN specific")

Reported-by: Russel Senior <russell@personaltelco.net>
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Tested-by: Russell Senior <russell@personaltelco.net>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>

commit | commitdiff | tree

Simon Wunderlich [Sat, 8 Feb 2014 15:45:06 +0000 (16:45 +0100)]

batman-adv: fix potential orig_node reference leak

Since batadv_orig_node_new() sets the refcount to two, assuming that
the calling function will use a reference for putting the orig_node into
a hash or similar, both references must be freed if initialization of
the orig_node fails. Otherwise that object may be leaked in that error
case.

Reported-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>

commit | commitdiff | tree

Antonio Quartulli [Wed, 29 Jan 2014 10:25:12 +0000 (11:25 +0100)]

batman-adv: avoid potential race condition when adding a new neighbour

When adding a new neighbour it is important to atomically
perform the following:
- check if the neighbour already exists
- append the neighbour to the proper list

If the two operations are not performed in an atomic context
it is possible that two concurrent insertions add the same
neighbour twice.

Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>

commit | commitdiff | tree

Antonio Quartulli [Wed, 29 Jan 2014 23:12:24 +0000 (00:12 +0100)]

batman-adv: properly check pskb_may_pull return value

pskb_may_pull() returns 1 on success and 0 in case of failure,
therefore checking for the return value being negative does
not make sense at all.

This way if the function fails we will probably read beyond the current
skb data buffer. Fix this by doing the proper check.

Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>

commit | commitdiff | tree

Antonio Quartulli [Tue, 28 Jan 2014 01:06:47 +0000 (02:06 +0100)]

batman-adv: release vlan object after checking the CRC

There is a refcounter unbalance in the CRC checking routine
invoked on OGM reception. A vlan object is retrieved (thus
its refcounter is increased by one) but it is never properly
released. This leads to a memleak because the vlan object
will never be free'd.

Fix this by releasing the vlan object after having read the
CRC.

Reported-by: Russell Senior <russell@personaltelco.net>
Reported-by: Daniel <daniel@makrotopia.org>
Reported-by: cmsv <cmsv@wirelesspt.net>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>

commit | commitdiff | tree

Antonio Quartulli [Mon, 27 Jan 2014 11:23:28 +0000 (12:23 +0100)]

batman-adv: fix TT-TVLV parsing on OGM reception

When accessing a TT-TVLV container in the OGM RX path
the variable pointing to the list of changes to apply is
altered by mistake.

This makes the TT component read data at the wrong position
in the OGM packet buffer.

Fix it by removing the bogus pointer alteration.

Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>

commit | commitdiff | tree

Antonio Quartulli [Tue, 21 Jan 2014 10:22:05 +0000 (11:22 +0100)]

batman-adv: fix soft-interface MTU computation

The current MTU computation always returns a value
smaller than 1500bytes even if the real interfaces
have an MTU large enough to compensate the batman-adv
overhead.

Fix the computation by properly returning the highest
admitted value.

Introduced by a19d3d85e1b854e4a483a55d740a42458085560d
("batman-adv: limit local translation table max size")

Reported-by: Russell Senior <russell@personaltelco.net>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>

commit | commitdiff | tree

Archana Patni [Mon, 3 Feb 2014 07:14:16 +0000 (12:44 +0530)]

HID: hid-sensor-hub: quirk for STM Sensor hub

Added STM sensor hub vendor id in HID_SENSOR_HUB_ENUM_QUIRK to
fix report descriptors. These devices uses old FW which uses
logical 0 as minimum. In these, HID reports are not using proper
collection classes. So we need to fix report descriptors,for
such devices. This will not have any impact, if the FW uses
logical 1 as minimum.

We look for usage id for "power and report state", and modify
logical minimum value to 1.

This is a follow-up patch to commit id 875e36f8.

Signed-off-by: Archana Patni <archana.patni@linux.intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

commit | commitdiff | tree

Chen Gang [Sun, 16 Feb 2014 11:36:06 +0000 (19:36 +0800)]

avr32: add generic vga.h to Kbuild

Need add generic "vga.h", or can not pass building for allmodconfig,
the related error:

    CC [M]  drivers/gpu/drm/drm_irq.o
  In file included from include/linux/vgaarb.h:34,
                   from drivers/gpu/drm/drm_irq.c:42:
  include/video/vga.h:22:21: error: asm/vga.h: No such file or directory

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Hans-Christian Egtvedt <hegtvedt@cisco.com>

commit | commitdiff | tree

Chen Gang [Sun, 16 Feb 2014 11:39:30 +0000 (19:39 +0800)]

avr32: add generic ioremap_wc() definition in io.h

Need generic ioremap_wc(), or can not pass compiling with allmodconfig,
the related error:

    CC [M]  drivers/gpu/drm/drm_bufs.o
  drivers/gpu/drm/drm_bufs.c: In function 'drm_addmap_core':
  drivers/gpu/drm/drm_bufs.c:217: error: implicit declaration of function 'ioremap_wc'
  drivers/gpu/drm/drm_bufs.c:218: warning: assignment makes pointer from integer without a cast

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Hans-Christian Egtvedt <hegtvedt@cisco.com>

commit | commitdiff | tree

Chen Gang [Sat, 1 Feb 2014 12:35:54 +0000 (20:35 +0800)]

avr32: Makefile: add '-D__linux__' flag for gcc-4.4.7 use

For avr32 cross compiler, do not define '__linux__' internally, so it
will cause issue with allmodconfig.

The related error:

    CC [M]  fs/coda/psdev.o
  In file included from include/linux/coda.h:64,
                   from fs/coda/psdev.c:45:
  include/uapi/linux/coda.h:221: error: expected specifier-qualifier-list before 'u_quad_t'

The related toolchain version (which only download, not re-compile):

  [root@gchen linux-next]# /upstream/toolchain/download/avr32-gnu-toolchain-linux_x86/bin/avr32-gcc -v
  Using built-in specs.
  Target: avr32
  Configured with: /data2/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/src/gcc/configure --target=avr32 --host=i686-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --enable-languages=c,c++ --disable-nls --disable-libssp --disable-libstdcxx-pch --with-dwarf2 --enable-version-specific-runtime-libs --disable-shared --enable-doc --with-mpfr-lib=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86/lib --with-mpfr-include=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86/include --with-gmp=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --with-mpc=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --enable-__cxa_atexit --disable-shared --with-newlib --with-pkgversion=AVR_32_bit_GNU_Toolchain_3.4.2_435 --with-bugurl=http://www
.atmel.com/avr
  Thread model: single
  gcc version 4.4.7 (AVR_32_bit_GNU_Toolchain_3.4.2_435)

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Hans-Christian Egtvedt <hegtvedt@cisco.com>
Cc: stable@vger.kernel.org

commit | commitdiff | tree

Paul Gortmaker [Fri, 10 Jan 2014 14:29:39 +0000 (09:29 -0500)]

avr32: fix missing module.h causing build failure in mimc200/fram.c

Causing this:

In file included from arch/avr32/boards/mimc200/fram.c:13:
include/linux/miscdevice.h:51: error: field 'list' has incomplete type
include/linux/miscdevice.h:55: error: expected specifier-qualifier-list before 'mode_t'
arch/avr32/boards/mimc200/fram.c:42: error: 'THIS_MODULE' undeclared here (not in a function)

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: stable@vger.kernel.org

commit | commitdiff | tree

Daniel Borkmann [Sun, 16 Feb 2014 14:55:22 +0000 (15:55 +0100)]

packet: check for ndo_select_queue during queue selection

Mathias reported that on an AMD Geode LX embedded board (ALiX)
with ath9k driver PACKET_QDISC_BYPASS, introduced in commit
d346a3fae3ff ("packet: introduce PACKET_QDISC_BYPASS socket
option"), triggers a WARN_ON() coming from the driver itself
via 066dae93bdf ("ath9k: rework tx queue selection and fix
queue stopping/waking").

The reason why this happened is that ndo_select_queue() call
is not invoked from direct xmit path i.e. for ieee80211 subsystem
that sets queue and TID (similar to 802.1d tag) which is being
put into the frame through 802.11e (WMM, QoS). If that is not
set, pending frame counter for e.g. ath9k can get messed up.

So the WARN_ON() in ath9k is absolutely legitimate. Generally,
the hw queue selection in ieee80211 depends on the type of
traffic, and priorities are set according to ieee80211_ac_numbers
mapping; working in a similar way as DiffServ only on a lower
layer, so that the AP can favour frames that have "real-time"
requirements like voice or video data frames.

Therefore, check for presence of ndo_select_queue() in netdev
ops and, if available, invoke it with a fallback handler to
__packet_pick_tx_queue(), so that driver such as bnx2x, ixgbe,
or mlx4 can still select a hw queue for transmission in
relation to the current CPU while e.g. ieee80211 subsystem
can make their own choices.

Reported-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Daniel Borkmann [Sun, 16 Feb 2014 14:55:21 +0000 (15:55 +0100)]

netdevice: move netdev_cap_txqueue for shared usage to header

In order to allow users to invoke netdev_cap_txqueue, it needs to
be moved into netdevice.h header file. While at it, also add kernel
doc header to document the API.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Daniel Borkmann [Sun, 16 Feb 2014 14:55:20 +0000 (15:55 +0100)]

netdevice: add queue selection fallback handler for ndo_select_queue

Add a new argument for ndo_select_queue() callback that passes a
fallback handler. This gets invoked through netdev_pick_tx();
fallback handler is currently __netdev_pick_tx() as most drivers
invoke this function within their customized implementation in
case for skbs that don't need any special handling. This fallback
handler can then be replaced on other call-sites with different
queue selection methods (e.g. in packet sockets, pktgen etc).

This also has the nice side-effect that __netdev_pick_tx() is
then only invoked from netdev_pick_tx() and export of that
function to modules can be undone.

Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Ingo Molnar [Fri, 14 Feb 2014 14:32:20 +0000 (15:32 +0100)]

drivers/net: tulip_remove_one needs to call pci_disable_device()

Otherwise the device is not completely shut down.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Matija Glavinic Pecotic [Fri, 14 Feb 2014 13:51:18 +0000 (14:51 +0100)]

net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer

Implementation of (a)rwnd calculation might lead to severe performance issues
and associations completely stalling. These problems are described and solution
is proposed which improves lksctp's robustness in congestion state.

1) Sudden drop of a_rwnd and incomplete window recovery afterwards

Data accounted in sctp_assoc_rwnd_decrease takes only payload size (sctp data),
but size of sk_buff, which is blamed against receiver buffer, is not accounted
in rwnd. Theoretically, this should not be the problem as actual size of buffer
is double the amount requested on the socket (SO_RECVBUF). Problem here is
that this will have bad scaling for data which is less then sizeof sk_buff.
E.g. in 4G (LTE) networks, link interfacing radio side will have a large portion
of traffic of this size (less then 100B).

An example of sudden drop and incomplete window recovery is given below. Node B
exhibits problematic behavior. Node A initiates association and B is configured
to advertise rwnd of 10000. A sends messages of size 43B (size of typical sctp
message in 4G (LTE) network). On B data is left in buffer by not reading socket
in userspace.

Lets examine when we will hit pressure state and declare rwnd to be 0 for
scenario with above stated parameters (rwnd == 10000, chunk size == 43, each
chunk is sent in separate sctp packet)

Logic is implemented in sctp_assoc_rwnd_decrease:

socket_buffer (see below) is maximum size which can be held in socket buffer
(sk_rcvbuf). current_alloced is amount of data currently allocated (rx_count)

A simple expression is given for which it will be examined after how many
packets for above stated parameters we enter pressure state:

We start by condition which has to be met in order to enter pressure state:

socket_buffer < currently_alloced;

currently_alloced is represented as size of sctp packets received so far and not
yet delivered to userspace. x is the number of chunks/packets (since there is no
bundling, and each chunk is delivered in separate packet, we can observe each
chunk also as sctp packet, and what is important here, having its own sk_buff):

socket_buffer < x*each_sctp_packet;

each_sctp_packet is sctp chunk size + sizeof(struct sk_buff). socket_buffer is
twice the amount of initially requested size of socket buffer, which is in case
of sctp, twice the a_rwnd requested:

2*rwnd < x*(payload+sizeof(struc sk_buff));

sizeof(struct sk_buff) is 190 (3.13.0-rc4+). Above is stated that rwnd is 10000
and each payload size is 43

20000 < x(43+190);

x > 20000/233;

x ~> 84;

After ~84 messages, pressure state is entered and 0 rwnd is advertised while
received 84*43B ~= 3612B sctp data. This is why external observer notices sudden
drop from 6474 to 0, as it will be now shown in example:

IP A.34340 > B.12345: sctp (1) [INIT] [init tag: 1875509148] [rwnd: 81920] [OS: 10] [MIS: 65535] [init TSN: 1096057017]
IP B.12345 > A.34340: sctp (1) [INIT ACK] [init tag: 3198966556] [rwnd: 10000] [OS: 10] [MIS: 10] [init TSN: 902132839]
IP A.34340 > B.12345: sctp (1) [COOKIE ECHO]
IP B.12345 > A.34340: sctp (1) [COOKIE ACK]
IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057017] [SID: 0] [SSEQ 0] [PPID 0x18]
IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057017] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0]
IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057018] [SID: 0] [SSEQ 1] [PPID 0x18]
IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057018] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0]
IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057019] [SID: 0] [SSEQ 2] [PPID 0x18]
IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057019] [a_rwnd 9914] [#gap acks 0] [#dup tsns 0]
<...>
IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057098] [SID: 0] [SSEQ 81] [PPID 0x18]
IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057098] [a_rwnd 6517] [#gap acks 0] [#dup tsns 0]
IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057099] [SID: 0] [SSEQ 82] [PPID 0x18]
IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057099] [a_rwnd 6474] [#gap acks 0] [#dup tsns 0]
IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057100] [SID: 0] [SSEQ 83] [PPID 0x18]

--> Sudden drop

IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 0] [#gap acks 0] [#dup tsns 0]

At this point, rwnd_press stores current rwnd value so it can be later restored
in sctp_assoc_rwnd_increase. This however doesn't happen as condition to start
slowly increasing rwnd until rwnd_press is returned to rwnd is never met. This
condition is not met since rwnd, after it hit 0, must first reach rwnd_press by
adding amount which is read from userspace. Let us observe values in above
example. Initial a_rwnd is 10000, pressure was hit when rwnd was ~6500 and the
amount of actual sctp data currently waiting to be delivered to userspace
is ~3500. When userspace starts to read, sctp_assoc_rwnd_increase will be blamed
only for sctp data, which is ~3500. Condition is never met, and when userspace
reads all data, rwnd stays on 3569.

IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 1505] [#gap acks 0] [#dup tsns 0]
IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 3010] [#gap acks 0] [#dup tsns 0]
IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057101] [SID: 0] [SSEQ 84] [PPID 0x18]
IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057101] [a_rwnd 3569] [#gap acks 0] [#dup tsns 0]

--> At this point userspace read everything, rwnd recovered only to 3569

IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057102] [SID: 0] [SSEQ 85] [PPID 0x18]
IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057102] [a_rwnd 3569] [#gap acks 0] [#dup tsns 0]

Reproduction is straight forward, it is enough for sender to send packets of
size less then sizeof(struct sk_buff) and receiver keeping them in its buffers.

2) Minute size window for associations sharing the same socket buffer

In case multiple associations share the same socket, and same socket buffer
(sctp.rcvbuf_policy == 0), different scenarios exist in which congestion on one
of the associations can permanently drop rwnd of other association(s).

Situation will be typically observed as one association suddenly having rwnd
dropped to size of last packet received and never recovering beyond that point.
Different scenarios will lead to it, but all have in common that one of the
associations (let it be association from 1)) nearly depleted socket buffer, and
the other association blames socket buffer just for the amount enough to start
the pressure. This association will enter pressure state, set rwnd_press and
announce 0 rwnd.
When data is read by userspace, similar situation as in 1) will occur, rwnd will
increase just for the size read by userspace but rwnd_press will be high enough
so that association doesn't have enough credit to reach rwnd_press and restore
to previous state. This case is special case of 1), being worse as there is, in
the worst case, only one packet in buffer for which size rwnd will be increased.
Consequence is association which has very low maximum rwnd ('minute size', in
our case down to 43B - size of packet which caused pressure) and as such
unusable.

Scenario happened in the field and labs frequently after congestion state (link
breaks, different probabilities of packet drop, packet reordering) and with
scenario 1) preceding. Here is given a deterministic scenario for reproduction:

>From node A establish two associations on the same socket, with rcvbuf_policy
being set to share one common buffer (sctp.rcvbuf_policy == 0). On association 1
repeat scenario from 1), that is, bring it down to 0 and restore up. Observe
scenario 1). Use small payload size (here we use 43). Once rwnd is 'recovered',
bring it down close to 0, as in just one more packet would close it. This has as
a consequence that association number 2 is able to receive (at least) one more
packet which will bring it in pressure state. E.g. if association 2 had rwnd of
10000, packet received was 43, and we enter at this point into pressure,
rwnd_press will have 9957. Once payload is delivered to userspace, rwnd will
increase for 43, but conditions to restore rwnd to original state, just as in
1), will never be satisfied.

--> Association 1, between A.y and B.12345

IP A.55915 > B.12345: sctp (1) [INIT] [init tag: 836880897] [rwnd: 10000] [OS: 10] [MIS: 65535] [init TSN: 4032536569]
IP B.12345 > A.55915: sctp (1) [INIT ACK] [init tag: 2873310749] [rwnd: 81920] [OS: 10] [MIS: 10] [init TSN: 3799315613]
IP A.55915 > B.12345: sctp (1) [COOKIE ECHO]
IP B.12345 > A.55915: sctp (1) [COOKIE ACK]

--> Association 2, between A.z and B.12346

IP A.55915 > B.12346: sctp (1) [INIT] [init tag: 534798321] [rwnd: 10000] [OS: 10] [MIS: 65535] [init TSN: 2099285173]
IP B.12346 > A.55915: sctp (1) [INIT ACK] [init tag: 516668823] [rwnd: 81920] [OS: 10] [MIS: 10] [init TSN: 3676403240]
IP A.55915 > B.12346: sctp (1) [COOKIE ECHO]
IP B.12346 > A.55915: sctp (1) [COOKIE ACK]

--> Deplete socket buffer by sending messages of size 43B over association 1

IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315613] [SID: 0] [SSEQ 0] [PPID 0x18]
IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315613] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0]

<...>

IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315696] [a_rwnd 6388] [#gap acks 0] [#dup tsns 0]
IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315697] [SID: 0] [SSEQ 84] [PPID 0x18]
IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315697] [a_rwnd 6345] [#gap acks 0] [#dup tsns 0]

--> Sudden drop on 1

IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315698] [SID: 0] [SSEQ 85] [PPID 0x18]
IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315698] [a_rwnd 0] [#gap acks 0] [#dup tsns 0]

--> Here userspace read, rwnd 'recovered' to 3698, now deplete again using
    association 1 so there is place in buffer for only one more packet

IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315799] [SID: 0] [SSEQ 186] [PPID 0x18]
IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315799] [a_rwnd 86] [#gap acks 0] [#dup tsns 0]
IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315800] [SID: 0] [SSEQ 187] [PPID 0x18]
IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 43] [#gap acks 0] [#dup tsns 0]

--> Socket buffer is almost depleted, but there is space for one more packet,
    send them over association 2, size 43B

IP B.12346 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3676403240] [SID: 0] [SSEQ 0] [PPID 0x18]
IP A.55915 > B.12346: sctp (1) [SACK] [cum ack 3676403240] [a_rwnd 0] [#gap acks 0] [#dup tsns 0]

--> Immediate drop

IP A.60995 > B.12346: sctp (1) [SACK] [cum ack 387491510] [a_rwnd 0] [#gap acks 0] [#dup tsns 0]

--> Read everything from the socket, both association recover up to maximum rwnd
    they are capable of reaching, note that association 1 recovered up to 3698,
    and association 2 recovered only to 43

IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 1548] [#gap acks 0] [#dup tsns 0]
IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 3053] [#gap acks 0] [#dup tsns 0]
IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315801] [SID: 0] [SSEQ 188] [PPID 0x18]
IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315801] [a_rwnd 3698] [#gap acks 0] [#dup tsns 0]
IP B.12346 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3676403241] [SID: 0] [SSEQ 1] [PPID 0x18]
IP A.55915 > B.12346: sctp (1) [SACK] [cum ack 3676403241] [a_rwnd 43] [#gap acks 0] [#dup tsns 0]

A careful reader might wonder why it is necessary to reproduce 1) prior
reproduction of 2). It is simply easier to observe when to send packet over
association 2 which will push association into the pressure state.

Proposed solution:

Both problems share the same root cause, and that is improper scaling of socket
buffer with rwnd. Solution in which sizeof(sk_buff) is taken into concern while
calculating rwnd is not possible due to fact that there is no linear
relationship between amount of data blamed in increase/decrease with IP packet
in which payload arrived. Even in case such solution would be followed,
complexity of the code would increase. Due to nature of current rwnd handling,
slow increase (in sctp_assoc_rwnd_increase) of rwnd after pressure state is
entered is rationale, but it gives false representation to the sender of current
buffer space. Furthermore, it implements additional congestion control mechanism
which is defined on implementation, and not on standard basis.

Proposed solution simplifies whole algorithm having on mind definition from rfc:

o  Receiver Window (rwnd): This gives the sender an indication of the space
   available in the receiver's inbound buffer.

Core of the proposed solution is given with these lines:

sctp_assoc_rwnd_update:
if ((asoc->base.sk->sk_rcvbuf - rx_count) > 0)
asoc->rwnd = (asoc->base.sk->sk_rcvbuf - rx_count) >> 1;
else
asoc->rwnd = 0;

We advertise to sender (half of) actual space we have. Half is in the braces
depending whether you would like to observe size of socket buffer as SO_RECVBUF
or twice the amount, i.e. size is the one visible from userspace, that is,
from kernelspace.
In this way sender is given with good approximation of our buffer space,
regardless of the buffer policy - we always advertise what we have. Proposed
solution fixes described problems and removes necessity for rwnd restoration
algorithm. Finally, as proposed solution is simplification, some lines of code,
along with some bytes in struct sctp_association are saved.

Version 2 of the patch addressed comments from Vlad. Name of the function is set
to be more descriptive, and two parts of code are changed, in one removing the
superfluous call to sctp_assoc_rwnd_update since call would not result in update
of rwnd, and the other being reordering of the code in a way that call to
sctp_assoc_rwnd_update updates rwnd. Version 3 corrected change introduced in v2
in a way that existing function is not reordered/copied in line, but it is
correctly called. Thanks Vlad for suggesting.

Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@nsn.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Duan Jiong [Fri, 14 Feb 2014 10:26:22 +0000 (18:26 +0800)]

ipv4: distinguish EHOSTUNREACH from the ENETUNREACH

since commit 251da413("ipv4: Cache ip_error() routes even when not forwarding."),
the counter IPSTATS_MIB_INADDRERRORS can't work correctly, because the value of
err was always set to ENETUNREACH.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Haiyang Zhang [Thu, 13 Feb 2014 00:54:27 +0000 (16:54 -0800)]

hyperv: Fix the carrier status setting

Without this patch, the "cat /sys/class/net/ethN/operstate" shows
"unknown", and "ethtool ethN" shows "Link detected: yes", when VM
boots up with or without vNIC connected.

This patch fixed the problem.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Beck SC1x5 Kernel