git.karo-electronics.de Git - linux-beck.git/log

inet: frag: make sure forced eviction removes all frags

Quoting Alexander Aring:
  While fragmentation and unloading of 6lowpan module I got this kernel Oops
  after few seconds:

  BUG: unable to handle kernel paging request at f88bbc30
  [..]
  Modules linked in: ipv6 [last unloaded: 6lowpan]
  Call Trace:
   [<c012af4c>] ? call_timer_fn+0x54/0xb3
   [<c012aef8>] ? process_timeout+0xa/0xa
   [<c012b66b>] run_timer_softirq+0x140/0x15f

Problem is that incomplete frags are still around after unload; when
their frag expire timer fires, we get crash.

When a netns is removed (also done when unloading module), inet_frag
calls the evictor with 'force' argument to purge remaining frags.

The evictor loop terminates when accounted memory ('work') drops to 0
or the lru-list becomes empty.  However, the mem accounting is done
via percpu counters and may not be accurate, i.e. loop may terminate
prematurely.

Alter evictor to only stop once the lru list is empty when force is
requested.

Reported-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
Reported-by: Alexander Aring <alex.aring@gmail.com>
Tested-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'tipc'

Eric Hugne says:

====================
tipc: refcount and memory leak fixes

v3: Remove error logging from data path completely. Rebased on top of
    latest net merge.

v2: Drop specific -ENOMEM logging in patch #1 (tipc: allow connection
    shutdown callback to be invoked in advance) And add a general error
    message if an internal server tries to send a message on a
    closed/nonexisting connection.

In addition to the fix for refcount leak and memory leak during
module removal, we also fix a problem where the topology server
listening socket where unexpectedly closed. We also eliminate an
unnecessary context switch during accept()/recvmsg() for nonblocking
sockets.

It might be good to include this patchset in stable aswell. After the
v3 rebase on latest merge from net all patches apply cleanly on that
tree.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: don't log disabled tasklet handler errors

Failure to schedule a TIPC tasklet with tipc_k_signal because the
tasklet handler is disabled is not an error. It means TIPC is
currently in the process of shutting down. We remove the error
logging in this case.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: fix memory leak during module removal

When the TIPC module is removed, the tasklet handler is disabled
before all other subsystems. This will cause lingering publications
in the name table because the node_down tasklets responsible to
clean up publications from an unreachable node will never run.
When the name table is shut down, these publications are detected
and an error message is logged:
tipc: nametbl_stop(): orphaned hash chain detected
This is actually a memory leak, introduced with commit
993b858e37b3120ee76d9957a901cca22312ffaa ("tipc: correct the order
of stopping services at rmmod")

Instead of just logging an error and leaking memory, we free
the orphaned entries during nametable shutdown.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: drop subscriber connection id invalidation

When a topology server subscriber is disconnected, the associated
connection id is set to zero. A check vs zero is then done in the
subscription timeout function to see if the subscriber have been
shut down. This is unnecessary, because all subscription timers
will be cancelled when a subscriber terminates. Setting the
connection id to zero is actually harmful because id zero is the
identity of the topology server listening socket, and can cause a
race that leads to this socket being closed instead.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: avoid to unnecessary process switch under non-block mode

When messages are received via tipc socket under non-block mode,
schedule_timeout() is called in tipc_wait_for_rcvmsg(), that is,
the process of receiving messages will be scheduled once although
timeout value passed to schedule_timeout() is 0. The same issue
exists in accept()/wait_for_accept(). To avoid this unnecessary
process switch, we only call schedule_timeout() if the timeout
value is non-zero.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: fix connection refcount leak

When tipc_conn_sendmsg() calls tipc_conn_lookup() to query a
connection instance, its reference count value is increased if
it's found. But subsequently if it's found that the connection is
closed, the work of sending message is not queued into its server
send workqueue, and the connection reference count is not decreased.
This will cause a reference count leak. To reproduce this problem,
an application would need to open and closes topology server
connections with high intensity.

We fix this by immediately decrementing the connection reference
count if a send fails due to the connection being closed.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: allow connection shutdown callback to be invoked in advance

Currently connection shutdown callback function is called when
connection instance is released in tipc_conn_kref_release(), and
receiving packets and sending packets are running in different
threads. Even if connection is closed by the thread of receiving
packets, its shutdown callback may not be called immediately as
the connection reference count is non-zero at that moment. So,
although the connection is shut down by the thread of receiving
packets, the thread of sending packets doesn't know it. Before
its shutdown callback is invoked to tell the sending thread its
connection has been closed, the sending thread may deliver
messages by tipc_conn_sendmsg(), this is why the following error
information appears:

"Sending subscription event failed, no memory"

To eliminate it, allow connection shutdown callback function to
be called before connection id is removed in tipc_close_conn(),
which makes the sending thread know the truth in time that its
socket is closed so that it doesn't send message to it. We also
remove the "Sending XXX failed..." error reporting for topology
and config services.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

l2tp: fix userspace reception on plain L2TP sockets

As pppol2tp_recv() never queues up packets to plain L2TP sockets,
pppol2tp_recvmsg() never returns data to userspace, thus making
the recv*() system calls unusable.

Instead of dropping packets when the L2TP socket isn't bound to a PPP
channel, this patch adds them to its reception queue.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

l2tp: fix manual sequencing (de)activation in L2TPv2

Commit e0d4435f "l2tp: Update PPP-over-L2TP driver to work over L2TPv3"
broke the PPPOL2TP_SO_SENDSEQ setsockopt. The L2TP header length was
previously computed by pppol2tp_l2t_header_len() before each call to
l2tp_xmit_skb(). Now that header length is retrieved from the hdr_len
session field, this field must be updated every time the L2TP header
format is modified, or l2tp_xmit_skb() won't push the right amount of
data for the L2TP header.

This patch uses l2tp_session_set_header_len() to adjust hdr_len every
time sequencing is (de)activated from userspace (either by the
PPPOL2TP_SO_SENDSEQ setsockopt or the L2TP_ATTR_SEND_SEQ netlink
attribute).

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

hyperv: Move state setting for link query

It moves the state setting for query into rndis_filter_receive_response().
All callbacks including query-complete and status-callback are synchronized
by channel->inbound_lock. This prevents pentential race between them.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: macb: DMA-unmap full rx-buffer

When allocating RX buffers a fixed size is used, while freeing is based
on actually received bytes, resulting in the following kernel warning
when CONFIG_DMA_API_DEBUG is enabled:
WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:1051 check_unmap+0x258/0x894()
macb e000b000.ethernet: DMA-API: device driver frees DMA memory with different size [device address=0x000000002d170040] [map size=1536 bytes] [unmap size=60 bytes]
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc3-xilinx-00220-g49f84081ce4f #65
[<c001516c>] (unwind_backtrace) from [<c0011df8>] (show_stack+0x10/0x14)
[<c0011df8>] (show_stack) from [<c03c775c>] (dump_stack+0x7c/0xc8)
[<c03c775c>] (dump_stack) from [<c00245cc>] (warn_slowpath_common+0x60/0x84)
[<c00245cc>] (warn_slowpath_common) from [<c0024670>] (warn_slowpath_fmt+0x2c/0x3c)
[<c0024670>] (warn_slowpath_fmt) from [<c0227d44>] (check_unmap+0x258/0x894)
[<c0227d44>] (check_unmap) from [<c0228588>] (debug_dma_unmap_page+0x64/0x70)
[<c0228588>] (debug_dma_unmap_page) from [<c02ab78c>] (gem_rx+0x118/0x170)
[<c02ab78c>] (gem_rx) from [<c02ac4d4>] (macb_poll+0x24/0x94)
[<c02ac4d4>] (macb_poll) from [<c031222c>] (net_rx_action+0x6c/0x188)
[<c031222c>] (net_rx_action) from [<c0028a28>] (__do_softirq+0x108/0x280)
[<c0028a28>] (__do_softirq) from [<c0028e8c>] (irq_exit+0x84/0xf8)
[<c0028e8c>] (irq_exit) from [<c000f360>] (handle_IRQ+0x68/0x8c)
[<c000f360>] (handle_IRQ) from [<c0008528>] (gic_handle_irq+0x3c/0x60)
[<c0008528>] (gic_handle_irq) from [<c0012904>] (__irq_svc+0x44/0x78)
Exception stack(0xc056df20 to 0xc056df68)
df20: 00000001 c0577430 00000000 c0577430 04ce8e0d 00000002 edfce238 00000000
df40: 04e20f78 00000002 c05981f4 00000000 00000008 c056df68 c0064008 c02d7658
df60: 20000013 ffffffff
[<c0012904>] (__irq_svc) from [<c02d7658>] (cpuidle_enter_state+0x54/0xf8)
[<c02d7658>] (cpuidle_enter_state) from [<c02d77dc>] (cpuidle_idle_call+0xe0/0x138)
[<c02d77dc>] (cpuidle_idle_call) from [<c000f660>] (arch_cpu_idle+0x8/0x3c)
[<c000f660>] (arch_cpu_idle) from [<c006bec4>] (cpu_startup_entry+0xbc/0x124)
[<c006bec4>] (cpu_startup_entry) from [<c053daec>] (start_kernel+0x350/0x3b0)
---[ end trace d5fdc38641bd3a11 ]---
Mapped at:
  [<c0227184>] debug_dma_map_page+0x48/0x11c
  [<c02ab32c>] gem_rx_refill+0x154/0x1f8
  [<c02ac7b4>] macb_open+0x270/0x3e0
  [<c03152e0>] __dev_open+0x7c/0xfc
  [<c031554c>] __dev_change_flags+0x8c/0x140

Fixing this by passing the same size which is passed during mapping the
memory to the unmap function as well.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Reviewed-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: macb: Check DMA mappings for error

With CONFIG_DMA_API_DEBUG enabled the following warning is printed:
WARNING: CPU: 0 PID: 619 at lib/dma-debug.c:1101 check_unmap+0x758/0x894()
macb e000b000.ethernet: DMA-API: device driver failed to check map error[device address=0x000000002d171c02] [size=322 bytes] [mapped as single]
Modules linked in:
CPU: 0 PID: 619 Comm: udhcpc Not tainted 3.14.0-rc3-xilinx-00219-gd158fc7f36a2 #63
[<c001516c>] (unwind_backtrace) from [<c0011df8>] (show_stack+0x10/0x14)
[<c0011df8>] (show_stack) from [<c03c7714>] (dump_stack+0x7c/0xc8)
[<c03c7714>] (dump_stack) from [<c00245cc>] (warn_slowpath_common+0x60/0x84)
[<c00245cc>] (warn_slowpath_common) from [<c0024670>] (warn_slowpath_fmt+0x2c/0x3c)
[<c0024670>] (warn_slowpath_fmt) from [<c0228244>] (check_unmap+0x758/0x894)
[<c0228244>] (check_unmap) from [<c0228588>] (debug_dma_unmap_page+0x64/0x70)
[<c0228588>] (debug_dma_unmap_page) from [<c02aba64>] (macb_interrupt+0x1f8/0x2dc)
[<c02aba64>] (macb_interrupt) from [<c006c6e4>] (handle_irq_event_percpu+0x2c/0x178)
[<c006c6e4>] (handle_irq_event_percpu) from [<c006c86c>] (handle_irq_event+0x3c/0x5c)
[<c006c86c>] (handle_irq_event) from [<c006f548>] (handle_fasteoi_irq+0xb8/0x100)
[<c006f548>] (handle_fasteoi_irq) from [<c006c148>] (generic_handle_irq+0x20/0x30)
[<c006c148>] (generic_handle_irq) from [<c000f35c>] (handle_IRQ+0x64/0x8c)
[<c000f35c>] (handle_IRQ) from [<c0008528>] (gic_handle_irq+0x3c/0x60)
[<c0008528>] (gic_handle_irq) from [<c0012904>] (__irq_svc+0x44/0x78)
Exception stack(0xed197f60 to 0xed197fa8)
7f60: 00000134 60000013 bd94362e bd94362e be96b37c 00000014 fffffd72 00000122
7f80: c000ebe4 ed196000 00000000 00000011 c032c0d8 ed197fa8 c0064008 c000ea20
7fa0: 60000013 ffffffff
[<c0012904>] (__irq_svc) from [<c000ea20>] (ret_fast_syscall+0x0/0x48)
---[ end trace 478f921d0d542d1e ]---
Mapped at:
  [<c0227184>] debug_dma_map_page+0x48/0x11c
  [<c02aaca0>] macb_start_xmit+0x184/0x2a8
  [<c03143c0>] dev_hard_start_xmit+0x334/0x470
  [<c032c09c>] sch_direct_xmit+0x78/0x2f8
  [<c0314814>] __dev_queue_xmit+0x318/0x708

due to missing checks of the dma mapping. Add the appropriate checks to fix
this.

Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Reviewed-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sctp: fix skb leakage in COOKIE ECHO path of chunk->auth_chunk

While working on ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to
verify if we/peer is AUTH capable"), we noticed that there's a skb
memory leakage in the error path.

Running the same reproducer as in ec0223ec48a9 and by unconditionally
jumping to the error label (to simulate an error condition) in
sctp_sf_do_5_1D_ce() receive path lets kmemleak detector bark about
the unfreed chunk->auth_chunk skb clone:

Unreferenced object 0xffff8800b8f3a000 (size 256):
  comm "softirq", pid 0, jiffies 4294769856 (age 110.757s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    89 ab 75 5e d4 01 58 13 00 00 00 00 00 00 00 00  ..u^..X.........
  backtrace:
    [<ffffffff816660be>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8119f328>] kmem_cache_alloc+0xc8/0x210
    [<ffffffff81566929>] skb_clone+0x49/0xb0
    [<ffffffffa0467459>] sctp_endpoint_bh_rcv+0x1d9/0x230 [sctp]
    [<ffffffffa046fdbc>] sctp_inq_push+0x4c/0x70 [sctp]
    [<ffffffffa047e8de>] sctp_rcv+0x82e/0x9a0 [sctp]
    [<ffffffff815abd38>] ip_local_deliver_finish+0xa8/0x210
    [<ffffffff815a64af>] nf_reinject+0xbf/0x180
    [<ffffffffa04b4762>] nfqnl_recv_verdict+0x1d2/0x2b0 [nfnetlink_queue]
    [<ffffffffa04aa40b>] nfnetlink_rcv_msg+0x14b/0x250 [nfnetlink]
    [<ffffffff815a3269>] netlink_rcv_skb+0xa9/0xc0
    [<ffffffffa04aa7cf>] nfnetlink_rcv+0x23f/0x408 [nfnetlink]
    [<ffffffff815a2bd8>] netlink_unicast+0x168/0x250
    [<ffffffff815a2fa1>] netlink_sendmsg+0x2e1/0x3f0
    [<ffffffff8155cc6b>] sock_sendmsg+0x8b/0xc0
    [<ffffffff8155d449>] ___sys_sendmsg+0x369/0x380

What happens is that commit bbd0d59809f9 clones the skb containing
the AUTH chunk in sctp_endpoint_bh_rcv() when having the edge case
that an endpoint requires COOKIE-ECHO chunks to be authenticated:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  ------------------ AUTH; COOKIE-ECHO ---------------->
  <-------------------- COOKIE-ACK ---------------------

When we enter sctp_sf_do_5_1D_ce() and before we actually get to
the point where we process (and subsequently free) a non-NULL
chunk->auth_chunk, we could hit the "goto nomem_init" path from
an error condition and thus leave the cloned skb around w/o
freeing it.

The fix is to centrally free such clones in sctp_chunk_destroy()
handler that is invoked from sctp_chunk_free() after all refs have
dropped; and also move both kfree_skb(chunk->auth_chunk) there,
so that chunk->auth_chunk is either NULL (since sctp_chunkify()
allocs new chunks through kmem_cache_zalloc()) or non-NULL with
a valid skb pointer. chunk->skb and chunk->auth_chunk are the
only skbs in the sctp_chunk structure that need to be handeled.

While at it, we should use consume_skb() for both. It is the same
as dev_kfree_skb() but more appropriately named as we are not
a device but a protocol. Also, this effectively replaces the
kfree_skb() from both invocations into consume_skb(). Functions
are the same only that kfree_skb() assumes that the frame was
being dropped after a failure (e.g. for tools like drop monitor),
usage of consume_skb() seems more appropriate in function
sctp_chunk_destroy() though.

Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <yasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: disable the ECM mode

There are known issues for switching the drivers between ECM mode and
vendor mode. The interrup transfer may become abnormal. The hardware
may have the opportunity to die if you change the configuration without
unloading the current driver first, because all the control transfers
of the current driver would fail after the command of switching the
configuration.

Although to use the ecm driver and vendor driver independently is fine,
it may have problems to change the driver from one to the other by
switching the configuration. Additionally, now the vendor mode driver
is more powerful than the ECM driver. Thus, disable the ECM mode driver,
and let r8152 to set the configuration to vendor mode and reset the
device automatically.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx4: Support shutdown() interface

In kexec scenario, we failed to load the mlx4 driver in the
second kernel because the ownership bit was hold by the first
kernel without release correctly.

The patch adds shutdown() interface so that the ownership can
be released correctly in the first kernel. It also helps avoiding
EEH error happened during boot stage of the second kernel because
of undesired traffic, which can't be handled by hardware during
that stage on Power platform.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Tested-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: multicast: add sanity check for query source addresses

MLD queries are supposed to have an IPv6 link-local source address
according to RFC2710, section 4 and RFC3810, section 5.1.14. This patch
adds a sanity check to ignore such broken MLD queries.

Without this check, such malformed MLD queries can result in a
denial of service: The queries are ignored by any MLD listener
therefore they will not respond with an MLD report. However,
without this patch these malformed MLD queries would enable the
snooping part in the bridge code, potentially shutting down the
according ports towards these hosts for multicast traffic as the
bridge did not learn about these listeners.

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fix for a race condition in the inet frag code

I stumbled upon this very serious bug while hunting for another one,
it's a very subtle race condition between inet_frag_evictor,
inet_frag_intern and the IPv4/6 frag_queue and expire functions
(basically the users of inet_frag_kill/inet_frag_put).

What happens is that after a fragment has been added to the hash chain
but before it's been added to the lru_list (inet_frag_lru_add) in
inet_frag_intern, it may get deleted (either by an expired timer if
the system load is high or the timer sufficiently low, or by the
fraq_queue function for different reasons) before it's added to the
lru_list, then after it gets added it's a matter of time for the
evictor to get to a piece of memory which has been freed leading to a
number of different bugs depending on what's left there.

I've been able to trigger this on both IPv4 and IPv6 (which is normal
as the frag code is the same), but it's been much more difficult to
trigger on IPv4 due to the protocol differences about how fragments
are treated.

The setup I used to reproduce this is: 2 machines with 4 x 10G bonded
in a RR bond, so the same flow can be seen on multiple cards at the
same time. Then I used multiple instances of ping/ping6 to generate
fragmented packets and flood the machines with them while running
other processes to load the attacked machine.

*It is very important to have the _same flow_ coming in on multiple CPUs
concurrently. Usually the attacked machine would die in less than 30
minutes, if configured properly to have many evictor calls and timeouts
it could happen in 10 minutes or so.

An important point to make is that any caller (frag_queue or timer) of
inet_frag_kill will remove both the timer refcount and the
original/guarding refcount thus removing everything that's keeping the
frag from being freed at the next inet_frag_put. All of this could
happen before the frag was ever added to the LRU list, then it gets
added and the evictor uses a freed fragment.

An example for IPv6 would be if a fragment is being added and is at
the stage of being inserted in the hash after the hash lock is
released, but before inet_frag_lru_add executes (or is able to obtain
the lru lock) another overlapping fragment for the same flow arrives
at a different CPU which finds it in the hash, but since it's
overlapping it drops it invoking inet_frag_kill and thus removing all
guarding refcounts, and afterwards freeing it by invoking
inet_frag_put which removes the last refcount added previously by
inet_frag_find, then inet_frag_lru_add gets executed by
inet_frag_intern and we have a freed fragment in the lru_list.

The fix is simple, just move the lru_add under the hash chain locked
region so when a removing function is called it'll have to wait for
the fragment to be added to the lru_list, and then it'll remove it (it
works because the hash chain removal is done before the lru_list one
and there's no window between the two list adds when the frag can get
dropped). With this fix applied I couldn't kill the same machine in 24
hours with the same setup.

Fixes: 3ef0eb0db4bf ("net: frag, move LRU list maintenance outside of
rwlock")

CC: Florian Westphal <fw@strlen.de>
CC: Jesper Dangaard Brouer <brouer@redhat.com>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix memory leak in ieee80211_prep_connection(), sta_info leaked on
    error.  From Eytan Lifshitz.

2) Unintentional switch case fallthrough in nft_reject_inet_eval(),
    from Patrick McHardy.

3) Must check if payload lenth is a power of 2 in
    nft_payload_select_ops(), from Nikolay Aleksandrov.

4) Fix mis-checksumming in xen-netfront driver, ip_hdr() is not in the
    correct place when we invoke skb_checksum_setup().  From Wei Liu.

5) TUN driver should not advertise HW vlan offload features in
    vlan_features.  Fix from Fernando Luis Vazquez Cao.

6) IPV6_VTI needs to select NET_IPV_TUNNEL to avoid build errors, fix
    from Steffen Klassert.

7) Add missing locking in xfrm_migrade_state_find(), we must hold the
    per-namespace xfrm_state_lock while traversing the lists.  Fix from
    Steffen Klassert.

8) Missing locking in ath9k driver, access to tid->sched must be done
    under ath_txq_lock().  Fix from Stanislaw Gruszka.

9) Fix two bugs in TCP fastopen.  First respect the size argument given
    to tcp_sendmsg() in the fastopen path, and secondly prevent
    tcp_send_syn_data() from potentially using order-5 allocations.
    From Eric Dumazet.

10) Fix handling of default neigh garbage collection params, from Jiri
    Pirko.

11) Fix cwnd bloat and over-inflation of RTT when transmit segmentation
    is in use.  From Eric Dumazet.

12) Missing initialization of Realtek r8169 driver's statistics
    seqlocks.  Fix from Kyle McMartin.

13) Fix RTNL assertion failures in 802.3ad and AB ARP monitor of bonding
    driver, from Ding Tianhong.

14) Bonding slave release race can cause divide by zero, fix from
    Nikolay Aleksandrov.

15) Overzealous return from neigh_periodic_work() causes reachability
    time to not be computed.  Fix from Duain Jiong.

16) Fix regression in ipv6_find_hdr(), it should not return -ENOENT when
    a specific target is specified and found.  From Hans Schillstrom.

17) Fix VLAN tag stripping regression in BNA driver, from Ivan Vecera.

18) Tail loss probe can calculate bogus RTTs due to missing packet
    marking on retransmit.  Fix from Yuchung Cheng.

19) We cannot do skb_dst_drop() in iptunnel_pull_header() because
    multicast loopback detection in later code paths need access to
    skb_rtable().  Fix from Xin Long.

20) The macvlan driver regresses in that it propagates lower device
    offload support disables into itself, causing severe slowdowns when
    running over a bridge.  Provide the software offloads always on
    macvlan devices to deal with this and the regression is gone.  From
    Vlad Yasevich.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits)
  macvlan: Add support for 'always_on' offload features
  net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable
  ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer
  net: cpsw: fix cpdma rx descriptor leak on down interface
  be2net: isolate TX workarounds not applicable to Skyhawk-R
  be2net: Fix skb double free in be_xmit_wrokarounds() failure path
  be2net: clear promiscuous bits in adapter->flags while disabling promiscuous mode
  be2net: Fix to reset transparent vlan tagging
  qlcnic: dcb: a couple off by one bugs
  tcp: fix bogus RTT on special retransmission
  hsr: off by one sanity check in hsr_register_frame_in()
  can: remove CAN FD compatibility for CAN 2.0 sockets
  can: flexcan: factor out soft reset into seperate funtion
  can: flexcan: flexcan_remove(): add missing netif_napi_del()
  can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze
  can: flexcan: factor out transceiver {en,dis}able into seperate functions
  can: flexcan: fix transition from and to low power mode in chip_{en,dis}able
  can: flexcan: flexcan_open(): fix error path if flexcan_chip_start() fails
  can: flexcan: fix shutdown: first disable chip, then all interrupts
  USB AX88179/178A: Support D-Link DUB-1312
  ...

Merge tag 'regulator-v3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"A couple of fixes here which ensure that regulators using the core
  support for GPIO enables work in all cases by ensuring that helpers
  are used consistently rather than open coding in places and hence not
  having GPIO support in some of them"

* tag 'regulator-v3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: core: Replace direct ops->disable usage
  regulator: core: Replace direct ops->enable usage

Merge branch 'akpm' (patches from Andrew Morton)

Merge misc fixes from Andrew Morton.

* emailed patches from Andrew Morton akpm@linux-foundation.org>:
  mm: page_alloc: exempt GFP_THISNODE allocations from zone fairness
  mm: numa: bugfix for LAST_CPUPID_NOT_IN_PAGE_FLAGS
  MAINTAINERS: add and correct types of some "T:" entries
  MAINTAINERS: use tab for separator
  rapidio/tsi721: fix tasklet termination in dma channel release
  hfsplus: fix remount issue
  zram: avoid null access when fail to alloc meta
  sh: prefix sh-specific "CCR" and "CCR2" by "SH_"
  ocfs2: fix quota file corruption
  drivers/rtc/rtc-s3c.c: fix incorrect way of save/restore of S3C2410_TICNT for TYPE_S3C64XX
  kallsyms: fix absolute addresses for kASLR
  scripts/gen_initramfs_list.sh: fix flags for initramfs LZ4 compression
  mm: include VM_MIXEDMAP flag in the VM_SPECIAL list to avoid m(un)locking
  memcg: reparent charges of children before processing parent
  memcg: fix endless loop in __mem_cgroup_iter_next()
  lib/radix-tree.c: swapoff tmpfs radix_tree: remember to rcu_read_unlock
  dma debug: account for cachelines and read-only mappings in overlap tracking
  mm: close PageTail race
  MAINTAINERS: EDAC: add Mauro and Borislav as interim patch collectors

mm: page_alloc: exempt GFP_THISNODE allocations from zone fairness

Jan Stancek reports manual page migration encountering allocation
failures after some pages when there is still plenty of memory free, and
bisected the problem down to commit 81c0a2bb515f ("mm: page_alloc: fair
zone allocator policy").

The problem is that GFP_THISNODE obeys the zone fairness allocation
batches on one hand, but doesn't reset them and wake kswapd on the other
hand. After a few of those allocations, the batches are exhausted and
the allocations fail.

Fixing this means either having GFP_THISNODE wake up kswapd, or
GFP_THISNODE not participating in zone fairness at all. The latter
seems safer as an acute bugfix, we can clean up later.

Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@kernel.org> [3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: numa: bugfix for LAST_CPUPID_NOT_IN_PAGE_FLAGS

When doing some numa tests on powerpc, I triggered an oops bug.  I find
it is caused by using page->_last_cpupid.  It should be initialized as
"-1 & LAST_CPUPID_MASK", but not "-1".  Otherwise, in task_numa_fault(),
we will miss the checking (last_cpupid == (-1 & LAST_CPUPID_MASK)).  And
finally cause an oops bug in task_numa_group(), since the online cpu is
less than possible cpu.  This happen with CONFIG_SPARSE_VMEMMAP disabled

Call trace:

  SMP NR_CPUS=64 NUMA PowerNV
  Modules linked in:
  CPU: 24 PID: 804 Comm: systemd-udevd Not tainted3.13.0-rc1+ #32
  task: c000001e2746aa80 ti: c000001e32c50000 task.ti:c000001e32c50000
  REGS: c000001e32c53510 TRAP: 0300   Not tainted(3.13.0-rc1+)
  MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR:28024424  XER: 20000000
  CFAR: c000000000009324 DAR: 7265717569726857 DSISR:40000000 SOFTE: 1
  NIP  .task_numa_fault+0x1470/0x2370
  LR  .task_numa_fault+0x1468/0x2370
  Call Trace:
   .task_numa_fault+0x1468/0x2370 (unreliable)
   .do_numa_page+0x480/0x4a0
   .handle_mm_fault+0x4ec/0xc90
   .do_page_fault+0x3a8/0x890
   handle_page_fault+0x10/0x30
  Instruction dump:
  3c82fefb 3884b138 48d9cff1 60000000 48000574 3c62fefb3863af78 3c82fefb
  3884b138 48d9cfd5 60000000 e93f0100 <812902e4> 7d2907b45529063e 7d2a07b4
  ---[ end trace 15f2510da5ae07cf ]---

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: add and correct types of some "T:" entries

Tree location entries should start with the appropriate type.

Add git to some, hg to another.

Neaten tree type description.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: use tab for separator

Convert whitespace to single tab for separators.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rapidio/tsi721: fix tasklet termination in dma channel release

This patch is a modification of the patch originally proposed by
Xiaotian Feng <xtfeng@gmail.com>: https://lkml.org/lkml/2012/11/5/413
This new version disables DMA channel interrupts and ensures that the
tasklet wil not be scheduled again before calling tasklet_kill().

Unfortunately the updated patch was not released at that time due to
planned rework of Tsi721 mport driver to use threaded interrupts (which
has yet to happen).  Recently the issue was reported again:
https://lkml.org/lkml/2014/2/19/762.

Description from the original Xiaotian's patch:

"Some drivers use tasklet_disable in device remove/release process,
  tasklet_disable will inc tasklet->count and return.  If the tasklet is
  not handled yet under some softirq pressure, the tasklet will be
  placed on the tasklet_vec, never have a chance to be excuted.  This
  might lead to a heavy loaded ksoftirqd, wakeup with pending_softirq,
  but tasklet is disabled.  tasklet_kill should be used in this case."

This patch is applicable to kernel versions starting from v3.5.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Xiaotian Feng <xtfeng@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <bitbucket@online.de>
Cc: <stable@vger.kernel.org> [3.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

hfsplus: fix remount issue

Current implementation of HFS+ driver has small issue with remount
option.  Namely, for example, you are unable to remount from RO mode
into RW mode by means of command "mount -o remount,rw /dev/loop0
/mnt/hfsplus".  Trying to execute sequence of commands results in an
error message:

  mount /dev/loop0 /mnt/hfsplus
  mount -o remount,ro /dev/loop0 /mnt/hfsplus
  mount -o remount,rw /dev/loop0 /mnt/hfsplus

  mount: you must specify the filesystem type

  mount -t hfsplus -o remount,rw /dev/loop0 /mnt/hfsplus

  mount: /mnt/hfsplus not mounted or bad option

The reason of such issue is failure of mount syscall:

  mount("/dev/loop0", "/mnt/hfsplus", 0x2282a60, MS_MGC_VAL|MS_REMOUNT, NULL) = -1 EINVAL (Invalid argument)

Namely, hfsplus_parse_options_remount() method receives empty "input"
argument and return false in such case.  As a result, hfsplus_remount()
returns -EINVAL error code.

This patch fixes the issue by means of return true for the case of empty
"input" argument in hfsplus_parse_options_remount() method.

Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

zram: avoid null access when fail to alloc meta

zram_meta_alloc could fail so caller should check it. Otherwise, your
system will hang.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sh: prefix sh-specific "CCR" and "CCR2" by "SH_"

Commit bcf24e1daa94 ("mmc: omap_hsmmc: use the generic config for
omap2plus devices"), enabled the build for other platforms for compile
testing.

sh-allmodconfig now fails with:

    include/linux/omap-dma.h:171:8: error: expected identifier before numeric constant
    make[4]: *** [drivers/mmc/host/omap_hsmmc.o] Error 1

This happens because SuperH #defines "CCR", which is one of the enum
values in include/linux/omap-dma.h.  There's a similar issue with "CCR2"
on sh2a.

As "CCR" and "CCR2" are too generic names for global #defines, prefix
them with "SH_" to fix this.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: fix quota file corruption

Global quota files are accessed from different nodes. Thus we cannot
cache offset of quota structure in the quota file after we drop our node
reference count to it because after that moment quota structure may be
freed and reallocated elsewhere by a different node resulting in
corruption of quota file.

Fix the problem by clearing dq_off when we are releasing dquot structure.
We also remove the DB_READ_B handling because it is useless -
DQ_ACTIVE_B is set iff DQ_READ_B is set.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/rtc/rtc-s3c.c: fix incorrect way of save/restore of S3C2410_TICNT for TYPE_S3C64XX

On exynos5250, exynos5420 and exynos5260 it was observed that, after 1
cycle of S2R, the rtc-tick occurs at a very fast rate as compared to the
rtc-tick occuring before S2R.

This patch fixes the above issue by correcting the wrong way of
save/restore of S3C2410_TICNT for TYPE_S3C64XX.

Signed-off-by: Vikas Sajjan <vikas.sajjan@samsung.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kallsyms: fix absolute addresses for kASLR

Currently symbols that are absolute addresses are incorrectly displayed
in /proc/kallsyms if the kernel is loaded with kASLR.

The problem was that the scripts/kallsyms.c file which generates the
array of symbol names and addresses uses an relocatable value for all
symbols, even absolute symbols.  This patch fixes that.

Several kallsyms output in different boot states for comparison:

  $ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.nokaslr
  0000000000000000 D __per_cpu_start
  0000000000014280 D __per_cpu_end
  ffffffff810001c8 T _stext
  $ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr1
  000000001f200000 D __per_cpu_start
  000000001f214280 D __per_cpu_end
  ffffffffa02001c8 T _stext
  $ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr2
  000000000d400000 D __per_cpu_start
  000000000d414280 D __per_cpu_end
  ffffffff8e4001c8 T _stext
  $ egrep '_(stext|_per_cpu_(start|end))' /root/kallsyms.kaslr-fixed
  0000000000000000 D __per_cpu_start
  0000000000014280 D __per_cpu_end
  ffffffffadc001c8 T _stext

Signed-off-by: Andy Honig <ahonig@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

scripts/gen_initramfs_list.sh: fix flags for initramfs LZ4 compression

LZ4 as implemented in the kernel differs from the default method now
used by the reference implementation of LZ4.  Until the in-kernel method
is updated to support the new default, passing the legacy flag (-l) to
the compressor is necessary.  Without this flag the kernel-generated,
LZ4-compressed initramfs is junk.

Kyungsik said:

: It seems that lz4 supports legacy format with the same option as lz4c
: does.  Just looking at the first few bytes of lz4 compressed image, we can
: see whether it is new format or not.
:
: It shows new format magic number without this patch.  New format magic
: number is 0x184d2204.
:
: $ hexdump -C ./initramfs_data.cpio.lz4 |more
: 00000000  04 22 4d 18 64 70 b9 69 (Little Endian)
: ...
:
: Currently kernel supports legacy format only.

Signed-off-by: Daniel M. Weeks <dan@danweeks.net>
Cc: Michal Marek <mmarek@suse.cz>
Acked-by: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: include VM_MIXEDMAP flag in the VM_SPECIAL list to avoid m(un)locking

Daniel Borkmann reported a VM_BUG_ON assertion failing:

  ------------[ cut here ]------------
  kernel BUG at mm/mlock.c:528!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: ccm arc4 iwldvm [...]
   video
  CPU: 3 PID: 2266 Comm: netsniff-ng Not tainted 3.14.0-rc2+ #8
  Hardware name: LENOVO 2429BP3/2429BP3, BIOS G4ET37WW (1.12 ) 05/29/2012
  task: ffff8801f87f9820 ti: ffff88002cb44000 task.ti: ffff88002cb44000
  RIP: 0010:[<ffffffff81171ad0>]  [<ffffffff81171ad0>] munlock_vma_pages_range+0x2e0/0x2f0
  Call Trace:
    do_munmap+0x18f/0x3b0
    vm_munmap+0x41/0x60
    SyS_munmap+0x22/0x30
    system_call_fastpath+0x1a/0x1f
  RIP   munlock_vma_pages_range+0x2e0/0x2f0
  ---[ end trace a0088dcf07ae10f2 ]---

because munlock_vma_pages_range() thinks it's unexpectedly in the middle
of a THP page.  This can be reproduced with default config since 3.11
kernels.  A reproducer can be found in the kernel's selftest directory
for networking by running ./psock_tpacket.

The problem is that an order=2 compound page (allocated by
alloc_one_pg_vec_page() is part of the munlocked VM_MIXEDMAP vma (mapped
by packet_mmap()) and mistaken for a THP page and assumed to be order=9.

The checks for THP in munlock came with commit ff6a6da60b89 ("mm:
accelerate munlock() treatment of THP pages"), i.e.  since 3.9, but did
not trigger a bug.  It just makes munlock_vma_pages_range() skip such
compound pages until the next 512-pages-aligned page, when it encounters
a head page.  This is however not a problem for vma's where mlocking has
no effect anyway, but it can distort the accounting.

Since commit 7225522bb429 ("mm: munlock: batch non-THP page isolation
and munlock+putback using pagevec") this can trigger a VM_BUG_ON in
PageTransHuge() check.

This patch fixes the issue by adding VM_MIXEDMAP flag to VM_SPECIAL, a
list of flags that make vma's non-mlockable and non-mergeable.  The
reasoning is that VM_MIXEDMAP vma's are similar to VM_PFNMAP, which is
already on the VM_SPECIAL list, and both are intended for non-LRU pages
where mlocking makes no sense anyway.  Related Lkml discussion can be
found in [2].

[1] tools/testing/selftests/net/psock_tpacket
[2] https://lkml.org/lkml/2014/1/10/427

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Reported-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: John David Anglin <dave.anglin@bell.net>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Jared Hulbert <jaredeh@gmail.com>
Tested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [3.11.x+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: reparent charges of children before processing parent

Sometimes the cleanup after memcg hierarchy testing gets stuck in
mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0.

There may turn out to be several causes, but a major cause is this: the
workitem to offline parent can get run before workitem to offline child;
parent's mem_cgroup_reparent_charges() circles around waiting for the
child's pages to be reparented to its lrus, but it's holding
cgroup_mutex which prevents the child from reaching its
mem_cgroup_reparent_charges().

Further testing showed that an ordered workqueue for cgroup_destroy_wq
is not always good enough: percpu_ref_kill_and_confirm's call_rcu_sched
stage on the way can mess up the order before reaching the workqueue.

Instead, when offlining a memcg, call mem_cgroup_reparent_charges() on
all its children (and grandchildren, in the correct order) to have their
charges reparented first.

Fixes: e5fca243abae ("cgroup: use a dedicated workqueue for cgroup destruction")
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org> [v3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: fix endless loop in __mem_cgroup_iter_next()

Commit 0eef615665ed ("memcg: fix css reference leak and endless loop in
mem_cgroup_iter") got the interaction with the commit a few before it
d8ad30559715 ("mm/memcg: iteration skip memcgs not yet fully
initialized") slightly wrong, and we didn't notice at the time.

It's elusive, and harder to get than the original, but for a couple of
days before rc1, I several times saw a endless loop similar to that
supposedly being fixed.

This time it was a tighter loop in __mem_cgroup_iter_next(): because we
can get here when our root has already been offlined, and the ordering
of conditions was such that we then just cycled around forever.

Fixes: 0eef615665ed ("memcg: fix css reference leak and endless loop in mem_cgroup_iter").
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org> [3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/radix-tree.c: swapoff tmpfs radix_tree: remember to rcu_read_unlock

Running fsx on tmpfs with concurrent memhog-swapoff-swapon, lots of

  BUG: sleeping function called from invalid context at kernel/fork.c:606
  in_atomic(): 0, irqs_disabled(): 0, pid: 1394, name: swapoff
  1 lock held by swapoff/1394:
   #0:  (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6

followed by

  ================================================
  [ BUG: lock held when returning to user space! ]
  3.14.0-rc1 #3 Not tainted
  ------------------------------------------------
  swapoff/1394 is leaving the kernel with locks still held!
  1 lock held by swapoff/1394:
   #0:  (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6

after which the system recovered nicely.

Whoops, I long ago forgot the rcu_read_unlock() on one unlikely branch.

Fixes e504f3fdd63d ("tmpfs radix_tree: locate_item to speed up swapoff")

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

dma debug: account for cachelines and read-only mappings in overlap tracking

While debug_dma_assert_idle() checks if a given *page* is actively
undergoing dma the valid granularity of a dma mapping is a *cacheline*.
Sander's testing shows that the warning message "DMA-API: exceeded 7
overlapping mappings of pfn..." is falsely triggering.  The test is
simply mapping multiple cachelines in a given page.

Ultimately we want overlap tracking to be valid as it is a real api
violation, so we need to track active mappings by cachelines.  Update
the active dma tracking to use the page-frame-relative cacheline of the
mapping as the key, and update debug_dma_assert_idle() to check for all
possible mapped cachelines for a given page.

However, the need to track active mappings is only relevant when the
dma-mapping is writable by the device.  In fact it is fairly standard
for read-only mappings to have hundreds or thousands of overlapping
mappings at once.  Limiting the overlap tracking to writable
(!DMA_TO_DEVICE) eliminates this class of false-positive overlap
reports.

Note, the radix gang lookup is sub-optimal.  It would be best if it
stopped fetching entries once the search passed a page boundary.
Nevertheless, this implementation does not perturb the original net_dma
failing case.  That is to say the extra overhead does not show up in
terms of making the failing case pass due to a timing change.

References:
  http://marc.info/?l=linux-netdev&m=139232263419315&w=2
  http://marc.info/?l=linux-netdev&m=139217088107122&w=2

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Dave Jones <davej@redhat.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: close PageTail race

Commit bf6bddf1924e ("mm: introduce compaction and migration for
ballooned pages") introduces page_count(page) into memory compaction
which dereferences page->first_page if PageTail(page).

This results in a very rare NULL pointer dereference on the
aforementioned page_count(page).  Indeed, anything that does
compound_head(), including page_count() is susceptible to racing with
prep_compound_page() and seeing a NULL or dangling page->first_page
pointer.

This patch uses Andrea's implementation of compound_trans_head() that
deals with such a race and makes it the default compound_head()
implementation.  This includes a read memory barrier that ensures that
if PageTail(head) is true that we return a head page that is neither
NULL nor dangling.  The patch then adds a store memory barrier to
prep_compound_page() to ensure page->first_page is set.

This is the safest way to ensure we see the head page that we are
expecting, PageTail(page) is already in the unlikely() path and the
memory barriers are unfortunately required.

Hugetlbfs is the exception, we don't enforce a store memory barrier
during init since no race is possible.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Holger Kiehl <Holger.Kiehl@dwd.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: EDAC: add Mauro and Borislav as interim patch collectors

We're more or less collecting EDAC patches already anyway so let's hold it
down so that get_maintainer sees it too.

Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

macvlan: Add support for 'always_on' offload features

Macvlan currently inherits all of its features from the lower
device.  When lower device disables offload support, this causes
macvlan to disable offload support as well.  This causes
performance regression when using macvlan/macvtap in bridge
mode.

It can be easily demonstrated by creating 2 namespaces using
macvlan in bridge mode and running netperf between them:

MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

87380  16384  16384    20.00    1204.61

To restore the performance, we add software offload features
to the list of "always_on" features for macvlan.  This way
when a namespace or a guest using macvtap initially sends a
packet, this packet will not be segmented at macvlan level.
It will only be segmented when macvlan sends the packet
to the lower device.

MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

87380  16384  16384    20.00    5507.35

Fixes: 6acf54f1cf0a6747bac9fea26f34cfc5a9029523 (macvtap: Add support of packet capture on macvtap device.)
Fixes: 797f87f83b60685ff8a13fa0572d2f10393c50d3 (macvlan: fix netdev feature propagation from lower device)
CC: Florian Westphal <fw@strlen.de>
CC: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Jason Wang <jasowang@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless

John W. Linville says:

====================
Please pull this batch of fixes intended for the 3.14 stream...

For the mac80211 bits, Johannes says:

"This time I have a fix to get out of an 'infinite error state' in case
regulatory domain updates failed and two fixes for VHT associations: one
to not disconnect immediately when the AP uses more bandwidth than the
new regdomain would allow after a change due to association country
information getting used, and one for an issue in the code where
mac80211 doesn't correctly ignore a reserved field and then uses an HT
instead of VHT association."

For the iwlwifi bits, Emmanuel says:

"Johannes fixes a long standing bug in the AMPDU status reporting.
Max fixes the listen time which was way too long and causes trouble
to several APs."

Along with those, Bing Zhao marks the mwifiex_usb driver as _not_
supporting USB autosuspend after a number of problems with that have
been reported.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable

RFC4895 introduced AUTH chunks for SCTP; during the SCTP
handshake RANDOM; CHUNKS; HMAC-ALGO are negotiated (CHUNKS
being optional though):

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------

A special case is when an endpoint requires COOKIE-ECHO
chunks to be authenticated:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  ------------------ AUTH; COOKIE-ECHO ---------------->
  <-------------------- COOKIE-ACK ---------------------

RFC4895, section 6.3. Receiving Authenticated Chunks says:

  The receiver MUST use the HMAC algorithm indicated in
  the HMAC Identifier field. If this algorithm was not
  specified by the receiver in the HMAC-ALGO parameter in
  the INIT or INIT-ACK chunk during association setup, the
  AUTH chunk and all the chunks after it MUST be discarded
  and an ERROR chunk SHOULD be sent with the error cause
  defined in Section 4.1. [...] If no endpoint pair shared
  key has been configured for that Shared Key Identifier,
  all authenticated chunks MUST be silently discarded. [...]

  When an endpoint requires COOKIE-ECHO chunks to be
  authenticated, some special procedures have to be followed
  because the reception of a COOKIE-ECHO chunk might result
  in the creation of an SCTP association. If a packet arrives
  containing an AUTH chunk as a first chunk, a COOKIE-ECHO
  chunk as the second chunk, and possibly more chunks after
  them, and the receiver does not have an STCB for that
  packet, then authentication is based on the contents of
  the COOKIE-ECHO chunk. In this situation, the receiver MUST
  authenticate the chunks in the packet by using the RANDOM
  parameters, CHUNKS parameters and HMAC_ALGO parameters
  obtained from the COOKIE-ECHO chunk, and possibly a local
  shared secret as inputs to the authentication procedure
  specified in Section 6.3. If authentication fails, then
  the packet is discarded. If the authentication is successful,
  the COOKIE-ECHO and all the chunks after the COOKIE-ECHO
  MUST be processed. If the receiver has an STCB, it MUST
  process the AUTH chunk as described above using the STCB
  from the existing association to authenticate the
  COOKIE-ECHO chunk and all the chunks after it. [...]

Commit bbd0d59809f9 introduced the possibility to receive
and verification of AUTH chunk, including the edge case for
authenticated COOKIE-ECHO. On reception of COOKIE-ECHO,
the function sctp_sf_do_5_1D_ce() handles processing,
unpacks and creates a new association if it passed sanity
checks and also tests for authentication chunks being
present. After a new association has been processed, it
invokes sctp_process_init() on the new association and
walks through the parameter list it received from the INIT
chunk. It checks SCTP_PARAM_RANDOM, SCTP_PARAM_HMAC_ALGO
and SCTP_PARAM_CHUNKS, and copies them into asoc->peer
meta data (peer_random, peer_hmacs, peer_chunks) in case
sysctl -w net.sctp.auth_enable=1 is set. If in INIT's
SCTP_PARAM_SUPPORTED_EXT parameter SCTP_CID_AUTH is set,
peer_random != NULL and peer_hmacs != NULL the peer is to be
assumed asoc->peer.auth_capable=1, in any other case
asoc->peer.auth_capable=0.

Now, if in sctp_sf_do_5_1D_ce() chunk->auth_chunk is
available, we set up a fake auth chunk and pass that on to
sctp_sf_authenticate(), which at latest in
sctp_auth_calculate_hmac() reliably dereferences a NULL pointer
at position 0..0008 when setting up the crypto key in
crypto_hash_setkey() by using asoc->asoc_shared_key that is
NULL as condition key_id == asoc->active_key_id is true if
the AUTH chunk was injected correctly from remote. This
happens no matter what net.sctp.auth_enable sysctl says.

The fix is to check for net->sctp.auth_enable and for
asoc->peer.auth_capable before doing any operations like
sctp_sf_authenticate() as no key is activated in
sctp_auth_asoc_init_active_key() for each case.

Now as RFC4895 section 6.3 states that if the used HMAC-ALGO
passed from the INIT chunk was not used in the AUTH chunk, we
SHOULD send an error; however in this case it would be better
to just silently discard such a maliciously prepared handshake
as we didn't even receive a parameter at all. Also, as our
endpoint has no shared key configured, section 6.3 says that
MUST silently discard, which we are doing from now onwards.

Before calling sctp_sf_pdiscard(), we need not only to free
the association, but also the chunk->auth_chunk skb, as
commit bbd0d59809f9 created a skb clone in that case.

I have tested this locally by using netfilter's nfqueue and
re-injecting packets into the local stack after maliciously
modifying the INIT chunk (removing RANDOM; HMAC-ALGO param)
and the SCTP packet containing the COOKIE_ECHO (injecting
AUTH chunk before COOKIE_ECHO). Fixed with this patch applied.

Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <yasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'linux-can-fixes-for-3.14-20140303' of git://gitorious.org/linux-can/linux-can

linux-can-fixes-for-3.14-20140303

Marc Kleine-Budde says:

====================
this is a pull request of 8 patches. Oliver Hartkopp contributes a patch which
removes the CAN FD compatibility for CAN 2.0 sockets, as it turns out that this
compatibility has some conceptual cornercases. The remaining 7 patches are by
me, they address a problem in the flexcan driver. When shutting down the
interface ("ifconfig can0 down") under heavy network load the whole system will
hang. This series reworks the actual sequence in close() and the transition
from and to the low power modes of the CAN controller.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer

when ip_tunnel process multicast packets, it may check if the packet is looped
back packet though 'rt_is_output_route(skb_rtable(skb))' in ip_tunnel_rcv(),
but before that , skb->_skb_refdst has been dropped in iptunnel_pull_header(),
so which leads to a panic.

fix the bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cpsw: fix cpdma rx descriptor leak on down interface

This patch fixes a CPDMA RX Descriptor leak that occurs after taking
the interface down when the CPSW is in Dual MAC mode. Previously
the CPSW_ALE port was left open up which causes packets to be received
and processed by the RX interrupt handler and were passed to the
non active network interface where they were ignored.

The fix is for the slave_stop function of the selected interface
to disable the respective CPSW_ALE Port from forwarding packets. This
blocks traffic from being received on the inactive interface.

Signed-off-by: Schuyler Patton <spatton@ti.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: isolate TX workarounds not applicable to Skyhawk-R

Some of TX workarounds in be_xmit_workarounds() routine
are not applicable (and result in HW errors) to Skyhawk-R chip.
Isolate BE3-R/Lancer specific workarounds to a separate routine.

Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix skb double free in be_xmit_wrokarounds() failure path

skb_padto(), skb_share_check() and __vlan_put_tag() routines free
skb when they return an error. This patch fixes be_xmit_workarounds()
to not free skb again in such cases.

Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: clear promiscuous bits in adapter->flags while disabling promiscuous mode

We should clear promiscuous bits in adapter->flags while disabling promiscuous
mode. Else we will not put interface back into VLAN promisc mode if the vlans
already added exceeds the maximum limit.

Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Fix to reset transparent vlan tagging

For disabling transparent tagging issue SET_HSW_CONFIG with pvid_valid=1
and pvid=0xFFFF and not with the default pvid as this case would fail in Lancer.
Hence removing the get_hsw_config call from be_vf_setup() as it's
only use of getting default pvid is no longer needed.

Also do proper housekeeping only if the FW command succeeds.

Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com>
Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: dcb: a couple off by one bugs

The ->tc_cfg[] array has QLC_DCB_MAX_TC (8) elements so the check is
off by one. These functions are always called with valid values though
so it doesn't affect how the code works.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: fix bogus RTT on special retransmission

RTT may be bogus with tall loss probe (TLP) when a packet
is retransmitted and latter (s)acked without TCPCB_SACKED_RETRANS flag.

For example, TLP calls __tcp_retransmit_skb() instead of
tcp_retransmit_skb(). The skb timestamps are updated but the sacked
flag is not marked with TCPCB_SACKED_RETRANS. As a result we'll
get bogus RTT in tcp_clean_rtx_queue() or in tcp_sacktag_one() on
spurious retransmission.

The fix is to apply the sticky flag TCP_EVER_RETRANS to enforce Karn's
check on RTT sampling. However this will disable F-RTO if timeout occurs
after TLP, by resetting undo_marker in tcp_enter_loss(). We relax this
check to only if any pending retransmists are still in-flight.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

hsr: off by one sanity check in hsr_register_frame_in()

This is a sanity check and we never pass invalid values so this patch
doesn't change anything. However the node->time_in[] array has
HSR_MAX_SLAVE (2) elements and not HSR_MAX_DEV (3).

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
"Misc fixes, most of them SCHED_DEADLINE fallout"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/deadline: Prevent rt_time growth to infinity
  sched/deadline: Switch CPU's presence test order
  sched/deadline: Cleanup RT leftovers from {inc/dec}_dl_migration
  sched: Fix double normalization of vruntime

Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull liblockdep fixes from Ingo Molnar:
"A handful of build fixes for liblockdep"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools/liblockdep: Use realpath for srctree and objtree
  tools/liblockdep: Add a stub for new rcu_is_watching
  tools/liblockdep: Mark runtests.sh as executable
  tools/liblockdep: Add include directory to allow tests to compile
  tools/liblockdep: Fix include of asm/hash.h
  tools/liblockdep: Fix initialization code path

Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux

Pull clk framework fixes from Mike Turquette:
"Clock framework and driver fixes, all of which fix user-visible
  regressions.

  There is a single framework fix that prevents dereferencing a NULL
  pointer when calling clk_get.  The range of fixes for clock driver
  regressions spans memory leak fixes, touching the wrong registers that
  cause things to explode, misconfigured clock rates that result in
  non-responsive devices and even some boot failures.  The most benign
  fix is DT binding doc typo.  It is a stable ABI exposed from the
  kernel that was introduced in -rc1, so best to fix it now"

* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux: (25 commits)
  clk:at91: Fix memory leak in of_at91_clk_master_setup()
  clk: nomadik: fix multiplatform problem
  clk: Correct handling of NULL clk in __clk_{get, put}
  clk: shmobile: Fix typo in MSTP clock DT bindings
  clk: shmobile: rcar-gen2: Fix qspi divisor
  clk: shmobile: rcar-gen2: Fix clock parent for all non-PLL clocks
  clk: tegra124: remove gr2d and gr3d clocks
  clk: tegra: Fix vic03 mux index
  clk: shmobile: rcar-gen2: Fix qspi divisor
  clk: shmobile: rcar-gen2: Fix clock parent all non-PLL clocks
  clk: tegra: use max divider if divider overflows
  clk: tegra: cclk_lp has a pllx/2 divider
  clk: tegra: fix sdmmc clks on Tegra1x4
  clk: tegra: fix host1x clock on Tegra124
  clk: tegra: PLLD2 fixes for hdmi
  clk: tegra: Fix PLLD mnp table
  clk: tegra: Fix PLLP rate table
  clk: tegra: Correct clock number for UARTE
  clk: tegra: Add missing Tegra20 fuse clks
  ARM: keystone: dts: fix clkvcp3 control register address
  ...

can: remove CAN FD compatibility for CAN 2.0 sockets

In commit e2d265d3b587 (canfd: add support for CAN FD in CAN_RAW sockets)
CAN FD frames with a payload length up to 8 byte are passed to legacy
sockets where the CAN FD support was not enabled by the application.

After some discussions with developers at a fair this well meant feature
leads to confusion as no clean switch for CAN / CAN FD is provided to the
application programmer. Additionally a compatibility like this for legacy
CAN_RAW sockets requires some compatibility handling for the sending, e.g.
make CAN2.0 frames a CAN FD frame with BRS at transmission time (?!?).

This will become a mess when people start to develop applications with
real CAN FD hardware. This patch reverts the bad compatibility code
together with the documentation describing the removed feature.

Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: flexcan: factor out soft reset into seperate funtion

This patch moves the soft reset into a seperate function.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: flexcan: flexcan_remove(): add missing netif_napi_del()

This patch adds the missing netif_napi_del() to the flexcan_remove() function.

Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze

This patch factors out freeze and unfreeze of the CAN core into seperate
functions. Experiments have shown that the transition from and to freeze mode
may take several microseconds, especially the time entering the freeze mode
depends on the current bitrate.

This patch adds a while loop which polls the Freeze Mode ACK bit (FRZ_ACK) that
indicates a successfull mode change. If the function runs into a timeout a
error value is returned.

Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: flexcan: factor out transceiver {en,dis}able into seperate functions

This patch moves the transceiver enable and disable into seperate functions,
where the NULL pointer check is hidden.

Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: flexcan: fix transition from and to low power mode in chip_{en,dis}able

In flexcan_chip_enable() and flexcan_chip_disable() fixed delays are used.
Experiments have shown that the transition from and to low power mode may take
several microseconds.

This patch adds a while loop which polls the Low Power Mode ACK bit (LPM_ACK)
that indicates a successfull mode change. If the function runs into a timeout a
error value is returned.

Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: flexcan: flexcan_open(): fix error path if flexcan_chip_start() fails

If flexcan_chip_start() in flexcan_open() fails, the interrupt is not freed,
this patch adds the missing cleanup.

Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: flexcan: fix shutdown: first disable chip, then all interrupts

When shutting down the CAN interface (ifconfig canX down) during high CAN bus
loads, the CAN core might hang and freeze the whole CPU.

This patch fixes the shutdown sequence by first disabling the CAN core then
disabling all interrupts.

Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

Linux 3.14-rc5

USB AX88179/178A: Support D-Link DUB-1312

Add the USB device ID for the D-Link DUB-1312 USB 3.0 to Gigabit Ethernet
Adapter to the AX88179/178A driver.

Signed-off-by: Gerry Demaret <gerry@tigron.be>
Signed-off-by: David S. Miller <davem@davemloft.net>

b44: always set duplex mode why phy changes

Without this patch b44_check_phy() was called when the phy called the
adjust callback. This method only change the mac duplex mode when the
carrier was off. When the phy changed the duplex mode after the carrier
was on the mac was not changed. This happened when an external phy was
used.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b44: add calls to phy_{start,stop}

When support for external phys was added to b44, the calls to start and
stop the phy were missing in the mac driver. This adds the calls to
phy_start() and phy_stop().

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Not a huge amount happening, some MAINTAINERS updates, radeon, vmwgfx
  and tegra fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/vmwgfx: avoid null pointer dereference at failure paths
  drm/vmwgfx: Make sure backing mobs are cleared when allocated. Update driver date.
  drm/vmwgfx: Remove some unused surface formats
  drm/radeon: enable speaker allocation setup on dce3.2
  drm/radeon: change audio enable logic
  drm/radeon: fix audio disable on dce6+
  drm/radeon: free uvd ring on unload
  drm/radeon: disable pll sharing for DP on DCE4.1
  drm/radeon: fix missing bo reservation
  drm/radeon: print the supported atpx function mask
  MAINTAINERS: update drm git tree entry
  MAINTAINERS: add entry for drm radeon driver
  drm/tegra: Add guard to avoid double disable/enable of RGB outputs
  gpu: host1x: do not check previously handled gathers
  drm/tegra: fix typo 'CONFIG_TEGRA_DRM_FBDEV'

Merge tag 'usb-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are 2 USB patches for 3.14-rc5, one a new device id, and the
  other fixes a reported problem with threaded irqs and the USB EHCI
  driver"

* tag 'usb-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: ehci: fix deadlock when threadirqs option is used
  USB: ftdi_sio: add Cressi Leonardo PID

Merge tag 'driver-core-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull sysfs fix from Greg KH:
"Here is a single sysfs fix for 3.14-rc5.  It fixes a reported problem
  with the namespace code in sysfs"

* tag 'driver-core-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  sysfs: fix namespace refcnt leak

Merge tag 'staging-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging tree fixes from Greg KH:
"Here are a few IIO fixes, and a new device id for a staging driver for
  3.14-rc5.  All have been in linux-next for a while, I did a final
  merge to get the IIO fixes into this tree, they were incorrectly in
  the char-misc tree for a few weeks, and I forgot to tell you to pull
  them from there.  This makes it a single pull request for you"

* tag 'staging-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: r8188eu: Add new device ID
  staging:iio:adc:MXS:LRADC: fix touchscreen statemachine
  iio:gyro: bug on L3GD20H gyroscope support
  iio: cm32181: Change cm32181 ambient light sensor driver
  iio: cm36651: Fix read/write integration time function.

Merge branch 'drm-fixes-3.14' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

more radeon fixes

* 'drm-fixes-3.14' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: enable speaker allocation setup on dce3.2
  drm/radeon: change audio enable logic
  drm/radeon: fix audio disable on dce6+
  drm/radeon: free uvd ring on unload
  drm/radeon: disable pll sharing for DP on DCE4.1
  drm/radeon: fix missing bo reservation
  drm/radeon: print the supported atpx function mask

Merge iio fixes into staging-linus

These I forgot about before, but need to get into 3.14-final.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"Misc fixes, most of them on the tooling side"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Fix strict alias issue for find_first_bit
  perf tools: fix BFD detection on opensuse
  perf: Fix hotplug splat
  perf/x86: Fix event scheduling
  perf symbols: Destroy unused symsrcs
  perf annotate: Check availability of annotate when processing samples

Merge tag 'vmwgfx-fixes-3.14-2014-03-02' of git://people.freedesktop.org/~thomash/linux into drm-fixes

A couple of minor fixes.

Pull request of 2014-03-02

* tag 'vmwgfx-fixes-3.14-2014-03-02' of git://people.freedesktop.org/~thomash/linux:
  drm/vmwgfx: avoid null pointer dereference at failure paths
  drm/vmwgfx: Make sure backing mobs are cleared when allocated. Update driver date.
  drm/vmwgfx: Remove some unused surface formats

drm/vmwgfx: avoid null pointer dereference at failure paths

vmw_takedown_otable_base() and vmw_mob_unbind() check for
potential vmw_fifo_reserve() failure and print error message,
but then immediately dereference NULL pointer.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>

drm/vmwgfx: Make sure backing mobs are cleared when allocated. Update driver date.

Backing mob contents is propagated to user-space, so make sure backing
mobs are cleared when allocated. This also accidently fix rendering errors
with celestia when emulating legacy mode.

Also update driver date.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

drm/vmwgfx: Remove some unused surface formats

These formats are deprecated.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Peter Anvin:
"The VMCOREINFO patch I'll pushing for this release to avoid having a
  release with kASLR and but without that information.

  I was hoping to include the FPU patches from Suresh, but ran into a
  problem (see other thread); will try to make them happen next week"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, kaslr: add missed "static" declarations
  x86, kaslr: export offset in VMCOREINFO ELF notes

Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending

Pull SCSI target fixes from Nicholas Bellinger:
"The bulk of the series are bugfixes for qla2xxx target NPIV support
  that went in for v3.14-rc1.  Also included are a few DIF related
  fixes, a qla2xxx fix (Cc'ed to stable) from Greg W., and vhost/scsi
  protocol version related fix from Venkatesh.

  Also just a heads up that a series to address a number of issues with
  iser-target active I/O reset/shutdown is still being tested, and will
  be included in a separate -rc6 PULL request"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  vhost/scsi: Check LUN structure byte 0 is set to 1, per spec
  qla2xxx: Fix kernel panic on selective retransmission request
  Target/sbc: Don't use sg as iterator in sbc_verify_read
  target: Add DIF sense codes in transport_generic_request_failure
  target/sbc: Fix sbc_dif_copy_prot addr offset bug
  tcm_qla2xxx: Fix NAA formatted name for NPIV WWPNs
  tcm_qla2xxx: Perform configfs depend/undepend for base_tpg
  tcm_qla2xxx: Add NPIV specific enable/disable attribute logic
  qla2xxx: Check + fail when npiv_vports_inuse exists in shutdown
  qla2xxx: Fix qlt_lport_register base_vha callback race

Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma

Pull slave-dma fixes from Vinod Koul:
"This request brings you two small fixes.  First one for fixing
  dereference of freed descriptor and second for fixing sdma bindings
  for it to work for imx25.

  I was planning to send this about 10days ago but then I had to proceed
  on my paternity leave and didnt get chance to send this.  Now got a
  bit of time from dady duties :)"

* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
  dma: sdma: Add imx25 compatible
  dma: ste_dma40: don't dereference free:d descriptor

Merge tag 'pm+acpi-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management fixes from Rafael Wysocki:
"These three commits fix a recent intel_pstate regression and two old
  bugs that should be fixed in -stable too, one in the ACPI processor
  driver and one in the firmare loader.

  Specifics:

   - One of the recent intel_pstate driver fixes introduced a rounding
     error that on some systems causes the frequency to be stuck at the
     lowest level forever.  Fix from Dirk Brandewie.

   - The firmware_class driver's PM notifier doesn't handle the
     PM_RESTORE_PREPARE event during hibernation image restore and that
     leads to a deadlock on umhelper_sem in __usermodehelper_disable().
     Fix from Sebastian Capella.

   - acpi_processor_set_throttling() abuses set_cpus_allowed_ptr() in a
     nasty way which triggers the WARN_ON_ONCE() in wq_worker_waking_up()
     among other things.  Fix from Lan Tianyu"

* tag 'pm+acpi-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / processor: Rework processor throttling with work_on_cpu()
  PM / hibernate: Fix restore hang in freeze_processes()
  intel_pstate: Change busy calculation to use fixed point math.

Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent build fixes for certain distro environments, from Arnaldo Carvalho de Melo:

  * Problem on recent gcc on x86-32 related to strict alias issue for
    find_first_bit (Jiri Olsa).

  * OpenSuSE: BFD detection problems related to not explicitely listing all
    required libraries (Andi Kleen)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'fixes-for-3.14d' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus

Jonathan writes:

Fourth set of IIO fixes for the 3.14 kernel.

A single line patch fixing a regression that was introduced in 3.13 in the
reworking of the mxs touch screen and ADC drivers to be interrupt rather
than polling driven. It resulted in a stray double reporting of the release
coordinate in the touch screen driver. The bug lay in the adc side
of the driver which left the statemachine in the wrong state.

MAINTAINERS: add maintainer entry for Armada DRM driver

Add a maintainers entry for the Armada DRM driver.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bna: fix vlan tag stripping and implement its toggling

The recent commit "fe1624c bna: RX Filter Enhancements" disables
VLAN tag stripping if the NIC is in promiscuous mode. This causes
__vlan_hwaccel_put_tag() is called when the stripping is disabled.
Because of this VLAN over bna does not work and causes BUGs in conjunction
with openvswitch like this:
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

tg3: Don't check undefined error bits in RXBD

Redefine the RXD_ERR_MASK to include only relevant error bits. This fixes
a customer reported issue of randomly dropping packets on the 5719.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'dm-3.14-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:
"A few dm-cache fixes, an invalid ioctl handling fix for dm multipath,
  a couple immutable biovec fixups for dm mirror, and a few dm-thin
  fixes.

  There will likely be additional dm-thin metadata and data resize fixes
  to include in 3.14-rc6 next week.

  Note to stable-minded folks: Immutable biovecs were introduced in
  3.14, so the related fixups for dm mirror are not needed in stable@
  kernels"

* tag 'dm-3.14-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm cache: fix truncation bug when mapping I/O to >2TB fast device
  dm thin: allow metadata space larger than supported to go unused
  dm mpath: fix stalls when handling invalid ioctls
  dm thin: fix the error path for the thin device constructor
  dm raid1: fix immutable biovec related BUG when retrying read bio
  dm io: fix I/O to multiple destinations
  dm thin: avoid metadata commit if a pool's thin devices haven't changed
  dm cache: do not add migration to completed list before unhooking bio
  dm cache: move hook_info into common portion of per_bio_data structure

Merge tag 'sound-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"It's a bad habit to get a higher volume of fixes often lately, but
  things happen again.

  All commits found here are real bug fixes, and are mostly trivial.
  Most of changes in ASoC are the fixes for enum items due to the wrong
  API usages, in addition to a few DAPM mutex deadlock and other fixes.
  In HD-audio, only fixups for HP laptops.  Although diffstat shows
  much, the changes are simple: there are just so many different device
  entries there"

* tag 'sound-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: sta32x: Fix wrong enum for limiter2 release rate
  ASoC: da732x: Mark DC offset control registers volatile
  ALSA: hda/realtek - Add more entry for enable HP mute led
  ALSA: hda - Add a fixup for HP Folio 13 mute LED
  ASoC: wm8958-dsp: Fix firmware block loading
  ASoC: sta32x: Fix cache sync
  ALSA: hda/realtek - Add more entry for enable HP mute led
  ASoC: dapm: Add locking to snd_soc_dapm_xxxx_pin functions
  Input - arizona-haptics: Fix double lock of dapm_mutex
  ASoC: wm8400: Fix the wrong number of enum items
  ASoC: isabelle: Fix the wrong number of items in enum ctls
  ASoC: ad1980: Fix wrong number of items for capture source
  ASoC: wm8994: Fix the wrong number of enum items
  ASoC: wm8900: Fix the wrong number of enum items
  ASoC: wm8770: Fix wrong number of enum items
  ASoC: sta32x: Fix array access overflow
  ASoC: dapm: Correct regulator bypass error messages

Merge tag 'edac_fixes_for_3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC fixes from Borislav Petkov:
"Two fixes below for PCI devices disappearing when a reference count
  underflow happens after a couple of insmod/rmmod cycles in succession"

* tag 'edac_fixes_for_3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  i7300_edac: Fix device reference count
  i7core_edac: Fix PCI device reference count

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"Three x86 fixes and one for ARM/ARM64.

  In particular, nested virtualization on Intel is broken in 3.13 and
  fixed by this pull request"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm, vmx: Really fix lazy FPU on nested guest
  kvm: x86: fix emulator buffer overflow (CVE-2014-0049)
  arm/arm64: KVM: detect CPU reset on CPU_PM_EXIT
  KVM: MMU: drop read-only large sptes when creating lower level sptes

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull ARM64 fixes from Catalin Marinas:
- !CONFIG_SMP build fix
- pte bit testing macros conversion fix (int truncates top bits of
   long)
- stack unwinding PC calculation fix

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Fix !CONFIG_SMP kernel build
  arm64: mm: Add double logical invert to pte accessors
  ARM64: unwind: Fix PC calculation

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

Pull powerpc fixes from Ben Herrenschmidt:
"Here are a few more powerpc fixes for 3.14.

  Most of these are also CC'ed to stable and fix bugs in new
  functionality introduced in the last 2 or 3 versions"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/powernv: Fix indirect XSCOM unmangling
  powerpc/powernv: Fix opal_xscom_{read,write} prototype
  powerpc/powernv: Refactor PHB diag-data dump
  powerpc/powernv: Dump PHB diag-data immediately
  powerpc: Increase stack redzone for 64-bit userspace to 512 bytes
  powerpc/ftrace: bugfix for test_24bit_addr
  powerpc/crashdump : Fix page frame number check in copy_oldmem_page
  powerpc/le: Ensure that the 'stop-self' RTAS token is handled correctly

mwifiex: do not advertise usb autosuspend support

As many Surface Pro I & II users have found out, the mwifiex_usb
doesn't support usb autosuspend, and it has caused some system
stability issues.

Bug 69661 - mwifiex_usb on MS Surface Pro 1 is unstable
Bug 60815 - Interface hangs in mwifiex_usb
Bug 64111 - mwifiex_usb USB8797 crash failed to get signal
information

USB autosuspend get triggered when Surface Pro's AC power is
removed or powertop enables power saving on USB8797 device.
Driver's suspend handler is called here, but resume handler
won't be called until the AC power is put back on or powertop
disables power saving for USB8797.

We need to refactor the suspend/resume handlers to support
usb autosuspend properly. For now let's just remove it.

Cc: <stable@vger.kernel.org> # 3.5+
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes

arm64: Fix !CONFIG_SMP kernel build

Commit fb4a96029c8a (arm64: kernel: fix per-cpu offset restore on
resume) uses per_cpu_offset() unconditionally during CPU wakeup,
however, this is only defined for the SMP case.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Dave P Martin <Dave.Martin@arm.com>

arm64: mm: Add double logical invert to pte accessors

Page table entries on ARM64 are 64 bits, and some pte functions such as
pte_dirty return a bitwise-and of a flag with the pte value. If the
flag to be tested resides in the upper 32 bits of the pte, then we run
into the danger of the result being dropped if downcast.

For example:
gather_stats(page, md, pte_dirty(*pte), 1);
where pte_dirty(*pte) is downcast to an int.

This patch adds a double logical invert to all the pte_ accessors to
ensure predictable downcasting.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>