git.karo-electronics.de Git - linux-beck.git/log

]> git.karo-electronics.de Git - linux-beck.git/log

projects / linux-beck.git / log

commit | commitdiff | tree

Roland Dreier [Thu, 23 Jan 2014 07:24:21 +0000 (23:24 -0800)]

Merge branch 'ip-roce' into for-next

Conflicts:
drivers/infiniband/hw/mlx4/main.c

commit | commitdiff | tree

Roland Dreier [Thu, 23 Jan 2014 07:24:13 +0000 (23:24 -0800)]

Merge branches 'cma', 'cxgb4', 'flowsteer', 'ipoib', 'misc', 'mlx4', 'mlx5', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next

commit | commitdiff | tree

Eli Cohen [Wed, 15 Jan 2014 12:56:44 +0000 (14:56 +0200)]

IB/mlx5: Verify reserved fields are cleared

Verify that reserved fields in struct mlx5_ib_resize_cq are cleared
before continuing execution of the verb. This is required to allow
making use of this area in future revisions.

Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:23 +0000 (17:45 +0200)]

IB/mlx5: Remove old field for create mkey mailbox

Match firmware specification.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:22 +0000 (17:45 +0200)]

IB/mlx5: Abort driver cleanup if teardown hca fails

Do not continue with cleanup flow. If this ever happens we can check which
resources remained open.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:21 +0000 (17:45 +0200)]

IB/mlx5: Allow creation of QPs with zero-length work queues

The current code attmepts to call ib_umem_get() even if the length is
zero, which causes a failure. Since the spec allows zero length work
queues, change the code so we don't call ib_umem_get() in those cases.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:20 +0000 (17:45 +0200)]

mlx5_core: Fix PowerPC support

1. Fix derivation of sub-page index from the dma address in free_4k.
2. Fix the DMA address passed to dma_unmap_page by masking it properly.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:19 +0000 (17:45 +0200)]

mlx5_core: Improve debugfs readability

Use strings to display transport service or state of QPs. Use numeric
value for MTU of a QP.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:18 +0000 (17:45 +0200)]

IB/mlx5: Add support for resize CQ

Implement resize CQ which is a mandatory verb in mlx5.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:17 +0000 (17:45 +0200)]

IB/mlx5: Implement modify CQ

Modify CQ is used by ULPs like IPoIB to change moderation parameters. This
patch adds support in mlx5.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:16 +0000 (17:45 +0200)]

IB/mlx5: Make sure doorbell record is visible before doorbell

Put a wmb() to make sure the doorbell record is visible to the HCA before we
hit doorbell.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:15 +0000 (17:45 +0200)]

mlx5_core: Use mlx5 core style warning

Use mlx5_core_warn(), which is the standard warning emitter function, instead
of pr_warn().

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:14 +0000 (17:45 +0200)]

IB/mlx5: Clear out struct before create QP command

Output structs are expected by firmware to be cleared when a command is called.
Clear the "out" struct instead of "dout" which is used only later.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Haggai Eran [Tue, 14 Jan 2014 15:45:13 +0000 (17:45 +0200)]

mlx5_core: Fix out arg size in access_register command

The output size should be the sum of the core access reg output struct
plus the size of the specific register data provided by the caller.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Ding Tianhong [Wed, 8 Jan 2014 02:53:39 +0000 (10:53 +0800)]

RDMA/nes: Slight optimization of Ethernet address compare

Use the possibly more efficient ether_addr_equal() instead of memcmp().

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Ira Weiny [Wed, 18 Dec 2013 16:41:37 +0000 (08:41 -0800)]

IB/qib: Fix QP check when looping back to/from QP1

The GSI QP type is compatible with and should be allowed to send data
to/from any UD QP. This was found when testing ibacm on the same node
as an SA.

Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Paul Bolle [Thu, 9 Jan 2014 10:53:27 +0000 (11:53 +0100)]

RDMA/cxgb4: Fix gcc warning on 32-bit arch

Building mem.o for 32 bits x86 triggers a GCC warning:

    drivers/infiniband/hw/cxgb4/mem.c: In function '_c4iw_write_mem_dma_aligned':
    drivers/infiniband/hw/cxgb4/mem.c:79:25: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]

Silence that warning by casting "&wr_wait" to unsigned long before
casting it to __be64.  That's what _c4iw_write_mem_inline() already does.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Wei Yongjun [Sat, 21 Dec 2013 09:25:35 +0000 (17:25 +0800)]

IB/usnic: Remove unused includes of <linux/version.h>

Remove including <linux/version.h> that don't need it.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Svetlana Mavrina [Sun, 12 Jan 2014 11:56:09 +0000 (11:56 +0000)]

RDMA/amso1100: Add check if cache memory was allocated before freeing it

There is a path in handle_vq() where kmem_cache_free() can be called
with pointer to a local variable. It can happen if vq_repbuf_alloc()
failed to allocate memory from cache and req is NULL.

The patch adds check if cache memory was allocated before freeing it.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Svetlana Mavrina <another.karnil@gmail.com>
Reviewed-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Michal Schmidt [Fri, 17 Jan 2014 18:47:25 +0000 (19:47 +0100)]

IPoIB: Report operstate consistently when brought up without a link

After booting without a working link, "ip link" shows:

5: mlx4_ib1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 2044 qdisc pfifo_fast
state DOWN qlen 256
    ...
7: mlx4_ib1.8003@mlx4_ib1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 2044 qdisc
pfifo_fast state DOWN qlen 256
    ...

Then after connecting and disconnecting the link, which should result
in exactly the same state as before, it shows:

5: mlx4_ib1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 2044 qdisc pfifo_fast
state DOWN qlen 256
    ...
7: mlx4_ib1.8003@mlx4_ib1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 2044 qdisc
pfifo_fast state LOWERLAYERDOWN qlen 256
    ...

Notice the (now correct) LOWERLAYERDOWN operstate shown for the
mlx4_ib1.8003 interface. Ideally the identical state would be shown
right after boot.

The problem is related to the calling of netif_carrier_off() in
network drivers.  For a long time it was known that doing
netif_carrier_off() before registering the netdevice would result in
the interface's operstate being shown as UNKNOWN if the device was
brought up without a working link. This problem was fixed in commit
8f4cccbbd92 ('net: Set device operstate at registration time'), but
still there remains the minor inconsistency demonstrated above.

This patch fixes it by moving ipoib's call to netif_carrier_off() into
the .ndo_open method, which is where network drivers ordinarily do it.
With the patch when doing the same test as above, the operstate of
mlx4_ib1.8003 is shown as LOWERLAYERDOWN right after boot.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Or Gerlitz [Sun, 3 Nov 2013 08:20:43 +0000 (10:20 +0200)]

IB/core: Fix unused variable warning

Fix the below "make W=1" build warning:

drivers/infiniband/core/iwcm.c: In function ‘destroy_cm_id’:
drivers/infiniband/core/iwcm.c:330: warning: variable ‘ret’ set but not used

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Somnath Kotur [Sat, 16 Nov 2013 04:00:01 +0000 (09:30 +0530)]

RDMA/cma: Handle global/non-linklocal IPv6 addresses in cma_check_linklocal()

If addr is not a linklocal address, the code incorrectly fails to
return and ends up assigning the scope ID to the scope id of the
address, which is wrong. Fix by checking if it's a link local address
first, and immediately return 0 if not.

Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Dan Carpenter [Mon, 20 Jan 2014 10:32:49 +0000 (13:32 +0300)]

IB/usnic: Use GFP_ATOMIC under spinlock

This is called from qp_grp_and_vf_bind() and we are holding the
vf->lock so the allocation can't sleep.

Fixes: e3cf00d0a87f ('IB/usnic: Add Cisco VIC low-level hardware driver')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Bart Van Assche [Mon, 13 Jan 2014 11:39:35 +0000 (12:39 +0100)]

scsi_transport_srp: Fix kernel-doc warnings

The following command has been used to verify that the kernel-doc
tool no longer complains about undocumented fields:

scripts/kernel-doc -html drivers/scsi/scsi_transport_srp.c \
include/scsi/scsi_transport_srp.h >srp-transport-doc.html

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Sebastian Riemer <sebastian.riemer@profitbricks.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 11 Dec 2013 16:08:43 +0000 (17:08 +0100)]

scsi_transport_srp: Add rport state diagram

Add a diagram in Documentation/scsi/scsi_transport_srp that
illustrates the rport state transitions.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 11 Dec 2013 16:06:14 +0000 (17:06 +0100)]

scsi_transport_srp: Fix a race condition

The rport timers must be stopped before the SRP initiator destroys the
resources associated with the SCSI host. This is necessary because
otherwise the callback functions invoked from the SRP transport layer
could trigger a use-after-free. Stopping the rport timers before
invoking scsi_remove_host() can trigger long delays in the SCSI error
handler if a transport layer failure occurs while scsi_remove_host()
is in progress. Hence move the code for stopping the rport timers from
srp_rport_release() into a new function and invoke that function after
scsi_remove_host() has finished. This patch fixes the following
sporadic kernel crash:

     kernel BUG at include/asm-generic/dma-mapping-common.h:64!
     invalid opcode: 0000 [#1] SMP
     RIP: 0010:[<ffffffffa03b20b1>]  [<ffffffffa03b20b1>] srp_unmap_data+0x121/0x130 [ib_srp]
     Call Trace:
     [<ffffffffa03b20fc>] srp_free_req+0x3c/0x80 [ib_srp]
     [<ffffffffa03b2188>] srp_finish_req+0x48/0x70 [ib_srp]
     [<ffffffffa03b21fb>] srp_terminate_io+0x4b/0x60 [ib_srp]
     [<ffffffffa03a6fb5>] __rport_fail_io_fast+0x75/0x80 [scsi_transport_srp]
     [<ffffffffa03a7438>] rport_fast_io_fail_timedout+0x88/0xc0 [scsi_transport_srp]
     [<ffffffff8108b370>] worker_thread+0x170/0x2a0
     [<ffffffff81090876>] kthread+0x96/0xa0
     [<ffffffff8100c0ca>] child_rip+0xa/0x20

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 11 Dec 2013 16:05:22 +0000 (17:05 +0100)]

scsi_transport_srp: Block rport upon TL error even with fast_io_fail_tmo = off

The current behavior of the SRP transport layer when a transport layer
error is encountered is to block SCSI command processing only if
fast_io_fail_tmo != off. The current behavior of the FC transport
layer when a transport layer error is encountered is to block SCSI
command processing no matter which value fast_io_fail_tmo has been set
to. Make the behavior of the SRP transport layer consistent with that
of the FC transport layer to avoid confusion.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

David Dillow [Fri, 3 Jan 2014 21:16:10 +0000 (16:16 -0500)]

MAINTAINERS: Pass the torch of SRP submaintainership

Today was my last day at ORNL, and my future endeavors will leave even
less time to maintain the SRP initiator.

My thanks especially go to Bart, for keeping the pressure to improve
alive, and for driving so many of those improvements.

[ Add Bart as new submaintainer. - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Roland Dreier [Sun, 19 Jan 2014 23:18:49 +0000 (15:18 -0800)]

IB/mlx4: Use IS_ENABLED(CONFIG_IPV6)

...instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)

Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Roland Dreier [Sun, 19 Jan 2014 23:16:23 +0000 (15:16 -0800)]

RDMA/ocrdma: Add dependency on INET

Now that ocrdma supports IP-based addressing, we need to depend on
INET, since ocrdma registers itself for net device events.

Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Roland Dreier [Sun, 19 Jan 2014 23:13:44 +0000 (15:13 -0800)]

RDMA/ocrdma: Move ocrdma_inetaddr_event outside of "#if CONFIG_IPV6"

This fixes the build if IPV6 isn't enabled.

Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Matan Barak [Thu, 16 Jan 2014 15:16:48 +0000 (17:16 +0200)]

IB/mlx4: Add dependency INET

Since mlx4_ib supports IP based addressing, a dependency on INET needs
to be added, since mlx4_ib registers itself for net device events.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Wei Yongjun [Thu, 16 Jan 2014 08:28:13 +0000 (16:28 +0800)]

IB/cm: Fix missing unlock on error in cm_init_qp_rtr_attr()

Add the missing unlock before return from function cm_init_qp_rtr_attr()
in the error handling case.

Fixes: dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures")
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Matan Barak [Thu, 16 Jan 2014 15:16:47 +0000 (17:16 +0200)]

IB/core: Make ib_addr a core IB module

IP based addressing introduces the usage of rdma_addr_find_dmac_by_grh()
within ib_core. Since this function is declared in ib_addr, ib_addr
should be a part of the core INFINIBAND modules, rather than
INFINIBAND_ADDR_TRANS.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Or Gerlitz [Thu, 12 Dec 2013 16:03:17 +0000 (18:03 +0200)]

IB/core: Resolve Ethernet L2 addresses when modifying QP

Existing user space applications provide only IBoE L3 address
attributes to the kernel when they issue a modify QP modify. To work
with them and let such apps (plus kernel consumers which don't use the
RDMA-CM) keep working transparently under the IBoE GID IP addressing
changes, add an Eth L2 address resolution helper.

Signed-off-by: Moni Shoua <monis@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Moni Shoua [Thu, 12 Dec 2013 16:03:16 +0000 (18:03 +0200)]

RDMA/ocrdma: Populate GID table with IP based gids

This patch is similar in spirit to the "IB/mlx4: Use IBoE (RoCE) IP
based GIDs in the port GID table" patch.

Changes to inet4 and inet6 addresses for the host are monitored and if
the address is associated with an ocrdma device then a gid is added or
deleted from the device's gid table. The gid format will be a IPv4 to
IPv6 mapped or the IPv6 address.

Cc: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Moni Shoua [Thu, 12 Dec 2013 16:03:15 +0000 (18:03 +0200)]

RDMA/ocrdma: Handle Ethernet L2 parameters for IP based GID addressing

This patch is similar in spirit to the "IB/mlx4: Handle Ethernet L2
parameters for IP based GID addressing". It handles the fact that IP
based RoCE gids don't store Ethernet L2 parameters, MAC and VLAN.

When building an address handle, instead of parsing the dgid to
get the MAC and VLAN, take them from the address handle attributes.

Cc: Naresh Gottumukkala <bgottumukkala@emulex.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Moni Shoua [Thu, 12 Dec 2013 16:03:14 +0000 (18:03 +0200)]

IB/mlx4: Handle Ethernet L2 parameters for IP based GID addressing

IP based RoCE gids don't store Ethernet L2 parameters, MAC and VLAN.

Therefore, we need to extract them from the CQE and place them in
struct ib_wc (to be used for cases were they were taken from the gid).

Also, when modifying a QP or building address handle, instead of
parsing the dgid to get the MAC and VLAN, take them from the address
handle attributes.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Moni Shoua [Thu, 12 Dec 2013 16:03:13 +0000 (18:03 +0200)]

IB/mlx4: Use IBoE (RoCE) IP based GIDs in the port GID table

Currently, the mlx4 driver set IBoE (RoCE) gids to encode related
Ethernet netdevice interface MAC address and possibly VLAN id.

Change this scheme such that gids encode interface IP addresses (both
IP4 and IPv6).

This requires learning the IP addresses which are of use by a
netdevice associated with the HCA port, formatting them to gids and
adding them to the port gid table. Furthermore, events of add and
delete address are caught to maintain the gid table accordingly.

Associated IP addresses may belong to a master of an Ethernet
netdevice on top of that port so this should be considered when
building and maintaining the gid table.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Moni Shoua [Thu, 12 Dec 2013 16:03:12 +0000 (18:03 +0200)]

IB/cma: IBoE (RoCE) IP-based GID addressing

Currently, the IB core and specifically the RDMA-CM assumes that IBoE
(RoCE) gids encode related Ethernet netdevice interface MAC address
and possibly VLAN id.

Change GIDs to be treated as they encode interface IP address.

Since Ethernet layer 2 address parameters are not longer encoded
within gids, we have to extend the Infiniband address structures (e.g.
ib_ah_attr) with layer 2 address parameters, namely mac and vlan.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Julia Lawall [Sun, 29 Dec 2013 22:47:20 +0000 (23:47 +0100)]

IB/mlx4: Fix error return code

Set the return variable to an error code as done elsewhere in the function.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
{ ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
    when != &ret
*if(...)
{
  ... when != ret = e2
      when forall
return ret;
}

// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Wei Yongjun [Wed, 15 Jan 2014 05:52:55 +0000 (13:52 +0800)]

IB/usnic: Remove unused variable in usnic_debugfs_exit()

The variable qp_grp is initialized but never used otherwise, so remove
the unused variable.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 16 Jan 2014 01:02:43 +0000 (17:02 -0800)]

IB/usnic: Set userspace/kernel ABI ver to 4

usNIC userspace/kernel ABI should be set to 4 instead of 3.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 16 Jan 2014 01:02:37 +0000 (17:02 -0800)]

IB/usnic: Advertise usNIC devices as RDMA_NODE_USNIC_UDP

usNIC default transport is UDP. Hence, advertise RDMA_NODE_USNIC_UDP
by default for usNIC devices.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 16 Jan 2014 01:02:36 +0000 (17:02 -0800)]

IB/core: Add support for RDMA_NODE_USNIC_UDP

Add the complementary RDMA_NODE_USNIC_UDP for RDMA_TRANSPORT_USNIC_UDP.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 16 Jan 2014 01:02:21 +0000 (17:02 -0800)]

IB/usnic: Add dependency on CONFIG_INET

usNIC needs inet notifiers to function correctly, so add a Kconfig
dependency on CONFIG_INET.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 16 Jan 2014 01:02:20 +0000 (17:02 -0800)]

IB/usnic: Fix endianness-related warnings

Fix sparse endianness related warnings.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Matan Barak [Thu, 12 Dec 2013 16:03:11 +0000 (18:03 +0200)]

IB/core: Ethernet L2 attributes in verbs/cm structures

This patch add the support for Ethernet L2 attributes in the
verbs/cm/cma structures.

When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.

Thus, those attributes were added to the following structures:

* ib_ah_attr - added dmac
* ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
* ib_wc - added smac, vlan_id
* ib_sa_path_rec - added smac, dmac, vlan_id
* cm_av - added smac and vlan_id

For the path record structure, extra care was taken to avoid the new
fields when packing it into wire format, so we don't break the IB CM
and SA wire protocol.

On the active side, the CM fills. its internal structures from the
path provided by the ULP.  We add there taking the ETH L2 attributes
and placing them into the CM Address Handle (struct cm_av).

On the passive side, the CM fills its internal structures from the WC
associated with the REQ message.  We add there taking the ETH L2
attributes from the WC.

When the HW driver provides the required ETH L2 attributes in the WC,
they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
code checks for the presence of these flags, and in their absence does
address resolution from the ib_init_ah_from_wc() helper function.

ib_modify_qp_is_ok is also updated to consider the link layer. Some
parameters are mandatory for Ethernet link layer, while they are
irrelevant for IB.  Vendor drivers are modified to support the new
function signature.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Matan Barak [Thu, 7 Nov 2013 13:25:17 +0000 (15:25 +0200)]

IB/mlx4: Add support for steerable IB UD QPs

This patch adds support for steerable (NETIF) QP creation. When we
create the device, we allocate a range of steerable QPs.

Afterward when a QP is created with the NETIF flag, it's allocated
from this range. Allocation is managed by bitmap allocator.

Internal steering rules for those QPs is automatically generated on
their creation.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Matan Barak [Thu, 7 Nov 2013 13:25:16 +0000 (15:25 +0200)]

IB/mlx4: Add mechanism to support flow steering over IB links

The mlx4 device requires adding IB flow spec to rules that apply over
infiniband link layer. This patch adds a mechanism to add such a rule.

If higher levels e.g. IP/UDP/TCP flow specs are provided, the device
requires us to add an empty wild-carded IB rule. Furthermore, the device
requires the QPN to be put in the rule.

Add here specific parsing support for IB empty rules and the ability
to self-generate missing specs based on existing ones.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Matan Barak [Thu, 7 Nov 2013 13:25:15 +0000 (15:25 +0200)]

IB/mlx4: Enable device-managed steering support for IB ports too

Up until now, flow steering wasn't supported when using IB ports.

This patch enables support for flow steering if all hardware ports
support that, for example the new MLX4_DEV_CAP_FLAG2_DMFS_IPOIB mlx4
device capability.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Matan Barak [Thu, 7 Nov 2013 13:25:14 +0000 (15:25 +0200)]

mlx4_core: Add support for steerable IB UD QPs

This patch adds support for allocating IB UD QPs that we can steer
traffic from. We introduce a new firmware command FLOW_STEERING_IB_UC_QP_RANGE
and a capability bit.

This command isn't supported for VFs.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Matan Barak [Thu, 7 Nov 2013 13:25:13 +0000 (15:25 +0200)]

IB/core: Add support for IB L2 device-managed steering

This patch adds preliminary support for IB L2 device-managed steering,
currently exposed only in the kernel.

This flow spec can be used by low-level drivers that need to indicate
the link layer type when creating device-managed flow rules.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Matan Barak [Thu, 7 Nov 2013 13:25:12 +0000 (15:25 +0200)]

IB/core: Add flow steering support for IPoIB UD traffic

When creating an IPoIB UD QP, provide a hint to the low level driver
that the QP should support flow-steering. This means that privileged
user space applications can steer TCP/IP IPoIB traffic from the
network stack, in a similar manner done with Ethernet RAW_PACKET QPs.

The hint is provided through new QP creation flag called NETIF_QP.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:12 +0000 (17:45 +0200)]

IB/mlx5: Fix micro UAR allocator

The micro UAR (uuar) allocator had a bug which resulted from the fact
that in each UAR we only have two micro UARs avaialable, those at
index 0 and 1. This patch defines iterators to aid in traversing the
list of available micro UARs when allocating a uuar.

In addition, change the logic in create_user_qp() so that if high
class allocation fails (high class means lower latency), we revert to
medium class and not to the low class.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Dan Carpenter [Tue, 14 Jan 2014 15:45:11 +0000 (17:45 +0200)]

mlx5_core: Remove dead code

Remove leftover of debug code.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Eli Cohen [Tue, 14 Jan 2014 15:45:10 +0000 (17:45 +0200)]

IB/mlx5: Remove unused code in mr.c

The variable start in struct mlx5_ib_mr is never used. Remove it.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Fri, 20 Dec 2013 21:41:23 +0000 (21:41 +0000)]

IB/usnic: Append documentation to usnic_transport.h and cleanup

Add comment describing usnic_transport_rsrv port and remove
extraneous space from usnic_transport.c.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Roland Dreier [Mon, 13 Jan 2014 21:12:48 +0000 (13:12 -0800)]

IB/usnic: Fix typo "Ignorning" -> "Ignoring"

Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 23:40:58 +0000 (15:40 -0800)]

IB/usnic: Expose flows via debugfs

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:45 +0000 (14:48 -0800)]

IB/usnic: Use for_each_sg instead of a for-loop

Use for_each_sg() instead of an explicit for-loop to iterate over
scatter-gather list.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:36 +0000 (14:48 -0800)]

IB/usnic: Remove superflous parentheses

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:19 +0000 (14:48 -0800)]

IB/core: Add RDMA_TRANSPORT_USNIC_UDP

Add RDMA_TRANSPORT_USNIC_UDP which will be used by usNIC.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:18 +0000 (14:48 -0800)]

IB/usnic: Add UDP support in usnic_ib_qp_grp.[hc]

UDP support for qp_grps/qps.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:17 +0000 (14:48 -0800)]

IB/usnic: Add UDP support in u*verbs.c, u*main.c and u*util.h

Add supports for:
1) Parsing the socket file descriptor pass down from userspace.
2) IP notifiers
3) Encoding the IP in the GID
4) Other aux. changes to support UDP

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:16 +0000 (14:48 -0800)]

IB:usnic: Add UDP support to usnic_transport.[hc]

This patch provides API for rest of usNIC code to increment or decrement
socket's reference count. Auxiliary socket APIs are also provided.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:15 +0000 (14:48 -0800)]

IB/usnic: Add UDP support to usnic_fwd.[hc]

Add *ip field* to *struct usnic_fwd_dev* as well as new *functions* to
manipulate the *ip field.* Furthermore, add new functions for
programming UDP flows in the forwarding device.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:14 +0000 (14:48 -0800)]

IB/usnic: Update ABI and Version file for UDP support

Expand the kernel/userspace interface so userspace may push down
a socket file descriptor to usNIC. Also, bump up the abi and version
numbers.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:10 +0000 (14:48 -0800)]

IB/usnic: Port over sysfs to new usnic_fwd.h

This patch ports usnic_ib_sysfs.c to the new interface of
usnic_fwd.h.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:09 +0000 (14:48 -0800)]

IB/usnic: Port over usnic_ib_qp_grp.[hc] to new usnic_fwd.h

This patch ports usnic_ib_qp_grp.[hc] to the new interface
of usnic_fwd.h.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:08 +0000 (14:48 -0800)]

IB/usnic: Port over main.c and verbs.c to the usnic_fwd.h

This patch ports usnic_ib_main.c, usnic_ib_verbs.c and usnic_ib.h
to the new interface of usnic_fwd.h.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:07 +0000 (14:48 -0800)]

IB/usnic: Push all forwarding state to usnic_fwd.[hc]

Push all of the usnic device forwarding state - such as mtu, mac - to
usnic_fwd_dev. Furthermore, usnic_fwd.h exposes a improved interface
for rest of the usnic code. The primary improvement is that
usnic_fwd.h's flow management interface takes in high-level *filter*
and *action* structures now, instead of low-level paramaters such as
vnic_idx, rq_idx.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:48:06 +0000 (14:48 -0800)]

IB/usnic: Add struct usnic_transport_spec

Add *struct usnic_transport_spec* for passing around transport
specifications.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Thu, 9 Jan 2014 22:47:33 +0000 (14:47 -0800)]

IB/usnic: Change WARN_ON to lockdep_assert_held

usNIC calls WARN_ON(spin_is_locked..) at few places. In some of these
instances, the call is made while holding a spinlock. Change
all WARN_ON(spin_is_locked...) calls in usNIC to
lockdep_assert_held to make it fool-proof bc the latter can be
called while holding a spinlock and unlike spin_is_locked,
lockdep_assert_held also works correctly on UP.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Upinder Malhi [Tue, 10 Sep 2013 03:38:16 +0000 (03:38 +0000)]

IB/usnic: Add Cisco VIC low-level hardware driver

This adds a driver that allows userspace to use UD-like QPs over a
proprietary Cisco transport with Cisco's Virtual Interface Cards (VICs),
including VIC 1240 and 1280 cards.

Signed-off-by: Upinder Malhi <umalhi@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Devesh Sharma [Thu, 5 Dec 2013 04:16:07 +0000 (09:46 +0530)]

RDMA/ocrdma: Fix OCRDMA_GEN2_FAMILY macro definition

OCRDMA_GEN2_FAMILY is wrongly defined as 0x02 -- it should be 0x0F.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Devesh Sharma [Thu, 5 Dec 2013 10:18:01 +0000 (15:48 +0530)]

RDMA/ocrdma: Fix AV_VALID bit position

Fix ah->av->valid bit position and big endian portability.

Signed-off-by: Devesh Sharma <devesh.sharma@emulex.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>

commit | commitdiff | tree

Linus Torvalds [Sun, 12 Jan 2014 10:04:18 +0000 (17:04 +0700)]

Linux 3.13-rc8

commit | commitdiff | tree

Steven Rostedt [Fri, 10 Jan 2014 02:46:34 +0000 (21:46 -0500)]

SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()

While running stress tests on adding and deleting ftrace instances I hit
this bug:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  IP: selinux_inode_permission+0x85/0x160
  PGD 63681067 PUD 7ddbe067 PMD 0
  Oops: 0000 [#1] PREEMPT
  CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
  Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
  task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
  RIP: 0010:[<ffffffff812d8bc5>]  [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
  RSP: 0018:ffff88007ddb1c48  EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
  RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
  RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
  R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
  R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
  FS:  00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
  CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
  Call Trace:
    security_inode_permission+0x1c/0x30
    __inode_permission+0x41/0xa0
    inode_permission+0x18/0x50
    link_path_walk+0x66/0x920
    path_openat+0xa6/0x6c0
    do_filp_open+0x43/0xa0
    do_sys_open+0x146/0x240
    SyS_open+0x1e/0x20
    system_call_fastpath+0x16/0x1b
  Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
  RIP  selinux_inode_permission+0x85/0x160
  CR2: 0000000000000020

Investigating, I found that the inode->i_security was NULL, and the
dereference of it caused the oops.

in selinux_inode_permission():

isec = inode->i_security;

rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);

Note, the crash came from stressing the deletion and reading of debugfs
files.  I was not able to recreate this via normal files.  But I'm not
sure they are safe.  It may just be that the race window is much harder
to hit.

What seems to have happened (and what I have traced), is the file is
being opened at the same time the file or directory is being deleted.
As the dentry and inode locks are not held during the path walk, nor is
the inodes ref counts being incremented, there is nothing saving these
structures from being discarded except for an rcu_read_lock().

The rcu_read_lock() protects against freeing of the inode, but it does
not protect freeing of the inode_security_struct.  Now if the freeing of
the i_security happens with a call_rcu(), and the i_security field of
the inode is not changed (it gets freed as the inode gets freed) then
there will be no issue here.  (Linus Torvalds suggested not setting the
field to NULL such that we do not need to check if it is NULL in the
permission check).

Note, this is a hack, but it fixes the problem at hand.  A real fix is
to restructure the destroy_inode() to call all the destructor handlers
from the RCU callback.  But that is a major job to do, and requires a
lot of work.  For now, we just band-aid this bug with this fix (it
works), and work on a more maintainable solution in the future.

Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home
Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home
Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Hugh Dickins [Sun, 12 Jan 2014 09:25:21 +0000 (01:25 -0800)]

thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only

We see General Protection Fault on RSI in copy_page_rep: that RSI is
what you get from a NULL struct page pointer.

  RIP: 0010:[<ffffffff81154955>]  [<ffffffff81154955>] copy_page_rep+0x5/0x10
  RSP: 0000:ffff880136e15c00  EFLAGS: 00010286
  RAX: ffff880000000000 RBX: ffff880136e14000 RCX: 0000000000000200
  RDX: 6db6db6db6db6db7 RSI: db73880000000000 RDI: ffff880dd0c00000
  RBP: ffff880136e15c18 R08: 0000000000000200 R09: 000000000005987c
  R10: 000000000005987c R11: 0000000000000200 R12: 0000000000000001
  R13: ffffea00305aa000 R14: 0000000000000000 R15: 0000000000000000
  FS:  00007f195752f700(0000) GS:ffff880c7fc20000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000093010000 CR3: 00000001458e1000 CR4: 00000000000027e0
  Call Trace:
    copy_user_huge_page+0x93/0xab
    do_huge_pmd_wp_page+0x710/0x815
    handle_mm_fault+0x15d8/0x1d70
    __do_page_fault+0x14d/0x840
    do_page_fault+0x2f/0x90
    page_fault+0x22/0x30

do_huge_pmd_wp_page() tests is_huge_zero_pmd(orig_pmd) four times: but
since shrink_huge_zero_page() can free the huge_zero_page, and we have
no hold of our own on it here (except where the fourth test holds
page_table_lock and has checked pmd_same), it's possible for it to
answer yes the first time, but no to the second or third test.  Change
all those last three to tests for NULL page.

(Note: this is not the same issue as trinity's DEBUG_PAGEALLOC BUG
in copy_page_rep with RSI: ffff88009c422000, reported by Sasha Levin
in https://lkml.org/lkml/2013/3/29/103.  I believe that one is due
to the source page being split, and a tail page freed, while copy
is in progress; and not a problem without DEBUG_PAGEALLOC, since
the pmd_same check will prevent a miscopy from being made visible.)

Fixes: 97ae17497e99 ("thp: implement refcounting for huge zero page")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org # v3.10 v3.11 v3.12
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Ming Lei [Thu, 26 Dec 2013 13:31:37 +0000 (21:31 +0800)]

block: null_blk: fix queue leak inside removing device

When queue_mode is NULL_Q_MQ and null_blk is being removed,
blk_cleanup_queue() isn't called to cleanup queue, so the queue
allocated won't be freed.

This patch calls blk_cleanup_queue() for MQ to drain all pending
requests first and release the reference counter of queue kobject, then
blk_mq_free_queue() will be called in queue kobject's release handler
when queue kobject's reference counter drops to zero.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Linus Torvalds [Fri, 10 Jan 2014 23:37:11 +0000 (06:37 +0700)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:
"Famouse last words: "final pull request" :-)

  I'm sending this because Jason Wang's fixes are pretty important

   1) Add missing per-cpu stats initialization to ip6_vti.  Otherwise
      lockdep spits out a call trace.  From Li RongQing.

   2) Fix NULL oops in wireless hwsim, from Javier Lopez

   3) TIPC deferred packet queue unlink must NULL out skb->next to avoid
      crashes.  From Erik Hugne

   4) Fix access to uninitialized buffer in nf_nat netfilter code, from
      Daniel Borkmann

   5) Fix lifetime of ipv6 loopback and SIT tunnel addresses, otherwise
      they basically timeout immediately.  From Hannes Frederic Sowa

   6) Fix DMA unmapping of TSO packets in bnx2x driver, from Michal
      Schmidt

   7) Do not allow L2 forwarding offload via macvtap device, the way
      things are now it will not end up being forwaded at all.  From
      Jason Wang

   8) Fix transmit queue selection via ndo_dfwd_start_xmit(), fixing
      things like applying NETIF_F_LLTX to the wrong device (!!) and
      eliding the proper transmit watchdog handling

   9) qlcnic driver was not updating tx statistics at all, from Manish
      Chopra"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  qlcnic: Fix ethtool statistics length calculation
  qlcnic: Fix bug in TX statistics
  net: core: explicitly select a txq before doing l2 forwarding
  macvlan: forbid L2 fowarding offload for macvtap
  bnx2x: fix DMA unmapping of TSO split BDs
  ipv6: add link-local, sit and loopback address with INFINITY_LIFE_TIME
  bnx2x: prevent WARN during driver unload
  tipc: correctly unlink packets from deferred packet queue
  ipv6: pcpu_tstats.syncp should be initialised in ip6_vti.c
  netfilter: only warn once on wrong seqadj usage
  netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper
  NFC: Fix target mode p2p link establishment
  iwlwifi: add new devices for 7265 series
  mac80211: move "bufferable MMPDU" check to fix AP mode scan
  mac80211_hwsim: Fix NULL pointer dereference

commit | commitdiff | tree

Linus Torvalds [Fri, 10 Jan 2014 23:33:03 +0000 (06:33 +0700)]

Merge tag 'xfs-for-linus-v3.13-rc8' of git://oss.sgi.com/xfs/xfs

Pull xfs bugfixes from Ben Myers:
"Here we have a bugfix for an off-by-one in the remote attribute
  verifier that results in a forced shutdown which you can hit with v5
  superblock by creating a 64k xattr, and a fix for a missing
  destroy_work_on_stack() in the allocation worker.

  It's a bit late, but they are both fairly straightforward"

* tag 'xfs-for-linus-v3.13-rc8' of git://oss.sgi.com/xfs/xfs:
  xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()
  xfs: fix off-by-one error in xfs_attr3_rmt_verify

commit | commitdiff | tree

Linus Torvalds [Fri, 10 Jan 2014 23:26:27 +0000 (06:26 +0700)]

Merge branch 'leds-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds

Pull LED fix from Bryan Wu:
"Pali Rohár and Pavel Machek reported the LED of Nokia N900 doesn't
work with our latest 3.13-rc6 kernel. Milo fixed the regression here"

* 'leds-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
leds: lp5521/5523: Remove duplicate mutex

commit | commitdiff | tree

Linus Torvalds [Fri, 10 Jan 2014 23:25:02 +0000 (06:25 +0700)]

Merge tag 'pm+acpi-3.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management fixes from Rafael Wysocki:

- Recent commits modifying the lists of C-states in the intel_idle
   driver introduced bugs leading to crashes on some systems.  Two fixes
   from Jiang Liu.

- The ACPI AC driver should receive all types of notifications, but
   recent change made it ignore some of them.  Fix from Alexander Mezin.

- intel_pstate's validity checks for MSRs it depends on are not
   sufficient to catch the lack of support in nested KVM setups, so they
   are extended to cover that case.  From Dirk Brandewie.

- NEC LZ750/LS has a botched up _BIX method in its ACPI tables, so our
   ACPI battery driver needs a quirk for it.  From Lan Tianyu.

- The tpm_ppi driver sometimes leaks memory allocated by
   acpi_get_name().  Fix from Jiang Liu.

* tag 'pm+acpi-3.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  intel_idle: close avn_cstates array with correct marker
  Revert "intel_idle: mark states tables with __initdata tag"
  ACPI / Battery: Add a _BIX quirk for NEC LZ750/LS
  intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters.
  ACPI / TPM: fix memory leak when walking ACPI namespace
  ACPI / AC: change notification handler type to ACPI_ALL_NOTIFY

commit | commitdiff | tree

Linus Torvalds [Fri, 10 Jan 2014 23:23:57 +0000 (06:23 +0700)]

Merge tag 'mfd-fixes-3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes

Pull MFD fix from Samuel Ortiz:
"This is the 2nd MFD pull request for 3.13

  It only contains one fix for the rtsx_pcr driver.  Without it we see a
  kernel panic on some machines, when resuming from suspend to RAM"

* tag 'mfd-fixes-3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes:
  mfd: rtsx_pcr: Disable interrupts before cancelling delayed works

commit | commitdiff | tree

Milo Kim [Tue, 3 Dec 2013 01:21:44 +0000 (17:21 -0800)]

leds: lp5521/5523: Remove duplicate mutex

It can be a problem when a pattern is loaded via the firmware interface.
LP55xx common driver has already locked the mutex in 'lp55xx_firmware_loaded()'.
So it should be deleted.

On the other hand, locks are required in store_engine_load()
on updating program memory.

Reported-by: Pali Rohár <pali.rohar@gmail.com>
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Milo Kim <milo.kim@ti.com>
Signed-off-by: Bryan Wu <cooloney@gmail.com>
Cc: <stable@vger.kernel.org>

commit | commitdiff | tree

Chuansheng Liu [Tue, 7 Jan 2014 08:53:34 +0000 (16:53 +0800)]

xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()

In case CONFIG_DEBUG_OBJECTS_WORK is defined, it is needed to
call destroy_work_on_stack() which frees the debug object to pair
with INIT_WORK_ONSTACK().

Signed-off-by: Liu, Chuansheng <chuansheng.liu@intel.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit 6f96b3063cdd473c68664a190524ed966ac0cd92)

commit | commitdiff | tree

Jie Liu [Wed, 1 Jan 2014 11:28:03 +0000 (19:28 +0800)]

xfs: fix off-by-one error in xfs_attr3_rmt_verify

With CRC check is enabled, if trying to set an attributes value just
equal to the maximum size of XATTR_SIZE_MAX would cause the v3 remote
attr write verification procedure failure, which would yield the back
trace like below:

<snip>
XFS (sda7): Internal error xfs_attr3_rmt_write_verify at line 191 of file fs/xfs/xfs_attr_remote.c
<snip>
Call Trace:
[<ffffffff816f0042>] dump_stack+0x45/0x56
[<ffffffffa0d99c8b>] xfs_error_report+0x3b/0x40 [xfs]
[<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs]
[<ffffffffa0d99ce5>] xfs_corruption_error+0x55/0x80 [xfs]
[<ffffffffa0dbef6b>] xfs_attr3_rmt_write_verify+0x14b/0x1a0 [xfs]
[<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs]
[<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs]
[<ffffffffa0d96edd>] _xfs_buf_ioapply+0x6d/0x390 [xfs]
[<ffffffff81184cda>] ? vm_map_ram+0x31a/0x460
[<ffffffff81097230>] ? wake_up_state+0x20/0x20
[<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs]
[<ffffffffa0d9726b>] xfs_buf_iorequest+0x6b/0xc0 [xfs]
[<ffffffffa0d97315>] xfs_bdstrat_cb+0x55/0xb0 [xfs]
[<ffffffffa0d97906>] xfs_bwrite+0x46/0x80 [xfs]
[<ffffffffa0dbfa94>] xfs_attr_rmtval_set+0x334/0x490 [xfs]
[<ffffffffa0db84aa>] xfs_attr_leaf_addname+0x24a/0x410 [xfs]
[<ffffffffa0db8893>] xfs_attr_set_int+0x223/0x470 [xfs]
[<ffffffffa0db8b76>] xfs_attr_set+0x96/0xb0 [xfs]
[<ffffffffa0db13b2>] xfs_xattr_set+0x42/0x70 [xfs]
[<ffffffff811df9b2>] generic_setxattr+0x62/0x80
[<ffffffff811e0213>] __vfs_setxattr_noperm+0x63/0x1b0
[<ffffffff81307afe>] ? evm_inode_setxattr+0xe/0x10
[<ffffffff811e0415>] vfs_setxattr+0xb5/0xc0
[<ffffffff811e054e>] setxattr+0x12e/0x1c0
[<ffffffff811c6e82>] ? final_putname+0x22/0x50
[<ffffffff811c708b>] ? putname+0x2b/0x40
[<ffffffff811cc4bf>] ? user_path_at_empty+0x5f/0x90
[<ffffffff811bdfd9>] ? __sb_start_write+0x49/0xe0
[<ffffffff81168589>] ? vm_mmap_pgoff+0x99/0xc0
[<ffffffff811e07df>] SyS_setxattr+0x8f/0xe0
[<ffffffff81700c2d>] system_call_fastpath+0x1a/0x1f

Tests:
setfattr -n user.longxattr -v `perl -e 'print "A"x65536'` testfile

This patch fix it to check the remote EA size is greater than the
XATTR_SIZE_MAX rather than more than or equal to it, because it's
valid if the specified EA value size is equal to the limitation as
per VFS setxattr interface.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
(cherry picked from commit 85dd0707f0cad26d60f2dc574d17a5ab948d10f7)

commit | commitdiff | tree

Shahed Shaikh [Thu, 9 Jan 2014 17:41:05 +0000 (12:41 -0500)]

qlcnic: Fix ethtool statistics length calculation

o Consider number of Tx queues while calculating the length of
Tx statistics as part of ethtool stats.
o Calculate statistics lenght properly for 82xx and 83xx adapter

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Manish Chopra [Thu, 9 Jan 2014 17:41:04 +0000 (12:41 -0500)]

qlcnic: Fix bug in TX statistics

o Driver was not updating TX stats so it was not populating
statistics in `ifconfig` command output.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jason Wang [Fri, 10 Jan 2014 08:18:26 +0000 (16:18 +0800)]

net: core: explicitly select a txq before doing l2 forwarding

Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
will cause several issues:

- NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
  instead of lower device which misses the necessary txq synchronization for
  lower device such as txq stopping or frozen required by dev watchdog or
  control path.
- dev_hard_start_xmit() was called with NULL txq which bypasses the net device
  watchdog.
- dev_hard_start_xmit() does not check txq everywhere which will lead a crash
  when tso is disabled for lower device.

Fix this by explicitly introducing a new param for .ndo_select_queue() for just
selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
forwarding transmission.

With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
dev_queue_xmit() to do the transmission.

In the future, it was also required for macvtap l2 forwarding support since it
provides a necessary synchronization method.

Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: e1000-devel@lists.sourceforge.net
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jason Wang [Fri, 10 Jan 2014 08:18:25 +0000 (16:18 +0800)]

macvlan: forbid L2 fowarding offload for macvtap

L2 fowarding offload will bypass the rx handler of real device. This will make
the packet could not be forwarded to macvtap device. Another problem is the
dev_hard_start_xmit() called for macvtap does not have any synchronization.

Fix this by forbidding L2 forwarding for macvtap.

Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Fri, 10 Jan 2014 18:21:22 +0000 (13:21 -0500)]

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless

John W. Linville says:

====================
For the mac80211 bits, Johannes says:

"I have a fix from Javier for mac80211_hwsim when used with wmediumd
userspace, and a fix from Felix for buffering in AP mode."

For the NFC bits, Samuel says:

"This pull request only contains one fix for a regression introduced with
commit e29a9e2ae165620d. Without this fix, we can not establish a p2p link
in target mode. Only initiator mode works."

For the iwlwifi bits, Emmanuel says:

"It only includes new device IDs so it's not vital. If you have a pull
request to net.git anyway, I'd happy to have this in."
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Michal Schmidt [Thu, 9 Jan 2014 13:36:27 +0000 (14:36 +0100)]

bnx2x: fix DMA unmapping of TSO split BDs

bnx2x triggers warnings with CONFIG_DMA_API_DEBUG=y:

  WARNING: CPU: 0 PID: 2253 at lib/dma-debug.c:887 check_unmap+0xf8/0x920()
  bnx2x 0000:28:00.0: DMA-API: device driver frees DMA memory with
  different size [device address=0x00000000da2b389e] [map size=1490 bytes]
  [unmap size=66 bytes]

The reason is that bnx2x splits a TSO BD into two BDs (headers + data)
using one DMA mapping for both, but it uses only the length of the first
BD when unmapping.

This patch fixes the bug by unmapping the whole length of the two BDs.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Fri, 10 Jan 2014 08:57:23 +0000 (15:57 +0700)]

Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux

Pull clock fixes from Mike Turquette:
"Late fixes for clock drivers.  All of these fixes are for user-visible
  regressions, typically boot failures or other unsafe system
  configuration that causes badness"

* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
  clk: clk-divider: fix divisor > 255 bug
  clk: exynos: File scope reg_save array should depend on PM_SLEEP
  clk: samsung: exynos5250: Add CLK_IGNORE_UNUSED flag for the sysreg clock
  ARM: dts: exynos5250: Fix MDMA0 clock number
  clk: samsung: exynos5250: Add MDMA0 clocks
  clk: samsung: exynos5250: Fix ACP gate register offset
  clk: exynos5250: fix sysmmu_mfc{l,r} gate clocks
  clk: samsung: exynos4: Correct SRC_MFC register

commit | commitdiff | tree

Linus Torvalds [Fri, 10 Jan 2014 08:54:49 +0000 (15:54 +0700)]

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
"A few fixes for Renesas platforms to fixup DMA masks (this started
  causing errors once the DMA API added checks for valid masks in 3.13)"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: shmobile: mackerel: Fix coherent DMA mask
  ARM: shmobile: kzm9g: Fix coherent DMA mask
  ARM: shmobile: armadillo: Fix coherent DMA mask

commit | commitdiff | tree

Hannes Frederic Sowa [Wed, 8 Jan 2014 14:43:22 +0000 (15:43 +0100)]

ipv6: add link-local, sit and loopback address with INFINITY_LIFE_TIME

In the past the IFA_PERMANENT flag indicated, that the valid and preferred
lifetime where ignored. Since change fad8da3e085ddf ("ipv6 addrconf: fix
preferred lifetime state-changing behavior while valid_lft is infinity")
we honour at least the preferred lifetime on those addresses. As such
the valid lifetime gets recalculated and updated to 0.

If loopback address is added manually this problem does not occur.
Also if NetworkManager manages IPv6, those addresses will get added via
inet6_rtm_newaddr and thus will have a correct lifetime, too.

Reported-by: François-Xavier Le Bail <fx.lebail@yahoo.com>
Reported-by: Damien Wyart <damien.wyart@gmail.com>
Fixes: fad8da3e085ddf ("ipv6 addrconf: fix preferred lifetime state-changing behavior while valid_lft is infinity")
Cc: Yasushi Asano <yasushi.asano@jp.fujitsu.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Yuval Mintz [Tue, 7 Jan 2014 10:07:41 +0000 (12:07 +0200)]

bnx2x: prevent WARN during driver unload

Starting with commit 80c33dd "net: add might_sleep() call to napi_disable"
bnx2x fails the might_sleep tests causing a stack trace to appear whenever
the driver is unloaded, as local_bh_disable() is being called before
napi_disable().

This changes the locking schematics related to CONFIG_NET_RX_BUSY_POLL,
preventing the need for calling local_bh_disable() and thus eliminating
the issue.

Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Rafael J. Wysocki [Fri, 10 Jan 2014 02:08:58 +0000 (03:08 +0100)]

Merge branch 'pm-cpuidle'

* pm-cpuidle:
intel_idle: close avn_cstates array with correct marker
Revert "intel_idle: mark states tables with __initdata tag"

Beck SC1x5 Kernel

RSS Atom