Mitko Haralanov [Tue, 8 Mar 2016 19:14:48 +0000 (11:14 -0800)]
IB/hfi1: Notify remove MMU/RB callback of calling context
Tell the remove MMU/RB callback if it's being called as
part of a memory invalidation or not. This can be important
in preventing a deadlock if the remove callback attempts to
take the map_sem semaphore because the kernel's MMU
invalidation functions have already taken it.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:42 +0000 (11:14 -0800)]
IB/hfi1: Remove the use of add/remove RB function pointers
The usage of function pointers for RB node insertion
and removal in the expected receive code path was
meant to be a small performance optimization. However,
maintaining it, especially with the new MMU API, would
become more troublesome as the API is extended.
Since the performance optimization is minor, remove the
function pointers and replace with direct calls.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:36 +0000 (11:14 -0800)]
IB/hfi1: Allow remove MMU callbacks to free nodes
In order to allow the remove MMU callbacks to free the
RB nodes, it is necessary to prevent any references to
the nodes after the remove callback has been called.
Therefore, remove the node from the tree prior to calling
the callback. In other words, the MMU/RB API now guarantees
that all RB node operations it performs will be done prior
to calling the remove callback and that the RB node will
not be touched afterwards.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:25 +0000 (11:14 -0800)]
IB/hfi1: Allow MMU function execution in IRQ context
Future users of the MMU/RB functions might be searching or
manipulating the MMU RB trees in interrupt context. Therefore,
the MMU/RB functions need to be able to run in interrupt
context. This requires that we use the IRQ-aware API for
spin locks.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Mitko Haralanov [Tue, 8 Mar 2016 19:14:20 +0000 (11:14 -0800)]
IB/hfi1: Re-factor MMU notification code
The MMU notification code added to the
expected receive side has been re-factored and
split into it's own file. This was done in
order to make the code more general and, therefore,
usable by other parts of the driver.
The caching behavior remains the same. However,
the handling of the RB tree (insertion, deletions,
and searching) as well as the MMU invalidation
processing is now handled by functions in the
mmu_rb.[ch] files.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
The adaptive pio heuristic missed a case that causes a corrupted
packet on the wire.
The case is if SDMA egress had been chosen for a pio-able packet and
then encountered a ring space wait, the packet is queued. The sge
cursor had been incremented as part of the packet build out for SDMA.
After the send engine restart, the heuristic might now chose pio based
on the sdma count being zero and start the mmio copy using the already
incremented sge cursor.
Fix this by forcing SDMA egress when the SDMA descriptor has already
been built.
Additionally, the code to wait for a QPs pio count to zero when
switching to SDMA was missing. Add it.
There is also an issue with UD QPs, in that the different SLs can pick
a different egress send context. For now, just insure the UD/GSI
always go through SDMA.
There is a timing hole if there had been greater than
PIO_WAIT_BATCH_SIZE waiters. This code will dispatch the first
batch but leave the others in the queue. If the restarted waiters
don't in turn wait on a buffer, there is a hang.
Fix by forcing a return when the QP queue is non-empty.
The current LED beaconing code is unclear and uses the timer handler to
turn off the timer. This patch simplifies the code by removing the
special semantics of timeon = timeoff = 0 being interpreted as a request
to turn off the beaconing.
Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Kaike Wan [Sat, 5 Mar 2016 16:50:49 +0000 (08:50 -0800)]
IB/hfi1: Don't call cond_resched in atomic mode when sending packets
This patch fixed the problem where the driver might reschedule in atomic
mode when sending packets. This is due to the fact that the call to
cond_resched() in hfi1_do_send() might occur in atomic mode and a check is
required to avoid the warning message:
"kernel: BUG: scheduling while atomic: swapper/2/0/0x10000100."
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:50:43 +0000 (08:50 -0800)]
IB/hfi1: Add adaptive cacheless verbs copy
The kernel memcpy is faster than a cacheless copy. However,
if too much of the L3 cache is overwritten by one-time copies
then overall bandwidth suffers. Implement an adaptive scheme
where full page copies are tracked and if the number of unique
entries are larger than a threshold, verbs will use a cacheless
copy. Tracked entries are gradually cleaned, allowing memcpy to
resume once the larger copies have stopped.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Sat, 5 Mar 2016 16:50:38 +0000 (08:50 -0800)]
IB/hfi1: Handle host handshake timeout
Host handshake timeout can occur during the verify capability
state. This is a LNI related failure and should be
handled in the same way as other LNI failures.
Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:50:27 +0000 (08:50 -0800)]
IB/hfi1: Hold i2c resource across debugfs open/close
External i2c firmware updates are done in multiple steps and
cannot have other things done in between. For debugfs files,
acquire the resource on open and release it on close.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:50:06 +0000 (08:50 -0800)]
IB/hfi1: Change QSFP functions to use resource reservation
Remove the mutex guarding each operation in favor the ASIC
resource acquire/release. Push the resource acquire/release,
above each operation call to allow exclusive access across
multiple operations.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:49:50 +0000 (08:49 -0800)]
IB/hfi1: Add ASIC resource reservation functions
The ASIC block is a shared hardware resource between two devices
on the chip. Add functions to acquire and release these resources
in a way that is safe for both multiple users on the same OS
and multiple users on different OSes, while holding the hardware
mutex as little as possible.
Reservations are noted in a scratch register in the shared region.
There are two types of reservations: per-HFI dynamic and permanent.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Sat, 5 Mar 2016 16:49:39 +0000 (08:49 -0800)]
IB/hfi1: Remove ASIC block clear
The ASIC block is shared between two HFIs. Individual devices
should not initialize registers there. Retain the power-on values.
Individual users set registers as needed with one exception.
Clear sbus fast mode on "slow" calls.
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Harish Chegondi [Sat, 5 Mar 2016 16:49:29 +0000 (08:49 -0800)]
IB/hfi1: Move constant to the right in bitwise operations
Implement changes recommended by the Coccinelle tool to move constant to
the right in bitwise operations
-bash-4.2$ make coccicheck MODE=report M=drivers/infiniband/hw/hfi1/
drivers/infiniband/hw/hfi1/pio.c:765:4-16: Move constant to right.
drivers/infiniband/hw/hfi1/rc.c:2503:19-29: Move constant to right.
drivers/infiniband/hw/hfi1/chip.c:9813:11-22: Move constant to right.
drivers/infiniband/hw/hfi1/chip.c:14468:29-40: Move constant to right.
Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Harish Chegondi [Sat, 5 Mar 2016 16:49:24 +0000 (08:49 -0800)]
IB/hfi1: Add the break statement that was removed in an earlier patch
The break statement was unintentionally removed in this patch
commit 41ca419abc0ca7ee65d765408cdc1a7fed2897a3
("staging/rdma/hfi1: Remove hfi1 MR and hfi1 specific qp type")
Jubin John [Fri, 26 Feb 2016 21:33:33 +0000 (13:33 -0800)]
staging/rdma/hfi1: Fix memory leaks
Fix 3 memory leaks reported by the LeakCheck tool in the KEDR framework.
The following resources were allocated memory during their respective
initializations but not freed during cleanup:
1. SDMA map elements
2. PIO map elements
3. HW send context to SW index map
This patch fixes the memory leaks by freeing the allocated memory in the
cleanup path.
Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Easwar Hariharan [Fri, 26 Feb 2016 21:33:28 +0000 (13:33 -0800)]
staging/rdma/hfi1: Fix reporting of LED status in Get(LedInfo) and Get(PortInfo)
The LedInfo SMA attribute is redefined to control the LED beaconing
state machine instead of the LED directly. In accordance, we now
return the state of LED beaconing, represented by whether the beaconing
timer is active, instead of the state of the LED itself for SMA queries
Get(LedInfo) and Get(PortInfo). While we are at it, we fix the beaconing
timer control code so that the state of the timer is accurately updated.
Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch tests the interrupt registers when the driver has no access to
its upstream component. In this case, it is highly likely that it is
running in a virtual machine (eg, Qemu-kvm guest). If the interrupt
registers are not mapped properly by the virtual machine monitor, an
error message will be printed and the probing will be terminated. This
will help the user identify the issue. On the other hand, if the driver
is running in a host or has access to its upstream component in some
other VM, it will do nothing.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Kaike Wan [Fri, 26 Feb 2016 21:33:18 +0000 (13:33 -0800)]
staging/rdma/hfi1: Avoid using upstream component if it is not accessible
When the hfi1 device is assigned to a VM (eg KVM), the hfi1 driver has
no access to the upstream component and therefore cannot use it to perform
some operations, such as secondary bus reset. As a result, the hfi1 driver
cannot perform the pcie Gen3 transition. Instead, those operation should
be done in the host environment, preferrably done during the Option ROM
initialization. Similarly, the hfi1 driver cannot support ASPM and tune
the pcie capability under this circumstance.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jianxin Xiong [Fri, 26 Feb 2016 21:33:13 +0000 (13:33 -0800)]
staging/rdma/hfi1: Fix header size calculation for RC/UC QPs with GRH enabled
There is a header size counter in both the QP struture and the txreq
structure. The counter in the txreq structure is not updated properly
for RC and UC queue pairs with GRH enabled, and thus causing SDMA
send to fail. This patch fixes the RC and UC path.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Dean Luick [Thu, 18 Feb 2016 19:12:25 +0000 (11:12 -0800)]
staging/rdma/hfi1: Fix debugfs access race
Debugfs access races with the driver being ready. Make sure the
driver is ready before debugfs files appear and debufs files are
gone before the driver starts tearing down.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Thu, 18 Feb 2016 19:11:59 +0000 (11:11 -0800)]
staging/rdma/hfi1: fix 0-day syntax error
Setting CONFIG_HFI1_DEBUG_SDMA_ORDER causes a syntax error:
sdma.c: In function ‘complete_tx’:
sdma.c:370: error: ‘txp’ undeclared (first use in
this function)
sdma.c:370: error: (Each undeclared identifier is reported only once
sdma.c:370: error: for each function it appears in.)
Adjust code under ifdef to reference the tx properly.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:22:09 +0000 (20:22 -0800)]
staging/rdma/hfi1: Remove else after break
Remove else after break to fix checkpatch warning:
WARNING: else is not generally useful after a break or return
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:22:00 +0000 (20:22 -0800)]
staging/rdma/hfi1: Add braces on all arms of statement
Add braces on all arms of statements to fix checkpatch check:
CHECK: braces {} should be used on all arms of this statement
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:21:52 +0000 (20:21 -0800)]
staging/rdma/hfi1: Fix code alignment
Fix code alignment to fix checkpatch check:
CHECK: Alignment should match open parenthesis
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:21:43 +0000 (20:21 -0800)]
staging/rdma/hfi1: Fix block comments
Fix block comments with proper formatting to fix checkpatch warnings:
WARNING: Block comments use * on subsequent lines
WARNING: Block comments use a trailing */ on a separate line
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:21:34 +0000 (20:21 -0800)]
staging/rdma/hfi1: Add comment for spinlock_t definition
Add comments describing the spinlock for spinlock_t definitions to
fix checkpatch check:
CHECK: spinlock_t definition without comment
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:21:26 +0000 (20:21 -0800)]
staging/rdma/hfi1: Remove void function return statement
Remove return statement at the end of a void function to fix
checkpatch warning:
WARNING: void function return statements are not generally useful
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:21:16 +0000 (20:21 -0800)]
staging/rdma/hfi1: Use pointer instead of struct name
Use sizeof(*p) instead of sizeof(struct foo) to fix checkpatch check:
CHECK: Prefer alloc(sizeof(*p)...) over alloc(sizeof(struct foo)...)
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:21:07 +0000 (20:21 -0800)]
staging/rdma/hfi1: Remove CamelCase
Remove CamelCase to fix checkpatch check:
CHECK: Avoid CamelCase: <PLS_CONFIGPHY_ESTcOMM_LOCAL_COMPLETE>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:20:58 +0000 (20:20 -0800)]
staging/rdma/hfi1: Fix misspellings
Fix misspelled word based on checkpatch check:
CHECK: 'ffoo' may be misspelled - perhaps 'foo'?
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:20:50 +0000 (20:20 -0800)]
staging/rdma/hfi1: Split multiple assignments
Split multiple assignments into individual assignments to fix
checkpatch check:
CHECK: multiple assignments should be avoided
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:20:42 +0000 (20:20 -0800)]
staging/rdma/hfi1: Use BIT_ULL macro
Use BIT_ULL macro to fix checkpatch check:
CHECK: Prefer using the BIT_ULL macro
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:20:33 +0000 (20:20 -0800)]
staging/rdma/hfi1: Remove unnecessary parentheses
Remove unnecessary parentheses around addressof single $Lvals to fix
checkpatch check:
CHECK: Unnecessary parentheses around $var
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:20:25 +0000 (20:20 -0800)]
staging/rdma/hfi1: Add blank link after declarations
Add blank line after declarations to fix checkpatch check:
CHECK: Please use a blank line after function/struct/union/enum
declarations
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:20:15 +0000 (20:20 -0800)]
staging/rdma/hfi1: Fix logical continuations
Move logical continuations to previous line to fix checkpatch check:
CHECK: Logical continuations should be on the previous line
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:20:06 +0000 (20:20 -0800)]
staging/rdma/hfi1: Remove blank line before close brace
Remove extra blank line before close brace to fix checkpatch check:
CHECK: Blank lines aren't necessary before a close brace '}'
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:19:58 +0000 (20:19 -0800)]
staging/rdma/hfi1: Remove blank line after an open brace
Remove blank line after an open brace to fix checkpatch check:
CHECK: Blank lines aren't necessary after an open brace '{'
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:19:49 +0000 (20:19 -0800)]
staging/rdma/hfi1: Fix comparison to NULL
Convert pointer comparisons to NULL to !pointer
to fix checkpatch check:
CHECK: Comparison to NULL could be written "!pointer"
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:19:41 +0000 (20:19 -0800)]
staging/rdma/hfi1: Remove space after cast
Remove the space after a cast to fix checkpatch check:
CHECK: No space is necessary after a cast
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:19:32 +0000 (20:19 -0800)]
staging/rdma/hfi1: Remove multiple blank lines
Remove multiple blank lines to fix checkpatch check:
CHECK: Please don't use multiple blank lines
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Jubin John [Mon, 15 Feb 2016 04:19:24 +0000 (20:19 -0800)]
staging/rdma/hfi1: Add spaces around binary operators
Add spaces around binary operators.
Fixes checkpatch check:
CHECK: spaces preferred around that 'x'
where x is a binary operator
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Sun, 14 Feb 2016 20:45:53 +0000 (12:45 -0800)]
staging/rdma/hfi: fix CQ completion order issue
The current implementation of the sdma_wait variable
has a timing hole that can cause a completion Q entry
to be returned from a pio send prior to an older
sdma packets completion queue entry.
The sdma_wait variable used to be decremented prior to
calling the packet complete routine. The hole is between decrement
and the verbs completion where send engine using pio could return
a out of order completion in that window.
This patch closes the hole by allowing an API option to
specify an sdma_drained callback. The atomic dec
is positioned after the complete callback to avoid the
window as long as the pio path doesn't execute when
there is a non-zero sdma count.
Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
The non-rdamvt versions of qib and hfi1 allow for a differing
heuristic to override a schedule progress in favor of a direct
call the the progress routine.
This patch adds that for both drivers and rdmavt.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Sun, 14 Feb 2016 20:45:36 +0000 (12:45 -0800)]
staging/rdma/hfi1: Adaptive PIO for short messages
The change requires a new pio_busy field in the iowait structure to
track the number of outstanding pios. The new counter together
with the sdma counter serve as the basis for a packet by packet decision
as to which egress mechanism to use. Since packets given to different
egress mechanisms are not ordered, this scheme will preserve the order.
The iowait drain/wait mechanisms are extended for a pio case. An
additional qp wait flag is added for the PIO drain wait case.
Currently the only pio wait is for buffers, so the no_bufs_available()
routine name is changed to pio_wait() and a third argument is passed
with one of the two pio wait flags to generalize the routine. A module
parameter is added to hold a configurable threshold. For now, the
module parameter is zero.
A heuristic routine is added to return the func pointer of the proper
egress routine to use.
The heuristic is as follows:
- SMI always uses pio
- GSI,UD qps <= threshold use pio
- UD qps > threadhold use sdma
o No coordination with sdma is required because order is not required
and this qp pio count is not maintained for UD
- RC/UC ONLY packets <= threshold chose as follows:
o If sdmas pending, use SDMA
o Otherwise use pio and enable the pio tracking count at
the time the pio buffer is allocated
- RC/UC ONLY packets > threshold use SDMA
o If pio's are pending the pio_wait with the new wait flag is
called to delay for pios to drain
The threshold is potentially reduced by the QP's mtu.
The sc_buffer_alloc() has two additional args (a callback, a void *)
which are exploited by the RC/UC cases to pass a new complete routine
and a qp *.
When the shadow ring completes the credit associated with a packet,
the new complete routine is called. The verbs_pio_complete() will then
decrement the busy count and trigger any drain waiters in qp destroy
or reset.
Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Sun, 14 Feb 2016 20:45:18 +0000 (12:45 -0800)]
staging/rdma/hfi1: fix panic in send engine
The send engine wasn't correctly handling
pre-built packets, and worse, the pointer to
a packet state's txreq wasn't initialized correctly.
To fix:
- all waiters need to save any prebuilt packets
(smda waits already did)
- the progress routine needs to handle a QPs prebuilt packet
and initialize the txreq pointer properly
To keep SDMA working, the dma send code needs to see if
a packet has been built already. If not the code will build
it.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
staging/rdma/hfi1: Remove header memcpy from sdma send path.
Instead of writing the header into a buffer then copying it into another
buffer to be sent, remove that memcpy and instead build the header directly
into the tx request that will be sent.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Vennila Megavannan <vennila.megavannan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Sun, 14 Feb 2016 20:44:34 +0000 (12:44 -0800)]
staging/rdma/hfi1: move txreq header code
The patch separates the txreq defines into new files, one for
verbs and one for sdma.
The verbs_txreq implementation handles the setup and teardown
of the txreq cache, so the register routine is changed to call
the new init/exit routines.
This patch allows for followup patches enhance the send engine.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Harish Chegondi [Sun, 14 Feb 2016 20:11:28 +0000 (12:11 -0800)]
IB/qib: Destroy SMI AH before de-allocating the protection domain
If SMI AH is not destroyed before de-allocating the PD, it would result in
non-zero PD use count when de-allocating the PD, triggering a WARN_ON() at
drivers/infiniband/core/verbs.c:284 ib_dealloc_pd+0x69/0xb0 [ib_core]()
when unloading the qib driver on systems with dual-port card.
This problem has always been there in qib and was detected only after the
commit 7dd78647a2c2 ("IB/core: Make ib_dealloc_pd return void") introduced
a WARN_ON in ib_dealloc_pd() that triggers if a PD's use count is non-zero
before de-allocating the PD.
Remove exported functions which are no longer required as the
functionality has moved into rdmavt. This also requires re-ordering some
of the functions since their prototype no longer appears in a header
file. Rather than add forward declarations it is just cleaner to
re-order some of the functions.
Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Initially it was intended that rdmavt would support some signaling
between the underlying driver and itself. However this turned out to be
unnecessary for qib and hfi1. If we need to add something like this in
later to support another driver we should do it then. As of now this
essentially dead code so remove it.
Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
While hfi1 and qib were still supporting bits and pieces of core verbs
components there needed to be a way to convey if rdmavt should handle
allocation and initialize of resources like the queue pair table. Now
that all of this is moved into rdmavt there is no need for these flags.
They are no longer used in the drivers.
Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
IB/qib: Setup notify free/create mad agent callbacks for rdmavt
Qib needs to be notified when mad agents are created and freed, there is
some counter maintenance that needs to be performed. Add those callbacks at
registration time with rdmavt.
Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
For each verb validate that all requirements for driver callbacks are met.
If a function is called without checking for a valid pointer, it is a
required function. Also document what each callback function does.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
IB/rdmavt: Clean up comments and add more documentation
Add, remove, and otherwise clean up existing comments that are leftover
from the initial code postings of rdmavt. Many of the comments were added
to provide an idea on the direction we were thinking of going. Now that the
design is solidified make a pass over and clean everything up. Also add
details where lacking.
Ensure all non static functions have nano comments.
Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Kaike Wan [Sun, 14 Feb 2016 20:10:20 +0000 (12:10 -0800)]
staging/rdma/hfi1: Put QPs into error state after SL->SC table changes
If an SL->SC mapping table change occurs after an RC/UC QP is created,
there is no mechanism to change the SC nor the VL for that QP. The fix
is to place the QP into error state so that ULP can recreate the QP with
the new SL->SC mapping.
Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Sun, 14 Feb 2016 20:10:04 +0000 (12:10 -0800)]
IB/qib, staging/rdma/hfi1: add s_hlock for use in post send
This patch adds an additional lock to reduce contention on the s_lock.
This lock is used in post_send() so that the post_send is not
serialized with the send engine and other send related processing.
To do this the s_next_psn is now maintained on post_send() while
post_send() related fields are moved to a new cache line. There is
an s_avail maintained for the post_send() to mitigate trading cache
lines with the send engine. The lock is released/acquired around
releasing the just built packet to the egress mechanism.
Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Harish Chegondi [Sun, 14 Feb 2016 20:09:55 +0000 (12:09 -0800)]
IB/qib: Rename several functions by adding a "qib_" prefix
This would avoid conflict with the functions in hfi1 that have similar
names when both qib and hfi1 drivers are configured to be built into
the kernel. This issue came up in the 0-day build report.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
IB/rdmavt, staging/rdma/hfi1: use qps to dynamically scale timeout value
A busy_jiffies variable is maintained and updated when rc qps are
created and deleted. busy_jiffies is a scaled value of the number
of rc qps in the device. busy_jiffies is incremented every rc qp
scaling interval. busy_jiffies is added to the rc timeout
in add_retry_timer and mod_retry_timer. The rc qp scaling interval
is selected based on extensive performance evaluation of targeted
workloads.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Vennila Megavannan <vennila.megavannan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
staging/rdma/hfi1: actually use new RNR timer API in loopback path
The patch series which added a new API for the RNR timer did not include an
updated call in the loopback path. RC/UC RNR loopback would be broken
without this.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
staging/rdma/hfi1: Tune for unknown channel if configuration file is absent
Currently, the driver fails to tune the SerDes and therefore prevents
link up if the configuration file is missing or fails parsing or
validation. This patch adds a fallback option so that the 8051 is asked
to tune for an unknown channel and possibly get the link up if tuning
succeeds. It also adds a user-friendly message to update the
configuration file if it is out-of-date.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>