scsi: storvsc: Prefer kcalloc over kzalloc with multiply
Use kcalloc for allocating an array instead of kzalloc with multiply,
kcalloc is the preferred API.
Found with checkpatch.
Signed-off-by: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:16:02 +0000 (14:16 -0500)]
scsi: cxlflash: Introduce hardware queue steering
As an enhancement to distribute requests to multiple hardware queues, add the
infrastructure to hash a SCSI command into a particular hardware queue.
Support the following scenarios when deriving which queue to use: single
queue, tagging when SCSI-MQ enabled, and simple hash via CPU ID when SCSI-MQ
is disabled. Rather than altering the existing send API, the derived hardware
queue is stored in the AFU command where it can be used for sending a command
to the chosen hardware queue.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:15:53 +0000 (14:15 -0500)]
scsi: cxlflash: Add hardware queues attribute
As staging for supporting multiple hardware queues, add an attribute to show
and set the current number of hardware queues for the host. Support specifying
a hard limit or a CPU affinitized value. This will allow the number of
hardware queues to be tuned by a system administrator.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Uma Krishnan [Wed, 12 Apr 2017 19:15:42 +0000 (14:15 -0500)]
scsi: cxlflash: Support multiple hardware queues
Introduce multiple hardware queues to improve legacy I/O path performance.
Each hardware queue is comprised of a master context and associated I/O
resources. The hardware queues are initially implemented as a static array
embedded in the AFU. This will be transitioned to a dynamic allocation in a
later series to improve the memory footprint of the driver.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The method used to decode asynchronous interrupts involves unnecessary loops
to match up bits that are set with corresponding entries in the asynchronous
interrupt information table. This algorithm is wasteful and does not scale
well as new status bits are supported.
As an improvement, use the for_each_set_bit() service to iterate over the
asynchronous status bits and refactor the information table such that it can
be indexed by bit position.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:15:20 +0000 (14:15 -0500)]
scsi: cxlflash: Fix warnings/errors
As a general cleanup, address all reasonable checkpatch warnings and
errors. These include enforcement of comment styles and including named
identifiers in function prototypes.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:15:11 +0000 (14:15 -0500)]
scsi: cxlflash: Fix power-of-two validations
Validation statements to enforce assumptions about specific defines are not
being evaluated by the compiler due to the fact that they reside in a routine
that is not used. To activate them, call the routine as part of module
initialization. As an additional, related cleanup, remove the now-defunct
CXLFLASH_NUM_CMDS.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:15:02 +0000 (14:15 -0500)]
scsi: cxlflash: Remove unnecessary DMA mapping
Devices supported by the cxlflash driver are fully coherent and do not require
a bus address mapping. Avoid unnecessary path length by using the virtual
address and length already present in the scatter-gather entry.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:14:51 +0000 (14:14 -0500)]
scsi: cxlflash: Fence EEH during probe
An EEH during probe can lead to a crash as the recovery thread races with the
probe thread. To avoid this issue, introduce new states to fence out EEH
recovery until probe has completed. Also ensure the reset wait queue is
flushed during device removal to avoid orphaned threads.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:14:41 +0000 (14:14 -0500)]
scsi: cxlflash: Support up to 4 ports
Update the driver to allow for future cards with 4 ports.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:14:28 +0000 (14:14 -0500)]
scsi: cxlflash: SISlite updates to support 4 ports
Update the SISlite header to support 4 ports as outlined in the SISlite
specification. Address fallout from structure renames and refreshed
organization throughout the driver. Determine the number of ports supported by
a card from the global port selection mask register reset value.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:14:17 +0000 (14:14 -0500)]
scsi: cxlflash: Hide FC internals behind common access routine
As staging to support FC-related updates to the SISlite specification,
introduce helper routines to obtain references to FC resources that exist
within the global map. This will allow changes to the underlying global map
structure without impacting existing code paths.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:14:05 +0000 (14:14 -0500)]
scsi: cxlflash: Remove port configuration assumptions
At present, the cxlflash driver only supports hardware with two FC ports. The
code was initially designed with this assumption and is dependent on having
two FC ports - adding more ports will break logic within the driver.
To mitigate this issue, remove the existing port assumptions and transition
the code to support more than two ports. As a side effect, clarify the
interpretation of the DK_CXLFLASH_ALL_PORTS_ACTIVE flag.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:13:50 +0000 (14:13 -0500)]
scsi: cxlflash: Support dynamic number of FC ports
Transition from a static number of FC ports to a value that is derived during
probe. For now, a static value is used but this will later be based on the
type of card being configured.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:13:34 +0000 (14:13 -0500)]
scsi: cxlflash: Update sysfs helper routines to pass config structure
As staging for future function, pass the config pointer instead of the AFU
pointer for port-related sysfs helper routines.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:13:20 +0000 (14:13 -0500)]
scsi: cxlflash: Implement IRQ polling for RRQ processing
Currently, RRQ processing takes place on hardware interrupt context. This can
be a heavy burden in some environments due to the overhead encountered while
completing RRQ entries. In an effort to improve system performance, use the
IRQ polling API to schedule this processing on softirq context.
This function will be disabled by default until starting values can be
established for the hardware supported by this driver.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:12:55 +0000 (14:12 -0500)]
scsi: cxlflash: Serialize RRQ access and support offlevel processing
As further staging to support processing the HRRQ by other means, access to
the HRRQ needs to be serialized by a disabled lock. This will allow safe
access in other non-hardware interrupt contexts. In an effort to minimize the
period where interrupts are disabled, support is added to queue up commands
harvested from the RRQ such that they can be processed with hardware
interrupts enabled. While this doesn't offer any improvement with processing
on a hardware interrupt it will help when IRQ polling is supported and the
command completions can execute on softirq context.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Wed, 12 Apr 2017 19:11:44 +0000 (14:11 -0500)]
scsi: cxlflash: Separate RRQ processing from the RRQ interrupt handler
In order to support processing the HRRQ by other means (e.g. polling), the
processing portion of the current RRQ interrupt handler needs to be broken out
into a separate routine. This will allow RRQ processing from places other than
the RRQ hardware interrupt handler.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: hisi_sas: controller reset for multi-bits ECC and AXI fatal errors
For 1 bit ECC errors, those errors can be recovered by hw. But for
multi-bits ECC and AXI errors, there are something wrong with whole
module or system, so try reset the controller to recover those errors
instead of calling panic().
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Mon, 10 Apr 2017 13:22:00 +0000 (21:22 +0800)]
scsi: hisi_sas: fix NULL deference when TMF timeouts
If a TMF timeouts (maybe due to unlikely scenario of an expander being
unplugged when TMF for remote device is active), when we eventually try
to free the slot, we crash as we dereference the slot's task, which has
already been released.
As a fix, add checks in the slot release code for a NULL task.
Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This patch is a workaround for a SoC bug where an internal abort command
may timeout. In v2 hw, the channel should become idle in order to finish
abort process. If the target side has been sending HOLD, host side
channel failed to complete the frame to send, and can not enter the idle
state. Then internal abort command will timeout.
As this issue is only in v2 hw, we deal with it in the hw layer. Our
workaround solution is: If abort is not finished within a certain period
of time, we will check HOLD status. If HOLD has been sending, we will
send break command.
Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiaofei Tan [Mon, 10 Apr 2017 13:21:58 +0000 (21:21 +0800)]
scsi: hisi_sas: workaround SoC about abort timeout bug
This patch adds a workaround solution for a SoC bug which may cause SoC
logic fatal error when disabling a PHY. Then we find internal abort IO
timeout may occur, and the controller IO breakpoint may be corrupted.
We work around this SoC bug by optimizing the flow of disabling a PHY.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiaofei Tan [Mon, 10 Apr 2017 13:21:57 +0000 (21:21 +0800)]
scsi: hisi_sas: workaround a SoC SATA IO processing bug
This patch provides a workaround a SoC bug where SATA IPTTs for
different devices may conflict.
The workaround solution requests the following:
1. SATA device id must be even and not equal to SAS IPTT.
2. SATA device can not share the same IPTT with other SAS or
SATA device.
Besides we shall consider IPTT value 0 is reserved for another SoC bug
(STP device open link at firstly after SAS controller reset).
To sum up, the solution is: Each SATA device uses independent and
continuous 32 even IPTT from 64 to 4094, then v2 hw can only support 63
SATA devices. All SAS device(SSP/SMP devices) share odd IPTT value from
1 to 4095.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiaofei Tan [Mon, 10 Apr 2017 13:21:56 +0000 (21:21 +0800)]
scsi: hisi_sas: workaround STP link SoC bug
After resetting the controller, the process of scanning SATA disks
attached to an expander may fail occasionally. The issue is that the
controller can't close the STP link created by target if the max link
time is 0.
To workaround this issue, we reject STP link after resetting the
controller, and change the corresponding PHY to accept STP link only
after receiving data.
We do this check in cq interrupt handler. In order not to reduce
efficiency, we use an variable to control whether we should check and
change PHY to accept STP link.
The function phys_reject_stp_links_v2_hw() should be called after
resetting the controller.
The solution of another SoC bug "SATA IO timeout", that also uses the
same register to control STP link, is not effective before the PHY
accepts STP link.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Directly call ELS request handler functions in fc_lport_recv_els_req
instead of saving the pointer to the handler's receive function and then
later dereferencing this pointer.
This makes the code a bit more obvious.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 7 Apr 2017 07:34:17 +0000 (09:34 +0200)]
scsi: sg: close race condition in sg_remove_sfp_usercontext()
sg_remove_sfp_usercontext() is clearing any sg requests, but needs to
take 'rq_list_lock' when modifying the list.
Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 7 Apr 2017 07:34:16 +0000 (09:34 +0200)]
scsi: sg: use standard lists for sg_requests
'Sg_request' is using a private list implementation; convert it to
standard lists.
Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: sg: check for valid direction before starting the request
Check for a valid direction before starting the request, otherwise we
risk running into an assertion in the scsi midlayer checking for valid
requests.
[mkp: fixed typo]
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Link: http://www.spinics.net/lists/linux-scsi/msg104400.html Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 7 Apr 2017 07:34:14 +0000 (09:34 +0200)]
scsi: sg: protect accesses to 'reserved' page array
The 'reserved' page array is used as a short-cut for mapping data,
saving us to allocate pages per request. However, the 'reserved' array
is only capable of holding one request, so this patch introduces a mutex
for protect 'sg_fd' against concurrent accesses.
Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 7 Apr 2017 07:34:13 +0000 (09:34 +0200)]
scsi: sg: remove 'save_scat_len'
Unused.
Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Fri, 7 Apr 2017 07:34:12 +0000 (09:34 +0200)]
scsi: sg: disable SET_FORCE_LOW_DMA
The ioctl SET_FORCE_LOW_DMA has never worked since the initial git
check-in, and the respective setting is nowadays handled correctly. So
disable it entirely.
Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Colin Ian King [Thu, 6 Apr 2017 11:19:57 +0000 (12:19 +0100)]
scsi: qla2xxx: remove some redundant pointer assignments
There are several local or function parameter pointers that are being
assigned NULL after a kfree where and these have no effect and hence can
be removed.
Fixes various cppcheck warnings:
"Assignment of function parameter has no effect outside the
function. Did you forget dereferencing it"
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The redundant init_completion() here seems to be a cut&past error as
struct scsi_qla_host only has 4 completion elements to initialize, thus
the duplicate init_completion(disable_acb_comp) is simply removed.
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 6 Apr 2017 13:36:35 +0000 (15:36 +0200)]
scsi: make asynchronous aborts mandatory
There hasn't been any reports for HBAs where asynchronous abort
would not work, so we should make it mandatory and remove
the fallback.
Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 6 Apr 2017 13:36:34 +0000 (15:36 +0200)]
scsi: make scsi_eh_scmd_add() always succeed
scsi_eh_scmd_add() currently only will fail if no
error handler thread is started (which will never be the
case) or if the state machine encounters an illegal transition.
But if we're encountering an invalid state transition
chances is we cannot fixup things with the error handler.
So better add a WARN_ON for illegal host states and
make scsi_dh_scmd_add() a void function.
Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 6 Apr 2017 13:36:33 +0000 (15:36 +0200)]
scsi: make eh_eflags persistent
If a failed command is retried and fails again we need
to enter SCSI EH, otherwise we will never be able to
recover the command.
To detect this situation we must not clear scmd->eh_eflags
when EH finishes but rather make it persistent throughout
the lifetime of the command.
Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We now first try to call ->eh_abort_handler from a work queue, but libsas
was always failing that for no good reason. Allow async aborts.
Reviewed-by: Johannes Thumshirn <jth@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 6 Apr 2017 13:36:31 +0000 (15:36 +0200)]
scsi: always send command aborts
When a command has timed out we always should be sending an
abort; with the previous code a failed abort might signal
SCSI EH to start, and all other timed out commands will
never be aborted, even though they might belong to a
different ITL nexus.
Cc: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 6 Apr 2017 13:36:30 +0000 (15:36 +0200)]
scsi: sd: Return SUCCESS in sd_eh_action() after device offline
If sd_eh_action() decides to take the device offline there is
no point in returning FAILED, as taking the device offline
is the ultimate step in SCSI EH anyway.
So further escalation via SCSI EH is not likely to make a
difference and we can as well return SUCCESS.
Cc: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Thu, 6 Apr 2017 13:36:29 +0000 (15:36 +0200)]
scsi: scsi_error: count medium access timeout only once per EH run
The current medium access timeout counter will be increased for
each command, so if there are enough failed commands we'll hit
the medium access timeout for even a single device failure and
the following kernel message is displayed:
sd H:C:T:L: [sdXY] Medium access timeout failure. Offlining disk!
Fix this by making the timeout per EH run, ie the counter will
only be increased once per device and EH run.
Fixes: 18a4d0a ("[SCSI] Handle disk devices which can not process medium access commands") Cc: Ewan Milne <emilne@redhat.com> Cc: Lawrence Obermann <loberman@redhat.com> Cc: Benjamin Block <bblock@linux.vnet.ibm.com> Cc: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: ses: don't get power status of SES device slot on probe
The commit 08024885a2a3 ("ses: Add power_status to SES device slot")
introduced the 'power_status' attribute to enclosure components and
the associated callbacks.
There are 2 callbacks available to get the power status of a device:
1) ses_get_power_status() for 'struct enclosure_component_callbacks'
2) get_component_power_status() for the sysfs device attribute
(these are available for kernel-space and user-space, respectively.)
However, despite both methods being available to get power status
on demand, that commit also introduced a call to get power status
in ses_enclosure_data_process().
This dramatically increased the total probe time for SCSI devices
on larger configurations, because ses_enclosure_data_process() is
called several times during the SCSI devices probe and loops over
the component devices (but that is another problem, another patch).
That results in a tremendous continuous hammering of SCSI Receive
Diagnostics commands to the enclosure-services device, which does
delay the total probe time for the SCSI devices __significantly__:
Originally, ~34 minutes on a system attached to ~170 disks:
Back to the commit discussion.. on the ses_get_power_status() call
introduced in ses_enclosure_data_process(): impact of removing it.
That may possibly be in place to initialize the power status value
on device probe. However, those 2 functions available to retrieve
that value _do_ automatically refresh/update it. So the potential
benefit would be a direct access of the 'power_status' field which
does not use the callbacks...
But the only reader of 'struct enclosure_component::power_status'
is the get_component_power_status() callback for sysfs attribute,
and it _does_ check for and call the .get_power_status callback,
(which indeed is defined and implemented by that commit), so the
power status value is, again, automatically updated.
So, the remaining potential for a direct/non-callback access to
the power_status attribute would be out-of-tree modules -- well,
for those, if they are for whatever reason interested in values
that are set during device probe and not up-to-date by the time
they need it.. well, that would be curious.
Well, to handle that more properly, set the initial power state
value to '-1' (i.e., uninitialized) instead of '1' (power 'on'),
and check for it in that callback which may do an direct access
to the field value _if_ a callback function is not defined.
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Fixes: 08024885a2a3 ("ses: Add power_status to SES device slot") Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Song Liu <songliubraving@fb.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: Make checking the scsi_device_get() return value mandatory
Now that all scsi_device_get() callers check the return value of this
function, make checking that return value mandatory.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
sas_domain_release_transport is unused since at least v3.13, remove it.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Milan P Gandhi [Fri, 31 Mar 2017 21:37:04 +0000 (14:37 -0700)]
scsi: qla2xxx: Fix typo in driver
Signed-off-by: Milan P Gandhi <mgandhi@redhat.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arnd Bergmann [Thu, 23 Mar 2017 15:02:18 +0000 (16:02 +0100)]
scsi: advansys: fix uninitialized data access
gcc-7.0.1 now warns about a previously unnoticed access of uninitialized
struct members:
drivers/scsi/advansys.c: In function 'AscMsgOutSDTR':
drivers/scsi/advansys.c:3860:26: error: '*((void *)&sdtr_buf+5)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
((ushort)s_buffer[i + 1] << 8) | s_buffer[i]);
^
drivers/scsi/advansys.c:3860:26: error: '*((void *)&sdtr_buf+7)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/scsi/advansys.c:3860:26: error: '*((void *)&sdtr_buf+5)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/scsi/advansys.c:3860:26: error: '*((void *)&sdtr_buf+7)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
The code has existed in this exact form at least since v2.6.12, and the
warning seems correct. This uses named initializers to ensure we
initialize all members of the structure.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The patch introduced redundant query retries as we already had such
mechanism provided with _retry functions. Both ufshcd_read_desc and
ufshcd_read_unit_desc_param functions call ufshcd_query_descriptor_retry
wrapper.
Signed-off-by: Szymon Mielczarek <szymonx.mielczarek@intel.com> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Tue, 28 Mar 2017 21:40:20 +0000 (16:40 -0500)]
scsi: hpsa: change driver version
Reviewed-by: Gerry Morong <gerry.morong@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace [Tue, 28 Mar 2017 21:40:13 +0000 (16:40 -0500)]
scsi: hpsa: update pci ids
Reviewed-by: Gerry Morong <gerry.morong@microsemi.com> Reviewed-by: Scott Teel <scott.teel@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Arnd Bergmann [Tue, 28 Mar 2017 14:22:03 +0000 (16:22 +0200)]
scsi: hisi_sas: fix SATA dependency
Removing the 'select SCSI_SAS_LIBSAS' statement in Kconfig resulted in a
link failure in configurations that have hisi_sas built-in but libsas as
a loadable module:
drivers/scsi/built-in.o: In function `hisi_sas_scan_finished':
hisi_sas_main.c:(.text+0x37ce9): undefined reference to `sas_drain_work'
drivers/scsi/built-in.o: In function `hisi_sas_slave_configure':
hisi_sas_main.c:(.text+0x37d17): undefined reference to `sas_slave_configure'
hisi_sas_main.c:(.text+0x37d40): undefined reference to `sas_change_queue_depth'
drivers/scsi/built-in.o: In function `hisi_sas_remove':
All other libsas users have the 'select' statement, so we should do the
same here for consistency. For all I can tell, the patch that added the
sata softreset does not actually introduce a dependency on SCSI_SAS_ATA
but instead adds calls into libata itself, so we can express that with a
more specific dependency.
We cannot have 'select SCSI_SAS_LIBSAS; depends on SCSI_SAS_ATA' as that
would cause a dependency loop.
Fixes: 7c594f0407de ("scsi: hisi_sas: add softreset function for SATA disk") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Tomohiro Kusumi [Tue, 28 Mar 2017 13:49:27 +0000 (16:49 +0300)]
scsi: ufs: add missing macros for register bits from UFSHCI spec
Add macros for register bits that can be found in JESD223C (v2.1).
Not all registers are defined in ufshci.h (i.e. some are unused whether
macros are defined or undefined), but all the bits for those registers
that are already defined should appear here.
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Colin Ian King [Tue, 28 Mar 2017 11:12:22 +0000 (12:12 +0100)]
scsi: hisi_sas: add missing break in switch statement
It appears that a break in the TRANS_TX_OPEN_CNX_ERR_NO_DESTINATION case
got accidentally removed in an earlier commit, as it stands, the
ts->stat and ts->open_rej_reason are being updated twice for this case
which looks incorrect. Fix this by adding in the missing break
statement.
Detected by CoverityScan, CID#1422110 ("Missing break in switch")
Fixes: 634a9585f49c7 ("scsi: hisi_sas: process error codes according to their priority") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Jitendra Bhivare [Fri, 24 Mar 2017 08:41:49 +0000 (14:11 +0530)]
scsi: be2iscsi: Update driver version
Version 11.4.0.0
Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Jitendra Bhivare [Fri, 24 Mar 2017 08:41:47 +0000 (14:11 +0530)]
scsi: be2iscsi: Check size before copying ASYNC handle
Data in buffers are gathered into a single buffer before giving to iSCSI
layer. Though less likely to have payload more than 8K in ASYNC PDU, the
data length is provide by FW and check is missing for overrun.
Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Jitendra Bhivare [Fri, 24 Mar 2017 08:41:46 +0000 (14:11 +0530)]
scsi: be2iscsi: Remove free_list for ASYNC handles
With previous patch adding ASYNC Rx buffers to free_list is not
required. Remove all free_list related operations.
Add in_use to track if buffer posted is being processed by driver and
purge all buffers received for connection if found so.
Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Jitendra Bhivare [Fri, 24 Mar 2017 08:41:45 +0000 (14:11 +0530)]
scsi: be2iscsi: Use num_cons field in Rx CQE
FW runs out of buffer if buffers are not posted back soon. ASYNC Rx CQE
indicates that FW has consumed 8 RQEs. Use it to post back buffers
instead of waiting for buffers to be processed and freed by driver.
Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Jitendra Bhivare [Fri, 24 Mar 2017 08:41:44 +0000 (14:11 +0530)]
scsi: be2iscsi: Increase HDQ default queue size
Currently, ASYNC PDU default queue size is set to max connections. This
leaves only one buffer per connection for any ASYNC PDUs from targets.
Double the size of the default queue.
Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Jitendra Bhivare [Fri, 24 Mar 2017 08:41:43 +0000 (14:11 +0530)]
scsi: scsi_transport_iscsi: Use flush_work in iscsi_remove_session
scsi_flush_work flushes workqueue for the Scsi_Host. In iSCSI offload
enabled host, this would wait for all other sessions under the host.
Use flush_work for the session being removed instead.
Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Jitendra Bhivare [Fri, 24 Mar 2017 08:41:42 +0000 (14:11 +0530)]
scsi: be2iscsi: Replace spin_unlock_bh with spin_lock
spin_unlock_bh back_lock is used in beiscsi_eh_device_reset instead of
spin_lock.
Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Jitendra Bhivare [Fri, 24 Mar 2017 08:41:41 +0000 (14:11 +0530)]
scsi: be2iscsi: Fix closing of connection
CID needs to be freed even when invalidate or upload connection fails.
Attempt to close connection 3 times before freeing CID.
Set cleanup_type to INVALIDATE instead of force TCP_RST. This
unnecessarily is terminating connection with reset instead of gracefully
closing it.
Set save_cfg to 0 - session not to be saved on flash.
Add delay and process CQ before uploading connection.
Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
beiscsi_mccq_compl_wait gets called even when MCC tag allocation failed
for mgmt_invalidate_connection. mcc_wait is not initialized for tag 0
so causes crash in prepare_to_wait.
Signed-off-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter [Thu, 23 Mar 2017 10:41:42 +0000 (13:41 +0300)]
scsi: osd_uld: remove an unneeded NULL check
We don't call the remove() function unless probe() succeeds so "oud"
can't be NULL here. Plus, if it were NULL, we dereference it on the
next line so it would crash anyway.
[mkp: applied by hand]
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Boaz Harrosh <ooo@electrozaur.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Brian King [Wed, 15 Mar 2017 21:58:42 +0000 (16:58 -0500)]
scsi: ipr: Driver version 2.6.4
Bump driver version
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Tested-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Brian King [Wed, 15 Mar 2017 21:58:41 +0000 (16:58 -0500)]
scsi: ipr: Fix SATA EH hang
This patch fixes a hang that can occur in ATA EH with ipr. With ipr's
usage of libata, commands should never end up on ap->eh_done_q. The
timeout function we use for ipr, even for SATA devices, is
scsi_times_out, so ATA_QCFLAG_EH_SCHEDULED never gets set for ipr and EH
is driven completely by ipr and SCSI. The SCSI EH thread ends up calling
ipr's eh_device_reset_handler, which then calls
ata_std_error_handler. This ends up calling ipr_sata_reset, which issues
a reset to the device. This should result in all pending commands
getting failed back and having ata_qc_complete called for them, which
should end up clearing ATA_QCFLAG_FAILED as qc->flags gets zeroed in
ata_qc_free. This ensures that when we end up in ata_eh_finish, we
don't do anything more with the command.
On adapters that only support a single interrupt and when running with
two MSI-X vectors or less, the adapter firmware guarantees that
responses to all outstanding commands are sent back prior to sending the
response to the SATA reset command. On newer adapters supporting
multiple HRRQs, however, this can no longer be guaranteed, since the
command responses and reset response may be processed on different
HRRQs.
If ipr returns from ipr_sata_reset before the outstanding command was
returned, this sends us down the path of __ata_eh_qc_complete which then
moves the associated scsi_cmd from the work_q in
scsi_eh_bus_device_reset to ap->eh_done_q, which then will sit there
forever and we will be wedged.
This patch fixes this up by ensuring that any outstanding commands are
flushed before returning from eh_device_reset_handler for a SATA device.
Reported-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Tested-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Brian King [Wed, 15 Mar 2017 21:58:39 +0000 (16:58 -0500)]
scsi: ipr: Error path locking fixes
This patch closes up some potential race conditions observed in the
error handling paths in ipr while debugging an issue resulting in a hang
with SATA error handling. These patches ensure we are holding the
correct lock when adding and removing commands from the free and pending
queues in some error scenarios.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Tested-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Brian King [Wed, 15 Mar 2017 21:58:39 +0000 (16:58 -0500)]
scsi: ipr: Fix abort path race condition
This fixes a race condition in the error handlomg paths of ipr. While a
command is outstanding to the adapter, it is placed on a pending queue
for the hrrq it is associated with, while holding the HRRQ lock. When a
command is completed, it is removed from the pending queue, under HRRQ
lock, and placed on a local list. This list is then iterated through
without any locks and each command's done function is invoked, inside of
which, the command gets returned to the free list while grabbing the
HRRQ lock. This fixes two race conditions when commands have been
removed from the pending list but have not yet been added to the free
list. Both of these changes fix race conditions that could result in
returning success from eh_abort_handler and then later calling scsi_done
for the same request.
The first race condition is in ipr_cancel_op. It looks through each
pending queue to see if the command to be aborted is still outstanding
or not. Rather than looking on the pending queue, reverse the logic to
check to look for commands that are NOT on the free queue. The second
race condition can occur when in ipr_wait_for_ops where we are waiting
for responses for commands we've aborted.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Tested-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Brian King [Wed, 15 Mar 2017 21:58:37 +0000 (16:58 -0500)]
scsi: ipr: Remove redundant initialization
Removes some code in __ipr_eh_dev_reset which was modifying the ipr_cmd
done function. This should have already been setup at command allocation
time and if its since been changed, it means we are in the ipr_erp*
functions and need to wait for them to complete and don't want to
override that here.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Tested-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Brian King [Wed, 15 Mar 2017 21:58:36 +0000 (16:58 -0500)]
scsi: ipr: Fix missed EH wakeup
Following a command abort or device reset, ipr's EH handlers wait for
the commands getting aborted to get sent back from the adapter prior to
returning from the EH handler. This fixes up some cases where the
completion handler was not getting called, which would have resulted in
the EH thread waiting until it timed out, greatly extending EH time.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Tested-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiaofei Tan [Wed, 22 Mar 2017 17:25:39 +0000 (01:25 +0800)]
scsi: hisi_sas: add is_sata_phy_v2_hw()
Add helper function is_sata_phy_v2_hw() to judge whether the attached
device is SATA disk for a root PHY.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 22 Mar 2017 17:25:38 +0000 (01:25 +0800)]
scsi: hisi_sas: use dev_is_sata to identify SATA or SAS disk
When SMP IO is sent, sas_protocol_ata couldn't judge whether the disk is
SATA or SAS disk. So use dev_is_sata to identify SATA or SAS disk.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unless we actually get some sort of failure in hisi_sas_lu_reset(),
don't print a message.
Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 22 Mar 2017 17:25:36 +0000 (01:25 +0800)]
scsi: hisi_sas: release SMP slot in lldd_abort_task
When an SMP task timeouts, it will call lldd_abort_task to release the
associated slot, and then will release the sas_task.
Currently in lldd_abort_task, if we fail to internally abort IO, then
the slot of SMP IO is not released, but sas_task will still be later
released, so the slot's sas_task is NULL, which will cause NULL pointer
when hisi_sas_slot_task_free happens later.
To resolve, check the return value of internal abort, and release the
slot if it failed.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Wed, 22 Mar 2017 17:25:35 +0000 (01:25 +0800)]
scsi: hisi_sas: add hisi_sas_clear_nexus_ha()
Add function for upper-layer to reset controller when all else fails.
Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Handle the situation that PHY UP and DOWN irq happen simultaneously.
There is no mechanism of SoC HW to ensure this situation will never
happen. So, we add this handle just in case.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 22 Mar 2017 17:25:31 +0000 (01:25 +0800)]
scsi: hisi_sas: process error codes according to their priority
There are some rules to decide which error code has the high priority
when errors happen together:
(1) Error phase of CQ decides the error happens on RX or TX;
(2) For TX error, when DMA/TRANS TX error happen simultaneously, the
priority of DMA TX error is higher than TRANS TX error, so for the
priority of TX error: DW2 (DMA TX part) > DW0;
(3) For RX error, when TRANS/DMA/SIPC RX error happen simultaneously,
the priority of TRANS RX error is higher than DMA and SIPC RX error,
and we should also keep the rules (the priority of DW3 > DW2), so
for the priority of RX error: DW1 > DW3 > DW2(SIPC RX part);
(4) There are also a priority we should keep in the same error type.
So, modify slot error code to handle this.
In addition to this, some some error codes are modified according to
recommendation from SoC designer.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Wed, 22 Mar 2017 17:25:29 +0000 (01:25 +0800)]
scsi: hisi_sas: fix some sas_task.task_state_lock locking
Some more locking needs to be added/modified for when
read-modify-writing sas_task.task_state_flags.
Note: since we can attempt to grab this lock in interrupt
context we should use irq variant of spin_lock.
Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 22 Mar 2017 17:25:28 +0000 (01:25 +0800)]
scsi: hisi_sas: free slots after hardreset
After hardreset, we clear up IOs of remote disks, so we need to free
those slots in LLDD.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Wed, 22 Mar 2017 17:25:27 +0000 (01:25 +0800)]
scsi: hisi_sas: check for SAS_TASK_STATE_ABORTED in slot complete
Check in slot_complete_v2_hw() for whether a task has already been
completed by upper layer.
Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Wed, 22 Mar 2017 17:25:26 +0000 (01:25 +0800)]
scsi: hisi_sas: hardreset for SATA disk in LU reset
When issuing an LU reset for a SATA target, issue an internal abort and
a hard reset.
Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Wed, 22 Mar 2017 17:25:25 +0000 (01:25 +0800)]
scsi: hisi_sas: modify hisi_sas_abort_task() for SSP
Currently an internal abort is executed regardless of the result of the
TMF. We should also check the result of the internal abort to see if we
should free the slot.
So change the status code STAT_IO_COMPLETE to TMF_RESP_FUNC_SUCC,
meaning the slot has been successfully aborted.
Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 22 Mar 2017 17:25:24 +0000 (01:25 +0800)]
scsi: hisi_sas: modify error handling for v2 hw
For error codes which need abort-and-retry, simulate IO timeout and let
SCSI+ATA layers process those errors.
Previously for SSP, we should try to abort the IO in the LLDD and then
pass back to upper layer, but sometimes this would also error. So
Instead of adding special error handling for this scenario in the LLDD,
allow the upper layer to handle completely.
No performance hit is seen by taking this approach.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Wed, 22 Mar 2017 17:25:23 +0000 (01:25 +0800)]
scsi: hisi_sas: only reset link for PHY_FUNC_LINK_RESET
We currently do a hard reset for a link reset. Change this to do a link
reset only.
Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Wed, 22 Mar 2017 17:25:22 +0000 (01:25 +0800)]
scsi: hisi_sas: error hisi_sas_task_prep() when port down
When sas_port is NULL, then return SAS_PHY_DOWN.
In addition, when the sas_dev is gone then explicitly return
SAS_PHY_DOWN.
Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Wed, 22 Mar 2017 17:25:21 +0000 (01:25 +0800)]
scsi: hisi_sas: remove hisi_sas_port_deformed()
Currently when a root PHY is deformed from a asd_sas_port we try to
release the slots in the LLDD, and fail.
Regardless, it is not right to release this early.
This patch removes the deformed function. As it was before, port
deformation is still done in hisi_sas_phy_down().
It would be nice to actually remove the hisi_sas_port_{de}formed() pair,
however we cannot as we need to know the asd_sas_port index libsas has
associated with an asd_sas_phy.
The hw does actually generate a port id for a PHY, but this seems to a
random number, so ignored for this purpose.
This patch also changes the code to link slots to the hisi_sas_device,
and not hisi_sas_port.
Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 22 Mar 2017 17:25:20 +0000 (01:25 +0800)]
scsi: hisi_sas: add softreset function for SATA disk
Add softreset to clear IO after internal abort device for SATA disk.
The SATA error handling for the controller is based on device internal
abort and softreset function.
The controller does not support internal abort for single IO, so we need
to execute internal abort for device.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Wed, 22 Mar 2017 17:25:19 +0000 (01:25 +0800)]
scsi: hisi_sas: move PHY init to hisi_sas_scan_start()
Relocate the PHY init code from LLDD hw init path to
hisi_sas_scan_start().
Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>