]> git.karo-electronics.de Git - mv-sheeva.git/log
mv-sheeva.git
12 years ago[SCSI] isci: fix interpretation of "hard" reset
Dan Williams [Wed, 30 Nov 2011 19:57:34 +0000 (11:57 -0800)]
[SCSI] isci: fix interpretation of "hard" reset

A hard reset to isci in the direct-attached case is one where the driver
internally manages debouncing the link.  In the sas-expander-attached
case a hard reset is one that clears affiliations.  The driver should
not be prematurely dropping affiliations at run time, that decision (to
force expander hard resets to ata devices) is left to userspace to
manage.  So, arrange for I_T_nexus resets to be sas-link-resets in the
expander-attached case and isci-hard-resets in the direct-attached case.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] isci: kill isci_port->status
Dan Williams [Wed, 4 Jan 2012 07:26:15 +0000 (23:26 -0800)]
[SCSI] isci: kill isci_port->status

It only tracks whether the port is stopping in order to gate new devices
being discovered while the port is stopping.  However, since the check
and subsequent handling is unlocked there is nothing to stop the port
from going down immediately after the check.

Driver is already prepared to handle devices arriving on stale ports,
and those will be cleaned up by an eventual ->lldd_dev_gone()
notification.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] isci: kill iphy->isci_port lookups
Dan Williams [Wed, 4 Jan 2012 07:26:08 +0000 (23:26 -0800)]
[SCSI] isci: kill iphy->isci_port lookups

This field is a holdover from the OS abstraction conversion.  The stable
phy to port lookups are done via iphy->ownining_port under scic_lock.
After this conversion to use port->lldd_port the only volatile lookup is
the initial lookup in isci_port_formed().  After that point any lookup
via a successfully notified domain_device is guaranteed to be valid
until the domain_device is destroyed.

Delete ->start_complete as it is only set once and is set as a
consequence of the port going link up, by definition of getting a port
formed event the port is "ready".

While we are correcting port lookups also move the asd_sas_port table
out from under the isci_port.  This is to preclude any temptation to use
container_of() to convert an asd_sas_port to an isci_port, the
association is dynamic and under libsas control.

Tested-by: Maciej Trela <maciej.trela@intel.com>
[dmilburn@redhat.com: fix i686 compile error]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: don't recover 'gone' devices in sas_ata_hard_reset()
Dan Williams [Thu, 22 Dec 2011 22:58:24 +0000 (14:58 -0800)]
[SCSI] libsas: don't recover 'gone' devices in sas_ata_hard_reset()

The commands that timeout when a disk is forcibly removed may trigger
libata to attempt recovery of the device.  If libsas has decided to
remove the device don't permit ata to continue to issue resets to its
last known phy.

The primary motivation for this patch is hotplug testing by writing 0 to
/sys/class/sas_phy/phyX/enable.  Without this check this test leads to
libata issuing a reset and re-enabling the device that wants to be torn
down.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: fix sas_find_local_phy(), take phy references
Dan Williams [Thu, 22 Dec 2011 05:33:17 +0000 (21:33 -0800)]
[SCSI] libsas: fix sas_find_local_phy(), take phy references

In the direct-attached case this routine returns the phy on which this
device was first discovered.  Which is broken if we want to support
wide-targets, as this phy reference can become stale even though the
port is still active.

In the expander-attached case this routine tries to lookup the phy by
scanning the attached sas addresses of the parent expander, and BUG_ONs
if it can't find it.  However since eh and the libsas workqueue run
independently we can still be attempting device recovery via eh after
libsas has recorded the device as detached.  This is even easier to hit
now that eh is blocked while device domain rediscovery takes place, and
that libata is fed more timed out commands increasing the chances that
it will try to recover the ata device.

Arrange for dev->phy to always point to a last known good phy, it may be
stale after the port is torn down, but it will catch up for wide port
reconfigurations, and never be NULL.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: check for 'gone' expanders in smp_execute_task()
Dan Williams [Wed, 21 Dec 2011 23:19:56 +0000 (15:19 -0800)]
[SCSI] libsas: check for 'gone' expanders in smp_execute_task()

No sense in issuing or retrying commands to an expander that has been
removed.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: don't mark expanders as gone when a child device is removed
Dan Williams [Wed, 21 Dec 2011 22:55:38 +0000 (14:55 -0800)]
[SCSI] libsas: don't mark expanders as gone when a child device is removed

Commit 56dd2c06 "[SCSI] libsas: Don't issue commands to devices that
have been hot-removed" marked the parent device of an end-device as gone
when all the phys to the end device have been deleted.

The expander device is still present until its parent is removed.  This
is a benign change until the smp_execute_task() path is taught to check
->gone.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: poll for ata device readiness after reset
Dan Williams [Fri, 18 Nov 2011 01:59:54 +0000 (17:59 -0800)]
[SCSI] libsas: poll for ata device readiness after reset

Use ata_wait_after_reset() to poll for link recovery after a reset.
This combined with sas_ha->eh_mutex prevents expander rediscovery from
probing phys in an intermediate state.  Local discovery does not have a
mechanism to filter link status changes during this timeout, so it
remains the responsibility of lldds to prevent premature port teardown.
Although once all lldd's support ->lldd_ata_check_ready() that could be
used as a gate to local port teardown.

The signature fis is re-transmitted when the link comes back so we
should be revalidating the ata device class, but that is left to a future
patch.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: async ata-eh
Dan Williams [Sun, 4 Dec 2011 09:06:24 +0000 (01:06 -0800)]
[SCSI] libsas: async ata-eh

Once sas_ata_hard_reset() starts honoring the 'deadline' parameter a
pathological configuration could take 25 seconds per ata device
(serialized) to recover.  Run per-port recoveries in parallel.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: add mutex for SMP task execution
Jeff Skirvin [Wed, 16 Nov 2011 09:44:13 +0000 (09:44 +0000)]
[SCSI] libsas: add mutex for SMP task execution

SAS does not tag SMP requests, and at least one lldd (isci) does not permit
more than one in-flight request at a time.

[jejb: fix sas_init_dev tab issues while we're at it]
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: Remove redundant phy state notification calls.
Jeff Skirvin [Fri, 16 Dec 2011 08:21:21 +0000 (08:21 +0000)]
[SCSI] libsas: Remove redundant phy state notification calls.

In the case of an explicit sas_phy_enable call to disable a phy,
the LLDD provides the calls to sas_phy_disconnected and the
PHYE_LOSS_OF_SIGNAL event.

NOTE: This assumes that the lldd(s) generate the notification, which
appears to be the case, but only verfied on isci.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: sas_phy_enable via transport_sas_phy_reset
Dan Williams [Sun, 4 Dec 2011 08:06:57 +0000 (00:06 -0800)]
[SCSI] libsas: sas_phy_enable via transport_sas_phy_reset

Execute the link-reset triggered by sas_phy_enable via
transport_sas_phy_reset so that it can be managed by libata.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: execute transport link resets with libata-eh via host workqueue
Dan Williams [Sat, 3 Dec 2011 00:07:01 +0000 (16:07 -0800)]
[SCSI] libsas: execute transport link resets with libata-eh via host workqueue

Link resets leave ata affiliations intact, so arrange for libsas to make
an effort to avoid dropping the device due to a slow-to-recover link.
Towards this end carry out reset in the host workqueue so that it can
check for ata devices and kick the reset request to libata.  Hard
resets, in contrast, bypass libata since they are meant for associating
an ata device with another initiator in the domain (tears down
affiliations).

Need to add a new transport_sas_phy_reset() since the current
sas_phy_reset() is a utility function to libsas lldds.  They are not
prepared for it to loop back into eh.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: perform sas-transport resets in shost->workq context
Dan Williams [Tue, 20 Dec 2011 09:03:48 +0000 (01:03 -0800)]
[SCSI] libsas: perform sas-transport resets in shost->workq context

Extend the sas transport class to allow transport users to attach extra
data to a sas_phy (->hostdata).  Use this area in libsas to move resets
to workq context in preparation for scheduling ata device resets through
libata-eh.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: use libata-eh-reset for sata rediscovery fis transmit failures
Dan Williams [Thu, 1 Dec 2011 07:23:33 +0000 (23:23 -0800)]
[SCSI] libsas: use libata-eh-reset for sata rediscovery fis transmit failures

Since sata devices can take several seconds to recover the link on reset
the 0.5 seconds that libsas currently waits may not be enough.  Instead
if we are rediscovering a phy that was previously attached to a sata
device let libata handle any resets to encourage the device to transmit
the initial fis.

Once sas_ata_hard_reset() and lldds learn how to honor 'deadline' libsas
should stop encountering phys in an intermediate state, until then this
will loop until the fis is transmitted or ->attached_sas_addr gets
cleared, but in the more likely initial discovery case we keep existing
behavior.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: defer SAS_TASK_NEED_DEV_RESET commands to libata
Dan Williams [Tue, 29 Nov 2011 22:54:28 +0000 (14:54 -0800)]
[SCSI] libsas: defer SAS_TASK_NEED_DEV_RESET commands to libata

lldds use the SAS_TASK_NEED_DEV_RESET interface to request that eh
perform a reset.  In the sata device case defer the commands that
triggered the reset to libata-eh context so it can perform its pre and
post reset management.

In the sas_ata_post_internal() case the reset request is falling on deaf
ears as the sas_task is immediately destroyed without any reset action.
Since it is currently a nop, and likely superfluous given the conversion
to new-style libata-eh, just drop the request.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: let libata handle command timeouts
Dan Williams [Tue, 29 Nov 2011 20:08:50 +0000 (12:08 -0800)]
[SCSI] libsas: let libata handle command timeouts

libsas-eh if it successfully aborts an ata command will hide the timeout
condition (AC_ERR_TIMEOUT) from libata.  The command likely completes
with the all-zero task->task_status it started with.  Instead, interpret
a TMF_RESP_FUNC_COMPLETE as the end of the sas_task but keep the scmd
around for libata-eh to handle.

Tested-by: Andrzej Jakowski <andrzej.jakowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: fix timeout vs completion race
Dan Williams [Mon, 28 Nov 2011 19:29:20 +0000 (11:29 -0800)]
[SCSI] libsas: fix timeout vs completion race

Until we have told the lldd to forget a task a timed out operation can
return from the hardware at any time.  Since completion frees the task
we need to make sure that no tasks run their normal completion handler
once eh has decided to manage the task.  Similar to
ata_scsi_cmd_error_handler() freeze completions to let eh judge the
outcome of the race.

Task collector mode is problematic because it presents a situation where
a task can be timed out and aborted before the lldd has even seen it.
For this case we need to guarantee that a task that an lldd has been
told to forget does not get queued after the lldd says "never seen it".
With sas_scsi_timed_out we achieve this with the ->task_queue_flush
mutex, rather than adding more time.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: prevent double completion of scmds from eh
Dan Williams [Wed, 7 Dec 2011 07:24:42 +0000 (23:24 -0800)]
[SCSI] libsas: prevent double completion of scmds from eh

We invoke task->task_done() to free the task in the eh case, but at this
point we are prepared for scsi_eh_flush_done_q() to finish off the scmd.

Introduce sas_end_task() to capture the final response status from the
lldd and free the task.

Also take the opportunity to kill this warning.
drivers/scsi/libsas/sas_scsi_host.c: In function ‘sas_end_task’:
drivers/scsi/libsas/sas_scsi_host.c:102:3: warning: case value ‘2’ not in enumerated type ‘enum exec_status’ [-Wswitch]

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: close error handling vs sas_ata_task_done() race
Dan Williams [Mon, 28 Nov 2011 20:08:22 +0000 (12:08 -0800)]
[SCSI] libsas: close error handling vs sas_ata_task_done() race

Since sas_ata does not implement ->freeze(), completions for scmds and
internal commands can still arrive concurrent with
ata_scsi_cmd_error_handler() and sas_ata_post_internal() respectively.
By the time either of those is called libata has committed to completing
the qc, and the ATA_PFLAG_FROZEN flag tells sas_ata_task_done() it has
lost the race.

In the sas_ata_post_internal() case we take on the additional
responsibility of freeing the sas_task to close the race with
sas_ata_task_done() freeing the the task while sas_ata_post_internal()
is in the process of invoking ->lldd_abort_task().

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: kill invocation of scsi_eh_finish_cmd from sas_ata_task_done
Dan Williams [Tue, 29 Nov 2011 01:11:33 +0000 (17:11 -0800)]
[SCSI] libsas: kill invocation of scsi_eh_finish_cmd from sas_ata_task_done

Prior to the conversion to the new-style libata-eh sas_ata_task_done()
may have been the last opportunity to clean up the scmd, but now
libata-eh explicitly handles this case.  It also races against sas-eh.
If a lldd completes a task after SAS_TASK_STATE_ABORTED is set it could
trigger a spurious decrement of shost->host_failed.  Current lldds have
the band-aid of checking SAS_TASK_STATE_ABORTED before calling
->task_done(), but better to just let the scmds escalate to libata for
race free cleanup.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: use ->set_dmamode to notify lldds of NCQ parameters
Dan Williams [Fri, 18 Nov 2011 01:59:52 +0000 (17:59 -0800)]
[SCSI] libsas: use ->set_dmamode to notify lldds of NCQ parameters

sas_discover_sata() notifies lldds of sata devices twice.  Once to allow
the 'identify' to be sent, and a second time to allow aic94xx (the only
libsas driver that cares about sata_dev.identify) to setup NCQ
parameters before the device becomes known to the midlayer.  Replace
this double notification and intervening 'identify' with an explicit
->lldd_ata_set_dmamode notification.  With this change all ata internal
commands are issued by libata, so we no longer need sas_issue_ata_cmd().

The data from the identify command only needs to be cached in one
location so ata_device.id replaces domain_device.sata_dev.identify.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: prevent domain rediscovery competing with ata error handling
Dan Williams [Fri, 18 Nov 2011 01:59:51 +0000 (17:59 -0800)]
[SCSI] libsas: prevent domain rediscovery competing with ata error handling

libata error handling provides for a timeout for link recovery.  libsas
must not rescan for previously known devices in this interval otherwise
it may remove a device that is simply waiting for its link to recover.
Let libata-eh make the determination of when the link is stable and
prevent libsas (host workqueue) from taking action while this
determination is pending.

Using a mutex (ha->disco_mutex) to flush and disable revalidation while
eh is running requires any discovery action that may block on eh be
moved to its own context outside the lock.  Probing ATA devices
explicitly waits on ata-eh and the cache-flush-io issued during device
removal may also pend awaiting eh completion.  Essentially any rphy
add/remove activity needs to run outside the lock.

This adds two new cleanup states for sas_unregister_domain_devices()
'allocated-but-not-probed', and 'flagged-for-destruction'.  In the
'allocated-but-not-probed' state  dev->rphy points to a rphy that is
known to have not been through a sas_rphy_add() event.  At domain
teardown check if this device is still pending probe and cleanup
accordingly.  Similarly if a device has already been queued for removal
then sas_unregister_domain_devices has nothing to do.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: convert dev->gone to flags
Dan Williams [Sat, 7 Jan 2012 08:52:39 +0000 (08:52 +0000)]
[SCSI] libsas: convert dev->gone to flags

In preparation for adding tracking of another device state "destroy".

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: remove ata_port.lock management duties from lldds
Dan Williams [Fri, 18 Nov 2011 01:59:50 +0000 (17:59 -0800)]
[SCSI] libsas: remove ata_port.lock management duties from lldds

Each libsas driver (mvsas, pm8001, and isci) has invented a different
method for managing the ap->lock.  The lock is held by the ata
->queuecommand() path.  mvsas drops it prior to acquiring any internal
locks which allows it to hold its internal lock across calls to
task->task_done().  This capability is important as it is the only way
the driver can flush task->task_done() instances to guarantee that it no
longer has any in-flight references to a domain_device at
->lldd_dev_gone() time.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: introduce sas_drain_work()
Dan Williams [Tue, 20 Dec 2011 00:42:34 +0000 (16:42 -0800)]
[SCSI] libsas: introduce sas_drain_work()

When an lldd invokes ->notify_port_event() it can trigger a chain of libsas
events to:

  1/ form the port and find the direct attached device

  2/ if the attached device is an expander perform domain discovery

A call to flush_workqueue() will only flush the initial port formation work.
Currently libsas users need to call scsi_flush_work() up to the max depth of
chain (which will grow from 2 to 3 when ata discovery is moved to its own
discovery event).  Instead of open coding multiple calls switch to use
drain_workqueue() to flush sas work.

drain_workqueue() does not handle new work submitted during the drain so
libsas needs a bit of infrastructure to hold off unchained work submissions
while a drain is in flight.  A lldd ->notify() event is considered 'unchained'
while a sas_discover_event() is 'chained'.  As Tejun notes:

  "For now, I think it would be best to add private wrapper in libsas to
   support deferring unchained work items while draining."

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: convert ha->state to flags
Dan Williams [Tue, 20 Dec 2011 01:02:25 +0000 (17:02 -0800)]
[SCSI] libsas: convert ha->state to flags

In preparation for adding new states (SAS_HA_DRAINING, SAS_HA_FROZEN),
convert ha->state into a set of flags.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: replace event locks with atomic bitops
Dan Williams [Fri, 18 Nov 2011 01:59:49 +0000 (17:59 -0800)]
[SCSI] libsas: replace event locks with atomic bitops

The locks only served to make sure the pending event bitmask was updated
consistently.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: fix leak of dev->sata_dev.identify_[packet_]device
Dan Williams [Fri, 18 Nov 2011 01:59:48 +0000 (17:59 -0800)]
[SCSI] libsas: fix leak of dev->sata_dev.identify_[packet_]device

These are never freed in the nominal path.  A domain_device has a
different lifetime than a sas_rphy we need a dev->rphy independent way
of identifying sata devices.

Reviewed-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: fix domain_device leak
Dan Williams [Fri, 18 Nov 2011 01:59:47 +0000 (17:59 -0800)]
[SCSI] libsas: fix domain_device leak

Arrange for the deallocation of a struct domain_device object when it no
longer has:
1/ any children
2/ references by any scsi_targets
3/ references by a lldd

The comment about domain_device lifetime in
Documentation/scsi/libsas.txt is stale as it appears mainline never had
a version of a struct domain_device that was registered as a kobject.
We now manage domain_device reference counts on behalf of external
agents.

Reviewed-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: kill sas_slave_destroy
Dan Williams [Fri, 18 Nov 2011 01:59:46 +0000 (17:59 -0800)]
[SCSI] libsas: kill sas_slave_destroy

Per commit 3e4ec344 "libata: kill ATA_FLAG_DISABLED" needing to set
ATA_DEV_NONE is a holdover from before libsas converted to the
"new-style" ata-eh.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libsas: remove unused ata_task_resp fields
Dan Williams [Fri, 18 Nov 2011 01:59:45 +0000 (17:59 -0800)]
[SCSI] libsas: remove unused ata_task_resp fields

Commit 1e34c838 "[SCSI] libsas: remove spurious sata control register
read/write" removed the routines to fake the presence of the sata
control registers, now remove the unused data structure fields to kill
any remaining confusion.

Acked-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] Handle disk devices which can not process medium access commands
Martin K. Petersen [Thu, 9 Feb 2012 18:48:53 +0000 (13:48 -0500)]
[SCSI] Handle disk devices which can not process medium access commands

We have experienced several devices which fail in a fashion we do not
currently handle gracefully in SCSI. After a failure these devices will
respond to the SCSI primary command set (INQUIRY, TEST UNIT READY, etc.)
but any command accessing the storage medium will time out.

The following patch adds an callback that can be used by upper level
drivers to inspect the results of an error handling command. This in
turn has been used to implement additional checking in the SCSI disk
driver.

If a medium access command fails twice but TEST UNIT READY succeeds both
times in the subsequent error handling we will offline the device. The
maximum number of failed commands required to take a device offline can
be tweaked in sysfs.

Also add a new error flag to scsi_debug which allows this scenario to be
easily reproduced.

[jejb: fix up integer parsing to use kstrtouint]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] mpt2sas: spell "primitive" correctly in function prototype
Andrew Morton [Wed, 8 Feb 2012 20:52:22 +0000 (12:52 -0800)]
[SCSI] mpt2sas: spell "primitive" correctly in function prototype

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] virtio-scsi: SCSI driver for QEMU based virtual machines
Paolo Bonzini [Sun, 5 Feb 2012 11:16:00 +0000 (12:16 +0100)]
[SCSI] virtio-scsi: SCSI driver for QEMU based virtual machines

The virtio-scsi HBA is the basis of an alternative storage stack
for QEMU-based virtual machines (including KVM).  Compared to
virtio-blk it is more scalable, because it supports many LUNs
on a single PCI slot), more powerful (it more easily supports
passthrough of host devices to the guest) and more easily
extensible (new SCSI features implemented by QEMU should not
require updating the driver in the guest).

Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] hpsa: add some older controllers to the kdump blacklist
Tomas Henzl [Tue, 14 Feb 2012 17:07:59 +0000 (18:07 +0100)]
[SCSI] hpsa: add some older controllers to the kdump blacklist

Some other older controllers also do have problems to perform a kdump.
Adding controllers to this list means that the driver will signal
this non-ability via a resettable flag correctly.
The unsupported list was created after a consultation with HP.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] scsi_error: classify some ILLEGAL_REQUEST sense as a permanent TARGET_ERROR
Mike Snitzer [Mon, 13 Feb 2012 23:35:11 +0000 (18:35 -0500)]
[SCSI] scsi_error: classify some ILLEGAL_REQUEST sense as a permanent TARGET_ERROR

Permanent target failures are non-retryable and should be classified as
TARGET_ERROR; otherwise dm-multipath will retry an IO request that will
always fail at the target.

A SCSI command that fails with ILLEGAL_REQUEST sense and Additional
sense 0x20, 0x21, 0x24 or 0x26 represents a permanent TARGET_ERROR.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] sd: Make sure provisioning mode is reported correctly
Martin K. Petersen [Mon, 13 Feb 2012 20:39:00 +0000 (15:39 -0500)]
[SCSI] sd: Make sure provisioning mode is reported correctly

The provisioning_mode parameter in sysfs did not get updated in the
SD_LBP_DISABLE case. Make sure the provisioning mode is always set
correctly.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] Ensure discard failure gets treated as a target problem
Martin K. Petersen [Mon, 13 Feb 2012 20:38:22 +0000 (15:38 -0500)]
[SCSI] Ensure discard failure gets treated as a target problem

The error reported up the stack for a discard failure did not clearly
indicate that the command was processed and subsequently failed by the
target device.

Return -EREMOTEIO so multipathing does not classify this condition as a
path failure.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] mpt2sas: add missing allocation check
Tomas Henzl [Mon, 13 Feb 2012 17:29:58 +0000 (18:29 +0100)]
[SCSI] mpt2sas: add missing allocation check

The __get_free_pages can fail, so the return value should be checked.
Spotted thanks to Stanislaw.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: "Nandigama, Nagalakshmi" <Nagalakshmi.Nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla4xxx: Update driver version to 5.02.00-k14
Vikas Chaudhary [Mon, 13 Feb 2012 13:00:50 +0000 (18:30 +0530)]
[SCSI] qla4xxx: Update driver version to 5.02.00-k14

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla4xxx: Added ping support
Vikas Chaudhary [Mon, 13 Feb 2012 13:00:49 +0000 (18:30 +0530)]
[SCSI] qla4xxx: Added ping support

Added ping support for network connection diagnostics.

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] iscsi_transport: Added Ping support
Vikas Chaudhary [Mon, 13 Feb 2012 13:00:48 +0000 (18:30 +0530)]
[SCSI] iscsi_transport: Added Ping support

Added ping support for iscsi adapter, application can use this
interface for diagnostic network connection.

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla4xxx: added support for host event
Vikas Chaudhary [Mon, 29 Aug 2011 18:13:02 +0000 (23:43 +0530)]
[SCSI] qla4xxx: added support for host event

Added support to post kernel host event to application using
netlink interface.

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] scsi_transport_iscsi: added support for host event
Vikas Chaudhary [Mon, 13 Feb 2012 13:00:46 +0000 (18:30 +0530)]
[SCSI] scsi_transport_iscsi: added support for host event

Added support to post kernel host event to application using
netlink interface.

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla4xxx: Proper detection of firmware abort error code for ISP82xx
Vikas Chaudhary [Mon, 13 Feb 2012 13:00:45 +0000 (18:30 +0530)]
[SCSI] qla4xxx: Proper detection of firmware abort error code for ISP82xx

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla4xxx: Remove un-necessary print statment
Lalit Chandivade [Mon, 13 Feb 2012 13:00:44 +0000 (18:30 +0530)]
[SCSI] qla4xxx: Remove un-necessary print statment

On ROM lock acquiring timeout failure, driver spews lot of warning
messages in a for loop, remove the unwanted warning message to reduce
kernel messages clutter.

Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla4xxx: Modified debug log messages for boot info.
Manish Rangankar [Mon, 13 Feb 2012 13:00:43 +0000 (18:30 +0530)]
[SCSI] qla4xxx: Modified debug log messages for boot info.

In some configurations user may not have boot targets configured.
In such cases the debug messages printed out by driver look like
some kind of failure happening. However this could be a valid
case, so modified the messages to appear as warning messages
versus failure messages.

Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla4xxx: Fix verify boot idx correctly
Lalit Chandivade [Mon, 13 Feb 2012 13:00:42 +0000 (18:30 +0530)]
[SCSI] qla4xxx: Fix verify boot idx correctly

qla4xxx_verify_boot_idx can falsely report a DDB to be boot target
if ha->pri_ddb_idx and ha->sec_ddb_idx are not initialized correctly.
What this could cause is if there is DDB entry in FLash at index 0, then
qla4xxx_verify_boot_idx would return wrong result as ha->pri_ddb_idx is not
set correctly. Fixed the qla4xxx_get_boot_info to set the ha->pri_ddb_idx and
ha->sec_ddb_idx correctly.

Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla4xxx: Fix un-necessary delay on invalid DDB
Lalit Chandivade [Mon, 13 Feb 2012 13:00:41 +0000 (18:30 +0530)]
[SCSI] qla4xxx: Fix un-necessary delay on invalid DDB

Fix the un-necessary wait for completion of a sendtarget on an
invalid DDB entry. The state of an invalid DDB entry is 0 (unassigned)

This will also avoid the delays during system boot.

Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla4xxx: Remove unused code
Vikas Chaudhary [Mon, 13 Feb 2012 13:00:40 +0000 (18:30 +0530)]
[SCSI] qla4xxx: Remove unused code

This code initially added for FW debugging, we don't need this
code now so taking it out.

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libfc: Handle discovery failure during ctlr link down
Bhanu Prakash Gollapudi [Sat, 11 Feb 2012 01:18:57 +0000 (17:18 -0800)]
[SCSI] libfc: Handle discovery failure during ctlr link down

While we wait for GPN_FT response, if the ctlr link goes down, the stack
generates a completion for GPN_FT with error FC_EXCH_CLOSED, and reports a
discovery error. Discovery is not retried in this case, and rightly so.
However, the 'pending' flag stays set, which does not allow subsequent
discovery to succeed as GPN_FT will never be issued. Fix it by clearing the
pending flag when the discovery fails due to GPN_FT failure.

Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libfc: Fix panic in fc_exch_recv
Bhanu Prakash Gollapudi [Sat, 11 Feb 2012 01:18:51 +0000 (17:18 -0800)]
[SCSI] libfc: Fix panic in fc_exch_recv

Adding and removing the host into the zone causes this panic.

BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
IP: [<ffffffffa0491707>] fc_exch_recv+0xc57/0xe70 [libfc]
Call Trace:
[<ffffffffa050e04b>] bnx2fc_l2_rcv_thread+0x37b/0x430 [bnx2fc]
[<ffffffffa050dcd0>] ? bnx2fc_l2_rcv_thread+0x0/0x430 [bnx2fc]
[<ffffffff81090886>] kthread+0x96/0xa0
[<ffffffff8100c14a>] child_rip+0xa/0x20
[<ffffffff810907f0>] ? kthread+0x0/0xa0
[<ffffffff8100c140>] ? child_rip+0x0/0x20

During fc_exch_reset, the active exchanges are aborted and the exch is deleted.
As part of processing ABTS response, due to 'ep' being NULL, any access to ep in
fc_exch_recv_bls() causes this panic. Fixed to access 'ep' only if non-NULL.

Reviewed-by: Neerav Parikh <neerav.parikh@intel.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] fcoe: Remove reference counting on 'stuct fcoe_interface'
Robert Love [Sat, 11 Feb 2012 01:18:46 +0000 (17:18 -0800)]
[SCSI] fcoe: Remove reference counting on 'stuct fcoe_interface'

The reference counting was necessary on these instances
because it was possible for NPIV ports to be destroyed
after the N_Port. A previous patch ensures that all NPIV
ports are destroyed before the N_Port making the need to
track references on the interface unnecessary.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] fcoe: Do not switch context in vport_delete callback
Robert Love [Sat, 11 Feb 2012 01:18:41 +0000 (17:18 -0800)]
[SCSI] fcoe: Do not switch context in vport_delete callback

Currently all port deletion is routed though the FCoE
workqueue (fcoe_wq). When fc_remove_host is called on
an N_Port (for example, from fcoe_destroy) the vports
are queued into a FC Transport workqueue. fc_remove_host
flushes that queue and each vport is passed to fcoe's
fcoe_vport_destroy, which simply queues the associated
fcoe_ports for later deletion. This queue cannot be
flushed within the N_Ports destroy path because of
circular locking issues. The result is that the NPIV
ports are destroyed after the N_Port, which is reverse
of how they are created.

This quirk causes fcoe to keep references on the
fcoe_interface shared by each of these ports (N_Port
and NPIV). Changing the ordering such that NPIV ports
are destroyed before the N_Port will allow us to remove
reference counting on the fcoe_interface instances.

This patch simply allows fcoe_vport_destory to destroy
NPIV ports without deferring them to a workqueue context.
This ensures that when fc_remove_host is called the
NPIV ports will be destroyed first before the N_Port and
allows reference counting on the fcoe's fcoe_interface
to be remove in a later patch.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] fcoe: Rename out_nomod label to out_putmod
Robert Love [Sat, 11 Feb 2012 01:18:36 +0000 (17:18 -0800)]
[SCSI] fcoe: Rename out_nomod label to out_putmod

The label implies that it should be called when
there is 'nomod.' I read that to mean that the
module reference 'get' failed. However, it's only
called when the module reference 'get' succeeded.

I think it makes more sense to name the label,
'out_putmod' since it should be called when we
need to 'put' the module reference taken in the
routine before returning.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] fcoe: Allow exposing FDMI attributes via sysfs
Neerav Parikh [Sat, 11 Feb 2012 01:18:31 +0000 (17:18 -0800)]
[SCSI] fcoe: Allow exposing FDMI attributes via sysfs

Allow FDMI attributes to be exposed via the fc_host
class object for the fcoe driver.

Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libfcoe: Don't KERN_ERR on netdev notification
Robert Love [Sat, 11 Feb 2012 01:17:59 +0000 (17:17 -0800)]
[SCSI] libfcoe: Don't KERN_ERR on netdev notification

This is more of a debug statement. As a KERN_ERR we generate
log entries anytime any netdev goes up or down, so when booting
there are notification log entries for all system interfaces
including 'lo'. This is too much. Let's just log when necessary.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] isci: T10 DIF support
Dave Jiang [Fri, 10 Feb 2012 09:18:34 +0000 (01:18 -0800)]
[SCSI] isci: T10 DIF support

This allows the controller to do WRITE_INSERT and READ_STRIP for SAS
disks that support protection information. SAS disks must be formatted
with protection information to use this feature via sg_format.

  sg3_utils-1.32 -- sg_format version 1.19 20110730
  sg_format usage:
  sg_format --format --verbose --pinfo /dev/sda

Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Avoid invalid request queue dereference for bad response packets.
Arun Easi [Thu, 9 Feb 2012 19:16:01 +0000 (11:16 -0800)]
[SCSI] qla2xxx: Avoid invalid request queue dereference for bad response packets.

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Stop iteration after first failure in *_id functions.
Arun Easi [Thu, 9 Feb 2012 19:16:00 +0000 (11:16 -0800)]
[SCSI] qla2xxx: Stop iteration after first failure in *_id functions.

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Fix incorrect register access in qla2x00_start_iocbs().
Arun Easi [Thu, 9 Feb 2012 19:15:59 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Fix incorrect register access in qla2x00_start_iocbs().

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Handle device mapping changes due to device logout.
Arun Easi [Thu, 9 Feb 2012 19:15:58 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Handle device mapping changes due to device logout.

A device logout sent in the delete path of a fcport would clear the
port handle binding inside the firmware. This could lead to queued
work items for the fcport, if any, getting incorrect results. This
patch fixes the issue by checking for device name changes after a
call to get port database.

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Add ha->max_fibre_devices to keep track of the maximum number of...
Chad Dupuis [Thu, 9 Feb 2012 19:15:57 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Add ha->max_fibre_devices to keep track of the maximum number of targets.

Add a field to the qla_hw_data struct to allow us to set the maximum number of
fabric devices on a per adapter basis based on ISP type.

[jejb: fix up missing rval = QLA_SUCCESS to prevent uninit var warning]
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Cache swl during fabric discovery.
Andrew Vasquez [Thu, 9 Feb 2012 19:15:56 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Cache swl during fabric discovery.

Rather than continuously allocating and freeing swl within the discovery
process, simply pre-allocate it the first time that it's needed, cache it
through the rest of the lifecycle of the driver and free it at module unload.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Remove EDC sysfs interface.
Joe Carnuccio [Thu, 9 Feb 2012 19:15:55 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Remove EDC sysfs interface.

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Update LICENSE.qla2xxx.
Chad Dupuis [Thu, 9 Feb 2012 19:15:54 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Update LICENSE.qla2xxx.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Perform firmware dump procedure on mailbox command timeout.
Chad Dupuis [Thu, 9 Feb 2012 19:15:53 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Perform firmware dump procedure on mailbox command timeout.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Change the log message when previous dump is available to retrieve...
Giridhar Malavali [Thu, 9 Feb 2012 19:15:52 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Change the log message when previous dump is available to retrieve for ISP82xx.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Log messages to use correct vha.
Arun Easi [Thu, 9 Feb 2012 19:15:51 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Log messages to use correct vha.

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Add new message when a new loopid is assigned.
Arun Easi [Thu, 9 Feb 2012 19:15:50 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Add new message when a new loopid is assigned.

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Fix ql_dbg arguments.
Arun Easi [Thu, 9 Feb 2012 19:15:49 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Fix ql_dbg arguments.

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Use ql_log* #define's in ql_log() and ql_log_pci().
Chad Dupuis [Thu, 9 Feb 2012 19:15:48 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Use ql_log* #define's in ql_log() and ql_log_pci().

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Convert remaining printk's to ql_log format.
Chad Dupuis [Thu, 9 Feb 2012 19:15:47 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Convert remaining printk's to ql_log format.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Print mailbox command opcode and return code when a command times...
Chad Dupuis [Thu, 9 Feb 2012 19:15:46 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Print mailbox command opcode and return code when a command times out.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Reduce mbx-command timeout for Login/Logout requests.
Andrew Vasquez [Thu, 9 Feb 2012 19:15:45 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Reduce mbx-command timeout for Login/Logout requests.

Don't use default 30 second mailbox-command timeout for these
serial requests, instead, limit the TMO to the standard 2*RATOV
plus some fudge-factor.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Prep zero-length BSG data-transfer requests.
Andrew Vasquez [Thu, 9 Feb 2012 19:15:44 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Prep zero-length BSG data-transfer requests.

During command failure/non-recognition, the upper-layer
FC-transport expects the drivers to set
job-reply->reply_payload_rcv_len.  Do this in a consistent manner
to avoid duplication.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Perform implicit logout during rport tear-down.
Andrew Vasquez [Thu, 9 Feb 2012 19:15:43 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Perform implicit logout during rport tear-down.

During rport tear-down, make sure we do an implicit LOGO of the fcport in our
firmware to try to clear any residual commands associated with that fcport.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Handle failure cases during fabric_login
Chad Dupuis [Thu, 9 Feb 2012 19:15:42 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Handle failure cases during fabric_login

Make sure that all calls to ha->isp_ops->fabric_login() check the
return value for failure.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Increase speed of flash access in ISP82xx adapters to improve firmwar...
Chad Dupuis [Thu, 9 Feb 2012 19:15:41 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Increase speed of flash access in ISP82xx adapters to improve firmware load speed.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Return blank sysfs string on initial get thermal failure.
Joe Carnuccio [Thu, 9 Feb 2012 19:15:40 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Return blank sysfs string on initial get thermal failure.

When thermal temperature initially fails, return a blank string to the
sysfs interface.  This fixes the initial display of 0.00 followed by
subsequent display of blank line; the initial 0.00 should have not
displayed for cards that do not support thermal temperature.

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Handle change notifications based on switch scan results.
Arun Easi [Thu, 9 Feb 2012 19:15:39 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Handle change notifications based on switch scan results.

Instead of processing each RSCN individually, use only the name server results
from the switch to tell the existance of a given fcport.

Signed-off-by: Arun Easi <arun.easi@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Correct print format for edc ql_log() calls.
Joe Carnuccio [Thu, 9 Feb 2012 19:15:38 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Correct print format for edc ql_log() calls.

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Use consistent DL mask for ELS/CT passthru requests.
Andrew Vasquez [Thu, 9 Feb 2012 19:15:37 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Use consistent DL mask for ELS/CT passthru requests.

The driver is logging a slew of 'good' status requests for ELS/CT passthrough
commands.  Change some log messages from:

     * ql_log() -> ql_dbg()
     * ql_log_info -> ql_dbg_user

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Consolidation of SRB processing.
Giridhar Malavali [Thu, 9 Feb 2012 19:15:36 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Consolidation of SRB processing.

Rework the structures related to SRB processing to minimize the memory
allocations per I/O and manage resources associated with and completions
from common routines.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Use a valid enode-mac if none defined.
Andrew Vasquez [Thu, 9 Feb 2012 19:15:35 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Use a valid enode-mac if none defined.

Original 'defaults' were not OUI valid.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Enhancements to support ISP83xx.
Giridhar Malavali [Thu, 9 Feb 2012 19:15:34 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Enhancements to support ISP83xx.

Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Harish Zunjarrao <harish.zunjarrao@qlogic.com>
Signed-off-by: Nigel Kirkland <nigel.kirkland@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] qla2xxx: Enhanced the dump routines to capture multiple request and response...
Giridhar Malavali [Thu, 9 Feb 2012 19:15:33 +0000 (11:15 -0800)]
[SCSI] qla2xxx: Enhanced the dump routines to capture multiple request and response queues.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] aacraid: Added Sync.mode to support series 7/8/9 controllers
Mahesh Rajashekhara [Thu, 9 Feb 2012 06:51:04 +0000 (22:51 -0800)]
[SCSI] aacraid: Added Sync.mode to support series 7/8/9 controllers

Added Sync. mode to support Series 7/8/9 controller families: This is a
compatibility mode for all these controller families. The Async. (Performance)
mode can be changed in the future.  First Async. mode version added for Series
7; Controller parameter aac_sync_mode added

Signed-off-by: Mahesh Rajashekhara <aacraid@pmc-sierra.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] bnx2i: Fixed the override of the error_mask module param
Eddie Wai [Thu, 2 Feb 2012 23:22:00 +0000 (15:22 -0800)]
[SCSI] bnx2i: Fixed the override of the error_mask module param

The error_mask module param overrides has a bug which prevented
the new module param values to take effect.

Also changed the type attribute of the error_mask1/2 module params
from int to uint to allow the MSB to be set.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Acked-by: Anil Veerabhadrappa <anilgv@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] bnx2i: use kthread_create_on_node()
Eric Dumazet [Thu, 2 Feb 2012 13:03:22 +0000 (14:03 +0100)]
[SCSI] bnx2i: use kthread_create_on_node()

bnx2i_percpu_thread_create() create per cpu kthread, and should use
proper NUMA aware API.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] bfa: add readme file
Jing Huang [Sat, 28 Jan 2012 00:51:51 +0000 (16:51 -0800)]
[SCSI] bfa: add readme file

This patch add bfa driver readme file to Documentation/scsi

Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] bfa: don't leak mem in bfad_im_bsg_els_ct_request()
Jesper Juhl [Fri, 27 Jan 2012 23:23:41 +0000 (00:23 +0100)]
[SCSI] bfa: don't leak mem in bfad_im_bsg_els_ct_request()

If 'drv_fcxp = kzalloc(sizeof(struct bfad_fcxp), GFP_KERNEL);' fails
and returns NULL, then we'll leak the memory allocated to 'bsg_fcpt'
when we jump to 'out:' and the variable subsequently goes out of
scope.

Also remove the cast of the kzalloc() return value. kzalloc() returns
a void* which is implicitly converted, so the explicit cast is
pointless.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Acked-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] isci: enable clock gating
Marcin Tomczak [Fri, 27 Jan 2012 19:14:50 +0000 (11:14 -0800)]
[SCSI] isci: enable clock gating

Enabling clock gating for power savings on entry to controller ready
state. Disable SCU clock gating for power savings on exit from the
controller ready state.

The gating is fully automated by silicon after setting the mode.

Signed-off-by: Marcin Tomczak <marcin.tomczak@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libiscsi: fix cmd timeout/completion race
Mike Christie [Fri, 27 Jan 2012 03:13:11 +0000 (21:13 -0600)]
[SCSI] libiscsi: fix cmd timeout/completion race

If the driver/lib has called scsi_done and cleaned up internally but
scsi layer has not yet called blk_mark_rq_complete when the command
times out we hit a problem if the timeout code calls blk_mark_rq_complete first.
When the time out code calls into the driver we were returning
BLK_EH_RESET_TIMER and that causes the timeout code to just call
us again later.

We need to be calling BLK_EH_HANDLED so the timeout code can complete
the completion process because it had called blk_mark_rq_complete
on the command and now owns its processing.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] libiscsi_tcp: fix max_r2t manipulation
Mike Christie [Fri, 27 Jan 2012 03:13:10 +0000 (21:13 -0600)]
[SCSI] libiscsi_tcp: fix max_r2t manipulation

Problem description from Xi Wang:
A large max_r2t could lead to integer overflow in subsequent call to
iscsi_tcp_r2tpool_alloc(), allocating a smaller buffer than expected
and leading to out-of-bounds write.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] iscsi: fix setting of pid from netlink skb
Mike Christie [Fri, 27 Jan 2012 03:13:09 +0000 (21:13 -0600)]
[SCSI] iscsi: fix setting of pid from netlink skb

NETLINK_CREDS's pid now returns 0, so I guess we are supposed to
be using NETLINK_CB. This changed while the patch to export the
pid was getting merged upstream, so it was not noticed until both
the network and iscsi changes were in the same tree.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] iscsi: don't hang in endless loop if no targets present
Sasha Levin [Thu, 26 Jan 2012 03:16:16 +0000 (22:16 -0500)]
[SCSI] iscsi: don't hang in endless loop if no targets present

iscsi_if_send_reply() may return -ESRCH if there were no targets to send
data to. Currently we're ignoring this value and looping in attempt to do it
over and over, which will usually lead in a hung task like this one:

[ 4920.817298] INFO: task trinity:9074 blocked for more than 120 seconds.
[ 4920.818527] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4920.819982] trinity         D 0000000000000000  5504  9074   2756 0x00000004
[ 4920.825374]  ffff880003961a98 0000000000000086 ffff8800001aa000 ffff8800001aa000
[ 4920.826791]  00000000001d4340 ffff880003961fd8 ffff880003960000 00000000001d4340
[ 4920.828241]  00000000001d4340 00000000001d4340 ffff880003961fd8 00000000001d4340
[ 4920.833231]
[ 4920.833519] Call Trace:
[ 4920.834010]  [<ffffffff826363fa>] schedule+0x3a/0x50
[ 4920.834953]  [<ffffffff82634ac9>] __mutex_lock_common+0x209/0x5b0
[ 4920.836226]  [<ffffffff81af805d>] ? iscsi_if_rx+0x2d/0x990
[ 4920.837281]  [<ffffffff81053943>] ? sched_clock+0x13/0x20
[ 4920.838305]  [<ffffffff81af805d>] ? iscsi_if_rx+0x2d/0x990
[ 4920.839336]  [<ffffffff82634eb0>] mutex_lock_nested+0x40/0x50
[ 4920.840423]  [<ffffffff81af805d>] iscsi_if_rx+0x2d/0x990
[ 4920.841434]  [<ffffffff810dffed>] ? sub_preempt_count+0x9d/0xd0
[ 4920.842548]  [<ffffffff82637bb0>] ? _raw_read_unlock+0x30/0x60
[ 4920.843666]  [<ffffffff821f71de>] netlink_unicast+0x1ae/0x1f0
[ 4920.844751]  [<ffffffff821f7997>] netlink_sendmsg+0x227/0x350
[ 4920.845850]  [<ffffffff821857bd>] ? sock_update_netprioidx+0xdd/0x1b0
[ 4920.847060]  [<ffffffff82185732>] ? sock_update_netprioidx+0x52/0x1b0
[ 4920.848276]  [<ffffffff8217f226>] sock_aio_write+0x166/0x180
[ 4920.849348]  [<ffffffff810dfe41>] ? get_parent_ip+0x11/0x50
[ 4920.850428]  [<ffffffff811d0d9a>] do_sync_write+0xda/0x120
[ 4920.851465]  [<ffffffff810dffed>] ? sub_preempt_count+0x9d/0xd0
[ 4920.852579]  [<ffffffff810dfe41>] ? get_parent_ip+0x11/0x50
[ 4920.853608]  [<ffffffff81791887>] ? security_file_permission+0x27/0xb0
[ 4920.854821]  [<ffffffff811d0f4c>] vfs_write+0x16c/0x180
[ 4920.855781]  [<ffffffff811d104f>] sys_write+0x4f/0xa0
[ 4920.856798]  [<ffffffff82638e79>] system_call_fastpath+0x16/0x1b
[ 4920.877487] 1 lock held by trinity/9074:
[ 4920.878239]  #0:  (rx_queue_mutex){+.+...}, at: [<ffffffff81af805d>] iscsi_if_rx+0x2d/0x990
[ 4920.880005] Kernel panic - not syncing: hung_task: blocked tasks

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] fix the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE)
Moger, Babu [Tue, 24 Jan 2012 20:38:46 +0000 (20:38 +0000)]
[SCSI] fix the new host byte settings (DID_TARGET_FAILURE and DID_NEXUS_FAILURE)

This patch fixes the host byte settings DID_TARGET_FAILURE and
DID_NEXUS_FAILURE.  The function __scsi_error_from_host_byte, tries to reset
the host byte to DID_OK. But that does not happen because of the OR operation.

Here is the flow.

scsi_softirq_done-> scsi_decide_disposition -> __scsi_error_from_host_byte

Let's take an example with DID_NEXUS_FAILURE. In scsi_decide_disposition,
result will be set as DID_NEXUS_FAILURE (=0x11). Then in
__scsi_error_from_host_byte, when we do OR with DID_OK.  Purpose is to reset
it back to DID_OK. But that does not happen.  This patch fixes this issue.

Signed-off-by: Babu Moger <babu.moger@netapp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
12 years ago[SCSI] Correctly set the scsi host/msg/status bytes
Moger, Babu [Tue, 24 Jan 2012 20:38:42 +0000 (20:38 +0000)]
[SCSI] Correctly set the scsi host/msg/status bytes

Resubmitting as my previous post had format issues and did not go llinux-scsi.
This patch changes the function to set_msg_byte, set_host_byte and
set_driver_byte to correctly set the corresponding bytes appropriately.

It will reset the original setting and correctly set it to the new value.  The
previous OR operation does not always set it back to new value. Look at patch
2/2 for an example.

Signed-off-by: Babu Moger <babu.moger@netapp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>