Jeff Skirvin [Wed, 14 Mar 2012 00:15:11 +0000 (17:15 -0700)]
isci: End the RNC resumption wait when the RNC is destroyed.
While the RNC is suspended for I/O cleanup, the remote device can be
stopped and the RNC setup for destruction. These changes accomodate that
case in the abort path.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Wed, 14 Mar 2012 00:03:00 +0000 (17:03 -0700)]
isci: Fixed RNC bug that lost the suspension or resumption during destroy
This fix corrects the saving of resume parameters when the destruction
of the RNC has already been directed, and makes sure not to overwrite
the RNC destruction callbacks.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The RNC state machine would incorrectly transition from
SCI_RNC_AWAIT_SUSPENSION directly to SCI_RNC_INVALIDATING when a destruct
request was made. This would skip the increment of the suspension count
and the abort of pending TCs (although the invalidating state would at
least cleanup outstanding TCs).
Instead, the RNC will transition to SCI_RNC_SUSPENDED and then start the
destruction process.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Tue, 13 Mar 2012 00:29:51 +0000 (17:29 -0700)]
isci: Manage the IREQ_NO_AUTO_FREE_TAG under scic_lock.
Since there is a possibilty of a timeout waiting for the RNC suspension,
handle the exit case from the task termination under scic_lock, and leave
the tag allocated if the termination timed-out.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Sun, 4 Mar 2012 12:44:53 +0000 (12:44 +0000)]
isci: Remove obviated host callback list.
Since the callbacks to libsas now occur under scic_lock, there is no
longer any reason to save the completed requests in a separate list
for completion to libsas.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Sat, 10 Mar 2012 05:46:46 +0000 (05:46 +0000)]
isci: Check IDEV_GONE before performing abort path operations.
In the link fail path, set IDEV_GONE for every device on the domain
when the last link in the port fails.
In the abort path functions like isci_reset_device, make sure that
there has not already been a detected domain failure with the device
by checking IDEV_GONE, before performing any kind of hard reset, SMP
phy control, or TMF operation.
The check for IDEV_GONE makes sure that the device in the abort path
really has control of the port with which it is associated. This
prevents starting hard resets at incorrect times and scheduling
unnecessary LUN resets for SATA devices.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:42:07 +0000 (22:42 -0800)]
isci: Change the phy control and link reset interface for HW reasons.
There is an apparent HW lockup caused when the PE is disabled while there
is an outstanding TC in progress. This change puts the link into OOB to
force the TC to end before the PE is disabled.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:42:06 +0000 (22:42 -0800)]
isci: Added timeouts to RNC suspensions in the abort path.
This change adds timeouts to the RNC suspension wait. It makes the
suspend and resume timeouts the same.
The previous resume timeout of 5 ms was too short, and timeouts were
seen in resumptions of devices in the abort task/LUN reset path - which
would receive an RNC resumed message within a tenth of a second later.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:42:06 +0000 (22:42 -0800)]
isci: Add protocol indicator for TMF requests.
Requests contructed as task management requests need to have the protocol
indicator set so the completion decode can observe any RNC suspension
conditions.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:42:05 +0000 (22:42 -0800)]
isci: Directly control IREQ_ABORT_PATH_ACTIVE when completing TMFs.
TMF requests, unlike normal I/O requests, need to handle I/O management
conditions in the completion function because TMFs are not handled in the
completion tasklet.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:42:04 +0000 (22:42 -0800)]
isci: Wait for RNC resumption before leaving the abort path.
In the case of TMF execution, or device resets, wait for the RNC to fully
resume before returning to the caller. This ensures that the remote
device will not fail I/O requests while waiting for the RNC resumption to
complete.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:42:04 +0000 (22:42 -0800)]
isci: Fix RNC suspend call for SCI_RESUMING state.
Instead of immediately transitioning to the SCI_RNC_AWAIT_SUSPENSION
state, handle the SCI_RNC_RESUMING suspend transition from the
SCI_RNC_READY state like the SCI_RNC_INVALIDATING --> SCI_RNC_POSTING
transitions do now, by setting the destination state for the entry
into the READY state.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:42:03 +0000 (22:42 -0800)]
isci: Callbacks to libsas occur under scic_lock and are synchronized.
This patch changes the callback mechanism to libsas to only occur while
the scic_lock is held; the abort path cleanup of I/Os also checks to make
sure IREQ_ABORT_PATH_ACTIVE is clear before proceding.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:42:02 +0000 (22:42 -0800)]
isci: When in the abort path, defeat other resume calls until done.
Completion of I/Os during the one of the abort path interface calls
from libsas can drive remote device state changes and the resumption
of the device RNC. This is a problem when the abort path is
attempting to cleanup outstanding I/O at the same time - the resumption
can prevent the termination from occuring correctly.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:42:01 +0000 (22:42 -0800)]
isci: Implement waiting for suspend in the abort path.
In order to prevent a device from receiving an I/O request while still
in an RNC suspending or resuming state (and therefore failing that
I/O back to libsas with a reset required status) wait for the RNC state
change before proceding in the abort path.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:42:00 +0000 (22:42 -0800)]
isci: Make sure all TCs are terminated and cleaned in LUN reset.
In the libsas error path, SATA disks require extra handling in
libata to recover operation. However, libsas expects to be able
to immediately recover all outstanding I/O once the error handler
escalation stops. This patch fixes the condition where the libata
error handler is scheduled for operation but libsas has already
deleted the outstanding sas_tasks.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:41:59 +0000 (22:41 -0800)]
isci: Save the suspension hint for upcoming suspensions.
In the case of a suspend call while in SCI_RNC_POSTING or INVALIDATING
states, the LLHANG detect needed to be saved so the upcoming suspension
would enable it correctly. The unused suspend callback parameters were
removed.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:41:58 +0000 (22:41 -0800)]
isci: Fix the terminated I/O to not call sas_task_abort().
This addresses a regression from the commit "isci: Redesign
device suspension, abort, cleanup." in which the sas_task end
condition for terminated I/Os was made to call back on
sas_task_abort()".
This commit will be rolled into the original.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:41:54 +0000 (22:41 -0800)]
isci: Add suspension cases for RNC INVALIDATING, POSTING states.
The RNC can be any of the states in the loop from suspended to
ready when the API "suspend" or "resume" are called. This change
adds destination states parameters that control the suspension /
resumption action of the RNC statemachine for those transition states.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:41:54 +0000 (22:41 -0800)]
isci: Redesign device suspension, abort, cleanup.
This commit changes the means by which outstanding I/Os are handled
for cleanup.
The likelihood is that this commit will be broken into smaller pieces,
however that will be a later revision. Among the changes:
- All completion structures have been removed from the tmf and
abort paths.
- Now using one completed I/O list, with the I/O completed in host bit being
used to select error or normal callback paths.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:41:51 +0000 (22:41 -0800)]
isci: Manage device suspensions during TC terminations.
TCs must be terminated only while the RNC is suspended. This commit
adds remote device suspensions and resumptions in the abort, reset and
termination paths.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:41:50 +0000 (22:41 -0800)]
isci: Handle all suspending TC completions
Add comprehensive decode for all TC completions that generate RNC
suspensions.
Note that this commit also removes unconditional resumptions of ATAPI
devices when in the SCI_STP_DEV_ATAPI_ERROR state, and STP devices
when in the SCI_STP_DEV_IDLE state. This is because the SCI_STP_DEV_IDLE
and SCI_STP_DEV_ATAPI state entry functions manage the RNC resumption.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Fri, 9 Mar 2012 06:41:48 +0000 (22:41 -0800)]
isci: Manage the link layer hang detect timer for RNC suspensions.
For STP devices under certain protocol conditions, an RNC will not
suspend until the current transfer state is broken with a SYNC/ESC
sequence from the SCU. The SYNC/ESC driven by expiration of the
SCU link layer hang detect timer, which has too small a dynamic
range to support slow SATA devices, so normally it is disabled.
This change enables the timer with the minimum period at the point
when the suspension is requested.
Note that there is potential collateral damage to other open
connections to slow SATA devices on the same port, since there
is no alternative but to enable the LLHANG timer on every phy in
the port for the current suspension request - there is no way to
tell on which phy the RNC in question is currently active.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Maciej Trela [Mon, 12 Mar 2012 23:29:30 +0000 (23:29 +0000)]
isci: enable BCN in sci_port_add_phy()
Ensure we enable receiving BCN's from the
hardware when adding phy to isci_port.
Otherwise if we get BCN before the port is
created we won't see any BCN
Signed-off-by: Maciej Trela <maciej.trela@intel.com> Reported-by: Richard Boyd <richard.g.boyd@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Artur Wojcik [Fri, 24 Feb 2012 23:42:45 +0000 (23:42 +0000)]
isci: implement suspend/resume support
Provide a "simple-dev-pm-ops" implementation that shuts down the domain
and the device on suspend, and resumes the device and the domain on
resume. All of the mechanics of restoring domain connectivity are
handled by libsas once isci has notified libsas that all links should be
back up. libsas is in charge of handling links that did not resume, or
resumed out of order.
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com> Signed-off-by: Jacek Danecki <jacek.danecki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Fri, 2 Mar 2012 01:06:24 +0000 (17:06 -0800)]
isci: fix interrupt disable
There is a (dubious?) lost irq workaround in sci_controller_isr() that
effectively nullifies attempts to disable interrupts. Until the
workaround can be re-evaluated add some infrastructure to prevent the
interrupt handler from inadvertantly re-enabling interrupts.
The failure mode was interrupts continuing to run after the driver had
been removed and its iomappings torn down.
Reported-by: Jacek Danecki <jacek.danecki@intel.com> Tested-by: Jacek Danecki <jacek.danecki@intel.com>
[richard: clear remaining interrupts at the end of reset] Acked-by: Richard Boyd <richard.g.boyd@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Wed, 29 Feb 2012 09:07:56 +0000 (01:07 -0800)]
isci: fix 'link-up' events occur after 'start-complete'
The call to wait_for_start() is meant to ensure that all links have been
given a chance to come up before letting the kernel proceed with
probing. However, the implementation is not correctly syncing with the
port configuration agent. In the MPC case the ports are hard-coded, in
the APC case we need to wait for the port-configuration to form ports
from the started phys.
Towards that end increase the timeout for the APC agent to form ports,
and delay start complete until all phys are out of link-training.
Cc: <stable@vger.kernel.org> Cc: Richard Boyd <richard.g.boyd@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Wed, 15 Feb 2012 21:58:42 +0000 (13:58 -0800)]
isci: refactor initialization for S3/S4
Based on an original implementation by Ed Nadolski and Artur Wojcik
In preparation for S3/S4 support refactor initialization so that
driver-load and resume-from-suspend can share the common init path of
isci_host_init(). Organize the initialization into objects that are
self-contained to the driver (initialized by isci_host_init) versus
those that have some upward registration (initialized at allocation time
asd_sas_phy, asd_sas_port, dma allocations). The largest change is
moving the the validation of the oem and module parameters from
isci_host_init() to isci_host_alloc().
The S3/S4 approach being taken is that libsas will be tasked with
remembering the state of the domain and the lldd is free to be
forgetful. In the case of isci we'll just re-init using a subset of the
normal driver load path.
[clean up some unused / mis-indented function definitions in host.h]
Signed-off-by: Ed Nadolski <edmund.nadolski@intel.com> Signed-off-by: Artur Wojcik <artur.wojcik@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Tom Jackson [Fri, 24 Feb 2012 09:38:49 +0000 (09:38 +0000)]
isci: Don't filter BROADCAST CHANGE primitives
Per the SAS spec, several types of BROADCAST CHANGE primitives
must cause re-discovery of the originating expander.
Only the standard BROADCAST CHANGE primitive was being
sent to the LIBSAS layer. The other BC primitives have been
added to the sci_phy_event_handler()
Signed-off-by: Tom Jackson <thomas.p.jackson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Fri, 10 Feb 2012 09:05:43 +0000 (01:05 -0800)]
isci: improve 'invalid state' warnings
Convert controller state machine warnings to emit the state number (it
missed the number to string conversion, but since these error rarely
happen not much motivation to go further).
Fix up the rnc warnings to use the state name.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Fri, 6 Apr 2012 00:29:12 +0000 (17:29 -0700)]
scsi: cleanup setting task state in scsi_error_handler()
Reading scsi_error_handler() one could easily come away with the
impression that it does its wakeup event check while the task state is
TASK_RUNNING. In fact it sets TASK_INTERRUPTIBLE at the bottom of the
loop, but that is ~50 lines down.
Just set TASK_INTERRUPTIBLE at the top of loop and be done.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Thu, 22 Mar 2012 04:09:12 +0000 (21:09 -0700)]
libsas: suspend / resume support
libsas power management routines to suspend and recover the sas domain
based on a model where the lldd is allowed and expected to be
"forgetful".
sas_suspend_ha - disable event processing allowing the lldd to take down
links without concern for causing hotplug events.
Regardless of whether the lldd actually posts link down
messages libsas notifies the lldd that all
domain_devices are gone.
sas_prep_resume_ha - on the way back up before the lldd starts link
training clean out any spurious events that were
generated on the way down, and re-enable event
processing
sas_resume_ha - after the lldd has started and decided that all phys
have posted link-up events this routine is called to let
libsas start it's own timeout of any phys that did not
resume. After the timeout an lldd can cancel the
phy teardown by posting a link-up event.
Storage for ex_change_count (u16) and phy_change_count (u8) are changed
to int so they can be set to -1 to indicate 'invalidated'.
Cc: Alan Stern <stern@rowland.harvard.edu> Reviewed-by: Jacek Danecki <jacek.danecki@intel.com> Tested-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Thu, 22 Mar 2012 04:09:11 +0000 (21:09 -0700)]
scsi, sd: limit the scope of the async probe domain
sd injects and synchronizes probe work on the global kernel-wide domain.
This runs into conflict with PM that wants to perform resume actions in
async context:
Provide a 'scsi_sd_probe_domain' so that async probe actions actions can
be flushed without regard for the state of PM, and allow for the resume
path to handle devices that have transitioned from SDEV_QUIESCE to
SDEV_DEL prior to resume.
Acked-by: Alan Stern <stern@rowland.harvard.edu>
[alan: uplevel scsi_sd_probe_domain, clarify scsi_device_resume] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Thu, 22 Mar 2012 04:09:10 +0000 (21:09 -0700)]
libsas: drop sata port multiplier infrastructure
On the way to add a new sata_device field, noticed that libsas is
carrying port multiplier infrastructure that is explicitly disabled by
sas_discover_sata(). The aic94xx touches the unused port_no, so leave
that field in case there was some use for it.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Thu, 22 Mar 2012 04:09:08 +0000 (21:09 -0700)]
libata: export ata_port suspend/resume infrastructure for sas
Reuse ata_port_{suspend|resume}_common for sas. This path is chosen
over adding coordination between ata-tranport and sas-transport because
libsas wants to revalidate the domain at resume-time at the host level.
It can not validate links have resumed properly until libata has had a
chance to perform its revalidation, and any sane placing of an ata_port
in the sas-transport model would delay it's resumption until after the
host.
Export the common portion of port suspend/resume (bypass pm_runtime),
and allow sas to perform these operations asynchronously (similar to the
libsas async-ata probe implmentation). Async operation is determined by
having an external, rather than stack based, location for storing the
result of the operation.
Reviewed-by: Jacek Danecki <jacek.danecki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Tue, 20 Mar 2012 17:58:35 +0000 (10:58 -0700)]
libata: reset once
Hotplug testing with libsas currently encounters a 55 second wait for
link recovery to give up. In the case where the user trusts the
response time of their devices permit the recovery attempts to be
limited to one.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jeff Skirvin [Sat, 10 Mar 2012 05:55:50 +0000 (05:55 +0000)]
libsas: sas_rediscover_dev did not look at the SMP exec status.
The discovery function "sas_rediscover_dev" had two bugs: 1) it did
not pay attention to the return status from the SMP task execution;
2) the stack variable used for the returned SAS address was compared
against 0 without being initialized.
Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
[djbw: todo sanitize smp_execute_task return values (see: residue)] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Thu, 9 Feb 2012 04:52:40 +0000 (20:52 -0800)]
libsas: trim sas_task of slow path infrastructure
The timer and the completion are only used for slow path tasks (smp, and
lldd tmfs), yet we incur the allocation space and cpu setup time for
every fast path task.
Cc: Christoph Hellwig <hch@lst.de> Acked-by: Jack Wang <jack_wang@usish.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Wed, 1 Feb 2012 09:12:23 +0000 (01:12 -0800)]
libsas: use ->lldd_I_T_nexus_reset for ->eh_bus_reset_handler
sas_eh_bus_reset_handler() amounts to sas_phy_reset() without
notification of the reset to the lldd. If this is triggered from
eh-cmnd recovery there may be sas_tasks for the lldd to terminate, so
->lldd_I_T_nexus_reset is warranted.
Cc: Xiangliang Yu <yuxiangl@marvell.com> Cc: Luben Tuikov <ltuikov@yahoo.com> Cc: Jack Wang <jack_wang@usish.com> Reviewed-by: Jacek Danecki <jacek.danecki@intel.com>
[jacek: modify pm8001_I_T_nexus_reset to return -ENODEV] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Tue, 31 Jan 2012 07:39:11 +0000 (23:39 -0800)]
libsas: enforce eh strategy handlers only in eh context
The strategy handlers may be called in places that are problematic for
libsas (i.e. sata resets outside of domain revalidation filtering /
libata link recovery), or problematic for userspace (non-blocking ioctl
to sleeping reset functions). However, these routines are also called
for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
as long as we are running in the host's error handler, otherwise arrange
for them to be triggered in eh_context.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Fri, 13 Apr 2012 04:18:38 +0000 (21:18 -0700)]
isci: fix oem parameter validation on single controller skus
OEM parameters [1] are parsed from the platform option-rom / efi
driver. By default the driver was validating the parameters for the
dual-controller case, but in single-controller case only the first set
of parameters may be valid.
Limit the validation to the number of actual controllers detected
otherwise the driver may fail to parse the valid parameters leading to
driver-load or runtime failures.
[1] the platform specific set of phy address, configuration,and analog
tuning values
Reported-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
...teach sas_rphy_remove to wait for async scanning to quiesce before
removing the end_device. It seems this is a more general problem [1],
but this patch only addresses sas transport.
[1]: 23edb6e [SCSI] mpt2sas: Do not set sas_device->starget to NULL from
the slave_destroy callback when all the LUNS have been deleted
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Normalize phy->attached_sas_addr to return a zero-address in the case
when device-type == NO_DEVICE or the linkrate is invalid to handle
expanders that put non-zero sas addresses in the discovery response:
Dan Williams [Fri, 6 Apr 2012 23:35:36 +0000 (16:35 -0700)]
scsi: fix eh wakeup (scsi_schedule_eh vs scsi_restart_operations)
Rapid ata hotplug on a libsas controller results in cases where libsas
is waiting indefinitely on eh to perform an ata probe.
A race exists between scsi_schedule_eh() and scsi_restart_operations()
in the case when scsi_restart_operations() issues i/o to other devices
in the sas domain. When this happens the host state transitions from
SHOST_RECOVERY (set by scsi_schedule_eh) back to SHOST_RUNNING and
->host_busy is non-zero so we put the eh thread to sleep even though
->host_eh_scheduled is active.
Before putting the error handler to sleep we need to check if the
host_state needs to return to SHOST_RECOVERY for another trip through
eh.
Cc: Tejun Heo <tj@kernel.org> Reported-by: Tom Jackson <thomas.p.jackson@intel.com> Tested-by: Tom Jackson <thomas.p.jackson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Thu, 22 Mar 2012 04:09:07 +0000 (21:09 -0700)]
libsas, libata: fix start of life for a sas ata_port
This changes the ordering of initialization and probing events from:
1/ allocate rphy in PORTE_BYTES_DMAED, DISCE_REVALIDATE_DOMAIN
2/ allocate ata_port and schedule port probe in DISCE_PROBE
...to:
1/ allocate ata_port in PORTE_BYTES_DMAED, DISCE_REVALIDATE_DOMAIN
2/ allocate rphy in PORTE_BYTES_DMAED, DISCE_REVALIDATE_DOMAIN
3/ schedule port probe in DISCE_PROBE
This ordering prevents PHYE_SIGNAL_LOSS_EVENTS from sneaking in to
destrory ata devices before they have been fully initialized:
Dan Williams [Thu, 22 Mar 2012 04:09:05 +0000 (21:09 -0700)]
libata: make ata_print_id atomic
This variable is incremented from multiple contexts (module_init via
libata-lldds and the libsas discovery thread). Make it atomic to head
off any chance of libsas and libata creating duplicate ids.
Acked-by: Jacek Danecki <jacek.danecki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Tue, 20 Mar 2012 20:24:29 +0000 (13:24 -0700)]
libsas: fix ata_eh clobbering ex_phys via smp_ata_check_ready
The check_ready implementation in the expander-attached ata device case
polls on sas_ex_phy_discover(). The effect is that the ex_phy fields
(critically ->attached_sas_addr) can change. When ata_eh ends and
libsas comes along to revalidate the domain
sas_unregister_devs_sas_addr() can fail to lookup devices to remove, or
fail to re-add an ata device that ata_eh marked as disabled. So change
the code to skip the sas_address and change count updates when ata_eh is
active.
Cc: Jack Wang <jack_wang@usish.com> Tested-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Tested-by: Bartek Nowakowski <bartek.nowakowski@intel.com> Tested-by: Jacek Danecki <jacek.danecki@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Tom Jackson <thomas.p.jackson@intel.com> Tested-by: Tom Jackson <thomas.p.jackson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams [Mon, 12 Mar 2012 18:38:26 +0000 (11:38 -0700)]
libsas: fix sas_get_port_device regression
Commit 899fcf4 "[SCSI] libsas: set attached device type and target
protocols for local phys" setup 'phy' to be dereferenced after
list_for_each_entry(phy, &port->phy_list, port_phy_el) (i.e. phy ==
&port->phy_list) resulting in reports like:
BUG: unable to handle kernel NULL pointer dereference at 00000000000002b0
IP: [<ffffffffa00ce948>] sas_discover_domain+0x29e/0x4fb [libsas]
...fix by deferring sas_phy_set_target() to the end of
sas_get_port_device().
Reported-by: Tom Jackson <thomas.p.jackson@intel.com> Tested-by: Tom Jackson <thomas.p.jackson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Thomas Jackson [Sat, 18 Feb 2012 02:33:10 +0000 (18:33 -0800)]
libsas: fix sas_find_bcast_phy() in the presence of 'vacant' phys
If an expander reports 'PHY VACANT' for a phy index prior to the one
that generated a BCN libsas fails rediscovery. Since a vacant phy is
defined as a valid phy index that will never have an attached device
just continue the search.
Cc: <stable@vger.kernel.org> Signed-off-by: Thomas Jackson <thomas.p.jackson@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Merge tag 'regmap-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull two more small regmap fixes from Mark Brown:
- Now we have users for it that aren't running Android it turns out
that regcache_sync_region() is much more useful to drivers if it's
exported for use by modules. Who knew?
- Make sure we don't divide by zero when doing debugfs dumps of
rbtrees, not visible up until now because everything was providing at
least some cache on startup.
* tag 'regmap-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: prevent division by zero in rbtree_show
regmap: Export regcache_sync_region()
Merge branch 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull a few KVM fixes from Avi Kivity:
"A bunch of powerpc KVM fixes, a guest and a host RCU fix (unrelated),
and a small build fix."
* 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Resolve RCU vs. async page fault problem
KVM: VMX: vmx_set_cr0 expects kvm->srcu locked
KVM: PMU: Fix integer constant is too large warning in kvm_pmu_set_msr()
KVM: PPC: Book3S: PR: Fix preemption
KVM: PPC: Save/Restore CR over vcpu_run
KVM: PPC: Book3S HV: Save and restore CR in __kvmppc_vcore_entry
KVM: PPC: Book3S HV: Fix kvm_alloc_linear in case where no linears exist
KVM: PPC: Book3S: Compile fix for ppc32 in HIOR access code
Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh
Pull SuperH fixes from Paul Mundt.
* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
sh: fix clock-sh7757 for the latest sh_mobile_sdhi driver
serial: sh-sci: use serial_port_in/out vs sci_in/out.
sh: vsyscall: Fix up .eh_frame generation.
sh: dma: Fix up device attribute mismatch from sysdev fallout.
sh: dwarf unwinder depends on SHcompact.
sh: fix up fallout from system.h disintegration.
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull ACPI & Power Management patches from Len Brown:
"Two fixes for cpuidle merge-window changes, plus a URL fix in
MAINTAINERS"
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
MAINTAINERS: Update git url for ACPI
cpuidle: Fix panic in CPU off-lining with no idle driver
ACPI processor: Use safe_halt() rather than halt() in acpi_idle_play_dead()
Merge branch '3.4-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull target fixes from Nicholas Bellinger:
"Pull two tcm_fc fabric related fixes for -rc2:
Note that both have been CC'ed to stable, and patch #1 is the
important one that addresses a memory corruption bug related to FC
exchange timeouts + command abort.
Thanks again to MDR for tracking down this issue!"
* '3.4-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
tcm_fc: Do not free tpg structure during wq allocation failure
tcm_fc: Add abort flag for gracefully handling exchange timeout
Mark Rustad [Tue, 3 Apr 2012 17:24:52 +0000 (10:24 -0700)]
tcm_fc: Do not free tpg structure during wq allocation failure
Avoid freeing a registered tpg structure if an alloc_workqueue call
fails. This fixes a bug where the failure was leaking memory associated
with se_portal_group setup during the original core_tpg_register() call.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Acked-by: Kiran Patil <Kiran.patil@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Mark Rustad [Tue, 3 Apr 2012 17:24:41 +0000 (10:24 -0700)]
tcm_fc: Add abort flag for gracefully handling exchange timeout
Add abort flag and use it to terminate processing when an exchange
is timed out or is reset. The abort flag is used in place of the
transport_generic_free_cmd function call in the reset and timeout
cases, because calling that function in that context would free
memory that was in use. The aborted flag allows the lifetime to
be managed in a more normal way, while truncating the processing.
This change eliminates a source of memory corruption which
manifested in a variety of ugly ways.
(nab: Drop unused struct fc_exch *ep in ft_recv_seq)
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Acked-by: Kiran Patil <Kiran.patil@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Pull arch/tile bug fixes from Chris Metcalf:
"This includes Paul Gortmaker's change to fix the <asm/system.h>
disintegration issues on tile, a fix to unbreak the tilepro ethernet
driver, and a backlog of bugfix-only changes from internal Tilera
development over the last few months.
They have all been to LKML and on linux-next for the last few days.
The EDAC change to MAINTAINERS is an oddity but discussion on the
linux-edac list suggested I ask you to pull that change through my
tree since they don't have a tree to pull edac changes from at the
moment."
* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (39 commits)
drivers/net/ethernet/tile: fix netdev_alloc_skb() bombing
MAINTAINERS: update EDAC information
tilepro ethernet driver: fix a few minor issues
tile-srom.c driver: minor code cleanup
edac: say "TILEGx" not "TILEPro" for the tilegx edac driver
arch/tile: avoid accidentally unmasking NMI-type interrupt accidentally
arch/tile: remove bogus performance optimization
arch/tile: return SIGBUS for addresses that are unaligned AND invalid
arch/tile: fix finv_buffer_remote() for tilegx
arch/tile: use atomic exchange in arch_write_unlock()
arch/tile: stop mentioning the "kvm" subdirectory
arch/tile: export the page_home() function.
arch/tile: fix pointer cast in cacheflush.c
arch/tile: fix single-stepping over swint1 instructions on tilegx
arch/tile: implement panic_smp_self_stop()
arch/tile: add "nop" after "nap" to help GX idle power draw
arch/tile: use proper memparse() for "maxmem" options
arch/tile: fix up locking in pgtable.c slightly
arch/tile: don't leak kernel memory when we unload modules
arch/tile: fix bug in delay_backoff()
...
Merge tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull xen fixes from Konrad Rzeszutek Wilk:
"Two fixes for regressions:
* one is a workaround that will be removed in v3.5 with proper fix in
the tip/x86 tree,
* the other is to fix drivers to load on PV (a previous patch made
them only load in PVonHVM mode).
The rest are just minor fixes in the various drivers and some cleanup
in the core code."
* tag 'stable/for-linus-3.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pcifront: avoid pci_frontend_enable_msix() falsely returning success
xen/pciback: fix XEN_PCI_OP_enable_msix result
xen/smp: Remove unnecessary call to smp_processor_id()
xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries'
xen: only check xen_platform_pci_unplug if hvm
Merge tag 'mmc-fixes-for-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Pull MMC fixes from Chris Ball:
- Disable use of MSI in sdhci-pci, which caused multiple chipsets to
stop working in 3.4-rc1. I'll wait to turn this on again until we
have a chipset whitelist for it.
- Fix a libertas SDIO powered-resume regression introduced in 3.3;
thanks to Neil Brown and Rafael Wysocki for this fix.
- Fix module reloading on omap_hsmmc.
- Stop trusting the spec/card's specified maximum data timeout length,
and use three seconds instead. Previously we used 300ms.
Also cleanups and fixes for s3c, atmel, sh_mmcif and omap_hsmmc.
* tag 'mmc-fixes-for-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (28 commits)
mmc: use really long write timeout to deal with crappy cards
mmc: sdhci-dove: Fix compile error by including module.h
mmc: Prevent 1.8V switch for SD hosts that don't support UHS modes.
Revert "mmc: sdhci-pci: Add MSI support"
Revert "mmc: sdhci-pci: add quirks for broken MSI on O2Micro controllers"
mmc: core: fix power class selection
mmc: omap_hsmmc: fix module re-insertion
mmc: omap_hsmmc: convert to module_platform_driver
mmc: omap_hsmmc: make it behave well as a module
mmc: omap_hsmmc: trivial cleanups
mmc: omap_hsmmc: context save after enabling runtime pm
mmc: omap_hsmmc: use runtime put sync in probe error patch
mmc: sdio: Use empty system suspend/resume callbacks at the bus level
mmc: bus: print bus speed mode of UHS-I card
mmc: sdhci-pci: add quirks for broken MSI on O2Micro controllers
mmc: sh_mmcif: Simplify calculation of mmc->f_min
mmc: sh_mmcif: mmc->f_max should be half of the bus clock
mmc: sh_mmcif: double clock speed
mmc: block: Remove use of mmc_blk_set_blksize
mmc: atmel-mci: add support for odd clock dividers
...
Make the "word-at-a-time" helper functions more commonly usable
I have a new optimized x86 "strncpy_from_user()" that will use these
same helper functions for all the same reasons the name lookup code uses
them. This is preparation for that.
This moves them into an architecture-specific header file. It's
architecture-specific for two reasons:
- some of the functions are likely to want architecture-specific
implementations. Even if the current code happens to be "generic" in
the sense that it should work on any little-endian machine, it's
likely that the "multiply by a big constant and shift" implementation
is less than optimal for an architecture that has a guaranteed fast
bit count instruction, for example.
- I expect that if architectures like sparc want to start playing
around with this, we'll need to abstract out a few more details (in
particular the actual unaligned accesses). So we're likely to have
more architecture-specific stuff if non-x86 architectures start using
this.
(and if it turns out that non-x86 architectures don't start using
this, then having it in an architecture-specific header is still the
right thing to do, of course)
cpuidle: Fix panic in CPU off-lining with no idle driver
Fix a NULL pointer dereference panic in cpuidle_play_dead() during
CPU off-lining when no cpuidle driver is registered. A cpuidle
driver may be registered at boot-time based on CPU type. This patch
allows an off-lined CPU to enter HLT-based idle in this condition.
Signed-off-by: Toshi Kani <toshi.kani@hp.com> Cc: Boris Ostrovsky <boris.ostrovsky@amd.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Tested-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Len Brown <len.brown@intel.com>
6) Link handling and firmware fixes in bnx2x driver from Yaniv Rosner
and Yuval Mintz.
7) mlx4 erroneously allocates 4 pages at a time, regardless of page
size, fix from Thadeu Lima de Souza Cascardo.
8) SCTP socket option wasn't extended in a backwards compatible way,
fix from Thomas Graf.
9) Add missing address change event emissions to bonding, from Shlomo
Pongratz.
10) /proc/net/dev regressed because it uses a private offset to track
where we are in the hash table, but this doesn't track the offset
pullback that the seq_file code does resulting in some entries being
missed in large dumps.
Fix from Eric Dumazet.
11) do_tcp_sendpage() unloads the send queue way too fast, because it
invokes tcp_push() when it shouldn't. Let the natural sequence
generated by the splice paths, and the assosciated MSG_MORE
settings, guide the tcp_push() calls.
Otherwise what goes out of TCP is spaghetti and doesn't batch
effectively into GSO/TSO clusters.
From Eric Dumazet.
12) Once we put a SKB into either the netlink receiver's queue or a
socket error queue, it can be consumed and freed up, therefore we
cannot touch it after queueing it like that.
Fixes from Eric Dumazet.
13) PPP has this annoying behavior in that for every transmit call it
immediately stops the TX queue, then calls down into the next layer
to transmit the PPP frame.
But if that next layer can take it immediately, it just un-stops the
TX queue right before returning from the transmit method.
Besides being useless work, it makes several facilities unusable, in
particular things like the equalizers. Well behaved devices should
only stop the TX queue when they really are full, and in PPP's case
when it gets backlogged to the downstream device.
David Woodhouse therefore fixed PPP to not stop the TX queue until
it's downstream can't take data any more.
14) IFF_UNICAST_FLT got accidently lost in some recent stmmac driver
changes, re-add. From Marc Kleine-Budde.
15) Fix link flaps in ixgbe, from Eric W. Multanen.
16) Descriptor writeback fixes in e1000e from Matthew Vick.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
net: fix a race in sock_queue_err_skb()
netlink: fix races after skb queueing
doc, net: Update ndo_start_xmit return type and values
doc, net: Remove instruction to set net_device::trans_start
doc, net: Update netdev operation names
doc, net: Update documentation of synchronisation for TX multiqueue
doc, net: Remove obsolete reference to dev->poll
ethtool: Remove exception to the requirement of holding RTNL lock
MAINTAINERS: update for Marvell Ethernet drivers
bonding: properly unset current_arp_slave on slave link up
phonet: Check input from user before allocating
tcp: tcp_sendpages() should call tcp_push() once
ipv6: fix array index in ip6_mc_add_src()
mlx4: allocate just enough pages instead of always 4 pages
stmmac: re-add IFF_UNICAST_FLT for dwmac1000
bnx2x: Clear MDC/MDIO warning message
bnx2x: Fix BCM57711+BCM84823 link issue
bnx2x: Clear BCM84833 LED after fan failure
bnx2x: Fix BCM84833 PHY FW version presentation
bnx2x: Fix link issue for BCM8727 boards.
...
The original XenoLinux code has always had things this way, and for
compatibility reasons (in particular with a subsequent pciback
adjustment) upstream Linux should behave the same way (allowing for two
distinct error indications to be returned by the backend).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Mon, 2 Apr 2012 14:32:22 +0000 (15:32 +0100)]
xen/pciback: fix XEN_PCI_OP_enable_msix result
Prior to 2.6.19 and as of 2.6.31, pci_enable_msix() can return a
positive value to indicate the number of vectors (less than the amount
requested) that can be set up for a given device. Returning this as an
operation value (secondary result) is fine, but (primary) operation
results are expected to be negative (error) or zero (success) according
to the protocol. With the frontend fixed to match the XenoLinux
behavior, the backend can now validly return zero (success) here,
passing the upper limit on the number of vectors in op->value.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The reason we are dying is b/c the call acpi_get_override_irq() is used,
which returns the polarity and trigger for the IRQs. That function calls
mp_find_ioapics to get the 'struct ioapic' structure - which along with the
mp_irq[x] is used to figure out the default values and the polarity/trigger
overrides. Since the mp_find_ioapics now returns -1 [b/c the IOAPIC is filled
with 0xffffffff], the acpi_get_override_irq() stops trying to lookup in the
mp_irq[x] the proper INT_SRV_OVR and we can't install the SCI interrupt.
The proper fix for this is going in v3.5 and adds an x86_io_apic_ops
struct so that platforms can override it. But for v3.4 lets carry this
work-around. This patch does that by providing a slightly different variant
of the fake IOAPIC entries.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Igor Mammedov [Tue, 27 Mar 2012 17:31:08 +0000 (19:31 +0200)]
xen: only check xen_platform_pci_unplug if hvm
commit b9136d207f08
xen: initialize platform-pci even if xen_emul_unplug=never
breaks blkfront/netfront by not loading them because of
xen_platform_pci_unplug=0 and it is never set for PV guest.
Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Eric Dumazet [Fri, 6 Apr 2012 08:49:10 +0000 (10:49 +0200)]
net: fix a race in sock_queue_err_skb()
As soon as an skb is queued into socket error queue, another thread
can consume it, so we are not allowed to reference skb anymore, or risk
use after free.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>