git.karo-electronics.de Git - linux-beck.git/log

[SCSI] scsi_debug: Runtime-configurable sector size

Make scsi_debug sector size configurable at load time instead of being
a #define. Handy for testing 4KB sectors.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] bsg: fix bsg_mutex hang with device removal

We don't need to hold bsg_mutex during bsg_complete_all_commands(). It
leads to a problem that we block bsg_unregister_queue during
bsg_complete_all_commands (untill all the outstanding commands
complete).

Thanks to Pete Wyckoff for finding the bug and testing the patch.

The detailed bug report is:

http://marc.info/?l=linux-scsi&m=121182137132145&w=2

Tested-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi_tcp: Enable any size command

Let through upto the largest command of 260 defined by the scsi standard.
iscsi core supports this already. Now that the scsi-ml supports it we can
start using large commands.

[jejb:rejections fixed up]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] scsi: use get_unaligned_* helpers

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] Replace __FUNCTION__ with __func__ in iscsi_tcp.

__FUNCTION__ is gcc-specific, use __func__

(Update diff by Mike Christie)

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] libiscsi, iser, tcp: remove recv_lock

The recv lock was defined so the iscsi layer could block
the recv path from processing IO during recovery. It
turns out iser just set a lock to that pointer which was pointless.

We now disconnect the transport connection before doing recovery
so we do not need the recv lock. For iscsi_tcp we still stop
the recv path incase older tools are being used.

This patch also has iscsi_itt_to_ctask user grab the session lock
and has the caller access the task with the lock or get a ref
to it in case the target is broken and sends a tmf success response
then sends data or a response for the command that was supposed to
be affected bty the tmf.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] libiscsi: fix cmds_max setting

Drivers expect that the cmds_max value they pass to the iscsi layer
is the max scsi commands + mgmt tasks. This patch implements that
and fixes some checks for nr cmd limits.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi class: Add session initiatorname and ifacename sysfs attrs.

This adds two new attrs used for creating initiator ports and
binding sessions to hardware.

The session level initiatorname:

Since bnx2i does a scsi_host per host device, we need to add the
iface initiator port settings on the session, so we can create
multiple initiator ports (each with different inames) per device/scsi_host.

The current iname reflects that qla4xxx can have one iname per hba, and we are
allocating a host per session for software. The iname on the host will
remain so we can export and set the hba level qla4xxx setting.

The ifacename attr:

To bind a session to a some peice of hardware in userspace we maintain
some mappings, but during boot or iscsid restart (iscsid contains the user
space part of the driver) we need to be able to figure out which of those
host mappings abstractions maps to certain sessions. This patch adds
a ifacename attr, which userspace can set to id the host side of the
endpoint across pivot_roots and iscsid restarts.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi_tcp: hook iscsi_tcp into iscsi_endpoint code

iscsi_tcp creates its ep in userspace using sockets because
it is virtual, so we just check if we are sent a ep and fail
if we are.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iser: Modify iser to take a iscsi_endpoint struct in ep callouts and session setup

This hooks iser into the iscsi endpoint code. Previously it handled the
lookup and allocation. This has been made generic so bnx2i and iser can
share it. It also allows us to pass iser the leading conn's ep, so we
know the ib_deivce being used and can set it as the scsi_host's parent.
And that allows scsi-ml to set the dma_mask based on those values.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi class: add endpoint class

Add sysfs representation for the endpoint, so userspace can match the
host and session to the endpoint. This will allow us to set the host's
parent correctly at host creation time.

The next patches will convert tcp and iser, and fix iser's dma_mask
bug.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi class: user device_for_each_child instead of duplicating session list

Currently we duplicate the list of sessions, because we were using the
test for if a session was on the host list to indicate if the session
was bound or unbound. We can instead use the target_id and fix up
the class so that drivers like bnx2i do not have to manage the target id
space.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iser: handle iscsi_cmd_task rename

This handles the iscsi_cmd_task rename and renames
the iser cmd task to iser task.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi_tcp: handle iscsi_cmd_task rename

This converts iscsi_tcp to use the iscsi_task name.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] libiscsi: rename iscsi_cmd_task to iscsi_task

This is the second part of the iscsi task merging, and
all it does it rename iscsi_cmd_task to iscsi_task and
mtask/ctask to just task.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iser: convert ib_iser to support merged tasks

Convert ib_iser to support merged tasks.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi_tcp: convert iscsi_tcp to support merged tasks

Convert iscsi_tcp to support merged tasks.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] libiscsi: merge iscsi_mgmt_task and iscsi_cmd_task

There is no need to have the mgmt and cmd tasks separate
structs. It used to save a lot of memory when we overprealocated
memory for tasks, but the next patches will set up the
driver so in the future they can use a mempool or some other
common scsi command allocator and common tagging.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] libiscsi: modify libiscsi so it supports offloaded data paths

This patch modifies libiscsi, so drivers like bnx2i and iser can execute
a command from queuecommand/send_pdu instead of having to be queued to
be run in a workq.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] libiscsi, iscsi_tcp, iser: add session cmds array accessor

Currently to get a ctask from the session cmd array, you have to
know to use the itt modifier. To make this easier on LLDs and
so in the future we can easilly kill the session array and use
the host shared map instead, this patch adds a nice wrapper
to strip the itt into a session->cmds index and return a ctask.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iser: fix handling of scsi cmnds during recovery.

After the stop_conn callback has returned the LLD should not
touch the scsi cmds. iscsi_tcp and libiscsi use the
conn->recv_lock and suspend_rx field to halt recv path
processing, but iser does not have any protection.

This patch modifies iser so that userspace can just
call the ep_disconnect callback, which will halt
all recv IO, before calling the stop_conn callback so
we do not have to worry about the conn->recv_lock and
suspend rx field. iser just needs to stop the send side
from accessing the ib conn.

Fixup to handle when the ep poll fails and ep disconnect
is called from Erez.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi: modify iscsi printk so it can take driver data pointers

Some drivers want to be able to just pass in the driver data pointers
to the iscsi objects. To enable this we need the iscsi printk macro
to cast the object.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi: remove session/conn_data_size from iscsi_transport

This removes the session and conn data_size fields from the iscsi_transport.
Just pass in the value like with host allocation. This patch also makes
it so the LLD iscsi_conn data is allocated with the iscsi_cls_conn.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi: add iscsi host helpers

This finishes the host/session unbinding, by adding some helpers
to add and remove hosts and the session they manage.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi: remove session and host binding in libiscsi

bnx2i allocates a host per netdevice but will use libiscsi,
so this unbinds the session from the host in that code.

This will also be useful for the iser parent device dma settings
fixes.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi class: rename iscsi_host to iscsi_cls_host

This renames the iscsi_host to iscsi_cls_host to match the other
structs, because libiscsi wants to use the iscsi_host name in
the future.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi class, iscsi drivers: remove unused iscsi_transport attrs

max_cmd_len and max_conn are not really used. max_cmd_len is
always 16 and can be set by the LLD. max_conn is always one
since we do not support MCS.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] iscsi class, iscsi_tcp/iser: add host arg to session creation

iscsi offload (bnx2i and qla4xx) allocate a scsi host per hba,
so the session creation path needs a shost/host_no argument.
Software iscsi/iser will follow the same behabior as before
where it allcoates a host per session, but in the future iser
will probably look more like bnx2i where the host's parent is
the hardware (rnic for iser and for bnx2i it is the nic), because
it does not use a socket layer like how iscsi_tcp does.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] mpt fusion : Adding FAULT Reset polling work

When the firmware is in Fault state it will be identifed only when the next time
the driver access the IOC state.
This patch includes a polling function in the driver which will be executed in
regular interval to check the status of the firmware and if it is in Fault
state, then the firmware will be reset by the driver.

Signed-off-by: Sathya Prakash <sathya.prakash@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] mpt fusion : Setting intial period to 0xFF instead of 0xA

The initial period is set to 0xFF

Signed-off-by: Sathya Prakash <sathya.prakash@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] mpt fusion : Updated copyright statment with 2008 included

Updating copyright statement to include the year 2008

Signed-off-by: Sathya Prakash <sathya.prakash@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] mpt fusion: Driver version upgrade to 3.04.07

Updating driver version to 3.04.07 from 3.04.06

Signed-off-by: Sathya Prakash <sathya.prakash@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: Fix sparse warning by providing new entry in dbf

drivers/s390/scsi/zfcp_dbf.c:692:2: warning: context imbalance in
'zfcp_rec_dbf_event_thread' - different lock contexts for basic block

Replace the parameter indicating if the lock is held with a new entry
function that only acquires the lock. This makes the lock handling
more visible and removes the sparse warning.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: remove some __attribute__ ((packed))

There is no need to pack data structures which describe the
contents of records in the new recovery trace.

lcrash currently depends on the binary format for the other traces,
removing the packed attribute from all traces would break trace
debugging with lcrash.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: Refine trace levels of some recovery related events.

This change better spreads trace levels of recovery related events.
There was an overlap of traces for some recovery triggers and the
processing of recovery actions.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: Add information about interrupt to trace.

Store the index of the buffer in the inbound queue used to report
request completion in trace record for request coompletion.
This piece of information allows to better compare qdio and zfcp traces.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: Rename sbal_curr to sbal_last.

sbal_last is more appropriate, because it matches sbal_first.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: Rename sbal_last.

sbal_last is confusing, as it is not the last one actually used,
but just a limit. sbal_limit is a better name.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: Remove field sbal_last from trace record.

This field is not needed, because it designates an index with a fix offset
from sbal_first. It's name is confusing anyway.

Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: Remove some sparse warnings

Remove some sparse warnings by telling sparse that zfcp_req_create
acquires the lock for the request queue.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: Fix fsf_status_read return code handling

If allocation of a status buffer failed the function incorrectly
returned 0 instead of -ENOMEM.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: Fix mempool pointer for GID_PN request allocation

When allocating memory for GID_PN nameserver requests, the allocation
function stores the pointer to the mempool, but then overwrites the
pointer via memset. Later, the wrong function to free the memory will
be called, since this is based on the stored pointer.

Fix this by first initializing the struct and then storing the pointer.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Peschke <mp3@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: receiving an unsolicted status can lead to I/O stall

Processing of an unsolicted status request can lead to a locking race
of the request_queue's queue_lock during the recreation of the
used up status read request while still in interrupt context
of the response handler.

Detaching the 'refill' of the long running status read requests from
the handler to a scheduled work is solving this issue.

In addition, each refill-run is trying to re-establish the full amount
of status read requests, which might have failed in earlier runs.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] mpt fusion: make struct mpt_proc_root_dir static

This patch makes the needlessly global struct mpt_proc_root_dir static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: "Prakash, Sathya" <Sathya.Prakash@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] make use of the residue value

USB sometimes doesn't return an error but instead returns a residue
value indicating part (or all) of the command wasn't completed. So if
the driver _done() error processing indicates the command was fully
processed, subtract off the residue so that this USB error gets
propagated.

Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] SCSI: remove dev->power.power_state from mesh driver

power.power_state is scheduled for removal. This patch (as1055)
removes all uses of that field from the SCSI mesh driver.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Paul Mackerras <paulus@au.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] aacraid: linit.c make aac_show_serial_number static

drivers/scsi/aacraid/linit.c:865:9: warning: symbol 'aac_show_serial_number' was not declared. Should it be static?

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Mark Salyzyn <Mark_Salyzyn@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] aacraid: Add Power Management cards to documentation

Update the documented list of products supported by the aacraid driver.

Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: sysfs attributes for fabric and channel latencies

The latency information is provided on a SCSI device level (LUN)
which can be found at the following location

/sys/class/scsi_device/<H:C:T:L>/device/cmd_latency
/sys/class/scsi_device/<H:C:T:L>/device/read_latency
/sys/class/scsi_device/<H:C:T:L>/device/write_latency

Each sysfs attribute provides the available data: min, max and sum for
fabric and channel latency and the number of requests processed.

An overrun of the variables is neither detected nor treated. The file
has to be read twice to make a meaningful statement, because only the
differences of the values between the two reads can be used. A reset
of the values can be achieved by writing to the attribute.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] zfcp: Track fabric and channel latencies provided by FCP adapter

Add the infrastructure to retrieve the fabric and channel latencies
from FSF commands for each SCSI command that has been processed. For
each unit, the sum, min, max and number of requests is tracked.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] scsi_dh: Remove hardware handler infrastructure from dm

This patch just removes infrastructure that provided support for hardware
handlers in the dm layer as it is not needed anymore.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] scsi_dh: Remove hardware handlers from dm

This patch removes the 3 hardware handlers that currently exist
under dm as the functionality is moved to SCSI layer in the earlier
patches.

[jejb: removed more makefile hunks and rejection fixes]
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] scsi_dh: Remove dm_pg_init_complete

This patch just removes the dm layer's path initialization completion
routine. This is separated from the other patch(scsi_dh: Use SCSI
device handler in dm-multipath) Just to make that patch more readable.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] scsi_dh: Add a single threaded workqueue for initializing paths

Before this patch set (SCSI hardware handlers), initialization of a
path was done asynchronously. Doing that requires a workqueue in each
device/hardware handler module and leads to unneccessary complication
in the device handler code, making it difficult to read the code and
follow the state diagram.

Moving that workqueue to this level makes the device handler code simpler.
Hence, the workqueue is moved to dm level.

A new workqueue is added instead of adding it to the existing workqueue
(kmpathd) for the following reasons:
1. Device activation has to happen faster, stacking them along
   with the other workqueue might lead to unnecessary delay
   in the activation of the path.
2. The effect could be felt the other way too. i.e the current
   events that are handled by the existing workqueue might get
   a delayed response.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] scsi_dh: Use SCSI device handler in dm-multipath

This patch converts dm-mpath to use scsi device handlers instead of
dm's hardware handlers.

This patch does not add any new functionality. Old behaviors remain and
userspace tools work as is except that arguments supplied with hardware
handler are ignored.

One behavioral exception is: Activation of a path is synchronous in this
patch, opposed to the older behavior of being asynchronous (changed in
patch 07: scsi_dh: Add a single threaded workqueue for initializing a path)

Note: There is no need to get a reference for the device handler module
(as it was done in the dm hardware handler case) here as the reference
is held when the device was first found. Instead we check and make sure
that support for the specified device is present at table load time.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] scsi_dh: add EMC Clariion device handler

This adds support for EMC Clariions. This patch has the features that
currently exists in mainline and advanced features from Ed's patches.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] scsi_dh: add hp sw device handler

This patch provides the device handler to support the older hp boxes
which cannot be upgraded.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] scsi_dh: add lsi rdac device handler

This patch provides the device handler to support the LSI RDAC SCSI
based storage devices.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

[SCSI] scsi_dh: add infrastructure for SCSI Device Handlers

Some of the storage devices (that can be accessed through multiple paths),
do need some special handling for
1. Activating the passive path of the storage access.
2. Decode and handle the special sense codes returned by the devices.
3. Handle the I/Os being sent to the passive path, especially
during the device probe time.
when accessed through multiple paths.

As of today this special device handling is done at the dm-multipath
layer using dm-handlers. That works well for (1); for (2) to be handled
at dm layer, scsi sense information need to be exported from SCSI to dm-layer,
which is not very attractive; (3) cannot be done at all at the dm layer.

Device handler has been moved to SCSI mainly to handle (2) and (3) properly.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

Linux 2.6.26-rc5

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (56 commits)
  l2tp: Fix possible oops if transmitting or receiving when tunnel goes down
  tcp: Fix for race due to temporary drop of the socket lock in skb_splice_bits.
  tcp: Increment OUTRSTS in tcp_send_active_reset()
  raw: Raw socket leak.
  lt2p: Fix possible WARN_ON from socket code when UDP socket is closed
  USB ID for Philips CPWUA054/00 Wireless USB Adapter 11g
  ssb: Fix context assertion in ssb_pcicore_dev_irqvecs_enable
  libertas: fix command size for CMD_802_11_SUBSCRIBE_EVENT
  ipw2200: expire and use oldest BSS on adhoc create
  airo warning fix
  b43legacy: Fix controller restart crash
  sctp: Fix ECN markings for IPv6
  sctp: Flush the queue only once during fast retransmit.
  sctp: Start T3-RTX timer when fast retransmitting lowest TSN
  sctp: Correctly implement Fast Recovery cwnd manipulations.
  sctp: Move sctp_v4_dst_saddr out of loop
  sctp: retran_path update bug fix
  tcp: fix skb vs fack_count out-of-sync condition
  sunhme: Cleanup use of deprecated calls to save_and_cli and restore_flags.
  xfrm: xfrm_algo: correct usage of RIPEMD-160
  ...

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc: switch /proc/led to seq_file
sparc64: IO accessors fix

l2tp: Fix possible oops if transmitting or receiving when tunnel goes down

Some problems have been experienced in the field which cause an oops
in the pppol2tp driver if L2TP tunnels fail while passing data.

The pppol2tp driver uses private data that is referenced via the
sk->sk_user_data of its UDP and PPPoL2TP sockets. This patch makes
sure that the driver uses sock_hold() when it holds a reference to the
sk pointer. This affects its sendmsg(), recvmsg(), getname(),
[gs]etsockopt() and ioctl() handlers.

Tested by ISP where problem was seen. System has been up 10 days with
no oops since running this patch. Without the patch, an oops would
occur every 1-2 days.

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Fix for race due to temporary drop of the socket lock in skb_splice_bits.

skb_splice_bits temporary drops the socket lock while iterating over
the socket queue in order to break a reverse locking condition which
happens with sendfile. This, however, opens a window of opportunity
for tcp_collapse() to aggregate skbs and thus potentially free the
current skb used in skb_splice_bits and tcp_read_sock.

This patch fixes the problem by (re-)getting the same "logical skb"
after the lock has been temporary dropped.

Based on idea and initial patch from Evgeniy Polyakov.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Increment OUTRSTS in tcp_send_active_reset()

TCP "resets sent" counter is not incremented when a TCP Reset is
sent via tcp_send_active_reset().

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

raw: Raw socket leak.

The program below just leaks the raw kernel socket

int main() {
        int fd = socket(PF_INET, SOCK_RAW, IPPROTO_UDP);
        struct sockaddr_in addr;

        memset(&addr, 0, sizeof(addr));
        inet_aton("127.0.0.1", &addr.sin_addr);
        addr.sin_family = AF_INET;
        addr.sin_port = htons(2048);
        sendto(fd,  "a", 1, MSG_MORE, &addr, sizeof(addr));
        return 0;
}

Corked packet is allocated via sock_wmalloc which holds the owner socket,
so one should uncork it and flush all pending data on close. Do this in the
same way as in UDP.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

lt2p: Fix possible WARN_ON from socket code when UDP socket is closed

If an L2TP daemon closes a tunnel socket while packets are queued in
the tunnel's reorder queue, a kernel warning is logged because the
socket is closed while skbs are still referencing it. The fix is to
purge the queue in the socket's release handler.

WARNING: at include/net/sock.h:351 udp_lib_unhash+0x41/0x68()
Pid: 12998, comm: openl2tpd Not tainted 2.6.25 #8
[<c0423c58>] warn_on_slowpath+0x41/0x51
[<c05d33a7>] udp_lib_unhash+0x41/0x68
[<c059424d>] sk_common_release+0x23/0x90
[<c05d16be>] udp_lib_close+0x8/0xa
[<c05d8684>] inet_release+0x42/0x48
[<c0592599>] sock_release+0x14/0x60
[<c059299f>] sock_close+0x29/0x30
[<c046ef52>] __fput+0xad/0x15b
[<c046f1d9>] fput+0x17/0x19
[<c046c8c4>] filp_close+0x50/0x5a
[<c046da06>] sys_close+0x69/0x9f
[<c04048ce>] syscall_call+0x7/0xb

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6

USB ID for Philips CPWUA054/00 Wireless USB Adapter 11g

Enable the Philips CPWUA054/00 in p54usb.

Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

ssb: Fix context assertion in ssb_pcicore_dev_irqvecs_enable

This fixes a context assertion in ssb that makes b44 print
out warnings on resume.

This fixes the following kernel oops:
http://www.kerneloops.org/oops.php?number=12732
http://www.kerneloops.org/oops.php?number=11410

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

libertas: fix command size for CMD_802_11_SUBSCRIBE_EVENT

The size was two small by two bytes.

Signed-off-by: Holger Schurig
Signed-off-by: John W. Linville <linville@tuxdriver.com>

ipw2200: expire and use oldest BSS on adhoc create

If there are no networks on the free list, expire the oldest one when
creating a new adhoc network. Because ipw2200 and the ieee80211 stack
don't actually cull old networks and place them back on the free list
unless they are needed for new probe responses, over time the free list
would become empty and creating an adhoc network would fail due to the !
list_empty(...) check.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

airo warning fix

WARNING: space prohibited between function name and open parenthesis '('
#22: FILE: drivers/net/wireless/airo.c:2907:
+ while ((IN4500 (ai, COMMAND) & COMMAND_BUSY) && (delay < 10000)) {

total: 0 errors, 1 warnings, 8 lines checked

./patches/wireless-airo-waitbusy-wont-delay.patch has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Dan Williams <dcbw@redhat.com>
Cc: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

b43legacy: Fix controller restart crash

This fixes a kernel crash on rmmod, in the case where the controller
was restarted before doing the rmmod.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

sctp: Fix ECN markings for IPv6

Commit e9df2e8fd8fbc95c57dbd1d33dada66c4627b44c ("[IPV6]: Use
appropriate sock tclass setting for routing lookup.") also changed the
way that ECN capable transports mark this capability in IPv6. As a
result, SCTP was not marking ECN capablity because the traffic class
was never set. This patch brings back the markings for IPv6 traffic.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: Flush the queue only once during fast retransmit.

When fast retransmit is triggered by a sack, we should flush the queue
only once so that only 1 retransmit happens. Also, since we could
potentially have non-fast-rtx chunks on the retransmit queue, we need
make sure any chunks eligable for fast retransmit are sent first
during fast retransmission.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: Start T3-RTX timer when fast retransmitting lowest TSN

When we are trying to fast retransmit the lowest outstanding TSN, we
need to restart the T3-RTX timer, so that subsequent timeouts will
correctly tag all the packets necessary for retransmissions.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: Correctly implement Fast Recovery cwnd manipulations.

Correctly keep track of Fast Recovery state and do not reduce
congestion window multiple times during sucht state.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Tested-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: Move sctp_v4_dst_saddr out of loop

There's no need to execute sctp_v4_dst_saddr() for each
iteration, just move it out of loop.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: retran_path update bug fix

If the current retran_path is the only active one, it should
update it to the the next inactive one.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-2.6-misc-20080605a' of git://git.linux-ipv6.org/gitroot/yoshfuji/linux-2.6-fix

tcp: fix skb vs fack_count out-of-sync condition

This bug is able to corrupt fackets_out in very rare cases.
In order for this to cause corruption:
  1) DSACK in the middle of previous SACK block must be generated.
  2) In order to take that particular branch, part or all of the
     DSACKed segment must already be SACKed so that we have that
     in cache in the first place.
  3) The new info must be top enough so that fackets_out will be
     updated on this iteration.
...then fack_count is updated while skb wasn't, then we walk again
that particular segment thus updating fack_count twice for
a single skb and finally that value is assigned to fackets_out
by tcp_sacktag_one.

It is safe to call tcp_sacktag_one just once for a segment (at
DSACK), no need to call again for plain SACK.

Potential problem of the miscount are limited to premature entry
to recovery and to inflated reordering metric (which could even
cancel each other out in the most the luckiest scenarios :-)).
Both are quite insignificant in worst case too and there exists
also code to reset them (fackets_out once sacked_out becomes zero
and reordering metric on RTO).

This has been reported by a number of people, because it occurred
quite rarely, it has been very evasive. Andy Furniss was able to
get it to occur couple of times so that a bit more info was
collected about the problem using a debug patch, though it still
required lot of checking around. Thanks also to others who have
tried to help here.

This is listed as Bugzilla #10346. The bug was introduced by
me in commit 68f8353b48 ([TCP]: Rewrite SACK block processing &
sack_recv_cache use), I probably thought back then that there's
need to scan that entry twice or didn't dare to make it go
through it just once there. Going through twice would have
required restoring fack_count after the walk but as noted above,
I chose to drop the additional walk step altogether here.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

sunhme: Cleanup use of deprecated calls to save_and_cli and restore_flags.

Make use of local_irq_save and local_irq_restore rather then the
deprecated save_and_cli and restore_flags calls.

Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

xfrm: xfrm_algo: correct usage of RIPEMD-160

This patch fixes the usage of RIPEMD-160 in xfrm_algo which in turn
allows hmac(rmd160) to be used as authentication mechanism in IPsec
ESP and AH (see RFC 2857).

Signed-off-by: Adrian-Ken Rueegsegger <rueegsegger@swiss-it.ch>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV6]: Do not change protocol for UDPv6 sockets with pending sent data.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6]: inet_sk(sk)->cork.opt leak

IPv6 UDP sockets wth IPv4 mapped address use udp_sendmsg to send the data
actually. In this case ip_flush_pending_frames should be called instead
of ip6_flush_pending_frames.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6]: Do not change protocol for raw IPv6 sockets.

It is not allowed to change underlying protocol for
int fd = socket(PF_INET6, SOCK_RAW, IPPROTO_UDP);

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6] NETNS: Handle ancillary data in appropriate namespace.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6]: Check outgoing interface even if source address is unspecified.

The outgoing interface index (ipi6_ifindex) in IPV6_PKTINFO
ancillary data, is not checked if the source address (ipi6_addr)
is unspecified. If the ipi6_ifindex is the not-exist interface,
it should be fail.

Based on patch from Shan Wei <shanwei@cn.fujitsu.com> and
Brian Haley <brian.haley@hp.com>.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6]: Fix the data length of get destination options with short length

If get destination options with length which is not enough for that
option,getsockopt() will still return the real length of the option,
which is larger then the buffer space.
This is because ipv6_getsockopt_sticky() returns the real length of
the option.

This patch fix this problem.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6]: Fix the return value of get destination options with NULL data pointer

If we pass NULL data buffer to getsockopt(), it will return 0,
and the option length is set to -EFAULT:
getsockopt(sk, IPPROTO_IPV6, IPV6_DSTOPTS, NULL, &len);

This is because ipv6_getsockopt_sticky() will return -EFAULT or
-EINVAL if some error occur.

This patch fix this problem.

Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6] ADDRCONF: Allow longer lifetime on 64bit archs.

- Allow longer lifetimes (>= 0x7fffffff/HZ) on 64bit archs
  by using unsigned long.
- Shadow this arithmetic overflow workaround by introducing
  helper functions: addrconf_timeout_fixup() and
  addrconf_finite_timeout().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV4] TUNNEL4: Fix incoming packet length check for inter-protocol tunnel.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6] TUNNEL6: Fix incoming packet length check for inter-protocol tunnel.

I discover a strange behavior in [ipv4 in ipv6] tunnel. When IPv6 tunnel
payload is less than 40(0x28), packet can be sent to network, received in
physical interface, but not seen in IP tunnel interface. No counter increase
in tunnel interface.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6] ADDRCONF: Check range of prefix length

As of now, the prefix length is not vaildated when adding or deleting
addresses. The value is passed directly into the inet6_ifaddr structure
and later passed on to memcmp() as length indicator which relies on
the value never to exceed 128 (bits).

Due to the missing check, the currently code allows for any 8 bit
value to be passed on as prefix length while using the netlink
interface, and any 32 bit value while using the ioctl interface.

[Use unsigned int instead to generate better code - yoshfuji]

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[IPV6] UDP: Possible dst leak in udpv6_sendmsg.

ip6_sk_dst_lookup returns held dst entry. It should be released
on all paths beyond this point. Add missed release when up->pending
is set.

Bug report and initial patch by Denis V. Lunev <den@openvz.org>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Denis V. Lunev <den@openvz.org>

[SCTP]: Fix NULL dereference of asoc.

Commit 7cbca67c073263c179f605bdbbdc565ab29d801d ("[IPV6]: Support
Source Address Selection API (RFC5014)") introduced NULL dereference
of asoc to sctp_v6_get_saddr in net/sctp/ipv6.c.
Pointed out by Johann Felix Soden <johfel@users.sourceforge.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

Merge branch 'davem-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6

tcp: Fix inconsistency source (CA_Open only when !tcp_left_out(tp))

It is possible that this skip path causes TCP to end up into an
invalid state where ca_state was left to CA_Open while some
segments already came into sacked_out. If next valid ACK doesn't
contain new SACK information TCP fails to enter into
tcp_fastretrans_alert(). Thus at least high_seq is set
incorrectly to a too high seqno because some new data segments
could be sent in between (and also, limited transmit is not
being correctly invoked there). Reordering in both directions
can easily cause this situation to occur.

I guess we would want to use tcp_moderate_cwnd(tp) there as well
as it may be possible to use this to trigger oversized burst to
network by sending an old ACK with huge amount of SACK info, but
I'm a bit unsure about its effects (mainly to FlightSize), so to
be on the safe side I just currently fixed it minimally to keep
TCP's state consistent (obviously, such nasty ACKs have been
possible this far). Though it seems that FlightSize is already
underestimated by some amount, so probably on the long term we
might want to trigger recovery there too, if appropriate, to make
FlightSize calculation to resemble reality at the time when the
losses where discovered (but such change scares me too much now
and requires some more thinking anyway how to do that as it
likely involves some code shuffling).

This bug was found by Brian Vowell while running my TCP debug
patch to find cause of another TCP issue (fackets_out
miscount).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

Fix uart_set_ldisc() function type

Commit 64e9159f5d2c4edf5fa6425031e556f8fddaf7e6 ("serial_core:
uart_set_ldisc infrastructure") introduced the ability for low-level
serial drivers to be informed when the tty ldisc changes.

However, the actual tty-layer function that does this callback for
serial devices was declared with the wrong type, having a spurious and
unused 'ldisc' argument.

This fixed the resulting compiler warning by just removing it.

Acked-by: Blithering Idiot <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>