Mike Christie [Wed, 13 May 2009 22:57:44 +0000 (17:57 -0500)]
[SCSI] libiscsi_tcp: update recv tracking for each skb instead of iscsi pdu
Everytime we read in a pdu libiscsi will update a tracking field.
It uses this to decide when to check if the transport might be bad.
If we have not got data in recv_timeout seconds then we will
send a iscsi ping/nop.
If we are on a slow link then it could take a while to read in all
the data for a data_in. In that case we might send a ping/nop when
we do not need to or we might drop a session thinking it is bad
when the lower layer is making forward progress on it.
This patch has libiscsi_tcp update the recv tracking for each skb
(basically network packet from our point of view) instead of the
entire iscsi pdu+data, so we account for these cases where data is
coming in slowly.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Mike Christie [Wed, 13 May 2009 22:57:43 +0000 (17:57 -0500)]
[SCSI] libiscsi: fix nop response/reply and session cleanup race
If we are responding to a nop from the target by sending our nop,
and the session is getting torn down, then iscsi_start_session_recovery
could set the conn stop bits while the recv path is sending the nop
response and we will hit the bug ons in __iscsi_conn_send_pdu.
This has us check the state in __iscsi_conn_send_pdu and fail all
incoming mgmt IO if we are not logged in and if the pdu is not login
related. It also changes the ordering of the setting of conn stop state
bits so they are set after the session state is set (both are set under
the session lock).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Mike Christie [Wed, 13 May 2009 22:57:42 +0000 (17:57 -0500)]
[SCSI] libiscsi: have iscsi_data_in_rsp call iscsi_update_cmdsn
This has iscsi_data_in_rsp call iscsi_update_cmdsn when a pdu is
completed like is done for other pdu's that are don.
For libiscsi_tcp, this means that it calls iscsi_update_cmdsn when
it is handling the pdu internally to only transfer data, but if there is
status then it does not need to call it since the completion handling
will do it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Mike Christie [Wed, 13 May 2009 22:57:41 +0000 (17:57 -0500)]
[SCSI] libiscsi: export iscsi_itt_to_task for bnx2i
bnx2i needs to be able to look up mgmt task like login and nop, because
it does some processing of them on the completion path. This exports
iscsi_itt_to_task so it can look up the task.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Mike Christie [Wed, 13 May 2009 22:57:40 +0000 (17:57 -0500)]
[SCSI] libiscsi: handle param allocation failures
If we could not allocate the initiator name or some other id like
the hwaddress or netdev, then userspace could deal with the failure
by just running in a dregraded mode.
Now we want to be able to switch values for the params and we
want some feedback, so this patch will check if a string like
the initiatorname could not be allocated and return an error.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Mike Christie [Wed, 13 May 2009 22:57:38 +0000 (17:57 -0500)]
[SCSI] iscsi: pass ep connect shost
When we create the tcp/ip connection by calling ep_connect, we currently
just go by the routing table info.
I think there are two problems with this.
1. Some drivers do not have access to a routing table. Some drivers like
qla4xxx do not even know about other ports.
2. If you have two initiator ports on the same subnet, the user may have
set things up so that session1 was supposed to be run through port1. and
session2 was supposed to be run through port2. It looks like we could
end with both sessions going through one of the ports.
Fixes for cxgb3i from Karen Xie.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Zhenwen Xu [Tue, 12 May 2009 20:29:13 +0000 (13:29 -0700)]
[SCSI] NCR_D700: fix IRQ handler return type
drivers/scsi/NCR_D700.c: In function `NCR_D700_probe':
drivers/scsi/NCR_D700.c:322: warning: passing argument 2 of `request_irq' from incompatible pointer type
drivers/scsi/NCR_D700.c:322: warning: passing argument 2 of `request_irq' from incompatible pointer type
drivers/scsi/NCR_D700.c:322: warning: passing argument 2 of `request_irq' from incompatible pointer type
Signed-off-by: Zhenwen Xu <helight.xu@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Andy Yan [Mon, 11 May 2009 14:19:25 +0000 (22:19 +0800)]
[SCSI] mvsas: performance improvement using domain_device->lldd_dev
Using sticky field to improve retrieve performance by eliminating some
lookups in . Remove some spurious casts.
Signed-off-by: Ying Chu <jasonchu@marvell.com> Signed-off-by: Andy Yan <ayan@marvell.com> Signed-off-by: Ke Wei <kewei@marvell.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Andy Yan [Mon, 11 May 2009 13:56:31 +0000 (21:56 +0800)]
[SCSI] mvsas: correct bit map usage
Utilize DECLARE_BITMAP to define the tags array.
Signed-off-by: Ying Chu <jasonchu@marvell.com> Signed-off-by: Andy Yan <ayan@marvell.com> Signed-off-by: Ke Wei <kewei@marvell.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Andy Yan [Mon, 11 May 2009 13:49:52 +0000 (21:49 +0800)]
[SCSI] mvsas: bug fix, null pointer may be used
Null pointer check to avoid corruption.
Signed-off-by: Ying Chu <jasonchu@marvell.com> Signed-off-by: Andy Yan <ayan@marvell.com> Signed-off-by: Ke Wei <kewei@marvell.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Andy Yan [Mon, 11 May 2009 12:05:26 +0000 (20:05 +0800)]
[SCSI] mvsas: bug fix of dead lock
TMF task should be issued with Interrupt Disabled, or Deadlock may take place.
Clean-up unused parameters and conditonal lock.
Signed-off-by: Ying Chu <jasonchu@marvell.com> Signed-off-by: Andy Yan <ayan@marvell.com> Signed-off-by: Ke Wei <kewei@marvell.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Andy Yan [Mon, 11 May 2009 12:01:55 +0000 (20:01 +0800)]
[SCSI] mvsas: bug fix with setting task management frame type
Correct frame type setting according to parameter.
Signed-off-by: Ying Chu <jasonchu@marvell.com> Signed-off-by: Andy Yan <ayan@marvell.com> Signed-off-by: Ke Wei <kewei@marvell.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Kleber S. Souza [Mon, 4 May 2009 13:41:02 +0000 (10:41 -0300)]
[SCSI] ipr: fix PCI permanent error handler
The ipr driver can hang if it encounters enough PCI errors
to trigger the permanent error handler. The driver will attempt
to initiate a "bringdown" of the adapter and fail all pending
ops back. However, this bringdown is unlike any other bringdown
of the adapter in the code as the driver. In this code path we
end up failing back ops with allow_cmds still set to 1. This results
in some commands, the HCAM commands in particular, getting immediately
re-issued to the adapter on the done call, which results in
an infinite loop in ipr_fail_all_ops. Fix this by setting allow_cmds
to zero in this path.
Signed-off-by: Kleber S. Souza <klebers@linux.vnet.ibm.com>
[brking@linux.vnet.ibm.com: alternate patch substituted] Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Eric Piel [Mon, 4 May 2009 10:43:02 +0000 (12:43 +0200)]
[SCSI] Update wording of CONFIG_SCSI_MULTI_LUN help
I had to set CONFIG_SCSI_MULTI_LUN to y in order to get my SE W595
working when plugging it as a mass storage. Looking at SCSI option to
get a phone behaving correctly was convoluted to say the least. There
are quite a few other reports about USB card readers needing this option
as well. This patch improves the help text to make the use of the option
more obvious.
Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Roel Kluin [Sat, 2 May 2009 20:14:54 +0000 (22:14 +0200)]
[SCSI] ibmvscsi: Remove redundant test on unsigned.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Kai Makisara [Sat, 2 May 2009 05:49:34 +0000 (08:49 +0300)]
[SCSI] st: fix gcc 4.4 warning
This patch fixes the GCC 4.4 warning reported by David Binderman and Sergey
Senozhatsky. The old version was working correctly but was not easy to read.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
[SCSI] limit state transitions in scsi_internal_device_unblock
scsi timeout on two or more devices may cause extremely long execution
time for user applications because SDEV_OFFLINE state is changed to
SDEV_RUNNING state during scsi error recovery procedures triggered by
a bus reset or a host reset of scsi LLD, and scsi timeout can happens
on the same devices many times.
This happens because scsi_internal_device_unblock() changes device's
state to SDEV_RUNNING even if a device in other states than SDEV_BLOCK,
while the following two transitions are required in this function.
Signed-off-by: Takahiro Yasui <tyasui@redhat.com>
[matthew@wil.cx: supplied rewritten base for patch] Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Randy Dunlap [Tue, 28 Apr 2009 04:49:31 +0000 (21:49 -0700)]
[SCSI] fcoe, libfc: fix function declarations to be ANSI-compliant
Fix function declarations:
drivers/scsi/fcoe/fcoe.c:1356:28: warning: non-ANSI function declaration of function 'fcoe_dev_setup'
drivers/scsi/libfc/fc_rport.c:1293:20: warning: non-ANSI function declaration of function 'fc_setup_rport'
drivers/scsi/libfc/fc_rport.c:1302:23: warning: non-ANSI function declaration of function 'fc_destroy_rport'
[jejb: fixed wrong doc in comment noticed during inspection] Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Marking the ipr clean up function ipr_remove() as __devexit and using
__devexit_p() macro in its address reference.
Signed-off-by: Kleber Sacilotto de Souza <kleber@linux.vnet.ibm.com> Reported-by: Breno Leitao <leitao@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
[SCSI] scsi_dh_rdac: Retry for NOT_READY(02/04/01) in rdac device handler
During device discovery read capacity fails with 0x020401 and sets the
device size to 0. As a reason any I/O submitted to this path gets
killed at sd_prep_fn with BLKPREP_KILL. This patch is to retry for
0x020401. NEED_RETRY in scsi_decide_disposition does not give
sufficient time for the device to become ready.
Signed-off-by: Vijay Chauhan <vijay.chauhan@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Make the sym53c8xx_2 driver slave_alloc/destroy less unsafe. References
to the destroyed LCB are cleared from the target structure (instead of
leaving a dangling pointer), and when the last LCB for the target is
destroyed the reference to the upper layer target data is cleared. The
host lock is used to prevent a race with the interrupt handler. Also
user commands are prevented for targets with all LCBs destroyed.
Signed-off-by: Aaro Koskinen <Aaro.Koskinen@nokia.com> Tested-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Aaro Koskinen [Tue, 14 Apr 2009 20:46:59 +0000 (15:46 -0500)]
[SCSI] sym53c8xx_2: lun to_clear flag not re-initialized (2.6.27.5)
(Resent with proper formatting)
Fix for the sym53c8xx_2 driver to initialize lun's to_clear flag after
a bus reset (a failed clear can trigger a bus reset and it should not
be attemped again after that).
Signed-off-by: Aaro Koskinen <Aaro.Koskinen@nokia.com> Tested-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Michael Reed [Wed, 8 Apr 2009 19:34:33 +0000 (14:34 -0500)]
[SCSI] qla1280: error recovery rewrite
The driver now waits for the scsi commands associated with a
particular error recovery step to be returned to the mid-layer,
and returns the appropriate SUCCESS or FAILED status. Removes
unneeded polling of chip for interrupts.
This patch also bumps the driver version number.
Signed-off-by: Michael Reed <mdr@sgi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Michael Reed [Wed, 8 Apr 2009 19:33:48 +0000 (14:33 -0500)]
[SCSI] qla1280: driver clean up
Remove some unneeded, inactive and unused code, make some trivial
corrections to comments and a printk, and return a proper status
in qla1280_queuecommand. No fundamental logic changes are made.
Signed-off-by: Michael Reed <mdr@sgi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Alan Stern [Thu, 12 Mar 2009 15:08:51 +0000 (11:08 -0400)]
[SCSI] Increase default timeout for INQUIRY
This patch (as1224) changes the default timeout for INQUIRY commands
from 3 seconds to 20 seconds, which is the value used by Windows for
USB Mass-Storage devices. Some of these devices, like the Corsair
Flash Voyager (see Bugzilla #12188) really do need a long timeout.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Andy Yan [Fri, 8 May 2009 21:46:40 +0000 (17:46 -0400)]
[SCSI] mvsas: add support for 94xx; layout change; bug fixes
This version contains following main changes
- Switch to new layout to support more types of ASIC.
- SSP TMF supported and related Error Handing enhanced.
- Support flash feature with delay 2*HZ when PHY changed.
- Support Marvell 94xx series ASIC for 6G SAS/SATA, which has 2
88SE64xx chips but any different register description.
- Support SPI flash for HBA-related configuration info.
- Other patch enhanced from kernel side such as increasing PHY type
[jejb: fold back in DMA_BIT_MASK changes] Signed-off-by: Ying Chu <jasonchu@marvell.com> Signed-off-by: Andy Yan <ayan@marvell.com> Signed-off-by: Ke Wei <kewei@marvell.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Jeff Garzik [Fri, 8 May 2009 20:35:37 +0000 (16:35 -0400)]
[SCSI] mvsas: move into new directory drivers/scsi/mvsas/
Zero functional changes, just file movement.
This commit prepares for the upcoming integration of the
Marvell-provided driver update that splits the driver into support
for both 64xx and 94xx chip families.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
[SCSI] qla2xxx: Use port number to compute nvram/vpd parameter offsets.
Read adapter's physical port number from interrupt pin register
and use it instead of pci function number to offset into the
nvram to obtain the port's configuration parameters.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Michael Reed [Tue, 7 Apr 2009 05:33:47 +0000 (22:33 -0700)]
[SCSI] qla2xxx: Conditionally disable automatic queue full tracking.
Changing a lun's queue depth (/sys/block/sdX/device/queue_depth)
isn't sticky when the device is connected via a QLogic fibre
channel adapter.
The QLogic qla2xxx fibre channel driver dynamically adjusts a
lun's queue depth. If a user has a specific need to limit the
number of commands issued to a lun (say a tape drive, or a shared
raid where the total commands issued to all luns is limited at
the controller level, for example) and writes a limiting value to
/sys/block/sdXX/device/queue_depth, the qla2xxx driver will
silently and gradually increase the queue depth back to the
driver limit of ql2xmaxqdepth. While reducing this value (module
parameter) or increasing the interval between ramp ups
(ql2xqfullrampup) offers the potential for a work around it would
be better to have the option of just disabling the dynamic
adjustment of queue depth.
This patch implements an "off switch" as a module parameter.
Signed-off-by: Michael Reed <mdr@sgi.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Joe Carnuccio [Tue, 7 Apr 2009 05:33:46 +0000 (22:33 -0700)]
[SCSI] qla2xxx: Perform an implicit login to the Management Server.
Set the conditional plogi option bit whenever logging in the
fabric management server (if it is already logged in, it does not
need an explicit login; an implicit login suffices).
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
[SCSI] qla2xxx: Remove reference to request queue from scsi request block.
srbs used to maintain a reference to the request queue on which
it was enqueued. This is no longer required as the request queue
pointer is now maintained in the scsi host that issues the srb.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Set the module parameter ql2xmultique_tag to 1 to enable this
feature. In this mode, the total number of response queues
created is equal to the number of online cpus. Turning the block
layer's rq_affinity mode on enables requests to be routed to the
proper cpu and at the same time it enables completion of the IO
in a response queue that is affined to the cpu in the request
path.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Set the number of request queues to the module paramater
ql2xmaxqueues. Each vport gets a request queue. The QoS value
set to the request queues determines priority control for queued
IOs. If QoS value is not specified, the vports use the default
queue 0.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Seokmann Ju [Tue, 7 Apr 2009 05:33:37 +0000 (22:33 -0700)]
[SCSI] qla2xxx: Correct bus-reset behaviour with recent ISPs.
The short-circuit to skip the non-applicable 'full-login-lip'
process on 81xx ISPs was nested too deeply in the 'bus-reset'
routine, as the code in qla2x00_loop_reset() should skip the
whole enable_lip_full_login process. The original code could
cause device tear-down due to the qla2x00_wait_for_loop_ready()
call taking a large amount of time.
Signed-off-by: Seokmann Ju <seokmann.ju@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Linus Torvalds [Tue, 19 May 2009 18:31:56 +0000 (11:31 -0700)]
Merge branch 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze
* 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Fix kind-of-intr checking against number of interrupts
microblaze: Update Microblaze defconfig
Linus Torvalds [Tue, 19 May 2009 18:25:35 +0000 (11:25 -0700)]
Avoid ICE in get_random_int() with gcc-3.4.5
Martin Knoblauch reports that trying to build 2.6.30-rc6-git3 with
RHEL4.3 userspace (gcc (GCC) 3.4.5 20051201 (Red Hat 3.4.5-2)) causes an
internal compiler error (ICE):
and after some debugging it turns out that it's due to the code trying
to figure out the rough value of the current stack pointer by taking an
address of an uninitialized variable and casting that to an integer.
This is clearly a compiler bug, but it's not worth fighting - while the
current stack kernel pointer might be somewhat hard to predict in user
space, it's also not generally going to change for a lot of the call
chains for a particular process.
So just drop it, and mumble some incoherent curses at the compiler.
Tested-by: Martin Knoblauch <spamtrap@knobisoft.de> Cc: Matt Mackall <mpm@selenic.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Frank Filz [Mon, 18 May 2009 21:41:40 +0000 (17:41 -0400)]
nfs: Fix NFS v4 client handling of MAY_EXEC in nfs_permission.
The problem is that permission checking is skipped if atomic open is
possible, but when exec opens a file, it just opens it O_READONLY which
means EXEC permission will not be checked at that time.
This problem is observed by the following sequence (executed as root):
mount -t nfs4 server:/ /mnt4
echo "ls" >/mnt4/foo
chmod 744 /mnt4/foo
su guest -c "mnt4/foo"
[SCSI] mpt2sas : bump driver version to 01.100.02.00
The MPT2SAS_MAJOR_VERSION didn't get bumped from 00 to 01 so
applications will see it incorrectly as 00.100.02.00 driver instead of
01.100.02.00. Fix by making MPT2SAS_MAJOR_VERSION match the major
number in MPT2SAS_DRIVER_VERSION
Signed-off-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Linus Torvalds [Mon, 18 May 2009 17:22:04 +0000 (10:22 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Explicit alignment for .data.cacheline_aligned
powerpc/ps3: Update ps3_defconfig
powerpc/ftrace: Fix constraint to be early clobber
powerpc/ftrace: Use pr_devel() in ftrace.c
powerpc: Do not assert pte_locked for hugepage PTE entries
Linus Torvalds [Mon, 18 May 2009 17:11:06 +0000 (10:11 -0700)]
Merge branches 'sched-fixes-for-linus-2' and 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix fallback sched_clock()'s offset when using jiffies
* 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: increase MAX_LOCKDEP_ENTRIES and MAX_LOCKDEP_CHAINS
Linus Torvalds [Mon, 18 May 2009 16:15:41 +0000 (09:15 -0700)]
Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Append prompt in /debug/tracing/README file
x86/function-graph: fix constraint for recording old return value
David Woodhouse [Mon, 18 May 2009 12:07:35 +0000 (13:07 +0100)]
Fix oops on close of hot-unplugged FTDI serial converter
Commit c45d6320 ("fix reference counting of ftdi_private") stopped
ftdi_sio_port_remove() from directly freeing the port-private data, with
the intention if the port was still open, it would be freed when
ftdi_close() is eventually called and releases the last refcount on the
structure.
That's all very well, but ftdi_sio_port_remove() still contains a call
to usb_set_serial_port_data(port, NULL) -- so by the time we get to
ftdi_close() for the port which was unplugged, it _still_ oopses on
dereferencing that NULL pointer, as it did before (and does in 2.6.29).
The fix is just not to clear the private data in ftdi_sio_port_remove().
Then the refcount is properly reduced to zero when the final kref_put()
happens in ftdi_close().
Remove a bogus comment too, while we're at it. And stop doing things
inside "if (priv)" -- it must _always_ be there.
Based loosely on an earlier patch by Daniel Mack, and suggestions by
Alan Stern.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Tested-by: Daniel Mack <daniel@caiaq.de> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Korsgaard [Mon, 18 May 2009 10:13:54 +0000 (11:13 +0100)]
mtd_dataflash: unbreak erase support
Commit 5b7f3a50 (fix dataflash 64-bit divisions) unfortunately
introduced a typo. Erase addr and len were swapped in the pageaddr
calculation, causing the wrong sectors to get erased.
Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Acked-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Rostedt [Fri, 15 May 2009 04:33:54 +0000 (04:33 +0000)]
powerpc/ftrace: Fix constraint to be early clobber
After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function
graph tracer broke. This was discovered on my x86 boxes.
The issue is that gcc used the same register for an output as it did for
an input in an asm statement. I first thought this was a bug in gcc and
reported it. I was notified that gcc was correct and that the output had
to be flagged as an "early clobber".
I noticed that powerpc had the same issue and this patch fixes it.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Wed, 13 May 2009 20:30:24 +0000 (20:30 +0000)]
powerpc/ftrace: Use pr_devel() in ftrace.c
pr_debug() can now result in code being generated even when #DEBUG
is not defined. That's not really desirable in the ftrace code
which we want to be snappy.
With CONFIG_DYNAMIC_DEBUG=y:
size before:
text data bss dec hex filename
3334 672 4 4010 faa arch/powerpc/kernel/ftrace.o
size after:
text data bss dec hex filename
2616 360 4 2980 ba4 arch/powerpc/kernel/ftrace.o
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc: Do not assert pte_locked for hugepage PTE entries
With CONFIG_DEBUG_VM, an assertion is made when changing the protection
flags of a PTE that the PTE is locked. Huge pages use a different pagetable
format and the assertion is bogus and will always trigger with a bug looking
something like
page-writeback: fix the calculation of the oldest_jif in wb_kupdate()
wb_kupdate() function has a bug on linux-2.6.30-rc5. This bug causes
generic_sync_sb_inodes() to start to write inodes back much earlier than
our expectations because it miscalculates oldest_jif in wb_kupdate().
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: padlock - Revert aes-all alias to aes
crypto: api - Fix algorithm module auto-loading
crypto: eseqiv - Fix IV generation for sync algorithms
crypto: ixp4xx - check firmware for crypto support
Jeff Mahoney [Sun, 17 May 2009 05:02:02 +0000 (01:02 -0400)]
reiserfs: deal with NULL xattr root w/ xattrs disabled
This avoids an Oops in open_xa_root that can occur when deleting a file
with xattrs disabled. It assumes that the xattr root will be there, and
that is not guaranteed.
Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Mahoney [Sun, 17 May 2009 05:02:01 +0000 (01:02 -0400)]
reiserfs: clean up ifdefs
With xattr cleanup even with xattrs disabled, much of the initial setup
is still performed. Some #ifdefs are just not needed since the options
they protect wouldn't be available anyway.
This cleans those up.
Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 16 May 2009 20:41:28 +0000 (13:41 -0700)]
Fix caller information for warn_slowpath_null
Ian Campbell noticed that since "Eliminate thousands of warnings with
gcc 3.2 build" (commit 57adc4d2dbf968fdbe516359688094eef4d46581) all
WARN_ON()'s currently appear to come from warn_slowpath_null(), eg:
WARNING: at kernel/softirq.c:143 warn_slowpath_null+0x1c/0x20()
because now that warn_slowpath_null() is in the call path, the
__builtin_return_address(0) returns that, rather than the place that
caused the warning.
Fix this by splitting up the warn_slowpath_null/fmt cases differently,
using a common helper function, and getting the return address in the
right place. This also happens to avoid the unnecessary stack usage for
the non-stdargs case, and just generally cleans things up.
Make the function name printout use %pS while at it.
Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 16 May 2009 19:47:11 +0000 (12:47 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
piix: The Sony TZ90 needs the cable type hardcoding
icside: register second channel of version 6 PCB
ide-tape: remove back-to-back REQUEST_SENSE detection
Alan Cox [Sat, 16 May 2009 17:03:36 +0000 (19:03 +0200)]
piix: The Sony TZ90 needs the cable type hardcoding
The Sony TZ90 needs the cable type hardcoding. See bug #12734
Signed-off-by: Alan Cox <alan@linux.intel.com> Reported-by: Jonathan E. Snow <jesnow@uh.edu>
[bart: port it from ata_piix to piix and give reporter the proper credit] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei Shtylyov [Sat, 16 May 2009 17:03:36 +0000 (19:03 +0200)]
icside: register second channel of version 6 PCB
The second IDE channel of version 6 PCB is not being registered anymore since
the commit 48c3c1072651922ed153bcf0a33ea82cf20df390 (ide: add struct ide_host
(take 3)).
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
ide_tape_issue_pc() assumed drive->pc isn't NULL on invocation when
checking for back-to-back request sense issues but drive->pc can be
NULL and even when it's not NULL, it's not safe to dereference it once
the previous command is complete because pc could have been freed or
was on stack. Kill back-to-back REQUEST_SENSE detection.
Len Brown [Fri, 15 May 2009 05:29:31 +0000 (01:29 -0400)]
ACPI: Idle C-states disabled by max_cstate should not disable the TSC
Processor idle power states C2 and C3 stop the TSC on many machines.
Linux recognizes this situation and marks the TSC as unstable:
Marking TSC unstable due to TSC halts in idle
But if those same machines are booted with "processor.max_cstate=1",
then there is no need to validate C2 and C3, and no need to
disable the TSC, which can be reliably used as a clocksource.
Signed-off-by: Len Brown <len.brown@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
Len Brown [Thu, 14 May 2009 21:27:38 +0000 (17:27 -0400)]
ACPI: idle: fix init-time TSC check regression
A previous 2.6.30 patch, a71e4917dc0ebbcb5a0ecb7ca3486643c1c9a6e2,
(ACPI: idle: mark_tsc_unstable() at init-time, not run-time)
erroneously disabled the TSC on systems that did not actually
have valid deep C-states.
Move the check after the deep-C-states are validated,
via new helper, tsc_check_state(), hich replaces tsc_halts_in_c().
Signed-off-by: Len Brown <len.brown@intel.com> Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Frans Pop <elendil@planet.nl>
But we discovered that some BIOS do not restore BM_RLD
after suspend, causing device errors on C3 and C4
after resume. So now the kernel restores BM_RLD.
Linus Torvalds [Fri, 15 May 2009 23:47:55 +0000 (16:47 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI MSI: Fix MSI-X with NIU cards
PCI: Fix pci-e port driver slot_reset bad default return value
Bjorn Helgaas [Fri, 15 May 2009 21:30:50 +0000 (23:30 +0200)]
PM: check sysdev_suspend(PMSG_FREEZE) return value
Check the return value of sysdev_suspend(). I think this was a typo.
Without this change, the following "if" check is always false.
I also changed the error message so it's distinguishable from the
similar message a few lines above.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Linus Torvalds [Fri, 15 May 2009 20:22:11 +0000 (13:22 -0700)]
Merge branch 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel
* 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
drm/i915: Add new GET_PIPE_FROM_CRTC_ID ioctl.
drm/i915: Set HDMI hot plug interrupt enable for only the output in question.
drm/i915: Include 965GME pci ID in IS_I965GM(dev) to match UMS.
drm/i915: Use the GM45 VGA hotplug workaround on G45 as well.
drm/i915: ignore LVDS on intel graphics systems that lie about having it
drm/i915: sanity check IER at wait_request time
drm/i915: workaround IGD i2c bus issue in kernel side (v2)
drm/i915: Don't allow binding objects into the last page of the aperture.
drm/i915: save/restore fence registers across suspend/resume
drm/i915: x86 always has writeq. Add I915_READ64 for symmetry.
Linus Torvalds [Fri, 15 May 2009 19:04:37 +0000 (12:04 -0700)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of ssh://master.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: Media rotation rate and form factor heuristics
libata: Report disk alignment and physical block size
sata_fsl: Fix the command description of FSL SATA controller
sata_fsl: Fix compile warnings
[libata] sata_sx4: fixup interrupt handling
[libata] sata_sx4: convert to new exception handling methods
* git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6:
iwlwifi: fix device id registration for 6000 series 2x2 devices
ath5k: update channel in sw state after stopping RX and TX
rtl8187: use DMA-aware buffers with usb_control_msg
mac80211: avoid NULL ptr deref when finding max_rates in PID and minstrel
airo: airo_get_encode{,ext} potential buffer overflow
Pulled directly by Linus because Davem is off playing shuffle-board at
some Alaskan cruise, and the NULL ptr deref issue hits people and should
get merged sooner rather than later.
David - make us proud on the shuffle-board tournament!
libata: Media rotation rate and form factor heuristics
This patch provides new heuristics for parsing both the form factor and
media rotation rate ATA IDENFITY words.
The reported ATA version must be 7 or greater and the device must return
values defined as valid in the standard. Only then are the
characteristics reported to SCSI via the VPD B1 page.
This seems like a reasonable compromise to me considering that we have
been shipping several kernel releases that key off the rotation rate bit
without any version checking whatsoever. With no complaints so far.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
libata: Report disk alignment and physical block size
For disks with 4KB sectors, report the correct block size and alignment
when filling out the READ CAPACITY(16) response.
This patch is based upon code from Matthew Wilcox' 4KB ATA tree. I
fixed the bug I reported a while back caused by ATA and SCSI using
different approaches to describing the alignment.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Dave Liu [Thu, 14 May 2009 14:47:07 +0000 (09:47 -0500)]
sata_fsl: Fix the command description of FSL SATA controller
The bit 11 of command description is reserved bit in Freescale
SATA controller and needs to be set to '1'. This is needed to
make sure the last write from the controller to the buffer
descriptor is seen before an interrupt is raised.
Signed-off-by: Dave Liu <daveliu@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Kumar Gala [Thu, 14 May 2009 03:10:50 +0000 (22:10 -0500)]
sata_fsl: Fix compile warnings
We we build with dma_addr_t as a 64-bit quantity we get:
drivers/ata/sata_fsl.c: In function 'sata_fsl_fill_sg':
drivers/ata/sata_fsl.c:340: warning: format '%x' expects type 'unsigned int', but argument 4 has type 'dma_addr_t'
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The extra call/return in the spinlock path is somehow
causing an increase in the cycles/instruction of somewhere around 2-7%
(seems to vary quite a lot from test to test). The working theory is
that the CPU's pipeline is getting upset about the
call->call->locked-op->return->return, and seems to be failing to
speculate (though I haven't seen anything definitive about the precise
reasons). This doesn't entirely make sense, because the performance
hit is also visible on unlock and other operations which don't involve
locked instructions. But spinlock operations clearly swamp all the
other pvops operations, even though I can't imagine that they're
nearly as common (there's only a .05% increase in instructions
executed).
If I disable just the pv-spinlock calls, my tests show that pvops is
identical to non-pvops performance on native (my measurements show that
it is actually about .1% faster, but Xiaohui shows a .05% slowdown).
Summary of results, averaging 10 runs of the "mmperf" test, using a
no-pvops build as baseline:
The clear effect here is that the 2% increase in CPI is
directly reflected in the final wallclock time.
(The other interesting effect is that the more ops are
out of line calls via pvops, the lower the cache access
and miss rates. Not too surprising, but it suggests that
the non-pvops kernel is over-inlined. On the flipside,
the branch misses go up correspondingly...)
So, what's the fix?
Paravirt patching turns all the pvops calls into direct calls, so
_spin_lock etc do end up having direct calls. For example, the compiler
generated code for paravirtualized _spin_lock is:
The indirect call will get patched to:
<_spin_lock+0>: mov %gs:0xb4c8,%rax
<_spin_lock+9>: incl 0xffffffffffffe044(%rax)
<_spin_lock+15>: callq <__ticket_spin_lock>
<_spin_lock+20>: nop; nop /* or whatever 2-byte nop */
<_spin_lock+22>: retq
One possibility is to inline _spin_lock, etc, when building an
optimised kernel (ie, when there's no spinlock/preempt
instrumentation/debugging enabled). That will remove the outer
call/return pair, returning the instruction stream to a single
call/return, which will presumably execute the same as the non-pvops
case. The downsides arel 1) it will replicate the
preempt_disable/enable code at eack lock/unlock callsite; this code is
fairly small, but not nothing; and 2) the spinlock definitions are
already a very heavily tangled mass of #ifdefs and other preprocessor
magic, and making any changes will be non-trivial.
The other obvious answer is to disable pv-spinlocks. Making them a
separate config option is fairly easy, and it would be trivial to
enable them only when Xen is enabled (as the only non-default user).
But it doesn't really address the common case of a distro build which
is going to have Xen support enabled, and leaves the open question of
whether the native performance cost of pv-spinlocks is worth the
performance improvement on a loaded Xen system (10% saving of overall
system CPU when guests block rather than spin). Still it is a
reasonable short-term workaround.
[ Impact: fix pvops performance regression when running native ]
Analysed-by: "Xin Xiaohui" <xiaohui.xin@intel.com> Analysed-by: "Li Xin" <xin.li@intel.com> Analysed-by: "Nakajima Jun" <jun.nakajima@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Xen-devel <xen-devel@lists.xensource.com>
LKML-Reference: <4A0B62F7.5030802@goop.org>
[ fixed the help text ] Signed-off-by: Ingo Molnar <mingo@elte.hu>