git.karo-electronics.de Git - linux-beck.git/log

[SCSI] mvsas: add support for Task collector mode and fixed relative bugs

1. Add support for Task collector mode.
2. Fixed relative collector mode bug:
   - I/O failed when disks is on two ports
   - system hang when hotplug disk
   - system hang when unplug disk during run IO
3. Unlock ap->lock within .lldd_execute_task for direct mode to
   improve performance

Signed-off-by: Xiangliang Yu <yuxiangl@marvell.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] mvsas: add support for Marvell 88SE9445/88SE9485

This is support for Marvell 88SE9445/88SE9485 SAS/SATA HBA, which
is based on Marvell 88SE9480.

Signed-off-by: Xiangliang Yu <yuxiangl@marvell.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] scsi_dh_alua: Attach to UNAVAILABLE/OFFLINE AAS devices

The SCSI ALUA handler currently fails to attach to devices
reporting an UNAVAILABLE/OFFLINE AAS. But given that an
UNAVAILABLE/OFFLINE AAS can transition to other states
like ACTIVE/OPTIMIZED, ACTIVE/NON-OPTIMIZED, etc. as per
SPC4, this ALUA handler behavior should be rectified so
as to attach to devices which also report an
UNAVAILABLE/OFFLINE AAS.

Signed-off-by: Martin George <marting@netapp.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] mptfusion: Bump version 3.4.19

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] mptfusion: Adding inline data padding support for TAPE drive.

Adding support for inline data padding for TAPE drive when running U320.

[jejb: whitespace fixes]
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] mptfusion: Remove debug print from mptscsih_qcmd()

Remove debug print from mptscsih_qcmd function call.
This debug print cause flood of prints and difficult to debug other issues.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] bnx2fc: increase cleanup wait time

FW may take more time cleaning up IOs issued to multiple targets.

Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] bnx2fc: Do not use HBA_DBG macro when lport is not available

Use MISC_DBG instead.

Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] bnx2fc: call scsi_done if session goes to not ready from ready

If the session is not ready yet, we ask the SCSI-ml to retry. However, if the
session is just uploaded, we should not retry, but instead call scsi_done to
fail the IO.

Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] bnx2fc: Release the reference to hba only after the interface is destroyed

Prematurely decrementing the reference may lead to cmd_mgr becoming NULL with
the cmds are still active.

Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] scsi_transport_fc: Fix deadlock during fc_remove_host

Creating and destroying fcoe interface in a tight loop leads to a system
deadlock with the following call traces:

Call Trace:
[<ffffffff814f4b3d>] schedule_timeout+0x1fd/0x2c0
[<ffffffff814f469f>] ? wait_for_common+0x4f/0x190
[<ffffffff814f469f>] ? wait_for_common+0x4f/0x190
[<ffffffff814f4737>] wait_for_common+0xe7/0x190
[<ffffffff81042fa0>] ? default_wake_function+0x0/0x20
[<ffffffff81082c2d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff814f48bd>] wait_for_completion+0x1d/0x20
[<ffffffff81066d90>] flush_workqueue+0x290/0x5f0
[<ffffffff81066b00>] ? flush_workqueue+0x0/0x5f0
[<ffffffff81067148>] destroy_workqueue+0x38/0x340
[<ffffffffa0260289>] fc_remove_host+0x1b9/0x1f0 [scsi_transport_fc]
[<ffffffffa02ed195>] bnx2fc_if_destroy+0xc5/0x1f0 [bnx2fc]
[<ffffffffa02ed33a>] bnx2fc_destroy+0x7a/0x100 [bnx2fc]
[<ffffffffa02c789b>] fcoe_transport_destroy+0x9b/0x1b0 [libfcoe]
[<ffffffff81069ec2>] param_attr_store+0x52/0x80
[<ffffffff81069976>] module_attr_store+0x26/0x30
[<ffffffff8119e726>] sysfs_write_file+0xe6/0x170
[<ffffffff81134710>] vfs_write+0xd0/0x1a0
[<ffffffff811348e4>] sys_write+0x54/0xa0
[<ffffffff81002e02>] system_call_fastpath+0x16/0x1b
Call Trace:
[<ffffffff81074865>] async_synchronize_cookie_domain+0x75/0x120
[<ffffffff8106caa0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff81074925>] async_synchronize_cookie+0x15/0x20
[<ffffffff8107494c>] async_synchronize_full+0x1c/0x40
[<ffffffffa0057466>] sd_remove+0x36/0xc0 [sd_mod]
[<ffffffff81358a75>] __device_release_driver+0x75/0xe0
[<ffffffff81358bef>] device_release_driver+0x2f/0x50
[<ffffffff81357aee>] bus_remove_device+0xbe/0x120
[<ffffffff813553ef>] device_del+0x12f/0x1e0
[<ffffffff8137454d>] __scsi_remove_device+0xbd/0xc0
[<ffffffff81374585>] scsi_remove_device+0x35/0x50
[<ffffffff813746a7>] __scsi_remove_target+0xe7/0x110
[<ffffffff81374730>] ? __remove_child+0x0/0x30
[<ffffffff81374753>] __remove_child+0x23/0x30
[<ffffffff81354a2c>] device_for_each_child+0x4c/0x80
[<ffffffff81374703>] scsi_remove_target+0x33/0x60
[<ffffffffa02622c6>] fc_starget_delete+0x26/0x30 [scsi_transport_fc]
[<ffffffffa026271a>] fc_rport_final_delete+0xaa/0x200 [scsi_transport_fc]
[<ffffffff8106585a>] process_one_work+0x1aa/0x540
[<ffffffff810657eb>] ? process_one_work+0x13b/0x540
[<ffffffffa0262670>] ? fc_rport_final_delete+0x0/0x200 [scsi_transport_fc]
[<ffffffff81067ac9>] worker_thread+0x179/0x410
[<ffffffff81067950>] ? worker_thread+0x0/0x410
[<ffffffff8106c546>] kthread+0xb6/0xc0
[<ffffffff8103879b>] ? finish_task_switch+0x4b/0xe0
[<ffffffff81003ca4>] kernel_thread_helper+0x4/0x10
[<ffffffff814f7994>] ? restore_args+0x0/0x30
[<ffffffff8106c490>] ? kthread+0x0/0xc0
[<ffffffff81003ca0>] ? kernel_thread_helper+0x0/0x10

fc_remove_host() waits for flushing the workqueue, but it is stuck at flushing
the first work. The first work doesnt complete, because it is waiting for async
layer to complete the IOs. The async layer cannot complete the IO as the
terminate_rport_io for the second work was not called, which will be called
only when the first work completes. Hence the deadlock. To resolve this
deadlock, the workqueue allocation has been modified from
create_singlethread_workqueue() to alloc_workqueue().

In addition, fc_terminate_rport_io() should be called before the
scsi_flush_work() to avoid the similar deadlock as above.

scsi fc alloc queue. move terminate rport io before flush

Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] lpfc 8.3.23: Update driver version to 8.3.23

Update driver version to 8.3.22

Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] lpfc 8.3.23: BSG additions and fixes

- Fixed the mixed declarations and codes which violate ISO C90
(declarations in subsections that assign at declaration)
- Add BSG data transfer size protection in mailbox command pass-through path
- Invoke BSG job_done while holding spinlock to fix deadlock
- Added support for checking SLI_CONFIG subcommands
- Fixed bug in BSG mailbox size check to non-embedded external buffer

Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] lpfc 8.3.23: Fixes related to new hardware

Fixes related to new hardware

- Restrict driver to look at BAR2 or BAR4 only for if_type 0.
- Allow SLI4 with FCOE_MODE not set for new SLI4 FC adapters.
- Add Temporary RPI field to the ELS request WQE.
- Do not override CT field in issue_els_flogi for SLI4 IF type 2
- For RQ_CREATE_V2 mbx cmd: fill in the rqe_size and page_size for RQ_CREATE.

Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] lpfc 8.3.23: Miscellaneous fixes

Miscellaneous fixes

- Do not limit RPI Count to a minimum of 64
- Fix FCFI incorrect on received unsolicited frames.
- Save the FCFI returned in the REG_FCFI mailbox command if it was successful.
- Fixed Vports not sending FDISC after lips.
- Align based on the SLI4_PAGE_SIZE.
- Fixed double byte swap on received RRQ.
- Fixed mask size for the wq_id mask from 0x7F to 0x7FFF.
- Clear FC_FABRIC flag when NPIV LOGO completes (and add a log message).
- Modified driver to skip round robin only when ulpStatus==LOCAL_REJECT
and word4=SEQUENCE_TIMEOUT to prevent FLOGI to disconnected FCF.
- Don't add rport if driver unloading

Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] lpfc 8.3.23: Debugfs enhancements

Debugfs enhancements

- Added iDiag support for new adapters.
- Added queue entry access methods.
- Fix host/port index in decimal
- Added Doorbell register access methods.

Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] bfa: Move debugfs initialization before bfa init.

Move the initialization of debugfs before bfa init, to enable us to
collect driver/firmware traces if init fails. Also add a printk to
display message on bfa_init failure.

Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] bfa: firmware download fix

This patch includes fixes for two issues releated to firmware download
implementation: 1) Merged memory leak fix provided by Jesper Juhl
<jj@chaosbits.net>. Basically we need to call release_firmware() after
request_firmware(). 2) fixed issues with the firmware download interface
as pointed out by Rolf Eike Beer <eike@sf-mail.de> in linux-scsi. Rearranged
the code and fixed related function protypes.

Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] ipr: improve interrupt service routine performance

During performance testing on P7 machines it was observed that the interrupt
service routine was doing unnecessary MMIO operations.

This patch rearranges the logic of the routine and moves some of the code out
of the main routine.  The result is that there are now fewer MMIO operations in
the performance path of the code.

As a result of the above change, an existing condition was exposed where the
driver could get an "unexpected" hrrq interrupt.  The original code would flag
the interrupt as unexpected and then reset the adapter.  After further analysis
it was confirmed that this condition can occasionally occur and that the
interrupt can safely be ignored.  Additional code in this patch detects this
condition, clears the interrupt and allows the driver to continue without
resetting the adapter.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] scsi_dh_rdac : decide whether to send mode select based on operating mode

Based on the operating modes, handler decides whether to send mode
select or not. Purpose here is to reduce io-shipping as much as
possible whenever there is an option.

Signed-off-by: Babu Moger <babu.moger@lsi.com>
Reviewed-by: Yanling Qi <yanling.qi@lsi.com>
Reviewed-by: Sudhir Dachepalli <Sudhir.Dachepalli@lis.com>
Reviewed-by: Somasundaram Krishnasamy <Somasundaram.Krishnasamy@lsi.com>
Reviewed-by: Bob Stankey <Robert.Stankey@lsi.com>
Reviewed-by: Vijay Chauhan <Vijay.Chauhan@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] scsi_dh_rdac : Detect the different RDAC operating modes

This patch detects different operating RDAC modes during the
discovery. It also collects the information about the preferred path.

Signed-off-by: Babu Moger <babu.moger@lsi.com>
Reviewed-by: Yanling Qi <yanling.qi@lsi.com>
Reviewed-by: Sudhir Dachepalli <Sudhir.Dachepalli@lis.com>
Reviewed-by: Somasundaram Krishnasamy <Somasundaram.Krishnasamy@lsi.com>
Reviewed-by: Bob Stankey <Robert.Stankey@lsi.com>
Reviewed-by: Vijay Chauhan <Vijay.Chauhan@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] scsi_dh_rdac : Add definitions for different RDAC operating modes

This patch adds definitions to support for different operating modes
for LSI rdac storage. Currently, rdac support 3 operation modes.

1. RDAC mode(legacy)
2. AVT mode
3. IOSHIP mode

These definitions are used while activating the path(rdac_activate).

Signed-off-by: Babu Moger <babu.moger@lsi.com>
Reviewed-by: Yanling Qi <yanling.qi@lsi.com>
Reviewed-by: Sudhir Dachepalli <Sudhir.Dachepalli@lis.com>
Reviewed-by: Somasundaram Krishnasamy <Somasundaram.Krishnasamy@lsi.com>
Reviewed-by: Bob Stankey <Robert.Stankey@lsi.com>
Reviewed-by: Vijay Chauhan <Vijay.Chauhan@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] ipr: remove unneeded volatile declarations

This patch removes three volatile declarations based on some feedback and code
analysis.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] mpt2sas : WarpDrive New product SSS6200 support added

This patch has Support for the new solid state device product SSS6200
from LSI and relavent features w.r.t SSS6200.

The major feature added in this driver is supporting Direct-I/O to the
SSS6200 storage.There are some additional changes done to avoid exposing
the RAID member disks to the OS and hiding/exposing drives based on the
OEM Specific Flag in Manufacturing Page10 (this is required to handle
specific changes in the SSS6200 firmware).

Each and every changes are listed below.
1. Hiding IR related messages.
For SSS6200, the driver is modified not to print IR related events.
Even if the debugging is enabled the IR related messages will not be displayed.
In some places if there is a need to display a message related to IR the
string "IR" is replaced with string "DD" and the string "volume" is replaced
with "direct drive". But the function names are not changed hence there are
some places where the reference to volume can be seen if debug level is set.

2. Removed RAID transport support
In Linux the user can retrieve RAID volume information from the sysfs directory.
This support is removed for SSS6200.

3. Direct I/O support.
The driver tries to enable direct I/O when a volume is reported to the driver
by the firmware through IRCC events and the driver does this just before
reporting to the OS, hence all the OS issued I/O can go through direct path
if they can, The first validation is to see whether the manufacturing page10
flag is set to expose all drives always. If that is set, the driver will not
enable direct I/O and displays the message "DDIO" is disabled globally as
drives are exposed. The driver checks whether there is more than one volume
in the controller, if so the direct I/O will be disabled globally for all
volumes in the controller and the message displayed will be "DDIO is disabled
globally as number of drives > 1.
If retrieving number of PD is failed the driver will not enable direct I/O
and displays the message Failure in computing number of drives DDIO disabled.
If memory allocation for RAIDVolumePage0 is failed, the driver will not enable
direct I/O and displays the message Memory allocation failure for
RVPG0 DDIO disabled. If retrieving RAIDVolumePage0 is failed the driver will
not enable direct I/O and displays the message Failure in retrieving
RVPG0 DDIO disabled

If the number of PD in a volume is greater than 8, then the direct I/O will
be disabled.
If any of individual drives handle retrieval is failed then the DD-IO will
be disabled.
If the volume is not RAID0 or if the block size is not 512 then the DD-IO will
be disabled.
If the volume size is greater than 2TB then the DD-IO will be disabled.
If the driver is not able to find a valid stripe exponent using the configured
stripe size then the DD-IO will be disabled

When the DD-IO is enabled the driver will check every I/O request issued to
the storage and checks whether the request is either
READ6/WRITE6/READ10/WRITE10, if it is and if the complete I/O transfer
is within a stripe size then the I/O is redirected to
the drive directly instead of the volume.

On completion of every I/O, if the completion is failure means if the reply
is address reply with a reply frame associated with it, then the type of I/O
will be checked, if the I/O is direct then the I/O will be retried to
the volume once.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Reviewed-by: Eric Moore <eric.moore@lsi.com>
Reviewed-by: Sathya Prakash <sathya.prakash@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] fusion: do not check serial_number in the abort handler

The SCSI midlayer stops all command processing when in error handling, which
means there is no chance for command reuse when the abort handler is called.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] mpt2sas: do not check serial_number in the abort handler

The SCSI midlayer stops all command processing when in error handling, which
means there is no chance for command reuse when the abort handler is called.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: "Moore, Eric" <Eric.Moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] usb-storage: do not increment cmd->serial_number

The isd200 sub-driver increments the command serial number despite not
using it at all in it's routine for sending internal scsi commands.
Remove the increment to prepare for removing the serial_number field.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] remove cmd->serial_number litter

Stop using cmd->serial_number in printks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] fcoe: have fcoe log off and lport destroy before ndo_fcoe_disable

Currently fcoe interface cleanup is done after ndo_fcoe_disable
and that prevents logoff going out to the peer, so this patch
moves all netdev cleanup and its releasing inside
fcoe_interface_cleanup to have log off before ndo_fcoe_disable
disables the fcoe.

This patch also fixes asymmetric rtnl locking around fcoe_if_destroy,
as currently this function requires rtnl held by its caller
and then have this func drops the lock, instead now don't have
any processing under rtnl inside fcoe_if_destroy, this required
moving few func to get build working again.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] libfc: rec tov value and REC_TOV_CONST units usages is incorrect

Added REC_TOV_CONST intent was to have rec tov as e_d_tov + 1s
but currently it is e_d_tov + 1ms since e_d_tov is stored in ms
unit.

Also returned rec tov by get_fsp_rec_tov is in ms and this ms tov
is used as-is with fc_fcp_timer_set expecting jiffies tov.

Fixed this by having get_fsp_rec_tov return rec tov in jiffies
as e_d_tov + 1s and then use jiffies tov w/ fc_fcp_timer_set.

Also some cleanup, no need to cache get_fsp_rec_tov return value
in local rec_tov at various places.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] libfc: remove duplicate ema_list init

As ema_list is already initialized by libfc_host_alloc.

Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] libfcoe: fix wrong comment in fcoe_transport_detach

fix typo of '_attach' -> '_detach' in the comment.

Reported-by: Frank Zhang <frank_1.zhang@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] libfcoe: fix possible buffer overflow in fcoe_transport_show

possible buffer overflow in fcoe_transport_show when reaching the end of
buffer and crossing PAGE_SIZE boundary.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] libfcoe: clean up netdev mapping properly when the transport goes away

When rmmoving the underlying fcoe transport driver module by force when
it's attached and in use, the correspoding netdev mapping should be
cleaned up properly as well, otherwise the lookup for a given netdev
for the transport would still return non NULL pointer, causing "unable
to handle paging request" bug.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] libfc: Move host_lock usage into ramp_up/down routines

The host_lock is still used to protect the can_queue
value in the Scsi_Host, but it doesn't need to be held
and released by each caller. This patch moves the lock
usage into the fc_fcp_can_queue_ramp_up and
fc_fcp_can_queue_ramp_down routines.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] esp, scsi_tgt_lib, fcoe: use list_move() instead of list_del()/list_add() combination

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] fcoe: remove unnecessary module state check

The check of module state being MODULE_STATE_LIVE is no longer needed for the
individual fcoe transport driver, e.g., fcoe.ko, as sysfs entries now go to
libfcoe now, if it reaches fcoe.ko, it has to be already registered. The module
state check for libfcoe will guard the possible race condition of sysfs being
writable before module_init function is called and after module_exit.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] fcoe: Remove mutex_trylock/restart_syscall checks

These checks were initially added to avoid a lockdep
false positive when dealing with the s_active, rtnl
and fcoe_config_mutex mutexes. Recently the create,
destroy, enable and disable sysfs entries were moved
from fcoe.ko to libfcoe.ko. With this change the mutex
usage was shuffled around and the lockdep false
positive stopped happening. We can now remove these
checks.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] libfcoe: Remove mutex_trylock/restart_syscall checks

This code was incorrectly ported from fcoe.c when the
fcoe transport infrastructure was put into place. It
was originally needed in fcoe.c when dealing with
the rtnl mutex. In that code it was only needed to
avoid a lockdep false positive. In libfcoe we don't
deal with the rtnl mutex, we don't get the lockdep
false positive and therefore we don't need these
checks.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] bnx2fc: introduce missing kfree

Error handling code following a kmalloc should free the allocated data.

The semantic match that finds the problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r exists@
local idexpression x;
statement S;
expression E;
identifier f,f1,l;
position p1,p2;
expression *ptr != NULL;
@@

x@p1 = $kmalloc\|kzalloc\|kcalloc$(...);
...
if (x == NULL) S
<... when != x
when != if (...) { <+...x...+> }
(
x->f1 = E
|
(x->f1 == NULL || ...)
|
f(...,x->f1,...)
)
...>
(
return $0\|<+...x...+>\|ptr$;
|
return@p2 ...;
)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@

print "* file: %s kmalloc %s return %s" % (p1[0].file,p1[0].line,p2[0].line)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] ipr: fix synchronous request flags for better performance

In testing it was noticed that Extended Delay after Reset flag was being set
for gscsi and volume set devices. This had a negative effect on performance
for volume sets. The fix is to only set the flag for gscsi devices.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Update version number to 8.03.07.03-k.

A minor change in the versioning. We'll be attaching the "-k"
in the end.

Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Log fcport state transitions when debug messages are enabled.

Add the inline function qla2x00_set_port_state() so that when a fcport state
transition happens we can log the state transition if debug messages are
enabled.

Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Verify login-state has transitioned to PRLI-completed.

Before driver's own internal state is marked as PLOGI/PRLI
complete. This additional check closes a window seen with
dual-personality initiator/target devices where a driver's
PLOGI/PRLI request occurs within the window after the target's
PLOGI request has completed, but prior to the target's PRLI
arriving and processed by the firmware. Without this additional
check, the firmware will return port-information stating that the
port neither supports target nor initiator functions, causing the
driver to register the rport prematurely to the FC-transport
without the proper 'roles' being set.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Limit the logs in case device state does not change for ISP82xx.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Add the ql2xdontresethba module_param.

Also, change the ISP82xx code to only reset if this module_param is set
and reset is intended via the QLA82XX_DEV_NEED_RESET case.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Display hardware/firmware registers to get more information about the error for ISP82xx.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Add test for valid loop id to qla2x00_relogin().

If fabric device has invalid loop id (FC_NO_LOOP_ID) then call
qla2x00_find_new_loop_id() to attempt to obtain valid loop id.

Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Perform FCoE context reset before trying adapter reset for ISP82xx.

For certain failures, try to recover first by doing FCoE context reset before
attempting big hammer approach(adpater reset).

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Remove extra call to qla82xx_check_fw_alive().

The stanadlone call to qla82xx_check_fw_alive() in qla82xx_watchdog()
is a typo, so remove it.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Updated the reset sequence for ISP82xx.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Update copyright banner.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Update License file.

Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Free firmware PCB on logout request.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Include request queue ID in the upper 16-bits of the I/O handle for Abort I/O IOCBs.

The upper 16-bits of the handle for all I/O in multi-queue supported
drivers carries the ID of the request queue it was submitted on. When
using Abort I/O IOCB, the driver needs to also populate the upper
16-bits in the handle_to_abort field so the fw can correlate with the
actual I/O.

Signed-off-by: Mike Hernandez <michael.hernandez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Remove extraneous setting of FCF_ASYNC_SENT during login-done completion.

The bit is already set upon entry.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Check for a match before attempting to set FCP-priority information.

Modifying qla24xx_get_fcp_prio() to return a 'found' status
allows the driver to short circuit the 'set FCP-priority' call
and reduce the amount of noise generated in the messages file:

scsi(5): Unable to activate fcp priority, ret=0x102
scsi(5): Unable to activate fcp priority, ret=0x102

Also make qla24xx_get_fcp_prio() static.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Correct calling contexts of qla2x00_mark_device_lost() in async paths.

The respective done() functions are called from process context,
so there's no reason to 'defer' the request.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] qla2xxx: Display PortID information during FCP command-status handling.

To provide a clearer translation of the command-status origin in
relation to the midlayer's standard SCSI nexus.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] be2iscsi: Fix for proper setting of FW

There was a bug in setting up type and dmsg for FW

Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] be2iscsi: check boot_kset is created before destroying it

Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] be2iscsi: Set a timeout to FW

Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] be2iscsi: Modifying Maintainer's emailid

- Modifying Maintainer's emailid to emulex as Emulex has
acquired Serverengines

Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] be2iscsi: change in copyright notice

- Modifying copyright year to 2011
- Replacing Serverengines with Emulex as Serverengines Corp
has been acquired by Emulex Corp

Signed-off-by: Jayamohan Kallickal <jayamohan.kallickal@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

[SCSI] Log thin provisioning threshold event

At least log the message that we received a THIN PROVISIONING SOFT
THRESHOLD REACHED Unit Attention. Also added it to unit attention
decodes.

Signed-off-by: Shyam Iyer <shyam_iyer@dell.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: only force kblockd unplugging from the schedule() path
  block: cleanup the block plug helper functions
  block, blk-sysfs: Use the variable directly instead of a function call
  block: move queue run on unplug to kblockd
  block: kill queue_sync_plugs()
  block: readd plug trace event
  block: add callback function for unplug notification
  block: add comment on why we save and disable interrupts in flush_plug_list()
  block: fixup block IO unplug trace call
  block: remove block_unplug_timer() trace point
  block: splice plug list to local context

Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6

* 'linux-next' of git://git.infradead.org/ubifs-2.6:
UBIFS: fix compilation warnings when compiling with gcc 4.5
UBIFS: fix oops when R/O file-system is fsync'ed

vfs: fix incorrect dentry_update_name_case() BUG_ON() test

The case we should be verifying when updating the dentry name is that
the _parent_ inode (the directory) semaphore is held, not the semaphore
for the dentry itself. It's the directory locking that rename and
readdir() etc all care about.

The comment just above even says so - but then the BUG_ON() still
checked the dentry inode itself.

Very few people noticed, because this helper function really isn't used
for very much, so you had to be using ncpfs to ever hit it.

I think I should just remove the BUG_ON (the function really has just
one user), but let's run with it fixed for a while before getting rid of
it entirely.

Reported-and-tested-by: Bongani Hlope <bonganih@bankservafrica.com>
Reported-and-tested-by: Bernd Feige <bernd.feige@uniklinik-freiburg.de>
Cc: Petr Vandrovec <petr@vandrovec.name>,
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

block: only force kblockd unplugging from the schedule() path

For the explicit unplugging, we'd prefer to kick things off
immediately and not pay the penalty of the latency to switch
to kblockd. So let blk_finish_plug() do the run inline, while
the implicit-on-schedule-out unplug will punt to kblockd.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

block: cleanup the block plug helper functions

It's a bit of a mess currently. task->plug is being cleared
and reset in __blk_finish_plug(), and blk_finish_plug() is
testing for a NULL plug which cannot happen even from schedule()
anymore since it uses blk_needs_flush_plug() to determine
whether to call into this function at all.

So get rid of some of the cruft.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/vapier/blackfin:
  Blackfin: SMP: fix cache flush loop
  Blackfin: time-ts: ack gptimer sooner to avoid missing short ints
  Blackfin: gptimers: fix thinko when disabling timers
  Blackfin: SMP: make all barriers handle cache issues

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: fix linger request requeueing

mm/thp: use conventional format for boolean attributes

The conventional format for boolean attributes in sysfs is numeric ("0" or
"1" followed by new-line). Any boolean attribute can then be read and
written using a generic function. Using the strings "yes [no]", "[yes]
no" (read), "yes" and "no" (write) will frustrate this.

[akpm@linux-foundation.org: use kstrtoul()]
[akpm@linux-foundation.org: test_bit() doesn't return 1/0, per Neil]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Tested-by: David Rientjes <rientjes@google.com>
Cc: NeilBrown <neilb@suse.de>
Cc: <stable@kernel.org> [2.6.38.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ramfs: fix memleak on no-mmu arch

On no-mmu arch, there is a memleak during shmem test.  The cause of this
memleak is ramfs_nommu_expand_for_mapping() added page refcount to 2
which makes iput() can't free that pages.

The simple test file is like this:

  int main(void)
  {
int i;
key_t k = ftok("/etc", 42);

for ( i=0; i<100; ++i) {
int id = shmget(k, 10000, 0644|IPC_CREAT);
if (id == -1) {
printf("shmget error\n");
}
if(shmctl(id, IPC_RMID, NULL ) == -1) {
printf("shm  rm error\n");
return -1;
}
}
printf("run ok...\n");
return 0;
  }

And the result:

  root:/> free
               total         used         free       shared      buffers
  Mem:         60320        17912        42408            0            0
  -/+ buffers:              17912        42408
  root:/> shmem
  run ok...
  root:/> free
               total         used         free       shared      buffers
  Mem:         60320        19096        41224            0            0
  -/+ buffers:              19096        41224
  root:/> shmem
  run ok...
  root:/> free
               total         used         free       shared      buffers
  Mem:         60320        20296        40024            0            0
  -/+ buffers:              20296        40024
  ...

After this patch the test result is:(no memleak anymore)

  root:/> free
               total         used         free       shared      buffers
  Mem:         60320        16668        43652            0            0
  -/+ buffers:              16668        43652
  root:/> shmem
  run ok...
  root:/> free
               total         used         free       shared      buffers
  Mem:         60320        16668        43652            0            0
  -/+ buffers:              16668        43652

Signed-off-by: Bob Liu <lliubbo@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

um: disable CONFIG_CMPXCHG_LOCAL

Commit 8a5ec0ba "Lockless (and preemptless) fastpaths for slub" makes use
of this_cpu_cmpxchg_double() which needs this_cpu_cmpxchg16b_emu() on
x86_64. Implementing cmpxchg16b emulation for UML would introduce too
much complexity. So just disable it.

Signed-off-by: Richard Weinberger <richard@nod.at>
Reported-by: Sergei Trofimovich <slyich@gmail.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

um: fix call tracer and bug handler

Commit 1de1502c ("x86, um: now we can get rid of trivial uml headers")
removed accidentally bug.h which broke UML's call tracer and bug
handler.

Without asm-generic/bug.h UML uses BUG() from arch/x86/ which makes use
of ud2. UML cannot use ud2, it raises SIGILL in user mode. As UML has
a different stack for handling signals the call trace will be cut off.

Signed-off-by: Richard Weinberger <richard@nod.at>
Reported-by: Sergei Trofimovich <slyich@gmail.com>
Tested-by: Sergei Trofimovich <slyich@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/fhandle.c: add <linux/personality.h> for ia64

force_o_largefile() on ia64 is defined in <asm/fcntl.h> and requires
<linux/personality.h>.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: change mail adress of Hans J. Koch

My old mail address doesn't exist anymore. This patch changes all
occurences in MAINTAINERS to my new address.

Signed-off-by: Hans J. Koch <hjk@hansjkoch.de>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

RapidIO/mpc85xx: fix possible mport registration problems

Fix a possible problem with mport registration left non-cleared after
fsl_rio_setup() exits on link error. Abort mport initialization if
registration failed.

This patch is applicable to 2.6.39-rc1 only. The problem does not exist
for earlier versions.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

RapidIO: add IDT CPS-1432 switch definitions

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

oom-kill: remove boost_dying_task_prio()

This is an almost-revert of commit 93b43fa ("oom: give the dying task a
higher priority").

That commit dramatically improved oom killer logic when a fork-bomb
occurs.  But I've found that it has nasty corner case.  Now cpu cgroup has
strange default RT runtime.  It's 0!  That said, if a process under cpu
cgroup promote RT scheduling class, the process never run at all.

If an admin inserts a !RT process into a cpu cgroup by setting
rtruntime=0, usually it runs perfectly because a !RT task isn't affected
by the rtruntime knob.  But if it promotes an RT task via an explicit
setscheduler() syscall or an OOM, the task can't run at all.  In short,
the oom killer doesn't work at all if admins are using cpu cgroup and don't
touch the rtruntime knob.

Eventually, kernel may hang up when oom kill occur.  I and the original
author Luis agreed to disable this logic.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Luis Claudio R. Goncalves <lclaudio@uudg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

vmscan: all_unreclaimable() use zone->all_unreclaimable as a name

all_unreclaimable check in direct reclaim has been introduced at 2.6.19
by following commit.

2006 Sep 25; commit 408d8544; oom: use unreclaimable info

And it went through strange history. firstly, following commit broke
the logic unintentionally.

2008 Apr 29; commit a41f24ea; page allocator: smarter retry of
      costly-order allocations

Two years later, I've found obvious meaningless code fragment and
restored original intention by following commit.

2010 Jun 04; commit bb21c7ce; vmscan: fix do_try_to_free_pages()
      return value when priority==0

But, the logic didn't works when 32bit highmem system goes hibernation
and Minchan slightly changed the algorithm and fixed it .

2010 Sep 22: commit d1908362: vmscan: check all_unreclaimable
      in direct reclaim path

But, recently, Andrey Vagin found the new corner case. Look,

struct zone {
  ..
        int                     all_unreclaimable;
  ..
        unsigned long           pages_scanned;
  ..
}

zone->all_unreclaimable and zone->pages_scanned are neigher atomic
variables nor protected by lock.  Therefore zones can become a state of
zone->page_scanned=0 and zone->all_unreclaimable=1.  In this case, current
all_unreclaimable() return false even though zone->all_unreclaimabe=1.

This resulted in the kernel hanging up when executing a loop of the form

1. fork
2. mmap
3. touch memory
4. read memory
5. munmmap

as described in
http://www.gossamer-threads.com/lists/linux/kernel/1348725#1348725

Is this ignorable minor issue?  No.  Unfortunately, x86 has very small dma
zone and it become zone->all_unreclamble=1 easily.  and if it become
all_unreclaimable=1, it never restore all_unreclaimable=0.  Why?  if
all_unreclaimable=1, vmscan only try DEF_PRIORITY reclaim and
a-few-lru-pages>>DEF_PRIORITY always makes 0.  that mean no page scan at
all!

Eventually, oom-killer never works on such systems.  That said, we can't
use zone->pages_scanned for this purpose.  This patch restore
all_unreclaimable() use zone->all_unreclaimable as old.  and in addition,
to add oom_killer_disabled check to avoid reintroduce the issue of commit
d1908362 ("vmscan: check all_unreclaimable in direct reclaim path").

Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: check that we have the right vma in __access_remote_vm()

In __access_remote_vm() we need to check that we have found the right
vma, not the following vma before we try to access it.  Otherwise we
might call the vma's access routine with an address which does not fall
inside the vma.

It was discovered on a current kernel but with an unreleased driver,
from memory it was strace leading to a kernel bad access, but it
obviously depends on what the access implementation does.

Looking at other access implementations I only see:

  $ git grep -A 5 vm_operations|grep access
  arch/powerpc/platforms/cell/spufs/file.c- .access = spufs_mem_mmap_access,
  arch/x86/pci/i386.c- .access = generic_access_phys,
  drivers/char/mem.c- .access = generic_access_phys
  fs/sysfs/bin.c- .access = bin_access,

The spufs one looks like it might behave badly given the wrong vma, it
assumes vma->vm_file->private_data is a spu_context, and looks like it
would probably blow up pretty quickly if it wasn't.

generic_access_phys() only uses the vma to check vm_flags and get the
mm, and then walks page tables using the address.  So it should bail on
the vm_flags check, or at worst let you access some other VM_IO mapping.

And bin_access() just proxies to another access implementation.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

brk: COMPAT_BRK: fix detection of randomized brk

5520e89 ("brk: fix min_brk lower bound computation for COMPAT_BRK")
tried to get the whole logic of brk randomization for legacy
(libc5-based) applications finally right.

It turns out that the way to detect whether brk has actually been
randomized in the end or not introduced by that patch still doesn't work
for those binaries, as reported by Geert:

: /sbin/init from my old m68k ramdisk exists prematurely.
:
: Before the patch:
:
: | brk(0x80005c8e) = 0x80006000
:
: After the patch:
:
: | brk(0x80005c8e) = 0x80005c8e
:
: Old libc5 considers brk() to have failed if the return value is not
: identical to the requested value.

I don't like it, but currently see no better option than a bit flag in
task_struct to catch the CONFIG_COMPAT_BRK && randomize_va_space == 2
case.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/misc/sgi-gru/grufile.c: fix the wrong members of gru_chip

Fix the wrong members and the wrong function's definition, since the
irq_chip had changed.

Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tmpfs: fix off-by-one in max_blocks checks

If you fill up a tmpfs, df was showing

  tmpfs                   460800         -         -   -  /tmp

because of an off-by-one in the max_blocks checks.  Fix it so df shows

  tmpfs                   460800    460800         0 100% /tmp

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: update STABLE BRANCH info

Drop Chris Wright from STABLE maintainers. He hasn't done STABLE release
work for quite some time.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: add VM counters for transparent hugepages

I found it difficult to make sense of transparent huge pages without
having any counters for its actions. Add some counters to vmstat for
allocation of transparent hugepages and fallback to smaller pages.

Optional patch, but useful for development and understanding the system.

Contains improvements from Andrea Arcangeli and Johannes Weiner

[akpm@linux-foundation.org: coding-style fixes]
[hannes@cmpxchg.org: fix vmstat_text[] entries]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: update various tty patterns

Commits 4a6514e6d0 ("tty: move obsolete and broken tty drivers to
drivers/staging/tty/") and a6afd9f3e8 ("tty: move a number of tty drivers
from drivers/char/ to drivers/tty/") moved files around.

Update patterns and orphan some files that were moved to staging.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: update m68knommu patterns

Commit 66d857b08b ("m68k: merge m68k and m68knommu arch directories")
moved the files around.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: add ARM/ts78xx-setup platform maintainer

Signed-off-by: Alexander Clouter <alex@digriz.org.uk>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kstrtox: simpler code in _kstrtoull()

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kstrtox: fix compile warnings in test

Fix the following warnings:

    CC [M]  lib/test-kstrtox.o
  lib/test-kstrtox.c: In function 'test_kstrtou64_ok':
  lib/test-kstrtox.c:318: warning: this decimal constant is unsigned only in ISO C90
...

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

leds/leds-regulator.c: fix handling of already enabled regulators

Make the driver aware of the initial status of the regulator.

The leds-regulator driver was ignoring the initial status of the
regulator; this resulted in rdev->use_count being incremented to 2 after
calling regulator_led_set_value() in the .probe method when a regulator
was already enabled at insmod time, which made it impossible to ever
disable the regulator.

Signed-off-by: Antonio Ospite <ospite@studenti.unina.it>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Antonio Ospite <ospite@studenti.unina.it>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Liam Girdwood <lrg@slimlogic.co.uk>
Cc: Daniel Ribeiro <drwyrm@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

vmstat: update comment regarding stat_threshold

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/page_alloc.c: silence build_all_zonelists() section mismatch

The memory hotplug case involves calling to build_all_zonelists() which
in turns calls in to setup_zone_pageset(). The latter is marked
__meminit while build_all_zonelists() itself has no particular
annotation. build_all_zonelists() is only handed a non-NULL pointer in
the case of memory hotplug through an existing __meminit path, so the
setup_zone_pageset() reference is always safe.

The options as such are either to flag build_all_zonelists() as __ref (as
per __build_all_zonelists()), or to simply discard the __meminit
annotation from setup_zone_pageset().

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/partitions/ldm.c: fix oops caused by corrupted partition table

The kernel automatically evaluates partition tables of storage devices.
The code for evaluating LDM partitions (in fs/partitions/ldm.c) contains
a bug that causes a kernel oops on certain corrupted LDM partitions.
A kernel subsystem seems to crash, because, after the oops, the kernel no
longer recognizes newly connected storage devices.

The patch validates the value of vblk_size.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Timo Warns <warns@pre-sense.de>
Cc: Eugene Teo <eugeneteo@kernel.sg>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Richard Russon <rich@flatcap.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/rtc/rtc-mc13xxx.c: fix unterminated platform_device_id table

The platform_device_id table is supposed to be zero-terminated.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: optimize pfn calculation in online_page()

If CONFIG_FLATMEM is enabled pfn is calculated in online_page() more than
once. It is possible to optimize that and use value established at
beginning of that function.

Signed-off-by: Daniel Kiper <dkiper@net-space.pl>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: fix mem_cgroup_rotate_reclaimable_page()

commit 3f58a8294333 ("move memcg reclaimable page into tail of inactive
list") added inline keyword twice in its prototype.

    CC      arch/x86/kernel/asm-offsets.s
  In file included from include/linux/swap.h:8,
                   from include/linux/suspend.h:4,
                   from arch/x86/kernel/asm-offsets.c:12:
  include/linux/memcontrol.h:220: error: duplicate `inline'

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>