git.karo-electronics.de Git - karo-tx-linux.git/log

PCI: delay configuration of SRIOV capability

The SRIOV capability, namely page size and total_vfs of a device are
configured during enumeration phase of the device. This can potentially
interfere with the PCI operations of the platform, if the IOV capability
of the device is not enabled.

The following patch postpones the configuration of the IOV capability of
the device to a later point, when the IOV capability is explicitly
enabled by the device driver.

The patch is tested on x86 and power platform.

Tested-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: Only call pci_stop_bus_device() one time for child devices at remove

During debugging pcie hotplug with SRIOV with pcie switch, I found
pci_stop_bus_device() is called several times for some child devices.

So change original pci_remove_bus_device() to __pci_remove_bus_device(),
and make it only do remove work, and add a new pci_remove_bus_device
that calls pci_stop_bus_device() one time, and then call
__pci_remove_bus_device().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

x86/PCI: amd: Kill misleading message about enablement of IO access to PCI ECS]

Commit 24d9b70b8c679264756a6980e668b96b3f964826 (x86: Use PCI method
for enabling AMD extended config space before MSR method) added a
message when IO access to PCI ECS was enabled via access to the NB_CFG
PCI register. This can lead to a bogus message like

[ 0.365177] Extended Config Space enabled on 0 nodes

which is misleading because IO ECS access is subsequently enabled for
AMD CPUs (that support this) by modifying the corresponding NB_CFG
MSR.

Furthermore it's not "Extended Config Space" that is enabled by this
register setting. It's the IO access that is enabled for extended
configruation space.

IMHO the ambiguous message needs to be cancelled.

Cc: Jan Beulich <jbeulich@novell.com>
Cc: Robert Richter <robert.richter@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: latency timer doesn't apply to PCIe

The latency timer is read-only and hardwired to zero for all PCIe
devices, both Type 0 and Type 1, so don't bother trying to update it
and cluttering the dmesg log with meaningless "setting latency timer
to 64" messages.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: x86: use generic pcibios_set_master()

This patch removes x86's architecture-specific 'pcibios_set_master()'
routine and lets the default PCI core based implementation handle PCI
device 'latency timer' setup.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: sh: use generic pcibios_set_master()

This patch removes sh's architecture-specific 'pcibios_set_master()'
routine and lets the default PCI core based implementation handle PCI
device 'latency timer' setup.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: mn10300: use generic pcibios_set_master()

This patch removes mn10300's architecture-specific 'pcibios_set_master()'
routine for ASB2305 and lets the default PCI core based implementation
handle PCI device 'latency timer' setup.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: MIPS: use generic pcibios_set_master()

This patch removes MIPS' architecture-specific 'pcibios_set_master()'
routine and lets the default PCI core based implementation handle PCI
device 'latency timer' setup.

No functional change.

Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: frv: use generic pcibios_set_master()

This patch removes frv's architecture-specific 'pcibios_set_master()'
routine and lets the default PCI core based implementation handle PCI
device 'latency timer' setup.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: Pull PCI 'latency timer' setup up into the core

The 'latency timer' of PCI devices, both Type 0 and Type 1,
is setup in architecture-specific code [see: 'pcibios_set_master()'].
There are two approaches being taken by all the architectures - check
if the 'latency timer' is currently set between 16 and 255 and if not
bring it within bounds, or, do nothing (and then there is the
gratuitously different PA-RISC implementation).

There is nothing architecture-specific about PCI's 'latency timer' so
this patch pulls its setup functionality up into the PCI core by
creating a generic 'pcibios_set_master()' function using the '__weak'
attribute which can be used by all architectures as a default which,
if necessary, can then be over-ridden by architecture-specific code.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: Xtensa: convert pcibios_set_master() to a non-inlined function

This patch converts Xtensa's architecture-specific
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.

Converting 'pci_bios_set_master()' to a non-inlined function will
allow Xtensa's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and
thus, not change current behavior.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: UniCore: convert pcibios_set_master() to a non-inlined function

This patch converts UniCore's architecture-specific
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.

Converting 'pci_bios_set_master()' to a non-inlined function will
allow UniCore's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and
thus, not change current behavior.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: TILE: convert pcibios_set_master() to a non-inlined function

This patch converts TILE's architecture-specific 'pcibios_set_master()'
routine to a non-inlined function. This will allow follow on patches
to create a generic 'pcibios_set_master()' function using the '__weak'
attribute which can be used by all architectures as a default which,
if necessary, can then be over-ridden by architecture-specific code.

Converting 'pci_bios_set_master()' to a non-inlined function will
allow TILE's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and
thus, not change current behavior.

No functional change.

Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: SPARC: convert pcibios_set_master() to a non-inlined function

This patch converts SPARC's architecture-specific
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.

Converting 'pci_bios_set_master()' to a non-inlined function will
allow SPARC's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and
thus, not change current behavior.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: PowerPC: convert pcibios_set_master() to a non-inlined function

This patch converts PowerPC's architecture-specific
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.

Converting 'pci_bios_set_master()' to a non-inlined function will
allow PowerPC's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and
thus, not change current behavior.

No functional change.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: MicroBlaze: convert pcibios_set_master() to a non-inlined function

This patch converts MicroBlaze's architecture-specific
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.

Converting 'pci_bios_set_master()' to a non-inlined function will
allow MicroBlaze's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and thus,
not change current behavior.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: IA64: convert pcibios_set_master() to a non-inlined function

This patch converts IA64's architecture-specific 'pcibios_set_master()'
routine to a non-inlined function. This will allow follow on
patches to create a generic 'pcibios_set_master()' function using the
'__weak' attribute which can be used by all architectures as a default
which, if necessary, can then be over-ridden by architecture-
specific code.

Converting 'pci_bios_set_master()' to a non-inlined function will allow
IA64's 'pcibios_set_master()' implementation to remain architecture-
specific after the generic version is introduced and thus, not change
current behavior.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: ARM: convert pcibios_set_master() to a non-inlined function

This patch converts ARM's architecture-specific inlined
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.

Converting 'pci_bios_set_master()' to a non-inlined function will allow
ARM's 'pcibios_set_master()' implementation to remain architecture-
specific after the generic version is introduced and thus, not change
current behavior.

Note that ARM also has a non-inlined 'pcibios_set_master()' that is
used if CONFIG_PCI_HOST_ITE8152 is defined. This patch does not
change any behavior here either.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: add declaration for pcibios_set_master() to pci core

Currently, pcibios_set_master() is implemented in architecture-
specific code. There is nothing architecture-specific about PCI's
'latency timer'.

This patch adds a declaration for pcibios_set_master() to PCI's core
in preperation for pulling the function itself up into the core.
Without the addition of this declaration, subsequent patches that
remove inline definitions of pcibios_set_master() would be removing
the only declaration of such.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

uio: Convert uio_generic_pci to new intx masking API

The new PCI API provides both generic probing for 2.3 masking support
and check&mask in the interrupt handler.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: Introduce INTx check & mask API

These new PCI services allow to probe for 2.3-compliant INTx masking
support and then use the feature from PCI interrupt handlers. The
services are properly synchronized with concurrent config space access
via sysfs or on device reset.

This enables generic PCI device drivers like uio_pci_generic or KVM's
device assignment to implement the necessary kernel-side IRQ handling
without any knowledge about device-specific interrupt status and control
registers.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: Rework config space blocking services

pci_block_user_cfg_access was designed for the use case that a single
context, the IPR driver, temporarily delays user space accesses to the
config space via sysfs. This assumption became invalid by the time
pci_dev_reset was added as locking instance. Today, if you run two loops
in parallel that reset the same device via sysfs, you end up with a
kernel BUG as pci_block_user_cfg_access detect the broken assumption.

This reworks the pci_block_user_cfg_access to a sleeping service
pci_cfg_access_lock and an atomic-compatible variant called
pci_cfg_access_trylock. The former not only blocks user space access as
before but also waits if access was already locked. The latter service
just returns false in this case, allowing the caller to resolve the
conflict instead of raising a BUG.

Adaptions of the ipr driver were originally written by Brian King.

Acked-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

x86/PCI: Ignore CPU non-addressable _CRS reserved memory resources

This assures that a _CRS reserved host bridge window or window region is
not used if it is not addressable by the CPU. The new code either trims
the window to exclude the non-addressable portion or totally ignores the
window if the entire window is non-addressable.

The current code has been shown to be problematic with 32-bit non-PAE
kernels on systems where _CRS reserves resources above 4GB.

Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Renninger <trenn@novell.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: Fix PCI_EXP_TYPE_RC_EC value

Spec shows this as 1010b = 0xa

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: fix a brace coding style issue in probe.c

Fixed a brace coding style issue.

Signed-off-by: Zac Storer <zac.3.14159@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: pci_has_legacy_pm_support add driver and device to WARN

Include the driver name and device in warning when a pci driver
supports both legacy pm and new framework as just the stack trace
gives no way to identify the driver.

Signed-off-by: David Fries <David@Fries.net>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: msi: Disable msi interrupts when we initialize a pci device

I traced a nasty kexec on panic boot failure to the fact that we had
screaming msi interrupts and we were not disabling the msi messages at
kernel startup.  The booting kernel had not enabled those interupts so
was not prepared to handle them.

I can see no reason why we would ever want to leave the msi interrupts
enabled at boot if something else has enabled those interrupts.  The pci
spec specifies that msi interrupts should be off by default.  Drivers
are expected to enable the msi interrupts if they want to use them.  Our
interrupt handling code reprograms the interrupt handlers at boot and
will not be be able to do anything useful with an unexpected interrupt.

This patch applies cleanly all of the way back to 2.6.32 where I noticed
the problem.

Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI/ACPI/PM: Avoid resuming devices that don't signal PME

Modify pci_acpi_wake_dev() to avoid resuming PME-capable devices
whose PME Status bits are not set, which may happen currently if
several devices are associated with the same wakeup GPE and all
of them are notified whenever at least one of them signals PME.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI/ACPI: Make acpiphp ignore root bridges using SHPC native hotplug

If the kernel has requested control of the SHPC native hotplug
feature for a given root bridge, the acpiphp driver should not try
to handle that root bridge and it should leave it to shpchp.
Failing to do so causes problems to happen if shpchp is loaded
and unloaded before loading acpiphp (ACPI-based hotplug won't work
in that case anyway).

To address this issue make find_root_bridges() ignore PCI root
bridges with SHPC native hotplug enabled and make add_bridge()
return error code if SHPC native hotplug is enabled for the given
root bridge. This causes acpiphp to refuse to load if SHPC native
hotplug is enabled for all root bridges and to refuse binding to
the root bridges with SHPC native hotplug enabled.

Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: pciehp: Handle push button event asynchronously

Use non-ordered workqueue for attention button events.

Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: pciehp: Fix wrong workqueue cleanup

Fix improper workqueue cleanup.

In the current pciehp, pcied_cleanup() calls destroy_workqueue()
before calling pcie_port_service_unregister(). This causes kernel oops
because flush_workqueue() is called in the pcie_port_service_unregister()
code path after the workqueue was destroyed. So pcied_cleanup() must call
pcie_port_service_unregister() first before calling destroy_workqueue().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: Rework ASPM disable code

Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.

This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.

It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.

Measured to save around 5W on an idle Thinkpad X220.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: Fix PRI and PASID consistency

These are extended capabilities, rename and move to proper
group for consistency.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI/sysfs: add per pci device msi[x] irq listing (v5)

This patch adds a per-pci-device subdirectory in sysfs called:
/sys/bus/pci/devices/<device>/msi_irqs

This sub-directory exports the set of msi vectors allocated by a given
pci device, by creating a numbered sub-directory for each vector beneath
msi_irqs. For each vector various attributes can be exported.
Currently the only attribute is called mode, which tracks the
operational mode of that vector (msi vs. msix)

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: fix ats compile failure

I get this compile failure on parisc:

drivers/pci/ats.c: In function 'ats_alloc_one':
drivers/pci/ats.c:29: error: implicit declaration of function 'kzalloc'
drivers/pci/ats.c:29: warning: assignment makes pointer from integer without a cast
drivers/pci/ats.c: In function 'ats_free_one':
drivers/pci/ats.c:45: error: implicit declaration of function 'kfree'

Because ats.c is missing linux/slab.h as an include. This patch fixes it

Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

PCI: defer enablement of SRIOV BARS

All the PCI BARs of a device are enabled when the device is enabled
using pci_enable_device(). This unnecessarily enables SRIOV BARs of the
device.

On some platforms, which do not support SRIOV as yet, the
pci_enable_device() fails to enable the device if its SRIOV BARs are not
allocated resources correctly.

The following patch fixes the above problem. The SRIOV BARs are now
enabled when IOV capability of the device is enabled in sriov_enable().

NOTE: Note, there is subtle change in the pci_enable_device() API. Any
driver that depends on SRIOV BARS to be enabled in pci_enable_device()
can fail.

The patch has been touch tested on power and x86 platform.

Tested-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

x86: Fix boot failures on older AMD CPU's

People with old AMD chips are getting hung boots, because commit
bcb80e53877c ("x86, microcode, AMD: Add microcode revision to
/proc/cpuinfo") moved the microcode detection too early into
"early_init_amd()".

At that point we are *so* early in the booth that the exception tables
haven't even been set up yet, so the whole

rdmsr_safe(MSR_AMD64_PATCH_LEVEL, &c->microcode, &dummy);

doesn't actually work: if the rdmsr does a GP fault (due to non-existant
MSR register on older CPU's), we can't fix it up yet, and the boot fails.

Fix it by simply moving the code to a slightly later point in the boot
(init_amd() instead of early_init_amd()), since the kernel itself
doesn't even really care about the microcode patchlevel at this point
(or really ever: it's made available to user space in /proc/cpuinfo, and
updated if you do a microcode load).

Reported-tested-and-bisected-by: Larry Finger <Larry.Finger@lwfinger.net>
Tested-by: Bob Tracy <rct@gherkin.frus.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

xen/pm_idle: Make pm_idle be default_idle under Xen.

The idea behind commit d91ee5863b71 ("cpuidle: replace xen access to x86
pm_idle and default_idle") was to have one call - disable_cpuidle()
which would make pm_idle not be molested by other code.  It disallows
cpuidle_idle_call to be set to pm_idle (which is excellent).

But in the select_idle_routine() and idle_setup(), the pm_idle can still
be set to either: amd_e400_idle, mwait_idle or default_idle.  This
depends on some CPU flags (MWAIT) and in AMD case on the type of CPU.

In case of mwait_idle we can hit some instances where the hypervisor
(Amazon EC2 specifically) sets the MWAIT and we get:

  Brought up 2 CPUs
  invalid opcode: 0000 [#1] SMP

  Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1
  RIP: e030:[<ffffffff81015d1d>]  [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
  ...
  Call Trace:
   [<ffffffff8100e2ed>] cpu_idle+0xae/0xe8
   [<ffffffff8149ee78>] cpu_bringup_and_idle+0xe/0x10
  RIP  [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
   RSP <ffff8801d28ddf10>

In the case of amd_e400_idle we don't get so spectacular crashes, but we
do end up making an MSR which is trapped in the hypervisor, and then
follow it up with a yield hypercall.  Meaning we end up going to
hypervisor twice instead of just once.

The previous behavior before v3.0 was that pm_idle was set to
default_idle regardless of select_idle_routine/idle_setup.

We want to do that, but only for one specific case: Xen.  This patch
does that.

Fixes RH BZ #739499 and Ubuntu #881076
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (21 commits)
  usb: ftdi_sio: add PID for Propox ISPcable III
  Revert "xHCI: reset-on-resume quirk for NEC uPD720200"
  xHCI: fix bug in xhci_clear_command_ring()
  usb: gadget: fsl_udc: fix dequeuing a request in progress
  usb: fsl_mxc_udc.c: Remove compile-time dependency of MX35 SoC type
  usb: fsl_mxc_udc.c: Fix build issue by including missing header file
  USB: fsl_udc_core: use usb_endpoint_xfer_isoc to judge ISO XFER
  usb: udc: Fix gadget driver's speed check in various UDC drivers
  usb: gadget: fix g_serial regression
  usb: renesas_usbhs: fixup driver speed
  usb: renesas_usbhs: fixup gadget.dev.driver when udc_stop.
  usb: renesas_usbhs: fixup signal the driver that cable was disconnected
  usb: renesas_usbhs: fixup device_register timing
  usb: musb: PM: fix context save/restore in suspend/resume path
  USB: linux-cdc-acm.inf: add support for the acm_ms gadget
  EHCI : Fix a regression in the ISO scheduler
  xHCI: reset-on-resume quirk for NEC uPD720200
  USB: whci-hcd: fix endian conversion in qset_clear()
  USB: usb-storage: unusual_devs entry for Kingston DT 101 G2
  usb: option: add SIMCom SIM5218
  ...

Merge branch 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

* 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  Staging: comedi: fix integer overflow in do_insnlist_ioctl()
  Revert "Staging: comedi: integer overflow in do_insnlist_ioctl()"
  Staging: comedi: integer overflow in do_insnlist_ioctl()
  Staging: comedi: fix signal handling in read and write
  Staging: comedi: fix mmap_count
  staging: comedi: fix oops for USB DAQ devices.
  staging: comedi: usbduxsigma: Fixed wrong range for the analogue channel.
  staging:rts_pstor:Complete scanning_done variable
  staging: usbip: bugfix for deadlock

Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs

* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: fix attr2 vs large data fork assert
  xfs: force buffer writeback before blocking on the ilock in inode reclaim
  xfs: validate acl count

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: Correct General touch PID

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  vmwgfx: integer overflow in vmw_kms_update_layout_ioctl()
  drm/radeon/kms: fix 2D tiling CS support on EG/CM
  drm/radeon/kms: fix scanout of 2D tiled buffers on EG/CM
  drm: Fix lack of CRTC disable for drm_crtc_helper_set_config(.fb=NULL)
  drm/radeon/kms: add some new pci ids
  drm/radeon/kms: Skip ACPI call to ATIF when possible
  drm/radeon/kms: Hide debugging message
  drm/radeon/kms: add some loop timeouts in pageflip code
  drm/nv50/disp: silence compiler warning
  drm/nouveau: fix oopses caused by clear being called on unpopulated ttms
  drm/nouveau: Keep RAMIN heap within the channel.
  drm/nvd0/disp: fix sor dpms typo, preventing dpms on in some situations
  drm/nvc0/gr: fix TP init for transform feedback offset queries
  drm/nouveau: add dumb ioctl support

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix S3/S4 problem on machines with VREF-pin mute-LED
  ALSA: hda_intel - revert a quirk that affect VIA chipsets
  ALSA: hda - Avoid touching mute-VREF pin for IDT codecs
  firmware: Sigma: Fix endianess issues
  firmware: Sigma: Skip header during CRC generation
  firmware: Sigma: Prevent out of bounds memory access
  ALSA: usb-audio - Support for Roland GAIA SH-01 Synthesizer
  ASoC: Supply dcs_codes for newer WM1811 revisions
  ASoC: Error out if we can't generate a LRCLK at all for WM8994
  ASoC: Correct name of Speyside Main Speaker widget
  ASoC: skip resume of soc-audio devices without codecs
  ASoC: cs42l51: Fix off-by-one for reg_cache_size
  ASoC: drop support for PlayPaq with WM8510
  ASoC: mpc8610: tell the CS4270 codec that it's the master
  ASoC: cs4720: use snd_soc_cache_sync()
  ASoC: SAMSUNG: Fix build error
  ASoC: max9877: Update register if either val or val2 is changed
  ASoC: Fix wrong define for AD1836_ADC_WORD_OFFSET

vmwgfx: integer overflow in vmw_kms_update_layout_ioctl()

There are two issues in vmw_kms_update_layout_ioctl(). First, the
for loop forgets to index rects and only checks the first element.
Second, there is a potential integer overflow if userspace passes
in a large arg->num_outputs. The call to kzalloc() would allocate
a small buffer, leading to out-of-bounds read.

Reported-by: Haogang Chen <haogangchen@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

drm/radeon/kms: fix 2D tiling CS support on EG/CM

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=43191

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

drm/radeon/kms: fix scanout of 2D tiled buffers on EG/CM

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=43191

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

drm: Fix lack of CRTC disable for drm_crtc_helper_set_config(.fb=NULL)

Disabling the CRTC by setting its framebuffer to NULL, as used by
drm_framebuffer_cleanup(), was failing to pass the current framebuffer
to the crtc_func->disable callback. This is because of the dance within
drm_crtc_helper_set_config to pass the new_fb (NULL in this case) to the
drm_crtc_helper_set_mode with the currently attached fb as a parameter.
drm_crtc_helper_set_mode treats this as a no-op and the encoder is still
enabled. And so the current fb is forgotten before the call to
drm_helper_disable_unused_functions.

This patch treats disabling the CRTC as a simple special case rather
than adding further complexity into the configuration logic.

This fixes a pin-leak of the fb bo on Xserver close.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
  netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NS
  ipv4: flush route cache after change accept_local
  sch_red: fix red_change
  Revert "udp: remove redundant variable"
  bridge: master device stuck in no-carrier state forever when in user-stp mode
  ipv4: Perform peer validation on cached route lookup.
  net/core: fix rollback handler in register_netdevice_notifier
  sch_red: fix red_calc_qavg_from_idle_time
  bonding: only use primary address for ARP
  ipv4: fix lockdep splat in rt_cache_seq_show
  sch_teql: fix lockdep splat
  net: fec: Select the FEC driver by default for i.MX SoCs
  isdn: avoid copying too long drvid
  isdn: make sure strings are null terminated
  netlabel: Fix build problems when IPv6 is not enabled
  sctp: better integer overflow check in sctp_auth_create_key()
  sctp: integer overflow in sctp_auth_create_key()
  ipv6: Set mcast_hops to IPV6_DEFAULT_MCASTHOPS when -1 was given.
  net: Fix corruption in /proc/*/net/dev_mcast
  mac80211: fix race between the AGG SM and the Tx data path
  ...

netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NS

firewalld in Fedora 16 needs this.

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: flush route cache after change accept_local

After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0,
we should flush route cache, or it will continue receive packets with local
source address, which should be dropped.

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sch_red: fix red_change

Le mercredi 30 novembre 2011 à 14:36 -0800, Stephen Hemminger a écrit :

> (Almost) nobody uses RED because they can't figure it out.
> According to Wikipedia, VJ says that:
> "there are not one, but two bugs in classic RED."

RED is useful for high throughput routers, I doubt many linux machines
act as such devices.

I was considering adding Adaptative RED (Sally Floyd, Ramakrishna
Gummadi, Scott Shender), August 2001

In this version, maxp is dynamic (from 1% to 50%), and user only have to
setup min_th (target average queue size)
(max_th and wq (burst in linux RED) are automatically setup)

By the way it seems we have a small bug in red_change()

if (skb_queue_empty(&sch->q))
red_end_of_idle_period(&q->parms);

First, if queue is empty, we should call
red_start_of_idle_period(&q->parms);

Second, since we dont use anymore sch->q, but q->qdisc, the test is
meaningless.

Oh well...

[PATCH] sch_red: fix red_change()

Now RED is classful, we must check q->qdisc->q.qlen, and if queue is empty,
we start an idle period, not end it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Linux 3.2-rc4

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (31 commits)
  ocfs2: avoid unaligned access to dqc_bitmap
  ocfs2: Use filemap_write_and_wait() instead of write_inode_now()
  ocfs2: honor O_(D)SYNC flag in fallocate
  ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2
  ocfs2: send correct UUID to cleancache initialization
  ocfs2: Commit transactions in error cases -v2
  ocfs2: make direntry invalid when deleting it
  fs/ocfs2/dlm/dlmlock.c: free kmem_cache_zalloc'd data using kmem_cache_free
  ocfs2: Avoid livelock in ocfs2_readpage()
  ocfs2: serialize unaligned aio
  ocfs2: Implement llseek()
  ocfs2: Fix ocfs2_page_mkwrite()
  ocfs2: Add comment about orphan scanning
  ocfs2: Clean up messages in the fs
  ocfs2/cluster: Cluster up now includes network connections too
  ocfs2/cluster: Add new function o2net_fill_node_map()
  ocfs2/cluster: Fix output in file elapsed_time_in_ms
  ocfs2/dlm: dlmlock_remote() needs to account for remastery
  ocfs2/dlm: Take inflight reference count for remotely mastered resources too
  ocfs2/dlm: Cleanup dlm_wait_for_node_death() and dlm_wait_for_node_recovery()
  ...

ocfs2: avoid unaligned access to dqc_bitmap

The dqc_bitmap field of struct ocfs2_local_disk_chunk is 32-bit aligned,
but not 64-bit aligned.  The dqc_bitmap is accessed by ocfs2_set_bit(),
ocfs2_clear_bit(), ocfs2_test_bit(), or ocfs2_find_next_zero_bit().  These
are wrapper macros for ext2_*_bit() which need to take an unsigned long
aligned address (though some architectures are able to handle unaligned
address correctly)

So some 64bit architectures may not be able to access the dqc_bitmap
correctly.

This avoids such unaligned access by using another wrapper functions for
ext2_*_bit().  The code is taken from fs/ext4/mballoc.c which also need to
handle unaligned bitmap access.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Joel Becker <jlbec@evilplan.org>

Merge branch 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm

* 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
  ARM: 7182/1: ARM cpu topology: fix warning
  ARM: 7181/1: Restrict kprobes probing SWP instructions to ARMv5 and below
  ARM: 7180/1: Change kprobes testcase with unpredictable STRD instruction
  ARM: 7177/1: GIC: avoid skipping non-existent PPIs in irq_start calculation
  ARM: 7176/1: cpu_pm: register GIC PM notifier only once
  ARM: 7175/1: add subname parameter to mfp_set_groupg callers
  ARM: 7174/1: Fix build error in kprobes test code on Thumb2 kernels
  ARM: 7172/1: dma: Drop GFP_COMP for DMA memory allocations
  ARM: 7171/1: unwind: add unwind directives to bitops assembly macros
  ARM: 7170/2: fix compilation breakage in entry-armv.S
  ARM: 7168/1: use cache type functions for arch_get_unmapped_area
  ARM: perf: check that we have a platform device when reserving PMU
  ARM: 7166/1: Use PMD_SHIFT instead of PGDIR_SHIFT in dma-consistent.c
  ARM: 7165/2: PL330: Fix typo in _prepare_ccr()
  ARM: 7163/2: PL330: Only register usable channels
  ARM: 7162/1: errata: tidy up Kconfig options for PL310 errata workarounds
  ARM: 7161/1: errata: no automatic store buffer drain
  ARM: perf: initialise used_mask for fake PMU during validation
  ARM: PMU: remove pmu_init declaration
  ARM: PMU: re-export release_pmu symbol to modules

Revert "udp: remove redundant variable"

This reverts commit 81d54ec8479a2c695760da81f05b5a9fb2dbe40a.

If we take the "try_again" goto, due to a checksum error,
the 'len' has already been truncated. So we won't compute
the same values as the original code did.

Reported-by: paul bilke <fsmail@conspiracy.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci into usb-linus

* 'for-usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci:
Revert "xHCI: reset-on-resume quirk for NEC uPD720200"
xHCI: fix bug in xhci_clear_command_ring()

bridge: master device stuck in no-carrier state forever when in user-stp mode

When in user-stp mode, bridge master do not follow state of its slaves, so
after the following sequence of events it can stuck forever in no-carrier
state:
1) turn stp off
2) put all slaves down - master device will follow their state and also go in
no-carrier state
3) turn stp on with bridge-stp script returning 0 (go to the user-stp mode)
Now bridge master won't follow slaves' state and will never reach running
state.

This patch solves the problem by making user-stp and kernel-stp behavior
similar regarding master following slaves' states.

Signed-off-by: Vitalii Demianets <vitas@nppfactor.kiev.ua>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

usb: ftdi_sio: add PID for Propox ISPcable III

Signed-off-by: Marcin Kościelnicki <koriakin@0x04.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Revert "xHCI: reset-on-resume quirk for NEC uPD720200"

This reverts commit df711fc9962b9491af2b92bd0d21ecbfefe4e5fa.

The commit added a reset-on-resume quirk because the NEC chipset stopped
responding to commands about 30 minutes after a system resume from
suspend. We thought it was a chipset issue, but it turns out that the
xHCI driver was zeroing out the link TRB after a successful context
restore during resume. The host controller would fall off the command
ring sometime later, causing it to not respond to new commands.

The link TRB issue has been fixed with commit
158886cd2cf4599e04f9b7e10cb767f5f39b14f1 "xHCI: fix bug in
xhci_clear_command_ring()", so revert the reset-on-resume quirk, as it's
not necessary.

Commit df711fc9962b9491af2b92bd0d21ecbfefe4e5fa was marked for stable
trees back to 2.6.37, but according to my mail, it has not made it into
Linus' tree or the stable trees yet.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Tested-by: Julian Sikorski <belegdol@gmail.com>
Cc: Andiry Xu <andiry.xu@amd.com>

ipv4: Perform peer validation on cached route lookup.

Otherwise we won't notice the peer GENID change.

Reported-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

xHCI: fix bug in xhci_clear_command_ring()

When system enters suspend, xHCI driver clears command ring by writing zero
to all the TRBs. However, this also writes zero to the Link TRB, and the ring
is mangled. This may cause driver accesses wrong memory address and the
result is unpredicted.

When clear the command ring, keep the last Link TRB intact, only clear its
cycle bit. This should fix the "command ring full" issue reported by Oliver
Neukum.

This should be backported to stable kernels as old as 2.6.37, since the
commit 89821320 "xhci: Fix command ring replay after resume" is merged.

Signed-off-by: Andiry Xu <andiry.xu@amd.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reported-by: Oliver Neukum <oneukum@suse.de>

drm/radeon/kms: add some new pci ids

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix meta data raid-repair merge problem
  Btrfs: skip allocation attempt from empty cluster
  Btrfs: skip block groups without enough space for a cluster
  Btrfs: start search for new cluster at the beginning
  Btrfs: reset cluster's max_size when creating bitmap
  Btrfs: initialize new bitmaps' list
  Btrfs: fix oops when calling statfs on readonly device
  Btrfs: Don't error on resizing FS to same size
  Btrfs: fix deadlock on metadata reservation when evicting a inode
  Fix URL of btrfs-progs git repository in docs
  btrfs scrub: handle -ENOMEM from init_ipath()

Merge branch 'fix/asoc' into for-linus

Btrfs: fix meta data raid-repair merge problem

Commit 4a54c8c16 introduced raid-repair, killing the individual
readpage_io_failed_hook entries from inode.c and disk-io.c. Commit
4bb31e92 introduced new readahead code, adding a readpage_io_failed_hook to
disk-io.c.

The raid-repair commit had logic to disable raid-repair, if
readpage_io_failed_hook is set. Thus, the readahead commit effectively
disabled raid-repair for meta data.

This commit changes the logic to always attempt raid-repair when needed and
call the readpage_io_failed_hook in case raid-repair fails. This is much
more straight forward and should have been like that from the beginning.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Reported-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

ALSA: hda - Fix S3/S4 problem on machines with VREF-pin mute-LED

The verb command in stac92xx_post_suspend caused the audio to stop
working after resuming from S3 mode on HP laptops with the VREF-pin
mute-LED control. Removing relevant post_suspend registering.

Although removing D3 on AFG is no optimal solution, the impact should
be small in comparison with the broken S3/S4.

Signed-off-by: Charles Chin <Charles.Chin@idt.com>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

drm/radeon/kms: Skip ACPI call to ATIF when possible

I am under the impression that it only makes sense to call the ATIF
method if the graphics device has an ACPI handle attached. So we could
skip the call altogether if there is no such handle.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

drm/radeon/kms: Hide debugging message

Use the proper macro to issue the debugging message in
radeon_atif_call(). Otherwise we spam the log of many systems with a
message which looks like an error message of unknown origin, and could
thus confuse the user. Commit dc77de12dde95c8da39e4c417eb70c7d445cf84b
was a first step in this direction, but was not sufficient IMHO.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

drm/radeon/kms: add some loop timeouts in pageflip code

Avoid infinite loops waiting for surface updates if a GPU
reset happens while waiting for a page flip.

See:
https://bugs.freedesktop.org/show_bug.cgi?id=43191

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@kernel.org
Reviewed-by: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
Tested-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>

Merge branch 'drm-nouveau-fixes' of git://git.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes

* 'drm-nouveau-fixes' of git://git.freedesktop.org/git/nouveau/linux-2.6:
  drm/nv50/disp: silence compiler warning
  drm/nouveau: fix oopses caused by clear being called on unpopulated ttms
  drm/nouveau: Keep RAMIN heap within the channel.
  drm/nvd0/disp: fix sor dpms typo, preventing dpms on in some situations
  drm/nvc0/gr: fix TP init for transform feedback offset queries
  drm/nouveau: add dumb ioctl support

net/core: fix rollback handler in register_netdevice_notifier

Within nested statements, the break statement terminates only the
do, for, switch, or while statement that immediately encloses it,
So replace the break with goto.

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sch_red: fix red_calc_qavg_from_idle_time

Since commit a4a710c4a7490587 (pkt_sched: Change PSCHED_SHIFT from 10 to
6) it seems RED/GRED are broken.

red_calc_qavg_from_idle_time() computes a delay in us units, but this
delay is now 16 times bigger than real delay, so the final qavg result
smaller than expected.

Use standard kernel time services since there is no need to obfuscate
them.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: only use primary address for ARP

Only use the primary address of the bond device
for master_ip. This will prevent changing the ARP source
address in Active-Backup mode whenever a secondry address
is added to the bond device.

Signed-off-by: Henrik Saavedra Persson <henrik.e.persson@ericsson.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB: Fix RCU lockdep splats
  IB/ipoib: Prevent hung task or softlockup processing multicast response
  IB/qib: Fix over-scheduling of QSFP work
  RDMA/cxgb4: Fix retry with MPAv1 logic for MPAv2
  RDMA/cxgb4: Fix iw_cxgb4 count_rcqes() logic
  IB/qib: Don't use schedule_work()

Merge branch 'dt-for-linus' of git://sources.calxeda.com/kernel/linux

* 'dt-for-linus' of git://sources.calxeda.com/kernel/linux:
of: Add Silicon Image vendor prefix
of/irq: of_irq_init: add check for parent equal to child node

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: twl: fix twl4030 support for smps regulators
  regulator: fix use after free bug
  regulator: aat2870: Fix the logic of checking if no id is matched in aat2870_get_regulator

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (45 commits)
  ARM: ux500: update defconfig
  ARM: u300: update defconfig
  ARM: at91: enable additional boards in existing soc defconfig files
  ARM: at91: refresh soc defconfig files for 3.2
  ARM: at91: rename defconfig files appropriately
  ARM: OMAP2+: Fix Compilation error when omap_l3_noc built as module
  ARM: OMAP2+: Remove empty io.h
  ARM: OMAP2: select ARM_AMBA if OMAP3_EMU is defined
  ARM: OMAP: smartreflex: fix IRQ handling bug
  ARM: OMAP: PM: only register TWL with voltage layer when device is present
  ARM: OMAP: hwmod: Fix the addr space, irq, dma count APIs
  arm: mx28: fix bit operation in clock setting
  ARM: imx: export imx_ioremap
  ARM: imx/mm-imx3: conditionally compile i.MX31 and i.MX35 code
  ARM: mx5: Fix checkpatch warnings in cpu-imx5.c
  MAINTAINERS: Add missing directory
  ARM: imx: drop 'ARCH_MX31' and 'ARCH_MX35'
  ARM: imx6q: move clock register map to machine_desc.map_io
  ARM: pxa168/gplugd: add the correct SSP device
  ARM: Update mach-types to fix mxs build breakage
  ...

ARM: 7182/1: ARM cpu topology: fix warning

kernel/sched.c:7354:2: warning: initialization from incompatible pointer type

Align cpu_coregroup_mask prototype interface with sched_domain_mask_f typedef
use int cpu instead of unsigned int cpu

Cc: <stable@vger.kernel.org>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ARM: 7181/1: Restrict kprobes probing SWP instructions to ARMv5 and below

The SWP instruction is deprecated on ARMv6 and with ARMv7 it will be
UNDEFINED when CONFIG_SWP_EMULATE is selected. In this case, probing a
SWP instruction will cause an oops when the kprobes emulation code
executes an undefined instruction.

As the SWP instruction should be rare or non-existent in kernels for
ARMv6 and later, we can simply avoid these problems by not allowing
probing of these.

Reported-by: Leif Lindholm <leif.lindholm@arm.com>
Tested-by: Leif Lindholm <leif.lindholm@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ARM: 7180/1: Change kprobes testcase with unpredictable STRD instruction

There is a kprobes testcase for the instruction "strd r2, [r3], r4".
This has unpredictable behaviour as it uses r3 for register writeback
addressing and also stores it to memory.

On a cortex A9, this testcase would fail because the instruction writes
the updated value of r3 to memory, whereas the kprobes emulation code
writes the original value.

Fix this by changing testcase to used r5 instead of r3.

Reported-by: Leif Lindholm <leif.lindholm@arm.com>
Tested-by: Leif Lindholm <leif.lindholm@arm.com>
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Jon Medhurst <tixy@yxit.co.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ipv4: fix lockdep splat in rt_cache_seq_show

After commit f2c31e32b378 (fix NULL dereferences in check_peer_redir()),
dst_get_neighbour() should be guarded by rcu_read_lock() /
rcu_read_unlock() section.

Reported-by: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sch_teql: fix lockdep splat

We need rcu_read_lock() protection before using dst_get_neighbour(), and
we must cache its value (pass it to __teql_resolve())

teql_master_xmit() is called under rcu_read_lock_bh() protection, its
not enough.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: Select the FEC driver by default for i.MX SoCs

Since commit 230dec6 (net/fec: add imx6q enet support) the FEC driver is no
longer built by default for i.MX SoCs.

Let the FEC driver be built by default again.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Suggested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem

Btrfs: skip allocation attempt from empty cluster

If we don't have a cluster, don't bother trying to allocate from it,
jumping right away to the attempt to allocate a new cluster.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: skip block groups without enough space for a cluster

We test whether a block group has enough free space to hold the
requested block, but when we're doing clustered allocation, we can
save some cycles by testing whether it has enough room for the cluster
upfront, otherwise we end up attempting to set up a cluster and
failing. Only in the NO_EMPTY_SIZE loop do we attempt an unclustered
allocation, and by then we'll have zeroed the cluster size, so this
patch won't stop us from using the block group as a last resort.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: start search for new cluster at the beginning

Instead of starting at zero (offset is always zero), request a cluster
starting at search_start, that denotes the beginning of the current
block group.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: reset cluster's max_size when creating bitmap

The field that indicates the size of the largest contiguous chunk of
free space in the cluster is not initialized when setting up bitmaps,
it's only increased when we find a larger contiguous chunk. We end up
retaining a larger value than appropriate for highly-fragmented
clusters, which may cause pointless searches for large contiguous
groups, and even cause clusters that do not meet the density
requirements to be set up.

Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: initialize new bitmaps' list

We're failing to create clusters with bitmaps because
setup_cluster_no_bitmap checks that the list is empty before inserting
the bitmap entry in the list for setup_cluster_bitmap, but the list
field is only initialized when it is restored from the on-disk free
space cache, or when it is written out to disk.

Besides a potential race condition due to the multiple use of the list
field, filesystem performance severely degrades over time: as we use
up all non-bitmap free extents, the try-to-set-up-cluster dance is
done at every metadata block allocation. For every block group, we
fail to set up a cluster, and after failing on them all up to twice,
we fall back to the much slower unclustered allocation.

To make matters worse, before the unclustered allocation, we try to
create new block groups until we reach the 1% threshold, which
introduces additional bitmaps and thus block groups that we'll iterate
over at each metadata block request.

Btrfs: fix oops when calling statfs on readonly device

To reproduce this bug:

  # dd if=/dev/zero of=img bs=1M count=256
  # mkfs.btrfs img
  # losetup -r /dev/loop1 img
  # mount /dev/loop1 /mnt
  OOPS!!

It triggered BUG_ON(!nr_devices) in btrfs_calc_avail_data_space().

To fix this, instead of checking write-only devices, we check all open
deivces:

  # df -h /dev/loop1
  Filesystem            Size  Used Avail Use% Mounted on
  /dev/loop1            250M   28K  238M   1% /mnt

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>

Btrfs: Don't error on resizing FS to same size

It seems overly harsh to fail a resize of a btrfs file system to the
same size when a shrink or grow would succeed. User app GParted trips
over this error. Allow it by bypassing the shrink or grow operation.

Signed-off-by: Mike Fleetwood <mike.fleetwood@googlemail.com>

Btrfs: fix deadlock on metadata reservation when evicting a inode

When I ran the xfstests, I found the test tasks was blocked on meta-data
reservation.

By debugging, I found the reason of this bug:
   start transaction
        |
v
   reserve meta-data space
|
v
   flush delay allocation -> iput inode -> evict inode
^ |
| v
   wait for delay allocation flush <- reserve meta-data space

And besides that, the flush on evicting inode will block the thread, which
is reclaiming the memory, and make oom happen easily.

Fix this bug by skipping the flush step when evicting inode.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>

Fix URL of btrfs-progs git repository in docs

The location of the btrfs-progs repository has been changed.
This patch updates the documentation accordingly.

Signed-off-by: Arnd Hannemann <arnd@arndnet.de>

btrfs scrub: handle -ENOMEM from init_ipath()

init_ipath() can return an ERR_PTR(-ENOMEM).

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

drm/nv50/disp: silence compiler warning

NFI why this only started appearing now. The use of the uninitialised var
can't actually happen, so perhaps my compiler just got stupider.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: fix oopses caused by clear being called on unpopulated ttms

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nouveau: Keep RAMIN heap within the channel.

The entire RAMIN is allocated to be 'size', but the heap is
specified as 'base' + 'size' inside RAMIN, so it will overflow
past RAMIN by 'base' bytes on NV50+ and clobber other allocatons
unless it's size is adjusted.

Signed-off-by: Younes Manton <younes.m@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

drm/nvd0/disp: fix sor dpms typo, preventing dpms on in some situations

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>