Ram Pai [Sun, 6 Nov 2011 02:33:57 +0000 (10:33 +0800)]
PCI: delay configuration of SRIOV capability
The SRIOV capability, namely page size and total_vfs of a device are
configured during enumeration phase of the device. This can potentially
interfere with the PCI operations of the platform, if the IOV capability
of the device is not enabled.
The following patch postpones the configuration of the IOV capability of
the device to a later point, when the IOV capability is explicitly
enabled by the device driver.
The patch is tested on x86 and power platform.
Tested-by: Donald Dutile <ddutile@redhat.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Wed, 23 Nov 2011 05:06:53 +0000 (21:06 -0800)]
PCI: Only call pci_stop_bus_device() one time for child devices at remove
During debugging pcie hotplug with SRIOV with pcie switch, I found
pci_stop_bus_device() is called several times for some child devices.
So change original pci_remove_bus_device() to __pci_remove_bus_device(),
and make it only do remove work, and add a new pci_remove_bus_device
that calls pci_stop_bus_device() one time, and then call
__pci_remove_bus_device().
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
x86/PCI: amd: Kill misleading message about enablement of IO access to PCI ECS]
Commit 24d9b70b8c679264756a6980e668b96b3f964826 (x86: Use PCI method
for enabling AMD extended config space before MSR method) added a
message when IO access to PCI ECS was enabled via access to the NB_CFG
PCI register. This can lead to a bogus message like
[ 0.365177] Extended Config Space enabled on 0 nodes
which is misleading because IO ECS access is subsequently enabled for
AMD CPUs (that support this) by modifying the corresponding NB_CFG
MSR.
Furthermore it's not "Extended Config Space" that is enabled by this
register setting. It's the IO access that is enabled for extended
configruation space.
IMHO the ambiguous message needs to be cancelled.
Cc: Jan Beulich <jbeulich@novell.com> Cc: Robert Richter <robert.richter@amd.com> Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Myron Stowe [Fri, 28 Oct 2011 21:49:20 +0000 (15:49 -0600)]
PCI: latency timer doesn't apply to PCIe
The latency timer is read-only and hardwired to zero for all PCIe
devices, both Type 0 and Type 1, so don't bother trying to update it
and cluttering the dmesg log with meaningless "setting latency timer
to 64" messages.
Myron Stowe [Fri, 28 Oct 2011 21:48:59 +0000 (15:48 -0600)]
PCI: mn10300: use generic pcibios_set_master()
This patch removes mn10300's architecture-specific 'pcibios_set_master()'
routine for ASB2305 and lets the default PCI core based implementation
handle PCI device 'latency timer' setup.
Myron Stowe [Fri, 28 Oct 2011 21:48:38 +0000 (15:48 -0600)]
PCI: Pull PCI 'latency timer' setup up into the core
The 'latency timer' of PCI devices, both Type 0 and Type 1,
is setup in architecture-specific code [see: 'pcibios_set_master()'].
There are two approaches being taken by all the architectures - check
if the 'latency timer' is currently set between 16 and 255 and if not
bring it within bounds, or, do nothing (and then there is the
gratuitously different PA-RISC implementation).
There is nothing architecture-specific about PCI's 'latency timer' so
this patch pulls its setup functionality up into the PCI core by
creating a generic 'pcibios_set_master()' function using the '__weak'
attribute which can be used by all architectures as a default which,
if necessary, can then be over-ridden by architecture-specific code.
Myron Stowe [Fri, 28 Oct 2011 21:48:31 +0000 (15:48 -0600)]
PCI: Xtensa: convert pcibios_set_master() to a non-inlined function
This patch converts Xtensa's architecture-specific
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.
Converting 'pci_bios_set_master()' to a non-inlined function will
allow Xtensa's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and
thus, not change current behavior.
Myron Stowe [Fri, 28 Oct 2011 21:48:24 +0000 (15:48 -0600)]
PCI: UniCore: convert pcibios_set_master() to a non-inlined function
This patch converts UniCore's architecture-specific
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.
Converting 'pci_bios_set_master()' to a non-inlined function will
allow UniCore's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and
thus, not change current behavior.
Myron Stowe [Fri, 28 Oct 2011 21:48:17 +0000 (15:48 -0600)]
PCI: TILE: convert pcibios_set_master() to a non-inlined function
This patch converts TILE's architecture-specific 'pcibios_set_master()'
routine to a non-inlined function. This will allow follow on patches
to create a generic 'pcibios_set_master()' function using the '__weak'
attribute which can be used by all architectures as a default which,
if necessary, can then be over-ridden by architecture-specific code.
Converting 'pci_bios_set_master()' to a non-inlined function will
allow TILE's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and
thus, not change current behavior.
Myron Stowe [Fri, 28 Oct 2011 21:48:10 +0000 (15:48 -0600)]
PCI: SPARC: convert pcibios_set_master() to a non-inlined function
This patch converts SPARC's architecture-specific
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.
Converting 'pci_bios_set_master()' to a non-inlined function will
allow SPARC's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and
thus, not change current behavior.
Myron Stowe [Fri, 28 Oct 2011 21:48:03 +0000 (15:48 -0600)]
PCI: PowerPC: convert pcibios_set_master() to a non-inlined function
This patch converts PowerPC's architecture-specific
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.
Converting 'pci_bios_set_master()' to a non-inlined function will
allow PowerPC's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and
thus, not change current behavior.
Myron Stowe [Fri, 28 Oct 2011 21:47:56 +0000 (15:47 -0600)]
PCI: MicroBlaze: convert pcibios_set_master() to a non-inlined function
This patch converts MicroBlaze's architecture-specific
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.
Converting 'pci_bios_set_master()' to a non-inlined function will
allow MicroBlaze's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and thus,
not change current behavior.
Myron Stowe [Fri, 28 Oct 2011 21:47:49 +0000 (15:47 -0600)]
PCI: IA64: convert pcibios_set_master() to a non-inlined function
This patch converts IA64's architecture-specific 'pcibios_set_master()'
routine to a non-inlined function. This will allow follow on
patches to create a generic 'pcibios_set_master()' function using the
'__weak' attribute which can be used by all architectures as a default
which, if necessary, can then be over-ridden by architecture-
specific code.
Converting 'pci_bios_set_master()' to a non-inlined function will allow
IA64's 'pcibios_set_master()' implementation to remain architecture-
specific after the generic version is introduced and thus, not change
current behavior.
Myron Stowe [Fri, 28 Oct 2011 21:47:42 +0000 (15:47 -0600)]
PCI: ARM: convert pcibios_set_master() to a non-inlined function
This patch converts ARM's architecture-specific inlined
'pcibios_set_master()' routine to a non-inlined function. This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.
Converting 'pci_bios_set_master()' to a non-inlined function will allow
ARM's 'pcibios_set_master()' implementation to remain architecture-
specific after the generic version is introduced and thus, not change
current behavior.
Note that ARM also has a non-inlined 'pcibios_set_master()' that is
used if CONFIG_PCI_HOST_ITE8152 is defined. This patch does not
change any behavior here either.
Myron Stowe [Fri, 28 Oct 2011 21:47:35 +0000 (15:47 -0600)]
PCI: add declaration for pcibios_set_master() to pci core
Currently, pcibios_set_master() is implemented in architecture-
specific code. There is nothing architecture-specific about PCI's
'latency timer'.
This patch adds a declaration for pcibios_set_master() to PCI's core
in preperation for pulling the function itself up into the core.
Without the addition of this declaration, subsequent patches that
remove inline definitions of pcibios_set_master() would be removing
the only declaration of such.
Jan Kiszka [Fri, 4 Nov 2011 08:46:01 +0000 (09:46 +0100)]
uio: Convert uio_generic_pci to new intx masking API
The new PCI API provides both generic probing for 2.3 masking support
and check&mask in the interrupt handler.
Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Jan Kiszka [Fri, 4 Nov 2011 08:46:00 +0000 (09:46 +0100)]
PCI: Introduce INTx check & mask API
These new PCI services allow to probe for 2.3-compliant INTx masking
support and then use the feature from PCI interrupt handlers. The
services are properly synchronized with concurrent config space access
via sysfs or on device reset.
This enables generic PCI device drivers like uio_pci_generic or KVM's
device assignment to implement the necessary kernel-side IRQ handling
without any knowledge about device-specific interrupt status and control
registers.
Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Jan Kiszka [Fri, 4 Nov 2011 08:45:59 +0000 (09:45 +0100)]
PCI: Rework config space blocking services
pci_block_user_cfg_access was designed for the use case that a single
context, the IPR driver, temporarily delays user space accesses to the
config space via sysfs. This assumption became invalid by the time
pci_dev_reset was added as locking instance. Today, if you run two loops
in parallel that reset the same device via sysfs, you end up with a
kernel BUG as pci_block_user_cfg_access detect the broken assumption.
This reworks the pci_block_user_cfg_access to a sleeping service
pci_cfg_access_lock and an atomic-compatible variant called
pci_cfg_access_trylock. The former not only blocks user space access as
before but also waits if access was already locked. The latter service
just returns false in this case, allowing the caller to resolve the
conflict instead of raising a BUG.
Adaptions of the ipr driver were originally written by Brian King.
Acked-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Gary Hade [Mon, 14 Nov 2011 23:42:16 +0000 (15:42 -0800)]
x86/PCI: Ignore CPU non-addressable _CRS reserved memory resources
This assures that a _CRS reserved host bridge window or window region is
not used if it is not addressable by the CPU. The new code either trims
the window to exclude the non-addressable portion or totally ignores the
window if the entire window is non-addressable.
The current code has been shown to be problematic with 32-bit non-PAE
kernels on systems where _CRS reserves resources above 4GB.
Signed-off-by: Gary Hade <garyhade@us.ibm.com> Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Thomas Renninger <trenn@novell.com> Cc: linux-kernel@vger.kernel.org Cc: stable@kernel.org Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
David Fries [Sun, 20 Nov 2011 21:29:46 +0000 (15:29 -0600)]
PCI: pci_has_legacy_pm_support add driver and device to WARN
Include the driver name and device in warning when a pci driver
supports both legacy pm and new framework as just the stack trace
gives no way to identify the driver.
Signed-off-by: David Fries <David@Fries.net> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCI: msi: Disable msi interrupts when we initialize a pci device
I traced a nasty kexec on panic boot failure to the fact that we had
screaming msi interrupts and we were not disabling the msi messages at
kernel startup. The booting kernel had not enabled those interupts so
was not prepared to handle them.
I can see no reason why we would ever want to leave the msi interrupts
enabled at boot if something else has enabled those interrupts. The pci
spec specifies that msi interrupts should be off by default. Drivers
are expected to enable the msi interrupts if they want to use them. Our
interrupt handling code reprograms the interrupt handlers at boot and
will not be be able to do anything useful with an unexpected interrupt.
This patch applies cleanly all of the way back to 2.6.32 where I noticed
the problem.
Cc: stable@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCI/ACPI/PM: Avoid resuming devices that don't signal PME
Modify pci_acpi_wake_dev() to avoid resuming PME-capable devices
whose PME Status bits are not set, which may happen currently if
several devices are associated with the same wakeup GPE and all
of them are notified whenever at least one of them signals PME.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCI/ACPI: Make acpiphp ignore root bridges using SHPC native hotplug
If the kernel has requested control of the SHPC native hotplug
feature for a given root bridge, the acpiphp driver should not try
to handle that root bridge and it should leave it to shpchp.
Failing to do so causes problems to happen if shpchp is loaded
and unloaded before loading acpiphp (ACPI-based hotplug won't work
in that case anyway).
To address this issue make find_root_bridges() ignore PCI root
bridges with SHPC native hotplug enabled and make add_bridge()
return error code if SHPC native hotplug is enabled for the given
root bridge. This causes acpiphp to refuse to load if SHPC native
hotplug is enabled for all root bridges and to refuse binding to
the root bridges with SHPC native hotplug enabled.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use non-ordered workqueue for attention button events.
Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.
Kenji Kaneshige [Mon, 7 Nov 2011 11:55:46 +0000 (20:55 +0900)]
PCI: pciehp: Fix wrong workqueue cleanup
Fix improper workqueue cleanup.
In the current pciehp, pcied_cleanup() calls destroy_workqueue()
before calling pcie_port_service_unregister(). This causes kernel oops
because flush_workqueue() is called in the pcie_port_service_unregister()
code path after the workqueue was destroyed. So pcied_cleanup() must call
pcie_port_service_unregister() first before calling destroy_workqueue().
Matthew Garrett [Thu, 10 Nov 2011 21:38:33 +0000 (16:38 -0500)]
PCI: Rework ASPM disable code
Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.
This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.
It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.
Measured to save around 5W on an idle Thinkpad X220.
Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Neil Horman [Thu, 6 Oct 2011 18:08:18 +0000 (14:08 -0400)]
PCI/sysfs: add per pci device msi[x] irq listing (v5)
This patch adds a per-pci-device subdirectory in sysfs called:
/sys/bus/pci/devices/<device>/msi_irqs
This sub-directory exports the set of msi vectors allocated by a given
pci device, by creating a numbered sub-directory for each vector beneath
msi_irqs. For each vector various attributes can be exported.
Currently the only attribute is called mode, which tracks the
operational mode of that vector (msi vs. msix)
James Bottomley [Tue, 29 Nov 2011 19:20:23 +0000 (19:20 +0000)]
PCI: fix ats compile failure
I get this compile failure on parisc:
drivers/pci/ats.c: In function 'ats_alloc_one':
drivers/pci/ats.c:29: error: implicit declaration of function 'kzalloc'
drivers/pci/ats.c:29: warning: assignment makes pointer from integer without a cast
drivers/pci/ats.c: In function 'ats_free_one':
drivers/pci/ats.c:45: error: implicit declaration of function 'kfree'
Because ats.c is missing linux/slab.h as an include. This patch fixes it
Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Ram Pai [Sun, 6 Nov 2011 02:33:10 +0000 (10:33 +0800)]
PCI: defer enablement of SRIOV BARS
All the PCI BARs of a device are enabled when the device is enabled
using pci_enable_device(). This unnecessarily enables SRIOV BARs of the
device.
On some platforms, which do not support SRIOV as yet, the
pci_enable_device() fails to enable the device if its SRIOV BARs are not
allocated resources correctly.
The following patch fixes the above problem. The SRIOV BARs are now
enabled when IOV capability of the device is enabled in sriov_enable().
NOTE: Note, there is subtle change in the pci_enable_device() API. Any
driver that depends on SRIOV BARS to be enabled in pci_enable_device()
can fail.
The patch has been touch tested on power and x86 platform.
Tested-by: Michael Wang <wangyun@linux.vnet.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Linus Torvalds [Sun, 4 Dec 2011 19:57:09 +0000 (11:57 -0800)]
x86: Fix boot failures on older AMD CPU's
People with old AMD chips are getting hung boots, because commit bcb80e53877c ("x86, microcode, AMD: Add microcode revision to
/proc/cpuinfo") moved the microcode detection too early into
"early_init_amd()".
At that point we are *so* early in the booth that the exception tables
haven't even been set up yet, so the whole
doesn't actually work: if the rdmsr does a GP fault (due to non-existant
MSR register on older CPU's), we can't fix it up yet, and the boot fails.
Fix it by simply moving the code to a slightly later point in the boot
(init_amd() instead of early_init_amd()), since the kernel itself
doesn't even really care about the microcode patchlevel at this point
(or really ever: it's made available to user space in /proc/cpuinfo, and
updated if you do a microcode load).
Reported-tested-and-bisected-by: Larry Finger <Larry.Finger@lwfinger.net> Tested-by: Bob Tracy <rct@gherkin.frus.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
xen/pm_idle: Make pm_idle be default_idle under Xen.
The idea behind commit d91ee5863b71 ("cpuidle: replace xen access to x86
pm_idle and default_idle") was to have one call - disable_cpuidle()
which would make pm_idle not be molested by other code. It disallows
cpuidle_idle_call to be set to pm_idle (which is excellent).
But in the select_idle_routine() and idle_setup(), the pm_idle can still
be set to either: amd_e400_idle, mwait_idle or default_idle. This
depends on some CPU flags (MWAIT) and in AMD case on the type of CPU.
In case of mwait_idle we can hit some instances where the hypervisor
(Amazon EC2 specifically) sets the MWAIT and we get:
In the case of amd_e400_idle we don't get so spectacular crashes, but we
do end up making an MSR which is trapped in the hypervisor, and then
follow it up with a yield hypercall. Meaning we end up going to
hypervisor twice instead of just once.
The previous behavior before v3.0 was that pm_idle was set to
default_idle regardless of select_idle_routine/idle_setup.
We want to do that, but only for one specific case: Xen. This patch
does that.
Fixes RH BZ #739499 and Ubuntu #881076 Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 2 Dec 2011 21:30:58 +0000 (13:30 -0800)]
Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (21 commits)
usb: ftdi_sio: add PID for Propox ISPcable III
Revert "xHCI: reset-on-resume quirk for NEC uPD720200"
xHCI: fix bug in xhci_clear_command_ring()
usb: gadget: fsl_udc: fix dequeuing a request in progress
usb: fsl_mxc_udc.c: Remove compile-time dependency of MX35 SoC type
usb: fsl_mxc_udc.c: Fix build issue by including missing header file
USB: fsl_udc_core: use usb_endpoint_xfer_isoc to judge ISO XFER
usb: udc: Fix gadget driver's speed check in various UDC drivers
usb: gadget: fix g_serial regression
usb: renesas_usbhs: fixup driver speed
usb: renesas_usbhs: fixup gadget.dev.driver when udc_stop.
usb: renesas_usbhs: fixup signal the driver that cable was disconnected
usb: renesas_usbhs: fixup device_register timing
usb: musb: PM: fix context save/restore in suspend/resume path
USB: linux-cdc-acm.inf: add support for the acm_ms gadget
EHCI : Fix a regression in the ISO scheduler
xHCI: reset-on-resume quirk for NEC uPD720200
USB: whci-hcd: fix endian conversion in qset_clear()
USB: usb-storage: unusual_devs entry for Kingston DT 101 G2
usb: option: add SIMCom SIM5218
...
Linus Torvalds [Fri, 2 Dec 2011 21:30:25 +0000 (13:30 -0800)]
Merge branch 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
* 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
Staging: comedi: fix integer overflow in do_insnlist_ioctl()
Revert "Staging: comedi: integer overflow in do_insnlist_ioctl()"
Staging: comedi: integer overflow in do_insnlist_ioctl()
Staging: comedi: fix signal handling in read and write
Staging: comedi: fix mmap_count
staging: comedi: fix oops for USB DAQ devices.
staging: comedi: usbduxsigma: Fixed wrong range for the analogue channel.
staging:rts_pstor:Complete scanning_done variable
staging: usbip: bugfix for deadlock
Linus Torvalds [Fri, 2 Dec 2011 18:38:20 +0000 (10:38 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: fix attr2 vs large data fork assert
xfs: force buffer writeback before blocking on the ilock in inode reclaim
xfs: validate acl count
Linus Torvalds [Fri, 2 Dec 2011 16:25:04 +0000 (08:25 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
vmwgfx: integer overflow in vmw_kms_update_layout_ioctl()
drm/radeon/kms: fix 2D tiling CS support on EG/CM
drm/radeon/kms: fix scanout of 2D tiled buffers on EG/CM
drm: Fix lack of CRTC disable for drm_crtc_helper_set_config(.fb=NULL)
drm/radeon/kms: add some new pci ids
drm/radeon/kms: Skip ACPI call to ATIF when possible
drm/radeon/kms: Hide debugging message
drm/radeon/kms: add some loop timeouts in pageflip code
drm/nv50/disp: silence compiler warning
drm/nouveau: fix oopses caused by clear being called on unpopulated ttms
drm/nouveau: Keep RAMIN heap within the channel.
drm/nvd0/disp: fix sor dpms typo, preventing dpms on in some situations
drm/nvc0/gr: fix TP init for transform feedback offset queries
drm/nouveau: add dumb ioctl support
Linus Torvalds [Fri, 2 Dec 2011 16:10:51 +0000 (08:10 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix S3/S4 problem on machines with VREF-pin mute-LED
ALSA: hda_intel - revert a quirk that affect VIA chipsets
ALSA: hda - Avoid touching mute-VREF pin for IDT codecs
firmware: Sigma: Fix endianess issues
firmware: Sigma: Skip header during CRC generation
firmware: Sigma: Prevent out of bounds memory access
ALSA: usb-audio - Support for Roland GAIA SH-01 Synthesizer
ASoC: Supply dcs_codes for newer WM1811 revisions
ASoC: Error out if we can't generate a LRCLK at all for WM8994
ASoC: Correct name of Speyside Main Speaker widget
ASoC: skip resume of soc-audio devices without codecs
ASoC: cs42l51: Fix off-by-one for reg_cache_size
ASoC: drop support for PlayPaq with WM8510
ASoC: mpc8610: tell the CS4270 codec that it's the master
ASoC: cs4720: use snd_soc_cache_sync()
ASoC: SAMSUNG: Fix build error
ASoC: max9877: Update register if either val or val2 is changed
ASoC: Fix wrong define for AD1836_ADC_WORD_OFFSET
Xi Wang [Mon, 28 Nov 2011 11:25:43 +0000 (12:25 +0100)]
vmwgfx: integer overflow in vmw_kms_update_layout_ioctl()
There are two issues in vmw_kms_update_layout_ioctl(). First, the
for loop forgets to index rects and only checks the first element.
Second, there is a potential integer overflow if userspace passes
in a large arg->num_outputs. The call to kzalloc() would allocate
a small buffer, leading to out-of-bounds read.
Reported-by: Haogang Chen <haogangchen@gmail.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Chris Wilson [Mon, 28 Nov 2011 21:10:05 +0000 (21:10 +0000)]
drm: Fix lack of CRTC disable for drm_crtc_helper_set_config(.fb=NULL)
Disabling the CRTC by setting its framebuffer to NULL, as used by
drm_framebuffer_cleanup(), was failing to pass the current framebuffer
to the crtc_func->disable callback. This is because of the dance within
drm_crtc_helper_set_config to pass the new_fb (NULL in this case) to the
drm_crtc_helper_set_mode with the currently attached fb as a parameter.
drm_crtc_helper_set_mode treats this as a no-op and the encoder is still
enabled. And so the current fb is forgotten before the call to
drm_helper_disable_unused_functions.
This patch treats disabling the CRTC as a simple special case rather
than adding further complexity into the configuration logic.
This fixes a pin-leak of the fb bo on Xserver close.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NS
ipv4: flush route cache after change accept_local
sch_red: fix red_change
Revert "udp: remove redundant variable"
bridge: master device stuck in no-carrier state forever when in user-stp mode
ipv4: Perform peer validation on cached route lookup.
net/core: fix rollback handler in register_netdevice_notifier
sch_red: fix red_calc_qavg_from_idle_time
bonding: only use primary address for ARP
ipv4: fix lockdep splat in rt_cache_seq_show
sch_teql: fix lockdep splat
net: fec: Select the FEC driver by default for i.MX SoCs
isdn: avoid copying too long drvid
isdn: make sure strings are null terminated
netlabel: Fix build problems when IPv6 is not enabled
sctp: better integer overflow check in sctp_auth_create_key()
sctp: integer overflow in sctp_auth_create_key()
ipv6: Set mcast_hops to IPV6_DEFAULT_MCASTHOPS when -1 was given.
net: Fix corruption in /proc/*/net/dev_mcast
mac80211: fix race between the AGG SM and the Tx data path
...
Peter Pan(潘卫平) [Thu, 1 Dec 2011 15:47:06 +0000 (15:47 +0000)]
ipv4: flush route cache after change accept_local
After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0,
we should flush route cache, or it will continue receive packets with local
source address, which should be dropped.
Signed-off-by: Weiping Pan <panweiping3@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 1 Dec 2011 11:06:34 +0000 (11:06 +0000)]
sch_red: fix red_change
Le mercredi 30 novembre 2011 à 14:36 -0800, Stephen Hemminger a écrit :
> (Almost) nobody uses RED because they can't figure it out.
> According to Wikipedia, VJ says that:
> "there are not one, but two bugs in classic RED."
RED is useful for high throughput routers, I doubt many linux machines
act as such devices.
I was considering adding Adaptative RED (Sally Floyd, Ramakrishna
Gummadi, Scott Shender), August 2001
In this version, maxp is dynamic (from 1% to 50%), and user only have to
setup min_th (target average queue size)
(max_th and wq (burst in linux RED) are automatically setup)
By the way it seems we have a small bug in red_change()
if (skb_queue_empty(&sch->q))
red_end_of_idle_period(&q->parms);
First, if queue is empty, we should call
red_start_of_idle_period(&q->parms);
Second, since we dont use anymore sch->q, but q->qdisc, the test is
meaningless.
Oh well...
[PATCH] sch_red: fix red_change()
Now RED is classful, we must check q->qdisc->q.qlen, and if queue is empty,
we start an idle period, not end it.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 1 Dec 2011 22:55:34 +0000 (14:55 -0800)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (31 commits)
ocfs2: avoid unaligned access to dqc_bitmap
ocfs2: Use filemap_write_and_wait() instead of write_inode_now()
ocfs2: honor O_(D)SYNC flag in fallocate
ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2
ocfs2: send correct UUID to cleancache initialization
ocfs2: Commit transactions in error cases -v2
ocfs2: make direntry invalid when deleting it
fs/ocfs2/dlm/dlmlock.c: free kmem_cache_zalloc'd data using kmem_cache_free
ocfs2: Avoid livelock in ocfs2_readpage()
ocfs2: serialize unaligned aio
ocfs2: Implement llseek()
ocfs2: Fix ocfs2_page_mkwrite()
ocfs2: Add comment about orphan scanning
ocfs2: Clean up messages in the fs
ocfs2/cluster: Cluster up now includes network connections too
ocfs2/cluster: Add new function o2net_fill_node_map()
ocfs2/cluster: Fix output in file elapsed_time_in_ms
ocfs2/dlm: dlmlock_remote() needs to account for remastery
ocfs2/dlm: Take inflight reference count for remotely mastered resources too
ocfs2/dlm: Cleanup dlm_wait_for_node_death() and dlm_wait_for_node_recovery()
...
Akinobu Mita [Tue, 15 Nov 2011 22:56:34 +0000 (14:56 -0800)]
ocfs2: avoid unaligned access to dqc_bitmap
The dqc_bitmap field of struct ocfs2_local_disk_chunk is 32-bit aligned,
but not 64-bit aligned. The dqc_bitmap is accessed by ocfs2_set_bit(),
ocfs2_clear_bit(), ocfs2_test_bit(), or ocfs2_find_next_zero_bit(). These
are wrapper macros for ext2_*_bit() which need to take an unsigned long
aligned address (though some architectures are able to handle unaligned
address correctly)
So some 64bit architectures may not be able to access the dqc_bitmap
correctly.
This avoids such unaligned access by using another wrapper functions for
ext2_*_bit(). The code is taken from fs/ext4/mballoc.c which also need to
handle unaligned bitmap access.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Joel Becker <jlbec@evilplan.org>
Linus Torvalds [Thu, 1 Dec 2011 19:53:54 +0000 (11:53 -0800)]
Merge branch 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm
* 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
ARM: 7182/1: ARM cpu topology: fix warning
ARM: 7181/1: Restrict kprobes probing SWP instructions to ARMv5 and below
ARM: 7180/1: Change kprobes testcase with unpredictable STRD instruction
ARM: 7177/1: GIC: avoid skipping non-existent PPIs in irq_start calculation
ARM: 7176/1: cpu_pm: register GIC PM notifier only once
ARM: 7175/1: add subname parameter to mfp_set_groupg callers
ARM: 7174/1: Fix build error in kprobes test code on Thumb2 kernels
ARM: 7172/1: dma: Drop GFP_COMP for DMA memory allocations
ARM: 7171/1: unwind: add unwind directives to bitops assembly macros
ARM: 7170/2: fix compilation breakage in entry-armv.S
ARM: 7168/1: use cache type functions for arch_get_unmapped_area
ARM: perf: check that we have a platform device when reserving PMU
ARM: 7166/1: Use PMD_SHIFT instead of PGDIR_SHIFT in dma-consistent.c
ARM: 7165/2: PL330: Fix typo in _prepare_ccr()
ARM: 7163/2: PL330: Only register usable channels
ARM: 7162/1: errata: tidy up Kconfig options for PL310 errata workarounds
ARM: 7161/1: errata: no automatic store buffer drain
ARM: perf: initialise used_mask for fake PMU during validation
ARM: PMU: remove pmu_init declaration
ARM: PMU: re-export release_pmu symbol to modules
If we take the "try_again" goto, due to a checksum error,
the 'len' has already been truncated. So we won't compute
the same values as the original code did.
Reported-by: paul bilke <fsmail@conspiracy.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'for-usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci into usb-linus
* 'for-usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci:
Revert "xHCI: reset-on-resume quirk for NEC uPD720200"
xHCI: fix bug in xhci_clear_command_ring()
bridge: master device stuck in no-carrier state forever when in user-stp mode
When in user-stp mode, bridge master do not follow state of its slaves, so
after the following sequence of events it can stuck forever in no-carrier
state:
1) turn stp off
2) put all slaves down - master device will follow their state and also go in
no-carrier state
3) turn stp on with bridge-stp script returning 0 (go to the user-stp mode)
Now bridge master won't follow slaves' state and will never reach running
state.
This patch solves the problem by making user-stp and kernel-stp behavior
similar regarding master following slaves' states.
Signed-off-by: Vitalii Demianets <vitas@nppfactor.kiev.ua> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The commit added a reset-on-resume quirk because the NEC chipset stopped
responding to commands about 30 minutes after a system resume from
suspend. We thought it was a chipset issue, but it turns out that the
xHCI driver was zeroing out the link TRB after a successful context
restore during resume. The host controller would fall off the command
ring sometime later, causing it to not respond to new commands.
The link TRB issue has been fixed with commit 158886cd2cf4599e04f9b7e10cb767f5f39b14f1 "xHCI: fix bug in
xhci_clear_command_ring()", so revert the reset-on-resume quirk, as it's
not necessary.
Commit df711fc9962b9491af2b92bd0d21ecbfefe4e5fa was marked for stable
trees back to 2.6.37, but according to my mail, it has not made it into
Linus' tree or the stable trees yet.
Andiry Xu [Wed, 30 Nov 2011 08:37:41 +0000 (16:37 +0800)]
xHCI: fix bug in xhci_clear_command_ring()
When system enters suspend, xHCI driver clears command ring by writing zero
to all the TRBs. However, this also writes zero to the Link TRB, and the ring
is mangled. This may cause driver accesses wrong memory address and the
result is unpredicted.
When clear the command ring, keep the last Link TRB intact, only clear its
cycle bit. This should fix the "command ring full" issue reported by Oliver
Neukum.
This should be backported to stable kernels as old as 2.6.37, since the
commit 89821320 "xhci: Fix command ring replay after resume" is merged.
Signed-off-by: Andiry Xu <andiry.xu@amd.com> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Reported-by: Oliver Neukum <oneukum@suse.de>
Linus Torvalds [Thu, 1 Dec 2011 16:28:53 +0000 (08:28 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix meta data raid-repair merge problem
Btrfs: skip allocation attempt from empty cluster
Btrfs: skip block groups without enough space for a cluster
Btrfs: start search for new cluster at the beginning
Btrfs: reset cluster's max_size when creating bitmap
Btrfs: initialize new bitmaps' list
Btrfs: fix oops when calling statfs on readonly device
Btrfs: Don't error on resizing FS to same size
Btrfs: fix deadlock on metadata reservation when evicting a inode
Fix URL of btrfs-progs git repository in docs
btrfs scrub: handle -ENOMEM from init_ipath()
Jan Schmidt [Thu, 1 Dec 2011 14:30:36 +0000 (09:30 -0500)]
Btrfs: fix meta data raid-repair merge problem
Commit 4a54c8c16 introduced raid-repair, killing the individual
readpage_io_failed_hook entries from inode.c and disk-io.c. Commit 4bb31e92 introduced new readahead code, adding a readpage_io_failed_hook to
disk-io.c.
The raid-repair commit had logic to disable raid-repair, if
readpage_io_failed_hook is set. Thus, the readahead commit effectively
disabled raid-repair for meta data.
This commit changes the logic to always attempt raid-repair when needed and
call the readpage_io_failed_hook in case raid-repair fails. This is much
more straight forward and should have been like that from the beginning.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Reported-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Charles Chin [Thu, 1 Dec 2011 10:21:00 +0000 (11:21 +0100)]
ALSA: hda - Fix S3/S4 problem on machines with VREF-pin mute-LED
The verb command in stac92xx_post_suspend caused the audio to stop
working after resuming from S3 mode on HP laptops with the VREF-pin
mute-LED control. Removing relevant post_suspend registering.
Although removing D3 on AFG is no optimal solution, the impact should
be small in comparison with the broken S3/S4.
Signed-off-by: Charles Chin <Charles.Chin@idt.com> Cc: <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Jean Delvare [Wed, 30 Nov 2011 16:36:39 +0000 (17:36 +0100)]
drm/radeon/kms: Skip ACPI call to ATIF when possible
I am under the impression that it only makes sense to call the ATIF
method if the graphics device has an ACPI handle attached. So we could
skip the call altogether if there is no such handle.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Jean Delvare [Wed, 30 Nov 2011 16:26:36 +0000 (17:26 +0100)]
drm/radeon/kms: Hide debugging message
Use the proper macro to issue the debugging message in
radeon_atif_call(). Otherwise we spam the log of many systems with a
message which looks like an error message of unknown origin, and could
thus confuse the user. Commit dc77de12dde95c8da39e4c417eb70c7d445cf84b
was a first step in this direction, but was not sufficient IMHO.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@kernel.org Reviewed-by: Mario Kleiner <mario.kleiner@tuebingen.mpg.de> Tested-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 1 Dec 2011 09:01:55 +0000 (09:01 +0000)]
Merge branch 'drm-nouveau-fixes' of git://git.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
* 'drm-nouveau-fixes' of git://git.freedesktop.org/git/nouveau/linux-2.6:
drm/nv50/disp: silence compiler warning
drm/nouveau: fix oopses caused by clear being called on unpopulated ttms
drm/nouveau: Keep RAMIN heap within the channel.
drm/nvd0/disp: fix sor dpms typo, preventing dpms on in some situations
drm/nvc0/gr: fix TP init for transform feedback offset queries
drm/nouveau: add dumb ioctl support
RongQing.Li [Thu, 1 Dec 2011 04:43:07 +0000 (23:43 -0500)]
net/core: fix rollback handler in register_netdevice_notifier
Within nested statements, the break statement terminates only the
do, for, switch, or while statement that immediately encloses it,
So replace the break with goto.
Signed-off-by: RongQing.Li <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 30 Nov 2011 12:10:53 +0000 (12:10 +0000)]
sch_red: fix red_calc_qavg_from_idle_time
Since commit a4a710c4a7490587 (pkt_sched: Change PSCHED_SHIFT from 10 to
6) it seems RED/GRED are broken.
red_calc_qavg_from_idle_time() computes a delay in us units, but this
delay is now 16 times bigger than real delay, so the final qavg result
smaller than expected.
Use standard kernel time services since there is no need to obfuscate
them.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Only use the primary address of the bond device
for master_ip. This will prevent changing the ARP source
address in Active-Backup mode whenever a secondry address
is added to the bond device.
Signed-off-by: Henrik Saavedra Persson <henrik.e.persson@ericsson.com> Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@drr.davemloft.net>
Linus Torvalds [Thu, 1 Dec 2011 00:24:24 +0000 (16:24 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: twl: fix twl4030 support for smps regulators
regulator: fix use after free bug
regulator: aat2870: Fix the logic of checking if no id is matched in aat2870_get_regulator
ARM: 7181/1: Restrict kprobes probing SWP instructions to ARMv5 and below
The SWP instruction is deprecated on ARMv6 and with ARMv7 it will be
UNDEFINED when CONFIG_SWP_EMULATE is selected. In this case, probing a
SWP instruction will cause an oops when the kprobes emulation code
executes an undefined instruction.
As the SWP instruction should be rare or non-existent in kernels for
ARMv6 and later, we can simply avoid these problems by not allowing
probing of these.
Reported-by: Leif Lindholm <leif.lindholm@arm.com> Tested-by: Leif Lindholm <leif.lindholm@arm.com> Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Jon Medhurst <tixy@yxit.co.uk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ARM: 7180/1: Change kprobes testcase with unpredictable STRD instruction
There is a kprobes testcase for the instruction "strd r2, [r3], r4".
This has unpredictable behaviour as it uses r3 for register writeback
addressing and also stores it to memory.
On a cortex A9, this testcase would fail because the instruction writes
the updated value of r3 to memory, whereas the kprobes emulation code
writes the original value.
Fix this by changing testcase to used r5 instead of r3.
Reported-by: Leif Lindholm <leif.lindholm@arm.com> Tested-by: Leif Lindholm <leif.lindholm@arm.com> Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org> Signed-off-by: Jon Medhurst <tixy@yxit.co.uk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Eric Dumazet [Tue, 29 Nov 2011 20:05:55 +0000 (20:05 +0000)]
ipv4: fix lockdep splat in rt_cache_seq_show
After commit f2c31e32b378 (fix NULL dereferences in check_peer_redir()),
dst_get_neighbour() should be guarded by rcu_read_lock() /
rcu_read_unlock() section.
Reported-by: Miles Lane <miles.lane@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandre Oliva [Wed, 30 Nov 2011 18:43:00 +0000 (13:43 -0500)]
Btrfs: skip block groups without enough space for a cluster
We test whether a block group has enough free space to hold the
requested block, but when we're doing clustered allocation, we can
save some cycles by testing whether it has enough room for the cluster
upfront, otherwise we end up attempting to set up a cluster and
failing. Only in the NO_EMPTY_SIZE loop do we attempt an unclustered
allocation, and by then we'll have zeroed the cluster size, so this
patch won't stop us from using the block group as a last resort.
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Alexandre Oliva [Wed, 30 Nov 2011 18:43:00 +0000 (13:43 -0500)]
Btrfs: start search for new cluster at the beginning
Instead of starting at zero (offset is always zero), request a cluster
starting at search_start, that denotes the beginning of the current
block group.
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Alexandre Oliva [Wed, 30 Nov 2011 18:43:00 +0000 (13:43 -0500)]
Btrfs: reset cluster's max_size when creating bitmap
The field that indicates the size of the largest contiguous chunk of
free space in the cluster is not initialized when setting up bitmaps,
it's only increased when we find a larger contiguous chunk. We end up
retaining a larger value than appropriate for highly-fragmented
clusters, which may cause pointless searches for large contiguous
groups, and even cause clusters that do not meet the density
requirements to be set up.
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Alexandre Oliva [Mon, 28 Nov 2011 14:04:43 +0000 (12:04 -0200)]
Btrfs: initialize new bitmaps' list
We're failing to create clusters with bitmaps because
setup_cluster_no_bitmap checks that the list is empty before inserting
the bitmap entry in the list for setup_cluster_bitmap, but the list
field is only initialized when it is restored from the on-disk free
space cache, or when it is written out to disk.
Besides a potential race condition due to the multiple use of the list
field, filesystem performance severely degrades over time: as we use
up all non-bitmap free extents, the try-to-set-up-cluster dance is
done at every metadata block allocation. For every block group, we
fail to set up a cluster, and after failing on them all up to twice,
we fall back to the much slower unclustered allocation.
To make matters worse, before the unclustered allocation, we try to
create new block groups until we reach the 1% threshold, which
introduces additional bitmaps and thus block groups that we'll iterate
over at each metadata block request.
Mike Fleetwood [Fri, 18 Nov 2011 18:55:01 +0000 (18:55 +0000)]
Btrfs: Don't error on resizing FS to same size
It seems overly harsh to fail a resize of a btrfs file system to the
same size when a shrink or grow would succeed. User app GParted trips
over this error. Allow it by bypassing the shrink or grow operation.
Signed-off-by: Mike Fleetwood <mike.fleetwood@googlemail.com>
Miao Xie [Fri, 18 Nov 2011 09:43:00 +0000 (17:43 +0800)]
Btrfs: fix deadlock on metadata reservation when evicting a inode
When I ran the xfstests, I found the test tasks was blocked on meta-data
reservation.
By debugging, I found the reason of this bug:
start transaction
|
v
reserve meta-data space
|
v
flush delay allocation -> iput inode -> evict inode
^ |
| v
wait for delay allocation flush <- reserve meta-data space
And besides that, the flush on evicting inode will block the thread, which
is reclaiming the memory, and make oom happen easily.
Fix this bug by skipping the flush step when evicting inode.
Younes Manton [Tue, 22 Nov 2011 19:58:31 +0000 (14:58 -0500)]
drm/nouveau: Keep RAMIN heap within the channel.
The entire RAMIN is allocated to be 'size', but the heap is
specified as 'base' + 'size' inside RAMIN, so it will overflow
past RAMIN by 'base' bytes on NV50+ and clobber other allocatons
unless it's size is adjusted.
Signed-off-by: Younes Manton <younes.m@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>