Michal Miroslaw [Tue, 12 May 2009 20:49:26 +0000 (13:49 -0700)]
PCI quirk: unhide 'Overflow' device on i828{6,7}5P/PE chipsets
Some BIOSes hide 'overflow' device (dev #6) for i82875P/PE chipsets.
The same happens for i82865P/PE. Add a quirk to enable this device.
This allows i82875 EDAC driver to bind to chipset's dev #6 and not
dev #0 as the latter is used by AGP driver.
On my laptop (i82865P based) ACPI code is disabling this device
again in \_SB.PCI0._CRS method (called at least at PNP init time).
This can be easily worked around by patching DSDT.
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl> Acked-by: Doug Thompson <norsk5@yahoo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Addition of one unknown subsystem identifier to the quirks handler for
chipset i82855GM_HB on notebook Asus A6L. This exposes the otherwise
hidden SMBus controller within the south bridge ICH4-M.
Signed-off-by: Mats Erik Andersson <mats.andersson@gisladisker.se> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Andrew Patterson [Wed, 22 Apr 2009 22:52:09 +0000 (16:52 -0600)]
PCI: Add support for turning PCIe ECRC on or off
Adds support for PCI Express transaction layer end-to-end CRC checking
(ECRC). This patch will enable/disable ECRC checking by setting/clearing
the ECRC Check Enable and/or ECRC Generation Enable bits for devices that
support ECRC.
The ECRC setting is controlled by the "pci=ecrc=<policy>" command-line
option. If this option is not set or is set to 'bios", the enable and
generation bits are left in whatever state that firmware/BIOS set them to.
The "off" setting turns them off, and the "on" option turns them on (if the
device supports it).
Turning ECRC on or off can be a data integrity versus performance
tradeoff. In theory, turning it on will catch more data errors, turning
it off means possibly better performance since CRC does not need to be
calculated by the PCIe hardware and packet sizes are reduced.
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCI PM: Follow PCI_PM_CTRL_NO_SOFT_RESET during transitions from D3
According to the PCI PM specification (PCI Bus Power Management
Interface Specification, Rev. 1.2, Section 5.4.1) we are supposed to
reinitialize devices that have PCI_PM_CTRL_NO_SOFT_RESET clear during
all transitions from PCI_D3hot to PCI_D0, but we only do it if the
device's current_state field is equal to PCI_UNKNOWN.
This may lead to problems if a device with PCI_PM_CTRL_NO_SOFT_RESET
unset is put into PCI_D3hot at run time by its driver and
pci_set_power_state() is used to put it back into PCI_D0, because in
that case the device will remain uninitialized after
pci_set_power_state() has returned. Prevent that from happening by
modifying pci_raw_set_power_state() to reinitialize devices with
PCI_PM_CTRL_NO_SOFT_RESET unset during all transitions from D3 to D0.
Cc: stable@kernel.org Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yu Zhao [Wed, 20 May 2009 09:11:57 +0000 (17:11 +0800)]
PCI: fix SR-IOV function dependency link problem
PCIe root complex integrated endpoint does not implement ARI, so this
kind of endpoint uses 3-bit function number. The function dependency
link of the integrated endpoint should be calculated using the device
number plus the value from function dependency link register.
Normal endpoint always implements ARI and the function dependency link
register contains 8-bit function number (i.e. `devfn' from software's
perspective).
PCI MSI: let drivers retry when not enough vectors
pci_enable_msix currently returns -EINVAL if you ask
for more vectors than supported by the device, which would
typically cause fallback to regular interrupts.
It's better to return the table size, making the driver retry
MSI-X with less vectors.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Shaohua Li [Mon, 8 Jun 2009 01:27:25 +0000 (09:27 +0800)]
PCI: disable ASPM on VIA root-port-under-bridge configurations
VIA has a strange chipset, it has root port under a bridge. Disable ASPM
for such strange chipset.
Cc: stable@kernel.org Tested-by: Wolfgang Denk <wd@denx.de> Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Alex Chiang [Tue, 31 Mar 2009 15:24:17 +0000 (09:24 -0600)]
PCI Hotplug: cpqphp: don't use pci_find_slot()
Convert uses of pci_find_slot to modern API.
In the conversion sites, we end up calling pci_dev_put() right away.
This may seem like it misses the entire point of doing something like
pci_get_bus_and_slot(), since we drop the reference so soon, but it turns
out we don't actually do much with the returned pci_dev.
I plan on untangling cpqphp further, but clearly cpqphp never worried too
much about a properly refcounted pci_dev anyway. For now, this conversion
seems reasonable, as it gets rid of the last in-tree caller of pci_find_slot.
Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Alex Chiang [Tue, 31 Mar 2009 15:24:07 +0000 (09:24 -0600)]
PCI Hotplug: cpqphp: eliminate dead code - PCI_ScanBusNonBridge
I have no clue what the original intent here was, but the code as
written is useless.
The old dbg() statement above the old callsite might lead one to think
that at one point, there was supposed to be some recursion, but any
sense of sanity here has been lost to the ravages of time.
Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Alex Chiang [Tue, 31 Mar 2009 15:24:02 +0000 (09:24 -0600)]
PCI Hotplug: cpqphp: clean up accesses to pcibios_get_irq_routing_table()
Instead of making multiple calls to pcibios_get_irq_routing_table, let's
just do it once and save the answer.
The reason we were making multiple calls is because we liked to calculate
its length and perform some loop over it. Instead of open-coding the length
calculation every time, provide it in an inline helper function.
Finally, since pci_print_IRQ_route() is used only for debug, let's only
do it when cpqhp_debug is set.
Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 24 Apr 2009 03:49:25 +0000 (20:49 -0700)]
PCI: improve resource allocation under transparent bridges
We could run out of space under under 4g, but devices under transparent
bridges can use 64bit resources, so keep trying on the parent bus until
we hit a non-transparent bridge.
Impact: better support for assigning unassigned resources
Reviewed-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 24 Apr 2009 03:48:32 +0000 (20:48 -0700)]
PCI/x86: don't assume prefetchable ranges are 64bit
We should not assign 64bit ranges to PCI devices that only take 32bit
prefetchable addresses.
Try to set IORESOURCE_MEM_64 in 64bit resource of pci_device/pci_bridge
and make the bus resource only have that bit set when all devices under
it support 64bit prefetchable memory. Use that flag to allocate
resources from that range.
Reported-by: Yannick <yannick.roehlly@free.fr> Reviewed-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCIE: remove driver_data direct access of struct device
In the near future, the driver core is going to not allow direct access
to the driver_data pointer in struct device. Instead, the functions
dev_get_drvdata() and dev_set_drvdata() should be used. These functions
have been around since the beginning, so are backwards compatible with
all older kernel versions.
PCI MSI: Remove unused/obsolete macros and definitions
Impact: cleanup, spec compliance
This patch does:
- Remove unused msi/msix_enable/disable macros.
User should use msi/msix_set_enable() functions instead.
- Remove unused msix_mask/unmask/pending macros.
These macros are useless because they are not based on any of
the PCI Local Bus Specifications properly.
It seems that they were written based on a draft of PCI spec,
and that the draft was the MSI-X ECN that underwent membership
review in September 2002.
(* In the draft, the size of a entry in MSI-X table was 64bit,
containing 32bit message data and DWORD aligned lower address
plus a pending bit and a mask bit.(30+1+1bit) The higher
address was placed in MSI-X capability structure and shared
by all entries.)
- Remove PCI_MSIX_FLAGS_BITMASK.
This definition also come from the draft ECN.
Peter Botha [Wed, 10 Jun 2009 00:16:32 +0000 (17:16 -0700)]
char: mxser, fix ISA board lookup
There's a bug in the mxser kernel module that still appears in the
2.6.29.4 kernel.
mxser_get_ISA_conf takes a ioaddress as its first argument, by passing the
not of the ioaddr, you're effectively passing 0 which means it won't be
able to talk to an ISA card. I have tested this, and removing the !
fixes the problem.
Jan Kara [Tue, 9 Jun 2009 23:26:26 +0000 (16:26 -0700)]
jbd: fix race in buffer processing in commit code
In commit code, we scan buffers attached to a transaction. During this
scan, we sometimes have to drop j_list_lock and then we recheck whether
the journal buffer head didn't get freed by journal_try_to_free_buffers().
But checking for buffer_jbd(bh) isn't enough because a new journal head
could get attached to our buffer head. So add a check whether the journal
head remained the same and whether it's still at the same transaction and
list.
This is a nasty bug and can cause problems like memory corruption (use after
free) or trigger various assertions in JBD code (observed).
Signed-off-by: Jan Kara <jack@suse.cz> Cc: <stable@kernel.org> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ian Kent [Tue, 9 Jun 2009 23:26:24 +0000 (16:26 -0700)]
autofs4: remove hashed check in validate_wait()
The recent ->lookup() deadlock correction required the directory inode
mutex to be dropped while waiting for expire completion. We were
concerned about side effects from this change and one has been identified.
I saw several error messages.
They cause autofs to become quite confused and don't really point to the
actual problem.
Things like:
handle_packet_missing_direct:1376: can't find map entry for (43,1827932)
which is usually totally fatal (although in this case it wouldn't be
except that I treat is as such because it normally is).
do_mount_direct: direct trigger not valid or already mounted
/test/nested/g3c/s1/ss1
which is recoverable, however if this problem is at play it can cause
autofs to become quite confused as to the dependencies in the mount tree
because mount triggers end up mounted multiple times. It's hard to
accurately check for this over mounting case and automount shouldn't need
to if the kernel module is doing its job.
There was one other message, similar in consequence of this last one but I
can't locate a log example just now.
When checking if a mount has already completed prior to adding a new mount
request to the wait queue we check if the dentry is hashed and, if so, if
it is a mount point. But, if a mount successfully completed while we
slept on the wait queue mutex the dentry must exist for the mount to have
completed so the test is not really needed.
Mounts can also be done on top of a global root dentry, so for the above
case, where a mount request completes and the wait queue entry has already
been removed, the hashed test returning false can cause an incorrect
callback to the daemon. Also, d_mountpoint() is not sufficient to check
if a mount has completed for the multi-mount case when we don't have a
real mount at the base of the tree.
Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 9 Jun 2009 23:26:23 +0000 (16:26 -0700)]
shm: fix unused warnings on nommu
The massive nommu update (8feae131) resulted in these warnings:
ipc/shm.c: In function `sys_shmdt':
ipc/shm.c:974: warning: unused variable `size'
ipc/shm.c:972: warning: unused variable `next'
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
cls_cgroup: Fix oops when user send improperly 'tc filter add' request
r8169: fix crash when large packets are received
Linus Torvalds [Tue, 9 Jun 2009 15:41:22 +0000 (08:41 -0700)]
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md/raid5: fix bug in reshape code when chunk_size decreases.
md/raid5 - avoid deadlocks in get_active_stripe during reshape
md/raid5: use conf->raid_disks in preference to mddev->raid_disk
it turns out that we need clear cpus_hardware_enabled in that case.
Reported-and-tested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Minoru Usui [Tue, 9 Jun 2009 11:03:09 +0000 (04:03 -0700)]
cls_cgroup: Fix oops when user send improperly 'tc filter add' request
I found a bug in cls_cgroup_change() in cls_cgroup.c.
cls_cgroup_change() expected tca[TCA_OPTIONS] was set from user space properly,
but tc in iproute2-2.6.29-1 (which I used) didn't set it.
In the current source code of tc in git, it set tca[TCA_OPTIONS].
If we always use a newest iproute2 in git when we use cls_cgroup,
we don't face this oops probably.
But I think, kernel shouldn't panic regardless of use program's behaviour.
Signed-off-by: Minoru Usui <usui@mxm.nes.nec.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 9 Jun 2009 11:01:02 +0000 (04:01 -0700)]
r8169: fix crash when large packets are received
Michael Tokarev reported receiving a large packet could crash
a machine with RTL8169 NIC.
( original thread at http://lkml.org/lkml/2009/6/8/192 )
Problem is this driver tells that NIC frames up to 16383 bytes
can be received but provides skb to rx ring allocated with
smaller sizes (1536 bytes in case standard 1500 bytes MTU is used)
When a frame larger than what was allocated by driver is received,
dma transfert can occurs past the end of buffer and corrupt
kernel memory.
Fix is to tell to NIC what is the maximum size a frame can be.
This bug is very old, (before git introduction, linux-2.6.10), and
should be backported to stable versions.
Reported-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
NeilBrown [Tue, 9 Jun 2009 06:32:22 +0000 (16:32 +1000)]
md/raid5: fix bug in reshape code when chunk_size decreases.
Now that we support changing the chunksize, we calculate
"reshape_sectors" to be the max of number of sectors in old
and new chunk size.
However there is one please where we still use 'chunksize'
rather than 'reshape_sectors'.
This causes a reshape that reduces the size of chunks to freeze.
NeilBrown [Tue, 9 Jun 2009 04:39:59 +0000 (14:39 +1000)]
md/raid5 - avoid deadlocks in get_active_stripe during reshape
md has functionality to 'quiesce' and array so that all pending
IO completed and no new IO starts. This is used to achieve a
stable state before making internal changes.
Currently this quiescing applies equally to normal IO, resync
IO, and reshape IO.
However there is a problem with applying it to reshape IO.
Reshape can have multiple 'stripe_heads' that must be active together.
If the quiesce come between allocating the first and the last of
such a collection, then we deadlock, as the last will not be allocated
until the quiesce is lifted, the quiesce will not be lifted until the
first (which has been allocated) gets used, and that first cannot be
used until the last is allocated.
It is not necessary to inhibit reshape IO when a quiesce is
requested. Those places in the code that require a full quiesce will
ensure the reshape thread is not running at all.
So allow reshape requests to get access to new stripe_heads without
being blocked by a 'quiesce'.
This only affects in-place reshapes (i.e. where the array does not
grow or shrink) and these are only newly supported. So this patch is
not needed in earlier kernels.
Linus Torvalds [Mon, 8 Jun 2009 19:31:53 +0000 (12:31 -0700)]
async: Fix lack of boot-time console due to insufficient synchronization
Our async work synchronization was broken by "async: make sure
independent async domains can't accidentally entangle" (commit d5a877e8dd409d8c702986d06485c374b705d340), because it would report
the wrong lowest active async ID when there was both running and
pending async work.
This caused things like no being able to read the root filesystem,
resulting in missing console devices and inability to run 'init',
causing a boot-time panic.
This fixes it by properly returning the lowest pending async ID: if
there is any running async work, that will have a lower ID than any
pending work, and we should _not_ look at the pending work list.
There were alternative patches from Jaswinder and James, but this one
also cleans up the code by removing the pointless 'ret' variable and
the unnecesary testing for an empty list around 'for_each_entry()' (if
the list is empty, the for_each_entry() thing just won't execute).
Fixes-bug: http://bugzilla.kernel.org/show_bug.cgi?id=13474 Reported-and-tested-by: Chris Clayton <chris2553@googlemail.com> Cc: Jaswinder Singh Rajput <jaswinder@kernel.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 8 Jun 2009 16:22:53 +0000 (09:22 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: Outline udelay and fix a few issues.
MIPS: ioctl.h: Fix headers_check warnings
MIPS: Cobalt: PCI bus is always required to obtain the board ID
MIPS: Kconfig: Remove "Support for" from Cavium system type
MIPS: Sibyte: Honor CONFIG_CMDLINE
SSB: BCM47xx: Export ssb_watchdog_timer_set
Ralf Baechle [Sat, 28 Feb 2009 09:44:28 +0000 (09:44 +0000)]
MIPS: Outline udelay and fix a few issues.
Outlining fixes the issue were on certain CPUs such as the R10000 family
the delay loop would need an extra cycle if it overlaps a cacheline
boundary.
The rewrite also fixes build errors with GCC 4.4 which was changed in
way incompatible with the kernel's inline assembly.
Relying on pure C for computation of the delay value removes the need for
explicit. The price we pay is a slight slowdown of the computation - to
be fixed on another day.
Linus Torvalds [Mon, 8 Jun 2009 15:29:31 +0000 (08:29 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5543/1: arm: serial amba: add missing declaration in serial.h
[ARM] pxa: fix pxa27x_udc default pullup GPIO
[ARM] pxa/imote2: fix UCAM sensor board ADC model number
mx[23]: don't put clock lookups in __initdata
fix oops when using console=ttymxcN with N > 0
[ARM] ARMv7 errata: only apply fixes when running on applicable CPU
[ARM] 5534/1: kmalloc must return a cache line aligned buffer
Linus Torvalds [Mon, 8 Jun 2009 14:53:59 +0000 (07:53 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
sdhci-of: Fix the wrong accessor to HOSTVER register
mvsdio: fix config failure with some high speed SDHC cards
mvsdio: ignore high speed timing requests from the core
mmc/omap: Use disable_irq_nosync() from within irq handlers.
sdhci-of: Add fsl,esdhc as a valid compatible to bind against
mvsdio: allow automatic loading when modular
mxcmmc: Fix missing return value checking in DMA setup code.
mxcmmc : Reset the SDHC hardware if software timeout occurs.
omap_hsmmc: Trivial fix for a typo in comment
mxcmmc: decrease minimum frequency to make MMC cards work
Avi Kivity [Sat, 6 Jun 2009 09:34:39 +0000 (12:34 +0300)]
KVM: Explicity initialize cpus_hardware_enabled
Under CONFIG_MAXSMP, cpus_hardware_enabled is allocated from the heap and
not statically initialized. This causes a crash on reboot when kvm thinks
vmx is enabled on random nonexistent cpus and accesses nonexistent percpu
lists.
Fix by explicitly clearing the variable.
Cc: stable@kernel.org Reported-and-tested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Avi Kivity <avi@redhat.com>
[ARM] 5543/1: arm: serial amba: add missing declaration in serial.h
This header is sometimes included in the uncompress stage to get
register values, but no <linux/amba/bus.h> can be included there.
So declare "struct amba_device" here before using it in a prototype.
Signed-off-by: Alessandro Rubini <rubini@unipv.it> Acked-by: Andrea Gallo <andrea.gallo@stericsson.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Sergei Shtylyov [Sun, 7 Jun 2009 11:52:50 +0000 (13:52 +0200)]
pdc202xx_old: fix resetproc() method
pdc202xx_reset() calls pdc202xx_reset_host() twice, for both channels, while
that function actually twiddles the single, shared software reset bit -- the
net effect is a duplicated reset and horrendous 4 second delay happening not
only on a channel reset but also when dma_lost_irq() and dma_clear() methods
are called. Fold pdc202xx_reset_host() into pdc202xx_reset(), fix printk(),
and move it before the actual reset...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei Shtylyov [Sun, 7 Jun 2009 11:52:50 +0000 (13:52 +0200)]
pdc202xx_old: fix 'pdc20246_dma_ops'
Commit ac95beedf8bc97b24f9540d4da9952f07221c023 (ide: add struct ide_port_ops
(take 2)) erroneously converted the driver's dma_timeout() and dma_lost_irq()
methods to call the driver's resetproc() method regardless of whether it was
defined for this specific controller while it hadn't been defined and hence
called for PDC20246. So the dma_clear() method, the successor of dma_timeout(),
shouldn't exist and the dma_lost_irq() method should be standard for PDC20246.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Linus Torvalds [Sat, 6 Jun 2009 21:33:54 +0000 (14:33 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/pci: fix mmconfig detection with 32bit near 4g
PCI: use fixed-up device class when configuring device
Hugh Dickins [Sat, 6 Jun 2009 20:18:09 +0000 (21:18 +0100)]
integrity: fix IMA inode leak
CONFIG_IMA=y inode activity leaks iint_cache and radix_tree_node objects
until the system runs out of memory. Nowhere is calling ima_inode_free()
a.k.a. ima_iint_delete(). Fix that by calling it from destroy_inode().
Linus Torvalds [Sat, 6 Jun 2009 19:18:14 +0000 (12:18 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
ext3/4 with synchronous writes gets wedged by Postfix
Fix nobh_truncate_page() to not pass stack garbage to get_block()
Al Viro [Wed, 13 May 2009 18:13:40 +0000 (19:13 +0100)]
ext3/4 with synchronous writes gets wedged by Postfix
OK, that's probably the easiest way to do that, as much as I don't like it...
Since iget() et.al. will not accept I_FREEING (will wait to go away
and restart), and since we'd better have serialization between new/free
on fs data structures anyway, we can afford simply skipping I_FREEING
et.al. in insert_inode_locked().
We do that from new_inode, so it won't race with free_inode in any interesting
ways and it won't race with iget (of any origin; nfsd or in case of fs
corruption a lookup) since both still will wait for I_LOCK.
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Jan Kara <jack@suse.cz> Tested-by: David Watson <dbwatson@ukfsn.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Theodore Ts'o [Tue, 12 May 2009 11:37:56 +0000 (07:37 -0400)]
Fix nobh_truncate_page() to not pass stack garbage to get_block()
The nobh_truncate_page() function is used by ext2, exofs, and jfs. Of
these three, only ext2 and jfs's get_block() function pays attention
to bh->b_size --- which is normally always the filesystem blocksize
except when the get_block() function is called by either
mpage_readpage(), mpage_readpages(), or the direct I/O routines in
fs/direct_io.c.
Unfortunately, nobh_truncate_page() does not initialize map_bh before
calling the filesystem-supplied get_block() function. So ext2 and jfs
will try to calculate the number of blocks to map by taking stack
garbage and shifting it left by inode->i_blkbits. This should be
*mostly* harmless (except the filesystem will do some unnneeded work)
unless the stack garbage is less than filesystem's blocksize, in which
case maxblocks will be zero, and the attempt to find out whether or
not the filesystem has a hole at a given logical block will fail, and
the page cache entry might not get zero'ed out.
Also if the stack garbage in in map_bh->state happens to have the
BH_Mapped bit set, there could be an attempt to call readpage() on a
non-existent page, which could cause nobh_truncate_page() to return an
error when it should not.
Fix this by initializing map_bh->state and map_bh->size.
Fortunately, it's probably fairly unlikely that ext2 and jfs users
mount with nobh these days.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Alan Cox [Wed, 13 May 2009 14:02:27 +0000 (15:02 +0100)]
[libata] pata_ali: Use IGN_SIMPLEX
Some ALi devices report simplex if they have been disabled and re-enabled, and
restoring the byte does not work. Ignore it - the needed supporting logic is
already present for the SATA ULi ports.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: Fix oops and use after free during space balancing
Btrfs: set device->total_disk_bytes when adding new device
Linus Torvalds [Fri, 5 Jun 2009 18:53:44 +0000 (11:53 -0700)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
ata_piix: Add HP Compaq nc6000 to the broken poweroff list
ahci: add warning messages for hp laptops with broken suspend
pata_efar: fix PIO2 underclocking
pata_legacy: wait for async probing
Tejun Heo [Sat, 30 May 2009 11:50:12 +0000 (20:50 +0900)]
ahci: add warning messages for hp laptops with broken suspend
Harddisks on HP dv[4-6] and HDX18 fail to come online after resume on
earlier BIOSen. Fortunately, HP recently released BIOS updates for
all machines to fix the issue. Detect old BIOSen, warn the user to
update BIOS on boot and suspend attempts and fail suspend.
Kudos to all the bug reporters.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: kernel.org@epperson.homelinux.net Cc: emisca@gmail.com Cc: Gadi Cohen <dragon@wastelands.net> Cc: Paul Swanson <paul@procursa.com> Cc: s@ourada.org Cc: Trevor Davenport <trevor.davenport@gmail.com> Cc: corruptor1972 <steven_tierney@yahoo.co.uk> Cc: Victoria Wilson <mail@vwilson.co.uk> Cc: khiraly <khiraly.list@gmail.com> Cc: Sean <wollombi@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Sergei Shtylyov [Mon, 1 Jun 2009 19:42:10 +0000 (22:42 +0300)]
pata_efar: fix PIO2 underclocking
Fix the PIO mode 2 using mode 0 timings -- this driver should enable the
fast timing bank starting with PIO2, just like the PIIX/ICH drivers do.
Also, fix/rephrase some comments while at it.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
James Bottomley [Fri, 5 Jun 2009 14:41:39 +0000 (10:41 -0400)]
pata_legacy: wait for async probing
The basic problem here that pata_legacy attaches the host, sees if it found
any devices and detaches it if none were found. With async probing, it's not
waiting until discovery is finished before deciding it has no devices and
trying the detach leading to this warning:
One way to fix it would be to put an async_synchronize_full() before looking
for devices, which this patch does. A better way might be to separate libata
into its own domain and only wait for that.
Reported-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Dave Jones [Fri, 5 Jun 2009 16:37:07 +0000 (12:37 -0400)]
[CPUFREQ] powernow-k8: check space_id of _PCT registers to be FFH
The powernow-k8 driver checks to see that the Performance Control/Status
Registers are declared as FFH (functional fixed hardware) by the BIOS.
However, this check got broken in the commit: 0e64a0c982c06a6b8f5e2a7f29eb108fdf257b2f
[CPUFREQ] checkpatch cleanups for powernow-k8
Fix based on an original patch from Naga Chumbalkar.
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Cc: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Dave Jones <davej@redhat.com>
Alan Cox [Fri, 5 Jun 2009 10:56:18 +0000 (11:56 +0100)]
ivtv: Fix PCI DMA direction
The ivtv stream buffers may be for receive or for send but the attached
sg handle is always destined cpu->device. We flush it correctly but the
allocation is wrongly done with the same type as the buffers.
See bug: http://bugzilla.kernel.org/show_bug.cgi?id=13385
(Note this doesn't close the bug - it fixes the ivtv part and in turn
the logging next shows up some rather alarming DMA sg list warnings in
libata)
Signed-off-by: Alan Cox <alan@linux.intel.com> Acked-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Thu, 4 Jun 2009 23:29:09 +0000 (16:29 -0700)]
ptrace: revert "ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic"
Commit 95a3540da9c81a5987be810e1d9a83640a366bd5 ("ptrace_detach: the wrong
wakeup breaks the ERESTARTxxx logic") removed the "extra"
wake_up_process() from ptrace_detach(), but as Jan pointed out this breaks
the compatibility.
I believe the changelog is right and this wake_up() is wrong in many
ways, but GDB assumes that ptrace(PTRACE_DETACH, child, 0, 0) always
wakes up the tracee.
Despite the fact this breaks SIGNAL_STOP_STOPPED/group_stop_count logic,
and despite the fact this wake_up_process() can break another
assumption: PTRACE_DETACH with SIGSTOP should leave the tracee in
TASK_STOPPED case. Because the untraced child can dequeue SIGSTOP and
call do_signal_stop() before ptrace_detach() calls wake_up_process().
Revert this change for now. We need some fixes even if we we want to keep
the current behaviour, but these fixes are not for 2.6.30.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The "trace || CLONE_PTRACE" check in tracehook_report_clone() is not right,
- If the untraced task does clone(CLONE_PTRACE) the new child is not traced,
we must not queue SIGSTOP.
- If we forked the traced task, but the tracer exits and untraces both the
forking task and the new child (after copy_process() drops tasklist_lock),
we should not queue SIGSTOP too.
Change the code to check task_ptrace() != 0 instead. This is still racy, but
the race is harmless.
We can race with another tracer attaching to this child, or the tracer can
exit and detach in parallel. But giwen that we didn't do wake_up_new_task()
yet, the child must have the pending SIGSTOP anyway.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 4 Jun 2009 22:23:39 +0000 (15:23 -0700)]
Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: ignore EDID with really tiny modes.
drm: don't associate _DRM_DRIVER maps with a master
drm/i915: intel_lvds.c fix section mismatch
drm: Hook up DPMS property handling in drm_crtc.c. Add drm_helper_connector_dpms.
drm: set permissions on edid file to 0444
drm: add newlines to text sysfs files
drm/radeon: fix ring free alignment calculations
drm: fix irq naming for kms drivers.
Salman Qazi [Thu, 4 Jun 2009 22:20:39 +0000 (15:20 -0700)]
drivers/char/mem.c: avoid OOM lockup during large reads from /dev/zero
While running 20 parallel instances of dd as follows:
#!/bin/bash
for i in `seq 1 20`; do
dd if=/dev/zero of=/export/hda3/dd_$i bs=1073741824 count=1 &
done
wait
on a 16G machine, we noticed that rather than just killing the processes,
the entire kernel went down. Stracing dd reveals that it first does an
mmap2, which makes 1GB worth of zero page mappings. Then it performs a
read on those pages from /dev/zero, and finally it performs a write.
The machine died during the reads. Looking at the code, it was noticed
that /dev/zero's read operation had been changed by 557ed1fa2620dc119adb86b34c614e152a629a80 ("remove ZERO_PAGE") from giving
zero page mappings to actually zeroing the page.
The zeroing of the pages causes physical pages to be allocated to the
process. But, when the process exhausts all the memory that it can, the
kernel cannot kill it, as it is still in the kernel mode allocating more
memory. Consequently, the kernel eventually crashes.
To fix this, I propose that when a fatal signal is pending during
/dev/zero read operation, we simply return and let the user process die.
Signed-off-by: Salman Qazi <sqazi@google.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Modified error return and comment trivially. - Linus] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Mason [Thu, 4 Jun 2009 19:34:51 +0000 (15:34 -0400)]
Btrfs: Fix oops and use after free during space balancing
The btrfs allocator uses list_for_each to walk the available block
groups when searching for free blocks. It starts off with a hint
to help find the best block group for a given allocation.
The hint is resolved into a block group, but we don't properly check
to make sure the block group we find isn't in the middle of being
freed due to filesystem shrinking or balancing. If it is being
freed, the list pointers in it are bogus and can't be trusted. But,
the code happily goes along and uses them in the list_for_each loop,
leading to all kinds of fun.
The fix used here is to check to make sure the block group we find really
is on the list before we use it. list_del_init is used when removing
it from the list, so we can do a proper check.
The allocation clustering code has a similar bug where it will trust
the block group in the current free space cluster. If our allocation
flags have changed (going from single spindle dup to raid1 for example)
because the drives in the FS have changed, we're not allowed to use
the old block group any more.
The fix used here is to check the current cluster against the
current allocation flags.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Rusty Russell [Wed, 3 Jun 2009 05:22:24 +0000 (14:52 +0930)]
lguest: fix 'unhandled trap 13' with CONFIG_CC_STACKPROTECTOR
We don't set up the canary; let's disable stack protector on boot.c so
we can get into lguest_init, then set it up. As a side effect,
switch_to_new_gdt() sets up %fs for us properly too.
Eric Anholt [Thu, 4 Jun 2009 11:18:14 +0000 (11:18 +0000)]
drm/i915: Remove a bad BUG_ON in the fence management code.
This could be triggered by a gtt mapping fault on 965 that decides to
remove the fence from another object that happens to be active currently.
Since the other object doesn't rely on the fence reg for its execution, we
don't wait for it to finish. We'll soon be not waiting on 915 most of the
time as well, so just drop the BUG_ON.
Yinghai Lu [Wed, 3 Jun 2009 07:13:13 +0000 (00:13 -0700)]
x86/pci: fix mmconfig detection with 32bit near 4g
Pascal reported and bisected a commit:
| x86/PCI: don't call e820_all_mapped with -1 in the mmconfig case
which broke one system system.
ACPI: Using IOAPIC for interrupt routing
PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 255
PCI: MCFG area at f0000000 reserved in ACPI motherboard resources
PCI: Using MMCONFIG for extended config space
it didn't have
PCI: updated MCFG configuration 0: base f0000000 segment 0 buses 0 - 63
anymore, and try to use 0xf000000 - 0xffffffff for mmconfig
For 32bit, mcfg_res->end could be 32bit only (if 64 resources aren't used)
So use end - 1 to pass the value in mcfg->end to avoid overflow.
We don't need to worry about the e820 path, they are always 64 bit.