Linus Torvalds [Wed, 4 Feb 2009 15:56:25 +0000 (07:56 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (40 commits)
Blackfin arch: Remove outdated code
Blackfin arch: Fix udelay implementation
Blackfin arch: Update Copyright information
Blackfin arch: Add BF561 PPI POLS, POLC Masks
Blackfin arch: Update CM-BF527 kernel config
Blackfin arch: define bfin_memmap as static since it is only used here
Blackfin arch: cplb mananger: use a do...while loop rather than a for loop
Blackfin arch: fix bug - traps test case 19 for exception 0x2d fails
Blackfin arch: add platform device bfin_mii-bus and KSZ8893M switch driver platform resources to board files
Blackfin arch: build jtag tty driver as a module by default
Blackfin arch: fix 2 bugs related to debug
Blackfin arch: Add ANOMALY_05000380 to BF54x to kill the compile warning
Blackfin arch: Fix bug - 561 SMP kernel can't boot from jffs2
Blackfin arch: base SIC_IWR# programming on whether the MMR exists
Blackfin arch: read SYSCR on newer parts that mirror the bits of SWRST in it
Blackfin arch: fixup board init function name
Blackfin arch: drop CONFIG_I2C_BOARDINFO ifdefs
Blackfin arch: bfin_reset->_bfin_reset redirection no longer needed
Blackfin arch: sync reboot handler with version in u-boot
Blackfin arch: Faster Implementation of csum_tcpudp_nofold()
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc64: Kill bogus TPC/address truncation during 32-bit faults.
sparc: fixup for sparseirq changes
sparc64: Validate kernel generated fault addresses on sparc64.
sparc64: On non-Niagara, need to touch NMI watchdog in NOHZ mode.
sparc64: Implement NMI watchdog on capable cpus.
sparc: Probe PMU type and record in sparc_pmu_type.
sparc64: Move generic PCR support code to seperate file.
The removed version with the loop registers saved on the stack was
originally intended to workaround the missing toolchain support for
LoopReg Clobbers.
Since our toolchain now supports these there is no point in keeping this
workaround. And since we don't touch LoopRegs anymore we're no longer
subject for ANOMALY_05000312.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
Mike Frysinger [Wed, 4 Feb 2009 08:49:45 +0000 (16:49 +0800)]
Blackfin arch: cplb mananger: use a do...while loop rather than a for loop
use a do...while loop rather than a for loop to get slightly better
optimization and to avoid gcc "may be used uninitialized" warnings ...
we know that the [id]cplb_nr_bounds variables will never be 0, so this
is OK
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
Blackfin arch: Faster Implementation of csum_tcpudp_nofold()
Avoid conditional branch instructions during carry bit additions.
Special thanks to Bernd.
Simplify: Use ((len + proto) << 8) like every other __LITTLE_ENDIAN__ machine
Sonic Zhang [Wed, 4 Feb 2009 08:49:45 +0000 (16:49 +0800)]
Blackfin arch: Fix bug - Run "reboot" hangs bf518-ezbrd
[Mike Frysinger <vapier.adi@gmail.com>:
- setup P_DEFAULT_BOOT_SPI_CS for every arch based on
the default bootrom behavior and convert all our boards
to it
- revert previous anomaly change ... bf51x is not affected
by anomaly 05000353]
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
Blackfin arch: explicit add a might sleep to gpio_free
According to the documentation gpio_free should only be called from task
context only. To make this more explicit add a might sleep to all
implementations.
This patch changes the gpio_free implementations for the blackfin
architecture.
Signed-off-by: Uwe Kleine-Koenig <ukleinek@strlen.de> Cc: David Brownell <david-b@pacbell.net> Acked-by: Bryan Wu <cooloney@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Blackfin arch: reset POLAR setting when acquiring a gpio for the first time
when requesting a GPIO for the first time, the POLAR setting is not
set to a sane state. this can lead to indeterminate behavior that
cannot be resolved without an explicit write to the Blackfin port POLAR
register.
when requesting a GPIO for the first time via gpio_request(), the POLAR
setting for the GPIO in question should be set to sane state. this
should occur if the GPIO has not been allocated in any other way.
some examples:
- when doing something like "request_irq(); gpio_request();" on the
same GPIO, the POLAR setting should not be reset.
- when doing "gpio_request(); gpio_request();" on the same GPIO, the
POLAR setting should be reset only the first time and not the second.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
Blackfin arch: Fix Bug - Kernel does not boot if re-program clocks
On BF561 EBIU_SDGCTL bit 31 controls the SDRAM external data
path width, typically set 0 for a 32-bit bus width. On other
Blackfin derivatives this bit should be set by default.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Bryan Wu <cooloney@kernel.org>
It made the following changes:
1. Decrement wbc->nr_to_write instead of nr_to_write
2. Decrement wbc->nr_to_write _only_ if wbc->sync_mode == WB_SYNC_NONE
3. If synced nr_to_write pages, stop only if if wbc->sync_mode ==
WB_SYNC_NONE, otherwise keep going.
However, according to the commit message, the intention was to only make
change 3. Change 1 is a bug. Change 2 does not seem to be necessary,
and it breaks UBIFS expectations, so if needed, it should be done
separately later. And change 2 does not seem to be documented in the
commit message.
This patch does the following:
1. Undo changes 1 and 2
2. Add a comment explaining change 3 (it very useful to have comments
in _code_, not only in the commit).
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Acked-by: Nick Piggin <npiggin@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 4 Feb 2009 00:52:44 +0000 (16:52 -0800)]
Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6
* 'linux-next' of git://git.infradead.org/ubifs-2.6:
UBIFS: remove fast unmounting
UBIFS: return sensible error codes
UBIFS: remount ro fixes
UBIFS: spelling fix 'date' -> 'data'
UBIFS: sync wbufs after syncing inodes and pages
UBIFS: fix LPT out-of-space bug (again)
UBIFS: fix no_chk_data_crc
UBIFS: fix assertions
UBIFS: ensure orphan area head is initialized
UBIFS: always clean up GC LEB space
UBIFS: add re-mount debugging checks
UBIFS: fix LEB list freeing
UBIFS: simplify locking
UBIFS: document dark_wm and dead_wm better
UBIFS: do not treat all data as short term
UBIFS: constify operations
UBIFS: do not commit twice
Linus Torvalds [Wed, 4 Feb 2009 00:50:20 +0000 (16:50 -0800)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
ocfs2: add quota call to ocfs2_remove_btree_range()
ocfs2: Wakeup the downconvert thread after a successful cancel convert
ocfs2: Access the xattr bucket only before modifying it.
configfs: Silence lockdep on mkdir(), rmdir() and configfs_depend_item()
ocfs2: Fix possible deadlock in ocfs2_write_dquot()
ocfs2: Push out dropping of dentry lock to ocfs2_wq
Linus Torvalds [Wed, 4 Feb 2009 00:49:54 +0000 (16:49 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
[XFS] Warn on transaction in flight on read-only remount
xfs: Check buffer lengths in log recovery
don't reallocate sxp variable passed into xfs_swapext
Upon further consideration, we actually should never see any
fault addresses for 32-bit tasks with the upper 32-bits set.
If it does every happen, by definition it's a bug. Whatever
context created that fault would only have that fault satisfied
if we used the full 64-bit address. If we truncate it, we'll
always fault the wrong address and we'll always loop faulting
forever.
So catch such conditions and mark them as errors always. Log
the error and fail the fault.
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Keil [Tue, 3 Feb 2009 23:18:01 +0000 (15:18 -0800)]
e1000: Fix PCI enable to honor the need_ioport flag
On machine were no IO ports are assigned the call
to pci_enable_device() will fail, even if need_ioport
is false, we need to use pci_enable_device_mem() here.
Signed-off-by: Karsten Keil <kkeil@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Dean Nelson [Tue, 3 Feb 2009 23:16:17 +0000 (15:16 -0800)]
sgi-xp: link XPNET's net_device_ops to its net_device structure
A recent patch by Stephen Hemminger to convert XPNET to use net_device_ops and
internal net_device_stats failed to link the net_device_ops structure to the
net_device structure. See commit e8ac9c55f28482f5b2f497a8e7eb90985db237c2
("xpnet: convert devices to new API").
Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Filip Aben [Tue, 3 Feb 2009 23:13:26 +0000 (15:13 -0800)]
hso: add new device id's
This patch adds a few device ID's. It also removes an ID that was used
in an internal engineering version of a device and will never see
commercial light. Even if this ID will be 'recycled' in the future,
which is very unlikely, we don't know what kind of device will be
behind it. Therefore it's safer to remove it.
Signed-off-by: Filip Aben <f.aben@option.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Felix Blyakher [Fri, 23 Jan 2009 03:34:05 +0000 (21:34 -0600)]
[XFS] Warn on transaction in flight on read-only remount
Till VFS can correctly support read-only remount without racing,
use WARN_ON instead of BUG_ON on detecting transaction in flight
after quiescing filesystem.
Signed-off-by: Felix Blyakher <felixb@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Dave Chinner [Thu, 22 Jan 2009 04:37:47 +0000 (15:37 +1100)]
xfs: Check buffer lengths in log recovery
Before trying to obtain, read or write a buffer,
check that the buffer length is actually valid. If
it is not valid, then something read in the recovery
process has been corrupted and we should abort
recovery.
Reported-by: Eric Sesterhenn <snakebyte@gmx.de> Tested-by: Eric Sesterhenn <snakebyte@gmx.de> Reviewed-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>
Linus Torvalds [Tue, 3 Feb 2009 15:39:55 +0000 (07:39 -0800)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: implement HORKAGE_1_5_GBPS and apply it to WD My Book
libata: add no penalty retry request for EH device handling routines
libata: improve probe failure handling
libata: add @spd_limit to sata_down_spd_limit()
libata: clear dev->ering in smarter way
libata: check onlineness before using SPD in sata_down_spd_limit()
libata: move ata_dev_disable() to libata-eh.c
libata: fix EH device failure handling
sata_nv: ck804 has borked hardreset too
ide/libata: fix ata_id_is_cfa() (take 4)
libata: fix kernel-doc warnings
ahci: add a module parameter to ignore the SSS flags for async scanning
sata_mv: Fix chip type for Hightpoint RocketRaid 1740/1742
[libata] sata_sil: Fix compilation error with libata debugging enabled
Change spin_locks to irqsave to prevent dead-locks.
Protect adding and deleting to/from dca_providers list.
Drop the lock during dca_sysfs_add_req() and dca_sysfs_remove_req() calls
as they might sleep (use GFP_KERNEL allocation).
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roel Kluin [Tue, 3 Feb 2009 07:19:50 +0000 (23:19 -0800)]
cassini/sungem: limit reaches -1, but 0 tested
while (limit--)
if (test())
break;
if (limit <= 0)
goto test_failed;
In the last iteration, limit is decremented after the test to 0.
If just thereafter test() succeeds and a break occurs, the goto
still occurs because limit is 0.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Feb 2009 06:08:15 +0000 (22:08 -0800)]
sparc64: Validate kernel generated fault addresses on sparc64.
In order to handle all of the cases of address calculation overflow
properly, we run sparc 32-bit processes in "address masking" mode
when running on a 64-bit kernel.
Address masking mode zeros out the top 32-bits of the address
calculated for every load and store instruction.
However, when we're in privileged mode we have to run with that
address masking mode disabled even when accessing userspace from
the kernel.
To "simulate" the address masking mode we clear the top-bits by
hand for 32-bit processes in the fault handler.
It is the responsibility of code in the compat layer to properly
zero extend addresses used to access userspace. If this isn't
followed properly we can get into a fault loop.
Say that the user address is 0xf0000000 but for whatever reason
the kernel code sign extends this to 64-bit, and then the kernel
tries to access the result.
In such a case we'll fault on address 0xfffffffff0000000 but the fault
handler will process that fault as if it were to address 0xf0000000.
We'll loop faulting forever because the fault never gets satisfied.
So add a check specifically for this case, when the kernel is faulting
on a user address access and the addresses don't match up.
This code path is sufficiently slow path, and this bug is sufficiently
painful to diagnose, that this kind of bug check is warranted.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Feb 2009 05:57:48 +0000 (21:57 -0800)]
sparc64: On non-Niagara, need to touch NMI watchdog in NOHZ mode.
When we're idling in NOHZ mode, timer interrupts are not running.
Evidence of processing timer interrupts is what the NMI watchdog
uses to determine if the CPU is stuck.
On Niagara, we'll yield the cpu. This will make the cpu, at
worst, hang out in the hypervisor until an interrupt arrives.
This will prevent the NMI watchdog timer from firing.
However on non-Niagara we just loop executing instructions
which will cause the NMI watchdog to keep firing. It won't
see timer interrupts happening so it will think the cpu is
stuck.
Fix this by touching the NMI watchdog in the cpu idle loop
on non-Niagara machines.
Signed-off-by: David S. Miller <davem@davemloft.net>
Tejun Heo [Thu, 29 Jan 2009 11:31:36 +0000 (20:31 +0900)]
libata: implement HORKAGE_1_5_GBPS and apply it to WD My Book
3Gbps is often much more prone to transmission failures. It's usually
okay to let EH handle speed down after transmission failures but some
WD My Book drives completely shutdown after certain transmission
failures and after it only power cycling can revive them. Combined
with the fact that external drives often end up with cable assembly
which is longer than usual and more likely to have intervening gender,
this makes these drives very likely to shutdown under certain
configurations virtually rendering them unusable.
This patch implements HOARKGE_1_5_GBPS and applies it to WD My Book
such that 1.5Gbps is forced once the device is identified.
Please take a look at the following bz for related reports.
http://bugzilla.kernel.org/show_bug.cgi?id=9913
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Thu, 29 Jan 2009 11:31:35 +0000 (20:31 +0900)]
libata: add no penalty retry request for EH device handling routines
Let -EAGAIN from EH device handling routines trigger EH retry without
consuming its tries count. This will be used to implement link SPD
horkage which requires hardreset to adjust SPD without affecting other
EH decisions. As it bypasses the forward progress guarantee provided
by the tries count, the requester is responsible for ensuring forward
progress.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Thu, 29 Jan 2009 11:31:34 +0000 (20:31 +0900)]
libata: improve probe failure handling
When link is flaky at high speed, it isn't uncommon for a device to
repeatedly fail probing sequence early after successfully negotiating
high link speed. This often leads to consecutive hotplug events
without successful probing.
This patch improves libata EH such that it remembers probing trials
and if there have been more than two unsuccessful trials in the past
60 seconds, slows down link speed to 1.5Gbps.
As link speed negotiation is the duty of the PHY layer proper, the
goal of this fallback mechanism is to provide the last resort when
everything else fails, which unfortunately happens not too
infrequently, so no fancy 6->3->1.5 speeding down or highest
successful transmission speed seen kind of logics (yet).
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Thu, 29 Jan 2009 11:31:33 +0000 (20:31 +0900)]
libata: add @spd_limit to sata_down_spd_limit()
Add @spd_limit to sata_down_spd_limit() so that the caller can specify
the SPD limit it wants. This parameter doesn't get in the way even
when it's too low. The closest possible limit is applied.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Thu, 29 Jan 2009 11:31:32 +0000 (20:31 +0900)]
libata: clear dev->ering in smarter way
dev->ering used to be cleared together with the rest of ata_device in
ata_dev_init() which is called whenever a probing event occurs.
dev->ering is about to be used to track probing failures so it needs
to remain persistent over multiple porbing events. This patch
achieves this by doing the following.
* Instead of CLEAR_OFFSET, define CLEAR_BEGIN and CLEAR_END and only
clear between BEGIN and END. ering is moved after END. The split
of persistent area is to allow hotter items remain at the head.
* ering is explicitly cleared on ata_dev_disable() and when device
attach succeeds. So, ering is persistent throug a device's life
time (unless explicitly cleared of course) and also through periods
inbetween disablement of an attached device and successful detection
of the next one.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Thu, 29 Jan 2009 11:31:31 +0000 (20:31 +0900)]
libata: check onlineness before using SPD in sata_down_spd_limit()
sata_down_spd_limit() should check whether the link is online before
using the SPD value to determine how to limit the link speed. Factor
out onlineness test and test it from sata_down_spd_limit().
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Thu, 29 Jan 2009 11:31:29 +0000 (20:31 +0900)]
libata: fix EH device failure handling
The dev->pio_mode > XFER_PIO_0 test is there to avoid unnecessary
speed down warning messages but it accidentally disabled SATA link spd
down during configuration phase after reset where PIO mode is always
zero.
This patch fixes the problem by moving the test where it belongs.
This makes libata probing sequence behave better when the connection
is flaky at higher link speeds which isn't too uncommon for eSATA
devices.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Sun, 1 Feb 2009 01:56:31 +0000 (10:56 +0900)]
sata_nv: ck804 has borked hardreset too
While playing with nvraid, I found out that rmmoding and insmoding
often trigger hardreset failure on the first port (the second one was
always okay). Seriously, how diverse can you get with hardreset
behaviors? Anyways, make ck804 use noclassify variant too.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Sergei Shtylyov [Sun, 1 Feb 2009 16:46:39 +0000 (20:46 +0400)]
ide/libata: fix ata_id_is_cfa() (take 4)
When checking for the CFA feature set support, ata_id_is_cfa() tests bit 2 in
word 82 of the identify data instead the word 83; it also checks the ATA/PI
version support in the word 80 (which the CompactFlash specifications have as
reserved), this having no slightest chance to work on the modern CF cards that
don't have 0x848A in the word 0...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Randy Dunlap [Wed, 21 Jan 2009 00:28:59 +0000 (16:28 -0800)]
libata: fix kernel-doc warnings
Fix libata kernel-doc warnings:
Warning(linux-next-20090120//drivers/ata/libata-core.c:4720): Excess function parameter 'dev' description in 'ata_qc_new'
Warning(linux-next-20090120//drivers/ata/libata-scsi.c:428): No description found for parameter 'ap'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Arjan van de Ven [Mon, 26 Jan 2009 10:05:44 +0000 (02:05 -0800)]
ahci: add a module parameter to ignore the SSS flags for async scanning
The SSS flag, which directs the OS to spin up one disk at a time
to not have the PSU blow out, sometimes gets set even when not needed.
The effect of this is a longer-than-needed boot time.
This patch adds a module parameter that makes the driver ignore SSS
at least as far as the parallel scan during boot is concerned...
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Pasi Kärkkäinen [Mon, 2 Feb 2009 19:47:14 +0000 (21:47 +0200)]
[libata] sata_sil: Fix compilation error with libata debugging enabled
I tried compiling 2.6.29-rc1 and 2.6.29-rc3 with libata debugging enabled
and got the following error:
CC [M] drivers/ata/sata_sil.o
drivers/ata/sata_sil.c: In function 'sil_fill_sg':
drivers/ata/sata_sil.c:327: error: 'pi' undeclared (first use in this function)
drivers/ata/sata_sil.c:327: error: (Each undeclared identifier is reported only once
drivers/ata/sata_sil.c:327: error: for each function it appears in.)
make[2]: *** [drivers/ata/sata_sil.o] Error 1
make[1]: *** [drivers/ata] Error 2
make: *** [drivers] Error 2
Linus Torvalds [Tue, 3 Feb 2009 03:28:58 +0000 (19:28 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI hotplug: Change link order of pciehp & acpiphp
PCI hotplug: fakephp: Allocate PCI resources before adding the device
PCI MSI: Fix undefined shift by 32
PCI PM: Do not wait for buses in B2 or B3 during resume
PCI PM: Power up devices before restoring their state
PCI PM: Fix hibernation breakage on EeePC 701
PCI: irq and pci_ids patch for Intel Tigerpoint DeviceIDs
PCI PM: Fix suspend error paths and testing facility breakage
Linus Torvalds [Tue, 3 Feb 2009 03:27:00 +0000 (19:27 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: fix per cpu kmem_cache_cpu array memory leak
kmalloc: return NULL instead of link failure
Linus Torvalds [Tue, 3 Feb 2009 03:26:44 +0000 (19:26 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
fbdev/atyfb: Fix DSP config on some PowerMacs & PowerBooks
powerpc: Fix oops on some machines due to incorrect pr_debug()
powerpc/ps3: Printing fixups for l64 to ll64 convserion drivers/net
powerpc/5200: update device tree binding documentation
powerpc/5200: Bugfix for PCI mapping of memory and IMMR
powerpc/5200: update defconfigs
Linus Torvalds [Tue, 3 Feb 2009 03:26:29 +0000 (19:26 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched_rt: don't use first_cpu on cpumask created with cpumask_and
sched: fix buddie group latency
sched: clear buddies more aggressively
sched: symmetric sync vs avg_overlap
sched: fix sync wakeups
cpuset: fix possible deadlock in async_rebuild_sched_domains
Linus Torvalds [Tue, 3 Feb 2009 03:24:14 +0000 (19:24 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
pxamci: enable DMA for write ops after CMD/RESP
pxamci: replace #ifdef CONFIG_PXA27x with if (cpu_is_pxa27x())
ricoh_mmc: Use suspend_late/resume_early
mmci: Add support for ST Micro derivate
mmc: Add a MX2/MX3 specific SDHC driver
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
icside: fix PCB version 6 support (v2)
tx4939ide: typo fix and minor cleanup
ide: add CS5536 host driver (v3)
ide: Force VIA IDE legacy interrupts for AmigaOne boards
IDE: Unregister and disable devices if initialization fails.
ide: fix ide_register_port() failure handling
ide: struct device - replace bus_id with dev_name(), dev_set_name()
ide-cd: fix DMA for non bio-backed requests
Linus Torvalds [Tue, 3 Feb 2009 03:20:17 +0000 (19:20 -0800)]
Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb
* 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb:
uwb: lock rc->rsvs_lock with spin_lock_bh()
wusb: timeout when waiting for ASL/PZL updates in whci-hcd
uwb: remove unused #include <version.h>'s
wusb: return -ENOTCONN when resetting a port with no connected device
uwb: safely remove all reservations
Linus Torvalds [Tue, 3 Feb 2009 03:19:50 +0000 (19:19 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: add text file detailing queue/ sysfs files
bio.h: If they MUST be inlined, then use __always_inline
Fix misleading comment in bio.h
block: fix inconsistent parenthesisation of QUEUE_FLAG_DEFAULT
block: fix oops in blk_queue_io_stat()
Mark McLoughlin [Tue, 3 Feb 2009 03:03:53 +0000 (13:33 +1030)]
virtio-pci: do not oops on config change if driver not loaded
The host really shouldn't be notifying us of config changes
before the device status is VIRTIO_CONFIG_S_DRIVER or
VIRTIO_CONFIG_S_DRIVER_OK.
However, if we do happen to be interrupted while we're not
attached to a driver, we really shouldn't oops. Prevent
this simply by checking that device->driver is non-NULL
before trying to notify the driver of config changes.
Problem observed by doing a "set_link virtio.0 down" with
QEMU before the net driver had been loaded.
Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Tue, 3 Feb 2009 03:01:36 +0000 (13:31 +1030)]
modules: Use a better scheme for refcounting
Current refcounting for modules (done if CONFIG_MODULE_UNLOAD=y) is
using a lot of memory.
Each 'struct module' contains an [NR_CPUS] array of full cache lines.
This patch uses existing infrastructure (percpu_modalloc() &
percpu_modfree()) to allocate percpu space for the refcount storage.
Instead of wasting NR_CPUS*128 bytes (on i386), we now use
nr_cpu_ids*sizeof(local_t) bytes.
On a typical distro, where NR_CPUS=8, shiping 2000 modules, we reduce
size of module files by about 2 Mbytes. (1Kb per module)
Instead of having all refcounters in the same memory node - with TLB misses
because of vmalloc() - this new implementation permits to have better
NUMA properties, since each CPU will use storage on its preferred node,
thanks to percpu storage.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark Fasheh [Thu, 29 Jan 2009 23:06:21 +0000 (15:06 -0800)]
ocfs2: add quota call to ocfs2_remove_btree_range()
We weren't reclaiming the clusters which get free'd from this function,
so any user punching holes in a file would still have those bytes accounted
against him/her. Add the call to vfs_dq_free_space_nodirty() to fix this.
Interestingly enough, the journal credits calculation already took this into
account.
Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Jan Kara <jack@suse.cz>
Sunil Mushran [Fri, 30 Jan 2009 01:12:31 +0000 (17:12 -0800)]
ocfs2: Wakeup the downconvert thread after a successful cancel convert
When two nodes holding PR locks on a resource concurrently attempt to
upconvert the locks to EX, the master sends a BAST to one of the nodes. This
message tells that node to first cancel convert the upconvert request,
followed by downconvert to a NL. Only when this lock is downconverted to NL,
can the master upconvert the first node's lock to EX.
While the fs was doing the cancel convert, it was forgetting to wake up the
dc thread after a successful cancel, leading to a deadlock.
Reported-and-Tested-by: David Teigland <teigland@redhat.com> Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Tao Ma [Thu, 8 Jan 2009 00:21:43 +0000 (08:21 +0800)]
ocfs2: Access the xattr bucket only before modifying it.
In ocfs2_xattr_value_truncate, we may call b-tree codes which will
extend the journal transaction. It has a potential problem that it
may let the already-accessed-but-not-dirtied buffers gone. So we'd
better access the bucket after we call ocfs2_xattr_value_truncate.
And as for the root buffer for the xattr value, b-tree code will
acess and dirty it, so we don't need to worry about it.
Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Joel Becker [Wed, 17 Dec 2008 22:23:52 +0000 (14:23 -0800)]
configfs: Silence lockdep on mkdir(), rmdir() and configfs_depend_item()
When attaching default groups (subdirs) of a new group (in mkdir() or
in configfs_register()), configfs recursively takes inode's mutexes
along the path from the parent of the new group to the default
subdirs. This is needed to ensure that the VFS will not race with
operations on these sub-dirs. This is safe for the following reasons:
- the VFS allows one to lock first an inode and second one of its
children (The lock subclasses for this pattern are respectively
I_MUTEX_PARENT and I_MUTEX_CHILD);
- from this rule any inode path can be recursively locked in
descending order as long as it stays under a single mountpoint and
does not follow symlinks.
Unfortunately lockdep does not know (yet?) how to handle such
recursion.
I've tried to use Peter Zijlstra's lock_set_subclass() helper to
upgrade i_mutexes from I_MUTEX_CHILD to I_MUTEX_PARENT when we know
that we might recursively lock some of their descendant, but this
usage does not seem to fit the purpose of lock_set_subclass() because
it leads to several i_mutex locked with subclass I_MUTEX_PARENT by
the same task.
>From inside configfs it is not possible to serialize those recursive
locking with a top-level one, because mkdir() and rmdir() are already
called with inodes locked by the VFS. So using some
mutex_lock_nest_lock() is not an option.
I am proposing two solutions:
1) one that wraps recursive mutex_lock()s with
lockdep_off()/lockdep_on().
2) (as suggested earlier by Peter Zijlstra) one that puts the
i_mutexes recursively locked in different classes based on their
depth from the top-level config_group created. This
induces an arbitrary limit (MAX_LOCK_DEPTH - 2 == 46) on the
nesting of configfs default groups whenever lockdep is activated
but this limit looks reasonably high. Unfortunately, this alos
isolates VFS operations on configfs default groups from the others
and thus lowers the chances to detect locking issues.
This patch implements solution 1).
Solution 2) looks better from lockdep's point of view, but fails with
configfs_depend_item(). This needs to rework the locking
scheme of configfs_depend_item() by removing the variable lock recursion
depth, and I think that it's doable thanks to the configfs_dirent_lock.
For now, let's stick to solution 1).
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Jan Kara [Mon, 12 Jan 2009 22:20:32 +0000 (23:20 +0100)]
ocfs2: Fix possible deadlock in ocfs2_write_dquot()
It could happen that some limit has been set via quotactl() and in parallel
->mark_dirty() is called from another thread doing e.g. dquot_alloc_space(). In
such case ocfs2_write_dquot() must not try to sync the dquot because that needs
global quota lock but that ranks above transaction start.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Jan Kara [Mon, 12 Jan 2009 22:20:31 +0000 (23:20 +0100)]
ocfs2: Push out dropping of dentry lock to ocfs2_wq
Dropping of last reference to dentry lock is a complicated operation involving
dropping of reference to inode. This can get complicated and quota code in
particular needs to obtain some quota locks which leads to potential deadlock.
Thus we defer dropping of inode reference to ocfs2_wq.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>