git.karo-electronics.de Git - linux-beck.git/log

md: be more consistent about setting WriteMostly flag when adding a drive to an array

When a drive is added to an array using ADD_NEW_DISK, there are two
places we can get certain flags from: the metadata on the disk or the
flags passed through the IOCTL.

For the WriteMostly flag (aka MD_DISK_WRITEMOSTLY) we take the value
from either of those sources depending on if it is set (i.e. we
effectively 'or' the two sources together).

This makes it awkward to clear, and is at best inconsistent.

As documented code (in mdadm) requires that setting
MD_DISK_WRITEMOSTLY in the ioctl will be effective, we resolve the
inconsistency by always using the value for this flag from the ioctl,
and ignoring the value on disk.

Signed-off-by: NeilBrown <neilb@suse.de>

md: occasionally checkpoint drive recovery to reduce duplicate effort after a crash

Version 1.x metadata has the ability to record the status of a
partially completed drive recovery.
However we only update that record on a clean shutdown.
It would be nice to update it on unclean shutdowns too, particularly
when using a bitmap that removes much to the 'sync' effort after an
unclean shutdown.

One complication with checkpointing recovery is that we only know
where we are up to in terms of IO requests started, not which ones
have completed.  And we need to know what has completed to record
how much is recovered.  So occasionally pause the recovery until all
submitted requests are completed, then update the record of where
we are up to.

When we have a bitmap, we already do that pause occasionally to keep
the bitmap up-to-date.  So enhance that code to record the recovery
offset and schedule a superblock update.
And when there is no bitmap, just pause 16 times during the resync to
do a checkpoint.
'16' is a fairly arbitrary number.  But we don't really have any good
way to judge how often is acceptable, and it seems like a reasonable
number for now.

Signed-off-by: NeilBrown <neilb@suse.de>

md: move md_k.h from include/linux/raid/ to drivers/md/

It really is nicer to keep related code together..

Signed-off-by: NeilBrown <neilb@suse.de>

md: move lots of #include lines out of .h files and into .c

This makes the includes more explicit, and is preparation for moving
md_k.h to drivers/md/md.h

Remove include/raid/md.h as its only remaining use was to #include
other files.

Signed-off-by: NeilBrown <neilb@suse.de>

md: move most content from md.h to md_k.h

The extern function definitions are kernel-internal definitions, so
they belong in md_k.h

The MD_*_VERSION values could reasonably go in a number of places,
but md_u.h seems most reasonable.

This leaves almost nothing in md.h. It will go soon.

Signed-off-by: NeilBrown <neilb@suse.de>

md: move LEVEL_* definition from md_k.h to md_u.h

.. as they are part of the user-space interface.
Also move MdpMinorShift into there so we can remove duplication.

Lastly move mdp_major in. It is less obviously part of the user-space
interface, but do_mounts_md.c uses it, and it is acting a bit like
user-space.

Signed-off-by: NeilBrown <neilb@suse.de>

md: move headers out of include/linux/raid/

Move the headers with the local structures for the disciplines and
bitmap.h into drivers/md/ so that they are more easily grepable for
hacking and not far away. md.h is left where it is for now as there
are some uses from the outside.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: NeilBrown <neilb@suse.de>

cleanup drivers/md/Makefile

Use the -y variables instead of the old -objs so we can easily add
conditional objects to the modules. Also always use += to add
subobjects to avoid problems when placing additional objects in
some place in the file.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: NeilBrown <neilb@suse.de>

md: stop defining MAJOR_NR

MAJOR_NR was only required for magic in linux/blk.h in 2.4 or earlier
kernels, so no need to keep it around.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: NeilBrown <neilb@suse.de>

MD data integrity support

md: Add support for data integrity to MD

If all subdevices support the same protection format the MD device is
flagged as integrity capable.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md: write bitmap information to devices that are undergoing recovery.

When we add some spares to an array and start recovery, and we have
a bitmap which is stored 'internally' on all devices, we call
bitmap_write_all to make sure the bitmap is correct on the new
device(s).
However that doesn't work as write_sb_page only writes to
'In_sync' devices, and devices undergoing recovery are not
'In_sync' until recovery finishes.

So extend write_sb_page (actually next_active_rdev) to include devices
that are under recovery.

Signed-off-by: NeilBrown <neilb@suse.de>

md: never clear bit from the write-intent bitmap when the array is degraded.

It is safe to clear a bit from the write-intent bitmap for a raid1
if we know the data has been written to all devices, which is
what the current test does.

But it is not always safe to update the 'events_cleared' counter in
that case.  This is because one request could complete successfully
after some other request has partially failed.

So simply disable the clearing and updating of events_cleared whenever
the array is degraded.  This might end up not clearing some bits that
could safely be cleared, but it is safest approach.

Note that the bug fixed here did not risk corrupting data by letting
the array get out-of-sync.  Rather it meant that when a device is
removed and re-added to the array, it might incorrectly require a full
recovery rather than just recovering based on the bitmap.

Signed-off-by: NeilBrown <neilb@suse.de>

md: Allow write-intent bitmaps to have chunksize < PAGE_SIZE

md currently insists that the chunk size used for write-intent
bitmaps (the amount of data that corresponds to one chunk)
be at least one page.

The reason for this restriction is lost in the mists of time,
but a review of the code (and a vague memory) suggests that the only
problem would be related to resync. Resync tries very hard to
work in multiples of a page, but also needs to sync with units
of a bitmap_chunk too.

This connection comes out in the bitmap_start_sync call.

So change bitmap_start_sync to always work in multiples of a page.
If the bitmap chunk size is less that one page, we flag multiple
chunks as 'syncing' and generally make them all appear to the
resync routines like one chunk.

All other code either already works with data ranges that could
span multiple chunks, or explicitly only cares about a single chunk.

Signed-off-by: Neil Brown <neilb@suse.de>

md: Fix is_mddev_idle test (again).

There are two problems with is_mddev_idle.

1/ sync_io is 'atomic_t' and hence 'int'.  curr_events and all the
   rest are 'long'.
   So if sync_io were to wrap on a 64bit host, the value of
   curr_events would go very negative suddenly, and take a very
   long time to return to positive.

   So do all calculations as 'int'.  That gives us plenty of precision
   for what we need.

2/ To initialise rdev->last_events we simply call is_mddev_idle, on
   the assumption that it will make sure that last_events is in a
   suitable range.  It used to do this, but now it does not.
   So now we need to be more explicit about initialisation.

Signed-off-by: NeilBrown <neilb@suse.de>

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] Add p4-clockmod sysfs-ui removal to feature-removal schedule.
Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."

copy_process: fix CLONE_PARENT && parent_exec_id interaction

CLONE_PARENT can fool the ->self_exec_id/parent_exec_id logic. If we
re-use the old parent, we must also re-use ->parent_exec_id to make
sure exit_notify() sees the right ->xxx_exec_id's when the CLONE_PARENT'ed
task exits.

Also, move down the "p->parent_exec_id = p->self_exec_id" thing, to place
two different cases together.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[CPUFREQ] Add p4-clockmod sysfs-ui removal to feature-removal schedule.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Dave Jones <davej@redhat.com>

Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."

This reverts commit e088e4c9cdb618675874becb91b2fd581ee707e6.

Removing the sysfs interface for p4-clockmod was flagged as a
regression in bug 12826.

Course of action:
- Find out the remaining causes of overheating, and fix them
if possible. ACPI should be doing the right thing automatically.
If it isn't, we need to fix that.
- mark p4-clockmod ui as deprecated
- try again with the removal in six months.

It's not really feasible to printk about the deprecation, because
it needs to happen at all the sysfs entry points, which means adding
a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.

Signed-off-by: Dave Jones <davej@redhat.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
  p54: fix race condition in memory management
  cfg80211: test before subtraction on unsigned
  iwlwifi: fix error flow in iwl*_pci_probe
  rt2x00 : more devices to rt73usb.c
  rt2x00 : more devices to rt2500usb.c
  bonding: Fix device passed into ->ndo_neigh_setup().
  vlan: Fix vlan-in-vlan crashes.
  net: Fix missing dev->neigh_setup in register_netdevice().
  tmspci: fix request_irq race
  pkt_sched: act_police: Fix a rate estimator test.
  tg3: Fix 5906 link problems
  SCTP: change sctp_ctl_sock_init() to try IPv4 if IPv6 fails
  IPv6: add "disable" module parameter support to ipv6.ko
  sungem: another error printed one too early
  aoe: error printed 1 too early
  net pcmcia: worklimit reaches -1
  net: more timeouts that reach -1
  net: fix tokenring license
  dm9601: new vendor/product IDs
  netlink: invert error code in netlink_set_err()
  ...

Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
lguest: fix for CONFIG_SPARSE_IRQ=y
lguest: fix crash 'unhandled trap 13 at <native_read_msr_safe>'

Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable

* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: fix spinlock assertions on UP systems

Btrfs: fix spinlock assertions on UP systems

btrfs_tree_locked was being used to make sure a given extent_buffer was
properly locked in a few places. But, it wasn't correct for UP compiled
kernels.

This switches it to using assert_spin_locked instead, and renames it to
btrfs_assert_tree_locked to better reflect how it was really being used.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Fix fixpoint divide exception in acct_update_integrals

Frans Pop reported the crash below when running an s390 kernel under Hercules:

  Kernel BUG at 000738b4  verbose debug info unavailable!
  fixpoint divide exception: 0009  #1! SMP
  Modules linked in: nfs lockd nfs_acl sunrpc ctcm fsm tape_34xx
     cu3088 tape ccwgroup tape_class ext3 jbd mbcache dm_mirror dm_log dm_snapshot
     dm_mod dasd_eckd_mod dasd_mod
  CPU: 0 Not tainted 2.6.27.19 #13
  Process awk (pid: 2069, task: 0f9ed9b8, ksp: 0f4f7d18)
  Krnl PSW : 070c1000 800738b4 (acct_update_integrals+0x4c/0x118)
             R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0
  Krnl GPRS: 00000000 000007d0 7fffffff fffff830
             00000000 ffffffff 00000002 0f9ed9b8
             00000000 00008ca0 00000000 0f9ed9b8
             0f9edda4 8007386e 0f4f7ec8 0f4f7e98
  Krnl Code: 800738aa: a71807d0         lhi     %r1,2000
             800738ae: 8c200001         srdl    %r2,1
             800738b2: 1d21             dr      %r2,%r1
            >800738b4: 5810d10e         l       %r1,270(%r13)
             800738b8: 1823             lr      %r2,%r3
             800738ba: 4130f060         la      %r3,96(%r15)
             800738be: 0de1             basr    %r14,%r1
             800738c0: 5800f060         l       %r0,96(%r15)
  Call Trace:
  ( <000000000004fdea>! blocking_notifier_call_chain+0x1e/0x2c)
    <0000000000038502>! do_exit+0x106/0x7c0
    <0000000000038c36>! do_group_exit+0x7a/0xb4
    <0000000000038c8e>! SyS_exit_group+0x1e/0x30
    <0000000000021c28>! sysc_do_restart+0x12/0x16
    <0000000077e7e924>! 0x77e7e924

Reason for this is that cpu time accounting usually only happens from
interrupt context, but acct_update_integrals gets also called from
process context with interrupts enabled.

So in acct_update_integrals we may end up with the following scenario:

Between reading tsk->stime/tsk->utime and tsk->acct_timexpd an interrupt
happens which updates accouting values.  This causes acct_timexpd to be
greater than the former stime + utime.  The subsequent calculation of

dtime = cputime_sub(time, tsk->acct_timexpd);

will be negative and the division performed by

cputime_to_jiffies(dtime)

will generate an exception since the result won't fit into a 32 bit
register.

In order to fix this just always disable interrupts while accessing any
of the accounting values.

Reported by: Frans Pop <elendil@planet.nl>
Tested by: Frans Pop <elendil@planet.nl>
Cc: stable@kernel.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lguest: fix for CONFIG_SPARSE_IRQ=y

Impact: remove lots of lguest boot WARN_ON() when CONFIG_SPARSE_IRQ=y

We now need to call irq_to_desc_alloc_cpu() before
set_irq_chip_and_handler_name(), but we can't do that from init_IRQ (no
kmalloc available).

So do it as we use interrupts instead. Also means we only alloc for
irqs we use, which was the intent of CONFIG_SPARSE_IRQ anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>

lguest: fix crash 'unhandled trap 13 at <native_read_msr_safe>'

Impact: fix lguest boot crash on modern Intel machines

The code in early_init_intel does:

if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
u64 misc_enable;

rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);

And that rdmsr faults (not allowed from non-0 PL).  We can get around
this by mugging the family ID part of the cpuid.  5 seems like a good
number.

Of course, this is a hack (how very lguest!).  We could just indicate
that we don't support MSRs, or implement lguest_rdmst.

Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Patrick McHardy <kaber@trash.net>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
mmc: fix data timeout for SEND_EXT_CSD

Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: increment quiescent state counter in ksoftirqd()

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs()
  x86, bts: remove bad warning
  x86: add Dell XPS710 reboot quirk
  x86, math-emu: fix init_fpu for task != current
  x86: EFI: Back efi_ioremap with init_memory_mapping instead of FIX_MAP
  x86: fix DMI on EFI

Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog

* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
  [WATCHDOG] orion5x_wdt.c: 'ORION5X_TCLK' undeclared
  [WATCHDOG] gef_wdt.c: fsl_get_sys_freq() failure not noticed
  [WATCHDOG] ks8695_wdt.c: 'CLOCK_TICK_RATE' undeclared
  [WATCHDOG] rc32434_wdt: fix sections
  [WATCHDOG] rc32434_wdt: fix watchdog driver

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix ext4_free_inode() vs. ext4_claim_inode() race

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (28 commits)
  Blackfin arch: SPI_MMC is now mainlined MMC_SPI
  Blackfin arch: disable legacy /proc/scsi/ support by default
  Blackfin arch: remove duplicated ANOMALY_05000448 ifdef check
  Blackfin arch: add stubs for anomalies 447 and 448
  Blackfin arch: cleanup bfin_sport.h header and export it to userspace
  Blackfin arch: fix bug - gdb signull case make trunk kernel panic frequently
  Blackfin arch: remove spurious dash when dcache is off
  Blackfin arch: mark init_pda as __init as only __init funcs all it
  Blackfin arch: fix bug - On bf548-ezkit, ethernet fails to work after wakeup from "mem"
  Blackfin arch: Random read/write errors are a bad thing
  Blackfin arch: update default kernel config, select KSZ8893M driver for BF518
  Blackfin arch: Fix bug - KGDB single step into the middle of a 4 bytes instruction on bf561 after soft bp is hit
  Blackfin arch: Fix bug - make ksz8893m driver available when bfin_mac is enabled
  Blackfin arch: make sure people do not set the kernel load address too high
  Blackfin arch: fix bug - The SPORT_HYS bit is not set for BF561 0.5
  Blackfin arch: update anomaly sheets to match latest public info
  Blackfin arch: Fix BUG - kernel fails to build in pm.c when allow wakeup fromi standby by GPIO
  Blackfin arch: PM_BFIN_WAKE_GP: update help
  Blackfin arch: fix bug - kgdb fails to continue after setting breakpoint on bf561-ezkit kernel with smp patch
  Blackfin arch: Enable Write Back Cache on all Blackfin Boards
  ...

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
  dmatest: fix use after free in dmatest_exit
  ipu_idmac: fix spinlock type
  iop-adma, mv_xor: fix mem leak on self-test setup failure
  fsldma: fix off by one in dma_halt
  I/OAT: fail self-test if callback test reaches timeout
  I/OAT: update driver version and copyright dates
  I/OAT: list usage cleanup
  I/OAT: set tcp_dma_copybreak to 256k for I/OAT ver.3
  I/OAT: cancel watchdog before dma remove
  I/OAT: fail initialization on zero channels detection
  I/OAT: do not set DCACTRL_CMPL_WRITE_ENABLE for I/OAT ver.3
  I/OAT: add verification for proper APICID_TAG_MAP setting by BIOS
  dmaengine: update kerneldoc

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ata: add CFA specific identify data words
  remove stale comment from <linux/hdreg.h>
  AT91: initialize Compact Flash on AT91SAM9263 cpu
  ide: add at91_ide driver
  ide: allow to wrap interrupt handler
  ide-iops: fix odd-length ATAPI PIO transfers
  ide: NULL noise: drivers/ide/ide-*.c
  ide: expiry() returns int, negative expiry() return values won't be noticed

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: Don't trust current capacity values in identify words 57-58
  libata: make sure port is thawed when skipping resets
  sata_nv: fix module parameter description
  ahci: Add the Device IDs for MCP89 and remove IDs of MCP7B to/from ahci.c
  libata: don't use on-stack sense buffer
  libata: align ap->sector_buf
  libata: fix dma_unmap_sg misuse
  libata: change drive ready wait after hard reset to 5s

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
  Squashfs: frag_size should be signed, as it can hold an error result
  Squashfs: fix documentation typo, Cramfs filesystem limit is 256 MiB
  Squashfs: Fix oops when reading fsfuzzer corrupted filesystems

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
smack: fixes for unlabeled host support

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: serio - fix protocol number for TouchIT213

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] fix PCI DMA flag propagation on SN (Altix) with PICs

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: fix missing bio back/front segment size setting in blk_recount_segments()
  loop: don't increment p->offset with (size_t) -EINVAL
  cciss: remove 30 second initial timeout on controller reset
  Fix kernel NULL pointer dereference in xen-blkfront

Merge branch 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6

* 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Fix headphone-detect regression with multiple HP jacks
ALSA: hda - Fix typos in slave controls in patch_sigmatel.c

MIPS: compat: Implement is_compat_task.

This is a build fix required after "x86-64: seccomp: fix 32/64 syscall
hole" (commit 5b1017404aea6d2e552e991b3fd814d839e9cd67). MIPS doesn't
have the issue that was fixed for x86-64 by that patch.

This also doesn't solve the N32 issue which is that N32 seccomp processes
will be treated as non-compat processes thus only have access to N64
syscalls.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: fix data timeout for SEND_EXT_CSD

Commit 0d3e0460f307e84904968aad6cff97bd688583d8
"MMC: CSD and CID timeout values" inadvertently broke
the timeout for the MMC command SEND_EXT_CSD.

This patch puts it back again.

Depending on the characteristics of the controller,
this bug may prevent the use of MMC cards.

Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>

Input: serio - fix protocol number for TouchIT213

Protocol 0x37 has been reserved for iNexio devices and Sahara
was supposed to get 0x38.

Reported-by: Claudio Nieder <private@claudio.ch>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

p54: fix race condition in memory management

This patch fixes a number of race conditions in the driver.
Up until now, "entry" pointer was initialized before acquiring the right lock.

Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

cfg80211: test before subtraction on unsigned

freq_diff is unsigned, so test before subtraction

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

[IA64] fix PCI DMA flag propagation on SN (Altix) with PICs

We recently discovered a problem with passing of DMA attributes on SN
systems with the older PIC chips.

[akpm@linux-foundation.org: coding-style fixes]

Signed-off-by: Jeremy Higdon <jeremy@sgi.com>
Cc: <habeck@sgi.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>

x86, pebs: correct qualifier passed to ds_write_config() from ds_request_pebs()

ds_write_config() can write the BTS as well as the PEBS part of
the DS config. ds_request_pebs() passes the wrong qualifier, which
results in the wrong configuration to be written.

Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090305085721.A22550@sedona.ch.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

x86, bts: remove bad warning

In case a ptraced task is reaped (while the tracer is still attached),
ds_exit_thread() is called before ptrace_exit(). The latter will
release the bts_tracer and remove the thread's ds_ctx.
The former will WARN() if the context is not NULL.

Oleg Nesterov submitted patches that move ptrace_exit() before
exit_thread() and thus reverse the order of the above calls.

Remove the bad warning. I will add it again when Oleg's changes are in.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
LKML-Reference: <20090305084954.A22000@sedona.ch.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ALSA: hda - Fix headphone-detect regression with multiple HP jacks

The recent changes over the DAC detection mechanism in patch_sigmatel.c
breaks the HP detection on the machines with multiple HP jacks.
It's basically because of the workaround to support the multi-channel
output. Since the HP detection is more important feature, disable
the HP-swap workaroud temporarily.

Reference: Novell bnc#482052
https://bugzilla.novell.com/show_bug.cgi?id=482052

Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: hda - Fix typos in slave controls in patch_sigmatel.c

"Headphone Playback ..." appears twice in slave_vols[] and slave_sws[].
They should be "Headphone Playback2 ..."

Signed-off-by: Takashi Iwai <tiwai@suse.de>

block: fix missing bio back/front segment size setting in blk_recount_segments()

Commit 1e42807918d17e8c93bf14fbb74be84b141334c1 introduced a bug where we
don't get front/back segment sizes in the bio in blk_recount_segments().
Fix this by tracking the back bio as well as the front bio in
__blk_recalc_rq_segments(), this also cleans up the interface by getting
rid of the segment size pointer passing.

Tested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

[WATCHDOG] orion5x_wdt.c: 'ORION5X_TCLK' undeclared

orion5x_wdt no longer compiled after the changes in commit
ebe35aff883496c07248df82c8576c3b6e84bbbe ("Orion: prepare for
runtime-determined timer tick rate").

The tick rate define (ORION5X_TCLK) was removed in favor of a runtime
detection. The quick fix is to add the define in the watchdog driver.
The fix is not correct for all supported orion5x platforms, but since
the supported platforms right now are 133 Mhz and 166 Mhz, it won't
be _that_ far off. ;-) A fix that uses the runtime-determined timer
tick rate will be applied later.

Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: Kristof Provost <kristof@sigsegv.be>
Acked-by: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Nicolas Pitre <nico@cam.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>

iwlwifi: fix error flow in iwl*_pci_probe

Both the agn and 3945 drivers has some problems with dealing with
errors in their probe functions. Ensure that a goto will undo only
things that was done before the goto was called.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

rt2x00 : more devices to rt73usb.c

add more usb_dev to rt73usb.c . IDs 'stolen' from the
windows inf file(10/21/2008, 1.03.02.0000) plus some
from the Ralink linux driver(2009_0206_RT73_Linux_STA_Drv1.1.0.2.tar.bz2)

Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

rt2x00 : more devices to rt2500usb.c

add more usb_dev to rt2500usb.c . IDs 'stolen' from the
windows inf file(02/12/2009, 2.01.01.0015).

Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

Blackfin arch: SPI_MMC is now mainlined MMC_SPI

Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

Blackfin arch: disable legacy /proc/scsi/ support by default

Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

Blackfin arch: remove duplicated ANOMALY_05000448 ifdef check

Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

ata: add CFA specific identify data words

Declare CFA specific identify data words 162 and 163 for future use.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
[bart: update patch summary/description]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

Blackfin arch: add stubs for anomalies 447 and 448

Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

remove stale comment from <linux/hdreg.h>

HDIO_GET_IDENTITY returns 256 words currently.

Noticed by Norman Diamond.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

AT91: initialize Compact Flash on AT91SAM9263 cpu

Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Cc: Andrew Victor <avictor.za@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Acked-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide: add at91_ide driver

This is IDE host driver for AT91 (SAM9, CAP9, AT572D940HF) Static Memory
Controller with Compact Flash True IDE Mode logic.

Driver have to switch 8/16 bit bus width when accessing Task Tile or Data
Register. Moreover some extra things need to be done when setting PIO mode.
Only PIO mode is used, hardware have no DMA support. If interrupt line is
connected through GPIO extra quirk is needed to cope with fake interrupts.

Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Cc: Andrew Victor <avictor.za@gmail.com>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide: allow to wrap interrupt handler

Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Cc: Andrew Victor <linux@maxim.org.za>
[bart: minor checkpatch.pl / CodingStyle fixups]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide-iops: fix odd-length ATAPI PIO transfers

Commit 9567b349f7e7dd7e2483db99ee8e4a6fe0caca38 (ide: merge ->atapi_*put_bytes
and ->ata_*put_data methods) introduced a regression WRT the odd-length ATAPI
PIO transfers -- the final word didn't get written (causing command timeouts).

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide: NULL noise: drivers/ide/ide-*.c

Fix this sparse warnings:
  drivers/ide/ide-disk_proc.c:130:11: warning: Using plain integer as NULL pointer
  drivers/ide/ide-floppy_proc.c:32:11: warning: Using plain integer as NULL pointer
  drivers/ide/ide-proc.c:234:11: warning: Using plain integer as NULL pointer
  drivers/ide/ide-tape.c:2141:11: warning: Using plain integer as NULL pointer

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Cc: trivial@kernel.org
Cc: kernel-janitors@vger.kernel.org
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide: expiry() returns int, negative expiry() return values won't be noticed

bart:
It seems like the bug could cause insanely long timeouts for:
- ATA_DMA_ERR error in dma_timer_expiry()
- commands without ->expiry in tc86c001_timer_expiry()
(TC86C001 IDE controller only)

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
[bart: port it to the current tree]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

[WATCHDOG] gef_wdt.c: fsl_get_sys_freq() failure not noticed

fsl_get_sys_freq() may return -1 when 'soc' isn't found, but in
gef_wdt_probe() 'freq' is unsigned, so the test doesn't catch that.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

[WATCHDOG] ks8695_wdt.c: 'CLOCK_TICK_RATE' undeclared

On arm-acs5k_tiny:

drivers/watchdog/ks8695_wdt.c:68: error: 'CLOCK_TICK_RATE' undeclared
(first use in this function)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: stable <stable@kernel.org>

libata: Don't trust current capacity values in identify words 57-58

Hanno Böck reported a problem where an old Conner CP30254 240MB hard drive
was reported as 1.1TB in capacity by libata:

http://lkml.org/lkml/2009/2/13/134

This was caused by libata trusting the drive's reported current capacity in
sectors in identify words 57 and 58 if the drive does not support LBA and the
current CHS translation values appear valid. Unfortunately it seems older
ATA specs were vague about what this field should contain and a number of drives
used values with wrong byte order or that were totally bogus. There's no
unique information that it conveys and so we can just calculate the number
of sectors from the reported current CHS values.

While we're at it, clean up this function to use named constants for the
identify word values.

Signed-off-by: Robert Hancock <hancockrwd@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

libata: make sure port is thawed when skipping resets

When SCR access is available and the link is offline, softreset is
skipped as it only wastes time and some controllers don't respond very
well.  However, the skip path forgot to thaw the port, which not only
blocks further event notification from the port but also causes
repeated EH invocations on the same event on drivers which rely on
->thaw() to clear events if the IRQ is shared with another device or
port.

This problem has always been there but is uncovered by recent sata_nv
nf2/3 change which dropped hardreset support while maintaining SCR
access.  nf2/3 doesn't clear hotplug event mask from the interrupt
handler but relies on ->thaw() to clear them.  When the hardreset was
there, the reset action was never skipped and the port was always
thawed but, with the hardreset gone, ->prereset() determines that
there's no need for softreset and both ->softreset() and ->thaw() are
skipped.  This leads to stuck hotplug event in the IRQ status register
triggering hotplug event whenever IRQ is delieverd on the same IRQ.
As the controller shares the same IRQ for both ports, this happens on
every IO if one port is occpupied and the other isn't.

This patch fixes the problem by making sure that the port is thawed on
reset-skip path.

bko#11615 reports this problem.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Robert Hancock <hancockrwd@gmail.com>
Reported-by: Dan Andresan <danyer@gmail.com>
Reported-by: Arne Woerner <arne_woerner@yahoo.com>
Reported-by: Stefan Lippers-Hollmann <s.L-H@gmx.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

sata_nv: fix module parameter description

Update MODULE_PARM_DESC for ADMA to reflect the fact that the
option is disabled by default.

Signed-off-by: Brandon Ehle <azverkan@yahoo.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

ahci: Add the Device IDs for MCP89 and remove IDs of MCP7B to/from ahci.c

Added the Device IDs for MCP89 AHCI controller.

Removed the IDs of MCP7B because this chipset had been cancelled.

Signed-off-by: Peer Chen <peerchen@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

libata: don't use on-stack sense buffer

sense_buffer is used as DMA target and shouldn't be allocated on
stack. Use ap->sector_buf instead. This problem is spotted by Chuck
Ebbert.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

libata: align ap->sector_buf

ap->sector_buf is used as DMA target and should at least be aligned on
cacheline. This caused problems on some embedded machines.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

libata: fix dma_unmap_sg misuse

libata passes the returned value of dma_map_sg() to
dma_unmap_sg(),which is the misuse of dma_unmap_sg().

DMA-mapping.txt says:

To unmap a scatterlist, just call:

pci_unmap_sg(pdev, sglist, nents, direction);

Again, make sure DMA activity has already finished.

PLEASE NOTE:  The 'nents' argument to the pci_unmap_sg call must be
              the _same_ one you passed into the pci_map_sg call,
      it should _NOT_ be the 'count' value _returned_ from the
              pci_map_sg call.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

libata: change drive ready wait after hard reset to 5s

This fixes problems during resume with drives that take longer than 1s to
be ready. The ATA-6 spec appears to allow 5 seconds for a drive to be
ready.

On one affected system, this patch changes "PM: resume devices took..."
message from 17 seconds to 4 seconds, and gets rid of a lot of ugly
timeout/error messages.

Without this patch, the libata code moves on after 1s, tries to send a
soft reset (which the drive doesn't see because it isn't ready) which also
times out, then an IDENTIFY command is sent to the drive which times out,
and finally the error handler will try to send another hard reset which
will finally get things working.

Signed-off-by: Stuart Hayes <stuart_hayes@dell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

Blackfin arch: cleanup bfin_sport.h header and export it to userspace

Signed-off-by: Alon Bar-Lev <alon.barlev@gmail.com>
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

loop: don't increment p->offset with (size_t) -EINVAL

Upon a 'transfer error block' size is set to -EINVAL, but this becomes positive
since size is unsigned: p->offset still gets incremented.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

cciss: remove 30 second initial timeout on controller reset

Commit 5e4c91c84b194b26cf592779e451f4b5be777cba forgot to remove the
initial sleep, get rid of it.

Thanks to Randy Dunlap <randy.dunlap@oracle.com> for spotting this error.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

Fix kernel NULL pointer dereference in xen-blkfront

When booting Xen Dom0 on a pre-release 3.2.1 hypervisor the system Oopses on a
"Unable to handle kernel NULL pointer dereference" in xenwatch.

From the backtrace it looks like backend_changed is calling bdget_disk
with a NULL pointer. Checking for NULL and returning ENODEV instead
allows the kernel to boot.

Blackfin arch: fix bug - gdb signull case make trunk kernel panic frequently

Use copy_to_user_page and copy_from_user_page instead of
memcpy. copy_to_user_page does cache flush when necessary.

Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

Blackfin arch: remove spurious dash when dcache is off

Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

Blackfin arch: mark init_pda as __init as only __init funcs all it

Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

Blackfin arch: fix bug - On bf548-ezkit, ethernet fails to work after wakeup from "mem"

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

Blackfin arch: Random read/write errors are a bad thing

Random read/write errors are a bad thing - so don't let anyone
(including the test bench) run on something we know is bad.

Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

bonding: Fix device passed into ->ndo_neigh_setup().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

Blackfin arch: update default kernel config, select KSZ8893M driver for BF518

Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

Blackfin arch: Fix bug - KGDB single step into the middle of a 4 bytes instruction on bf561 after soft bp is hit

Run IFLUSH twice to avoid loading wrong instruction
after invalidating icache and following sequence is met.

1) The one instruction address is cached in the icache.
2) This instruction in SDRAM is changed.
3) IFLASH[P0] is executed only once in lackfin_icache_flush_range().
4) This instruction is executed again, but not the changed new one.

Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

Blackfin arch: Fix bug - make ksz8893m driver available when bfin_mac is enabled

Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

Blackfin arch: make sure people do not set the kernel load address too high

Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

Merge branch 'master' of /home/davem/src/GIT/linux-2.6/

vlan: Fix vlan-in-vlan crashes.

As analyzed by Patrick McHardy, vlan needs to reset it's
netdev_ops pointer in it's ->init() function but this
leaves the compat method pointers stale.

Add a netdev_resync_ops() and call it from the vlan code.

Any other driver which changes ->netdev_ops after register_netdevice()
will need to call this new function after doing so too.

With help from Patrick McHardy.

Tested-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Fix missing dev->neigh_setup in register_netdevice().

Signed-off-by: David S. Miller <davem@davemloft.net>

Blackfin arch: fix bug - The SPORT_HYS bit is not set for BF561 0.5

IMHO the setting should depend on ANOMALY_05000305 which is about the
availability of the bit, not ANOMALY_05000265 which only describes the
SPORT sensitivity to noise (checked for BF561 only, though).

If that's not true for other BF variants, maybe the definition of
ANOMALY_05000265 for BF561 should be changed to '(1)' instead.

Signed-off-by: Enrik Berkhan <Enrik.Berkhan@ge.com>
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

tmspci: fix request_irq race

Currently, tmspci tokenring driver crashes on device initialization
because it requests its irq before initializing corresponding data
structures. Fix this by moving request_irq call to a safer place.

Signed-off-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>

pkt_sched: act_police: Fix a rate estimator test.

A commit c1b56878fb68e9c14070939ea4537ad4db79ffae "tc: policing requires
a rate estimator" introduced a test which invalidates previously working
configs, based on examples from iproute2: doc/actions/actions-general.
This is too rigorous: a rate estimator is needed only when police's
"avrate" option is used.

Reported-by: Joao Correia <joaomiguelcorreia@gmail.com>
Diagnosed-by: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Squashfs: frag_size should be signed, as it can hold an error result

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>

Squashfs: fix documentation typo, Cramfs filesystem limit is 256 MiB

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>

Squashfs: Fix oops when reading fsfuzzer corrupted filesystems

This fixes a code regression caused by the recent mainlining changes.
The recent code changes call zlib_inflate repeatedly, decompressing into
separate 4K buffers, this code didn't check for the possibility that
zlib_inflate might ask for too many buffers when decompressing corrupted
data.

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>