git.karo-electronics.de Git - karo-tx-linux.git/log

md: restart recovery cleanly after device failure.

When we get any IO error during a recovery (rebuilding a spare), we abort
the recovery and restart it.

For RAID6 (and multi-drive RAID1) it may not be best to restart at the
beginning: when multiple failures can be tolerated, the recovery may be
able to continue and re-doing all that has already been done doesn't make
sense.

We already have the infrastructure to record where a recovery is up to
and restart from there, but it is not being used properly.
This is because:
  - We sometimes abort with MD_RECOVERY_ERR rather than just MD_RECOVERY_INTR,
    which causes the recovery not be be checkpointed.
  - We remove spares and then re-added them which loses important state
    information.

The distinction between MD_RECOVERY_ERR and MD_RECOVERY_INTR really isn't
needed.  If there is an error, the relevant drive will be marked as
Faulty, and that is enough to ensure correct handling of the error.  So we
first remove MD_RECOVERY_ERR, changing some of the uses of it to
MD_RECOVERY_INTR.

Then we cause the attempt to remove a non-faulty device from an array to
fail (unless recovery is impossible as the array is too degraded).  Then
when remove_and_add_spares attempts to remove the devices on which
recovery can continue, it will fail, they will remain in place, and
recovery will continue on them as desired.

Issue:  If we are halfway through rebuilding a spare and another drive
fails, and a new spare is immediately available,  do we want to:
1/ complete the current rebuild, then go back and rebuild the new spare or
2/ restart the rebuild from the start and rebuild both devices in
    parallel.

Both options can be argued for.  The code currently takes option 2 as
  a/ this requires least code change
  b/ this results in a minimally-degraded array in minimal time.

Cc: "Eivind Sarto" <ivan@kasenna.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: allow parallel resync of md-devices.

In some configurations, a raid6 resync can be limited by CPU speed
(Calculating P and Q and moving data) rather than by device speed. In
these cases there is nothing to be gained byt serialising resync of arrays
that share a device, and doing the resync in parallel can provide benefit.
So add a sysfs tunable to flag an array as being allowed to resync in
parallel with other arrays that use (a different part of) the same device.

Signed-off-by: Bernd Schubert <bs@q-leap.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: notify userspace on 'stop' events

This additional notification to 'array_state' is needed to allow the
monitor application to learn about stop events via sysfs. The
sysfs_notify("sync_action") call that comes at the end of do_md_stop()
(via md_new_event) is insufficient since the 'sync_action' attribute has
been removed by this point.

(Seems like a sysfs-notify-on-removal patch is a better fix. Currently
removal updates the event count but does not wake up waiters)

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: notify userspace on 'write-pending' changes to array_state

When an array enters write pending, 'array_state' changes, so we must be
sure to sysfs_notify.

Also, when waiting for user-space to acknowledge 'write-pending' by
marking the metadata as dirty, we don't want to wait for MD_CHANGE_DEVS to
be cleared as that might not happen. So explicity test for the bits that
we are really interested in.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: raid1: Fix restoration of bio between failed read and write.

When performing a "recovery" or "check" pass on a RAID1 array, we read
from each device and possible, if there is a difference or a read error,
write back to some devices.

We use the same 'bio' for both read and write, resetting various fields
between the two operations.

We forgot to reset bv_offset and bv_len however. These are often left
unchanged, but in the case where there is an IO error one or two sectors
into a page, they are changed.

This results in correctable errors not being corrected properly. It does
not result in any data corruption.

Cc: "Fairbanks, David" <David.Fairbanks@stratus.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: md: raid5 rate limit error printk

Last night we had scsi problems and a hardware raid unit was offlined
during heavy i/o. While this happened we got for about 3 minutes a huge
number messages like these

Apr 12 03:36:07 pfs1n14 kernel: [197510.696595] raid5:md7: read error not correctable (sector 2993096568 on sdj2).

I guess the high error rate is responsible for not scheduling other events
- during this time the system was not pingable and in the end also other
devices run into scsi command timeouts causing problems on these unrelated
devices as well.

Signed-off-by: Bernd Schubert <bernd-schubert@gmx.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: kill file_path wrapper

Kill the trivial and rather pointless file_path wrapper around d_path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: proper extern for mdp_major

This patch adds a proper extern for mdp_major in include/linux/raid/md.h

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

md: fix possible oops when removing a bitmap from an active array

It is possible to add a write-intent bitmap to an active array, or remove
the bitmap that is there.

When we do with the 'quiesce' the array, which causes make_request to
block in "wait_barrier()".

However we are sampling the value of "mddev->bitmap" before the
wait_barrier call, and using it afterwards. This can result in using a
bitmap structure that has been freed.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: fix atomic_t overflow in vm

The atomic_t type is 32bit but a 64bit system can have more than 2^32
pages of virtual address space available. Without this we overflow on
ludicrously large mappings

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

types.h: don't expose struct ustat to userspace

<linux/types.h> can't be used together with <sys/ustat.h> because they
both define struct ustat:

    $ cat test.c
    #include <sys/ustat.h>
    #include <linux/types.h>
    $ gcc -c test.c
    In file included from test.c:2:
    /usr/include/linux/types.h:165: error: redefinition of 'struct ustat'

has been reported a while ago to debian, but seems to have been
lost in cat fighting: http://bugs.debian.org/429064

Signed-off-by: maximilian attems <max@stro.at>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

serial: support for InstaShield IS-400 four port RS-232 PCI card

Add support for the InstaShield IS-400 four port RS-232 PCI card.

Signed-off-by: Ignacio García Pérez <iggarpe@t2i.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fix parenthesis in include/asm-mips/mach-au1x00/au1000.h

Parenthesis fix in include/asm-mips/mach-au1x00/au1000.h

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fix parenthesis in include/asm-mips/gic.h

Parenthesis fix in include/asm-mips/gic.h

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ibmaem: new driver for power/energy/temp meters in IBM System X hardware

This driver reads IBM Active Energy Manager energy/temperature/power
sensors on IBM System X hardware.

[akpm@linux-foundation.org: fix printk warnings]
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

i5k_amb: support Intel 5400 chipset

Minor rework to support the Intel 5400 chipset.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

hdaps: invert the axes for HDAPS on Lenovo R61i ThinkPads

Cc: "Mark M. Hoffman" <mhoffman@lightlink.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Jiri Kosina <jikos@jikos.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ntfs: le*_add_cpu conversion

replace all:
little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) +
expression_in_cpu_byteorder);
with:
leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder);
generated with semantic patch

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: don't drop a partial page in a zone's memory map size

In a zone's present pages number, account for all pages occupied by the
memory map, including a partial.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: add util-linux-ng package

(akpm: we often deal with util-linux and I (at least) can never remember
where they hang out).

Signed-off-by: Karel Zak <kzak@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ecryptfs: fix missed mutex_unlock

Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fuse: fix bdi naming conflict

Fuse allocates a separate bdi for each filesystem, and registers them
in sysfs with "MAJOR:MINOR" of sb->s_dev (st_dev). This works fine for
anon devices normally used by fuse, but can conflict with an already
registered BDI for "fuseblk" filesystems, where sb->s_dev represents a
real block device. In particularl this happens if a non-partitioned
device is being mounted.

Fix by registering with a different name for "fuseblk" filesystems.

Thanks to Ioan Ionita for the bug report.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reported-by: Ioan Ionita <opslynx@gmail.com>
Tested-by: Ioan Ionita <opslynx@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: allow pfnmap ->fault()s

Take out an assertion to allow ->fault handlers to service PFNMAP regions.
This is required to reimplement .nopfn handlers with .fault handlers and
subsequently remove nopfn.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Jes Sorensen <jes@sgi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mad: Fix kernel crash when .process_mad() returns SUCCESS|CONSUMED
  IPoIB: Test for NULL broadcast object in ipiob_mcast_join_finish()
  MAINTAINERS: Add cxgb3 and iw_cxgb3 NIC and iWARP driver entries
  IB/mlx4: Fix creation of kernel QP with max number of send s/g entries
  IB/mthca: Fix max_sge value returned by query_device
  RDMA/cxgb3: Fix uninitialized variable warning in iwch_post_send()
  IB/mlx4: Fix uninitialized-var warning in mlx4_ib_post_send()
  IB/ipath: Fix UC receive completion opcode for RDMA WRITE with immediate
  IB/ipath: Fix printk format for ipath_sdma_status

IB/mad: Fix kernel crash when .process_mad() returns SUCCESS|CONSUMED

If a low-level driver returns IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED,
handle_outgoing_dr_smp() doesn't clean up properly. The fix is to
kfree the local data and break, rather than falling through. This was
observed with the ipath driver, but could happen with any driver.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1027>.

Signed-off-by: Dave Olson <dave.olson@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] clarify license of freq_table.c
  [CPUFREQ] Remove documentation of removed ondemand tunable.
  [CPUFREQ] Crusoe: longrun cpufreq module reports false min freq
  [CPUFREQ] powernow-k8: improve error messages

remove debug printk from DRM suspend path

Not sure how this snuck upstream, but it really doesn't belong there. We
don't need a KERN_ERR printk in the suspend path to know what's going on (at
least not anymore).

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  [POWERPC] iSeries: Remove unused mail address
  [POWERPC] mpic: Fix use of uninitialized variable
  [POWERPC] Add kernstart_addr to list of allowed symbols in prom_init
  [POWERPC] Fix __set_fixmap() for STRICT_MM_TYPECHECKS
  [POWERPC] PS3: Fix memory hotplug

Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6

* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6:
  [XFS] Fix memory corruption with small buffer reads
  [XFS] Fix inode list allocation size in writeback.
  [XFS] Don't allow memory reclaim to wait on the filesystem in inode
  [XFS] Fix fsync() b0rkage.
  [XFS] Include linux/random.h in all builds, not just debug builds.

Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  stop_machine: make stop_machine_run more virtualization friendly
  doc: add a chapter about trylock functions [Bug 9011]
  modules: proper cleanup of kobject without CONFIG_SYSFS
  module loading ELF handling: use SELFMAG instead of numeric constant

fbdev: fix integer as NULL pointer warning

drivers/video/aty/atyfb_base.c:3359:26: warning: Using plain integer as NULL pointer
drivers/video/aty/radeon_base.c:2280:32: warning: Using plain integer as NULL pointer
drivers/video/matrox/matroxfb_base.h:203:25: warning: Using plain integer as NULL pointer
drivers/video/matrox/matroxfb_base.h:203:25: warning: Using plain integer as NULL pointer
drivers/video/sis/sis_main.c:5790:44: warning: Using plain integer as NULL pointer

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

scsi: fix integer as NULL pointer warning

drivers/scsi/aha152x.c:3585:60: warning: Using plain integer as NULL pointer
drivers/scsi/aha152x.c:3845:56: warning: Using plain integer as NULL pointer
drivers/scsi/qla1280.c:2814:37: warning: Using plain integer as NULL pointer
drivers/scsi/atp870u.c:750:47: warning: Using plain integer as NULL pointer
drivers/scsi/3w-9xxx.c:1281:36: warning: Using plain integer as NULL pointer
drivers/scsi/3w-9xxx.c:1293:36: warning: Using plain integer as NULL pointer
drivers/scsi/3w-9xxx.c:1301:35: warning: Using plain integer as NULL pointer
drivers/scsi/hptiop.c:447:10: warning: Using plain integer as NULL pointer
drivers/scsi/hptiop.c:457:10: warning: Using plain integer as NULL pointer
drivers/scsi/hptiop.c:479:24: warning: Using plain integer as NULL pointer
drivers/scsi/hptiop.c:483:22: warning: Using plain integer as NULL pointer
drivers/scsi/hptiop.c:1213:23: warning: Using plain integer as NULL pointer
drivers/scsi/hptiop.c:1214:23: warning: Using plain integer as NULL pointer

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

isdn: fix integer as NULL pointer warning

drivers/isdn/hysdn/hycapi.c:465:42: warning: Using plain integer as NULL pointer
drivers/isdn/hysdn/hycapi.c:467:44: warning: Using plain integer as NULL pointer
drivers/isdn/hysdn/hycapi.c:469:42: warning: Using plain integer as NULL pointer

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

acpi: fix integer as NULL pointer warning

drivers/acpi/dispatcher/dsmethod.c:568:50: warning: Using plain integer as NULL pointer
drivers/acpi/executer/exmutex.c:329:30: warning: Using plain integer as NULL pointer
drivers/acpi/executer/exmutex.c:466:31: warning: Using plain integer as NULL pointer

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

x86: fix integer as NULL pointer warning

arch/x86/boot/printf.c:59:10: warning: Using plain integer as NULL pointer

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[XFS] Fix memory corruption with small buffer reads

When we have multiple buffers in a single page for a blocksize == pagesize
filesystem we might overwrite the page contents if two callers hit it
shortly after each other. To prevent that we need to keep the page locked
until I/O is completed and the page marked uptodate.

Thanks to Eric Sandeen for triaging this bug and finding a reproducible
testcase and Dave Chinner for additional advice.

This should fix kernel.org bz #10421.

Tested-by: Eric Sandeen <sandeen@sandeen.net>
SGI-PV: 981813
SGI-Modid: xfs-linux-melb:xfs-kern:31173a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>

[POWERPC] iSeries: Remove unused mail address

I don't use my IBM email address normally and people can find me in
CREDITS.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>

[POWERPC] mpic: Fix use of uninitialized variable

Compiling ppc64_defconfig with gcc 4.3 gives thes warnings:

arch/powerpc/sysdev/mpic.c: In function 'mpic_irq_get_priority':
arch/powerpc/sysdev/mpic.c:1351: warning: 'is_ipi' may be used uninitialized in this function
arch/powerpc/sysdev/mpic.c: In function 'mpic_irq_set_priority':
arch/powerpc/sysdev/mpic.c:1328: warning: 'is_ipi' may be used uninitialized in this function

It turns out that in the cases where is_ipi is uninitialized, another
variable (mpic) will be NULL and it is dereferenced. Protect against
this by returning if mpic is NULL in mpic_irq_set_priority, and removing
mpic_irq_get_priority completely as it has no in tree callers.

This has the nice side effect of making the warning go away.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>

[POWERPC] Add kernstart_addr to list of allowed symbols in prom_init

Since commit "85xx: Add support for relocatable kernel (and
booting at non-zero)" (37dd2badcfcec35f5e21a0926968d77a404f03c3),
PHYSICAL_START is #defined as kernstart_addr if RELOCATABLE
and FLATMEM is enabled.

PHYSICAL_START is used in prom_init.c and so kernstart_addr
needs to be added to the list of allowed symbols that
prom_init.c can access.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>

[POWERPC] Fix __set_fixmap() for STRICT_MM_TYPECHECKS

__set_fixmap() in pgtable_32.c currently fails to compile if
STRICT_MM_TYPECHECKS is defined. This fixes it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>

[POWERPC] PS3: Fix memory hotplug

A change was made to walk_memory_resource() in commit
4b119e21d0c66c22e8ca03df05d9de623d0eb50f that added a
check of find_lmb(). Add the coresponding lmb_add()
call to ps3_mm_add_memory() so that that check will
succeed.

This fixes the condition where the PS3 boots up with
only the 128 MiB of boot memory, and doesn't see the
other 128MiB that is available.

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>

[XFS] Fix inode list allocation size in writeback.

We only need to allocate space for the number of inodes in the cluster
when writing back inodes, not every byte in the inode cluster. This
reduces the amount of memory needing to be allocated to 256 bytes instead
of 64k.

SGI-PV: 981949
SGI-Modid: xfs-linux-melb:xfs-kern:31182a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>

[XFS] Don't allow memory reclaim to wait on the filesystem in inode
writeback

If we allow memory reclaim to wait on the pages under writeback in inode
cluster writeback we could deadlock because we are currently holding the
ILOCK on the initial writeback inode which is needed in data I/O
completion to change the file size or do unwritten extent conversion
before the pages are taken out of writeback state.

SGI-PV: 981091
SGI-Modid: xfs-linux-melb:xfs-kern:31015a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>

[XFS] Fix fsync() b0rkage.

xfs_fsync() fails to wait for data I/O completion before checking if the
inode is dirty or clean to decide whether to log the inode or not. This
misses inode size updates when the data flushed by the fsync() is
extending the file.

Hence, like fdatasync(), we need to wait for I/o completion first, then
check the inode for cleanliness. Doing so makes the behaviour of
xfs_fsync() identical for fsync and fdatasync and we *always* use
synchronous semantics if the inode is dirty. Therefore also kill the
differences and remove the unused flags from the xfs_fsync function and
callers.

SGI-PV: 981296
SGI-Modid: xfs-linux-melb:xfs-kern:31033a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into for-linus

stop_machine: make stop_machine_run more virtualization friendly

On kvm I have seen some rare hangs in stop_machine when I used more guest
cpus than hosts cpus. e.g. 32 guest cpus on 1 host cpu triggered the
hang quite often. I could also reproduce the problem on a 4 way z/VM host with
a 64 way guest.

It turned out that the guest was consuming all available cpus mostly for
spinning on scheduler locks like rq->lock. This is expected as the threads are
calling yield all the time.
The problem is now, that the host scheduling decisings together with the guest
scheduling decisions and spinlocks not being fair managed to create an
interesting scenario similar to a live lock. (Sometimes the hang resolved
itself after some minutes)

Changing stop_machine to yield the cpu to the hypervisor when yielding inside
the guest fixed the problem for me. While I am not completely happy with this
patch, I think it causes no harm and it really improves the situation for me.

I used cpu_relax for yielding to the hypervisor, does that work on all
architectures?

p.s.: If you want to reproduce the problem, cpu hotplug and kprobes use
stop_machine_run and both triggered the problem after some retries.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

doc: add a chapter about trylock functions [Bug 9011]

Add a chapter about trylock functions.
http://bugzilla.kernel.org/show_bug.cgi?id=9011

Signed-off-by: Matti Linnanvuori <mattilinnanvuori@yahoo.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed down_trylock)

modules: proper cleanup of kobject without CONFIG_SYSFS

kobject: '<NULL>' (ffffffffa0104050): is not initialized, yet kobject_put() is being called.
------------[ cut here ]------------
WARNING: at /home/den/src/linux-netns26/lib/kobject.c:583 kobject_put+0x53/0x55()
Modules linked in: ipv6 nfsd lockd nfs_acl auth_rpcgss sunrpc exportfs ide_cd_mod cdrom button [last unloaded: pktgen]
comm: rmmod Tainted: G        W 2.6.26-rc3 #585
Call Trace:
  [<ffffffff802359ab>] warn_on_slowpath+0x58/0x7a
  [<ffffffff80236aca>] ? printk+0x67/0x69
  [<ffffffff80236aca>] ? printk+0x67/0x69
  [<ffffffff80324289>] kobject_put+0x53/0x55
  [<ffffffff8025e2ee>] free_module+0x87/0xfa
  [<ffffffff8025fee5>] sys_delete_module+0x178/0x1e1
  [<ffffffff804b1e70>] ? lockdep_sys_exit_thunk+0x35/0x67
  [<ffffffff804b1dff>] ? trace_hardirqs_on_thunk+0x35/0x3a
  [<ffffffff8020c0bb>] system_call_after_swapgs+0x7b/0x80
---[ end trace 8f5aafa7f6406cf8 ]---

mod->mkobj.kobj is not initialized without CONFIG_SYSFS. Do not call
kobject_put in this case.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

module loading ELF handling: use SELFMAG instead of numeric constant

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

[CPUFREQ] clarify license of freq_table.c

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] Fix reversed memset arguments
  Adds username in the upcall key for unattended mounts with keytab
  [CIFS] Remove redundant NULL check

[CIFS] Fix reversed memset arguments

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>

Adds username in the upcall key for unattended mounts with keytab

Signed-off-by: Igor Mammedov <niallain@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net: The world is not perfect patch.
  tcp: Make prior_ssthresh a u32
  xfrm_user: Remove zero length key checks.
  net/ipv4/arp.c: Use common hex_asc helpers
  cassini: Only use chip checksum for ipv4 packets.
  tcp: TCP connection times out if ICMP frag needed is delayed
  netfilter: Move linux/types.h inclusions outside of #ifdef __KERNEL__
  af_key: Fix selector family initialization.
  libertas: Fix ethtool statistics
  mac80211: fix NULL pointer dereference in ieee80211_compatible_rates
  mac80211: don't claim iwspy support
  orinoco_cs: add ID for SpeedStream wireless adapters
  hostap_cs: add ID for Conceptronic CON11CPro
  rtl8187: resource leak in error case
  ath5k: Fix loop variable initializations

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: Prevent stack backtrace false positives on trap frames.
  sparc64: Fix stack tracing through trap frames.
  sparc64: Fix kernel thread stack termination.
  sunhv: Fix locking in non-paged I/O case.

sparc64: Prevent stack backtrace false positives on trap frames.

When we fully commit to returning back to kernel mode from
a trap, zero out the regs->magic value to prevent false
positives during stack backtraces.

Signed-off-by: David S. Miller <davem@davemloft.net>

[CIFS] Remove redundant NULL check

Noticed by Coverity checker.

Signed-off-by: Steve French <sfrench@us.ibm.com>

sparc64: Fix stack tracing through trap frames.

The offset to the pt_regs area was wrong, so we weren't
looking at the right location for the magic cookie.

A trap frame is composed of a "struct sparc_stackf" then
a "struct pt_regs", the code was using "struct reg_window"
instead of "struct sparc_stackf".

Signed-off-by: David S. Miller <davem@davemloft.net>

sparc64: Fix kernel thread stack termination.

Because of the silly way I set up the initial stack for
new kernel threads, there is a loop at the top of the
stack.

To fix this, properly add another stack frame that is copied
from the parent and terminate it in the child by setting
the frame pointer in that frame to zero.

Signed-off-by: David S. Miller <davem@davemloft.net>

net: The world is not perfect patch.

Unless there will be any objection here, I suggest consider the
following patch which simply removes the code for the
-DI_WISH_WORLD_WERE_PERFECT in the three methods which use it.

The compilation errors we get when using -DI_WISH_WORLD_WERE_PERFECT
show that this code was not built and not used for really a long time.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Make prior_ssthresh a u32

If previous window was above representable values of u16,
strange things will happen if undo with the truncated value
is called for. Alternatively, this could be fixed by some
max trickery but that would limit undoing high-speed undos.

Adds 16-bit hole but there isn't anything to fill it with.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

xfrm_user: Remove zero length key checks.

The crypto layer will determine whether that is valid
or not.

Suggested by Herbert Xu, based upon a report and patch
by Martin Willi.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

net/ipv4/arp.c: Use common hex_asc helpers

Here the local hexbuf is a duplicate of global const char hex_asc from
lib/hexdump.c, except the hex letters' cases:

const char hexbuf[] = "0123456789ABCDEF";

const char hex_asc[] = "0123456789abcdef";

and here to print HW addresses, the hex cases are not significant.

Thanks to Harvey Harrison to introduce the hex_asc_hi/hex_asc_lo helpers.

Signed-off-by: Denis Cheng <crquan@gmail.com>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cassini: Only use chip checksum for ipv4 packets.

According to David Monro, at least with Natsemi Saturn chips the
cassini driver has some trouble with ipv6 checksums.

Until we have more information about what's going on here, only
use the chip checksums for ipv4.

This workaround was suggested and tested by David.

Update version and release date.

Signed-off-by: David S. Miller <davem@davemloft.net>

HTC_EGPIO is ARM-only

driver uses symbols defined only on ARM

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

provide out-of-line strcat() for m68k

Whether we sidestep it in init/main.c or not, such situations
will arise again; compiler does generate calls of strcat()
on optimizations, so we really ought to have an out-of-line
version...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

caiaq endianness fix

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MODULE_LICENSE expects "GPL v2", not "GPLv2"

... and we have few enough places using the latter to make it
simpler to do search and replace...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

msnd_* is ISA-only

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

missing dependencies on HAS_DMA

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2 endianness fixes

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

irda-usb endianness annotations and fixes

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sbus bpp: instances missed in s/dev_name/bpp_dev_name/

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ecryptfs fixes

memcpy() from userland pointer is a Bad Thing(tm)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

misc drivers/net endianness noise

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

thanks to net/mac80211 we need to pull drivers/leds/Kconfig on uml

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

missing export of csum_partial() on uml/amd64

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

uml: add missing exports for UML_RANDOM=m

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fix hppfs Makefile breakage

Fallout from commit 46d7b522ebf486edbd096965d534cc6465e9e309 ("uml: move
hppfs_kern.c to hppfs.c")

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fix include order in sys-i386/registers.c

We want sys/ptrace.h before any includes of linux/ptrace.h and
asm/user.h pulls the latter.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

missed kmalloc() in pcap_user.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tcp: TCP connection times out if ICMP frag needed is delayed

We are seeing an issue with TCP in handling an ICMP frag needed
message that is received after net.ipv4.tcp_retries1 retransmits.
The default value of retries1 is 3. So if the path mtu changes
and ICMP frag needed is lost for the first 3 retransmits or if
it gets delayed until 3 retransmits are done, TCP doesn't update
MSS correctly and continues to retransmit the orginal message
until it timesout after tcp_retries2 retransmits.

I am seeing this issue even with the latest 2.6.25.4 kernel.

In tcp_retransmit_timer(), when retransmits counter exceeds
tcp_retries1 value, the dst cache entry of the socket is reset.
At this time, if we receive an ICMP frag needed message, the
dst entry gets updated with the new MTU, but the TCP sockets
dst_cache entry remains NULL.

So the next time when we try to retransmit after the ICMP frag
needed is received, tcp_retransmit_skb() gets called. Here the
cur_mss value is calculated at the start of the routine with
a NULL sk_dst_cache. Instead we should call tcp_current_mss after
the rebuild_header that caches the dst entry with the updated mtu.
Also the rebuild_header should be called before tcp_fragment
so that skb is fragmented if the mss goes down.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netfilter: Move linux/types.h inclusions outside of #ifdef __KERNEL__

Greg Steuck <greg@nest.cx> points out that some of the netfilter
headers can't be used in userspace without including linux/types.h
first. The headers include their own linux/types.h include statements,
these are stripped by make headers-install because they are inside
#ifdef __KERNEL__ however. Move them out to fix this.

Reported and Tested by Greg Steuck.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

af_key: Fix selector family initialization.

This propagates the xfrm_user fix made in commit
bcf0dda8d2408fe1c1040cdec5a98e5fcad2ac72 ("[XFRM]: xfrm_user: fix
selector family initialization")

Based upon a bug report from, and tested by, Alan Swanson.

Signed-off-by: Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

sunhv: Fix locking in non-paged I/O case.

This causes the lock to be taken twice, thus resulting in
a deadlock.

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (21 commits)
  [CIFS] Remove debug statement
  Fix possible access to undefined memory region.
  [CIFS] Enable DFS support for Windows query path info
  [CIFS] Enable DFS support for Unix query path info
  [CIFS] add missing seq_printf to cifs_show_options for hard mount option
  [CIFS] add more complete mount options to cifs_show_options
  [CIFS] Add missing defines for DFS
  CIFSGetDFSRefer cleanup + dfs_referral_level_3 fixed to conform REFERRAL_V3 the MS-DFSC spec.
  Fixed DFS code to work with new 'build_path_from_dentry', that returns full path if share in the dfs, now.
  [CIFS] enable parsing for transport encryption mount parm
  [CIFS] Finishup DFS code
  [CIFS] BKL-removal: convert CIFS over to unlocked_ioctl
  [CIFS] suppress duplicate warning
  [CIFS] Fix paths when share is in DFS to include proper prefix
  add function to convert access flags to legacy open mode
  clarify return value of cifs_convert_flags()
  [CIFS] don't explicitly do a FindClose on rewind when directory search has ended
  [CIFS] cleanup old checkpatch warnings
  [CIFS] CIFSSMBPosixLock should return -EINVAL on error
  fix memory leak in CIFSFindNext
  ...

[CIFS] Remove debug statement

Signed-off-by: Steve French <sfrench@us.ibm.com>

Fix possible access to undefined memory region.

Signed-off-by: Igor Mammedov <niallain@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>

Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6

libertas: Fix ethtool statistics

Fix various problems:
- We converted MESH_ACCESS to a direct command but missed this caller.
- We were trying to access mesh stats even on meshless firmware.
- We should really zero the buffer if something goes wrong.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

mac80211: fix NULL pointer dereference in ieee80211_compatible_rates

Fix a possible NULL pointer dereference in ieee80211_compatible_rates
introduced in the patch "mac80211: fix association with some APs". If no bss
is available just use all supported rates in the association request.

Signed-off-by: Helmut Schaa <hschaa@suse.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

Merge branch 'for-2.6.26' of git://linux-nfs.org/~bfields/linux

* 'for-2.6.26' of git://linux-nfs.org/~bfields/linux: (25 commits)
  svcrdma: Verify read-list fits within RPCSVC_MAXPAGES
  svcrdma: Change svc_rdma_send_error return type to void
  svcrdma: Copy transport address and arm CQ before calling rdma_accept
  svcrdma: Set rqstp transport address in rdma_read_complete function
  svcrdma: Use ib verbs version of dma_unmap
  svcrdma: Cleanup queued, but unprocessed I/O in svc_rdma_free
  svcrdma: Move the QP and cm_id destruction to svc_rdma_free
  svcrdma: Add reference for each SQ/RQ WR
  svcrdma: Move destroy to kernel thread
  svcrdma: Shrink scope of spinlock on RQ CQ
  svcrdma: Use standard Linux lists for context cache
  svcrdma: Simplify RDMA_READ deferral buffer management
  svcrdma: Remove unused READ_DONE context flags bit
  svcrdma: Return error from rdma_read_xdr so caller knows to free context
  svcrdma: Fix error handling during listening endpoint creation
  svcrdma: Free context on post_recv error in send_reply
  svcrdma: Free context on ib_post_recv error
  svcrdma: Add put of connection ESTABLISHED reference in rdma_cma_handler
  svcrdma: Fix return value in svc_rdma_send
  svcrdma: Fix race with dto_tasklet in svc_rdma_send
  ...

[CPUFREQ] Remove documentation of removed ondemand tunable.

sampling_down_factor was removed in ccb2fe209dac9ff67f6351e783e610073afaaeaf
back in June 2006.

Signed-off-by: Dave Jones <davej@redhat.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
  pktgen: make sure that pktgen_thread_worker has been executed
  [VLAN]: Propagate selected feature bits to VLAN devices
  drivers/atm/: remove CVS keywords
  vlan: Correctly handle device notifications for layered VLAN devices
  net: Fix call to ->change_rx_flags(dev, IFF_MULTICAST) in dev_change_flags()
  net_sched: cls_api: fix return value for non-existant classifiers
  ipsec: Use the correct ip_local_out function
  ipv6 addrconf: Allow infinite prefix lifetime.
  ipv6 route: Fix lifetime in netlink.
  ipv6 addrconf: Fix route lifetime setting in corner case.
  ndisc: Add missing strategies for per-device retrans timer/reachable time settings.
  ipv6: Move <linux/in6.h> from header-y to unifdef-y.
  l2tp: avoid skb truesize bug if headroom is increased
  wireless: Create 'device' symlink in sysfs
  wireless, airo: waitbusy() won't delay
  libertas: fix command timeout after firmware failure
  mac80211: Add RTNL version of ieee80211_iterate_active_interfaces
  mac80211 : Association with 11n hidden ssid ap.
  hostap: fix "registers" registration in procfs
  isdn/capi: Return proper errnos on module init.
  ...

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: Add global register dumping facility.
  sparc: remove CVS keywords
  sparc64: remove CVS keywords

Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: CDC WDM driver
  USB: ehci-orion: the Orion EHCI root hub does have a Transaction Translator
  USB: serial: ch341: New VID/PID for CH341 USB-serial
  USB: build fix
  USB: pxa27x_udc - Fix Oops
  USB: OPTION: fix name of Onda MSA501HS HSDPA modem
  USB: add TELIT HDSPA UC864-E modem to option driver
  usb-serial: Use ftdi_sio driver for RATOC REX-USB60F

Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6:
  SCSI: fix race in device_create
  USB: Core: fix race in device_create
  USB: Phidget: fix race in device_create
  s390: fix race in device_create
  SOUND: fix race in device_create
  UIO: fix race in device_create
  Power Supply: fix race in device_create
  LEDS: fix race in device_create
  IB: fix race in device_create
  ide: fix race in device_create
  fbdev: fix race in device_create
  mm: bdi: fix race in bdi_class device creation
  Driver core: add device_create_vargs and device_create_drvdata

Merge branch 'from-tomtucker' into for-2.6.26

IPoIB: Test for NULL broadcast object in ipiob_mcast_join_finish()

We saw a kernel oops in our regression testing when a multicast "join
finish" occurred just after the interface was -- this is
<https://bugs.openfabrics.org/show_bug.cgi?id=1040>. The test
randomly causes the HCA physical port to go down then up.

The cause of this is that ipoib_mcast_join_finish() processing happen
just after ipoib_mcast_dev_flush() was invoked (in which case the
broadcast pointer is NULL). This patch tests for and handles the case
where priv->broadcast is NULL.

Cc: <stable@kernel.org>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

pktgen: make sure that pktgen_thread_worker has been executed

The following courruption can happen during pktgen stop:
list_del corruption. prev->next should be ffff81007e8a5e70, but was 6b6b6b6b6b6b6b6b
kernel BUG at lib/list_debug.c:67!
      :pktgen:pktgen_thread_worker+0x374/0x10b0
      ? autoremove_wake_function+0x0/0x40
      ? _spin_unlock_irqrestore+0x42/0x80
      ? :pktgen:pktgen_thread_worker+0x0/0x10b0
      kthread+0x4d/0x80
      child_rip+0xa/0x12
      ? restore_args+0x0/0x30
      ? kthread+0x0/0x80
      ? child_rip+0x0/0x12
RIP  list_del+0x48/0x70

The problem is that pktgen_thread_worker can not be executed if kthread_stop
has been called too early. Insert a completion on the normal initialization
path to make sure that pktgen_thread_worker will gain the control for sure.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>