git.karo-electronics.de Git - linux-beck.git/log

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (22 commits)
  Fix the race between capifs remount and node creation
  Fix races around the access to ->s_options
  switch ufs directories to ufs_sync_file()
  Switch open_exec() and sys_uselib() to do_open_filp()
  Make open_exec() and sys_uselib() use may_open(), instead of duplicating its parts
  Reduce path_lookup() abuses
  Make checkpatch.pl shut up on fs/inode.c
  NULL noise in fs/super.c:kill_bdev_super()
  romfs: cleanup romfs_fs.h
  ROMFS: romfs_dev_read() error ignored
  fs: dcache fix LRU ordering
  ocfs2: Use nd_set_link().
  Fix deadlock in ipathfs ->get_sb()
  Fix a leak in failure exit in 9p ->get_sb()
  Convert obvious places to deactivate_locked_super()
  New helper: deactivate_locked_super()
  reiserfs: remove privroot hiding in lookup
  reiserfs: dont associate security.* with xattr files
  reiserfs: fixup xattr_root caching
  Always lookup priv_root on reiserfs mount and keep it
  ...

Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
GFS2: Fix glock ref counting bug

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: hda - Fix line-in on Mac Mini Core2 Duo
  ALSA: Release v1.0.20
  sound: via82xx: fix DXS volume range
  sound: serial-u16550: fix buffer overflow
  ASoC: Fix errors in WM8990

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
  V4L/DVB (11680): cafe_ccic: use = instead of == for setting a value at a var
  V4L/DVB (11679): cafe_ccic: fix sensor detection
  V4L/DVB (11675): ivtv/radio: fix V4L2_TUNER_MODE/V4L2_TUNER_SUB confusion
  V4L/DVB (11674): ivtv: fix incorrect bit tests
  V4L/DVB (11669): uvc: fix compile warning
  V4L/DVB (11668): ivtv: fix compiler warning.
  V4L/DVB (11664): cx23885: Frontend wasn't locking on HVR-1500
  V4L/DVB (11662): v4l2-ioctl: Clear buffer type specific trailing fields/padding
  V4L/DVB (11661): v4l2-ioctl: Check buffer types using g_fmt instead of try_fmt
  V4L/DVB (11660): zoran: fix bug when enumerating format -1
  V4L/DVB (11575): uvcvideo: fix uvc resume failed

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits)
  bonding: fix panic if initialization fails
  IXP4xx: complete Ethernet netdev setup before calling register_netdev().
  IXP4xx: use "ENODEV" instead of "ENOSYS" in module initialization.
  ipvs: Fix IPv4 FWMARK virtual services
  ipv4: Make INET_LRO a bool instead of tristate.
  net: remove stale reference to fastroute from Kconfig help text
  net: update skb_recycle_check() for hardware timestamping changes
  bnx2: Fix panic in bnx2_poll_work().
  net-sched: fix bfifo default limit
  igb: resolve panic on shutdown when SR-IOV is enabled
  wimax: oops: wimax_dev_add() is the only one that can initialize the state
  wimax: fix oops if netlink fails to add attribute
  Bluetooth: Move dev_set_name() to a context that can sleep
  netfilter: ctnetlink: fix wrong message type in user updates
  netfilter: xt_cluster: fix use of cluster match with 32 nodes
  netfilter: ip6t_ipv6header: fix match on packets ending with NEXTHDR_NONE
  netfilter: add missing linux/types.h include to xt_LED.h
  mac80211: pid, fix memory corruption
  mac80211: minstrel, fix memory corruption
  cfg80211: fix comment on regulatory hint processing
  ...

Merge branch 'fix/asoc' into for-linus

* fix/asoc:
ASoC: Fix errors in WM8990

Merge branch 'fix/hda' into for-linus

* fix/hda:
ALSA: hda - Fix line-in on Mac Mini Core2 Duo

Merge branch 'topic/misc' into for-linus

* topic/misc:
ALSA: Release v1.0.20

Merge branch 'fix/misc' into for-linus

* fix/misc:
sound: via82xx: fix DXS volume range
sound: serial-u16550: fix buffer overflow

V4L/DVB (11680): cafe_ccic: use = instead of == for setting a value at a var

/home/v4l/master/v4l/cafe_ccic.c: In function 'cafe_cam_init':
/home/v4l/master/v4l/cafe_ccic.c:778: warning: statement with no effect

Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: saeed bishara <saeed.bishara@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (11679): cafe_ccic: fix sensor detection

Due to an uninitialized chip.ident field the chip identification failed.

Thanks-to: Saeed Bishara <saeed.bishara@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (11675): ivtv/radio: fix V4L2_TUNER_MODE/V4L2_TUNER_SUB confusion

V4L2_TUNER_MODE_ was used in a few places where V4L2_TUNER_SUB_ should have
been used.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (11674): ivtv: fix incorrect bit tests

Found the coccinelle tool.

Thanks-to: Julia Lawall <julia@diku.dk>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (11669): uvc: fix compile warning

The 2.6.30 kernel generates this warning:

uvc_driver.c:1729: warning: 'ret' may be used uninitialized in this function

I guess some new warning flag must have been turned on since this warning
didn't appear with older kernels (gcc version 4.3.1). It's also a bogus
warning, but since this code didn't comply to the coding standard anyway
I've modified it to 1) remove the warning and 2) conform to the coding
standard.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (11668): ivtv: fix compiler warning.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (11664): cx23885: Frontend wasn't locking on HVR-1500

The boards control struct wasn't updated when (presumably) all of the
other drivers migrated from using scode_table to specifying the demod.

Signed-off-by: Steven Toth <stoth@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (11662): v4l2-ioctl: Clear buffer type specific trailing fields/padding

Some ioctls have structs that are a different size depending on what type
of buffer is being used. If the buffer type leaves a field unused or has
padding space at the end, this space should be zeroed out.

The problems with S_FMT and REQBUFS were original identified and patched by
Marton Nemeth <nm127@freemail.hu>.

Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (11661): v4l2-ioctl: Check buffer types using g_fmt instead of try_fmt

For a number of different ioctls, the v4l2-ioctl code checks that the
passed buffer type is supported by the driver.  It did this by checking
that the driver defined a method for the try_fmt handler for that buffer
type.  However, try_fmt is optional and a driver might not provide it even
though it does support that type.  So use g_fmt instead, since that isn't
optional.

This should fix a problem with VBI capture with saa7146.

Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (11660): zoran: fix bug when enumerating format -1

If someone requests a format at fmt->index == (unsigned)-1 and the first
format in the array doesn't have the requested type then num will still be
-1 when it's compared to fmt->index and there will appear to be a match.

Restructure the loop so this can't happen. It's simpler this way too. The
unnecessary check for (unsigned)fmt->index < 0 found by Roel Kluin
<roel.kluin@gmail.com> is removed this way too.

Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (11575): uvcvideo: fix uvc resume failed

Now urb buffers is not freed before suspend, so uvc_alloc_urb_buffers should
return packet counts allocated originally during uvc resume, instead of zero.

This version uses round down to return packet counts on Linus' suggestions,
or else may lead to buffer destructed if packet size is changed before
calling uvc_alloc_urb_buffers() in this kind of case.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Laurent Pinchart <laurent.pinchart@skynet.be>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

Merge branch 'net-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/chris/linux-2.6

bonding: fix panic if initialization fails

If module initialisation failed (e.g. because the bonding sysfs entry
cannot be created), kernel panics:
IP: [<ffffffff8024910a>] destroy_workqueue+0x2d/0x146
Call Trace:
[<ffffffff808268c4>] bond_destructor+0x28/0x78
[<ffffffff80b64471>] netdev_run_todo+0x231/0x25a
[<ffffffff80b6dbcd>] rtnl_unlock+0x9/0xb
[<ffffffff81567907>] bonding_init+0x83e/0x84a

Remove the calls to bond_work_cancel_all() and destroy_workqueue();
both are also called/scheduled via bond_free_all().

bond_destroy_sysfs is unecessary because the sysfs entry has
not been created in the error case.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Fix the race between capifs remount and node creation

we don't want to deal with half-updated config

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Fix races around the access to ->s_options

Put generic_show_options read access to s_options under rcu_read_lock,
split save_mount_options() into "we are setting it the first time"
(uses in foo_fill_super()) and "we are relacing and freeing the old one",
synchronize_rcu() before kfree() in the latter.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

switch ufs directories to ufs_sync_file()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Switch open_exec() and sys_uselib() to do_open_filp()

... and make path_lookup_open() static

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Make open_exec() and sys_uselib() use may_open(), instead of duplicating its parts

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Reduce path_lookup() abuses

... use kern_path() where possible

[folded a fix from rdd]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Make checkpatch.pl shut up on fs/inode.c

Code Quality According To Mingo(tm) has been vastly improved,
no code has been damaged^Wchanged^Wdamaged.

[commit message rewritten -- AV]

Signed-off-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

NULL noise in fs/super.c:kill_bdev_super()

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Subrata Modak <subrata@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

romfs: cleanup romfs_fs.h

There's no kernel-only content in it anymore, so move it to header-y
and remove the superflous #ifdef __KERNEL__.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ROMFS: romfs_dev_read() error ignored

romfs_dev_read() may return -EIO, but ret is unsigned, so the errorpath
isn't taken.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fs: dcache fix LRU ordering

Fix ordering of LRU when moving referenced dentries to the head of the list
(they should go to the head of the list in the same order as they were found
from the tail, rather than reverse order).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ocfs2: Use nd_set_link().

ocfs2 was hand-calling vfs_follow_link(), but there's no point to that.
Let's use page_follow_link_light() and nd_set_link().

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Fix deadlock in ipathfs ->get_sb()

forgot to unlock superblock before calling deactivate_super()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Fix a leak in failure exit in 9p ->get_sb()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Convert obvious places to deactivate_locked_super()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

New helper: deactivate_locked_super()

Does equivalent of up_write(&s->s_umount); deactivate_super(s);
However, it does not does not unlock it until it's all over.
As the result, it's safe to use to dispose of new superblock on ->get_sb()
failure exits - nobody will see the sucker until it's all over.
Equivalent using up_write/deactivate_super is safe for that purpose
if superblock is either safe to use or has NULL ->s_root when we unlock.
Normally filesystems take the required precautions, but
a) we do have bugs in that area in some of them.
b) up_write/deactivate_super sequence is extremely common,
so the helper makes sense anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

reiserfs: remove privroot hiding in lookup

With Al Viro's patch to move privroot lookup to fs mount, there's no need
to have special code to hide the privroot in reiserfs_lookup.

I've also cleaned up the privroot hiding in reiserfs_readdir_dentry and
removed the last user of reiserfs_xattrs().

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

reiserfs: dont associate security.* with xattr files

The security.* xattrs are ignored for xattr files, so don't create them.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

reiserfs: fixup xattr_root caching

The xattr_root caching was broken from my previous patch set. It wouldn't
cause corruption, but could cause decreased performance due to allocating
a larger chunk of the journal (~ 27 blocks) than it would actually use.

This patch loads the xattr root dentry at xattr initialization and creates
it on-demand. Since we're using the cached dentry, there's no point
in keeping lookup_or_create_dir around, so that's removed.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Always lookup priv_root on reiserfs mount and keep it

... even if it's a negative dentry. That way we can set ->d_op on
root before anyone could race with us. Simplify d_compare(), while
we are at it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

reiserfs: Expand i_mutex to enclose lookup_one_len

2.6.30-rc3 introduced some sanity checks in the VFS code to avoid NFS
bugs by ensuring that lookup_one_len is always called under i_mutex.

This patch expands the i_mutex locking to enclose lookup_one_len. This was
always required, but not not enforced in the reiserfs code since it
does locking around the xattr interactions with the xattr_sem.

This is obvious enough, and it survived an overnight 50 thread ACL test.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: umount_begin BKL pushdown

Push BKL down into ->umount_begin()

Signed-off-by: Alessio Igor Bogani <abogani@texware.it>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

GFS2: Fix glock ref counting bug

Depending on the ordering of events as we go around the
glock shrinker loop, it is possible to drop the ref count
of a glock incorrectly. It doesn't happen very often. This
patch corrects the got_ref variable, fixing the problem.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>

IXP4xx: complete Ethernet netdev setup before calling register_netdev().

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>

IXP4xx: use "ENODEV" instead of "ENOSYS" in module initialization.
ENOSYS makes modutils complain about missing kernel module support.

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>

Linux 2.6.30-rc5

Merge git://git.infradead.org/mtd-2.6

* git://git.infradead.org/mtd-2.6:
  mtd: fix timeout in M25P80 driver
  mtd: Bug in m25p80.c during whole-chip erase
  mtd: expose subpage size via sysfs
  mtd: mtd in mtd_release is unused without CONFIG_MTD_CHAR

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: MCE: make cmci_discover_lock irq-safe
  x86: xen, i386: reserve Xen pagetables
  x86, kexec: fix crashdump panic with CONFIG_KEXEC_JUMP
  x86-64: finish cleanup_highmaps()'s job wrt. _brk_end
  x86: fix boot hang in early_reserve_e820()
  x86: Fix a typo in a printk message
  x86, srat: do not register nodes beyond e820 map

Merge branch 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6

* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
hwmon: (w83781d) Fix W83782D support (NULL pointer dereference)
hwmon: (asus_atk0110) Fix compiler warning

Merge branch 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze

* 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Fix return value for sys_ipc
microblaze: Storage class should be before const qualifier

kprobes: fix to use text_mutex around arm/disarm kprobe

Fix kprobes to lock text_mutex around some arch_arm/disarm_kprobe() which
are newly added by commit de5bd88d5a5cce3cacea904d3503e5ebdb3852a2.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ipvs: Fix IPv4 FWMARK virtual services

This fixes the use of fwmarks to denote IPv4 virtual services
which was unfortunately broken as a result of the integration
of IPv6 support into IPVS, which was included in 2.6.28.

The problem arises because fwmarks are stored in the 4th octet
of a union nf_inet_addr .all, however in the case of IPv4 only
the first octet, corresponding to .ip, is assigned and compared.

In other words, using .all = { 0, 0, 0, htonl(svc->fwmark) always
results in a value of 0 (32bits) being stored for IPv4. This means
that one fwmark can be used, as it ends up being mapped to 0, but things
break down when multiple fwmarks are used, as they all end up being mapped
to 0.

As fwmarks are 32bits a reasonable fix seems to be to just store the fwmark
in .ip, and comparing and storing .ip when fwmarks are used.

This patch makes the assumption that in calls to ip_vs_ct_in_get()
and ip_vs_sched_persist() if the proto parameter is IPPROTO_IP then
we are dealing with an fwmark. I believe this is valid as ip_vs_in()
does fairly strict filtering on the protocol and IPPROTO_IP should
not be used in these calls unless explicitly passed when making
these calls for fwmarks in ip_vs_sched_persist().

Tested-by: Fabien Duchêne <fabien.duchene@student.uclouvain.be>
Cc: Joseph Mack NA3T <jmack@wm7d.net>
Cc: Julius Volz <julius.volz@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: Make INET_LRO a bool instead of tristate.

This code is used as a library by several device drivers,
which select INET_LRO.

If some are modules and some are statically built into the
kernel, we get build failures if INET_LRO is modular.

Signed-off-by: David S. Miller <davem@davemloft.net>

hwmon: (w83781d) Fix W83782D support (NULL pointer dereference)

Commit 360782dde00a2e6e7d9fd57535f90934707ab8a8 (hwmon: (w83781d) Stop
abusing struct i2c_client for ISA devices) broke W83782D support for
devices connected on the ISA bus. You will hit a NULL pointer
dereference as soon as you read any device attribute. Other devices,
and W83782D devices on the SMBus, aren't affected.

Reported-by: Michel Abraham
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Michel Abraham

hwmon: (asus_atk0110) Fix compiler warning

atk_sensor_type is only used when DEBUG is defined.

Signed-off-by: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>

mtd: fix timeout in M25P80 driver

Extend erase timeout in M25P80 SPI Flash driver.

The M25P80 drivers fails erasing sectors on a M25P128 because the ready
wait timeout is too short. Change the timeout from a simple loop count to a
suitable number of seconds.

Signed-off-by: Peter Horton <zero@colonel-panic.org>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

x86: MCE: make cmci_discover_lock irq-safe

Lockdep reports the warning below when Li tries to offline one cpu:

[  110.835487] =================================
[  110.835616] [ INFO: inconsistent lock state ]
[  110.835688] 2.6.30-rc4-00336-g8c9ed89 #52
[  110.835757] ---------------------------------
[  110.835828] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[  110.835908] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[  110.835982]  (cmci_discover_lock){?.+...}, at: [<ffffffff80236dc0>] cmci_clear+0x30/0x9b

cmci_clear() can be called via smp_call_function_single().

It is better to disable interrupt while holding cmci_discover_lock,
to turn it into an irq-safe lock - we can deadlock otherwise.

[ Impact: fix possible deadlock in the MCE code ]

Reported-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A03ED38.8000700@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reported-by: Shaohua Li<shaohua.li@intel.com>

x86: xen, i386: reserve Xen pagetables

The Xen pagetables are no longer implicitly reserved as part of the other
i386_start_kernel reservations, so make sure we explicitly reserve them.
This prevents them from being released into the general kernel free page
pool and reused.

[ Impact: fix Xen guest crash ]

Also-Bisected-by: Bryan Donlan <bdonlan@gmail.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Xen-devel <xen-devel@lists.xensource.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4A032EEC.30509@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ALSA: hda - Fix line-in on Mac Mini Core2 Duo

BIOS on Mac Mini Core2 Duo sets both INPUT and OUTPUT pinctl bits to
the line-in jack, and it confuses the driver as if it's a valid input.
This patch adds the check of OUTPUT bit so that the driver fixes the
invalid pin setup.

Tested-by: Tino Keitel <tino.keitel@gmx.de>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

x86, kexec: fix crashdump panic with CONFIG_KEXEC_JUMP

Tim Starling reported that crashdump will panic with kernel compiled
with CONFIG_KEXEC_JUMP due to null pointer deference in
machine_kexec_32.c: machine_kexec(), when deferencing
kexec_image. Refering to:

http://bugzilla.kernel.org/show_bug.cgi?id=13265

This patch fixes the BUG via replacing global variable reference:
kexec_image in machine_kexec() with local variable reference: image,
which is more appropriate, and will not be null.

Same BUG is in machine_kexec_64.c too, so fixed too in the same way.

[ Impact: fix crash on kexec ]

Reported-by: Tim Starling <tstarling@wikimedia.org>
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1241751101.6259.85.camel@yhuang-dev.sh.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>

x86-64: finish cleanup_highmaps()'s job wrt. _brk_end

With the introduction of the .brk section, special care must be taken
that no unused page table entries remain if _brk_end and _end are
separated by a 2M page boundary. cleanup_highmap() runs very early and
hence cannot take care of that, hence potential entries needing to be
removed past _brk_end must be cleared once the brk allocator has done
its job.

[ Impact: avoids undesirable TLB aliases ]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>

x86: fix boot hang in early_reserve_e820()

If the first non-reserved (sub-)range doesn't fit the size requested,
an endless loop will be entered. If a range returned from
find_e820_area_size() turns out insufficient in size, the range must
be skipped before calling the function again.

[ Impact: fixes boot hang on some platforms ]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (32 commits)
  [CIFS] Fix double list addition in cifs posix open code
  [CIFS] Allow raw ntlmssp code to be enabled with sec=ntlmssp
  [CIFS] Fix SMB uid in NTLMSSP authenticate request
  [CIFS] NTLMSSP reenabled after move from connect.c to sess.c
  [CIFS] Remove sparse warning
  [CIFS] remove checkpatch warning
  [CIFS] Fix final user of old string conversion code
  [CIFS] remove cifs_strfromUCS_le
  [CIFS] NTLMSSP support moving into new file, old dead code removed
  [CIFS] Fix endian conversion of vcnum field
  [CIFS] Remove trailing whitespace
  [CIFS] Remove sparse endian warnings
  [CIFS] Add remaining ntlmssp flags and standardize field names
  [CIFS] Fix build warning
  cifs: fix length handling in cifs_get_name_from_search_buf
  [CIFS] Remove unneeded QuerySymlink call and fix mapping for unmapped status
  [CIFS] rename cifs_strndup to cifs_strndup_from_ucs
  Added loop check when mounting DFS tree.
  Enable dfs submounts to handle remote referrals.
  [CIFS] Remove older session setup implementation
  ...

[CIFS] Fix double list addition in cifs posix open code

Remove adding open file entry twice to lists in the file
Do not fill file info twice in case of posix opens and creates

Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>

net: remove stale reference to fastroute from Kconfig help text

Signed-off-by: Ashish Karkare <akarkare@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

NOMMU: Don't check vm_region::vm_start is page aligned in add_nommu_region()

Don't check vm_region::vm_start is page aligned in add_nommu_region() because
the region may reflect some non-page-aligned mapped file, such as could be
obtained from RomFS XIP.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-linus' of git://neil.brown.name/md

* 'for-linus' of git://neil.brown.name/md:
  md: remove rd%d links immediately after stopping an array.
  md: remove ability to explicit set an inactive array to 'clean'.
  md: constify VFTs
  md: tidy up status_resync to handle large arrays.
  md: fix some (more) errors with bitmaps on devices larger than 2TB.
  md/raid10: don't clear bitmap during recovery if array will still be degraded.
  md: fix loading of out-of-date bitmap.

random: make get_random_int() more random

It's a really simple patch that basically just open-codes the current
"secure_ip_id()" call, but when open-coding it we now use a _static_
hashing area, so that it gets updated every time.

And to make sure somebody can't just start from the same original seed of
all-zeroes, and then do the "half_md4_transform()" over and over until
they get the same sequence as the kernel has, each iteration also mixes in
the same old "current->pid + jiffies" we used - so we should now have a
regular strong pseudo-number generator, but we also have one that doesn't
have a single seed.

Note: the "pid + jiffies" is just meant to be a tiny tiny bit of noise. It
has no real meaning. It could be anything. I just picked the previous
seed, it's just that now we keep the state in between calls and that will
feed into the next result, and that should make all the difference.

I made that hash be a per-cpu data just to avoid cache-line ping-pong:
having multiple CPU's write to the same data would be fine for randomness,
and add yet another layer of chaos to it, but since get_random_int() is
supposed to be a fast interface I did it that way instead. I considered
using "__raw_get_cpu_var()" to avoid any preemption overhead while still
getting the hash be _mostly_ ping-pong free, but in the end good taste won
out.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 5507/1: support R_ARM_MOVW_ABS_NC and MOVT_ABS relocation types
  [ARM] 5506/1: davinci: DMA_32BIT_MASK --> DMA_BIT_MASK(32)
  i.MX31: Disable CPU_32v6K in mx3_defconfig.
  mx3fb: Fix compilation with CONFIG_PM
  mx27ads: move PBC mapping out of vmalloc space
  MXC: remove BUG_ON in interrupt handler
  mx31: remove mx31moboard_defconfig
  ARM: ARCH_MXC should select HAVE_CLK
  mxc : BUG in imx_dma_request
  mxc : Clean up properly when imx_dma_free() used without imx_dma_disable()
  [ARM] mv78xx0: update defconfig
  [ARM] orion5x: update defconfig
  [ARM] Kirkwood: update defconfig
  [ARM] Kconfig typo fix:  "PXA930" -> "CPU_PXA930".
  [ARM] S3C2412: Add missing cache flush in suspend code
  [ARM] S3C: Add UDIVSLOT support for newer UARTS
  [ARM] S3C64XX: Add S3C64XX_PA_IIS{0,1} to <mach/map.h>

[ARM] 5507/1: support R_ARM_MOVW_ABS_NC and MOVT_ABS relocation types

From: Bruce Ashfield <bruce.ashfield@windriver.com>

To fully support the armv7-a instruction set/optimizations, support
for the R_ARM_MOVW_ABS_NC and R_ARM_MOVT_ABS relocation types is
required.

The MOVW and MOVT are both load-immediate instructions, MOVW loads 16
bits into the bottom half of a register, and MOVT loads 16 bits into the
top half of a register.

The relocation information for these instructions has a full 32 bit
value, plus an addend which is stored in the 16 immediate bits in the
instruction itself. The immediate bits in the instruction are not
contiguous (the register # splits it into a 4 bit and 12 bit value),
so the addend has to be extracted accordingly and added to the value.
The value is then split and put into the instruction; a MOVW uses the
bottom 16 bits of the value, and a MOVT uses the top 16 bits.

Signed-off-by: David Borman <david.borman@windriver.com>
Signed-off-by: Bruce Ashfield <bruce.ashfield@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

[ARM] 5506/1: davinci: DMA_32BIT_MASK --> DMA_BIT_MASK(32)

As per commit 284901a90a9e0b812ca3f5f852cbbfb60d10249d, use
DMA_BIT_MASK(n)

Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

md: remove rd%d links immediately after stopping an array.

md maintains link in sys/mdXX/md/ to identify which device has
which role in the array. e.g.
rd2 -> dev-sda

indicates that the device with role '2' in the array is sda.

These links are only present when the array is active. They are
created immediately after ->run is called, and so should be removed
immediately after ->stop is called.
However they are currently removed a little bit later, and it is
possible for ->run to be called again, thus adding these links, before
they are removed.

So move the removal earlier so they are consistently only present when
the array is active.

Signed-off-by: NeilBrown <neilb@suse.de>

md: remove ability to explicit set an inactive array to 'clean'.

Being able to write 'clean' to an 'array_state' of an inactive array
to activate it in 'clean' mode is both unnecessary and inconvenient.

It is unnecessary because the same can be achieved by writing
'active'.  This activates and array, but it still remains 'clean'
until the first write.

It is inconvenient because writing 'clean' is more often used to
cause an 'active' array to revert to 'clean' mode (thus blocking
any writes until a 'write-pending' is promoted to 'active').

Allowing 'clean' to both activate an array and mark an active array as
clean can lead to races:  One program writes 'clean' to mark the
active array as clean at the same time as another program writes
'inactive' to deactivate (stop) and active array.  Depending on which
writes first, the array could be deactivated and immediately
reactivated which isn't what was desired.

So just disable the use of 'clean' to activate an array.

This avoids a race that can be triggered with mdadm-3.0 and external
metadata, so it suitable for -stable.

Reported-by: Rafal Marszewski <rafal.marszewski@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de>

md: constify VFTs

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: NeilBrown <neilb@suse.de>

md: tidy up status_resync to handle large arrays.

Two problems in status_resync.
1/ It still used Kilobytes as the basic block unit, while most code
   now uses sectors uniformly.
2/ It doesn't allow for the possibility that max_sectors exceeds
   the range of "unsigned long".

So
- change "max_blocks" to "max_sectors", and store sector numbers
   in there and in 'resync'
- Make 'rt' a 'sector_t' so it can temporarily hold the number of
   remaining sectors.
- use sector_div rather than normal division.
- change the magic '100' used to preserve precision to '32'.
   + making it a power of 2 makes division easier
   + it doesn't need to be as large as it was chosen when we averaged
     speed over the entire run.  Now we average speed over the last 30
     seconds or so.

Reported-by: "Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE>
Signed-off-by: NeilBrown <neilb@suse.de>

md: fix some (more) errors with bitmaps on devices larger than 2TB.

If a write intent bitmap covers more than 2TB, we sometimes work with
values beyond 32bit, so these need to be sector_t. This patches
add the required casts to some unsigned longs that are being shifted
up.

This will affect any raid10 larger than 2TB, or any raid1/4/5/6 with
member devices that are larger than 2TB.

Signed-off-by: NeilBrown <neilb@suse.de>
Reported-by: "Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE>
Cc: stable@kernel.org

md/raid10: don't clear bitmap during recovery if array will still be degraded.

If we have a raid10 with multiple missing devices, and we recover just
one of these to a spare, then we risk (depending on the bitmap and
array chunk size) clearing bits of the bitmap for which recovery isn't
complete (because a device is still missing).

This can lead to a subsequent "re-add" being recovered without
any IO happening, which would result in loss of data.

This patch takes the safe approach of not clearing bitmap bits
if the array will still be degraded.

This patch is suitable for all active -stable kernels.

Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>

md: fix loading of out-of-date bitmap.

When md is loading a bitmap which it knows is out of date, it fills
each page with 1s and writes it back out again.  However the
write_page call makes used of bitmap->file_pages and
bitmap->last_page_size which haven't been set correctly yet.  So this
can sometimes fail.

Move the setting of file_pages and last_page_size to before the call
to write_page.

This bug can cause the assembly on an array to fail, thus making the
data inaccessible.  Hence I think it is a suitable candidate for
-stable.

Cc: stable@kernel.org
Reported-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: NeilBrown <neilb@suse.de>

net: update skb_recycle_check() for hardware timestamping changes

Commit ac45f602ee3d1b6f326f68bc0c2591ceebf05ba4 ("net: infrastructure
for hardware time stamping") added two skb initialization actions to
__alloc_skb(), which need to be added to skb_recycle_check() as well.

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2: Fix panic in bnx2_poll_work().

Add barrier() to bnx2_get_hw_{tx|rx}_cons() to fix this issue:

http://bugzilla.kernel.org/show_bug.cgi?id=12698

This issue was reported by multiple i386 users.  Without barrier(),
the compiled code looks like the following where %eax contains the
address of the tx_cons or rx_cons in the DMA status block.  The
status block contents can change between the cmpb and the movzwl
instruction.  The driver would crash if the value was not 0xff during
the cmpb instruction, but changed to 0xff during the movzwl
instruction.

6828: 80 38 ff              cmpb   $0xff,(%eax)
682b: 0f b7 10              movzwl (%eax),%edx

With the added barrier(), the compiled code now looks correct:

683d: 0f b7 10              movzwl (%eax),%edx
6840: 0f b6 c2              movzbl %dl,%eax
6843: 3d ff 00 00 00        cmp    $0xff,%eax

Thanks to Pascal de Bruijn <pmjdebruijn@pcode.nl> for reporting the
problem and Holger Noefer <hnoefer@pironet-ndh.com> for patiently
testing test patches for us.

Also updated version to 2.0.1.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net-sched: fix bfifo default limit

When no limit is given, the bfifo uses a default of tx_queue_len * mtu.
Packets handled by qdiscs include the link layer header, so this should
be taken into account, similar to what other qdiscs do.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

igb: resolve panic on shutdown when SR-IOV is enabled

The setup_rctl call was making a call into the ring structure after it had
been freed. This was causing a panic on shutdown. This call wasn't
necessary since it is possible to get the needed index from
adapter->vfs_allocated_count.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'linux-2.6.30.y' of git://git.kernel.org/pub/scm/linux/kernel/git/inaky/wimax

drivers/base/iommu.c: add missing includes

Fix zillions of -mm x86_64 allmodconfig build errors - the file uses
EXPORT_SYMBOL() and kmalloc but misses the needed includes.

Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

initramfs: clean up messages related to initramfs unpacking

With the removal of duplicate unpack_to_rootfs() (commit
df52092f3c97788592ef72501a43fb7ac6a3cfe0) the messages displayed do not
actually correspond to what the kernel is doing. In addition, depending
if ramdisks are supported or not, the messages are not at all the same.

So keep the messages more in sync with what is really doing the kernel,
and only display a second message in case of failure. This also ensure
that the printk message cannot be split by other printk's.

Signed-off-by: Eric Piel <eric.piel@tremplin-utc.net>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

nommu: make the initial mmap allocation excess behaviour Kconfig configurable

NOMMU mmap() has an option controlled by a sysctl variable that determines
whether the allocations made by do_mmap_private() should have the excess
space trimmed off and returned to the allocator.  Make the initial setting
of this variable a Kconfig configuration option.

The reason there can be excess space is that the allocator only allocates
in power-of-2 size chunks, but mmap()'s can be made in sizes that aren't a
power of 2.

There are two alternatives:

(1) Keep the excess as dead space.  The dead space then remains unused for the
     lifetime of the mapping.  Mappings of shared objects such as libc, ld.so
     or busybox's text segment may retain their dead space forever.

(2) Return the excess to the allocator.  This means that the dead space is
     limited to less than a page per mapping, but it means that for a transient
     process, there's more chance of fragmentation as the excess space may be
     reused fairly quickly.

During the boot process, a lot of transient processes are created, and
this can cause a lot of fragmentation as the pagecache and various slabs
grow greatly during this time.

By turning off the trimming of excess space during boot and disabling
batching of frees, Coldfire can manage to boot.

A better way of doing things might be to have /sbin/init turn this option
off.  By that point libc, ld.so and init - which are all long-duration
processes - have all been loaded and trimmed.

Reported-by: Lanttor Guo <lanttor.guo@freescale.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Lanttor Guo <lanttor.guo@freescale.com>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

nommu: clamp zone_batchsize() to 0 under NOMMU conditions

Clamp zone_batchsize() to 0 under NOMMU conditions to stop
free_hot_cold_page() from queueing and batching frees.

The problem is that under NOMMU conditions it is really important to be
able to allocate large contiguous chunks of memory, but when munmap() or
exit_mmap() releases big stretches of memory, return of these to the buddy
allocator can be deferred, and when it does finally happen, it can be in
small chunks.

Whilst the fragmentation this incurs isn't so much of a problem under MMU
conditions as userspace VM is glued together from individual pages with
the aid of the MMU, it is a real problem if there isn't an MMU.

By clamping the page freeing queue size to 0, pages are returned to the
allocator immediately, and the buddy detector is more likely to be able to
glue them together into large chunks immediately, and fragmentation is
less likely to occur.

By disabling batching of frees, and by turning off the trimming of excess
space during boot, Coldfire can manage to boot.

Reported-by: Lanttor Guo <lanttor.guo@freescale.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Lanttor Guo <lanttor.guo@freescale.com>
Cc: Greg Ungerer <gerg@snapgear.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: use roundown_pow_of_two() in zone_batchsize()

Use roundown_pow_of_two(N) in zone_batchsize() rather than (1 <<
(fls(N)-1)) as they are equivalent, and with the former it is easier to
see what is going on.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Lanttor Guo <lanttor.guo@freescale.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

isl29003: fix resume functionality

The isl29003 does not interpret the return value of
i2c_smbus_write_byte_data() correctly and hence causes an error on system
resume.

Also introduce power_state_before_suspend and restore the chip's power
state upon wakeup.

Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fbdev: remove makefile reference to removed driver

The cyblafb driver is removed so remove its last trace in the makefile.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

alloc_vmap_area: fix memory leak

If alloc_vmap_area() fails the allocated struct vmap_area has to be freed.

Signed-off-by: Ralph Wuerthner <ralphw@linux.vnet.ibm.com>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

doc: small kernel-parameters updates

Change last "i386" to X86-32 as is used throughout the rest of the file.
Change combination of X86-32,X86-64 to just X86, as is done throughout the
rest of the file.

Add a note that hyphens and underscores are equivalent in parameter names,
with examples.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Jan Engelhardt <jengelh@medozas.de>
Cc: Christopher Sylvain <chris.sylvain@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fbdev: fix fillrect for 24bpp modes

The software fillrect routines do not work properly when the number of
pixels per machine word is not an integer.  To see that, run the following
command on a fbdev console with a 24bpp video mode, using a
non-accelerated driver such as (u)vesafb:

  reset ; echo -e '\e[41mtest\e[K'

The expected result is 'test' displayed on a line with red background.
Instead of that, 'test' has a red background, but the rest of the line
(rendered using fillrect()) contains a distored colorful pattern.

This patch fixes the problem by correctly computing rotation shifts.  It
has been tested in a 24bpp mode on 32- and 64-bit little-endian machines.

Signed-off-by: Michal Januszewski <spock@gentoo.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

oom: prevent livelock when oom_kill_allocating_task is set

When /proc/sys/vm/oom_kill_allocating_task is set for large systems that
want to avoid the lengthy tasklist scan, it's possible to livelock if
current is ineligible for oom kill. This normally happens when it is set
to OOM_DISABLE, but is also possible if any threads are sharing the same
->mm with a different tgid.

So change __out_of_memory() to fall back to the full task-list scan if it
was unable to kill `current'.

Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fiemap: fix problem with setting FIEMAP_EXTENT_LAST

Fix a problem where the generic block based fiemap stuff would not
properly set FIEMAP_EXTENT_LAST on the last extent. I've reworked things
to keep track if we go past the EOF, and mark the last extent properly.
The problem was reported by and tested by Eric Sandeen.

Tested-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Josef Bacik <jbacik@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: <xfs-masters@oss.sgi.com>
Cc: <linux-btrfs@vger.kernel.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <Joel.Becker@oracle.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Eliminate thousands of warnings with gcc 3.2 build

When building with gcc 3.2 I get thousands of warnings such as

include/linux/gfp.h: In function `allocflags_to_migratetype':
include/linux/gfp.h:105: warning: null format string

due to passing a NULL format string to warn_slowpath() in

#define __WARN() warn_slowpath(__FILE__, __LINE__, NULL)

Split this case out into a separate call.  This also shrinks the kernel
slightly:

          text    data     bss     dec     hex filename
       4802274  707668  712704 6222646  5ef336 vmlinux
          text    data     bss     dec     hex filename
       4799027  703572  712704 6215303  5ed687 vmlinux

due to removeing one argument from the commonly-called __WARN().

[akpm@linux-foundation.org: reduce scope of `empty']
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

doc: hashdist defaults on for 64bit

kernel boot parameter `hashdist' now defaults on for all 64bit NUMA.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

inotify: use GFP_NOFS in kernel_event() to work around a lockdep false-positive

There is what we believe to be a false positive reported by lockdep.

inotify_inode_queue_event() => take inotify_mutex => kernel_event() =>
kmalloc() => SLOB => alloc_pages_node() => page reclaim => slab reclaim =>
dcache reclaim => inotify_inode_is_dead => take inotify_mutex => deadlock

The plan is to fix this via lockdep annotation, but that is proving to be
quite involved.

The patch flips the allocation over to GFP_NFS to shut the warning up, for
the 2.6.30 release.

Hopefully we will fix this for real in 2.6.31.  I'll queue a patch in -mm
to switch it back to GFP_KERNEL so we don't forget.

  =================================
  [ INFO: inconsistent lock state ]
  2.6.30-rc2-next-20090417 #203
  ---------------------------------
  inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
  kswapd0/380 [HC0[0]:SC0[0]:HE1:SE1] takes:
   (&inode->inotify_mutex){+.+.?.}, at: [<ffffffff8112f1b5>] inotify_inode_is_dead+0x35/0xb0
  {RECLAIM_FS-ON-W} state was registered at:
    [<ffffffff81079188>] mark_held_locks+0x68/0x90
    [<ffffffff810792a5>] lockdep_trace_alloc+0xf5/0x100
    [<ffffffff810f5261>] __kmalloc_node+0x31/0x1e0
    [<ffffffff81130652>] kernel_event+0xe2/0x190
    [<ffffffff81130826>] inotify_dev_queue_event+0x126/0x230
    [<ffffffff8112f096>] inotify_inode_queue_event+0xc6/0x110
    [<ffffffff8110444d>] vfs_create+0xcd/0x140
    [<ffffffff8110825d>] do_filp_open+0x88d/0xa20
    [<ffffffff810f6b68>] do_sys_open+0x98/0x140
    [<ffffffff810f6c50>] sys_open+0x20/0x30
    [<ffffffff8100c272>] system_call_fastpath+0x16/0x1b
    [<ffffffffffffffff>] 0xffffffffffffffff
  irq event stamp: 690455
  hardirqs last  enabled at (690455): [<ffffffff81564fe4>] _spin_unlock_irqrestore+0x44/0x80
  hardirqs last disabled at (690454): [<ffffffff81565372>] _spin_lock_irqsave+0x32/0xa0
  softirqs last  enabled at (690178): [<ffffffff81052282>] __do_softirq+0x202/0x220
  softirqs last disabled at (690157): [<ffffffff8100d50c>] call_softirq+0x1c/0x50

  other info that might help us debug this:
  2 locks held by kswapd0/380:
   #0:  (shrinker_rwsem){++++..}, at: [<ffffffff810d0bd7>] shrink_slab+0x37/0x180
   #1:  (&type->s_umount_key#17){++++..}, at: [<ffffffff8110cfbf>] shrink_dcache_memory+0x11f/0x1e0

  stack backtrace:
  Pid: 380, comm: kswapd0 Not tainted 2.6.30-rc2-next-20090417 #203
  Call Trace:
   [<ffffffff810789ef>] print_usage_bug+0x19f/0x200
   [<ffffffff81018bff>] ? save_stack_trace+0x2f/0x50
   [<ffffffff81078f0b>] mark_lock+0x4bb/0x6d0
   [<ffffffff810799e0>] ? check_usage_forwards+0x0/0xc0
   [<ffffffff8107b142>] __lock_acquire+0xc62/0x1ae0
   [<ffffffff810f478c>] ? slob_free+0x10c/0x370
   [<ffffffff8107c0a1>] lock_acquire+0xe1/0x120
   [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0
   [<ffffffff81562d43>] mutex_lock_nested+0x63/0x420
   [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0
   [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0
   [<ffffffff81012fe9>] ? sched_clock+0x9/0x10
   [<ffffffff81077165>] ? lock_release_holdtime+0x35/0x1c0
   [<ffffffff8112f1b5>] inotify_inode_is_dead+0x35/0xb0
   [<ffffffff8110c9dc>] dentry_iput+0xbc/0xe0
   [<ffffffff8110cb23>] d_kill+0x33/0x60
   [<ffffffff8110ce23>] __shrink_dcache_sb+0x2d3/0x350
   [<ffffffff8110cffa>] shrink_dcache_memory+0x15a/0x1e0
   [<ffffffff810d0cc5>] shrink_slab+0x125/0x180
   [<ffffffff810d1540>] kswapd+0x560/0x7a0
   [<ffffffff810ce160>] ? isolate_pages_global+0x0/0x2c0
   [<ffffffff81065a30>] ? autoremove_wake_function+0x0/0x40
   [<ffffffff8107953d>] ? trace_hardirqs_on+0xd/0x10
   [<ffffffff810d0fe0>] ? kswapd+0x0/0x7a0
   [<ffffffff8106555b>] kthread+0x5b/0xa0
   [<ffffffff8100d40a>] child_rip+0xa/0x20
   [<ffffffff8100cdd0>] ? restore_args+0x0/0x30
   [<ffffffff81065500>] ? kthread+0x0/0xa0
   [<ffffffff8100d400>] ? child_rip+0x0/0x20

[eparis@redhat.com: fix audit too]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>