git.karo-electronics.de Git - karo-tx-linux.git/log

RPC/RDMA: avoid an oops due to disconnect racing with async upcalls.

RDMA disconnects yield an upcall from the RDMA connection manager,
which can race with rpc transport close, e.g. on ^C of a mount.
Ensure any rdma cm_id and qp are fully destroyed before continuing.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

RPC/RDMA: maintain the RPC task bytes-sent statistic.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

RPC/RDMA: suppress retransmit on RPC/RDMA clients.

An RPC/RDMA client cannot retransmit on an unbroken connection,
doing so violates its flow control with the server.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

RPC/RDMA: fix connection IRD/ORD setting

This logic sets the connection parameter that configures the local device
and informs the remote peer how many concurrent incoming RDMA_READ
requests are supported. The original logic didn't really do what was
intended for two reasons:

- The max number supported by the device is typically smaller than
any one factor in the calculation used, and

- The field in the connection parameter structure where the value is
stored is a u8 and always overflows for the default settings.

So what really happens is the value requested for responder resources
is the left over 8 bits from the "desired value". If the desired value
happened to be a multiple of 256, the result was zero and it wouldn't
connect at all.

Given the above and the fact that max_requests is almost always larger
than the max responder resources supported by the adapter, this patch
simplifies this logic and simply requests the max supported by the device,
subject to a reasonable limit.

This bug was found by Jim Schutt at Sandia.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

RPC/RDMA: support FRMR client memory registration.

Configure, detect and use "fastreg" support from IB/iWARP verbs
layer to perform RPC/RDMA memory registration.

Make FRMR the default memreg mode (will fall back if not supported
by the selected RDMA adapter).

This allows full and optimal operation over the cxgb3 adapter, and others.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Acked-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

RPC/RDMA: check selected memory registration mode at runtime.

At transport creation, check for, and use, any local dma lkey.
Then, check that the selected memory registration mode is in fact
supported by the RDMA adapter selected for the mount. Fall back
to best alternative if not.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Acked-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

RPC/RDMA: add data types and new FRMR memory registration enum.

Internal RPC/RDMA structure updates in preparation for FRMR support.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Acked-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

RPC/RDMA: refactor the inline memory registration code.

Refactor the memory registration and deregistration routines.
This saves stack space, makes the code more readable and prepares
to add the new FRMR registration methods.

Signed-off-by: Tom Talpey <talpey@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: fix nfs_parse_ip_address() corner case

Bruce observed that nfs_parse_ip_address() will successfully parse an
IPv6 address that looks like this:

  "::1%"

A scope delimiter is present, but there is no scope ID following it.
This is harmless, as it would simply set the scope ID to zero.  However,
in some cases we would like to flag this as an improperly formed
address.

We are now also careful to reject addresses where garbage follows the
address (up to the length of the string), instead of ignoring the
non-address characters; and where the scope ID is nonsense (not a valid
device name, but also not numeric).  Before, both of these cases would
result in a harmless zero scope ID.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Cleanup nfs_set_port

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Fix attribute updates

This fixes a regression seen when running the Connectathon testsuite
against an ext3 filesystem. The reason was that the inode was constantly
being marked as 'just updated' by the jiffy wraparound test.
This again meant that newer GETATTR calls were failing to pass the
nfs_inode_attrs_need_update() test unless the changes caused a ctime update
on the server, since they were perceived as having been started before the
latest inode update.

Given that nfs_inode_attrs_need_update() already checks for wraparound
of nfsi->last_updated, we can drop the buggy "protection" in
nfs_update_inode().

Also make a slight micro-optimisation of nfs_inode_attrs_need_update(): we
are more often going to see time_after(fattr->time_start, nfsi->last_updated)
be true, rather than seeing an update of ctime/size, so put that test
first to ensure that we optimise away the ctime/size tests.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Save padding bytes in struct nfs4_setclientid

Peter Staubach suggested reducing NFS4_SETCLIENTID_NAMELEN by one byte so
as to avoid 7 bytes of unnecessary padding.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

sunrpc: fix oops in rpc_create when the mount namespace is unshared

On a system with nfs mounts, if a task unshares its mount namespace,
a oops can occur when the system is rebooted if the task is the last
to unreference the nfs mount. It will try to create a rpc request
using utsname() which has been invalidated by free_nsproxy().

The patch fixes the issue by using the global init_utsname() which is
always valid. the capability of identifying rpc clients per uts namespace
stills needs some extra work so this should not be a problem.

BUG: unable to handle kernel NULL pointer dereference at 00000004
IP: [<c024c9ab>] rpc_create+0x332/0x42f
Oops: 0000 [#1] DEBUG_PAGEALLOC

Pid: 1857, comm: uts-oops Not tainted (2.6.27-rc5-00319-g7686ad5 #4)
EIP: 0060:[<c024c9ab>] EFLAGS: 00210287 CPU: 0
EIP is at rpc_create+0x332/0x42f
EAX: 00000000 EBX: df26adf0 ECX: c0251887 EDX: 00000001
ESI: df26ae58 EDI: c02f293c EBP: dda0fc9c ESP: dda0fc2c
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process uts-oops (pid: 1857, ti=dda0e000 task=dd9a0778 task.ti=dda0e000)
Stack: c0104532 dda0fffc dda0fcac dda0e000 dda0e000 dd93b7f0 00000009 c02f2880
df26aefc dda0fc68 c01096b7 00000000 c0266ee0 c039a070 c039a070 dda0fc74
c012ca67 c039a064 dda0fc8c c012cb20 c03daf74 00000011 00000000 c0275c90
Call Trace:
[<c0104532>] ? dump_trace+0xc2/0xe2
[<c01096b7>] ? save_stack_trace+0x1c/0x3a
[<c012ca67>] ? save_trace+0x37/0x8c
[<c012cb20>] ? add_lock_to_list+0x64/0x96
[<c0256fc4>] ? rpcb_register_call+0x62/0xbb
[<c02570c8>] ? rpcb_register+0xab/0xb3
[<c0252f4d>] ? svc_register+0xb4/0x128
[<c0253114>] ? svc_destroy+0xec/0x103
[<c02531b2>] ? svc_exit_thread+0x87/0x8d
[<c01a75cd>] ? lockd_down+0x61/0x81
[<c01a577b>] ? nlmclnt_done+0xd/0xf
[<c01941fe>] ? nfs_destroy_server+0x14/0x16
[<c0194328>] ? nfs_free_server+0x4c/0xaa
[<c019a066>] ? nfs_kill_super+0x23/0x27
[<c0158585>] ? deactivate_super+0x3f/0x51
[<c01695d1>] ? mntput_no_expire+0x95/0xb4
[<c016965b>] ? release_mounts+0x6b/0x7a
[<c01696cc>] ? __put_mnt_ns+0x62/0x70
[<c0127501>] ? free_nsproxy+0x25/0x80
[<c012759a>] ? switch_task_namespaces+0x3e/0x43
[<c01275a9>] ? exit_task_namespaces+0xa/0xc
[<c0117fed>] ? do_exit+0x4fd/0x666
[<c01181b3>] ? do_group_exit+0x5d/0x83
[<c011fa8c>] ? get_signal_to_deliver+0x2c8/0x2e0
[<c0102630>] ? do_notify_resume+0x69/0x700
[<c011d85a>] ? do_sigaction+0x134/0x145
[<c0127205>] ? hrtimer_nanosleep+0x8f/0xce
[<c0126d1a>] ? hrtimer_wakeup+0x0/0x1c
[<c0103488>] ? work_notifysig+0x13/0x1b
=======================
Code: 70 20 68 cb c1 2c c0 e8 75 4e 01 00 8b 83 ac 00 00 00 59 3d 00 f0 ff ff 5f 77 63 eb 57 a1 00 80 2d c0 8b 80 a8 02 00 00 8d 73 68 <8b> 40 04 83 c0 45 e8 41 46 f7 ff ba 20 00 00 00 83 f8 21 0f 4c
EIP: [<c024c9ab>] rpc_create+0x332/0x42f SS:ESP 0068:dda0fc2c

Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "Serge E. Hallyn" <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Don't use range_cyclic for data integrity syncs

It is more efficient to write linearly starting from the beginning of the
file.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Client mounts hang when exported directory do not exist

This patch fixes a regression that was introduced by the string based mounts.

nfs_mount() statically returns -EACCES for every error returned
by the remote mounted. This is incorrect because -EACCES is
an non-fatal error to the mount.nfs command. This error causes
mount.nfs to retry the mount even in the case when the exported
directory does not exist.

This patch maps the errors returned by the remote mountd into
valid errno values, exactly how it was done pre-string based
mounts. By returning the correct errno enables mount.nfs
to do the right thing.

Signed-off-by: Steve Dickson <steved@redhat.com>
[Trond.Myklebust@netapp.com: nfs_stat_to_errno() now correctly returns
negative errors, so remove the sign change.]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Fix a memory leak in rpcb_getport_async

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Fix autobind on cloned rpc clients

Despite the fact that cloned rpc clients won't have the cl_autobind flag
set, they may still find themselves calling rpcb_getport_async(). For this
to happen, it suffices for a _parent_ rpc_clnt to use autobinding, in which
case any clone may find itself triggering the !xprt_bound() case in
call_bind().

The correct fix for this is to walk back up the tree of cloned rpc clients,
in order to find the parent that 'owns' the transport, either because it
has clnt->cl_autobind set, or because it originally created the
transport...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: SETCLIENTID truncates client ID and netid

The sc_name field is currently 56 bytes long.  This is not large enough
to hold a pair of IPv6 addresses, the authentication type, the protocol
name, and a uniquifier number.  The maximum possible size of the name
string using IPv6 addresses is just under 110 bytes, so I increased the
size of the sc_name field to accomodate this maximum.

In addition, the strings in the nfs4_setclientid structure are
constructed with scnprintf(), which wants to terminate its output with
'\0'.  The sc_netid field was large enough only for a three byte netid
string and a '\0' so inet6 netids were being truncated.  Perhaps we
don't need the overhead of scnprintf() to do a simple string copy, but
I fixed this by increasing the size of the buffer by one byte.

Since all three of the string buffers in nfs4_setclientid are
constructed with scnprintf(), I increased the size of all three by one
byte to document the requirement, although I don't think either the
universal address field or the name field will be so small that these
strings get truncated in this way.

The size of the Linux client's client ID on the wire will be larger
than before.  RFC 3530 suggests the size limit for client IDs is 1024,
and we are still well below that.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: remove 8 bytes of padding from struct nfs_fattr on 64 bit builds

remove 8 bytes of padding from struct nfs_fattr on 64 bit builds

This also removes padding from several nfs structures, including
16 bytes from nfs4_opendata, nfs4_createdata,nfs3_createdata
& 8 bytes from nfs_read_data,nfs_write_data,nfs_removeres,nfs4_closedata

This also reduces the reported stack usage of many nfs functions (30+).

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
----

This patch is against the latest git 2.6.27-rc4.
I've built & run this on my AMD64 desktop, & successfully run _simple_
tests with a 64 bit client => 32 bit server & 32 bit client to 64 bit
server.

On fedora with gcc (GCC) 4.3.0 20080428 (Red Hat 4.3.0-8) checkpatch
reports 33 functions with reduced stack usage.
e.g.
__nfs_revalidate_inode [nfs] 216 => 200
_nfs4_proc_access [nfs] 304 => 288
_nfs4_proc_link [nfs] 536 => 504
_nfs4_proc_remove [nfs] 304 => 288
_nfs4_proc_rename [nfs] 584 => 552
nfs3_proc_access [nfs] 272 => 256
nfs3_proc_getacl [nfs] 384 => 368
nfs3_proc_link [nfs] 496 => 464
etc
I can supply the complete list if anyone is interested.

regards
Richard
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

nfs: Fix misparsing of nfsv4 fs_locations attribute

The code incorrectly assumes here that the server name (or ip address)
is null-terminated. This can cause referrals to fail in some cases.

Also support ipv6 addresses.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

nfs: prepare to share nfs_set_port

We plan to use this function elsewhere.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

nfs: replace while loop by for loops in nfs_follow_referral

Whoever wrote this had a bizarre allergy to for loops.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

nfs: break up nfs_follow_referral

This function is a little longer and more deeply nested than necessary.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

nfs: authenticated deep mounting

Allow mount to do authenticated mounts below the root of the exported tree.
The wording in RFC 2623, sec 2.3.2. allows fsinfo with UNIX authentication
on the root of the export. Mounts are not always done on the root
of the exported tree. Especially autoumounts often mount below the root of
the exported tree.
Some server implementations (justly) require full authentication for the
so-called deep mounts. The old code used AUTH_SYS only. This caused deep
mounts to fail on systems requiring stronger authentication..
The client should try both authentication types and use the first one that
succeeds.
This method was already partially implemented. This patch completes
the implementation for NFS2 and NFS3.
This patch was developed to allow Debian systems to automount home directories
on Solaris servers with krb5 authentication.

Tested on kernel 2.6.24-etchnhalf.1

Signed-off-by: E.G. Keizer <keie@few.vu.nl>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: missing nfs_fattr_init in nfs3_proc_getacl and nfs3_proc_setacls (resend #2)

The fattrs used in the NFSv3 getacl/setacl calls are not being properly
initialized. This occasionally causes nfs_update_inode to fall into
NFSv4 specific codepaths when handling post-op attrs from these calls.

Thanks to Cai Qian for noticing the spurious NFSv4 messages in debug
output from a v3 mount...

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

nfs: remove an obsolete nfs_flock comment

We *do* now allow bsd flocks over nfs.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

nfs: BUG_ON in nfs_follow_mountpoint

Unfortunately, BUG_ON(IS_ROOT(dentry)) can happen inside
nfs_follow_mountpoint with NFS running Fedora 8 using a
specific setup.
https://bugzilla.redhat.com/show_bug.cgi?id=458622

So, the situation should be handled on NFS client gracefully.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
CC: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

sunrpc: do not pin sunrpc module in the memory

Basically, try_module_get here are pretty useless. Any other module using
this API will pin sunrpc in memory due using exported symbols.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

nfs: ERR_PTR is expected on failure from nfs_do_clone_mount

Replace NULL with ERR_PTR(-EINVAL).

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

fix fs/nfs/nfsroot.c compilation

This patch fixes the following compile error caused by
commit f9247273cb69ba101877e946d2d83044409cc8c5
(UFS: add const to parser token tabl):

<--  snip  -->

...
  CC      fs/nfs/nfsroot.o
/home/bunk/linux/kernel-2.6/git/linux-2.6/fs/nfs/nfsroot.c:130: error: tokens causes a section type conflict
make[3]: *** [fs/nfs/nfsroot.o] Error 1

<--  snip  -->

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Allow concurrent inode revalidation

Currently, if two processes are both trying to revalidate metadata for the
same inode, they will find themselves being serialised. There is no good
justification for this now that we have improved our ability to detect
stale attribute data, so we should remove that serialisation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Fix up nfs_setattr_update_inode()

Ensure that it sets the inode metadata under the correct spinlock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Don't clear nfsi->cache_validity in nfs_check_inode_attributes()

If we're merely checking the inode attributes because we suspect that the
'updated' attributes returned by the RPC call are stale, then we shouldn't
be doing weak cache consistency updates or clearing the cache_validity
flags.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Convert __nfs_revalidate_inode() to use nfs_refresh_inode()

In the case where there are parallel RPC calls to the same inode, we may
receive stale metadata due to the lack of ordering, hence the sanity
checking of metadata in nfs_refresh_inode().
Currently, __nfs_revalidate_inode() is calling nfs_update_inode() directly,
without any further sanity checks, and hence may end up setting the inode
up with stale metadata.

Fix is to use nfs_refresh_inode() instead of nfs_update_inode().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Fix nfs_post_op_update_inode_force_wcc()

If we believe that the attributes are old (see nfs_refresh_inode()), then
we shouldn't force an update.
Also ensure that we hold the inode->i_lock across attribute checks and the
call to nfs_refresh_inode_locked() to ensure that we don't race with other
attribute updates.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Fix the NFS attribute update

Currently nfs_refresh_inode() will only update the inode metadata if it
sees that the RPC call that returned the nfs_fattr was started
after the last update of the inode. This means that if we have parallel
RPC calls to the same inode (when sending WRITE calls, for instance), we
may often miss updates.

This patch attempts to recover those missed updates by also accepting
them if the ctime in the nfs_fattr is more recent than the inode's
cached ctime.
It also recovers the case where the file size has increased, but the
ctime has not been updated due to limited ctime resolution.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Clean up nfs_refresh_inode() and nfs_post_op_update_inode()

Try to avoid taking and dropping the inode->i_lock more than once. Do so by
moving the code in nfs_refresh_inode() that needs to be done under the
spinlock into a function nfs_refresh_inode_locked(), and then having both
nfs_refresh_inode() and nfs_post_op_update_inode() call it directly.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Add mount options for controlling the lookup cache

Add the following NFS-specific mount options to the parser.

    -o lookupcache=all          /* Default: cache positive & negative
                                   dentries */
    -o lookupcache=pos[itive]   /* Don't cache negative dentries */
    -o lookupcache=none         /* Strict revalidation of all dentries */

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Don't apply NFS_MOUNT_FLAGMASK to text-based mounts

The point of introducing text-based mounts was to allow us to add
functionality without having to worry about legacy binary mount formats.
The mask should be there in order to ensure that binary formats don't start
enabling features that they cannot support. There is no justification for
applying it to the text mount path.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Add options for finer control of the lookup cache

Add the flag NFS_MOUNT_LOOKUP_CACHE_NONEG to turn off the caching of
negative dentries. In reality what we do is to force
nfs_lookup_revalidate() to always discard negative dentries.

Add the flag NFS_MOUNT_LOOKUP_CACHE_NONE for enforcing stricter
revalidation of dentries. It forces the revalidate code to always do a
lookup instead of just checking the cached mtime of the parent directory.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Clean up nfs_sb_active/nfs_sb_deactive

Instead of causing umount requests to block on server->active_wq while the
asynchronous sillyrename deletes are executing, we can use the sb->s_active
counter to obtain a reference to the super_block, and then release that
reference in nfs_async_unlink_release().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

NFS: Fix nfs_file_llseek()

After the BKL removal patches were applied to the rest of the NFS code, the
BKL protection in nfs_file_llseek() is no longer sufficient to ensure that
inode->i_size is read safely in generic_file_llseek_unlocked().

In order to fix the situation, we either have to replace the naked read of
inode->i_size in generic_file_llseek_unlocked() with i_size_read(), or the
whole thing needs to be executed under the inode->i_lock;
In order to avoid disrupting other filesystems, avoid touching
generic_file_llseek_unlocked() for now...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

Linux 2.6.27-rc9

Marker depmod fix core kernel list

* Theodore Ts'o (tytso@mit.edu) wrote:
>
> I've been playing with adding some markers into ext4 to see if they
> could be useful in solving some problems along with Systemtap.  It
> appears, though, that as of 2.6.27-rc8, markers defined in code which is
> compiled directly into the kernel (i.e., not as modules) don't show up
> in Module.markers:
>
> kvm_trace_entryexit arch/x86/kvm/kvm-intel  %u %p %u %u %u %u %u %u
> kvm_trace_handler arch/x86/kvm/kvm-intel  %u %p %u %u %u %u %u %u
> kvm_trace_entryexit arch/x86/kvm/kvm-amd  %u %p %u %u %u %u %u %u
> kvm_trace_handler arch/x86/kvm/kvm-amd  %u %p %u %u %u %u %u %u
>
> (Note the lack of any of the kernel_sched_* markers, and the markers I
> added for ext4_* and jbd2_* are missing as wel.)
>
> Systemtap apparently depends on in-kernel trace_mark being recorded in
> Module.markers, and apparently it's been claimed that it used to be
> there.  Is this a bug in systemtap, or in how Module.markers is getting
> built?   And is there a file that contains the equivalent information
> for markers located in non-modules code?

I think the problem comes from "markers: fix duplicate modpost entry"
(commit d35cb360c29956510b2fe1a953bd4968536f7216)

Especially :

  -   add_marker(mod, marker, fmt);
  +   if (!mod->skip)
  +     add_marker(mod, marker, fmt);
    }
    return;
   fail:

Here is a fix that should take care if this problem.

Thanks for the bug report!

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Tested-by: "Theodore Ts'o" <tytso@mit.edu>
CC: Greg KH <greg@kroah.com>
CC: David Smith <dsmith@redhat.com>
CC: Roland McGrath <roland@redhat.com>
CC: Sam Ravnborg <sam@ravnborg.org>
CC: Wenji Huang <wenji.huang@oracle.com>
CC: Takashi Nishiie <t-nishiie@np.css.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
kgdb: call touch_softlockup_watchdog on resume
kgdb, x86: Avoid invoking kgdb_nmicallback twice per NMI

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: gart iommu have direct mapping when agp is present too

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
  ide: workaround for bogus gcc warning in ide_sysfs_register_port()
  ide-cd: Optiarc DVD RW AD-7200A does play audio
  IDE: Fix platform device registration in Swarm IDE driver (v2)
  ide-dma: fix ide_build_dmatable() for TRM290
  ide-cd: temporary tray close fix

Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] IP27: Fix build errors if CONFIG_MAPPED_KERNEL=y
[MIPS] Fix CMP Kconfig configuration and mark as broken.

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (33 commits)
  V4L/DVB (9103): em28xx: HVR-900 B3C0 - fix audio clicking issue
  V4L/DVB (9099): em28xx: Add detection for K-WORLD DVB-T 310U
  V4L/DVB (9092): gspca: Bad init values for sonixj ov7660.
  V4L/DVB (9080): gspca: Add a delay after writing to the sonixj sensors.
  V4L/DVB (9075): gspca: Bad check of returned status in i2c_read() spca561.
  V4L/DVB (9053): fix buffer overflow in uvc-video
  V4L/DVB (9043): S5H1420: Fix size of shadow-array to avoid overflow
  V4L/DVB (9037): Fix support for Hauppauge Nova-S SE
  V4L/DVB (9029): Fix deadlock in demux code
  V4L/DVB (8979): sms1xxx: Add new USB product ID for Hauppauge WinTV MiniStick
  V4L/DVB (8978): sms1xxx: fix product name for Hauppauge WinTV MiniStick
  V4L/DVB (8967): Use correct XC3028L firmware for AMD ATI TV Wonder 600
  V4L/DVB (8963): s2255drv field count fix
  V4L/DVB (8961): zr36067: Fix RGBR pixel format
  V4L/DVB (8960): drivers/media/video/cafe_ccic.c needs mm.h
  V4L/DVB (8958): zr36067: Return proper bytes-per-line value
  V4L/DVB (8957): zr36067: Restore the default pixel format
  V4L/DVB (8955): bttv: Prevent NULL pointer dereference in radio_open
  V4L/DVB (8935): em28xx-cards: Remove duplicate entry (EM2800_BOARD_KWORLD_USB2800)
  V4L/DVB (8933): gspca: Disable light frquency for zc3xx cs2102 Kokom.
  ...

atmel-mci: Initialize BLKR before sending data transfer command

The atmel-mci driver sometimes fails data transfers like this:

   mmcblk0: error -5 transferring data
   end_request: I/O error, dev mmcblk0, sector 2749769
   end_request: I/O error, dev mmcblk0, sector 2749777

It turns out that this might be caused by the BLKR register (which
contains the block size and the number of blocks being transfered) being
initialized too late. This patch moves the initialization of BLKR so
that it contains the correct value before the block transfer command is
sent.

This error is difficult to reproduce, but if you insert a long delay
(mdelay(10) or thereabouts) between the calls to atmci_start_command()
and atmci_submit_data(), all transfers seem to fail without this patch,
while I haven't seen any failures with this patch.

Reported-by: Hein_Tibosch <hein_tibosch@yahoo.es>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kgdb: call touch_softlockup_watchdog on resume

The softlockup watchdog needs to be touched when resuming the from the
kgdb stopped state to avoid the printk that a CPU is stuck if the
debugger was active for longer than the softlockup threshold.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>

kgdb, x86: Avoid invoking kgdb_nmicallback twice per NMI

Stress-testing KVM's latest NMI support with kgdbts inside an SMP guest,
I came across spurious unhandled NMIs while running the singlestep test.
Looking closer at the code path each NMI takes when KGDB is enabled, I
noticed that kgdb_nmicallback is called twice per event: One time via
DIE_NMI_IPI notification, the second time on DIE_NMI. Removing the first
invocation cures the unhandled NMIs here.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>

x86 ACPI: Blacklist two HP machines with buggy BIOSes

There is a bug in the BIOSes of some HP boxes with AMD Turions which
connects IO-APIC pins with ACPI thermal trip points in such a way that
if the state of the IO-APIC is not as expected by the (buggy) BIOS, the
thermal trip points are set to insanely low values (usually all of them
become 16 degrees Celsius).  As a result, thermal throttling kicks in
and knock the system down to its shoes.

Unfortunately some of the recent IO-APIC changes made the bug show up.
To prevent this from happening, blacklist machines that are known to be
affected (nx6115 and 6715b in this particular case).

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=11516 listed as
a regression from 2.6.26.

On my box it was caused by:

commit 691874fa96d6349a8b60f8ea9c2bae52ece79941
Author: Maciej W. Rozycki <macro@linux-mips.org>
Date:   Tue May 27 21:19:51 2008 +0100

    x86: I/O APIC: timer through 8259A second-chance

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
and the whole story is described in this (huge) thread:

    http://marc.info/?l=linux-kernel&m=121358440508410&w=4

Matthew Garrett told us about that happening on the nx6125:

    http://marc.info/?l=linux-kernel&m=121396307411930&w=4

and then Maciej analysed the breakage on the basis of a DSDT from the
nx6325:

    http://marc.info/?l=linux-kernel&m=121401068718826&w=4

As far as the Dmitry's and Jason's boxes are concerned, I recognized the
symptoms and asked them to verify that the blacklisting helped.

It appears that the buggy BIOS code has been copy-pasted to the entire
range of machines, for no good reason.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: Jason Vas Dias <jason.vas.dias@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[MIPS] IP27: Fix build errors if CONFIG_MAPPED_KERNEL=y

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

[MIPS] Fix CMP Kconfig configuration and mark as broken.

Because sync-r4k.c doesn't build.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

ide: workaround for bogus gcc warning in ide_sysfs_register_port()

Reported-by: "Steven Noonan" <steven@uplinklabs.net>
Suggested-by: "Elias Oltmanns" <eo@nebensachen.de>
Cc: mingo@elte.hu
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide-cd: Optiarc DVD RW AD-7200A does play audio

The Optiarc DVD RW AD-7200A can play audio, but tells it could not.

Signed-off-by: Bodo Eggert <7eggert@gmx.de>
Tested-by: Nick Warne <nick@ukfsn.org>
Received-from: Borislav Petkov <petkovbb@googlemail.com>
[bart: keep "audio" quirks together]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

IDE: Fix platform device registration in Swarm IDE driver (v2)

The Swarm IDE driver uses a release method which is defined in the driver
itself thus potentially oopsable. The simple fix would be to just leak
the device but this patch goes the full length and moves the entire
handling of the platform device in the platform code and retains only
the platform driver code in drivers/ide/mips/swarm.c.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: "Maciej W. Rozycki" <macro@linux-mips.org>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
[bart: remove no longer needed BLK_DEV_IDE_SWARM from ide/Kconfig]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide-dma: fix ide_build_dmatable() for TRM290

Apparently, 'xcount' being 0 does not mean 0 bytes for TRM290; it means 4 bytes,
judging from the code immediately preceding this check. So, we must never try
to "split" the PRD for TRM290.

This is probably never hit anyway -- with the DMA buffers aligned to at least
512 bytes and ATAPI DMA not being used for non block I/O commands...

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

ide-cd: temporary tray close fix

This one fixes http://bugzilla.kernel.org/show_bug.cgi?id=11602.

A more generic fix for drives which cannot autoclose tray will follow.

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
[bart: add an extra parentheses for consistency with the rest of kernel code]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

x86: gart iommu have direct mapping when agp is present too

move init_memory_mapping() out of init_k8_gatt.

for: http://bugzilla.kernel.org/show_bug.cgi?id=11676
2.6.27-rc2 to rc8, apgart fails, iommu=soft works, regression

This is needed because we need to map the GART aperture even
if the GATT is not initialized.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

V4L/DVB (9103): em28xx: HVR-900 B3C0 - fix audio clicking issue

Fixed audio clicking problem which could be heard when using analog tv or composite input

Signed-off-by: Wiktor Grebla <greblus@gmail.com>
Signed-off-by: Douglas Schilling Landgraf <dougsland@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (9099): em28xx: Add detection for K-WORLD DVB-T 310U

Correct firmware type to MTS
Correct audio routing for composite/s-video
Add DVB-T detection.

This patch uses the eeprom hash method for detection as the vendor/product
ids are also used for the DIGIVOX_AD. This may be a clone of the same
product. Explanatory text has been added prior to the hask look-up in
anticipation that it may help others.

The following has been tested to work:
Analogue TV (PAL-I)
Composite In
DVB-T (UK Crystal Palace)
USB AUDIO

The following has not been tested but probably works:
S-Video In

Signed-off-by: Darron Broad <darron@kewl.org>
Signed-off-by: Douglas Schilling Landgraf <dougsland@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (9092): gspca: Bad init values for sonixj ov7660.

Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (9080): gspca: Add a delay after writing to the sonixj sensors.

Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (9075): gspca: Bad check of returned status in i2c_read() spca561.

This makes auto gain functional on 04fc:0561.

Signed-off-by: Shane <gnome42@gmail.com>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (9053): fix buffer overflow in uvc-video

There is a buffer overflow in drivers/media/video/uvc/uvc_ctrl.c:

INFO: 0xf2c5ce08-0xf2c5ce0b. First byte 0xa1 instead of 0xcc
INFO: Allocated in uvc_query_v4l2_ctrl+0x3c/0x239 [uvcvideo] age=13 cpu=1 pid=4975
...

A fixed size 8-byte buffer is allocated, and a variable size field is read
into it; there is no particular bound on the size of the field (it is
dependent on hardware and configuration) and it can overflow [also
verified by inserting printk's.]

The patch attempts to size the buffer to the correctly.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Laurent Pinchart <laurent.pinchart@skynet.be>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (9043): S5H1420: Fix size of shadow-array to avoid overflow

The array size of 'shadow' still needs to be fixed in order to not overflow when reading register 0x00.

Thanks to Oliver Endriss for pointing that out.

Signed-off-by: Patrick Boettcher <pb@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (9037): Fix support for Hauppauge Nova-S SE

Different backends have different input busses (saa7146, flexcop).
To reflect that a config-option to the s5h1420-driver was added which makes
the output mode selectable.

Furthermore the s5h1420-driver is now doing the same i2c-method as it was done
before adding support for other i2c-users.

This patch needs to go into the current release of the kernel, as this driver
is currently broken.

(Thanks to Eberhard Kaltenhaeuser for helping out to debug this issue.)

Signed-off-by: Patrick Boettcher <pb@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (9029): Fix deadlock in demux code

The functions dvb_dmxdev_section_callback, dvb_dmxdev_ts_callback,
dvb_dmx_swfilter_packet, dvb_dmx_swfilter_packets, dvb_dmx_swfilter and
dvb_dmx_swfilter_204 may be called from both interrupt and process
context. Therefore they need to be protected by spin_lock_irqsave()
instead of spin_lock().

This fixes a deadlock discovered by lockdep.

Signed-off-by: Andreas Oberritter <obi@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8979): sms1xxx: Add new USB product ID for Hauppauge WinTV MiniStick

2040:5510 is the same hardware as 2040:5500

Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8978): sms1xxx: fix product name for Hauppauge WinTV MiniStick

Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8967): Use correct XC3028L firmware for AMD ATI TV Wonder 600

The AMD ATI TV Wonder 600 has an XC3028L and *not* an XC3028, so we need to
load the proper firmware to prevent the device from overheating.

Signed-off-by: Devin Heitmueller <devin.heitmueller@gmail.com>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8963): s2255drv field count fix

Fixes videobuf field_count

Signed-off-by: Dean Anderson <dean@sensoray.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8961): zr36067: Fix RGBR pixel format

The zr36067 driver is improperly declaring pixel format RGBP twice,
once as "16-bit RGB LE" and once as "16-bit RGB BE". The latter is
actually RGBR. Fix the code to properly map both pixel formats.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8960): drivers/media/video/cafe_ccic.c needs mm.h

sparc32 allmodconfig:

drivers/media/video/cafe_ccic.c: In function 'cafe_setup_siobuf':
drivers/media/video/cafe_ccic.c:1192: error: implicit declaration of function 'PAGE_ALIGN'
drivers/media/video/cafe_ccic.c: At top level:
drivers/media/video/cafe_ccic.c:1430: error: variable 'cafe_v4l_vm_ops' has initializer but incomplete type
drivers/media/video/cafe_ccic.c:1431: error: unknown field 'open' specified in initializer
drivers/media/video/cafe_ccic.c:1431: warning: excess elements in struct initializer
drivers/media/video/cafe_ccic.c:1431: warning: (near initialization for 'cafe_v4l_vm_ops')
drivers/media/video/cafe_ccic.c:1432: error: unknown field 'close' specified in initializer
drivers/media/video/cafe_ccic.c:1433: warning: excess elements in struct initializer
drivers/media/video/cafe_ccic.c:1433: warning: (near initialization for 'cafe_v4l_vm_ops')
drivers/media/video/cafe_ccic.c: In function 'cafe_v4l_mmap':
drivers/media/video/cafe_ccic.c:1444: error: 'VM_WRITE' undeclared (first use in this function)
drivers/media/video/cafe_ccic.c:1444: error: (Each undeclared identifier is reported only once
drivers/media/video/cafe_ccic.c:1444: error: for each function it appears in.)
drivers/media/video/cafe_ccic.c:1444: error: 'VM_SHARED' undeclared (first use in this function)
drivers/media/video/cafe_ccic.c:1461: error: 'VM_DONTEXPAND' undeclared (first use in this function)

This build breakage is caused by some header file shuffle in linux-next. But
I suggest that this patch be merged ahead of linux-next to avoid bisection
breakage.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8958): zr36067: Return proper bytes-per-line value

The zr36067 driver should return the actual bytes-per-line value when
queried with ioctl VIDIOC_G_FMT, instead of 0. Otherwise user-space
applications can get confused.

Likewise, with ioctl VIDIOC_S_FMT, we are supposed to fill the
bytes-per-line value. And we shouldn't fail if the caller sets the
initial value to something different from 0. This is perfectly valid
for applications to pre-fill this field with the value they expect.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8957): zr36067: Restore the default pixel format

Restore the default pixel format to YUYV as it used to be before
kernel 2.6.23. It was accidentally changed to BGR3 by commit
603d6f2c8f9f3604f9c6c1f8903efc2df30a000f.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8955): bttv: Prevent NULL pointer dereference in radio_open

Fix the following crash in the bttv driver:

BUG: unable to handle kernel NULL pointer dereference at 000000000000036c
IP: [<ffffffffa037860a>] radio_open+0x3a/0x170 [bttv]

This happens because radio_open assumes that all present bttv devices
have a radio function. If a bttv device without radio and one with
radio are installed on the same system, and the one without radio is
registered first, then radio_open checks for the radio device number
of a bttv device that has no radio function, and this breaks. All we
have to do to fix it is to skip bttv devices without a radio function.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8935): em28xx-cards: Remove duplicate entry (EM2800_BOARD_KWORLD_USB2800)

Removed duplicated entry for EM2800_BOARD_KWORLD_USB2800

Signed-off-by: Douglas Schilling Landgraf <dougsland@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8933): gspca: Disable light frquency for zc3xx cs2102 Kokom.

CS2102K stop streaming on setlightfreq (50Hz & 60Hz).
Disable it for now until a correct solution is found.

Signed-off-by: Costantino Leandro <le_costantino@pixartargentina.com.ar>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8926): gspca: Bad fix of leak memory (changeset 43d2ead315b1).

Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (8919): cx18: Fix tuner audio input for Compro H900 cards

Earlier fixes to get the tuner audio working correctly broke the audio
on the Compro VideoMate H900 cards. This is now fixed.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clockevents: check broadcast tick device not the clock events device

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86 setup: correct segfault in generation of 32-bit reloc kernel

clockevents: check broadcast tick device not the clock events device

Impact: jiffies increment too fast.

Hugh Dickins noted that with NOHZ=n and HIGHRES=n jiffies get
incremented too fast. The reason is a wrong check in the broadcast
enter/exit code, which keeps the local apic timer in periodic mode
when the switch happens.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid()

ACPI: Make /proc/acpi/wakeup interface handle PCI devices (again)

Make the ACPI /proc/acpi/wakeup interface set the appropriate wake-up bits
of physical devices corresponding to the ACPI devices and make those bits
be set initially for devices that are enabled to wake up by default. This
is needed to restore the 2.6.26 and earlier behavior for the PCI devices
that were previously handled correctly with the help of the
/proc/acpi/wakeup interface.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

leds-pca955x: add proper error handling and fix bogus memory handling

Check the return value of led_classdev_register and unregister all
registered devices, if registering one device fails.  Also the dynamic
memory handling is totally bogus.  You can't allocate multiple chunks via
kzalloc() and expect them to be in order later.  I wonder how this ever
worked.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Nate Case <ncase@xes-inc.com>
Tested-by: Nate Case <ncase@xes-inc.com>
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

leds-fsg: change order of initialization and deinitialization

On initialization, we first do the ioremap and then register the led devices.
On deinitialization, we do it in reverse order. This prevents someone calling
into the brightness_set functions with an invalid latch_address.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Rod Whitby <rod@whitby.id.au>
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

dw_dmac: fix copy/paste bug in tasklet

The tasklet checks RAW.BLOCK twice, and does not check RAW.XFER. This is
obviously wrong, and could theoretically cause the driver to hang.

Reported-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Documentation/HOWTO: info about interface changes should CC linux-api@vger

The "Documentation" section of this file mentions that when an interface
change is made, I should be CCed with info about the change (so that
man-pages can document it). Additionally request that this info be CCed
to the new linux-api@vger.kernel.org list.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

SubmitChecklist: interfaces changes should CC linux-api@

Mention that patches that change the kernel-userland interface should
be CCed to the new list linux-api@vger.kernel.org.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: add mailing list for man-pages

Nowadays, man-pages has an associated mailing list. Mention that list
in MAINTAINERS.

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpusets: remove pj from cpuset maintainers

Remove myself from the kernel MAINTAINERS file for cpusets.  I am leaving
SGI and probably will not be active in Linux kernel work.  I can be
reached at <pj@usa.net>.  Contact Derek Fults <dfults@sgi.com> for future
SGI+cpuset related issues.  I'm off to the next chapter of this good life.

Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: Paul Menage <menage@google.com>
Cc: Derek Fults <dfults@sgi.com>
Cc: John Hesterberg <jh@sgi.com>
Cc: Paul Jackson <pj@usa.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

include/linux/stacktrace.h: declare struct task_struct

include/linux/stacktrace.h:13: warning:
'struct task_struct' declared inside parameter list

(This might be a hard error on sparc64, which uses this header and has
-Werror)

Reported-by: "Randy.Dunlap" <rdunlap@xenotime.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

orion_spi: fix handling of default transfer speed

Accept zero (the default!) as a per-transfer clock speed override.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fbdev: fix recursive notifier and locking when fbdev console is blanked

Fix infinite recursive notifier in the fbdev layer.  This causes recursive
locking.  Dmitry Baryshkov found the problem and confirmed that the patch
fixes the bug.

After doing
# echo 1 > /sys/class/graphics/fb0/blank
I got the following in my kernel log:

=============================================
[ INFO: possible recursive locking detected ]
2.6.27-rc6-00086-gda63874-dirty #97
---------------------------------------------
echo/1564 is trying to acquire lock:
((fb_notifier_list).rwsem){..--}, at: [<c005a384>] __blocking_notifier_call_chain+0x38/0x6c

but task is already holding lock:
((fb_notifier_list).rwsem){..--}, at: [<c005a384>] __blocking_notifier_call_chain+0x38/0x6c

other info that might help us debug this:
2 locks held by echo/1564:
#0:  (&buffer->mutex){--..}, at: [<c00ddde0>] sysfs_write_file+0x30/0x80
#1:  ((fb_notifier_list).rwsem){..--}, at: [<c005a384>] __blocking_notifier_call_chain+0x38/0x6c

stack backtrace:
[<c0029fe4>] (dump_stack+0x0/0x14) from [<c0060ce0>] (print_deadlock_bug+0xa4/0xd0)
[<c0060c3c>] (print_deadlock_bug+0x0/0xd0) from [<c0060e54>] (check_deadlock+0x148/0x17c)
r6:c397a1e0 r5:c397a530 r4:c04fcf98
[<c0060d0c>] (check_deadlock+0x0/0x17c) from [<c00637e8>] (validate_chain+0x3c4/0x4f0)
[<c0063424>] (validate_chain+0x0/0x4f0) from [<c0063efc>] (__lock_acquire+0x5e8/0x6b4)
[<c0063914>] (__lock_acquire+0x0/0x6b4) from [<c006402c>] (lock_acquire+0x64/0x78)
[<c0063fc8>] (lock_acquire+0x0/0x78) from [<c0316ca8>] (down_read+0x4c/0x60)
r7:00000009 r6:ffffffff r5:c0427a40 r4:c005a384
[<c0316c5c>] (down_read+0x0/0x60) from [<c005a384>] (__blocking_notifier_call_chain+0x38/0x6c)
r5:c0427a40 r4:c0427a74
[<c005a34c>] (__blocking_notifier_call_chain+0x0/0x6c) from [<c005a3d8>] (blocking_notifier_call_chain+0x20/0x28)
r8:00000009 r7:c086d640 r6:c3967940 r5:00000000 r4:c38984b8
[<c005a3b8>] (blocking_notifier_call_chain+0x0/0x28) from [<c014baa0>] (fb_notifier_call_chain+0x1c/0x24)
[<c014ba84>] (fb_notifier_call_chain+0x0/0x24) from [<c014c18c>] (fb_blank+0x64/0x70)
[<c014c128>] (fb_blank+0x0/0x70) from [<c0155978>] (fbcon_blank+0x114/0x1bc)
r5:00000001 r4:c38984b8
[<c0155864>] (fbcon_blank+0x0/0x1bc) from [<c0170ea8>] (do_blank_screen+0x1e0/0x2a0)
[<c0170cc8>] (do_blank_screen+0x0/0x2a0) from [<c0154024>] (fbcon_fb_blanked+0x74/0x94)
r5:c3967940 r4:00000001
[<c0153fb0>] (fbcon_fb_blanked+0x0/0x94) from [<c0154228>] (fbcon_event_notify+0x100/0x12c)
r5:fffffffe r4:c39bc194
[<c0154128>] (fbcon_event_notify+0x0/0x12c) from [<c005a0d4>] (notifier_call_chain+0x38/0x7c)
[<c005a09c>] (notifier_call_chain+0x0/0x7c) from [<c005a3a0>] (__blocking_notifier_call_chain+0x54/0x6c)
r8:c3b51ea0 r7:00000009 r6:ffffffff r5:c0427a40 r4:c0427a74
[<c005a34c>] (__blocking_notifier_call_chain+0x0/0x6c) from [<c005a3d8>] (blocking_notifier_call_chain+0x20/0x28)
r8:00000001 r7:c3a7e000 r6:00000000 r5:00000000 r4:c38984b8
[<c005a3b8>] (blocking_notifier_call_chain+0x0/0x28) from [<c014baa0>] (fb_notifier_call_chain+0x1c/0x24)
[<c014ba84>] (fb_notifier_call_chain+0x0/0x24) from [<c014c18c>] (fb_blank+0x64/0x70)
[<c014c128>] (fb_blank+0x0/0x70) from [<c014e450>] (store_blank+0x54/0x7c)
r5:c38984b8 r4:c3b51ec4
[<c014e3fc>] (store_blank+0x0/0x7c) from [<c017981c>] (dev_attr_store+0x28/0x2c)
r8:00000001 r7:c042bf80 r6:c39eba10 r5:c3967c30 r4:c38e0140
[<c01797f4>] (dev_attr_store+0x0/0x2c) from [<c00ddaac>] (flush_write_buffer+0x54/0x68)
[<c00dda58>] (flush_write_buffer+0x0/0x68) from [<c00dde08>] (sysfs_write_file+0x58/0x80)
r8:c3b51f78 r7:c3bcb070 r6:c39eba10 r5:00000001 r4:00000001
[<c00dddb0>] (sysfs_write_file+0x0/0x80) from [<c009de04>] (vfs_write+0xb8/0x148)
[<c009dd4c>] (vfs_write+0x0/0x148) from [<c009e384>] (sys_write+0x44/0x70)
r7:00000004 r6:c3bcb070 r5:00000000 r4:00000000
[<c009e340>] (sys_write+0x0/0x70) from [<c0025d00>] (ret_fast_syscall+0x0/0x2c)
r6:4001b000 r5:00000001 r4:401dc658

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Reported-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Testted-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc: fix kernel panic on second use of SIGIO nofitication

When userspace uses SIGIO notification and forgets to disable it before
closing file descriptor, rtc->async_queue contains stale pointer to struct
file.  When user space enables again SIGIO notification in different
process, kernel dereferences this (poisoned) pointer and crashes.

So disable SIGIO notification on close.

Kernel panic:
(second run of qemu (requires echo 1024 > /sys/class/rtc/rtc0/max_user_freq))

general protection fault: 0000 [1] PREEMPT
CPU 0
Modules linked in: af_packet snd_pcm_oss snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq usbhid tuner tea5767 tda8290 tuner_xc2028 xc5000 tda9887 tuner_simple tuner_types mt20xx tea5761 tda9875 uhci_hcd ehci_hcd usbcore bttv snd_via82xx snd_ac97_codec ac97_bus snd_pcm snd_timer ir_common compat_ioctl32 snd_page_alloc videodev v4l1_compat snd_mpu401_uart snd_rawmidi v4l2_common videobuf_dma_sg videobuf_core snd_seq_device snd btcx_risc soundcore tveeprom i2c_viapro
Pid: 5781, comm: qemu-system-x86 Not tainted 2.6.27-rc6 #363
RIP: 0010:[<ffffffff8024f891>]  [<ffffffff8024f891>] __lock_acquire+0x3db/0x73f
RSP: 0000:ffffffff80674cb8  EFLAGS: 00010002
RAX: ffff8800224c62f0 RBX: 0000000000000046 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800224c62f0
RBP: ffffffff80674d08 R08: 0000000000000002 R09: 0000000000000001
R10: ffffffff80238941 R11: 0000000000000001 R12: 0000000000000000
R13: 6b6b6b6b6b6b6b6b R14: ffff88003a450080 R15: 0000000000000000
FS:  00007f98b69516f0(0000) GS:ffffffff80623200(0000) knlGS:00000000f7cc86d0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000a87000 CR3: 0000000022598000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu-system-x86 (pid: 5781, threadinfo ffff880028812000, task ffff88003a450080)
Stack:  ffffffff80674cf8 0000000180238440 0000000200000002 0000000000000000
ffff8800224c62f0 0000000000000046 0000000000000000 0000000000000002
0000000000000002 0000000000000000 ffffffff80674d68 ffffffff8024fc7a
Call Trace:
<IRQ>  [<ffffffff8024fc7a>] lock_acquire+0x85/0xa9
[<ffffffff8029cb62>] ? send_sigio+0x2a/0x184
[<ffffffff80491d1f>] _read_lock+0x3e/0x4a
[<ffffffff8029cb62>] ? send_sigio+0x2a/0x184
[<ffffffff8029cb62>] send_sigio+0x2a/0x184
[<ffffffff8024fb97>] ? __lock_acquire+0x6e1/0x73f
[<ffffffff8029cd4d>] ? kill_fasync+0x2c/0x4e
[<ffffffff8029cd10>] __kill_fasync+0x54/0x65
[<ffffffff8029cd5b>] kill_fasync+0x3a/0x4e
[<ffffffff80402896>] rtc_update_irq+0x9c/0xa5
[<ffffffff80404640>] cmos_interrupt+0xae/0xc0
[<ffffffff8025d1c1>] handle_IRQ_event+0x25/0x5a
[<ffffffff8025e5e4>] handle_edge_irq+0xdd/0x123
[<ffffffff8020da34>] do_IRQ+0xe4/0x144
[<ffffffff8020bad6>] ret_from_intr+0x0/0xf
<EOI>  [<ffffffff8026fdc2>] ? __alloc_pages_internal+0xe7/0x3ad
[<ffffffff8033fe67>] ? clear_page_c+0x7/0x10
[<ffffffff8026fc10>] ? get_page_from_freelist+0x385/0x450
[<ffffffff8026fdc2>] ? __alloc_pages_internal+0xe7/0x3ad
[<ffffffff80280aac>] ? anon_vma_prepare+0x2e/0xf6
[<ffffffff80279400>] ? handle_mm_fault+0x227/0x6a5
[<ffffffff80494716>] ? do_page_fault+0x494/0x83f
[<ffffffff8049251d>] ? error_exit+0x0/0xa9

Code: cc 41 39 45 28 74 24 e8 5e 1d 0f 00 85 c0 0f 84 6a 03 00 00 83 3d 8f a9 aa 00 00 be 47 03 00 00 0f 84 6a 02 00 00 e9 53 03 00 00 <41> ff 85 38 01 00 00 45 8b be 90 06 00 00 41 83 ff 2f 76 24 e8
RIP  [<ffffffff8024f891>] __lock_acquire+0x3db/0x73f
RSP <ffffffff80674cb8>
---[ end trace 431877d860448760 ]---
Kernel panic - not syncing: Aiee, killing interrupt handler!

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Alessandro Zummo <alessandro.zummo@towertech.it>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid()

At some point during the 2.6.27 development cycle two new fields were added
to the SELinux context structure, a string pointer and a length field. The
code in selinux_secattr_to_sid() was not modified and as a result these two
fields were left uninitialized which could result in erratic behavior,
including kernel panics, when NetLabel is used. This patch fixes the
problem by fully initializing the context in selinux_secattr_to_sid() before
use and reducing the level of direct context manipulation done to help
prevent future problems.

Please apply this to the 2.6.27-rcX release stream.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>