git.karo-electronics.de Git - karo-tx-linux.git/log

niu: Fix readq implementation when architecture does not provide one.

[ Upstream commit e23a59e1ca6d177a57a7791b3629db93ff1d9813 ]

This fixes a TX hang reported by Jesper Dangaard Brouer.

When an architecutre cannot provide a fully functional
64-bit atomic readq/writeq, the driver must implement
it's own.  This is because only the driver can say whether
doing something like using two 32-bit reads to implement
the full 64-bit read will actually work properly.

In particular one of the issues is whether the top 32-bits
or the bottom 32-bits of the 64-bit register should be read
first.  There could be side effects, and in fact that is
exactly the problem here.

The TX_CS register has counters in the upper 32-bits and
state bits in the lower 32-bits.  A read clears the state
bits.

We would read the counter half before the state bit half.
That first read would clear the state bits, and then the
driver thinks that no interrupts are pending because the
interrupt indication state bits are seen clear every time.

Fix this by reading the bottom half before the upper half.

Tested-by: Jesper Dangaard Brouer <jdb@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

af_unix: netns: fix problem of return value

[ Upstream commit 48dcc33e5e11de0f76b65b113988dbc930d17395 ]

fix problem of return value

net/unix/af_unix.c: unix_net_init()
when error appears, it should return 'error', not always return 0.

Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

libata: improve phantom device detection

commit 6a6b97d360702b98c02c7fca4c4e088dcf3a2985 upstream.

Currently libata uses four methods to detect device presence.

1. PHY status if available.
2. TF register R/W test (only promotes presence, never demotes)
3. device signature after reset
4. IDENTIFY failure detection in SFF state machine

Combination of the above works well in most cases but recently there
have been a few reports where a phantom device causes unnecessary
delay during probe. In both cases, PHY status wasn't available. In
one case, it passed #2 and #3 and failed IDENTIFY with ATA_ERR which
didn't qualify as #4. The other failed #2 but as it passed #3 and #4,
it still caused failure.

In both cases, phantom device reported diagnostic failure, so these
cases can be safely worked around by considering any !ATA_DRQ IDENTIFY
failure as NODEV_HINT if diagnostic failure is set.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Linux 2.6.27.8

jbd: ordered data integrity fix

commit 960a22ae60c8a723bd17da3b929fe0bcea6d007e upstream.

In ordered mode, if a file data buffer being dirtied exists in the
committing transaction, we write the buffer to the disk, move it from the
committing transaction to the running transaction, then dirty it. But we
don't have to remove the buffer from the committing transaction when the
buffer couldn't be written out, otherwise it would miss the error and the
committing transaction would not abort.

This patch adds an error check before removing the buffer from the
committing transaction.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext3: fix ext3 block reservation early ENOSPC issue

commit 46d01a225e694f1a4343beea44f1e85105aedd7e upstream.

We could run into ENOSPC error on ext3, even when there is free blocks on
the filesystem.

The problem is triggered in the case the goal block group has 0 free
blocks , and the rest block groups are skipped due to the check of
"free_blocks < windowsz/2". Current code could fall back to non
reservation allocation to prevent early ENOSPC after examing all the block
groups with reservation on , but this code was bypassed if the reservation
window is turned off already, which is true in this case.

This patch fixed two issues:
1) We don't need to turn off block reservation if the goal block group has
0 free blocks left and continue search for the rest of block groups.

Current code the intention is to turn off the block reservation if the
goal allocation group has a few (some) free blocks left (not enough for
make the desired reservation window),to try to allocation in the goal
block group, to get better locality. But if the goal blocks have 0 free
blocks, it should leave the block reservation on, and continues search for
the next block groups,rather than turn off block reservation completely.

2) we don't need to check the window size if the block reservation is off.

The problem was originally found and fixed in ext4.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext2: fix ext2 block reservation early ENOSPC issue

commit d707d31c972b657dfc2efefd0b99cc4e14223dab upstream.

We could run into ENOSPC error on ext2, even when there is free blocks on
the filesystem.

The problem is triggered in the case the goal block group has 0 free
blocks , and the rest block groups are skipped due to the check of
"free_blocks < windowsz/2". Current code could fall back to non
reservation allocation to prevent early ENOSPC after examing all the block
groups with reservation on , but this code was bypassed if the reservation
window is turned off already, which is true in this case.

This patch fixed two issues:
1) We don't need to turn off block reservation if the goal block group has
0 free blocks left and continue search for the rest of block groups.

Current code the intention is to turn off the block reservation if the
goal allocation group has a few (some) free blocks left (not enough for
make the desired reservation window),to try to allocation in the goal
block group, to get better locality. But if the goal blocks have 0 free
blocks, it should leave the block reservation on, and continues search for
the next block groups,rather than turn off block reservation completely.

2) we don't need to check the window size if the block reservation is off.

The problem was originally found and fixed in ext4.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext3: don't try to resize if there are no reserved gdt blocks left

commit 972fbf779832e5ad15effa7712789aeff9224c37 upstream.

When trying to resize a ext3 fs and you run out of reserved gdt blocks,
you get an error that doesn't actually tell you what went wrong, it just
says that the gdb it picked is not correct, which is the case since you
don't have any reserved gdt blocks left. This patch adds a check to make
sure you have reserved gdt blocks to use, and if not prints out a more
relevant error.

Signed-off-by: Josef Bacik <jbacik@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Andreas Dilger <adilger@sun.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext3: Fix duplicate entries returned from getdents() system call

commit 8c9fa93d51123c5540762b1a9e1919d6f9c4af7c upstream.

Fix a regression caused by commit 6a897cf4, "ext3: fix ext3_dx_readdir
hash collision handling", where deleting files in a large directory
(requiring more than one getdents system call), results in some
filenames being returned twice. This was caused by a failure to
update info->curr_hash and info->curr_minor_hash, so that if the
directory had gotten modified since the last getdents() system call
(as would be the case if the user is running "rm -r" or "git clean"),
a directory entry would get returned twice to the userspace.

This patch fixes the bug reported by Markus Trippelsdorf at:
http://bugzilla.kernel.org/show_bug.cgi?id=11844

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext3: fix ext3_dx_readdir hash collision handling

commit 6a897cf447a83c9c3fd1b85a1e525c02d6eada7d upstream.

This fixes a bug where readdir() would return a directory entry twice
if there was a hash collision in an hash tree indexed directory.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Eugene Dashevsky <eugene@ibrix.com>
Signed-off-by: Mike Snitzer <msnitzer@ibrix.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: add checksum calculation when clearing UNINIT flag in ext4_new_inode

(cherry picked from commit 23712a9c28b9f80a8cf70c8490358d5f562d2465)

When initializing an uninitialized block group in ext4_new_inode(),
its block group checksum must be re-calculated. This fixes a race
when several threads try to allocate a new inode in an UNINIT'd group.

There is some question whether we need to be initializing the block
bitmap in ext4_new_inode() at all, but for now, if we are going to
init the block group, let's eliminate the race.

Signed-off-by: Frederic Bohe <frederic.bohe@bull.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: Mark the buffer_heads as dirty and uptodate after prepare_write

(cherry picked from commit ed9b3e3379731e9f9d2f73f3d7fd9e7d2ce3df4a)

We need to make sure we mark the buffer_heads as dirty and uptodate
so that block_write_full_page write them correctly.

This fixes mmap corruptions that can occur in low memory situations.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: calculate journal credits correctly

(cherry picked from commit ac51d83705c2a38c71f39cde99708b14e6212a60)

This fixes a 2.6.27 regression which was introduced in commit a02908f1.

We weren't passing the chunk parameter down to the two subections,
ext4_indirect_trans_blocks() and ext4_ext_index_trans_blocks(), with
the result that massively overestimate the amount of credits needed by
ext4_da_writepages, especially in the non-extents case. This causes
failures especially on /boot partitions, which tend to be small and
non-extent using since GRUB doesn't handle extents.

This patch fixes the bug reported by Joseph Fannin at:
http://bugzilla.kernel.org/show_bug.cgi?id=11964

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: wait on all pending commits in ext4_sync_fs()

(cherry picked from commit 14ce0cb411c88681ab8f3a4c9caa7f42e97a3184)

In ext4_sync_fs, we only wait for a commit to finish if we started it,
but there may be one already in progress which will not be synced.

In the case of a data=ordered umount with pending long symlinks which
are delayed due to a long list of other I/O on the backing block
device, this causes the buffer associated with the long symlinks to
not be moved to the inode dirty list in the second phase of
fsync_super. Then, before they can be dirtied again, kjournald exits,
seeing the UMOUNT flag and the dirty pages are never written to the
backing block device, causing long symlink corruption and exposing new
or previously freed block data to userspace.

To ensure all commits are synced, we flush all journal commits now
when sync_fs'ing ext4.

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: Convert to host order before using the values.

(cherry picked from commit d94e99a64c3beece22dbfb2b335771a59184eb0a)

Use le16_to_cpu to read the s_reserved_gdt_blocks values
from super block.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space

(cherry picked from commit 8c3f25d8950c3e9fe6c9849f88679b3f2a071550)

Commit 23f8b79e introducd a regression because it assumed that if
there were no transactions ready to be checkpointed, that no progress
could be made on making space available in the journal, and so the
journal should be aborted. This assumption is false; it could be the
case that simply calling jbd2_cleanup_journal_tail() will recover the
necessary space, or, for small journals, the currently committing
transaction could be responsible for chewing up the required space in
the log, so we need to wait for the currently committing transaction
to finish before trying to force a checkpoint operation.

This patch fixes a bug reported by Mihai Harpau at:
https://bugzilla.redhat.com/show_bug.cgi?id=469582

This patch fixes a bug reported by François Valenduc at:
http://bugzilla.kernel.org/show_bug.cgi?id=11840

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Duane Griffin <duaneg@dghda.com>
Cc: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: Fix duplicate entries returned from getdents() system call

(cherry picked from commit 3c37fc86d20fe35be656f070997d62f75c2e4874)

Fix a regression caused by commit d0156417, "ext4: fix ext4_dx_readdir
hash collision handling", where deleting files in a large directory
(requiring more than one getdents system call), results in some
filenames being returned twice. This was caused by a failure to
update info->curr_hash and info->curr_minor_hash, so that if the
directory had gotten modified since the last getdents() system call
(as would be the case if the user is running "rm -r" or "git clean"),
a directory entry would get returned twice to the userspace.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This patch fixes the bug reported by Markus Trippelsdorf at:
http://bugzilla.kernel.org/show_bug.cgi?id=11844

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: Do mballoc init before doing filesystem recovery

(cherry picked from commit c2774d84fd6cab2bfa2a2fae0b1ca8d8ebde48a2)

During filesystem recovery we may be doing a truncate
which expects some of the mballoc data structures to
be initialized. So do ext4_mb_init before recovery.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: Free ext4_prealloc_space using kmem_cache_free

(cherry picked from commit 688f05a01983711a4e715b1d6e15a89a89c96a66)

We should use kmem_cache_free to free memory allocated
via kmem_cache_alloc

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: fix xattr deadlock

(cherry picked from commit 4d20c685fa365766a8f13584b4c8178a15ab7103)

ext4_xattr_set_handle() eventually ends up calling
ext4_mark_inode_dirty() which tries to expand the inode by shifting
the EAs. This leads to the xattr_sem being downed again and leading
to a deadlock.

This patch makes sure that if ext4_xattr_set_handle() is in the
call-chain, ext4_mark_inode_dirty() will not expand the inode.

Signed-off-by: Kalpak Shah <kalpak.shah@sun.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

jbd2: Fix buffer head leak when writing the commit block

(cherry picked from commit 45a90bfd90c1215bf824c0f705b409723f52361b)

Also make sure the buffer heads are marked clean before submitting bh
for writing. The previous code was marking the buffer head dirty,
which would have forced an unneeded write (and seek) to the journal
for no good reason.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

jbd2: abort instead of waiting for nonexistent transaction

(cherry picked from commit 23f8b79eae8a74e42a006ffa7c456e295c7e1c0d)

The __jbd2_log_wait_for_space function sits in a loop checkpointing
transactions until there is sufficient space free in the journal.
However, if there are no transactions to be processed (e.g. because the
free space calculation is wrong due to a corrupted filesystem) it will
never progress.

Check for space being required when no transactions are outstanding and
abort the journal instead of endlessly looping.

This patch fixes the bug reported by Sami Liedes at:
http://bugzilla.kernel.org/show_bug.cgi?id=10976

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Cc: Sami Liedes <sliedes@cc.hut.fi>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4/jbd2: Avoid WARN() messages when failing to write to the superblock

(cherry picked from commit 914258bf2cb22bf4336a1b1d90c551b4b11ca5aa)

This fixes some very common warnings reported by kerneloops.org

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: Renumber EXT4_IOC_MIGRATE

(cherry picked from commit 8eea80d52b9d87cfd771055534bd2c24f73704d7)

Pick an ioctl number for EXT4_IOC_MIGRATE that won't conflict with
other ext4 ioctl's. Since there haven't been any major userspace
users of this ioctl, we can afford to change this now, to avoid
potential problems later.

Also, reorder the ioctl numbers in ext4.h to avoid this sort of
mistake in the future.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: elevate write count for migrate ioctl

(cherry picked from commit 2a43a878001cc5cb7c3c7be2e8dad0a1aeb939b0)

The migrate ioctl writes to the filsystem, so we need to elevate the
write count.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: add missing unlock in ext4_check_descriptors() on error path

(cherry picked from commit 7ee1ec4ca30c6df8e989615cdaacb75f2af4fa6b)

If there group descriptors are corrupted we need unlock the block
group lock before returning from the function; else we will oops when
freeing a spinlock which is still being held.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

jbd2: fix /proc setup for devices that contain '/' in their names

trimed down version of commit 05496769e5da83ce22ed97345afd9c7b71d6bd24 upstream.

Some devices such as "cciss/c0d0p9" will cause jbd2 setup and teardown
failures when /proc filenames are created with embedded slashes. This
is a slimmed down version of commit 05496769, with the stack reduction
aspects of the patch omitted to meet the -stable criteria.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: fix #11321: create /proc/ext4/*/stats more carefully

(cherry picked from commit 899fc1a4cf404747de2666534d508804597ee22f)

ext4 creates per-suberblock directory in /proc/ext4/ . Name used as
basis is taken from bdevname, which, surprise, can contain slash.

However, proc while allowing to use proc_create("a/b", parent) form of
PDE creation, assumes that parent/a was already created.

bdevname in question is 'cciss/c0d0p9', directory is not created and all
this stuff goes directly into /proc (which is real bug).

Warning comes when _second_ partition is mounted.

http://bugzilla.kernel.org/show_bug.cgi?id=11321

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext4: Update flex_bg free blocks and free inodes counters when resizing.

(cherry picked from commit c62a11fd9555007b1caab83b5bcbb443a43e32bb)

This fixes a bug which prevented the newly created inodes after a
resize from being used on filesystems with flex_bg.

Signed-off-by: Frederic Bohe <frederic.bohe@bull.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: fix check for dead tcon in smb_init

commit bfb59820ee46616a7bdb4af6b8f7e109646de6ec upstream

This was recently changed to check for need_reconnect, but should
actually be a check for a tidStatus of CifsExiting.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: prevent cifs_writepages() from skipping unwritten pages

commit b066a48c9532243894f93a06ca5a0ee2cc21a8dc upstream

prevent cifs_writepages() from skipping unwritten pages

Fixes a data corruption under heavy stress in which pages could be left
dirty after all open instances of a inode have been closed.

In order to write contiguous pages whenever possible, cifs_writepages()
asks pagevec_lookup_tag() for more pages than it may write at one time.
Normally, it then resets index just past the last page written before calling
pagevec_lookup_tag() again.

If cifs_writepages() can't write the first page returned, it wasn't resetting
index, and the next call to pagevec_lookup_tag() resulted in skipping all of
the pages it previously returned, even though cifs_writepages() did nothing
with them. This can result in data loss when the file descriptor is about
to be closed.

This patch ensures that index gets set back to the next returned page so
that none get skipped.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Cc: Shirish S Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: Fix check for tcon seal setting and fix oops on failed mount from earlier patch

commit ab3f992983062440b4f37c666dac66d987902d91 upstream

set tcon->ses earlier

If the inital tree connect fails, we'll end up calling cifs_put_smb_ses
with a NULL pointer. Fix it by setting the tcon->ses earlier.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: Fix build break

commit c2b3382cd4d6c6adef1347e81f20e16c93a39feb upstream

Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: reinstate sharing of tree connections

commit f1987b44f642e96176adc88b7ce23a1d74806f89 upstream

Use a similar approach to the SMB session sharing. Add a list of tcons
attached to each SMB session. Move the refcount to non-atomic. Protect
all of the above with the cifs_tcp_ses_lock. Add functions to
properly find and put references to the tcons.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: minor cleanup to cifs_mount

commit d82c2df54e2f7e447476350848d8eccc8d2fe46a upstream

Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: reinstate sharing of SMB sessions sans races

commit 14fbf50d695207754daeb96270b3027a3821121f upstream

We do this by abandoning the global list of SMB sessions and instead
moving to a per-server list. This entails adding a new list head to the
TCP_Server_Info struct. The refcounting for the cifsSesInfo is moved to
a non-atomic variable. We have to protect it by a lock anyway, so there's
no benefit to making it an atomic. The list and refcount are protected
by the global cifs_tcp_ses_lock.

The patch also adds a new routines to find and put SMB sessions and
that properly take and put references under the lock.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: disable sharing session and tcon and add new TCP sharing code

commit e7ddee9037e7dd43de1ad08b51727e552aedd836 upstream.

The code that allows these structs to be shared is extremely racy.
Disable the sharing of SMB and tcon structs for now until we can
come up with a way to do this that's race free.

We want to continue to share TCP sessions, however since they are
required for multiuser mounts. For that, implement a new (hopefully
race-free) scheme. Add a new global list of TCP sessions, and take
care to get a reference to it whenever we're dealing with one.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: clean up server protocol handling

commit 3ec332ef7a38c2327e18d087d4120a8e3bd3dc6e upstream.

We're currently declaring both a sockaddr_in and sockaddr6_in on the
stack, but we really only need storage for one of them. Declare a
sockaddr struct and cast it to the proper type. Also, eliminate the
protocolType field in the TCP_Server_Info struct. It's redundant since
we have a sa_family field in the sockaddr anyway.

We may need to revisit this if SCTP is ever implemented, but for now
this will simplify the code.

CIFS over IPv6 also has a number of problems currently. This fixes all
of them that I found. Eventually, it would be nice to move more of the
code to be protocol independent, but this is a start.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: remove unused list, add new cifs sock list to prepare for mount/umount fix

commit fb396016647ae9de5b3bd8c4ee4f7b9cc7148bd5 upstream.

Also adds two lines missing from the previous patch (for the need reconnect flag in the
/proc/fs/cifs/DebugData handling)

The new global_cifs_sock_list is added, and initialized in init_cifs but not used yet.
Jeff Layton will be adding code in to use that and to remove the GlobalTcon and GlobalSMBSession
lists.

CC: Jeff Layton <jlayton@redhat.com>
CC: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: Fix cifs reconnection flags

commit 3b7952109361c684caf0c50474da8662ecc81019 upstream

[CIFS] Fix cifs reconnection flags

In preparation for Jeff's big umount/mount fixes to remove the possibility of
various races in cifs mount and linked list handling of sessions, sockets and
tree connections, this patch cleans up some repetitive code in cifs_mount,
and addresses a problem with ses->status and tcon->tidStatus in which we
were overloading the "need_reconnect" state with other status in that
field. So the "need_reconnect" flag has been broken out from those
two state fields (need reconnect was not mutually exclusive from some of the
other possible tid and ses states). In addition, a few exit cases in
cifs_mount were cleaned up, and a problem with a tcon flag (for lease support)
was not being set consistently for the 2nd mount of the same share

CC: Jeff Layton <jlayton@redhat.com>
CC: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

net: Fix soft lockups/OOM issues w/ unix garbage collector (CVE-2008-5300)

commit 5f23b734963ec7eaa3ebcd9050da0c9b7d143dd3 upstream.

This is an implementation of David Miller's suggested fix in:
https://bugzilla.redhat.com/show_bug.cgi?id=470201

It has been updated to use wait_event() instead of
wait_event_interruptible().

Paraphrasing the description from the above report, it makes sendmsg()
block while UNIX garbage collection is in progress. This avoids a
situation where child processes continue to queue new FDs over a
AF_UNIX socket to a parent which is in the exit path and running
garbage collection on these FDs. This contention can result in soft
lockups and oom-killing of unrelated processes.

Signed-off-by: dann frazier <dannf@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

IB/mlx4: Fix MTT leakage in resize CQ

commit 42ab01c31526ac1d06d193f81a498bf3cf2acfe4 upstream.

When resizing a CQ, MTTs associated with the old CQE buffer were not
freed. As a result, if any app used resize CQ repeatedly, all MTTs
were eventually exhausted, which led to all memory registration
operations failing until the driver is reloaded.

Once the RESIZE_CQ command returns successfully from FW, FW no longer
accesses the old CQ buffer, so it is safe to deallocate the MTT
entries used by the old CQ buffer.

Finally, if the RESIZE_CQ command fails, the MTTs allocated for the
new CQEs buffer also need to be de-allocated.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1416>.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

firewire: fw-sbp2: another iPod mini quirk entry

commit 031bb27c4bf77c2f60b3f3dea8cce63ef0d1fba9 upstream.

Add another model ID of a broken firmware to prevent early I/O errors
by acesses at the end of the disk. Reported at linux1394-user,
http://marc.info/?t=122670842900002

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ieee1394: sbp2: another iPod mini quirk entry

commit 9e0de91011ef6fe6eb3bb63f7ea15f586955660a upstream.

Add another model ID of a broken firmware to prevent early I/O errors
by acesses at the end of the disk. Reported at linux1394-user,
http://marc.info/?t=122670842900002

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ACPI: EC: count interrupts only if called from interrupt handler.

commit 7b4d469228a92a00e412675817cedd60133de38a upstream.

fix 2.6.28 EC interrupt storm regression

Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Remove -mno-spe flags as they dont belong

commit 65ecc14a30ad21bed9aabdfd6a2ae1a1aaaa6a00 upstream, tweaked to get
it to apply to 2.6.27

For some unknown reason at Steven Rostedt added in disabling of the SPE
instruction generation for e500 based PPC cores in commit
6ec562328fda585be2d7f472cfac99d3b44d362a.

We are removing it because:

1. It generates e500 kernels that don't work
2. its not the correct set of flags to do this
3. we handle this in the arch/powerpc/Makefile already
4. its unknown in talking to Steven why he did this

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Tested-and-Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

WATCHDOG: hpwdt: Fix kdump when using hpwdt

commit 290172e79036fc25a22aaf3da4835ee634886183 upstream.

When the "hpwdt" module is loaded (even if the /dev/watchdog device is not
opened), then kdump does not work. The panic kernel either does not start at
all or crash in various places.

The problem is that hpwdt_pretimeout is registered with register_die_notifier()
with the highest possible priority. Because it returns NOTIFY_STOP, the
crash_nmi_callback which is also registered with register_die_notifier()
is never executed. This causes the shutdown of other CPUs to fail.

Reverting the order is no option: The crash_nmi_callback executes HLT
and so never returns normally. Because of that, it must be executed as
last notifier, which currently is done.

So, that patch returns NOTIFY_OK to keep the crash_nmi_callback executed.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Thomas Mingarelli <thomas.mingarelli@hp.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

WATCHDOG: hpwdt: set the mapped BIOS address space as executable

commit 060264133b946786b4b28a1ba79e6725eaf258f3 upstream.

The address provided by the SMBIOS/DMI CRU information is mapped via
ioremap() in the virtual address space.  However, since the address is
executed (i.e.  call'd), we need to set that pages as executable.

Without that, I get following oops on a HP ProLiant DL385 G2
machine with BIOS from 05/29/2008 when I trigger crashdump:

    BUG: unable to handle kernel paging request at ffffc20011090c00
    IP: [<ffffc20011090c00>] 0xffffc20011090c00
    PGD 12f813067 PUD 7fe6a067 PMD 7effe067 PTE 80000000fffd3173
    Oops: 0011 [1] SMP
    last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
    CPU 1
    Modules linked in: autofs4 ipv6 af_packet cpufreq_conservative cpufreq_userspace
     cpufreq_powersave powernow_k8 fuse loop dm_mod rtc_cmos ipmi_si sg rtc_core i2c
    _piix4 ipmi_msghandler bnx2 sr_mod container button i2c_core hpilo joydev pcspkr
     rtc_lib shpchp hpwdt cdrom pci_hotplug usbhid hid ff_memless ohci_hcd ehci_hcd
    uhci_hcd usbcore edd ext3 mbcache jbd fan ide_pci_generic serverworks ide_core p
    ata_serverworks pata_acpi cciss ata_generic libata scsi_mod dock thermal process
    or thermal_sys hwmon
    Supported: Yes
    Pid: 0, comm: swapper Not tainted 2.6.27.5-HEAD_20081111100657-default #1
    RIP: 0010:[<ffffc20011090c00>]  [<ffffc20011090c00>] 0xffffc20011090c00
    RSP: 0018:ffff88012f6f9e68  EFLAGS: 00010046
    RAX: 0000000000000d02 RBX: 0000000000000000 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: ffff88012f6f9e98 R08: 666666666666660a R09: ffffffffa1006fc0
    R10: 0000000000000000 R11: ffff88012f6f3ea8 R12: ffffc20011090c00
    R13: ffff88012f6f9ee8 R14: 000000000000000e R15: 0000000000000000
    FS:  00007ff70b29a6f0(0000) GS:ffff88012f6512c0(0000) knlGS:0000000000000000
    CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: ffffc20011090c00 CR3: 0000000000201000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process swapper (pid: 0, threadinfo ffff88012f6f2000, task ffff88007fa8a1c0)
    Stack:  ffffffffa0f8502b 0000000000000002 ffffffff80738d50 0000000000000000
     0000000000000046 0000000000000046 00000000fffffffe ffffffffa0f852ec
     0000000000000000 ffffffff804ad9a6 0000000000000000 0000000000000000
    Call Trace:
    Inexact backtrace:

     <NMI>  [<ffffffffa0f8502b>] ? asminline_call+0x2b/0x55 [hpwdt]
     [<ffffffffa0f852ec>] hpwdt_pretimeout+0x3c/0xa0 [hpwdt]
     [<ffffffff804ad9a6>] ? notifier_call_chain+0x29/0x4c
     [<ffffffff802587e4>] ? notify_die+0x2d/0x32
     [<ffffffff804abbdc>] ? default_do_nmi+0x53/0x1d9
     [<ffffffff804abd90>] ? do_nmi+0x2e/0x43
     [<ffffffff804ab552>] ? nmi+0xa2/0xd0
     [<ffffffff80221ef9>] ? native_safe_halt+0x2/0x3
     <<EOE>>  [<ffffffff8021345d>] ? default_idle+0x38/0x54
     [<ffffffff8021359a>] ? c1e_idle+0x118/0x11c
     [<ffffffff8020b3b5>] ? cpu_idle+0xa9/0xf1

    Code: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff <55> 50 e8 00 00 00 00 58 48 2d 07 10 40 00 48 8b e8 58 e9 68 02
    RIP  [<ffffc20011090c00>] 0xffffc20011090c00
     RSP <ffff88012f6f9e68>
    CR2: ffffc20011090c00
    Kernel panic - not syncing: Fatal exception

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Thomas Mingarelli <Thomas.Mingarelli@hp.com>
Cc: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

powerpc/spufs: add a missing mutex_unlock

commit 6747c2ee8abf749e63fee8cd01a9ee293e6a4247 upstream.

A mutex_unlock(&gang->aff_mutex) in spufs_create_context() is missing
in case spufs_context_open() fails. As a result, spu_create syscall
and spu_get_idle() may block.

This patch adds the mutex_unlock.

Signed-off-by: Kou Ishizaki <kou.ishizaki@toshiba.co.jp>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Acked-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

powerpc/spufs: Fix spinning in spufs_ps_fault on signal

commit 606572634c3faa5b32a8fc430266e6e9d78d2179 upstream.

Currently, we can end up in an infinite loop if we get a signal
while the kernel has faulted in spufs_ps_fault. Eg:

alarm(1);

write(fd, some_spu_psmap_register_address, 4);

- the write's copy_from_user will fault on the ps mapping, and
signal_pending will be non-zero. Because returning from the fault
handler will never clear TIF_SIGPENDING, so we'll just keep faulting,
resulting in an unkillable process using 100% of CPU.

This change returns VM_FAULT_SIGBUS if there's a fatal signal pending,
letting us escape the loop.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: Fix error in smb_send2

Backport of upstream commit 61de800d33af585cb7e6f27b5cdd51029c6855cb
for -stable.

[CIFS] fix error in smb_send2

smb_send2 exit logic was strange, and with the previous change
could cause us to fail large
smb writes when all of the smb was not sent as one chunk.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Cc: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: Reduce number of socket retries in large write path

Backport of upstream commit edf1ae403896cb7750800508b14996ba6be39a53
for -stable.

[CIFS] Reduce number of socket retries in large write path

CIFS in some heavy stress conditions cifs could get EAGAIN
repeatedly in smb_send2 which led to repeated retries and eventually
failure of large writes which could lead to data corruption.

There are three changes that were suggested by various network
developers:

1) convert cifs from non-blocking to blocking tcp sendmsg
(we left in the retry on failure)
2) change cifs to not set sendbuf and rcvbuf size for the socket
(let tcp autotune the buffer sizes since that works much better
in the TCP stack now)
3) if we have a partial frame sent in smb_send2, mark the tcp
session as invalid (close the socket and reconnect) so we do
not corrupt the remaining part of the SMB with the beginning
of the next SMB.

This does not appear to hurt performance measurably and has
been run in various scenarios, but it definately removes
a corruption that we were seeing in some high stress
test cases.

Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

rtl8187: Add USB ID for Belkin F5D7050 with RTL8187B chip

commit eaca90dab6ab9853223029deffdd226f41b2028c upstream.

The Belkin F5D7050rev5000de (id 050d:705e) has the Realtek RTL8187B chip
and works with the 2.6.27 driver.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

rtl8187: add device ID 0bda:8198

commit 746db510395e32ff57b9f8582e520df6b3fac618 upstream.

Reported by zOOmER.gm@gmail.com to work here:

http://bugzilla.kernel.org/show_bug.cgi?id=11728

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: Zoomer <zoomer.gm@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

libata: blacklist Seagate drives which time out FLUSH_CACHE when used with NCQ

commit ac70a964b0e22a95af3628c344815857a01461b7 upstream.

Some recent Seagate harddrives have firmware bug which causes FLUSH
CACHE to timeout under certain circumstances if NCQ is being used.
This can be worked around by disabling NCQ and fixed by updating the
firmware. Implement ATA_HORKAGE_FIRMWARE_UPDATE and blacklist these
devices.

The wiki page has been updated to contain information on this issue.

http://ata.wiki.kernel.org/index.php/Known_issues

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e1000e: Use device_set_wakeup_enable

commit 6ff68026f4757d68461b7fbeca5c944e1f5f8b44 upstream.

Since dev->power.should_wakeup bit is used by the PCI core to
decide whether the device should wake up the system from sleep
states, set/unset this bit whenever WOL is enabled/disabled using
e1000_set_wol(). Accordingly, use device_can_wakeup() for checking
if wake-up is supported by the device.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e1000: Use device_set_wakeup_enable

commit de1264896c8012a261c1cba17e6a61199c276ad3 upstream.

Since dev->power.should_wakeup bit is used by the PCI core to
decide whether the device should wake up the system from sleep
states, set/unset this bit whenever WOL is enabled/disabled using
e1000_set_wol(). Accordingly, use device_can_wakeup() for checking
if wake-up is supported by the device.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

igb: Use device_set_wakeup_enable

commit e1b86d8479f90aadee57a3d07d8e61c815c202d9 upstream.

Since dev->power.should_wakeup bit is used by the PCI core to
decide whether the device should wake up the system from sleep
states, set/unset this bit whenever WOL is enabled/disabled using
igb_set_wol(). Accordingly, use device_can_wakeup() for checking
if wake-up is supported by the device.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: call dmi-quirks for HP Laptops after early-quirks are executed

commit 35af28219e684a36cc8b1ff456c370ce22be157d upstream.

Impact: make warning message disappear - functionality unchanged

Problems with bogus IRQ0 override of those laptops should be fixed
with commits

x86: SB600: skip IRQ0 override if it is not routed to INT2 of IOAPIC
x86: SB450: skip IRQ0 override if it is not routed to INT2 of IOAPIC

that introduce early-quirks based on chipset configuration.

For further information, see
http://bugzilla.kernel.org/show_bug.cgi?id=11516

Instead of removing the related dmi-quirks completely we'd like to
keep them for (at least) one kernel version -- to double-check whether
the early-quirks really took effect. But the dmi-quirks need to be
called after early-quirks are executed. With this patch calling
sequence for dmi-quriks is changed as follows:

acpi_boot_table_init()   (dmi-quirks)
...
early_quirks()           (detect bogus IRQ0 override)
...
acpi_boot_init()         (late dmi-quirks and setup IO APIC)

Note: Plan is to remove the "late dmi-quirks" with next kernel version.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

libata: Avoid overflow in libata when tf->hba_lbal > 127

Combination of these two upstream patches:

ba14a9c291aa867896a90b3571fcc1c3759942ff
libata: Avoid overflow in ata_tf_to_lba48() when tf->hba_lbal > 127

44901a96847b9967c057832b185e2f34ee6a14e5
libata: Avoid overflow in ata_tf_read_block() when tf->hba_lbal > 127

Originally written by Roland Dreier, but backported by Chuck.

Cc: Roland Dreier <rdreier@cisco.com>
Cc: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: SB600: skip ACPI IRQ0 override if it is not routed to INT2 of IOAPIC

commit 26adcfbf00e0726b4469070aa2f530dcf963f484 upstream.

On some more HP laptops BIOS reports an IRQ0 override
but the SB600 chipset is configured such that timer
interrupts go to INT0 of IOAPIC.

Check IRQ0 routing and if it is routed to INT0 of IOAPIC skip the
timer override.

http://bugzilla.kernel.org/show_bug.cgi?id=11715
http://bugzilla.kernel.org/show_bug.cgi?id=11516

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: Hibernate: Fix breakage on x86_32 with CONFIG_NUMA set

backport of commit 97a70e548bd97d5a46ae9d44f24aafcc013fd701 to the 2.6.27 kernel.

The NUMA code on x86_32 creates special memory mapping that allows
each node's pgdat to be located in this node's memory.  For this
purpose it allocates a memory area at the end of each node's memory
and maps this area so that it is accessible with virtual addresses
belonging to low memory.  As a result, if there is high memory,
these NUMA-allocated areas are physically located in high memory,
although they are mapped to low memory addresses.

Our hibernation code does not take that into account and for this
reason hibernation fails on all x86_32 systems with CONFIG_NUMA=y and
with high memory present.  Fix this by adding a special mapping for
the NUMA-allocated memory areas to the temporary page tables created
during the last phase of resume.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

xen: do not reserve 2 pages of padding between hypervisor and fixmap.

commit 5dc64a3442b98eaa0e3730c35fcf00cf962a93e7 upstream.

When reserving space for the hypervisor the Xen paravirt backend adds
an extra two pages (this was carried forward from the 2.6.18-xen tree
which had them "for safety"). Depending on various CONFIG options this
can cause the boot time fixmaps to span multiple PMDs which is not
supported and triggers a WARN in early_ioremap_init().

This was exposed by 2216d199b1430d1c0affb1498a9ebdbd9c0de439 which
moved the dmi table parsing earlier.
    x86: fix CONFIG_X86_RESERVE_LOW_64K=y

    The bad_bios_dmi_table() quirk never triggered because we do DMI setup
    too late. Move it a bit earlier.

There is no real reason to reserve these two extra pages and the
fixmap already incorporates FIX_HOLE which serves the same
purpose. None of the other callers of reserve_top_address do this.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

CPUFREQ: powernow-k8: ignore out-of-range PstateStatus value

commit a266d9f1253a38ec2d5655ebcd6846298b0554f4 upstream.

A workaround for AMD CPU family 11h erratum 311 might cause that the
P-state Status Register shows a "current P-state" which is larger than
the "current P-state limit" in P-state Current Limit Register. For the
wrong P-state value there is no ACPI _PSS object defined and
powernow-k8/cpufreq can't determine the proper CPU frequency for that
state.

As a consequence this can cause a panic during boot (potentially with
all recent kernel versions -- at least I have reproduced it with
various 2.6.27 kernels and with the current .28 series), as an
example:

powernow-k8: Found 1 AMD Turion(tm)X2 Ultra DualCore Mobile ZM-82 processors (2 \
)
powernow-k8:    0 : pstate 0 (2200 MHz)
powernow-k8:    1 : pstate 1 (1100 MHz)
powernow-k8:    2 : pstate 2 (600 MHz)
BUG: unable to handle kernel paging request at ffff88086e7528b8
IP: [<ffffffff80486361>] cpufreq_stats_update+0x4a/0x5f
PGD 202063 PUD 0
Oops: 0002 [#1] SMP
last sysfs file:
CPU 1
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.28-rc3-dirty #16
RIP: 0010:[<ffffffff80486361>]  [<ffffffff80486361>] cpufreq_stats_update+0x4a/0\
f
Synaptics claims to have extended capabilities, but I'm not able to read them.<6\
6
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88006e7528c0
RDX: 00000000ffffffff RSI: ffff88006e54af00 RDI: ffffffff808f056c
RBP: 00000000fffee697 R08: 0000000000000003 R09: ffff88006e73f080
R10: 0000000000000001 R11: 00000000002191c0 R12: ffff88006fb83c10
R13: 00000000ffffffff R14: 0000000000000001 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88006fb50740(0000) knlGS:0000000000000000
Unable to initialize Synaptics hardware.
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffff88086e7528b8 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88006fb82000, task ffff88006fb816d0)
Stack:
ffff88006e74da50 0000000000000000 ffff88006e54af00 ffffffff804863c7
ffff88006e74da50 0000000000000000 00000000ffffffff 0000000000000000
ffff88006fb83c10 ffffffff8024b46c ffffffff808f0560 ffff88006fb83c10
Call Trace:
[<ffffffff804863c7>] ? cpufreq_stat_notifier_trans+0x51/0x83
[<ffffffff8024b46c>] ? notifier_call_chain+0x29/0x4c
[<ffffffff8024b561>] ? __srcu_notifier_call_chain+0x46/0x61
[<ffffffff8048496d>] ? cpufreq_notify_transition+0x93/0xa9
[<ffffffff8021ab8d>] ? powernowk8_target+0x1e8/0x5f3
[<ffffffff80486687>] ? cpufreq_governor_performance+0x1b/0x20
[<ffffffff80484886>] ? __cpufreq_governor+0x71/0xa8
[<ffffffff80484b21>] ? __cpufreq_set_policy+0x101/0x13e
[<ffffffff80485bcd>] ? cpufreq_add_dev+0x3f0/0x4cd
[<ffffffff8048577a>] ? handle_update+0x0/0x8
[<ffffffff803c2062>] ? sysdev_driver_register+0xb6/0x10d
[<ffffffff8056592c>] ? powernowk8_init+0x0/0x7e
[<ffffffff8048604c>] ? cpufreq_register_driver+0x8f/0x140
[<ffffffff80209056>] ? _stext+0x56/0x14f
[<ffffffff802c2234>] ? proc_register+0x122/0x17d
[<ffffffff802c23a0>] ? create_proc_entry+0x73/0x8a
[<ffffffff8025c259>] ? register_irq_proc+0x92/0xaa
[<ffffffff8025c2c8>] ? init_irq_proc+0x57/0x69
[<ffffffff807fc85f>] ? kernel_init+0x116/0x169
[<ffffffff8020cc79>] ? child_rip+0xa/0x11
[<ffffffff807fc749>] ? kernel_init+0x0/0x169
[<ffffffff8020cc6f>] ? child_rip+0x0/0x11
Code: 05 c5 83 36 00 48 c7 c2 48 5d 86 80 48 8b 04 d8 48 8b 40 08 48 8b 34 02 48\

RIP  [<ffffffff80486361>] cpufreq_stats_update+0x4a/0x5f
RSP <ffff88006fb83b20>
CR2: ffff88086e7528b8
---[ end trace 0678bac75e67a2f7 ]---
Kernel panic - not syncing: Attempted to kill init!

In short, aftereffect of the wrong P-state is that
cpufreq_stats_update() uses "-1" as index for some array in

cpufreq_stats_update (unsigned int cpu)
{
...
     if (stat->time_in_state)
                stat->time_in_state[stat->last_index] =
                        cputime64_add(stat->time_in_state[stat->last_index],
                                      cputime_sub(cur_time, stat->last_time));
...
}

Fortunately, the wrong P-state value is returned only if the core is
in P-state 0. This fix solves the problem by detecting the
out-of-range P-state, ignoring it, and using "0" instead.

Cc: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: Hotplug core: remove 'name'

commit 58319b802a614f10f1b5238fbde7a4b2e9a60069 upstream.

Now that the PCI core manages the 'name' for each individual
hotplug driver, and all drivers (except rpaphp) have been converted
to use hotplug_slot_name(), there is no need for the PCI hotplug
core to drag around its own copy of name either.

Cc: kristen.c.accardi@intel.com
Cc: matthew@wil.cx
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: shcphp: remove 'name' parameter

commit 66f1705580f796a3f52c092e9dc92cbe5df41dd6 upstream.

We do not need to manage our own name parameter, especially since
the PCI core can change it on our behalf, in the case of duplicate
slot names.

Remove 'name' from shpchp's version of struct slot.

This change also removes the unused struct task_event from the
slot structure.

Cc: kristen.c.accardi@intel.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: SGI Hotplug: stop managing bss_hotplug_slot->name

commit 85234ce86dfa62b779faa19a70364a06e3f7fc32 upstream.

We no longer need to manage our version of hotplug_slot->name
since the PCI and hotplug core manage it on our behalf.

Update the sn_hp_slot_private_alloc() interface to fill in
the correct name for us, as that function already has all
the parameters needed to determine the name.

Cc: kristen.c.accardi@intel.com
Cc: jpk@sgi.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: rpaphp: kmalloc/kfree slot->name directly

commit b2132fecca02fa05d509ba4c8c1e51dee6ccd003 upstream.

rpaphp tends to use slot->name directly everywhere, and doesn't
ever need slot->hotplug_slot->name.

struct hotplug_slot->name is going away, so convert rpaphp directly
manipulate its own slot->name everywhere, and don't bother touching
slot->hotplug_slot->name.

Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: pciehp: remove 'name' parameter

commit e1acb24f059defdaa0264e925f19cc21b0a3e592 upstream.

We do not need to manage our own name parameter, especially since
the PCI core can change it on our behalf, in the case of duplicate
slot names.

Remove 'name' from pciehp's version of struct slot, and remove
unused 'task_list' as well.

Cc: kristen.c.accardi@intel.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: ibmphp: stop managing hotplug_slot->name

commit a32615a1a661f83661e8a26c3bc7763f716da8f3 upstream.

We no longer need to manage our version of hotplug_slot->name
since the PCI and hotplug core manage it on our behalf.

Now, we simply advise the PCI core of the name that we would
like, and let the core take care of the rest.

Additionally, slightly rearrange the members of struct slot
so they are naturally aligned to eliminate holes.

Cc: kristen.c.accardi@intel.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: fakephp: remove 'name' parameter

commit 43caae884b5a5e2eacb4879225341cb49700e129 upstream.

Remove 'name' from fakephp's struct dummy_slot, as the PCI core
will now manage our slot name for us.

Cc: kristen.c.accardi@intel.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: cpqphp: stop managing hotplug_slot->name

commit 30ac7acd05d1449ac784de144c4b5237be25b0b4 upstream.

We no longer need to manage our version of hotplug_slot->name
since the PCI and hotplug core manage it on our behalf.

Now, we simply advise the PCI core of the name that we would
like, and let the core take care of the rest.

Cc: jbarnes@virtuousgeek.org
Cc: kristen.c.accardi@intel.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: cpci_hotplug: stop managing hotplug_slot->name

commit d6c479e0b777afcd7a26ca62e122e3f878ccc830 upstream.

We no longer need to manage our version of hotplug_slot->name
since the PCI and hotplug core manage it on our behalf.

Now, we simply advise the PCI core of the name that we would
like, and let the core take care of the rest.

Cc: kristen.c.accardi@intel.com
Cc: scottm@somanetworks.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: acpiphp: remove 'name' parameter

commit df77cd10078e36e1b89964e5e8c206add399a98d upstream.

We do not need to manage our own name parameter, especially since
the PCI core can change it on our behalf, in the case of duplicate
slot names.

Remove 'name' from acpiphp's version of struct slot.

Cc: kristen.c.accardi@intel.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI, PCI Hotplug: introduce slot_name helpers

commit 0ad772ec464d3fcf9d210836b97e654f393606c4 upstream

In preparation for cleaning up the various hotplug drivers
such that they don't have to manage their own 'name' parameters
anymore, we provide the following convenience functions:

pci_slot_name()
hotplug_slot_name()

These helpers will be used by individual hotplug drivers.

Cc: kristen.c.accardi@intel.com
Cc: matthew@wil.cx
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: prevent duplicate slot names

commit 5fe6cc60680d29740b85278e17a002fa27b7e642 upstream.

Prevent callers of pci_create_slot() from registering slots with
duplicate names. This condition occurs most often when PCI hotplug
drivers are loaded on platforms with broken firmware that assigns
identical names to multiple slots.

We now rename these duplicate slots on behalf of the user.

If firmware assigns the name N to multiple slots, then:

The first registered slot is assigned N
The second registered slot is assigned N-1
The third registered slot is assigned N-2
etc.

This is the permanent fix mentioned in earlier commits d6a9e9b4 and
167e782e (shpchp/pciehp: Rename duplicate slot name...).

We take advantage of the new 'hotplug' parameter in pci_create_slot()
to prevent a slot create/rename race between hotplug drivers and
detection drivers.

Scenario A:
hotplug driver                  detection driver
--------------                  ----------------
pci_create_slot(hotplug=set)
pci_create_slot(hotplug=NULL)

The hotplug driver creates the slot with its desired name, and then
releases the semaphore. Now, the detection driver tries to create
the same slot, but it already exists. We don't care about renaming,
so return the existing slot.

Scenario B:
hotplug driver                  detection driver
--------------                  ----------------
pci_create_slot(hotplug=NULL)
pci_create_slot(hotplug=set)

The detection driver creates the slot with name "X". Then the hotplug
driver tries to create the same slot, but wants the name "Y" instead.
We detect that we're trying to create the same slot and that we also
want a rename, so rename the slot to "Y" and return.

Scenario C:
hotplug driver                  hotplug driver
--------------                  ----------------
pci_create_slot(hotplug=set)
pci_create_slot(hotplug=set)

Two separate hotplug drivers are attempting to claim the slot and
are passing valid hotplug_slot args to pci_create_slot(). We detect
that the slot already has a ->hotplug callback, prevent a rename,
and return -EBUSY.

Cc: jbarnes@virtuousgeek.org
Cc: kristen.c.accardi@intel.com
Cc: matthew@wil.cx
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI Hotplug: serialize pci_hp_register and pci_hp_deregister

commit 95cb9093960b6249fdbe7417bf513a1358aaa51a upstream.

Convert the pci_hotplug_slot_list_lock, which only protected the
list of hotplug slots, to a pci_hp_mutex which now protects both
interfaces.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: update pci_create_slot() to take a 'hotplug' param

commit 828f37683e6d3ab5912989df0d04201db7ad798e upstream.

Slot detection drivers can co-exist with hotplug drivers. The names
of the detected/claimed slots may be different depending on module
load order.

For legacy reasons, we need to allow hotplug drivers to override
the slot name if a detection driver is loaded first (and they find
the same slots).

Creating and overriding slot names should be an atomic operation,
otherwise you get a locking nightmare as various drivers race to
call pci_create_slot().

pci_create_slot() is already serialized by grabbing the pci_bus_sem.

We update the API and add a 'hotplug' param, which is:

set if the caller is a hotplug driver
NULL if the caller is a detection driver

pci_create_slot() does not actually use the 'hotplug' parameter in this
patch. A later patch will add the logic that uses it.

Cc: kristen.c.accardi@intel.com
Cc: matthew@wil.cx
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI Hotplug core: add 'name' param pci_hp_register interface

commit 1359f2701b96abd9bb69c1273fb995a093b6409a upstream.

Update pci_hp_register() to take a const char *name parameter.

The motivation for this is to clean up the individual hotplug
drivers so that each one does not have to manage its own name.
The PCI core should be the place where we manage the name.

We update the interface and all callsites first, in a
"no functional change" manner, and clean up the drivers later.

Cc: kristen.c.accardi@intel.com
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

axnet_cs / pcnet_cs: moving PCMCIA_DEVICE_PROD_ID for Netgear FA411

commit 208fbec5bec1de4fce48aab41efde11ba25ab04c upstream.

Hi,

after noticing that my Netgear FA411 (PCMCIA-NIC) [1] stopped working with
the release of the 2.6.25 kernel (sidux-version), I checked the
respective driver sources and noticed that the pcnet_cs driver bailed
out with "use axnet_cs instead" for the Netgear FA411, but axnet_cs
doesn't claim this ID.

I compiled a kernel with the PCMCIA-ID for the netgear card moved to
axnet_cs from pcnet_cs which worked. I then contacted sidux-kernel
maintainer Stefan Lippers-Hollmann who turned the info into this patch
and integrated it into the kernel:

<http://svn.berlios.de/svnroot/repos/fullstory/linux-sidux-2.6/trunk/debian/patches/features/2.6.27.4_PCMCIA_move-PCMCIA-ID-for-Netgear-FA411-from-pcnet_cs-to-axnet_cs.patch>

This works for me and AFAIK there were no reports of any breakage for
other devices on sidux-support.

This looks like a trivial patch, but since I have very limited
experience with kernel modifications  I might be woefully wrong there.
But if there are no side effects of this patch, is it possible to get it
into the official kernel?

I can provide more detailed information on the affected hardware if
necessary.

-cord

[1]
Socket 1 Device 0:      [axnet_cs]              (bus ID: 1.0)
        Configuration:  state: on
        Product Name:   NETGEAR FA411 Fast Ethernet
        Identification: manf_id: 0x0149 card_id: 0x0411
                        function: 6 (network)
                        prod_id(1): "NETGEAR" (0x9aa79dc3)
                        prod_id(2): "FA411" (0x40fad875)
                        prod_id(3): "Fast Ethernet" (0xb4be14e3)
                        prod_id(4): --- (---)

From: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Date: Sat, 1 Nov 2008 23:53:04 +0000
Subject: PCMCIA: move PCMCIA ID for Netgear FA411 from pcnet_cs to axnet_cs:

Since kernel 2.6.25, commit 61da96be07ec860e260ca4af0199b9d48d000b80
(pcnet_cs: if AX88190-based card, printk "use axnet_cs instead" message.),
pcnet_cs bails out with "use axnet_cs instead" for the Netgear FA411, but
axnet_cs doesn't claim this ID.

Socket 1 Device 0:      [axnet_cs]              (bus ID: 1.0)
        Configuration:  state: on
        Product Name:   NETGEAR FA411 Fast Ethernet
        Identification: manf_id: 0x0149 card_id: 0x0411
                        function: 6 (network)
                        prod_id(1): "NETGEAR" (0x9aa79dc3)
                        prod_id(2): "FA411" (0x40fad875)
                        prod_id(3): "Fast Ethernet" (0xb4be14e3)
                        prod_id(4): --- (---)

Signed-off-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Signed-off-by: Cord Walter <qord@cwalter.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath9k: correct expected max RX buffer size

commit b4b6cda2298b0c9a0af902312184b775b8867c65 upstream

We should only tell the hardware its capable of DMA'ing
to us only what we asked dev_alloc_skb(). Prior to this
it is possible a large RX'd frame could have corrupted
DMA data but for us but we were saved only because we
were previously also pci_map_single()'ing the same large
value. The issue prior to this though was we were unmapping
a smaller amount which the prior DMA patch fixed.

Signed-off-by: Bennyam Malavazi <Bennyam.Malavazi@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath9k: Fix SW-IOMMU bounce buffer starvation

commit ca0c7e5101fd4f37fed8e851709f08580b92fbb3 upstream.

This should fix the SW-IOMMU bounce buffer starvation
seen ok kernel.org bugzilla 11811:

http://bugzilla.kernel.org/show_bug.cgi?id=11811

Users on MacBook Pro 3.1/MacBook v2 would see something like:

DMA: Out of SW-IOMMU space for 4224 bytes at device 0000:0b:00.0

Unfortunately its only easy to trigger on MacBook Pro 3.1/MacBook v2
so far so its difficult to debug (even with swiotlb=force).

We were pci_unmap_single()'ing less bytes than what we called
for with pci_map_single() and as such we were starving
the swiotlb from its 64MB amount of bounce buffers. We remain
consistent and now always use sc->rxbufsize for RX. While at
it we update the beacon DMA maps as well to only use the data
portion of the skb, previous to this we were pci_map_single()'ing
more data for beaconing than what we tell the hardware it can use,
therefore pushing more iotlb abuse.

Still not sure why this is so easily triggerable on
MacBook Pro 3.1, it may be the hardware configuration
tends to use more memory > 3GB mark for DMA.

Signed-off-by: Maciej Zenczykowski <zenczykowski@gmail.com>
Signed-off-by: Bennyam Malavazi <Bennyam.Malavazi@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: always define DECLARE_PCI_UNMAP* macros

commit b627c8b17ccacba38c975bc0f69a49fc4e5261c9 upstream.

Impact: fix boot crash on AMD IOMMU if CONFIG_GART_IOMMU is off

Currently these macros evaluate to a no-op except the kernel is compiled
with GART or Calgary support. But we also need these macros when we have
SWIOTLB, VT-d or AMD IOMMU in the kernel. Since we always compile at
least with SWIOTLB we can define these macros always.

This patch is also for stable backport for the same reason the SWIOTLB
default selection patch is.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: more general identifier for Phoenix BIOS

commit 0af40a4b1050c050e62eb1dc30b82d5ab22bf221 upstream.

Impact: widen the reach of the low-memory-protect DMI quirk

Phoenix BIOSes variously identify their vendor as "Phoenix Technologies,
LTD" or "Phoenix Technologies LTD" (without the comma.)

This patch makes the identification string in the bad_bios_dmi_table
more general (following a suggestion by Ingo Molnar), so that both
versions are handled.

Again, the patched file compiles cleanly and the patch has been tested
successfully on my machine.

Signed-off-by: Philipp Kohlbecher <xt28@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

parport_serial: fix array overflow

commit 36be47d6d8d98f54b6c4f891e9f54fb2bf554584 upstream.

The netmos_9xx5_combo type assumes that PCI SSID provides always the
correct value for the number of parallel and serial ports, but there are
indeed broken devices with wrong numbers, which may result in Oops.

This patch simply adds the check of the array range.

Reference: Novell bnc#447067
https://bugzilla.novell.com/show_bug.cgi?id=447067

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

lib/idr.c: fix rcu related race with idr_find

commit 6ff2d39b91aec3dcae951afa982059e3dd9b49dc upstream.

2nd part of the fixes needed for
http://bugzilla.kernel.org/show_bug.cgi?id=11796.

When the idr tree is either grown or shrunk, then the update to the number
of layers and the top pointer were not atomic. This race caused crashes.

The attached patch fixes that by replicating the layers counter in each
layer, thus idr_find doesn't need idp->layers anymore.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Clement Calmels <cboulte@gmail.com>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Input: atkbd - add keymap quirk for Inventec Symphony systems

commit a8215b81cc31cf267506bc6a4a4bfe93f4ca1652 upstream.

The Zepto 6615WD laptop (rebranded Inventec Symphony system) needs a
key release quirk for its volume keys to work. The attached patch adds
the quirk to the atkbd driver.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=460237
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Adel Gadllah <adel.gadllah@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

V4L/DVB (9352): Add some missing compat32 ioctls

commit c7f09db6852d85e7f76322815051aad1c88d08cf upstream.

This patch adds the missing compat ioctls that are needed to
operate Skype in combination with libv4l and a MJPEG only camera.

If you think it's trivial enough please submit it to -stable, too.

Signed-off-by: Gregor Jasny <gjasny@web.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

IA64: fix boot panic caused by offline CPUs

commit 62ee0540f5e5a804b79cae8b3c0185a85f02436b upstream.

This fixes a regression introduced by 2c6e6db41f01b6b4eb98809350827c9678996698
"Minimize per_cpu reservations."  That patch incorrectly used information about
what CPUs are possible that was not yet initialized by ACPI.  The end result
was that per_cpu structures for offline CPUs were not initialized causing a
NULL pointer reference.

Since we cannot do the full acpi_boot_init() call any earlier, the simplest
fix is to just parse the MADT for SAPIC entries early to find the CPU
info.  This should also allow for some cleanup of the code added by the
"Minimize per_cpu reservations".  This patch just fixes the regressions, the
cleanup will come in a later patch.

Signed-off-by: Doug Chapman <doug.chapman@hp.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
CC: Robin Holt <holt@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Fix inotify watch removal/umount races

commit 8f7b0ba1c853919b85b54774775f567f30006107 upstream.

Inotify watch removals suck violently.

To kick the watch out we need (in this order) inode->inotify_mutex and
ih->mutex.  That's fine if we have a hold on inode; however, for all
other cases we need to make damn sure we don't race with umount.  We can
*NOT* just grab a reference to a watch - inotify_unmount_inodes() will
happily sail past it and we'll end with reference to inode potentially
outliving its superblock.

Ideally we just want to grab an active reference to superblock if we
can; that will make sure we won't go into inotify_umount_inodes() until
we are done.  Cleanup is just deactivate_super().

However, that leaves a messy case - what if we *are* racing with
umount() and active references to superblock can't be acquired anymore?
We can bump ->s_count, grab ->s_umount, which will almost certainly wait
until the superblock is shut down and the watch in question is pining
for fjords.  That's fine, but there is a problem - we might have hit the
window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e.
the moment when superblock is past the point of no return and is heading
for shutdown) and the moment when deactivate_super() acquires
->s_umount.

We could just do drop_super() yield() and retry, but that's rather
antisocial and this stuff is luser-triggerable.  OTOH, having grabbed
->s_umount and having found that we'd got there first (i.e.  that
->s_root is non-NULL) we know that we won't race with
inotify_umount_inodes().

So we could grab a reference to watch and do the rest as above, just
with drop_super() instead of deactivate_super(), right? Wrong.  We had
to drop ih->mutex before we could grab ->s_umount.  So the watch
could've been gone already.

That still can be dealt with - we need to save watch->wd, do idr_find()
and compare its result with our pointer.  If they match, we either have
the damn thing still alive or we'd lost not one but two races at once,
the watch had been killed and a new one got created with the same ->wd
at the same address.  That couldn't have happened in inotify_destroy(),
but inotify_rm_wd() could run into that.  Still, "new one got created"
is not a problem - we have every right to kill it or leave it alone,
whatever's more convenient.

So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
"grab it and kill it" check.  If it's been our original watch, we are
fine, if it's a newcomer - nevermind, just pretend that we'd won the
race and kill the fscker anyway; we are safe since we know that its
superblock won't be going away.

And yes, this is far beyond mere "not very pretty"; so's the entire
concept of inotify to start with.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

epoll: introduce resource usage limits

commit 7ef9964e6d1b911b78709f144000aacadd0ebc21 upstream.

It has been thought that the per-user file descriptors limit would also
limit the resources that a normal user can request via the epoll
interface.  Vegard Nossum reported a very simple program (a modified
version attached) that can make a normal user to request a pretty large
amount of kernel memory, well within the its maximum number of fds.  To
solve such problem, default limits are now imposed, and /proc based
configuration has been introduced.  A new directory has been created,
named /proc/sys/fs/epoll/ and inside there, there are two configuration
points:

  max_user_instances = Maximum number of devices - per user

  max_user_watches   = Maximum number of "watched" fds - per user

The current default for "max_user_watches" limits the memory used by epoll
to store "watches", to 1/32 of the amount of the low RAM.  As example, a
256MB 32bit machine, will have "max_user_watches" set to roughly 90000.
That should be enough to not break existing heavy epoll users.  The
default value for "max_user_instances" is set to 128, that should be
enough too.

This also changes the userspace, because a new error code can now come out
from EPOLL_CTL_ADD (-ENOSPC).  The EMFILE from epoll_create() was already
listed, so that should be ok.

[akpm@linux-foundation.org: use get_current_user()]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Reported-by: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

parisc: fix kernel crash when unwinding a userspace process

commit 7a3f5134a8f5bd7fa38b5645eef05e8a4eb62951 upstream.

Any user on existing parisc 32- and 64bit-kernels can easily crash
the kernel and as such enforce a DSO.
A simple testcase is available here:
http://gsyprf10.external.hp.com/~deller/crash.tgz

The problem is introduced by the fact, that the handle_interruption()
crash handler calls the show_regs() function, which in turn tries to
unwind the stack by calling parisc_show_stack(). Since the stack contains
userspace addresses, a try to unwind the stack is dangerous and useless
and leads to the crash.

The fix is trivial: For userspace processes
a) avoid to unwind the stack, and
b) avoid to resolve userspace addresses to kernel symbol names.

While touching this code, I converted print_symbol() to %pS
printk formats and made parisc_show_stack() static.

An initial patch for this was written by Kyle McMartin back in August:
http://marc.info/?l=linux-parisc&m=121805168830283&w=2

Compile and run-tested with a 64bit parisc kernel.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sysvipc: fix the ipc structures initialization

commit e00b4ff7ebf098b11b11be403921c1cf41d9e321 upstream.

A problem was found while reviewing the code after Bugzilla bug
http://bugzilla.kernel.org/show_bug.cgi?id=11796.

In ipc_addid(), the newly allocated ipc structure is inserted into the
ipcs tree (i.e made visible to readers) without locking it. This is not
correct since its initialization continues after it has been inserted in
the tree.

This patch moves the ipc structure lock initialization + locking before
the actual insertion.

Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Reported-by: Clement Calmels <cboulte@gmail.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

lib/scatterlist.c: fix kunmap() argument in sg_miter_stop()

commit f652c521e0bec2e70cf123f47e80117a7e6ed139 upstream.

kunmap() takes as argument the struct page that orginally got kmap()'d,
however the sg_miter_stop() function passed it the kernel virtual address
instead, resulting in weird stuff.

Somehow I ended up fixing this bug by accident while looking for a bug in
the same area.

Reported-by: kerneloops.org
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

gpiolib: extend gpio label column width in debugfs file

commit 6e8ba729b6332f2a75572e02480936d2b51665aa upstream.

There are already various drivers having bigger label than 12 bytes. Most
of them fit well under 20 bytes but make column width exact so that
oversized labels don't mess up output alignment.

Signed-off-by: Jarkko Nikula <jarkko.nikula@nokia.com>
Acked-by: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fbdev: clean the penguin's dirty feet

commit cf7ee554f3a324e98181b0ea249d9d5be3a0acb8 upstream.

When booting in a direct color mode, the penguin has dirty feet, i.e.,
some pixels have the wrong color. This is caused by
fb_set_logo_directpalette() which does not initialize the last 32 palette
entries.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

pxa2xx_spi: bugfix full duplex dma data corruption

commit 393df744e056ba24e9531d0657d09fc3c7c0dd22 upstream.

Fixes a data corruption bug in pxa2xx_spi.c when operating in full duplex
mode with DMA and using buffers that overlap.

SPI transmit and receive buffers are allowed to be the same or to overlap.
However, this driver fails if such overlap is attempted in DMA mode
because it maps the rx and tx buffers in the wrong order. By mapping
DMA_FROM_DEVICE (read) before DMA_TO_DEVICE (write), it invalidates the
cache before flushing it, thus discarding data which should have been
transmitted.

The patch corrects the order of mapping. This bug exists in all versions
of pxa2xx_spi.c; similar bugs are in the drivers for two other SPI
controllers (au1500, imx).

A version of this patch has been tested on kernel 2.6.20 using
verification of loopback data with: random transfer length, random
bits-per-word, random positive offsets (both larger and smaller than
transfer length) between the start of the rx and tx buffers, and varying
clock rates.

Signed-off-by: Ned Forrester <nforrester@whoi.edu>
Cc: Vernon Sauder <vernoninhand@gmail.com>
Cc: J. Scott Merritt <merrij3@rpi.edu>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

eCryptfs: Allocate up to two scatterlists for crypto ops on keys

commit ac97b9f9a2d0b83488e0bbcb8517b229d5c9b142 upstream.

I have received some reports of out-of-memory errors on some older AMD
architectures.  These errors are what I would expect to see if
crypt_stat->key were split between two separate pages.  eCryptfs should
not assume that any of the memory sent through virt_to_scatterlist() is
all contained in a single page, and so this patch allocates two
scatterlist structs instead of one when processing keys.  I have received
confirmation from one person affected by this bug that this patch resolves
the issue for him, and so I am submitting it for inclusion in a future
stable release.

Note that virt_to_scatterlist() runs sg_init_table() on the scatterlist
structs passed to it, so the calls to sg_init_table() in
decrypt_passphrase_encrypted_session_key() are redundant.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Reported-by: Paulo J. S. Silva <pjssilva@ime.usp.br>
Cc: "Leon Woestenberg" <leon.woestenberg@gmail.com>
Cc: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cgroups: fix a serious bug in cgroupstats

commit 33d283bef23132c48195eafc21449f8ba88fce6b upstream.

Try this, and you'll get oops immediately:
# cd Documentation/accounting/
# gcc -o getdelays getdelays.c
# mount -t cgroup -o debug xxx /mnt
# ./getdelays -C /mnt/tasks

Because a normal file's dentry->d_fsdata is a pointer to struct cftype,
not struct cgroup.

After the patch, it returns EINVAL if we try to get cgroupstats
from a normal file.

Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cpuset: fix regression when failed to generate sched domains

commit 700018e0a77b4113172257fcdaa1c58e27a5074f upstream.

Impact: properly rebuild sched-domains on kmalloc() failure

When cpuset failed to generate sched domains due to kmalloc()
failure, the scheduler should fallback to the single partition
'fallback_doms' and rebuild sched domains, but now it only
destroys but not rebuilds sched domains.

The regression was introduced by:

| commit dfb512ec4834116124da61d6c1ee10fd0aa32bd6
| Author: Max Krasnyansky <maxk@qualcomm.com>
| Date: Fri Aug 29 13:11:41 2008 -0700
|
| sched: arch_reinit_sched_domains() must destroy domains to force rebuild

After the above commit, partition_sched_domains(0, NULL, NULL) will
only destroy sched domains and partition_sched_domains(1, NULL, NULL)
will create the default sched domain.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>