git.karo-electronics.de Git - karo-tx-linux.git/log

ieee1394: ohci1394: increase AT req. retries, fix ack_busy_X from Panasonic camcorders and others

Commit 64c634ef83991b390ec0503e61f16efb0ba3c60b upstream.

Camcorders have a tendency to fail read requests to their config ROM and
write request to their FCP command register with ack_busy_X.  This has
become a problem with newer kernels and especially Panasonic camcorders,
causing AV/C in dvgrab and kino to fail.  Dvgrab for example frequently
logs "send oops"; kino reports loss of AV/C control.  I suspect that
lower CPU scheduling latencies in newer kernels made this issue more
prominent now.

According to
https://sourceforge.net/tracker/?func=detail&atid=114103&aid=2492640&group_id=14103
this can be fixed by configuring the FireWire controller for more
hardware retries for request transmission; these retries are evidently
more successful than libavc1394's own retry loop (typically 3 tries on
top of hardware retries).

Presumably the same issue has been reported at
https://bugzilla.redhat.com/show_bug.cgi?id=449252 and
https://bugzilla.redhat.com/show_bug.cgi?id=477279 .

Tested-by: Mathias Beilstein
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ACPI: dock: Don't eval _STA on every show_docked sysfs read

commit fc5a9f8841ee87d93376ada5d73117d4d6a373ea upstream.

Some devices trigger a DEVICE_CHECK on every evalutation of _STA. This
can also be seen in commit 8b59560a3baf2e7c24e0fb92ea5d09eca92805db
(ACPI: dock: avoid check _STA method). If an undock is processed, the
dock driver sends a uevent and userspace might read the show_docked
property in sysfs. This causes an evaluation of _STA of the particular
device which causes the dock driver to immediately dock again.

In any case, evaluation of _STA (show_docked) does not necessarily mean
that we are docked, so check with the internal device structure.

http://bugzilla.kernel.org/show_bug.cgi?id=12360

Signed-off-by: Holger Macht <hmacht@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ACPI: Enable bit 11 in _PDC to advertise hw coord

commit d96f94c604453f87fe24154b87e1e9a3a72511f8 upstream.

Bit 11 in intel PDC definitions is meant for OS capability to handle
hardware coordination of P-states. In Linux we have always supported
hwardware coordination of P-states. Just let the BIOSes know that we
support it, by setting this bit.

Some BIOSes use this bit to choose between hardware or software coordination
and without this change below, BIOSes switch to software coordination, which
is not very optimal in terms of power consumption and extra wakeups from idle.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

md: Fix a bug in linear.c causing which_dev() to return the wrong device.

commit 852c8bf484a0e17ee27f413ef26e87f522af5607 upstream.

ab5bd5cbc8d4b868378d062eed3d4240930fbb86 introduced the following
bug in linear software raid for large arrays on 32 bit machines:

which_dev() computes the device holding a given sector by shifting
down the sector number to a 32 bit range, dividing by the array
spacing and looking up the resulting index in the hash table of
the array.

Because the computed index might be slightly too small, a loop at
the end of which_dev() increases the index until the given sector
actually falls into the range of the device associated with that index.

The changes of the above mentioned commit caused this loop to check
whether the _index_ rather than the sector number is small enough,
effectively bypassing the loop and thus possibly returning the wrong
device.

As reported by Simon Kirby, this leads to errors such as

linear_make_request: Sector 2340486136 out of bounds on dev sdi: 156301312 sectors, offset 2109870464

Fix this bug by introducing a local variable for the index so that
the variable containing the passed sector is left unchanged.

Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

md: Ensure an md array never has too many devices.

commit de01dfadf25bf83cfe3d85c163005c4320532658 upstream.

Each different metadata format supported by md supports a
different maximum number of devices.
We really should be enforcing this maximum in the kernel, but
we aren't quite doing that properly.

We currently only enforce it at the 'hot_add' point, which is an
older interface which is not used by current userspace.

We need to also enforce it at 'add_new_disk' time for active arrays
and at 'do_md_run' time when starting a new array.

So move the test from 'hot_add' into 'bind_rdev_to_array' which is
called from both 'hot_add' and 'add_new_disk, and add a new
test in 'analyse_sbs' which is called from 'do_md_run'.

This bug (or missing feature) has been around "forever" and so
the patch is suitable for any -stable that is currently maintained.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sound: usb-audio: handle wMaxPacketSize for FIXED_ENDPOINT devices

commit 894dcd78782842924527598b0b764c9b4e679e21 upstream.

For audio devices that do not have proper audio descriptors (e.g.,
Edirol UA-20), we use hardcoded parameters from our quirks list.
However, we must still read the maximum packet size from the standard
endpoint descriptor; otherwise, we might use packets that are too big
and therefore rejected by the USB core.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

prevent kprobes from catching spurious page faults

commit 9be260a646bf76fa418ee519afa10196b3164681 upstream.

Prevent kprobes from catching spurious faults which will cause infinite
recursive page-fault and memory corruption by stack overflow.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

revert "rlimit: permit setting RLIMIT_NOFILE to RLIM_INFINITY"

commit 60fd760fb9ff7034360bab7137c917c0330628c2 upstream.

Revert commit 0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f because it causes
(arguably poorly designed) existing userspace to spend interminable
periods closing billions of not-open file descriptors.

We could bring this back, with some sort of opt-in tunable in /proc, which
defaults to "off".

Peter's alanysis follows:

: I spent several hours trying to get to the bottom of a serious
: performance issue that appeared on one of our servers after upgrading to
: 2.6.28.  In the end it's what could be considered a userspace bug that
: was triggered by a change in 2.6.28.  Since this might also affect other
: people I figured I'd at least document what I found here, and maybe we
: can even do something about it:
:
:
: So, I upgraded some of debian.org's machines to 2.6.28.1 and immediately
: the team maintaining our ftp archive complained that one of their
: scripts that previously ran in a few minutes still hadn't even come
: close to being done after an hour or so.  Downgrading to 2.6.27 fixed
: that.
:
: Turns out that script is forking a lot and something in it or python or
: whereever closes all the file descriptors it doesn't want to pass on.
: That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and
: closes them all with a few exceptions.
:
: Turns out that takes a long time when your limit -n is now 2^20 (1048576).
:
: With 2.6.27.* the ulimit -n was the standard 1024, but with 2.6.28 it is
: now a thousand times that.
:
: 2.6.28 included a patch titled "rlimit: permit setting RLIMIT_NOFILE to
: RLIM_INFINITY" (0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f)[1] that
: allows, as the title implies, to set the limit for number of files to
: infinity.
:
: Closer investigation showed that the broken default ulimit did not apply
: to "system" processes (like stuff started from init).  In the end I
: could establish that all processes that passed through pam_limit at one
: point had the bad resource limit.
:
: Apparently the pam library in Debian etch (4.0) initializes the limits
: to some default values when it doesn't have any settings in limit.conf
: to override them.  Turns out that for nofiles this is RLIM_INFINITY.
: Commenting out "case RLIMIT_NOFILE" in pam_limit.c:267 of our pam
: package version 0.79-5 fixes that - tho I'm not sure what side effects
: that has.
:
: Debian lenny (the upcoming 5.0 version) doesn't have this issue as it
: uses a different pam (version).

Reported-by: Peter Palfrader <weasel@debian.org>
Cc: Adam Tkac <vonsch@gmail.com>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

shm: fix shmctl(SHM_INFO) lockup with !CONFIG_SHMEM

commit a68e61e8ff2d46327a37b69056998b47745db6fa upstream.

shm_get_stat() assumes that the inode is a "struct shmem_inode_info",
which is incorrect for !CONFIG_SHMEM (see fs/ramfs/inode.c:
ramfs_get_inode() vs. mm/shmem.c: shmem_get_inode()).

This bad assumption can cause shmctl(SHM_INFO) to lockup when
shm_get_stat() tries to spin_lock(&info->lock). Users of !CONFIG_SHMEM
may encounter this lockup simply by invoking the 'ipcs' command.

Reported by Jiri Olsa back in February 2008:
http://lkml.org/lkml/2008/2/29/74

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Reported-by: Jiri Olsa <olsajiri@gmail.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

wait: prevent exclusive waiter starvation

commit 777c6c5f1f6e757ae49ecca2ed72d6b1f523c007 upstream.

With exclusive waiters, every process woken up through the wait queue must
ensure that the next waiter down the line is woken when it has finished.

Interruptible waiters don't do that when aborting due to a signal.  And if
an aborting waiter is concurrently woken up through the waitqueue, noone
will ever wake up the next waiter.

This has been observed with __wait_on_bit_lock() used by
lock_page_killable(): the first contender on the queue was aborting when
the actual lock holder woke it up concurrently.  The aborted contender
didn't acquire the lock and therefor never did an unlock followed by
waking up the next waiter.

Add abort_exclusive_wait() which removes the process' wait descriptor from
the waitqueue, iff still queued, or wakes up the next waiter otherwise.
It does so under the waitqueue lock.  Racing with a wake up means the
aborting process is either already woken (removed from the queue) and will
wake up the next waiter, or it will remove itself from the queue and the
concurrent wake up will apply to the next waiter after it.

Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and
__wait_on_bit_lock() when they were interrupted by other means than a wake
up through the queue.

[akpm@linux-foundation.org: coding-style fixes]
Reported-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Mentored-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chuck Lever <cel@citi.umich.edu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

do_wp_page: fix regression with execute in place

commit ab92661d5d9514647346047f30f67a7f35ffea67 upstream.

Fix do_wp_page for VM_MIXEDMAP mappings.

In the case where pfn_valid returns 0 for a pfn at the beginning of
do_wp_page and the mapping is not shared writable, the code branches to
label `gotten:' with old_page == NULL.

In case the vma is locked (vma->vm_flags & VM_LOCKED), lock_page,
clear_page_mlock, and unlock_page try to access the old_page.

This patch checks whether old_page is valid before it is dereferenced.

The regression was introduced by "mlock: mlocked pages are unevictable"
(commit b291f000393f5a0b679012b39d79fbc85c018233).

Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sgi-xp: fix writing past the end of kzalloc()'d space

commit 361916a943cd9dbda1c0b00879d0225cc919d868 upstream.

A missing type cast results in writing way beyond the end of a kzalloc()'d
memory segment resulting in slab corruption. But it seems like the better
solution is to define ->recv_msg_slots as a 'void *' rather than a
'struct xpc_notify_mq_msg_uv *' and add the type cast.

Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Linux 2.6.28.4

ACPICA: Allow multiple backslash prefix in namepaths

commit d037c5fd7367548191eab2b376a1d08c4ffaf7ff upstream.

In a fully qualified namepath, allow multiple backslash prefixes.
This can happen because of the use of a double-backslash in strings
(since backslash is the escape character) causing confusion.
ACPICA BZ 739 Lin Ming.

http://www.acpica.org/bugzilla/show_bug.cgi?id=739

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sata_mv: Fix chip type for Hightpoint RocketRaid 1740/1742

commit 4462254ac6be9150aae87d54d388fc348d6fcead upstream.

Fix chip type for the Highpoint RocketRAID 1740 and 1742 PCI cards.
These really do have Marvell 6042 chips on them, rather than the 5081 chip.

Confirmed by multiple (two) users (for the 1740), and by examining
the product photographs from Highpoint's web site.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

dlm: initialize file_lock struct in GETLK before copying conflicting lock

commit 20d5a39929232a715f29e6cb7e3f0d0c790f41eb upstream.

dlm_posix_get fills out the relevant fields in the file_lock before
returning when there is a lock conflict, but doesn't clean out any of
the other fields in the file_lock.

When nfsd does a NFSv4 lockt call, it sets the fl_lmops to
nfsd_posix_mng_ops before calling the lower fs. When the lock comes back
after testing a lock on GFS2, it still has that field set. This confuses
nfsd into thinking that the file_lock is a nfsd4 lock.

Fix this by making DLM reinitialize the file_lock before copying the
fields from the conflicting lock.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ACPI: Do not modify SCI_EN directly

commit 11e93130c7ce5228d484fd5e86f3984835d4256b upstream.

According to the ACPI specification the SCI_EN flag is controlled by
the hardware, which sets this flag to inform the kernel that ACPI is
enabled. For this reason, we shouldn't try to modify SCI_EN
directly. Also, we don't need to do it in irqrouter_resume(), since
lower-level resume code takes care of enabling ACPI in case it hasn't
been enabled by the BIOS before passing control to the kernel (which
by the way is against the ACPI specification).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Newly inserted battery might differ from one just removed, so update of battery info fields is required.

commit 50b178512b7d6e7724f87459f6bd06504c9c2da1 upstream.

Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Acked-by: Andy Neitzke <neitzke@ias.edu>
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

video: always update the brightness when poking "brightness"

commit 9e6dada9d255497127251c03aaa59296d186f959 upstream.

always update props.brightness no matter the backlight is changed
via procfs, hotkeys or sysfs.

Sighed-off-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ACPI: Avoid array address overflow when _CST MWAIT hint bits are set

commit 13b40a1a065824d2d4e55c8b48ea9f3f9d162929 upstream.

The Cx Register address obtained from the _CST object is used as the MWAIT
hints if the register type is FFixedHW. And it is used to check whether
the Cx type is supported or not.

On some boxes the following Cx state package is obtained from _CST object:
    >{
                ResourceTemplate ()
                {
                    Register (FFixedHW,
                        0x01,               // Bit Width
                        0x02,               // Bit Offset
                        0x0000000000889759, // Address
                        0x03,               // Access Size
                        )
                },

                0x03,
                0xF5,
                0x015E }

   In such case we should use the bit[7:4] of Cx address to check whether
the Cx type is supported or not.

mask the MWAIT hint to avoid array address overflow

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Acked-by:Venki Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Thomas Renninger <trenn@suse.de>

cpuidle: Add decaying history logic to menu idle predictor

commit 816bb611e41be29b476dc16f6297eb551bf4d747 upstream

Add decaying history of predicted idle time, instead of using the last early
wakeup. This logic helps menu governor do better job of predicting idle time.

With this change, we also measured noticable (~8%) power savings on
a DP server system with CPUs supporting deep C states, when system
was lightly loaded. There was no change to power or perf on other load
conditions.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI: irq and pci_ids patch for Intel Tigerpoint DeviceIDs

commit 57064d213d2e44654d4f13c66df135b5e7389a26 upstream.

This patch adds the Intel Tigerpoint LPC Controller DeviceIDs.

Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

minstrel: fix warning if lowest supported rate index is not 0

commit d57854bb1d78ba89ffbfdfd1c3e95b52ed7478ff upstream

This patch fixes the following WARNING (caused by rix_to_ndx): "
>WARNING: at net/mac80211/rc80211_minstrel.c:69 minstrel_rate_init+0xd2/0x33a [mac80211]()
>[...]
>Call Trace:
> warn_on_slowpath+0x51/0x75
> _format_mac_addr+0x4c/0x88
> minstrel_rate_init+0xd2/0x33a [mac80211]
> print_mac+0x16/0x1b
> schedule_hrtimeout_range+0xdc/0x107
> ieee80211_add_station+0x158/0x1bd [mac80211]
> nl80211_new_station+0x1b3/0x20b [cfg80211]

The reason is that I'm experimenting with "g" only mode on a 802.11 b/g card.

Therefore rate_lowest_index returns 4 (= 6Mbit, instead of usual 0 = 1Mbit).
Since mi->r array is initialized with zeros in minstrel_alloc_sta,
rix_to_ndx has a hard time to find the 6Mbit entry and will trigged the WARNING.

Signed-off-by: Christian Lamparter <chunkeey@web.de>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

p54usb: rewriting rx/tx routines to make use of usb_anchor's facilities

commit dd397dc9dddfa2149a1bbc9e52ac7d5630737cec upstream

Alan Stern found several flaws in p54usb's implementation and annotated:
"usb_kill_urb() and similar routines do not expect an URB's completion
routine to deallocate it. This is almost obvious -- if the URB is deallocated
before the completion routine returns then there's no way for usb_kill_urb
to detect when the URB actually is complete."

This patch addresses all known limitations in the old implementation and fixes
khub's "use-after-freed" hang, when SLUB debug's poisoning option is enabled.

Signed-off-by: Christian Lamparter <chunkeey@web.de>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

p54: fix p54_read_eeprom to cope with tx_hdr_len

commit b92f30d65aeb0502e2ed8beb80c8465578b40002 upstream

This patch fixes a regression in "p54: move eeprom code into common library"
7cb770729ba895f73253dfcd46c3fcba45d896f9

Some of p54usb's devices need a little headroom for the transportation and
this was forgotten in the eeprom change.

Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

p54: fix lm87 checksum endianness

commit c91276592695e13d1b52eab572551017cbf96ee7 upstream

This fixes the checksum calculation for lm87 firmwares
on big endian platforms, the device treats the data as
an array of 32-bit little endian values so the driver
needs to do that as well.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

iwlwifi: fix rs_get_rate WARN_ON()

commit c338ba3ca5bef2df2082d9e8d336ff7b2880c326 upstream.

In ieee80211_sta structure there is u64 supp_rates[IEEE80211_NUM_BANDS]
this is filled with all support rate from assoc_resp. If we associate
with G-band AP only supp_rates of G-band will be set the other band
supp_rates will be set to 0. If the user type this command
this will cause mac80211 to set to new channel, mac80211
does not disassociate in setting new channel, so the active
band is now A-band. then in handling the new essid mac80211 will
kick in the assoc steps which involve sending disassociation frame.
in this mac80211 will WARN_ON sta->supp_rates[A_BAND] == 0.

This fixes:
http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1822
http://www.kerneloops.org/searchweek.php?search=rs_get_rate

Signed-off-by: mohamed abbas <mohamed.abbas@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

nfsd: Ensure nfsv4 calls the underlying filesystem on LOCKT

commit 55ef1274dddd4de387c54d110e354ffbb6cdc706 upstream.

Since nfsv4 allows LOCKT without an open, but the ->lock() method is a
file method, we fake up a struct file in the nfsv4 code with just the
fields we need initialized.  But we forgot to initialize the file
operations, with the result that LOCKT never results in a call to the
filesystem's ->lock() method (if it exists).

We could just add that one more initialization.  But this hack of faking
up a struct file with only some fields initialized seems the kind of
thing that might cause more problems in the future.  We should either do
an open and get a real struct file, or make lock-testing an inode (not a
file) method.

This patch does the former.

Reported-by: Marc Eshel <eshel@almaden.ibm.com>
Tested-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

nfsd: only set file_lock.fl_lmops in nfsd4_lockt if a stateowner is found

commit fa82a491275a613b15489aab4b99acecb00958d3 upstream.

nfsd4_lockt does a search for a lockstateowner when building the lock
struct to test. If one is found, it'll set fl_owner to it. Regardless of
whether that happens, it'll also set fl_lmops. Given that this lock is
basically a "lightweight" lock that's just used for checking conflicts,
setting fl_lmops is probably not appropriate for it.

This behavior exposed a bug in DLM's GETLK implementation where it
wasn't clearing out the fields in the file_lock before filling in
conflicting lock info. While we were able to fix this in DLM, it
still seems pointless and dangerous to set the fl_lmops this way
when we may have a NULL lockstateowner.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@pig.fieldses.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Input: atkbd - Samsung NC10 key repeat fix

commit 4200844bd9dc511088258437d564a187f0ffc94e upstream.

This patch fixes the key repeat issue with the Fn+F? keys on the new
Samsung NC10 Netbook, so that the keys can be defined and used within
ACPID correctly, otherwise the keys repeat indefinately.

This solves part of http://bugzilla.kernel.org/show_bug.cgi?id=12021

Signed-off-by: Stuart Hopkins <stuart@dodgy-geeza.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Add enable_ms to jsm driver

commit 0461ec5bc7745b89a8ab67ba0ea497abd58a6301 upstream.

This fixes a crash observed when non-existant enable_ms function is
called for jsm driver.

Signed-off-by: Scott Kilau <Scott.Kilau@digi.com>
Signed-off-by: Paul Larson <pl@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Fix memory corruption in console selection

commit 878b8619f711280fd05845e21956434b5e588cc4 upstream.

Fix an off-by-two memory error in console selection.

The loop below goes from sel_start to sel_end (inclusive), so it writes
one more character. This one more character was added to the allocated
size (+1), but it was not multiplied by an UTF-8 multiplier.

This patch fixes a memory corruption when UTF-8 console is used and the
user selects a few characters, all of them 3-byte in UTF-8 (for example
a frame line).

When memory redzones are enabled, a redzone corruption is reported.
When they are not enabled, trashing of random memory occurs.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sata_nv: ck804 has borked hardreset too

commit 8d993eaa9c3c61b8a5929a7f695078a1fcfb4869 upstream.

While playing with nvraid, I found out that rmmoding and insmoding
often trigger hardreset failure on the first port (the second one was
always okay). Seriously, how diverse can you get with hardreset
behaviors? Anyways, make ck804 use noclassify variant too.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sata_nv: fix MCP5x reset

commit 2d775708bc6613f1be47f1e720781343341ecc94 upstream.

MCP5x family of controllers seem to share much more with nf2's as far
as reset protocol is concerned. It requires heardreset to get the PHY
going and classfication code report after hardreset is unreliable.
Create a new board type MCP5x and use noclassify hardreset. SWNCQ is
modified to inherit from this new type.

This fixes hotplug regression reported in kernel bz#12351.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sata_nv: rename nv_nf2_hardreset()

commit e8caa3c70e94d867ca2efe9e53fd388b52d6d0c8 upstream.

nv_nf2_hardreset() will be used by other flavors too. Rename it to
nv_noclassify_hardreset().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

kmalloc: return NULL instead of link failure

commit 1cf3eb2ff6b0844c678f2f48d0053b9d12b7da67 upstream.

The SLAB kmalloc with a constant value isn't consistent with the other
implementations because it bails out with __you_cannot_kmalloc_that_much
rather than returning NULL and properly allowing the caller to fall back
to vmalloc or take other action.  This doesn't happen with a non-constant
value or with SLOB or SLUB.

Starting with 2.6.28, I've been seeing build failures on s390x.  This is
due to init_section_page_cgroup trying to allocate 2.5MB when the max size
for a kmalloc on s390x is 2MB.

It's failing because the value is constant.  The workarounds at the call
size are ugly and the caller shouldn't have to change behavior depending
on what the backend of the API is.

So, this patch eliminates the link failure and returns NULL like the other
implementations.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fbdev/atyfb: Fix DSP config on some PowerMacs & PowerBooks

commit 7fbb7cadd062baf299fd8b26a80ea99da0c3fe01 upstream.

Since the complete re-write in 2.6.10, some PowerMacs (At least PowerMac 5500
and PowerMac G3 Beige rev A) with ATI Mach64 chip have suffered from unstable
columns in their framebuffer image. This seems to depend on a value (4) read
from PLL_EXT_CNTL register, which leads to incorrect DSP config parameters to
be written to the chip. This patch uses a value calculated by aty_init_pll_ct
instead, as a starting point.

There are questions as to whether this should be extended to other platforms
or maybe made dependent on specific chip types, but in the meantime, this has
been tested on various powermacs and works for them so let's commit it.

Signed-off-by: Risto Suominen <Risto.Suominen@gmail.com>
Tested-by: Michael Pettersson <mike@it.uu.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

orinoco: move kmalloc(..., GFP_KERNEL) outside spinlock in orinoco_ioctl_set_genie

commit 7fe99c4e28ab54eada8aa456b417114e6ef21587 upstream

orinoco: move kmalloc(..., GFP_KERNEL) outside spinlock in orinoco_ioctl_set_genie

[   56.923623] BUG: sleeping function called from invalid context at /home/bor/src/linux-git/mm/slub.c:1599
[   56.923644] in_atomic(): 0, irqs_disabled(): 1, pid: 3031, name: wpa_supplicant
[   56.923656] 2 locks held by wpa_supplicant/3031:
[   56.923662]  #0:  (rtnl_mutex){--..}, at: [<c02abd1f>] rtnl_lock+0xf/0x20
[   56.923703]  #1:  (&priv->lock){++..}, at: [<dfc840c2>] orinoco_ioctl_set_genie+0x52/0x130 [orinoco]
[   56.923782] irq event stamp: 910
[   56.923788] hardirqs last  enabled at (909): [<c01957db>] __kmalloc+0x7b/0x140
[   56.923820] hardirqs last disabled at (910): [<c0309419>] _spin_lock_irqsave+0x19/0x80
[   56.923847] softirqs last  enabled at (880): [<c0124f54>] __do_softirq+0xc4/0x110
[   56.923865] softirqs last disabled at (871): [<c01049ae>] do_softirq+0x8e/0xe0
[   56.923895] Pid: 3031, comm: wpa_supplicant Not tainted 2.6.29-rc2-1avb #1
[   56.923905] Call Trace:
[   56.923919]  [<c01049ae>] ? do_softirq+0x8e/0xe0
[   56.923941]  [<c011ad12>] __might_sleep+0xd2/0x100
[   56.923952]  [<c0195837>] __kmalloc+0xd7/0x140
[   56.923963]  [<c030946a>] ? _spin_lock_irqsave+0x6a/0x80
[   56.923981]  [<dfc840e9>] ? orinoco_ioctl_set_genie+0x79/0x130 [orinoco]
[   56.923999]  [<dfc840c2>] ? orinoco_ioctl_set_genie+0x52/0x130 [orinoco]
[   56.924017]  [<dfc840e9>] orinoco_ioctl_set_genie+0x79/0x130 [orinoco]
[   56.924036]  [<c0209325>] ? copy_from_user+0x35/0x130
[   56.924061]  [<c02ffd96>] ioctl_standard_call+0x196/0x380
[   56.924085]  [<c029f945>] ? __dev_get_by_name+0x85/0xb0
[   56.924096]  [<c02ff88f>] wext_handle_ioctl+0x14f/0x230
[   56.924113]  [<dfc84070>] ? orinoco_ioctl_set_genie+0x0/0x130 [orinoco]
[   56.924132]  [<c02a3da5>] dev_ioctl+0x495/0x570
[   56.924155]  [<c0293e05>] ? sys_sendto+0xa5/0xd0
[   56.924171]  [<c0142fe8>] ? mark_held_locks+0x48/0x90
[   56.924183]  [<c0292880>] ? sock_ioctl+0x0/0x280
[   56.924193]  [<c029297d>] sock_ioctl+0xfd/0x280
[   56.924203]  [<c0292880>] ? sock_ioctl+0x0/0x280
[   56.924235]  [<c01a51d0>] vfs_ioctl+0x20/0x80
[   56.924246]  [<c01a53e2>] do_vfs_ioctl+0x72/0x570
[   56.924257]  [<c0293e62>] ? sys_send+0x32/0x40
[   56.924268]  [<c02947c0>] ? sys_socketcall+0x1d0/0x2a0
[   56.924280]  [<c010339f>] ? sysenter_exit+0xf/0x16
[   56.924292]  [<c01a5919>] sys_ioctl+0x39/0x70
[   56.924302]  [<c0103371>] sysenter_do_call+0x12/0x31

Signed-off-by: Andrey Borzenkov <arvidjaar@mail.ru>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

netfilter: ctnetlink: fix scheduling while atomic

commit 748085fcbedbf7b0f38d95e178265d7b13360b44 upstream.

Caused by call to request_module() while holding nf_conntrack_lock.

Reported-and-tested-by: Kövesdi György <kgy@teledigit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

cifs: make sure we allocate enough storage for socket address

commit a9ac49d303f967be0dabd97cb722c4a13109c6c2 upstream.

cifs_mount declares a struct sockaddr on the stack and then casts it
to the proper address type. The storage allocated is fine for ipv4,
but is too small for ipv6 addresses. Declare it as
"struct sockaddr_storage" instead of struct sockaddr".

This bug was manifesting itself as oopses and address corruption when
mounting IPv6 addresses.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Tested-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: use early clobbers in usercopy*.c

commit e0a96129db574d6365e3439d16d88517c437ab33 upstream.

Impact: fix rare (but currently harmless) miscompile with certain configs and gcc versions

Hugh Dickins noticed that strncpy_from_user() was miscompiled
in some circumstances with gcc 4.3.

Thanks to Hugh's excellent analysis it was easy to track down.

Hugh writes:

> Try building an x86_64 defconfig 2.6.29-rc1 kernel tree,
> except not quite defconfig, switch CONFIG_PREEMPT_NONE=y
> and CONFIG_PREEMPT_VOLUNTARY off (because it expands a
> might_fault() there, which hides the issue): using a
> gcc 4.3.2 (I've checked both openSUSE 11.1 and Fedora 10).
>
> It generates the following:
>
> 0000000000000000 <__strncpy_from_user>:
>    0:   48 89 d1                mov    %rdx,%rcx
>    3:   48 85 c9                test   %rcx,%rcx
>    6:   74 0e                   je     16 <__strncpy_from_user+0x16>
>    8:   ac                      lods   %ds:(%rsi),%al
>    9:   aa                      stos   %al,%es:(%rdi)
>    a:   84 c0                   test   %al,%al
>    c:   74 05                   je     13 <__strncpy_from_user+0x13>
>    e:   48 ff c9                dec    %rcx
>   11:   75 f5                   jne    8 <__strncpy_from_user+0x8>
>   13:   48 29 c9                sub    %rcx,%rcx
>   16:   48 89 c8                mov    %rcx,%rax
>   19:   c3                      retq
>
> Observe that "sub %rcx,%rcx; mov %rcx,%rax", whereas gcc 4.2.1
> (and many other configs) say "sub %rcx,%rdx; mov %rdx,%rax".
> Isn't it returning 0 when it ought to be returning strlen?

The asm constraints for the strncpy_from_user() result were missing an
early clobber, which tells gcc that the last output arguments
are written before all input arguments are read.

Also add more early clobbers in the rest of the file and fix 32-bit
usercopy.c in the same way.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
[ since this API is rarely used and no in-kernel user relies on a 'len'
  return value (they only rely on negative return values) this miscompile
  was never noticed in the field. But it's worth fixing it nevertheless. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI/MSI: bugfix/utilize for msi_capability_init()

commit 0db29af1e767464d71b89410d61a1e5b668d0370 upstream.

This patch fix a following bug and does a cleanup.

bug:
commit 5993760f7fc75b77e4701f1e56dc84c0d6cf18d5
had a wrong change (since is_64 is boolean[0|1]):

-               pci_write_config_dword(dev,
-                       msi_mask_bits_reg(pos, is_64bit_address(control)),
-                       maskbits);
+               pci_write_config_dword(dev, entry->msi_attrib.is_64, maskbits);

utilize:
Unify separated if (entry->msi_attrib.maskbit) statements.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: "Jike Song" <albcamus@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

m68knommu: set NO_DMA

commit e0212e72186e855027dd35b37e9d7a99a078448c upstream.

m68knommu does not set the Kconfig NO_DMA variable, but also does
not provide the required functions, resulting in the following
build error triggered by commit a40c24a13366e324bc0ff8c3bb107db89312c984
(net: Add SKB DMA mapping helper functions.):

<--  snip  -->

..
  LD      vmlinux
net/built-in.o: In function `skb_dma_unmap':
(.text+0xac5e): undefined reference to `dma_unmap_single'
net/built-in.o: In function `skb_dma_unmap':
(.text+0xac7a): undefined reference to `dma_unmap_page'
net/built-in.o: In function `skb_dma_map':
(.text+0xacdc): undefined reference to `dma_map_single'
net/built-in.o: In function `skb_dma_map':
(.text+0xace8): undefined reference to `dma_mapping_error'
net/built-in.o: In function `skb_dma_map':
(.text+0xad10): undefined reference to `dma_map_page'
net/built-in.o: In function `skb_dma_map':
(.text+0xad82): undefined reference to `dma_unmap_page'
net/built-in.o: In function `skb_dma_map':
(.text+0xadc6): undefined reference to `dma_unmap_single'
make[1]: *** [vmlinux] Error 1

<--  snip  -->

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sata_mv: fix 8-port timeouts on 508x/6081 chips

commit b0bccb18bc523d1d5060d25958f12438062829a9 upstream.

Fix a longstanding bug for the 8-port Marvell Sata controllers (508x/6081),
where accesses to the upper 4 ports would cause lost-interrupts / timeouts
for the lower 4-ports. With this patch, the 6081 boards should finally be
reliable enough for mainstream use with Linux.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

xen: make sysfs files behave as their names suggest

commit 618b2c8db24522ae273d8299c6a936ea13793c4d upstream.

1: make "target_kb" only accept and produce a memory size in kilobytes.
2: add a second "target" file which produces output in bytes, and will accept
memparse input (scaled bytes)

This fixes the rather irritating problem that writing the same value
read back into target_kb would end up shrinking the domain by a factor
of 1024, with generally bad results.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: "dan.magenheimer@oracle.com" <dan.magenheimer@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Manually revert "mlock: downgrade mmap sem while populating mlocked regions"

commit 27421e211a39784694b597dbf35848b88363c248 upstream.

This essentially reverts commit 8edb08caf68184fb170f4f69c7445929e199eaea.

It downgraded our mmap semaphore to a read-lock while mlocking pages, in
order to allow other threads (and external accesses like "ps" et al) to
walk the vma lists and take page faults etc.  Which is a nice idea, but
the implementation does not work.

Because we cannot upgrade the lock back to a write lock without
releasing the mmap semaphore, the code had to release the lock entirely
and then re-take it as a writelock.  However, that meant that the caller
possibly lost the vma chain that it was following, since now another
thread could come in and mmap/munmap the range.

The code tried to work around that by just looking up the vma again and
erroring out if that happened, but quite frankly, that was just a buggy
hack that doesn't actually protect against anything (the other thread
could just have replaced the vma with another one instead of totally
unmapping it).

The only way to downgrade to a read map _reliably_ is to do it at the
end, which is likely the right thing to do: do all the 'vma' operations
with the write-lock held, then downgrade to a read after completing them
all, and then do the "populate the newly mlocked regions" while holding
just the read lock.  And then just drop the read-lock and return to user
space.

The (perhaps somewhat simpler) alternative is to just make all the
callers of mlock_vma_pages_range() know that the mmap lock got dropped,
and just re-grab the mmap semaphore if it needs to mlock more than one
vma region.

So we can do this "downgrade mmap sem while populating mlocked regions"
thing right, but the way it was done here was absolutely not correct.
Thus the revert, in the expectation that we will do it all correctly
some day.

Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Linux 2.6.28.3

relay: fix lock imbalance in relay_late_setup_files

commit b786c6a98ef6fa81114ba7b9fbfc0d67060775e3 upstream.

One fail path in relay_late_setup_files() omits
mutex_unlock(&relay_channels_mutex);
Add it.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

NET: net_namespace, fix lock imbalance

commit 357f5b0b91054ae23385ea4b0634bb8b43736e83 upstream.

register_pernet_gen_subsys omits mutex_unlock in one fail path.
Fix it.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

dmaengine: fix dependency chaining

commit dd59b8537f6cb53ab863fafad86a5828f1e889a2 upstream

ASYNC_TX: fix dependency chaining

In ASYNC_TX we track the dependencies between the descriptors
using the 'next' pointers of the structures. These pointers are
set to NULL as soon as the corresponding descriptor has been
submitted to the channel (in async_tx_run_dependencies()).
But, the first 'next' in chain still remains set, regardless
the fact, that tx->next is already submitted. This may lead to
multiple submisions of the same descriptor. This patch fixes this.

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

PCI hotplug: fix lock imbalance in pciehp

commit c2fdd36b550659f5ac2240d1f5a83ffa1a092289 upstream.

set_lock_status omits mutex_unlock in fail path. Add the omitted
unlock.

As a result a lockup caused by this can be triggered from userspace
by writing 1 to /sys/bus/pci/slots/.../lock often enough.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, pat: fix PTE corruption issue while mapping RAM using /dev/mem

commit 9597134218300c045cf219be3664615e97cb239c upstream.

Beschorner Daniel reported:
> hwinfo problem since 2.6.28, showing this in the oops:
>   Corrupted page table at address 7fd04de3ec00

Also, PaX Team reported a regression with this commit:

>   commit 9542ada803198e6eba29d3289abb39ea82047b92
>   Author: Suresh Siddha <suresh.b.siddha@intel.com>
>   Date:   Wed Sep 24 08:53:33 2008 -0700
>
>       x86: track memtype for RAM in page struct

This commit breaks mapping any RAM page through /dev/mem, as the
reserve_memtype() was not initializing the return attribute type and as such
corrupting the PTE entry that was setup with the return attribute type.

Because of this bug, application mapping this RAM page through /dev/mem
will die with "Corrupted page table at address xxxx" message in the kernel
log and also the kernel identity mapping which maps the underlying RAM
page gets converted to UC.

Fix this by initializing the return attribute type before calling
reserve_ram_pages_type()

Reported-by: PaX Team <pageexec@freemail.hu>
Reported-and-tested-by: Beschorner Daniel <Daniel.Beschorner@facton.com>
Tested-and-Acked-by: PaX Team <pageexec@freemail.hu>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, pat: fix reserve_memtype() for legacy 1MB range

commit 5cca0cf15a94417f49625ce52e23589eed0a1675 upstream

Thierry Vignaud reported:
> http://bugzilla.kernel.org/show_bug.cgi?id=12372
>
> On P4 with an SiS motherboard (video card is a SiS 651)
> X server fails to start with error:
> xf86MapVidMem: Could not mmap framebuffer (0x00000000,0x2000) (Invalid
> argument)

Here X is trying to map first 8KB of memory using /dev/mem. Existing
code treats first 0-4KB of memory as non-RAM and 4KB-8KB as RAM. Recent
code changes don't allow to map memory with different attributes
at the same time.

Fix this by treating the first 1MB legacy region as special and always
track the attribute requests with in this region using linear linked
list (and don't bother if the range is RAM or non-RAM or mixed)

Reported-and-tested-by: Thierry Vignaud <tvignaud@mandriva.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

crypto: ccm - Fix handling of null assoc data

commit 516280e735b034216de97eb7ba080ec6acbfc58f upstream.

Its a valid use case to have null associated data in a ccm vector, but
this case isn't being handled properly right now.

The following ccm decryption/verification test vector, using the
rfc4309 implementation regularly triggers a panic, as will any
other vector with null assoc data:

* key: ab2f8a74b71cd2b1ff802e487d82f8b9
* iv: c6fb7d800d13abd8a6b2d8
* Associated Data: [NULL]
* Tag Length: 8
* input: d5e8939fc7892e2b

The resulting panic looks like so:

Unable to handle kernel paging request at ffff810064ddaec0 RIP:
[<ffffffff8864c4d7>] :ccm:get_data_to_compute+0x1a6/0x1d6
PGD 8063 PUD 0
Oops: 0002 [1] SMP
last sysfs file: /module/libata/version
CPU 0
Modules linked in: crypto_tester_kmod(U) seqiv krng ansi_cprng chainiv rng ctr aes_generic aes_x86_64 ccm cryptomgr testmgr_cipher testmgr aead crypto_blkcipher crypto_a
lgapi des ipv6 xfrm_nalgo crypto_api autofs4 hidp l2cap bluetooth nfs lockd fscache nfs_acl sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink xt_
tcpudp iptable_filter ip_tables x_tables dm_mirror dm_log dm_multipath scsi_dh dm_mod video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac lp sg
snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss joydev snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ide_cd snd_pcm floppy parport_p
c shpchp e752x_edac snd_timer e1000 i2c_i801 edac_mc snd soundcore snd_page_alloc i2c_core cdrom parport serio_raw pcspkr ata_piix libata sd_mod scsi_mod ext3 jbd uhci_h
cd ohci_hcd ehci_hcd
Pid: 12844, comm: crypto-tester Tainted: G      2.6.18-128.el5.fips1 #1
RIP: 0010:[<ffffffff8864c4d7>]  [<ffffffff8864c4d7>] :ccm:get_data_to_compute+0x1a6/0x1d6
RSP: 0018:ffff8100134434e8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8100104898b0 RCX: ffffffffab6aea10
RDX: 0000000000000010 RSI: ffff8100104898c0 RDI: ffff810064ddaec0
RBP: 0000000000000000 R08: ffff8100104898b0 R09: 0000000000000000
R10: ffff8100103bac84 R11: ffff8100104898b0 R12: ffff810010489858
R13: ffff8100104898b0 R14: ffff8100103bac00 R15: 0000000000000000
FS:  00002ab881adfd30(0000) GS:ffffffff803ac000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff810064ddaec0 CR3: 0000000012a88000 CR4: 00000000000006e0
Process crypto-tester (pid: 12844, threadinfo ffff810013442000, task ffff81003d165860)
Stack:  ffff8100103bac00 ffff8100104898e8 ffff8100134436f8 ffffffff00000000
0000000000000000 ffff8100104898b0 0000000000000000 ffff810010489858
0000000000000000 ffff8100103bac00 ffff8100134436f8 ffffffff8864c634
Call Trace:
[<ffffffff8864c634>] :ccm:crypto_ccm_auth+0x12d/0x140
[<ffffffff8864cf73>] :ccm:crypto_ccm_decrypt+0x161/0x23a
[<ffffffff88633643>] :crypto_tester_kmod:cavs_test_rfc4309_ccm+0x4a5/0x559
[...]

The above is from a RHEL5-based kernel, but upstream is susceptible too.

The fix is trivial: in crypto/ccm.c:crypto_ccm_auth(), pctx->ilen contains
whatever was in memory when pctx was allocated if assoclen is 0. The tested
fix is to simply add an else clause setting pctx->ilen to 0 for the
assoclen == 0 case, so that get_data_to_compute() doesn't try doing
things its not supposed to.

Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

crypto: authenc - Fix zero-length IV crash

commit 29b37f42127f7da511560a40ea74f5047da40c13 upstream.

As it is if an algorithm with a zero-length IV is used (e.g.,
NULL encryption) with authenc, authenc may generate an SG entry
of length zero, which will trigger a BUG check in the hash layer.

This patch fixes it by skipping the IV SG generation if the IV
size is zero.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ALSA: hda - Add quirk for HP DV6700 laptop

commit aa9d823bb347fb66cb07f98c686be8bb85cb6a74 upstream.

Added the matching model=laptop for HP DV6700 laptop.

Signed-off-by: Joerg Schirottke <master@kanotix.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ALSA: hda - add another MacBook Pro 4, 1 subsystem ID

commit 2a88464ceb1bda2571f88902fd8068a6168e3f7b upstream.

Add another MacBook Pro 4,1 SSID (106b:3800). It seems that latter revisions,
(at least mine), have different IDs to earlier revisions.

Signed-off-by: Luke Yelavich <themuso@ubuntu.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ALSA: hda - Fix PCM reference NID for STAC/IDT analog outputs

commit 00a602db1ce9d61319d6f769dee206ec85f19bda upstream.

The reference NID for the analog outputs of STAC/IDT codecs is set
to a fixed number 0x02. But this isn't always correct and in many
codecs it points to a non-existing NID.

This patch fixes the initialization of the PCM reference NID taken
from the actually probed DAC list.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

include/linux: Add bsg.h to the Kernel exported headers

commit a229fc61ef0ee3c30fd193beee0eeb87410227f1 upstream.

bsg.h in current form is perfectly suitable for user-mode
consumption. It is needed together with scsi/sg.h for applications
that want to interface with the bsg driver.

Currently the few projects that use it would copy it over into
the projects. But that is not acceptable for projects that need
to provide source and devel packages for distros.

This should also be submitted to stable 2.6.28 and 2.6.27 since bsg had
a stable API since these Kernels and distro users will need the header
for these kernels a swell

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sgi-xpc: ensure flags are updated before bte_copy

commit 69b3bb65fa97a1e8563518dbbc35cd57beefb2d4 upstream.

The clearing of the msg->flags needs a barrier between it and the notify
of the channel threads that the messages are cleaned and ready for use.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sgi-xpc: Remove NULL pointer dereference.

commit 17e2161654da4e6bdfd8d53d4f52e820ee93f423 upstream.

If the bte copy fails, the attempt to retrieve payloads merely returns a
null pointer deref and not NULL as was expected.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

gpiolib: fix request related issue

commit 7460db567bbca76bf087d1694d792a1a96bdaa26 upstream.

Fix request-already-requested handling in gpio_request().

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

epoll: drop max_user_instances and rely only on max_user_watches

commit 9df04e1f25effde823a600e755b51475d438f56b upstream.

Linus suggested to put limits where the money is, and max_user_watches
already does that w/out the need of max_user_instances.  That has the
advantage to mitigate the potential DoS while allowing pretty generous
default behavior.

Allowing top 4% of low memory (per user) to be allocated in epoll watches,
we have:

LOMEM    MAX_WATCHES (per user)
512MB    ~178000
1GB      ~356000
2GB      ~712000

A box with 512MB of lomem, will meet some challenge in hitting 180K
watches, socket buffers math teaches us.  No more max_user_instances
limits then.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Bron Gondwana <brong@fastmail.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

rtl8187: Fix error in setting OFDM power settings for RTL8187L

commit eb83bbf57429ab80f49b413e3e44d3b19c3fdc5a upstream.

After reports of poor performance, a review of the latest vendor driver
(rtl8187_linux_26.1025.0328.2007) for RTL8187L devices was undertaken.

A difference was found in the code used to index the OFDM power tables. When
the Linux driver was changed, my unit works at a much greater range than
before. I think this fixes Bugzilla #12380 and has been tested by at least
two other users.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Tested-by: Martín Ernesto Barreyro <barreyromartin@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ext3: Add sanity check to make_indexed_dir

commit a21102b55c4f8dfd3adb4a15a34cd62237b46039 upstream.

Make sure the rec_len field in the '..' entry is sane, lest we overrun
the directory block and cause a kernel oops on a purposefully
corrupted filesystem.

This fixes a bug related to a bug originally reported by Sami Liedes
for ext4 at:

http://bugzilla.kernel.org/show_bug.cgi?id=12430

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

bnx2x: Block nvram access when the device is inactive

commit 2add3acb11a26cc14b54669433ae6ace6406cbf2 upstream.

Don't dump eeprom when bnx2x adapter is down. Running ethtool -e causes an eeh
without it when the device is down

Signed-off-by: Paul Larson <pl@linux.vnet.ibm.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Fix OOPS in mmap_region() when merging adjacent VM_LOCKED file segments

This patch differs from the upstream commit
de33c8db5910cda599899dd431cc30d7c1018cbf written by Linus, as it aims to
only prevent the oops from happening, not attempt to change anything
else.

The problem was introduced by commit
ba470de43188cdbff795b5da43a1474523c6c2fb

which added new references to *vma after we've potentially freed it.

From: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Maksim Yevmenkin <maksim.yevmenkin@gmail.com>
Tested-by: Maksim Yevmenkin <maksim.yevmenkin@gmail.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm: stash AGP include under the do-we-have-AGP ifdef

commit 1bb88edb7a3769992026f34fd648bb459b0469aa upstream.

This fixes the MIPS with DRM build.

Signed-off-by: Eric Anholt <eric@anholt.net>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

serial_8250: support for Sealevel Systems Model 7803 COMM+8

commit e65f0f8271b1b0452334e5da37fd35413a000de4 upstream.

Add support for Sealevel Systems Model 7803 COMM+8

Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

libata: pata_via: support VX855, future chips whose IDE controller use 0x0571

commit e4d866cdea24543ee16ce6d07d80c513e86ba983 upstream.

It supports VX855 and future chips whose IDE controller uses PCI ID 0x0571.

Signed-off-by: Joseph Chan <josephchan@via.com.tw>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

it821x: Add ultra_mask quirk for Vortex86SX

commit b94b898f3107046b5c97c556e23529283ea5eadd upstream.

On Vortex86SX with IDE controller revision 0x11 ultra DMA must be
disabled. This patch was tested by DMP and seems to work.

It is a cleaned up version of their older Kernel patch:
http://www.dmp.com.tw/tech/vortex86sx/patch-2.6.24-DMP.gz

Tested-by: Shawn Lin <shawn@dmp.com.tw>
Signed-off-by: Brandon Philips <bphilips@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

rtl8187: Add termination packet to prevent stall

commit 2fcbab044a3faf4d4a6e269148dd1f188303b206 upstream.

The RTL8187 and RTL8187B devices can stall unless an explicit termination
packet is sent.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

resources: skip sanity check of busy resources

commit 3ac52669c7a24b93663acfcab606d1065ed1accd upstream.

Impact: reduce false positives in iomem_map_sanity_check()

Some drivers (vesafb) only map/reserve a portion of a resource.
If then some other driver comes in and maps the whole resource,
the current code WARN_ON's. This is not the intent of the checks
in iomem_map_sanity_check(); rather these checks want to
warn when crossing *hardware* resources only.

This patch skips BUSY resources as suggested by Linus.

Note: having two drivers talk to the same hardware at the same
time is obviously not optimal behavior, but that's a separate story.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

alpha: fix vmalloc breakage

commit 822c18f2e38cbc775792ab65ace4f9198678dec9 upstream.

On alpha, we have to map some stuff in the VMALLOC space very early in the
boot process (to make SRM console callbacks work and so on, see
arch/alpha/mm/init.c). For old VM allocator, we just manually placed a
vm_struct onto the global vmlist and this worked for ages.

Unfortunately, the new allocator isn't aware of this, so it constantly
tries to allocate the VM space which is already in use, making vmalloc on
alpha defunct.

This patch forces KVA to import vmlist entries on init.

[akpm@linux-foundation.org: remove unneeded check (per Johannes)]
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

alpha: nautilus - fix compile failure with gcc-4.3

commit 70b66cbfd3316b792a855cb9a2574e85f1a63d0f upstream.

init_srm_irq() deals with irq's #16 and above, but size of irq_desc
array on nautilus and some other system types is 16. So gcc-4.3
complains that "array subscript is above array bounds", even though
this function is never called on those systems.

This adds a check for NR_IRQS <= 16, which effectively optimizes
init_srm_irq() code away on problematic platforms.

Thanks to Daniel Drake <dsd@gentoo.org> for detailed analysis
of the problem.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tobias Klausmann <klausman@schwarzvogel.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: storage: add unusual devs entry

commit b90de8aea36ae6fe8050a6e91b031369c4f251b2 upstream.

This adds an unusual devs entry for 2116:0320

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: fix char-device disconnect handling

commit 501950d846218ed80a776d2aae5aed9c8b92e778 upstream.

This patch (as1198) fixes a conceptual bug: Somewhere along the line
we managed to confuse USB class devices with USB char devices.  As a
result, the code to send a disconnect signal to userspace would not be
built if both CONFIG_USB_DEVICE_CLASS and CONFIG_USB_DEVICEFS were
disabled.

The usb_fs_classdev_common_remove() routine has been renamed to
usbdev_remove() and it is now called whenever any USB device is
removed, not just when a class device is unregistered.  The notifier
registration and unregistration calls are no longer conditionally
compiled.  And since the common removal code will always be called as
part of the char device interface, there's no need to call it again as
part of the usbfs interface; thus the invocation of
usb_fs_classdev_common_remove() has been taken out of
usbfs_remove_device().

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Alon Bar-Lev <alon.barlev@gmail.com>
Tested-by: Alon Bar-Lev <alon.barlev@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: usbmon: Implement compat_ioctl

commit 7abce6bedc118eb39fe177c2c26be5d008505c14 upstream.

Running a 32-bit usbmon(8) on 2.6.28-rc9 produces the following:
ioctl32(usbmon:28563): Unknown cmd fd(3) cmd(400c9206){t:ffffff92;sz:12} arg(ffd3f458) on /dev/usbmon0

It happens because the compatibility mode was implemented for 2.6.18
and not updated for the fsops.compat_ioctl API.

This patch relocates the pieces from under #ifdef CONFIG_COMPAT into
compat_ioctl with no other changes except one new whitespace.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sound: virtuoso: enable UART on Xonar HDAV1.3

commit 22c733788bbd4b75c00279119a83da5cd74b987a upstream.

This hardware has a better chance of working correctly if we don't
forget to enable it.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: fix toggle mismatch in disable_endpoint paths

commit ddeac4e75f2527a340f9dc655bde49bb2429b39b upstream.

This patch (as1200) finishes some fixes that were left incomplete by
an earlier patch.

Although nobody has addressed this issue in the past, it turns out
that we need to distinguish between two different modes of disabling
and enabling endpoints.  In one mode only the data structures in
usbcore are affected, and in the other mode the host controller and
device hardware states are affected as well.

The earlier patch added an extra argument to the routines in the
enable_endpoint pathways to reflect this difference.  This patch adds
corresponding arguments to the disable_endpoint pathways.  Without
this change, the endpoint toggle state can get out of sync between
the host and the device.  The exact mechanism depends on the details
of the host controller (whether or not it stores its own copy of the
toggle values).

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Dan Streetman <ddstreet@ieee.org>
Tested-by: Dan Streetman <ddstreet@ieee.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86: fix page attribute corruption with cpa()

commit a1e46212a410793d575718818e81ddc442a65283 upstream.

Impact: fix sporadic slowdowns and warning messages

This patch fixes a performance issue reported by Linus on his
Nehalem system. While Linus reverted the PAT patch (commit
58dab916dfb57328d50deb0aa9b3fc92efa248ff) which exposed the issue,
existing cpa() code can potentially still cause wrong(page attribute
corruption) behavior.

This patch also fixes the "WARNING: at arch/x86/mm/pageattr.c:560" that
various people reported.

In 64bit kernel, kernel identity mapping might have holes depending
on the available memory and how e820 reports the address range
covering the RAM, ACPI, PCI reserved regions. If there is a 2MB/1GB hole
in the address range that is not listed by e820 entries, kernel identity
mapping will have a corresponding hole in its 1-1 identity mapping.

If cpa() happens on the kernel identity mapping which falls into these holes,
existing code fails like this:

__change_page_attr_set_clr()
__change_page_attr()
returns 0 because of if (!kpte). But doesn't
set cpa->numpages and cpa->pfn.
cpa_process_alias()
uses uninitialized cpa->pfn (random value)
which can potentially lead to changing the page
attribute of kernel text/data, kernel identity
mapping of RAM pages etc. oops!

This bug was easily exposed by another PAT patch which was doing
cpa() more often on kernel identity mapping holes (physical range between
max_low_pfn_mapped and 4GB), where in here it was setting the
cache disable attribute(PCD) for kernel identity mappings aswell.

Fix cpa() to handle the kernel identity mapping holes. Retain
the WARN() for cpa() calls to other not present address ranges
(kernel-text/data, ioremap() addresses)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

sysfs: fix problems with binary files

commit 4503efd0891c40e30928afb4b23dc3f99c62a6b2 upstream.

Some sysfs binary files don't like having 0 passed to them as a size.
Fix this up at the root by just returning to the vfs if userspace asks
us for a zero sized buffer.

Thanks to Pavel Roskin for pointing this out.

Reported-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

klist.c: bit 0 in pointer can't be used as flag

commit c0e69a5bbc6fc74184aa043aadb9a53bc58f953b upstream.

The commit a1ed5b0cffe4b16a93a6a3390e8cee0fbef94f86
(klist: don't iterate over deleted entries) introduces use of the
low bit in a pointer to indicate if the knode is dead or not,
assuming that this bit is always free.

This is not true for all architectures, CRIS for example may align data
on byte borders.

The result is a bunch of warnings on bootup, devices not being
added correctly etc, reported by Hinko Kocevar <hinko.kocevar@cetrtapot.si>:

------------[ cut here ]------------
WARNING: at lib/klist.c:62 ()
Modules linked in:

Stack from c1fe1cf0:
       c01cc7f4 c1fe1d11 c000eb4e c000e4de 00000000 00000000 c1f4f78f c1f50c2d
       c01d008c c1fdd1a0 c1fdd1a0 c1fe1d38 c0192954 c1fe0000 00000000 c1fe1dc0
       00000002 7fffffff c1fe1da8 c0192d50 c1fe1dc0 00000002 7fffffff c1ff9fcc
Call Trace: [<c000eb4e>] [<c000e4de>] [<c0192954>] [<c0192d50>] [<c001d49e>] [<c000b688>] [<c0192a3c>]
       [<c000b63e>] [<c000b63e>] [<c001a542>] [<c00b55b0>] [<c00411c0>] [<c00b559c>] [<c01918e6>] [<c0191988>]
       [<c01919d0>] [<c00cd9c8>] [<c00cdd6a>] [<c0034178>] [<c000409a>] [<c0015576>] [<c0029130>] [<c0029078>]
       [<c0029170>] [<c0012336>] [<c00b4076>] [<c00b4770>] [<c006d6e4>] [<c006d974>] [<c006dca0>] [<c0028d6c>]
       [<c0028e12>] [<c0006424>] <4>---[ end trace 4eaa2a86a8e2da22 ]---
------------[ cut here ]------------
Repeat ad nauseam.

Wed, Jan 14, 2009 at 12:11:32AM +0100, Bastien ROUCARIES wrote:
> Perhaps using a pointerhackalign trick on this structure where
> #define pointerhackalign(x) __attribute__ ((aligned (x)))
> and declare
> struct klist_node {
> ...
> }  pointerhackalign(2);
>
> Because  __attribute__ ((aligned (x))) could only increase alignment
> it will safe to do that and serve as documentation purpose :)

That works, but we need to do it not for the struct klist_node,
but for the struct we insert into the void * in klist_node,
which is struct klist.

Reported-by: Hinko Kocevar <hinko.kocevar@cetrtapot.si
Cc: Bastien ROUCARIES <roucaries.bastien@gmail.com>
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

x86, mm: fix pte_free()

commit 42ef73fe134732b2e91c0326df5fd568da17c4b2 upstream.

On -rt we were seeing spurious bad page states like:

Bad page state in process 'firefox'
page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3
[<c043d0f3>] ? printk+0x14/0x19
[<c0272d4e>] bad_page+0x4e/0x79
[<c0273831>] free_hot_cold_page+0x5b/0x1d3
[<c02739f6>] free_hot_page+0xf/0x11
[<c0273a18>] __free_pages+0x20/0x2b
[<c027d170>] __pte_alloc+0x87/0x91
[<c027d25e>] handle_mm_fault+0xe4/0x733
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c0218875>] do_page_fault+0x36f/0x88a

This is the case where a concurrent fault already installed the PTE and
we get to free the newly allocated one.

This is due to pgtable_page_ctor() doing the spin_lock_init(&page->ptl)
which is overlaid with the {private, mapping} struct.

union {
    struct {
        unsigned long private;
        struct address_space *mapping;
    };
    spinlock_t ptl;
    struct kmem_cache *slab;
    struct page *first_page;
};

Normally the spinlock is small enough to not stomp on page->mapping, but
PREEMPT_RT=y has huge 'spin'locks.

But lockdep kernels should also be able to trigger this splat, as the
lock tracking code grows the spinlock to cover page->mapping.

The obvious fix is calling pgtable_page_dtor() like the regular pte free
path __pte_free_tlb() does.

It seems all architectures except x86 and nm10300 already do this, and
nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it
doesn't do SMP or simply doesnt do MMU at all or something.

Signed-off-by: Peter Zijlstra <a.p.zijlsta@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fuse: fix NULL deref in fuse_file_alloc()

commit bb875b38dc5e343bdb696b2eab8233e4d195e208 upstream.

ff is set to NULL and then dereferenced on line 65. Compile tested only.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fuse: fix missing fput on error

commit 3ddf1e7f57237ac7c5d5bfb7058f1ea4f970b661 upstream.

Fix the leaking file reference if allocation or initialization of
fuse_conn failed.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fuse: destroy bdi on umount

commit 26c3679101dbccc054dcf370143941844ba70531 upstream.

If a fuse filesystem is unmounted but the device file descriptor
remains open and a new mount reuses the old device number, then the
mount fails with EEXIST and the following warning is printed in the
kernel log:

  WARNING: at fs/sysfs/dir.c:462 sysfs_add_one+0x35/0x3d()
  sysfs: duplicate filename '0:15' can not be created

The cause is that the bdi belonging to the fuse filesystem was
destoryed only after the device file was released.  Fix this by
calling bdi_destroy() from fuse_put_super() instead.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

inotify: clean up inotify_read and fix locking problems

commit 3632dee2f8b8a9720329f29eeaa4ec4669a3aff8 upstream.

If userspace supplies an invalid pointer to a read() of an inotify
instance, the inotify device's event list mutex is unlocked twice.
This causes an unbalance which effectively leaves the data structure
unprotected, and we can trigger oopses by accessing the inotify
instance from different tasks concurrently.

The best fix (contributed largely by Linus) is a total rewrite
of the function in question:

On Thu, Jan 22, 2009 at 7:05 AM, Linus Torvalds wrote:
> The thing to notice is that:
>
>  - locking is done in just one place, and there is no question about it
>   not having an unlock.
>
>  - that whole double-while(1)-loop thing is gone.
>
>  - use multiple functions to make nesting and error handling sane
>
>  - do error testing after doing the things you always need to do, ie do
>   this:
>
>        mutex_lock(..)
>        ret = function_call();
>        mutex_unlock(..)
>
>        .. test ret here ..
>
>   instead of doing conditional exits with unlocking or freeing.
>
> So if the code is written in this way, it may still be buggy, but at least
> it's not buggy because of subtle "forgot to unlock" or "forgot to free"
> issues.
>
> This _always_ unlocks if it locked, and it always frees if it got a
> non-error kevent.

Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Robert Love <rlove@google.com>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mac80211: decrement ref count to netdev after launching mesh discovery

commit 5dc306f3bd1d4cfdf79df39221b3036eab1ddcf3 upstream.

After launching mesh discovery in tx path, reference count was not being
decremented. This was preventing module unload.

Signed-off-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ath5k: fix mesh point operation

commit b706e65b40417e03c2451bb3f92488f3736843fa upstream.

This patch fixes mesh point operation (thanks to YanBo for pointing
out the problem): make mesh point interfaces start beaconing when
they come up and configure the RX filter in mesh mode so that mesh
beacons and action frames are received. Add mesh point to the check
in ath5k_add_interface. Tested with multiple AR5211 cards.

Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Acked-by: Nick Kossifidis <mickflemm@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Linux 2.6.28.2

fs: sys_sync fix

commit 856bf4d717feb8c55d4e2f817b71ebb70cfbc67b upstream.

s_syncing livelock avoidance was breaking data integrity guarantee of
sys_sync, by allowing sys_sync to skip writing or waiting for superblocks
if there is a concurrent sys_sync happening.

This livelock avoidance is much less important now that we don't have the
get_super_to_sync() call after every sb that we sync. This was replaced
by __put_super_and_need_restart.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fs: sync_sb_inodes fix

commit 38f21977663126fef53f5585e7f1653d8ebe55c4 upstream.

Fix data integrity semantics required by sys_sync, by iterating over all
inodes and waiting for any writeback pages after the initial writeout.
Comments explain the exact problem.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fs: remove WB_SYNC_HOLD

commit 4f5a99d64c17470a784a6c68064207d82e3e74a5 upstream.

Remove WB_SYNC_HOLD.  The primary motiviation is the design of my
anti-starvation code for fsync.  It requires taking an inode lock over the
sync operation, so we could run into lock ordering problems with multiple
inodes.  It is possible to take a single global lock to solve the ordering
problem, but then that would prevent a future nice implementation of "sync
multiple inodes" based on lock order via inode address.

Seems like a backward step to remove this, but actually it is busted
anyway: we can't use the inode lists for data integrity wait: an inode can
be taken off the dirty lists but still be under writeback.  In order to
satisfy data integrity semantics, we should wait for it to finish
writeback, but if we only search the dirty lists, we'll miss it.

It would be possible to have a "writeback" list, for sys_sync, I suppose.
But why complicate things by prematurely optimise?  For unmounting, we
could avoid the "livelock avoidance" code, which would be easier, but
again premature IMO.

Fixing the existing data integrity problem will come next.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: direct IO starvation improvement

commit 48b47c561e41525061b5bc0cfd67d6367fd11dc4 upstream.

Direct IO can invalidate and sync a lot of pagecache pages in the mapping.
A 4K direct IO will actually try to sync and/or invalidate the pagecache
of the entire file, for example (which might be many GB or TB large).

Improve this by doing range syncs. Also, memory no longer has to be
unmapped to catch the dirty bits for syncing, as dirty bits would remain
coherent due to dirty mmap accounting.

This fixes the immediate DM deadlocks when doing direct IO reads to block
device with a mounted filesystem, if only by papering over the problem
somewhat rather than addressing the fsync starvation cases.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: do_sync_mapping_range integrity fix

commit ee53a891f47444c53318b98dac947ede963db400 upstream.

Chris Mason notices do_sync_mapping_range didn't actually ask for data
integrity writeout. Unfortunately, it is advertised as being usable for
data integrity operations.

This is a data integrity bug.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: write_cache_pages more terminate quickly

commit 82fd1a9a8ced9607312b54859572bcc6211e8919 upstream.

Now that we have the early-termination logic in place, it makes sense to
bail out early in all other cases where done is set to 1.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: write_cache_pages terminate quickly

commit d5482cdf8a0aacb1e6468a97d5544f5829c8d8c4 upstream.

Terminate the write_cache_pages loop upon encountering the first page past
end, without locking the page. Pages cannot have their index change when
we have a reference on them (truncate, eg truncate_inode_pages_range
performs the same check without the page lock).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: write_cache_pages optimise page cleaning

commit 515f4a037fb9ab736f8bad733fcd2ffd350cf265 upstream.

In write_cache_pages, if we get stuck behind another process that is
cleaning pages, we will be forced to wait for them to finish, then perform
our own writeout (if it was redirtied during the long wait), then wait for
that.

If a page under writeout is still clean, we can skip waiting for it (if
we're part of a data integrity sync, we'll be waiting for all writeout
pages afterwards, so we'll still be waiting for the other guy's write
that's cleaned the page).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

mm: write_cache_pages cleanups

commit 5a3d5c9813db56a75934eb1015367fda23a8b0b4 upstream.

Get rid of some complex expressions from flow control statements, add a
comment, remove some duplicate code.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>