git.karo-electronics.de Git - karo-tx-linux.git/log

gcov: fix null-pointer dereference for certain module types

commit 85a0fdfd0f967507f3903e8419bc7e408f5a59de upstream.

The gcov-kernel infrastructure expects that each object file is loaded
only once.  This may not be true, e.g.  when loading multiple kernel
modules which are linked to the same object file.  As a result, loading
such kernel modules will result in incorrect gcov results while unloading
will cause a null-pointer dereference.

This patch fixes these problems by changing the gcov-kernel infrastructure
so that multiple profiling data sets can be associated with one debugfs
entry.  It applies to 2.6.36-rc1.

Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Reported-by: Werner Spies <werner.spies@thalesgroup.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

irda: off by one

commit cf9b94f88bdbe8a02015fc30d7c232b2d262d4ad upstream.

This is an off by one. We would go past the end when we NUL terminate
the "value" string at end of the function. The "value" buffer is
allocated in irlan_client_parse_response() or
irlan_provider_parse_command().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

tracing: t_start: reset FTRACE_ITER_HASH in case of seek/pread

commit df09162550fbb53354f0c88e85b5d0e6129ee9cc upstream.

Be sure to avoid entering t_show() with FTRACE_ITER_HASH set without
having properly started the iterator to iterate the hash.  This case is
degenerate and, as discovered by Robert Swiecki, can cause t_hash_show()
to misuse a pointer.  This causes a NULL ptr deref with possible security
implications.  Tracked as CVE-2010-3079.

Cc: Robert Swiecki <swiecki@google.com>
Cc: Eugene Teo <eugene@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

tracing: Do not allow llseek to set_ftrace_filter

commit 9c55cb12c1c172e2d51e85fbb5a4796ca86b77e7 upstream.

Reading the file set_ftrace_filter does three things.

1) shows whether or not filters are set for the function tracer
2) shows what functions are set for the function tracer
3) shows what triggers are set on any functions

3 is independent from 1 and 2.

The way this file currently works is that it is a state machine,
and as you read it, it may change state. But this assumption breaks
when you use lseek() on the file. The state machine gets out of sync
and the t_show() may use the wrong pointer and cause a kernel oops.

Luckily, this will only kill the app that does the lseek, but the app
dies while holding a mutex. This prevents anyone else from using the
set_ftrace_filter file (or any other function tracing file for that matter).

A real fix for this is to rewrite the code, but that is too much for
a -rc release or stable. This patch simply disables llseek on the
set_ftrace_filter() file for now, and we can do the proper fix for the
next major release.

Reported-by: Robert Swiecki <swiecki@google.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Tavis Ormandy <taviso@google.com>
Cc: Eugene Teo <eugene@redhat.com>
Cc: vendor-sec@lst.de
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

tracing: Fix a race in function profile

commit 3aaba20f26f58843e8f20611e5c0b1c06954310f upstream.

While we are reading trace_stat/functionX and someone just
disabled function_profile at that time, we can trigger this:

divide error: 0000 [#1] PREEMPT SMP
...
EIP is at function_stat_show+0x90/0x230
...

This fix just takes the ftrace_profile_lock and checks if
rec->counter is 0. If it's 0, we know the profile buffer
has been reset.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4C723644.4040708@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

libata: skip EH autopsy and recovery during suspend

commit e2f3d75fc0e4a0d03c61872bad39ffa2e74a04ff upstream.

For some mysterious reason, certain hardware reacts badly to usual EH
actions while the system is going for suspend. As the devices won't
be needed until the system is resumed, ask EH to skip usual autopsy
and recovery and proceed directly to suspend.

Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Stephan Diestelhorst <stephan.diestelhorst@amd.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

oprofile, x86: fix init_sysfs() function stub

commit 269f45c25028c75fe10e6d9be86e7202ab461fbc upstream.

The use of the return value of init_sysfs() with commit

10f0412 oprofile, x86: fix init_sysfs error handling

discovered the following build error for !CONFIG_PM:

.../linux/arch/x86/oprofile/nmi_int.c: In function ‘op_nmi_init’:
.../linux/arch/x86/oprofile/nmi_int.c:784: error: expected expression before ‘do’
make[2]: *** [arch/x86/oprofile/nmi_int.o] Error 1
make[1]: *** [arch/x86/oprofile] Error 2

This patch fixes this.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

oprofile, x86: fix init_sysfs error handling

commit 10f0412f57f2a76a90eff4376f59cbb0a39e4e18 upstream.

On failure init_sysfs() might not properly free resources. The error
code of the function is not checked. And, when reinitializing the exit
function might be called twice. This patch fixes all this.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

oprofile: fix crash when accessing freed task structs

commit 750d857c682f4db60d14722d430c7ccc35070962 upstream.

This patch fixes a crash during shutdown reported below. The crash is
caused by accessing already freed task structs. The fix changes the
order for registering and unregistering notifier callbacks.

All notifiers must be initialized before buffers start working. To
stop buffer synchronization we cancel all workqueues, unregister the
notifier callback and then flush all buffers. After all of this we
finally can free all tasks listed.

This should avoid accessing freed tasks.

On 22.07.10 01:14:40, Benjamin Herrenschmidt wrote:

> So the initial observation is a spinlock bad magic followed by a crash
> in the spinlock debug code:
>
> [ 1541.586531] BUG: spinlock bad magic on CPU#5, events/5/136
> [ 1541.597564] Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6d03
>
> Backtrace looks like:
>
>       spin_bug+0x74/0xd4
>       ._raw_spin_lock+0x48/0x184
>       ._spin_lock+0x10/0x24
>       .get_task_mm+0x28/0x8c
>       .sync_buffer+0x1b4/0x598
>       .wq_sync_buffer+0xa0/0xdc
>       .worker_thread+0x1d8/0x2a8
>       .kthread+0xa8/0xb4
>       .kernel_thread+0x54/0x70
>
> So we are accessing a freed task struct in the work queue when
> processing the samples.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

sysfs: checking for NULL instead of ERR_PTR

commit 57f9bdac2510cd7fda58e4a111d250861eb1ebeb upstream.

d_path() returns an ERR_PTR and it doesn't return NULL.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ALSA: seq/oss - Fix double-free at error path of snd_seq_oss_open()

commit 27f7ad53829f79e799a253285318bff79ece15bd upstream.

The error handling in snd_seq_oss_open() has several bad codes that
do dereferecing released pointers and double-free of kmalloc'ed data.
The object dp is release in free_devinfo() that is called via
private_free callback. The rest shouldn't touch this object any more.

The patch changes delete_port() to call kfree() in any case, and gets
rid of unnecessary calls of destructors in snd_seq_oss_open().

Fixes CVE-2010-3080.

Reported-and-tested-by: Tavis Ormandy <taviso@cmpxchg8b.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

USB: cdc-acm: Fixing crash when ACM probing interfaces with no endpoint descriptors.

commit 577045c0a76e34294f902a7d5d60e90b04d094d0 upstream.

Certain USB devices, such as the Nokia X6 mobile phone, don't expose any
endpoint descriptors on some of their interfaces. If the ACM driver is forced
to probe all interfaces on a device the a NULL pointer dereference will occur
when the ACM driver attempts to use the endpoint of the alternative settings.
One way to get the ACM driver to probe all the interfaces is by using the
/sys/bus/usb/drivers/cdc_acm/new_id interface.

This patch checks that the endpoint pointer for the current alternate settings
is non-NULL before using it.

Signed-off-by: Toby Gray <toby.gray@realvnc.com>
Cc: Oliver Neukum <oliver@neukum.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

USB: cdc-acm: Add pseudo modem without AT command capabilities

commit 5b239f0aebd4dd6f85b13decf5e18e86e35d57f0 upstream.

cdc-acm.c : Manage pseudo-modem without AT commands capabilities
  Enable to drive electronic simple gadgets based on microcontrolers.
  The Interface descriptor is like this:
    bInterfaceClass         2 Communications
    bInterfaceSubClass      2 Abstract (modem)
    bInterfaceProtocol      0 None

Signed-off-by: Philippe Corbes <philippe.corbes@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

USB: cdc-acm: Adding second ACM channel support for various Nokia and one Samsung phones

commit 4035e45632c2a8bb4edae83c20447051bd9a9604 upstream.

S60 phones from Nokia and Samsung expose two ACM channels. The first is a modem
with a standard AT-command interface, which is picked up correctly by CDC-ACM.

The second ACM port is marked as having a vendor-specific protocol. This means
that the ACM driver will not claim the second channel by default.

This adds support for the second ACM channel for the following devices:
    Nokia E63
    Nokia E75
    Nokia 6760 Slide
    Nokia E52
    Nokia E55
    Nokia E72
    Nokia X6
    Nokia N97 Mini
    Nokia 5800 Xpressmusic
    Nokia E90
    Samsung GTi8510 (INNOV8)

Signed-off-by: Toby Gray <toby.gray@realvnc.com>
Cc: Oliver Neukum <oliver@neukum.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

USB: Expose vendor-specific ACM channel on Nokia 5230

commit 83a4eae9aeed4a69e89e323a105e653ae06e7c1f upstream.

Nokia S60 phones expose two ACM channels. The first is
a modem, the second is 'vendor-specific' but is treated
as a serial device at the S60 end, so we want to expose
it on Linux too.

Signed-off-by: Przemo Firszt <przemo@firszt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

usb: serial: mos7840: Add USB IDs to support more B&B USB/RS485 converters.

commit 870408c8291015872a7a0b583673a9e56b3e73f4 upstream.

Add the USB IDs needed to support the B&B USOPTL4-4P, USO9ML2-2P, and
USO9ML2-4P. This patch expands and corrects a typo in the patch sent
on 08-31-2010.

Signed-off-by: Dave Ludlow <dave.ludlow@bay.ws>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

usb: serial: mos7840: Add USB ID to support the B&B Electronics USOPTL4-2P.

commit caf3a636a9f809fdca5fa746e6687096457accb1 upstream.

Add the USB ID needed to support B&B Electronic's 2-port, optically-isolated,
powered, USB to RS485 converter.

Signed-off-by: Dave Ludlow <dave.ludlow@bay.ws>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

USB: ftdi_sio: Added custom PIDs for ChamSys products

commit 657373883417b2618023fd4135d251ba06a2c30a upstream.

Added the 0xDAF8 to 0xDAFF PID range for ChamSys limited USB interface/wing products

Signed-off-by: Luke Lowrey <luke@chamsys.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

USB: cp210x: Add B&G H3000 link cable ID

commit 0bf7a81c5d447c21db434be35363c44c0a30f598 upstream.

This is the cable between an H3000 navigation unit and a multi-function display.
http://www.bandg.com/en/Products/H3000/Spares-and-Accessories/Cables/H3000-CPU-USB-Cable-Pack/

Signed-off-by: Jason Detring <jason.detring@navico.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

USB: CP210x Add new device ID

commit 541e05ec3add5ab5bcf238d60161b53480280b20 upstream.

New device ID added for Balluff RFID reader.

Signed-off-by: Craig Shelley <craig@microtron.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

USB: Fix kernel oops with g_ether and Windows

commit 037d3656adbd7e8cb848f01cf5dec423ed76bbe7 upstream.

Please find attached patch for
https://bugzilla.kernel.org/show_bug.cgi?id=16023 problem.

Signed-off-by: Maxim Osipov <maxim.osipov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

USB: ehci-ppc-of: problems in unwind

commit 08a3b3b1c2e622e378d9086aee9e2e42ce37591d upstream.

The iounmap(ehci->ohci_hcctrl_reg); should be the first thing we do
because the ioremap() was the last thing we did. Also if we hit any of
the goto statements in the original code then it would have led to a
NULL dereference of "ehci". This bug was introduced in: 796bcae7361c
"USB: powerpc: Workaround for the PPC440EPX USBH_23 errata [take 3]"

I modified the few lines in front a little so that my code didn't
obscure the return success code path.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ocfs2: Fix incorrect checksum validation error

commit f5ce5a08a40f2086435858ddc80cb40394b082eb upstream.

For local mounts, ocfs2_read_locked_inode() calls ocfs2_read_blocks_sync() to
read the inode off the disk. The latter first checks to see if that block is
cached in the journal, and, if so, returns that block. That is ok.

But ocfs2_read_locked_inode() goes wrong when it tries to validate the checksum
of such blocks. Blocks that are cached in the journal may not have had their
checksum computed as yet. We should not validate the checksums of such blocks.

Fixes ossbz#1282
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1282

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Singed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ath9k_hw: fix parsing of HT40 5 GHz CTLs

commit 904879748d7439a6dabdc6be9aad983e216b027d upstream.

The 5 GHz CTL indexes were not being read for all hardware
devices due to the masking out through the CTL_MODE_M mask
being one bit too short. Without this the calibrated regulatory
maximum values were not being picked up when devices operate
on 5 GHz in HT40 mode. The final output power used for Atheros
devices is the minimum between the calibrated CTL values and
what CRDA provides.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

fuse: flush background queue on connection close

commit 595afaf9e6ee1b48e13ec4b8bcc8c7dee888161a upstream.

David Bartly reported that fuse can hang in fuse_get_req_nofail() when
the connection to the filesystem server is no longer active.

If bg_queue is not empty then flush_bg_queue() called from
request_end() can put more requests on to the pending queue.  If this
happens while ending requests on the processing queue then those
background requests will be queued to the pending list and never
ended.

Another problem is that fuse_dev_release() didn't wake up processes
sleeping on blocked_waitq.

Solve this by:

a) flushing the background queue before calling end_requests() on the
    pending and processing queues

b) setting blocked = 0 and waking up processes waiting on
    blocked_waitq()

Thanks to David for an excellent bug report.

Reported-by: David Bartley <andareed@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

staging: hv: Fixed lockup problem with bounce_buffer scatter list

commit 77c5ceaff31645ea049c6706b99e699eae81fb88 upstream.

Fixed lockup problem with bounce_buffer scatter list which caused
crashes in heavy loads. And minor code indentation cleanup in effected
area.

Removed whitespace and noted minor indentation changes in description as
pointed out by Joe Perches. (Thanks for reviewing Joe)

Signed-off-by: Hank Janssen <hjanssen@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

staging: hv: Increased storvsc ringbuffer and max_io_requests

commit 15dd1c9f53b31cdc84b8072a88c23fa09527c596 upstream.

Increased storvsc ringbuffer and max_io_requests. This now more
closely mimics the numbers on Hyper-V. And will allow more IO requests
to take place for the SCSI driver.

Max_IO is set to double from what it was before, Hyper-V allows it and
we have had appliance builder requests to see if it was a problem to
increase the number.

Ringbuffer size for storvsc is now increased because I have seen A few buffer
problems on extremely busy systems. They were Set pretty low before.
And since max_io_requests is increased I Really needed to increase the buffer
as well.

Signed-off-by: Hank Janssen <hjanssen@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

staging: hv: Fixed the value of the 64bit-hole inside ring buffer

commit e5fa721d1c2a54261a37eb59686e18dee34b6af6 upstream.

Fixed the value of the 64bit-hole inside ring buffer, this
caused a problem on Hyper-V when running checked Windows builds.

Checked builds of Windows are used internally and given to external
system integrators at times. They are builds that for example that all
elements in a structure follow the definition of that Structure. The bug
this fixed was for a field that we did not fill in at all (Because we do
Not use it on the Linux side), and the checked build of windows gives
errors on it internally to the Windows logs.

This fixes that error.

Signed-off-by:Hank Janssen <hjanssen@microsoft.com>
Signed-off-by:Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

staging: hv: Fixed bounce kmap problem by using correct index

commit 0c47a70a9a8a6d1ec37a53d2f9cb82f8b8ef8aa2 upstream.

Fixed bounce offset kmap problem by using correct index.
The symptom of the problem is that in some NAS appliances this problem
represents Itself by a unresponsive VM under a load with many clients writing
small files.

Signed-off-by:Hank Janssen <hjanssen@microsoft.com>
Signed-off-by:Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

staging: hv: Fix missing functions for net_device_ops

commit b681b5886bb5d1f5b6750a0ed7c62846da7ccea4 upstream.

Fix missing functions for net_device_ops.
It's a bug when porting the drivers from 2.6.27 to 2.6.32. In 2.6.27,
the default functions for Ethernet, like eth_change_mtu(), were assigned
by ether_setup(). But in 2.6.32, these function pointers moved to
net_device_ops structure and no longer be assigned in ether_setup(). So
we need to set these functions in our driver code. It will ensure the
MTU won't be set beyond 1500. Otherwise, this can cause an error on the
server side, because the HyperV linux driver doesn't support jumbo frame
yet.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Hank Janssen <hjanssen@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc()

commit 30da55242818a8ca08583188ebcbaccd283ad4d9 upstream.

commit 2ca1af9aa3285c6a5f103ed31ad09f7399fc65d7 "PCI: MSI: Remove
unsafe and unnecessary hardware access" changed read_msi_msg_desc() to
return the last MSI message written instead of reading it from the
device, since it may be called while the device is in a reduced
power state.

However, the pSeries platform code really does need to read messages
from the device, since they are initially written by firmware.
Therefore:
- Restore the previous behaviour of read_msi_msg_desc()
- Add new functions get_cached_msi_msg{,_desc}() which return the
last MSI message written
- Use the new functions where appropriate

Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

PCI: MSI: Remove unsafe and unnecessary hardware access

commit fcd097f31a6ee207cc0c3da9cccd2a86d4334785 upstream.

During suspend on an SMP system, {read,write}_msi_msg_desc() may be
called to mask and unmask interrupts on a device that is already in a
reduced power state. At this point memory-mapped registers including
MSI-X tables are not accessible, and config space may not be fully
functional either.

While a device is in a reduced power state its interrupts are
effectively masked and its MSI(-X) state will be restored when it is
brought back to D0. Therefore these functions can simply read and
write msi_desc::msg for devices not in D0.

Further, read_msi_msg_desc() should only ever be used to update a
previously written message, so it can always read msi_desc::msg
and never needs to touch the hardware.

Tested-by: "Michael Chan" <mchan@broadcom.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

x86, tsc, sched: Recompute cyc2ns_offset's during resume from sleep states

commit cd7240c0b900eb6d690ccee088a6c9b46dae815a upstream.

TSC's get reset after suspend/resume (even on cpu's with invariant TSC
which runs at a constant rate across ACPI P-, C- and T-states). And in
some systems BIOS seem to reinit TSC to arbitrary large value (still
sync'd across cpu's) during resume.

This leads to a scenario of scheduler rq->clock (sched_clock_cpu()) less
than rq->age_stamp (introduced in 2.6.32). This leads to a big value
returned by scale_rt_power() and the resulting big group power set by the
update_group_power() is causing improper load balancing between busy and
idle cpu's after suspend/resume.

This resulted in multi-threaded workloads (like kernel-compilation) go
slower after suspend/resume cycle on core i5 laptops.

Fix this by recomputing cyc2ns_offset's during resume, so that
sched_clock() continues from the point where it was left off during
suspend.

Reported-by: Florian Pritz <flo@xssn.at>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1282262618.2675.24.camel@sbsiddha-MOBL3.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

sata_mv: fix broken DSM/TRIM support (v2)

commit 44b733809a5aba7f6b15a548d31a56d25bf3851c upstream.

Fix DSM/TRIM commands in sata_mv (v2).
These need to be issued using old-school "BM DMA",
rather than via the EDMA host queue.

Since the chips don't have proper BM DMA status,
we need to be more careful with setting the ATA_DMA_INTR bit,
since DSM/TRIM often has a long delay between "DMA complete"
and "command complete".

GEN_I chips don't have BM DMA, so no TRIM for them.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ALSA: hda - Rename iMic to Int Mic on Lenovo NB0763

commit 150b432f448281d5518f5229d240923f9a9c5459 upstream.

The non-standard name "iMic" makes PulseAudio ignore the microphone.
BugLink: https://launchpad.net/bugs/605101
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

xen: use percpu interrupts for IPIs and VIRQs

commit aaca49642b92c8a57d3ca5029a5a94019c7af69f upstream.

IPIs and VIRQs are inherently per-cpu event types, so treat them as such:
- use a specific percpu irq_chip implementation, and
- handle them with handle_percpu_irq

This makes the path for delivering these interrupts more efficient
(no masking/unmasking, no locks), and it avoid problems with attempts
to migrate them.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

xen: handle events as edge-triggered

commit dffe2e1e1a1ddb566a76266136c312801c66dcf7 upstream.

Xen events are logically edge triggered, as Xen only calls the event
upcall when an event is newly set, but not continuously as it remains set.
As a result, use handle_edge_irq rather than handle_level_irq.

This has the important side-effect of fixing a long-standing bug of
events getting lost if:
- an event's interrupt handler is running
- the event is migrated to a different vcpu
- the event is re-triggered

The most noticable symptom of these lost events is occasional lockups
of blkfront.

Many thanks to Tom Kopec and Daniel Stodden in tracking this down.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Tom Kopec <tek@acm.org>
Cc: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

hwmon: (k8temp) Differentiate between AM2 and ASB1

commit a05e93f3b3fc2f53c1d0de3b17019e207c482349 upstream.

Commit 8bf0223ed515be24de0c671eedaff49e78bebc9c (hwmon, k8temp: Fix
temperature reporting for ASB1 processor revisions) fixed temperature
reporting for ASB1 CPUs. But those CPU models (model 0x6b, 0x6f, 0x7f)
were packaged both as AM2 (desktop) and ASB1 (mobile). Thus the commit
leads to wrong temperature reporting for AM2 CPU parts.

The solution is to determine the package type for models 0x6b, 0x6f,
0x7f.

This is done using BrandId from CPUID Fn8000_0001_EBX[15:0]. See
"Constructing the processor Name String" in "Revision Guide for AMD
NPT Family 0Fh Processors" (Rev. 3.46).

Cc: Rudolf Marek <r.marek@assembler.cz>
Reported-by: Vladislav Guberinic <neosisani@gmail.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: fix freeze deadlock under IO

commit 437f88cc031ffe7f37f3e705367f4fe1f4be8b0f upstream.
[The 6b0310fb below references the mainline version of what
has also been cherry picked into this 34-stable branch]

Commit 6b0310fbf087ad6 caused a regression resulting in deadlocks
when freezing a filesystem which had active IO; the vfs_check_frozen
level (SB_FREEZE_WRITE) did not let the freeze-related IO syncing
through.  Duh.

Changing the test to FREEZE_TRANS should let the normal freeze
syncing get through the fs, but still block any transactions from
starting once the fs is completely frozen.

I tested this by running fsstress in the background while periodically
snapshotting the fs and running fsck on the result.  I ran into
occasional deadlocks, but different ones.  I think this is a
fine fix for the problem at hand, and the other deadlocky things
will need more investigation.

Reported-by: Phillip Susi <psusi@cfl.rr.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

CIFS: Remove __exit mark from cifs_exit_dns_resolver()

commit 51c20fcced5badee0e2021c6c89f44aa3cbd72aa upstream.

Remove the __exit mark from cifs_exit_dns_resolver() as it's called by the
module init routine in case of error, and so may have been discarded during
linkage.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: Make fsync sync new parent directories in no-journal mode

commit 14ece1028b3ed53ffec1b1213ffc6acaf79ad77c upstream.

Add a new ext4 state to tell us when a file has been newly created; use
that state in ext4_sync_file in no-journal mode to tell us when we need
to sync the parent directory as well as the inode and data itself. This
fixes a problem in which a panic or power failure may lose the entire
file even when using fsync, since the parent directory entry is lost.

Addresses-Google-Bug: #2480057

Signed-off-by: Frank Mayhar <fmayhar@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: Fix compat EXT4_IOC_ADD_GROUP

commit 4d92dc0f00a775dc2e1267b0e00befb783902fe7 upstream.

struct ext4_new_group_input needs to be converted because u64 has
only 32-bit alignment on some 32-bit architectures, notably i386.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: Conditionally define compat ioctl numbers

commit 899ad0cea6ad7ff4ba24b16318edbc3cbbe03fad upstream.

It is unnecessary, and in general impossible, to define the compat
ioctl numbers except when building the filesystem with CONFIG_COMPAT
defined.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: restart ext4_ext_remove_space() after transaction restart

commit 0617b83fa239db9743a18ce6cc0e556f4d0fd567 upstream.

If i_data_sem was internally dropped due to transaction restart, it is
necessary to restart path look-up because extents tree was possibly
modified by ext4_get_block().

https://bugzilla.kernel.org/show_bug.cgi?id=15827

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: Clear the EXT4_EOFBLOCKS_FL flag only when warranted

commit 786ec7915e530936b9eb2e3d12274145cab7aa7d upstream.

Dimitry Monakhov discovered an edge case where it was possible for the
EXT4_EOFBLOCKS_FL flag could get cleared unnecessarily.  This is true;
I have a test case that can be exercised via downloading and
decompressing the file:

wget ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/ext4-testcases/eofblocks-fl-test-case.img.bz2
bunzip2 eofblocks-fl-test-case.img
dd if=/dev/zero of=eofblocks-fl-test-case.img bs=1k seek=17925 bs=1k count=1 conv=notrunc

However, triggering it in real life is highly unlikely since it
requires an extremely fragmented sparse file with a hole in exactly
the right place in the extent tree.  (It actually took quite a bit of
work to generate this test case.)  Still, it's nice to get even
extreme corner cases to be correct, so this patch makes sure that we
don't clear the EXT4_EOFBLOCKS_FL incorrectly even in this corner
case.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: Avoid crashing on NULL ptr dereference on a filesystem error

commit f70f362b4a6fe47c239dbfb3efc0cc2c10e4f09c upstream.

If the EOFBLOCK_FL flag is set when it should not be and the inode is
zero length, then eh_entries is zero, and ex is NULL, so dereferencing
ex to print ex->ee_block causes a kernel OOPS in
ext4_ext_map_blocks().

On top of that, the error message which is printed isn't very helpful.
So we fix this by printing something more explanatory which doesn't
involve trying to print ex->ee_block.

Addresses-Google-Bug: #2655740

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: Use bitops to read/modify i_flags in struct ext4_inode_info

commit 12e9b892002d9af057655d35b44db8ee9243b0dc upstream.

At several places we modify EXT4_I(inode)->i_flags without holding
i_mutex (ext4_do_update_inode, ...). These modifications are racy and
we can lose updates to i_flags. So convert handling of i_flags to use
bitops which are atomic.

https://bugzilla.kernel.org/show_bug.cgi?id=15792

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: Show journal_checksum option

commit 39a4bade8c1826b658316d66ee81c09b0a4d7d42 upstream.

We failed to show journal_checksum option in /proc/mounts. Fix it.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: check for a good block group before loading buddy pages

commit 8a57d9d61a6e361c7bb159dda797672c1df1a691 upstream.

This adds a new field in ext4_group_info to cache the largest available
block range in a block group; and don't load the buddy pages until *after*
we've done a sanity check on the block group.

With large allocation requests (e.g., fallocate(), 8MiB) and relatively full
partitions, it's easy to have no block groups with a block extent large
enough to satisfy the input request length. This currently causes the loop
during cr == 0 in ext4_mb_regular_allocator() to load the buddy bitmap pages
for EVERY block group. That can be a lot of pages. The patch below allows
us to call ext4_mb_good_group() BEFORE we load the buddy pages (although we
have check again after we lock the block group).

Addresses-Google-Bug: #2578108
Addresses-Google-Bug: #2704453

Signed-off-by: Curt Wohlgemuth <curtw@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: Prevent creation of files larger than RLIMIT_FSIZE using fallocate

commit 6d19c42b7cf81c39632b6d4dbc514e8449bcd346 upstream.

Currently using posix_fallocate one can bypass an RLIMIT_FSIZE limit
and create a file larger than the limit. Add a check for that.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Amit Arora <aarora@in.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: Remove extraneous newlines in ext4_msg() calls

commit fbe845ddf368f77f86aa7500f8fd2690f54c66a8 upstream.

Addresses-Google-Bug: #2562325

Signed-off-by: Curt Wohlgemuth <curtw@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: init statistics after journal recovery

commit 84061e07c5fbbbf9dc8aef8fb750fc3a2dfc31f3 upstream.

Currently block/inode/dir counters initialized before journal was
recovered. In fact after journal recovery this info will probably
change. And freeblocks it critical for correct delalloc mode
accounting.

https://bugzilla.kernel.org/show_bug.cgi?id=15768

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: clean up inode bitmaps manipulation in ext4_free_inode

commit d17413c08cd2b1dd2bf2cfdbb0f7b736b2b2b15c upstream.

- Reorganize locking scheme to batch two atomic operation in to one.
  This also allow us to state what healthy group must obey following rule
  ext4_free_inodes_count(sb, gdp) == ext4_count_free(inode_bitmap, NUM);
- Fix possible undefined pointer dereference.
- Even if group descriptor stats aren't accessible we have to update
  inode bitmaps.
- Move non-group members update out of group_lock.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: Do not zero out uninitialized extents beyond i_size

commit 21ca087a3891efab4d45488db8febee474d26c68 upstream.

The extents code will sometimes zero out blocks and mark them as
initialized instead of splitting an extent into several smaller ones.
This optimization however, causes problems if the extent is beyond
i_size because fsck will complain if there are uninitialized blocks
after i_size as this can not be distinguished from an inode that has
an incorrect i_size field.

https://bugzilla.kernel.org/show_bug.cgi?id=15742

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: don't scan/accumulate more pages than mballoc will allocate

commit c445e3e0a5c2804524dec6e55f66d63f6bc5bc3e upstream.

There was a bug reported on RHEL5 that a 10G dd on a 12G box
had a very, very slow sync after that.

At issue was the loop in write_cache_pages scanning all the way
to the end of the 10G file, even though the subsequent call
to mpage_da_submit_io would only actually write a smallish amt; then
we went back to the write_cache_pages loop ... wasting tons of time
in calling __mpage_da_writepage for thousands of pages we would
just revisit (many times) later.

Upstream it's not such a big issue for sys_sync because we get
to the loop with a much smaller nr_to_write, which limits the loop.

However, talking with Aneesh he realized that fsync upstream still
gets here with a very large nr_to_write and we face the same problem.

This patch makes mpage_add_bh_to_extent stop the loop after we've
accumulated 2048 pages, by setting mpd->io_done = 1; which ultimately
causes the write_cache_pages loop to break.

Repeating the test with a dirty_ratio of 80 (to leave something for
fsync to do), I don't see huge IO performance gains, but the reduction
in cpu usage is striking: 80% usage with stock, and 2% with the
below patch. Instrumenting the loop in write_cache_pages clearly
shows that we are wasting time here.

Eventually we need to change mpage_da_map_pages() also submit its I/O
to the block layer, subsuming mpage_da_submit_io(), and then change it
call ext4_get_blocks() multiple times.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: stop issuing discards if not supported by device

commit a30eec2a8650a77f754e84b2e15f062fe652baa7 upstream.

Turn off issuance of discard requests if the device does
not support it - similar to the action we take for barriers.
This will save a little computation time if a non-discardable
device is mounted with -o discard, and also makes it obvious
that it's not doing what was asked at mount time ...

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: don't return to userspace after freezing the fs with a mutex held

commit 6b0310fbf087ad6e9e3b8392adca97cd77184084 upstream.

ext4_freeze() used jbd2_journal_lock_updates() which takes
the j_barrier mutex, and then returns to userspace. The
kernel does not like this:

================================================
[ BUG: lock held when returning to user space! ]
------------------------------------------------
lvcreate/1075 is leaving the kernel with locks still held!
1 lock held by lvcreate/1075:
#0: (&journal->j_barrier){+.+...}, at: [<ffffffff811c6214>]
jbd2_journal_lock_updates+0xe1/0xf0

Use vfs_check_frozen() added to ext4_journal_start_sb() and
ext4_force_commit() instead.

Addresses-Red-Hat-Bugzilla: #568503

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: fix quota accounting in case of fallocate

commit 35121c9860316d7799cea0fbc359a9186e7c2747 upstream.

allocated_meta_data is already included in 'used' variable.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: allow defrag (EXT4_IOC_MOVE_EXT) in 32bit compat mode

commit b684b2ee9409f2890a8b3aea98525bbe5f84e276 upstream.

I have an x86_64 kernel with i386 userspace. e4defrag fails on the
EXT4_IOC_MOVE_EXT ioctl because it is not wired up for the compat
case. It seems that struct move_extent is compat save, only types
with fixed widths are used:
{
        __u32 reserved;         /* should be zero */
        __u32 donor_fd;         /* donor file descriptor */
        __u64 orig_start;       /* logical start offset in block for orig */
        __u64 donor_start;      /* logical start offset in block for donor */
        __u64 len;              /* block length to be moved */
        __u64 moved_len;        /* moved block length */
};

Lets just wire up EXT4_IOC_MOVE_EXT for the compat case.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
CC: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: rename ext4_mb_release_desc() to ext4_mb_unload_buddy()

commit e39e07fdfd98be8650385f12a7b81d6adc547510 upstream.

This function cleans up after ext4_mb_load_buddy(), so the renaming
makes the code clearer.

Signed-off-by: Jing Zhang <zj.barak@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: Remove unnecessary call to ext4_get_group_desc() in mballoc

commit 62e823a2cba18509ee826d775270e8ef9071b5bc upstream.

Signed-off-by: Jing Zhang <zj.barak@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: fix memory leaks in error path handling of ext4_ext_zeroout()

commit b720303df7352d4a7a1f61e467e0a124913c0d41 upstream.

When EIO occurs after bio is submitted, there is no memory free
operation for bio, which results in memory leakage. And there is also
no check against bio_alloc() for bio.

Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Jing Zhang <zj.barak@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ext4: check missed return value in ext4_sync_file()

commit 0671e704658b9f26f85e78d51176daa861f955c7 upstream.

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

ath5k: drop warning on jumbo frames

commit 9637e516d16a58b13f6098cfe899e22963132be3 upstream.

Jumbo frames are not supported, and if they are seen it is likely
a bogus frame so just silently discard them instead of warning on
them all time. Also, instead of dropping them immediately though
move the check *after* we check for all sort of frame errors. This
should enable us to discard these frames if the hardware picks
other bogus items first. Lets see if we still get those jumbo
counters increasing still with this.

Jumbo frames would happen if we tell hardware we can support
a small 802.11 chunks of DMA'd frame, hardware would split RX'd
frames into parts and we'd have to reconstruct them in software.
This is done with USB due to the bulk size but with ath5k we
already provide a good limit to hardware and this should not be
happening.

This is reported quite often and if it fills the logs then this
needs to be addressed and to avoid spurious reports.

Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

KEYS: Return more accurate error codes

commit 4d09ec0f705cf88a12add029c058b53f288cfaa2 upstream.

We were using the wrong variable here so the error codes weren't being returned
properly. The original code returns -ENOKEY.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

sctp: fix append error cause to ERROR chunk correctly

commit 2e3219b5c8a2e44e0b83ae6e04f52f20a82ac0f2 upstream.

commit 5fa782c2f5ef6c2e4f04d3e228412c9b4a4c8809
sctp: Fix skb_over_panic resulting from multiple invalid \
parameter errors (CVE-2010-1173) (v4)

cause 'error cause' never be add the the ERROR chunk due to
some typo when check valid length in sctp_init_cause_fixed().

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

Linux 2.6.34.7

Revert "USB delay init quirk for logitech Harmony 700-series devices"

This reverts commit 631b2d37894bb2a891d8897e1861362a23dde4d9.

It was found to cause a number of USB devices to not work properly
because we call usb_disable_autosuspend too soon. This is not an issue
with any other kernel version.

Reported-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Phil Dibowitz <phil@ipom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Linux 2.6.34.6

x86, apic: ack all pending irqs when crashed/on kexec

commit 8c3ba8d049247dc06b6dcee1711a11b26647aa44 upstream.

When the SMP kernel decides to crash_kexec() the local APICs may have
pending interrupts in their vector tables.

The setup routine for the local APIC has a deficient mechanism for
clearing these interrupts, it only handles interrupts that has already
been dispatched to the local core for servicing (the ISR register) safely,
it doesn't consider lower prioritized queued interrupts stored in the IRR
register.

If you have more than one pending interrupt within the same 32 bit word in
the LAPIC vector table registers you may find yourself entering the IO
APIC setup with pending interrupts left in the LAPIC.  This is a situation
for wich the IO APIC setup is not prepared.  Depending of what/which
interrupt vector/vectors are stuck in the APIC tables your system may show
various degrees of malfunctioning.  That was the reason why the
check_timer() failed in our system, the timer interrupts was blocked by
pending interrupts from the old kernel when routed trough the IO APIC.

Additional comment from Jiri Bohac:
==============
If this should go into stable release,
I'd add some kind of limit on the number of iterations, just to be safe from
hard to debug lock-ups:

+if (loops++  > MAX_LOOPS) {
+        printk("LAPIC pending clean-up")
+        break;
+}
while (queued);

with MAX_LOOPS something like 1E9 this would leave plenty of time for the
pending IRQs to be cleared and would and still cause at most a second of delay
if the loop were to lock-up for whatever reason.

[trenn@suse.de:

V2: Use tsc if avail to bail out after 1 sec due to possible virtual
    apic_read calls which may take rather long (suggested by: Avi Kivity
    <avi@redhat.com>) If no tsc is available bail out quickly after
    cpu_khz, if we broke out too early and still have irqs pending (which
    should never happen?) we still get a WARN_ON...

V3: - Fixed indentation -> checkpatch clean
    - max_loops must be signed

V4: - Fix typo, mixed up tsc and ntsc in first rdtscll() call

V5: Adjust WARN_ON() condition to also catch error in cpu_has_tsc case]

Cc: <jbohac@novell.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Kerstin Jonsson <kerstin.jonsson@ericsson.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
LKML-Reference: <201005241913.o4OJDGWM010865@imap1.linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

tracing: Fix timer tracing

commit ede1b4290781ae82ccf0f2ecc6dada8d3dd35779 upstream.

PowerTOP would like to be able to trace timers.

Unfortunately, the current timer tracing is not very useful: the
actual timer function is not recorded in the trace at the start
of timer execution.

Although this is recorded for timer "start" time (when it gets
armed), this is not useful; most timers get started early, and a
tracer like PowerTOP will never see this event, but will only
see the actual running of the timer.

This patch just adds the function to the timer tracing; I've
verified with PowerTOP that now it can get useful information
about timers.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: xiaoguangrong@cn.fujitsu.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4C6C5FA9.3000405@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: ftdi_sio: add product ID for Lenz LI-USB

commit ea233f805537f5da16c2b34d85b6c5cf88a0f9aa upstream.

Add ftdi product ID for Lenz LI-USB, a model train interface. This
was NOT tested against 2.6.35, but a similar patch was tested with the
CentOS 2.6.18-194.11.1.el5 kernel. It wasn't clear to me what
ordering is being used in ftdi_sio.c, so I inserted the ID after another
model train entry(SPROG_II).

Signed-off-by: Galen Seitz <galens@seitzassoc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: ftdi_sio: Add ID for Ionics PlugComputer

commit 666cc076d284e32d11bfc5ea2fbfc50434cff051 upstream.

Add the ID for the Ionics PlugComputer (<http://ionicsplug.com/>).

Signed-off-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: xhci: Remove buggy assignment in next_trb()

commit a1669b2c64a9c8b031e0ac5cbf2692337a577f7c upstream.

The code to increment the TRB pointer has a slight ambiguity that could
lead to a bug on different compilers. The ANSI C specification does not
specify the precedence of the assignment operator over the postfix
operator. gcc 4.4 produced the correct code (increment the pointer and
assign the value), but a MIPS compiler that one of John's clients used
assigned the old (unincremented) value.

Remove the unnecessary assignment to make all compilers produce the
correct assembly.

Signed-off-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: io_ti: check firmware version before updating

commit 0827a9ff2bbcbb03c33f1a6eb283fe051059482c upstream.

If we can't read the firmware for a device from the disk, and yet the
device already has a valid firmware image in it, we don't want to
replace the firmware with something invalid. So check the version
number to be less than the current one to verify this is the correct
thing to do.

Reported-by: Chris Beauchamp <chris@chillibean.tv>
Tested-by: Chris Beauchamp <chris@chillibean.tv>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: ftdi_sio: fix endianess of max packet size

commit d1ab903d2552b2362339b19203c7f01c797cb316 upstream.

The USB max packet size (always little-endian) was not being byte
swapped on big-endian systems.

Applicable since [USB: ftdi_sio: fix hi-speed device packet size calculation] approx 2.6.31

Signed-off-by: Michael Wileczka <mikewileczka@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: CP210x Fix Break On/Off

commit 72916791cbeb9cc607ae620cfba207dea481cd76 upstream.

The definitions for BREAK_ON and BREAK_OFF are inverted, causing break
requests to fail. This patch sets BREAK_ON and BREAK_OFF to the correct
values.

Signed-off-by: Craig Shelley <craig@microtron.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: pl2303: New vendor and product id

commit f36ecd5de93e4c85a9e3d25100c6e233155b12e5 upstream.

Add support for the Zeagle N2iTiON3 dive computer interface. Since
Zeagle devices are actually manufactured by Seiko, this patch will
support other Seiko based models as well.

Signed-off-by: Jef Driesen <jefdriesen@telenet.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: add device IDs for igotu to navman

commit 0eee6a2b2a52e17066a572d30ad2805d3ebc7508 upstream.

I recently bought a i-gotU USB GPS, and whilst hunting around for linux
support discovered this post by you back in 2009:

http://kerneltrap.org/mailarchive/linux-usb/2009/3/12/5148644

>Try the navman driver instead. You can either add the device id to the
> driver and rebuild it, or do this before you plug the device in:
> modprobe navman
> echo -n "0x0df7 0x0900" > /sys/bus/usb-serial/drivers/navman/new_id
>
> and then plug your device in and see if that works.

I can confirm that the navman driver works with the right device IDs on
my i-gotU GT-600, which has the same device IDs. Attached is a patch
adding the IDs.

From: Ross Burton <ross@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: option: add Celot CT-650

commit 76078dc4fc389185fe467d33428f259ea9e69807 upstream.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

powerpc: Fix typo in uImage target

commit c686ecf5040d287a68d4fca7f1948472f556a6d3 upstream.

Commit e32e78c5ee8aadef020fbaecbe6fb741ed9029fd
(powerpc: fix build with make 3.82) introduced a
typo in uImage target and broke building uImage:

make: *** No rule to make target `uImage'. Stop.

Signed-off-by: Anatolij Gustschin <agust@denx.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm: stop information leak of old kernel stack.

commit b9f0aee83335db1f3915f4e42a5e21b351740afd upstream.

non-critical issue, CVE-2010-2803

Userspace controls the amount of memory to be allocate, so it can
get the ioctl to allocate more memory than the kernel uses, and get
access to kernel stack. This can only be done for processes authenticated
to the X server for DRI access, and if the user has DRI access.

Fix is to just memset the data to 0 if the user doesn't copy into
it in the first place.

Reported-by: Kees Cook <kees@ubuntu.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/radeon/kms: fix GTT/VRAM overlapping test

commit 2cbeb4efc2b9739fe6019b613ae658bd2119a3eb upstream.

GTT/VRAM overlapping test had a typo which leaded to not
detecting case when vram_end > gtt_end. This patch fix the
logic and should fix #16574

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/radeon/kms: fix sideport detection on newer rs880 boards

commit 4b80d954a7e54c13a5063af18d01719ad6a0daf3 upstream.

The meaning of ucMemoryType changed on recent boards, however,
ulBootUpSidePortClock should be set properly across all boards.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/radeon/kms/DCE3+: switch pads to ddc mode when going i2c

commit 5786e2c5a3f519647c50bbc276e45d36a704415a upstream.

The pins for ddc and aux are shared so you need to switch the
mode when doing ddc. The ProcessAuxChannel table already sets
the pin mode to DP. This should fix unreliable ddc issues
on DP ports using non-DP monitors.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/radeon/kms: fix typo in radeon_compute_pll_gain

commit 0537398b211b4f040564beec458e23571042d335 upstream.

Looks like this got copied from the ddx wrong.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

drm/radeon/kms: don't enable MSIs on AGP boards

commit da7be684c55dbaeebfc1a048d5faf52d52cb3c1f upstream.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=29327

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

netlink: fix compat recvmsg

commit 68d6ac6d2740b6a55f3ae92a4e0be6d881904b32 upstream.

Since
commit 1dacc76d0014a034b8aca14237c127d7c19d7726
Author: Johannes Berg <johannes@sipsolutions.net>
Date: Wed Jul 1 11:26:02 2009 +0000

net/compat/wext: send different messages to compat tasks

we had a race condition when setting and then
restoring frag_list. Eric attempted to fix it,
but the fix created even worse problems.

However, the original motivation I had when I
added the code that turned out to be racy is
no longer clear to me, since we only copy up
to skb->len to userspace, which doesn't include
the frag_list length. As a result, not doing
any frag_list clearing and restoring avoids
the race condition, while not introducing any
other problems.

Additionally, while preparing this patch I found
that since none of the remaining netlink code is
really aware of the frag_list, we need to use the
original skb's information for packet information
and credentials. This fixes, for example, the
group information received by compat tasks.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

ALSA: intel8x0: Mute External Amplifier by default for ThinkPad X31

commit 9c77b846ec8b4e0c7107dd7f820172462dc84a61 upstream.

BugLink: https://bugs.launchpad.net/bugs/619439
This ThinkPad model needs External Amplifier muted for audible playback,
so set the inv_eapd quirk for it.

Reported-and-tested-by: Dennis Bell <dennis.bell@parkerg.co.uk>
Signed-off-by: Daniel T Chen <crimsun@ubuntu.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

fixes for using make 3.82

commit 3c955b407a084810f57260d61548cc92c14bc627 upstream.

It doesn't like pattern and explicit rules to be on the same line,
and it seems to be more picky when matching file (or really directory)
names with different numbers of trailing slashes.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Andrew Benton <b3nton@gmail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e1000e: don't check for alternate MAC addr on parts that don't support it

commit 1aef70ef125165e0114a8e475636eff242a52030 upstream.

From: Bruce Allan <bruce.w.allan@intel.com>

The alternate MAC address feature is only supported by 80003ES2LAN and
82571 LOMs as well as a couple 82571 mezzanine cards. Checking for an
alternate MAC address on other parts can fail leading to the driver not
able to load. This patch limits the check for an alternate MAC address
to be done only for parts that support the feature.

This issue has been around since support for the feature was introduced
to the e1000e driver in 2.6.34.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Reported-by: Fabio Varesano <fax8@users.sourceforge.net>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

e1000e: disable ASPM L1 on 82573

commit 19833b5dffe2f2e92a1b377f9aae9d5f32239512 upstream.

On the e1000-devel mailing list, Nils Faerber reported latency issues with
the 82573 LOM on a ThinkPad X60. It was found to be caused by ASPM L1;
disabling it resolves the latency. The issue is present in kernels back
to 2.6.34 and possibly 2.6.33.

Reported-by: Nils Faerber <nils.faerber@kernelconcepts.de>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

oprofile: add support for Intel processor model 30

commit a7c55cbee0c1bae9bf5a15a08300e91d88706e45 upstream.

Newer Intel processors identifying themselves as model 30 are not recognized by
oprofile.

<cpuinfo snippet>
model           : 30
model name      : Intel(R) Xeon(R) CPU           X3470  @ 2.93GHz
</cpuinfo snippet>

Running oprofile on these machines gives the following:
+ opcontrol --init
+ opcontrol --list-events
oprofile: available events for CPU type "Intel Architectural Perfmon"

See Intel 64 and IA-32 Architectures Software Developer's Manual
Volume 3B (Document 253669) Chapter 18 for architectural perfmon events
This is a limited set of fallback events because oprofile doesn't know your CPU
CPU_CLK_UNHALTED: (counter: all)
        Clock cycles when not halted (min count: 6000)
INST_RETIRED: (counter: all)
        number of instructions retired (min count: 6000)
LLC_MISSES: (counter: all)
        Last level cache demand requests from this core that missed the LLC
(min count: 6000)
        Unit masks (default 0x41)
        ----------
        0x41: No unit mask
LLC_REFS: (counter: all)
        Last level cache demand requests from this core (min count: 6000)
        Unit masks (default 0x4f)
        ----------
        0x4f: No unit mask
BR_MISS_PRED_RETIRED: (counter: all)
        number of mispredicted branches retired (precise) (min count: 500)
+ opcontrol --shutdown

Tested using oprofile 0.9.6.

Signed-off-by: Josh Hunt <johunt@akamai.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Oprofile: Change CPUIDS from decimal to hex, and add some comments

commit 45c34e05c4e3d36e7c44e790241ea11a1d90d54e upstream.

Back when the patch was submitted for "Add Xeon 7500 series support to
oprofile", Robert Richter had asked for a followon patch that
converted all the CPU ID values to hex.

I have done that here for the "i386/core_i7" and "i386/atom" class
processors in the ppro_init() function and also added some comments on
where to find documentation on the Intel processors.

Signed-off-by: John L. Villalovos <john.l.villalovos@intel.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

isdn: gigaset: add missing unlock

commit 7e27a0aeb98d53539bdc38384eee899d6db62617 upstream.

We should unlock here. This is the only place where we return from the
function with the lock held. The caller isn't expecting it.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

isdn/gigaset: reduce syslog spam

commit 7d060ed2877ff6d00e7238226edbaf91493d6d0b upstream.

Downgrade some error messages which occur frequently during
normal operation to debug messages.

Impact: logging
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

pcmcia: avoid buffer overflow in pcmcia_setup_isa_irq

commit 127c03cdbad9bd5af5d7f33bd31a1015a90cb77f upstream.

NR_IRQS may be as low as 16, causing a (harmless?) buffer overflow in
pcmcia_setup_isa_irq():

static u8 pcmcia_used_irq[NR_IRQS];

...

if ((try < 32) && pcmcia_used_irq[irq])
continue;

This is read-only, so if this address would be non-zero, it would just
mean we would not attempt an IRQ >= NR_IRQS -- which would fail anyway!
And as request_irq() fails for an irq >= NR_IRQS, the setting code path:

pcmcia_used_irq[irq]++;

is never reached as well.

Reported-by: Christoph Fritz <chf.fritz@googlemail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

vmscan: raise the bar to PAGEOUT_IO_SYNC stalls

commit e31f3698cd3499e676f6b0ea12e3528f569c4fa3 upstream.

Fix "system goes unresponsive under memory pressure and lots of
dirty/writeback pages" bug.

http://lkml.org/lkml/2010/4/4/86

In the above thread, Andreas Mohr described that

Invoking any command locked up for minutes (note that I'm
talking about attempted additional I/O to the _other_,
_unaffected_ main system HDD - such as loading some shell
binaries -, NOT the external SSD18M!!).

This happens when the two conditions are both meet:
- under memory pressure
- writing heavily to a slow device

OOM also happens in Andreas' system.  The OOM trace shows that 3 processes
are stuck in wait_on_page_writeback() in the direct reclaim path.  One in
do_fork() and the other two in unix_stream_sendmsg().  They are blocked on
this condition:

(sc->order && priority < DEF_PRIORITY - 2)

which was introduced in commit 78dc583d (vmscan: low order lumpy reclaim
also should use PAGEOUT_IO_SYNC) one year ago.  That condition may be too
permissive.  In Andreas' case, 512MB/1024 = 512KB.  If the direct reclaim
for the order-1 fork() allocation runs into a range of 512KB
hard-to-reclaim LRU pages, it will be stalled.

It's a severe problem in three ways.

Firstly, it can easily happen in daily desktop usage.  vmscan priority can
easily go below (DEF_PRIORITY - 2) on _local_ memory pressure.  Even if
the system has 50% globally reclaimable pages, it still has good
opportunity to have 0.1% sized hard-to-reclaim ranges.  For example, a
simple dd can easily create a big range (up to 20%) of dirty pages in the
LRU lists.  And order-1 to order-3 allocations are more than common with
SLUB.  Try "grep -v '1 :' /proc/slabinfo" to get the list of high order
slab caches.  For example, the order-1 radix_tree_node slab cache may
stall applications at swap-in time; the order-3 inode cache on most
filesystems may stall applications when trying to read some file; the
order-2 proc_inode_cache may stall applications when trying to open a
/proc file.

Secondly, once triggered, it will stall unrelated processes (not doing IO
at all) in the system.  This "one slow USB device stalls the whole system"
avalanching effect is very bad.

Thirdly, once stalled, the stall time could be intolerable long for the
users.  When there are 20MB queued writeback pages and USB 1.1 is writing
them in 1MB/s, wait_on_page_writeback() will stuck for up to 20 seconds.
Not to mention it may be called multiple times.

So raise the bar to only enable PAGEOUT_IO_SYNC when priority goes below
DEF_PRIORITY/3, or 6.25% LRU size.  As the default dirty throttle ratio is
20%, it will hardly be triggered by pure dirty pages.  We'd better treat
PAGEOUT_IO_SYNC as some last resort workaround -- its stall time is so
uncomfortably long (easily goes beyond 1s).

The bar is only raised for (order < PAGE_ALLOC_COSTLY_ORDER) allocations,
which are easy to satisfy in 1TB memory boxes.  So, although 6.25% of
memory could be an awful lot of pages to scan on a system with 1TB of
memory, it won't really have to busy scan that much.

Andreas tested an older version of this patch and reported that it mostly
fixed his problem.  Mel Gorman helped improve it and KOSAKI Motohiro will
fix it further in the next patch.

Reported-by: Andreas Mohr <andi@lisas.de>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

pkt_sched: Fix sch_sfq vs tc_modify_qdisc oops

[ Upstream commit 41065fba846e795b31b17e4dec01cb904d56c6cd ]

sch_sfq as a classful qdisc needs the .leaf handler. Otherwise, there
is an oops possible in tc_modify_qdisc()/check_loop().

Fixes commit 7d2681a6ff4f9ab5e48d02550b4c6338f1638998

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

pkt_sched: Fix sch_sfq vs tcf_bind_filter oops

[ Upstream commit eb4a5527b1f0d581ac217c80ef3278ed5e38693c ]

Since there was added ->tcf_chain() method without ->bind_tcf() to
sch_sfq class options, there is oops when a filter is added with
the classid parameter.

Fixes commit 7d2681a6ff4f9ab5e48d02550b4c6338f1638998
netdev thread: null pointer at cls_api.c

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Reported-by: Franchoze Eric <franchoze@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>