git.karo-electronics.de Git - karo-tx-linux.git/log

CIFS: Fix VFS lock usage for oplocked files

commit 66189be74ff5f9f3fd6444315b85be210d07cef2 upstream.

We can deadlock if we have a write oplock and two processes
use the same file handle. In this case the first process can't
unlock its lock if the second process blocked on the lock in the
same time.

Fix it by using posix_lock_file rather than posix_lock_file_wait
under cinode->lock_mutex. If we request a blocking lock and
posix_lock_file indicates that there is another lock that prevents
us, wait untill that lock is released and restart our call.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86,kgdb: Fix DEBUG_RODATA limitation using text_poke()

commit 3751d3e85cf693e10e2c47c03c8caa65e171099b upstream.

There has long been a limitation using software breakpoints with a
kernel compiled with CONFIG_DEBUG_RODATA going back to 2.6.26. For
this particular patch, it will apply cleanly and has been tested all
the way back to 2.6.36.

The kprobes code uses the text_poke() function which accommodates
writing a breakpoint into a read-only page. The x86 kgdb code can
solve the problem similarly by overriding the default breakpoint
set/remove routines and using text_poke() directly.

The x86 kgdb code will first attempt to use the traditional
probe_kernel_write(), and next try using a the text_poke() function.
The break point install method is tracked such that the correct break
point removal routine will get called later on.

Cc: x86@kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Inspried-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kgdbts: (2 of 2) fix single step awareness to work correctly with SMP

commit 23bbd8e346f1ef3fc1219c79cea53d8d52b207d8 upstream.

The do_fork and sys_open tests have never worked properly on anything
other than a UP configuration with the kgdb test suite.  This is
because the test suite did not fully implement the behavior of a real
debugger.  A real debugger tracks the state of what thread it asked to
single step and can correctly continue other threads of execution or
conditionally stop while waiting for the original thread single step
request to return.

Below is a simple method to cause a fatal kernel oops with the kgdb
test suite on a 2 processor ARM system:

while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
echo V1I1F100 > /sys/module/kgdbts/parameters/kgdbts

Very soon after starting the test the kernel will start warning with
messages like:

kgdbts: BP mismatch c002487c expected c0024878
------------[ cut here ]------------
WARNING: at drivers/misc/kgdbts.c:317 check_and_rewind_pc+0x9c/0xc4()
[<c01f6520>] (check_and_rewind_pc+0x9c/0xc4)
[<c01f595c>] (validate_simple_test+0x3c/0xc4)
[<c01f60d4>] (run_simple_test+0x1e8/0x274)

The kernel will eventually recovers, but the test suite has completely
failed to test anything useful.

This patch implements behavior similar to a real debugger that does
not rely on hardware single stepping by using only software planted
breakpoints.

In order to mimic a real debugger, the kgdb test suite now tracks the
most recent thread that was continued (cont_thread_id), with the
intent to single step just this thread.  When the response to the
single step request stops in a different thread that hit the original
break point that thread will now get continued, while the debugger
waits for the thread with the single step pending.  Here is a high
level description of the sequence of events.

   cont_instead_of_sstep = 0;

1) set breakpoint at do_fork
2) continue
3)   Save the thread id where we stop to cont_thread_id
4) Remove breakpoint at do_fork
5) Reset the PC if needed depending on kernel exception type
6) soft single step
7)   Check where we stopped
       if current thread != cont_thread_id {
           if (here for more than 2 times for the same thead) {
              ### must be a really busy system, start test again ###
      goto step 1
           }
           goto step 5
       } else {
           cont_instead_of_sstep = 0;
       }
8) clean up and run test again if needed
9) Clear out any threads that were waiting on a break point at the
   point in time the test is ended with get_cont_catch().  This
   happens sometimes because breakpoints are used in place of single
   stepping and some threads could have been in the debugger exception
   handling queue because breakpoints were hit concurrently on
   different CPUs.  This also means we wait at least one second before
   unplumbing the debugger connection at the very end, so as respond
   to any debug threads waiting to be serviced.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kgdbts: (1 of 2) fix single step awareness to work correctly with SMP

commit 486c5987a00a89d56c2c04c506417ef8f823ca2e upstream.

The do_fork and sys_open tests have never worked properly on anything
other than a UP configuration with the kgdb test suite.  This is
because the test suite did not fully implement the behavior of a real
debugger.  A real debugger tracks the state of what thread it asked to
single step and can correctly continue other threads of execution or
conditionally stop while waiting for the original thread single step
request to return.

Below is a simple method to cause a fatal kernel oops with the kgdb
test suite on a 4 processor x86 system:

while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
while [ 1 ] ; do ls > /dev/null 2> /dev/null; done&
echo V1I1F1000 > /sys/module/kgdbts/parameters/kgdbts

Very soon after starting the test the kernel will oops with a message like:

kgdbts: BP mismatch 3b7da66480 expected ffffffff8106a590
WARNING: at drivers/misc/kgdbts.c:303 check_and_rewind_pc+0xe0/0x100()
Call Trace:
[<ffffffff812994a0>] check_and_rewind_pc+0xe0/0x100
[<ffffffff81298945>] validate_simple_test+0x25/0xc0
[<ffffffff81298f77>] run_simple_test+0x107/0x2c0
[<ffffffff81298a18>] kgdbts_put_char+0x18/0x20

The warn will turn to a hard kernel crash shortly after that because
the pc will not get properly rewound to the right value after hitting
a breakpoint leading to a hard lockup.

This change is broken up into 2 pieces because archs that have hw
single stepping (2.6.26 and up) need different changes than archs that
do not have hw single stepping (3.0 and up).  This change implements
the correct behavior for an arch that supports hw single stepping.

A minor defect was fixed where sys_open should be do_sys_open
for the sys_open break point test.  This solves the problem of running
a 64 bit with a 32 bit user space.  The sys_open() never gets called
when using the 32 bit file system for the kgdb testsuite because the
32 bit binaries invoke the compat_sys_open() call leading to the test
never completing.

In order to mimic a real debugger, the kgdb test suite now tracks the
most recent thread that was continued (cont_thread_id), with the
intent to single step just this thread.  When the response to the
single step request stops in a different thread that hit the original
break point that thread will now get continued, while the debugger
waits for the thread with the single step pending.  Here is a high
level description of the sequence of events.

   cont_instead_of_sstep = 0;

1) set breakpoint at do_fork
2) continue
3)   Save the thread id where we stop to cont_thread_id
4) Remove breakpoint at do_fork
5) Reset the PC if needed depending on kernel exception type
6) if (cont_instead_of_sstep) { continue } else { single step }
7)   Check where we stopped
       if current thread != cont_thread_id {
           cont_instead_of_sstep = 1;
           goto step 5
       } else {
           cont_instead_of_sstep = 0;
       }
8) clean up and run test again if needed

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kgdbts: Fix kernel oops with CONFIG_DEBUG_RODATA

commit 456ca7ff24841bf2d2a2dfd690fe7d42ef70d932 upstream.

On x86 the kgdb test suite will oops when the kernel is compiled with
CONFIG_DEBUG_RODATA and you run the tests after boot time. This is
regression has existed since 2.6.26 by commit: b33cb815 (kgdbts: Use
HW breakpoints with CONFIG_DEBUG_RODATA).

The test suite can use hw breakpoints for all the tests, but it has to
execute the hardware breakpoint specific tests first in order to
determine that the hw breakpoints actually work. Specifically the
very first test causes an oops:

# echo V1I1 > /sys/module/kgdbts/parameters/kgdbts
kgdb: Registered I/O driver kgdbts.
kgdbts:RUN plant and detach test

Entering kdb (current=0xffff880017aa9320, pid 1078) on processor 0 due to Keyboard Entry
[0]kdb> kgdbts: ERROR PUT: end of test buffer on 'plant_and_detach_test' line 1 expected OK got $E14#aa
WARNING: at drivers/misc/kgdbts.c:730 run_simple_test+0x151/0x2c0()
[...oops clipped...]

This commit re-orders the running of the tests and puts the RODATA
check into its own function so as to correctly avoid the kernel oops
by detecting and using the hw breakpoints.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kgdb,debug_core: pass the breakpoint struct instead of address and memory

commit 98b54aa1a2241b59372468bd1e9c2d207bdba54b upstream.

There is extra state information that needs to be exposed in the
kgdb_bpt structure for tracking how a breakpoint was installed. The
debug_core only uses the the probe_kernel_write() to install
breakpoints, but this is not enough for all the archs. Some arch such
as x86 need to use text_poke() in order to install a breakpoint into a
read only page.

Passing the kgdb_bpt structure to kgdb_arch_set_breakpoint() and
kgdb_arch_remove_breakpoint() allows other archs to set the type
variable which indicates how the breakpoint was installed.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

target: Fix unsupported WRITE_SAME sense payload

commit 67236c44741e250199ccd77f1115568e68cf8848 upstream.

This patch fixes a bug in target-core where unsupported WRITE_SAME ops
from a target_check_write_same_discard() failure was incorrectly
returning CHECK_CONDITION w/ TCM_INVALID_CDB_FIELD sense data.
This was causing some clients to not properly fall back, so go ahead
and use the correct TCM_UNSUPPORTED_SCSI_OPCODE sense for this case.

Reported-by: Martin Svec <martin.svec@zoner.cz>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

r8169: runtime resume before shutdown.

commit 2a15cd2ff488a9fdb55e5e34060f499853b27c77 upstream.

With runtime PM, if the ethernet cable is disconnected, the device is
transitioned to D3 state to conserve energy. If the system is shutdown
in this state, any register accesses in rtl_shutdown are dropped on
the floor. As the device was programmed by .runtime_suspend() to wake
on link changes, it is thus brought back up as soon as the link recovers.

Resuming every suspended device through the driver core would slow things
down and it is not clear how many devices really need it now.

Original report and D0 transition patch by Sameer Nanda. Patch has been
changed to comply with advices by Rafael J. Wysocki and the PM folks.

Reported-by: Sameer Nanda <snanda@chromium.org>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Hayes Wang <hayeswang@realtek.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: quirk away broken OpRegion VBT

commit 25e341cfc33d94435472983825163e97fe370a6c upstream.

Somehow the BIOS manages to screw things up when copying the VBT
around, because the one we scrap from the VBIOS rom actually works.

Tested-by: Markus Heinz <markus.heinz@uni-dortmund.de>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=28812
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Add lock on drm_helper_resume_force_mode

commit 927a2f119e8235238a2fc64871051b16c9bdae75 upstream.

i915_drm_thaw was not locking the mode_config lock when calling
drm_helper_resume_force_mode. When there were multiple wake sources,
this caused FDI training failure on SNB which in turn corrupted the
display.

Signed-off-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Sanitize BIOS debugging bits from PIPECONF

commit f47166d2b0001fcb752b40c5a2d4db986dfbea68 upstream.

Quoting the BSpec from time immemorial:

  PIPEACONF, bits 28:27: Frame Start Delay (Debug)

  Used to delay the frame start signal that is sent to the display planes.
  Care must be taken to insure that there are enough lines during VBLANK
  to support this setting.

An instance of the BIOS leaving these bits set was found in the wild,
where it caused our modesetting to go all squiffy and skewiff.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47271
Reported-and-tested-by: Eva Wang <evawang@linpus.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43012
Reported-and-tested-by: Carl Richell <carl@system76.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: no-lvds quirk on MSI DC500

commit 97effadb65ed08809e1720c8d3ee80b73a93665c upstream.

This hardware doesn't have an LVDS, it's a desktop box. Fix incorrect
LVDS detection.

Signed-off-by: Anisse Astier <anisse@astier.eu>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon/kms: fix fans after resume

commit 402976fe51b2d1a58a29ba06fa1ca5ace3a4cdcd upstream.

On pre-R600 asics, the SpeedFanControl table is not
executed as part of ASIC_Init as it is on newer asics.

Fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=29412

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm: Validate requested virtual size against allocated fb size

commit 62fb376e214d3c1bfdf6fbb77dac162f6da04d7e upstream.

mplayer -vo fbdev tries to create a screen that is twice as tall as the
allocated framebuffer for "doublebuffering". By default, and all in-tree
users, only sufficient memory is allocated and mapped to satisfy the
smallest framebuffer and the virtual size is no larger than the actual.
For these users, we should therefore reject any userspace request to
create a screen that requires a buffer larger than the framebuffer
originally allocated.

References: https://bugs.freedesktop.org/show_bug.cgi?id=38138
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtlwifi: rtl8192ce: rtl8192cu: rtl8192de: Fix low-gain setting when scanning

commit 643c61e119459e9d750087b7b34be94491efebf9 upstream.

In https://bugzilla.redhat.com/show_bug.cgi?id=770207, slowdowns of driver
rtl8192ce are reported. One fix (commit a9b89e2) has already been applied,
and it helped, but the maximum RX speed would still drop to 1 Mbps. As in
the previous fix, the initial gain was determined to be the problem; however,
the problem arises from a setting of the gain when scans are started.

Driver rtl8192de also has the same code structure - this one is fixed as well.

Reported-and-Tested-by: Ivan Pesin <ivan.pesin@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mac80211: fix possible tid_rx->reorder_timer use after free

commit d72308bff5c2fa207949a5925b020bce74495e33 upstream.

Is possible that we will arm the tid_rx->reorder_timer after
del_timer_sync() in ___ieee80211_stop_rx_ba_session(). We need to stop
timer after RCU grace period finish, so move it to
ieee80211_free_tid_rx(). Timer will not be armed again, as
rcu_dereference(sta->ampdu_mlme.tid_rx[tid]) will return NULL.

Debug object detected problem with the following warning:
ODEBUG: free active (active state 0) object type: timer_list hint: sta_rx_agg_reorder_timer_expired+0x0/0xf0 [mac80211]

Bug report (with all warning messages):
https://bugzilla.redhat.com/show_bug.cgi?id=804007

Reported-by: "jan p. springer" <jsd@igroup.org>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

m68k/mac: Add missing platform check before registering platform devices

commit 6cfeba53911d6d2f17ebbd1246893557d5ff5aeb upstream.

On multi-platform kernels, the Mac platform devices should be registered
when running on Mac only. Else it may crash later.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tracing: Fix ent_size in trace output

commit 12b5da349a8b94c9dbc3430a6bc42eabd9eaf50b upstream.

When reading the trace file, the records of each of the per_cpu buffers
are examined to find the next event to print out. At the point of looking
at the event, the size of the event is recorded. But if the first event is
chosen, the other events in the other CPU buffers will reset the event size
that is stored in the iterator descriptor, causing the event size passed to
the output functions to be incorrect.

In most cases this is not a problem, but for the case of stack traces, it
is. With the change to the stack tracing to record a dynamic number of
back traces, the output depends on the size of the entry instead of the
fixed 8 back traces. When the entry size is not correct, the back traces
would not be fully printed.

Note, reading from the per-cpu trace files were not affected.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tracing: Fix ftrace stack trace entries

commit 01de982abf8c9e10fc3089e10585cd2cc914bdab upstream.

8 hex characters tell only half the tale for 64 bit CPUs,
so use the appropriate length.

Link: http://lkml.kernel.org/r/1332411501-8059-2-git-send-email-wolfgang.mauerer@siemens.com
Signed-off-by: Wolfgang Mauerer <wolfgang.mauerer@siemens.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

genirq: Adjust irq thread affinity on IRQ_SET_MASK_OK_NOCOPY return value

commit f5cb92ac82d06cb583c1f66666314c5c0a4d7913 upstream.

irq_move_masked_irq() checks the return code of
chip->irq_set_affinity() only for 0, but IRQ_SET_MASK_OK_NOCOPY is
also a valid return code, which is there to avoid a redundant copy of
the cpumask. But in case of IRQ_SET_MASK_OK_NOCOPY we not only avoid
the redundant copy, we also fail to adjust the thread affinity of an
eventually threaded interrupt handler.

Handle IRQ_SET_MASK_OK (==0) and IRQ_SET_MASK_OK_NOCOPY(==1) return
values correctly by checking the valid return values seperately.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Keping Chen <chenkeping@huawei.com>
Link: http://lkml.kernel.org/r/1333120296-13563-2-git-send-email-jiang.liu@huawei.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

modpost: fix ALL_INIT_DATA_SECTIONS

commit 9aaf440f8fabcebf9ea79a62ccf4c212e6544b49 upstream.

This was lacking a comma between two supposed to be separate strings.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPICA: Fix regression in FADT revision checks

commit 3e80acd1af40fcd91a200b0416a7616b20c5d647 upstream.

commit 64b3db22c04586997ab4be46dd5a5b99f8a2d390 (2.6.39),
"Remove use of unreliable FADT revision field" causes regression
for old P4 systems because now cst_control and other fields are
not reset to 0.

The effect is that acpi_processor_power_init will notice
cst_control != 0 and a write to CST_CNT register is performed
that should not happen. As result, the system oopses after the
"No _CST, giving up" message, sometimes in acpi_ns_internalize_name,
sometimes in acpi_ns_get_type, usually at random places. May be
during migration to CPU 1 in acpi_processor_get_throttling.

Every one of these settings help to avoid this problem:
- acpi=off
- processor.nocst=1
- maxcpus=1

The fix is to update acpi_gbl_FADT.header.length after
the original value is used to check for old revisions.

https://bugzilla.kernel.org/show_bug.cgi?id=42700
https://bugzilla.redhat.com/show_bug.cgi?id=727865

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Cc: Jonathan Nieder <jrnieder@gmail.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

PNPACPI: Fix device ref leaking in acpi_pnp_match

commit 89e96ada572fb216e582dbe3f64e1a6939a37f74 upstream.

During testing pci root bus removal, found some root bus bridge is not freed.
If booting with pnpacpi=off, those hostbridge could be freed without problem.
It turns out that some devices reference are not released during acpi_pnp_match.
that match should not hold one device ref during every calling.
Add pu_device calling before returning.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI: Do cpufreq clamping for throttling per package v2

commit 2815ab92ba3ab27556212cc306288dc95692824b upstream.

On Intel CPUs the processor typically uses the highest frequency
set by any logical CPU. When the system overheats
Linux first forces the frequency to the lowest available one
to lower the temperature.

However this was done only per logical CPU, which means all
logical CPUs in a package would need to go through this before
the frequency is actually lowered.

Worse this delay actually prevents real throttling, because
the real throttle code only proceeds when the lowest frequency
is already reached.

So when a throttle event happens force the lowest frequency
for all CPUs in the package where it happened. The per CPU
state is now kept per package, not per logical CPU. An alternative
would be to do it per cpufreq unit, but since we want to bring
down the temperature of the complete chip it's better
to do it for all.

In principle it may even make sense to do it for all CPUs,
but I kept it on the package for now.

With this change the frequency is actually lowered, which
in terms also allows real throttling to proceed.

I also removed an unnecessary per cpu variable initialization.

v2: Fix package mapping

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: m25p80: set writebufsize

commit b54f47c8bcfc5f766bf13ec31bd7dd1d4726d33b upstream.

Using UBI on m25p80 can give messages like:

UBI error: io_init: bad write buffer size 0 for 1 min. I/O unit

We need to initialize writebufsize; I think "page_size" is the correct
"bufsize", although I'm not sure. Comments?

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: lart: initialize writebufsize

commit fcc44a07dae0af16e84e93425fc8afe642ddc603 upstream.

The writebufsize concept was introduce by commit
"0e4ca7e mtd: add writebufsize field to mtd_info struct" and it represents
the maximum amount of data the device writes to the media at a time. This is
an important parameter for UBIFS which is used during recovery and which
basically defines how big a corruption caused by a power cut can be.

Set writebufsize to 4 because this drivers writes at max 4 bytes at a time.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: block2mtd: initialize writebufsize

commit b604387411ec6a072e95910099262616edd2bd2f upstream.

The writebufsize concept was introduce by commit
"0e4ca7e mtd: add writebufsize field to mtd_info struct" and it represents
the maximum amount of data the device writes to the media at a time. This is
an important parameter for UBIFS which is used during recovery and which
basically defines how big a corruption caused by a power cut can be.

However, we forgot to set this parameter for block2mtd. Set it to PAGE_SIZE
because this is actually the amount of data we write at a time.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: Joern Engel <joern@lazybastard.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: sst25l: initialize writebufsize

commit c4cc625ea5958d065c21cc0fcea29e9ed8f3d2bc upstream.

The writebufsize concept was introduce by commit
"0e4ca7e mtd: add writebufsize field to mtd_info struct" and it represents
the maximum amount of data the device writes to the media at a time. This is
an important parameter for UBIFS which is used during recovery and which
basically defines how big a corruption caused by a power cut can be.

Set writebufsize to the flash page size because it is the maximum amount of
data it writes at a time.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: nand: gpmi: use correct member for checking NAND_BBT_USE_FLASH

commit 5289966ea576a062b80319975b31b661c196ff9d upstream.

This has been moved from .options to .bbt_options meanwhile. So, it
currently checks for something totally different (NAND_OWN_BUFFERS) and
decides according to that.

Artem Bityutskiy: the options were moved in
a40f734 mtd: nand: consolidate redundant flash-based BBT flags

Artem Bityutskiy: CCing -stable

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Huang Shijie <b32955@freescale.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: mips: lantiq: reintroduce support for cmdline partitions

commit bf011f2ed53d587fdd8148c173c4f09ed77bdf1a upstream.

Since commit ca97dec2ab5c87e9fbdf7e882e1820004a3966fa the
command line parsing of MTD partitions does not work anymore.

Signed-off-by: Daniel Schwierzeck <daniel.schwierzeck@googlemail.com>
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: ixp4xx: oops in ixp4xx_flash_probe

commit a3c1e3b732b3708a80e4035b9d845f3f7c7dd0c9 upstream.

In commit "c797533 mtd: abstract last MTD partition parser argument" the
third argument of "mtd_device_parse_register()" changed from start address
of the MTD device to a pointer to a struct.

The "ixp4xx_flash_probe()" function was not converted properly, causing
an oops during boot.

This patch fixes the problem by filling the needed information into a
"struct mtd_part_parser_data" and passing it to
"mtd_device_parse_register()".

Signed-off-by: Marc Kleine-Budde <mkl@blackshift.org>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ASoC: wm8994: Update WM8994 DCS calibration

commit e16605855d58803fe0608417150c7a618b4f8243 upstream.

Based on latest production information.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Fix non TBI PHY access; a bad merge undid bug fix in a previous commit.

[ Upstream commit 464b57da56910c8737ede75ad820b9a7afc46b3e ]

The merge done in commit b26e478f undid bug fix in commit c3e072f8
("net: fsl_pq_mdio: fix non tbi phy access"), with the result that non
TBI (e.g. MDIO) PHYs cannot be accessed.

Signed-off-by: Kenth Eriksson <kenth.eriksson@transmode.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: usb: cdc_eem: fix mtu

[ Upstream commit 78fb72f7936c01d5b426c03a691eca082b03f2b9 ]

Make CDC EEM recalculate the hard_mtu after adjusting the
hard_header_len.

Without this, usbnet adjusts the MTU down to 1494 bytes, and the host is
unable to receive standard 1500-byte frames from the device.

Tested with the Linux USB Ethernet gadget.

Cc: Oliver Neukum <oliver@neukum.name>
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rose_dev: fix memcpy-bug in rose_set_mac_address

[ Upstream commit 81213b5e8ae68e204aa7a3f83c4f9100405dbff9 ]

If both addresses equal, nothing needs to be done. If the device is down,
then we simply copy the new address to dev->dev_addr. If the device is up,
then we add another loopback device with the new address, and if that does
not fail, we remove the loopback device with the old address. And only
then, we update the dev->dev_addr.

Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sky2: dont overwrite settings for PHY Quick link

[ Upstream commit 2240eb4ae3dc4acff20d1a8947c441c451513e37 ]

This patch corrects a bug in function sky2_open() of the Marvell Yukon 2 driver
in which the settings for PHY quick link are overwritten.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Acked-by: Stephen Hemminger <shemminger@vyattta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tg3: Fix 5717 serdes powerdown problem

[ Upstream commit 085f1afc56619bda424941412fdeaff1e32c21dc ]

If port 0 of a 5717 serdes device powers down, it hides the phy from
port 1. This patch works around the problem by keeping port 0's phy
powered up.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86 bpf_jit: fix a bug in emitting the 16-bit immediate operand of AND

[ Upstream commit 1d24fb3684f347226747c6b11ea426b7b992694e ]

When K >= 0xFFFF0000, AND needs the two least significant bytes of K as
its operand, but EMIT2() gives it the least significant byte of K and
0x2. EMIT() should be used here to replace EMIT2().

Signed-off-by: Feiran Zhuang <zhuangfeiran@ict.ac.cn>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 3.2.14

ASPM: Fix pcie devices with non-pcie children

commit c9651e70ad0aa499814817cbf3cc1d0b806ed3a1 upstream.

Since 3.2.12 and 3.3, some systems are failing to boot with a BUG_ON.
Some other systems using the pata_jmicron driver fail to boot because no
disks are detected.  Passing pcie_aspm=force on the kernel command line
works around it.

The cause: commit 4949be16822e ("PCI: ignore pre-1.1 ASPM quirking when
ASPM is disabled") changed the behaviour of pcie_aspm_sanity_check() to
always return 0 if aspm is disabled, in order to avoid cases where we
changed ASPM state on pre-PCIe 1.1 devices.

This skipped the secondary function of pcie_aspm_sanity_check which was
to avoid us enabling ASPM on devices that had non-PCIe children, causing
trouble later on.  Move the aspm_disabled check so we continue to honour
that scenario.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=42979 and
          http://bugs.debian.org/665420

Reported-by: Romain Francoise <romain@orebokech.com> # kernel panic
Reported-by: Chris Holland <bandidoirlandes@gmail.com> # disk detection trouble
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Tested-by: Hatem Masmoudi <hatem.masmoudi@gmail.com> # Dell Latitude E5520
Tested-by: janek <jan0x6c@gmail.com> # pata_jmicron with JMB362/JMB363
[jn: with more symptoms in log message]
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: sh-sci: fix a race of DMA submit_tx on transfer

commit 49d4bcaddca977fffdea8b0b71f6e5da96dac78e upstream.

When DMA is enabled, sh-sci transfer begins with
uart_start()
  sci_start_tx()
    if (cookie_tx < 0) schedule_work()
Then, starts DMA when wq scheduled, -- (A)
process_one_work()
  work_fn_rx()
   cookie_tx = desc->submit_tx()
And finishes when DMA transfer ends, -- (B)
sci_dma_tx_complete()
  async_tx_ack()
  cookie_tx = -EINVAL
  (possible another schedule_work())

This A to B sequence is not reentrant, since controlling variables
(for example, cookie_tx above) are not queues nor lists. So, they
must be invoked as A B A B..., otherwise results in kernel crash.

To ensure the sequence, sci_start_tx() seems to test if cookie_tx < 0
(represents "not used") to call schedule_work().
But cookie_tx will not be set (to a cookie, also means "used") until
in the middle of work queue scheduled function work_fn_tx().

This gap between the test and set allows the breakage of the sequence
under the very frequently call of uart_start().
Another gap between async_tx_ack() and another schedule_work() results
in the same issue, too.

This patch introduces a new condition "cookie_tx == 0" just to mark
it is "busy" and assign it within spin-locked region to fill the gaps.

Signed-off-by: Takashi Yoshii <takashi.yoshii.zj@renesas.com>
Reviewed-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfsd: don't allow zero length strings in cache_parse()

commit 6d8d17499810479eabd10731179c04b2ca22152f upstream.

There is no point in passing a zero length string here and quite a
few of that cache_parse() implementations will Oops if count is
zero.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtc: Provide flag for rtc devices that don't support UIE

commit 4a649903f91232d02284d53724b0a45728111767 upstream.

Richard Weinberger noticed that on some RTC hardware that
doesn't support UIE mode, due to coarse granular alarms
(like 1minute resolution), the current virtualized RTC
support doesn't properly error out when UIE is enabled.

Instead the current code queues an alarm for the next second,
but it won't fire until up to a miniute later.

This patch provides a generic way to flag this sort of hardware
and fixes the issue on the mpc5121 where Richard noticed the
problem.

Reported-by: Richard Weinberger <richard@nod.at>
Tested-by: Richard Weinberger <richard@nod.at>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

compat: use sys_sendfile64() implementation for sendfile syscall

commit 1631fcea8399da5e80a80084b3b8c5bfd99d21e7 upstream.

<asm-generic/unistd.h> was set up to use sys_sendfile() for the 32-bit
compat API instead of sys_sendfile64(), but in fact the right thing to
do is to use sys_sendfile64() in all cases. The 32-bit sendfile64() API
in glibc uses the sendfile64 syscall, so it has to be capable of doing
full 64-bit operations. But the sys_sendfile() kernel implementation
has a MAX_NON_LFS test in it which explicitly limits the offset to 2^32.
So, we need to use the sys_sendfile64() implementation in the kernel
for this case.

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86, tls: Off by one limit check

commit 8f0750f19789cf352d7e24a6cc50f2ab1b4f1372 upstream.

These are used as offsets into an array of GDT_ENTRY_TLS_ENTRIES members
so GDT_ENTRY_TLS_ENTRIES is one past the end of the array.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: http://lkml.kernel.org/r/20120324075250.GA28258@elgon.mountain
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86, tsc: Skip refined tsc calibration on systems with reliable TSC

commit 57779dc2b3b75bee05ef5d1ada47f615f7a13932 upstream.

While running the latest Linux as guest under VMware in highly
over-committed situations, we have seen cases when the refined TSC
algorithm fails to get a valid tsc_start value in
tsc_refine_calibration_work from multiple attempts. As a result the
kernel keeps on scheduling the tsc_irqwork task for later. Subsequently
after several attempts when it gets a valid start value it goes through
the refined calibration and either bails out or uses the new results.
Given that the kernel originally read the TSC frequency from the
platform, which is the best it can get, I don't think there is much
value in refining it.

So for systems which get the TSC frequency from the platform we
should skip the refined tsc algorithm.

We can use the TSC_RELIABLE cpu cap flag to detect this, right now it is
set only on VMware and for Moorestown Penwell both of which have there
own TSC calibration methods.

Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Dirk Brandewie <dirk.brandewie@gmail.com>
Cc: Alan Cox <alan@linux.intel.com>
[jstultz: Reworked to simply not schedule the refining work,
rather then scheduling the work and bombing out later]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

lockd: fix arg parsing for grace_period and timeout.

commit de5b8e8e047534aac6bc9803f96e7257436aef9c upstream.

If you try to set grace_period or timeout via a module parameter
to lockd, and do this on a big-endian machine where

sizeof(int) != sizeof(unsigned long)

it won't work. This number given will be effectively shifted right
by the difference in those two sizes.

So cast kp->arg properly to get correct result.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfrm: Access the replay notify functions via the registered callbacks

[ Upstream commit 1265fd616782ef03b98fd19f65c2b47fcd4ea11f ]

We call the wrong replay notify function when we use ESN replay
handling. This leads to the fact that we don't send notifications
if we use ESN. Fix this by calling the registered callbacks instead
of xfrm_replay_notify().

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sky2: override for PCI legacy power management

[ Upstream commit 5676cc7bfe1e388e87843f71daa229610385b41e ]

Some BIOS's don't setup power management correctly (what else is
new) and don't allow use of PCI Express power control. Add a special
exception module parameter to allow working around this issue.
Based on slightly different patch by Knut Petersen.

Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Remove printk from rds_sendmsg

[ Upstream commit a6506e1486181975d318344143aca722b2b91621 ]

no socket layer outputs a message for this error and neither should rds.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: fix napi_reuse_skb() skb reserve

[ Upstream commit 2a2a459eeeff48640dc557548ce576d666ab06ed ]

napi->skb is allocated in napi_get_frags() using
netdev_alloc_skb_ip_align(), with a reserve of NET_SKB_PAD +
NET_IP_ALIGN bytes.

However, when such skb is recycled in napi_reuse_skb(), it ends with a
reserve of NET_IP_ALIGN which is suboptimal.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: fix a potential rcu_read_lock() imbalance in rt6_fill_node()

[ Upstream commit 94f826b8076e2cb92242061e92f21b5baa3eccc2 ]

Commit f2c31e32b378 (net: fix NULL dereferences in check_peer_redir() )
added a regression in rt6_fill_node(), leading to rcu_read_lock()
imbalance.

Thats because NLA_PUT() can make a jump to nla_put_failure label.

Fix this by using nla_put()

Many thanks to Ben Greear for his help

Reported-by: Ben Greear <greearb@candelatech.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: bpf_jit: fix BPF_S_LDX_B_MSH compilation

[ Upstream commit dc72d99dabb870ca5bd6d9fff674be853bb4a88d ]

Matt Evans spotted that x86 bpf_jit was incorrectly handling negative
constant offsets in BPF_S_LDX_B_MSH instruction.

We need to abort JIT compilation like we do in common_load so that
filter uses the interpreter code and can call __load_pointer()

Reference: http://lists.openwall.net/netdev/2011/07/19/11

Thanks to Indan Zupancic to bring back this issue.

Reported-by: Matt Evans <matt@ozlabs.org>
Reported-by: Indan Zupancic <indan@nul.nu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: fix incorrent ipv6 ipsec packet fragment

[ Upstream commit 1f85851e17b64cabd089a8a8839dddebc627948c ]

Since commit 299b0767(ipv6: Fix IPsec slowpath fragmentation problem)
In func ip6_append_data,after call skb_put(skb, fraglen + dst_exthdrlen)
the skb->len contains dst_exthdrlen,and we don't reduce dst_exthdrlen at last
This will make fraggap>0 in next "while cycle",and cause the size of skb incorrent

Fix this by reserve headroom for dst_exthdrlen.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Fix pppol2tp getsockname()

[ Upstream commit bbdb32cb5b73597386913d052165423b9d736145 ]

While testing L2TP functionality, I came across a bug in getsockname(). The
IP address returned within the pppol2tp_addr's addr memember was not being
set to the IP address in use. This bug is caused by using inet_sk() on the
wrong socket (the L2TP socket rather than the underlying UDP socket), and was
likely introduced during the addition of L2TPv3 support.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: suspend fbdev device around suspend/hibernate

commit 3fa016a0b5c5237e9c387fc3249592b2cb5391c6 upstream.

Looking at hibernate overwriting I though it looked like a cursor,
so I tracked down this missing piece to stop the cursor blink
timer. I've no idea if this is sufficient to fix the hibernate
problems people are seeing, but please test it.

Both radeon and nouveau have done this for a long time.

I've run this personally all night hib/resume cycles with no fails.

Reviewed-by: Keith Packard <keithp@keithp.com>
Reported-by: Petr Tesarik <kernel@tesarici.cz>
Reported-by: Stanislaw Gruszka <sgruszka@redhat.com>
Reported-by: Lots of misc segfaults after hibernate across the world.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=37142
Tested-by: Dave Airlie <airlied@redhat.com>
Tested-by: Bojan Smojver <bojan@rexursive.com>
Tested-by: Andreas Hartmann <andihartmann@01019freenet.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bluetooth: btusb: fix bInterval for high/super speed isochronous endpoints

commit fa0fb93f2ac308a76fa64eb57c18511dadf97089 upstream.

For high-speed/super-speed isochronous endpoints, the bInterval
value is used as exponent, 2^(bInterval-1). Luckily we have
usb_fill_int_urb() function that handles it correctly. So we just
call this function to fill in the RX URB.

Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

module: Remove module size limit

commit f946eeb9313ff1470758e171a60fe7438a2ded3f upstream.

Module size was limited to 64MB, this was legacy limitation due to vmalloc()
which was removed a while ago.

Limiting module size to 64MB is both pointless and affects real world use
cases.

Cc: Tim Abbott <tim.abbott@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4.1: Fix layoutcommit error handling

commit e59d27e05a6435f8c04d5ad843f37fa795f2eaaa upstream.

Firstly, task->tk_status will always return negative error values,
so the current tests for 'NFS4ERR_DELEG_REVOKED' etc. are all being
ignored.
Secondly, clean up the code so that we only need to test
task->tk_status once!

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4: Fix two infinite loops in the mount code

commit 05e9cfb408b24debb3a85fd98edbfd09dd148881 upstream.

We can currently loop forever in nfs4_lookup_root() and in
nfs41_proc_secinfo_no_name(), if the first iteration returns a
NFS4ERR_DELAY or something else that causes exception.retry to get
set.

Reported-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

slub: Do not hold slub_lock when calling sysfs_slab_add()

commit 66c4c35c6bc5a1a452b024cf0364635b28fd94e4 upstream.

sysfs_slab_add() calls various sysfs functions that actually may
end up in userspace doing all sorts of things.

Release the slub_lock after adding the kmem_cache structure to the list.
At that point the address of the kmem_cache is not known so we are
guaranteed exlusive access to the following modifications to the
kmem_cache structure.

If the sysfs_slab_add fails then reacquire the slub_lock to
remove the kmem_cache structure from the list.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfs: Fix oops on IO error during xlog_recover_process_iunlinks()

commit d97d32edcd732110758799ae60af725e5110b3dc upstream.

When an IO error happens during inode deletion run from
xlog_recover_process_iunlinks() filesystem gets shutdown. Thus any subsequent
attempt to read buffers fails. Code in xlog_recover_process_iunlinks() does not
count with the fact that read of a buffer which was read a while ago can
really fail which results in the oops on
agi = XFS_BUF_TO_AGI(agibp);

Fix the problem by cleaning up the buffer handling in
xlog_recover_process_iunlinks() as suggested by Dave Chinner. We release buffer
lock but keep buffer reference to AG buffer. That is enough for buffer to stay
pinned in memory and we don't have to call xfs_read_agi() all the time.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

backlight: fix typo in tosa_lcd.c

commit 8da00edc1069f01c34510fa405dc15d96c090a3f upstream.

Fix typo in drivers/video/backlight/tosa_lcd.c
"tosa_lcd_reume" should be "tosa_lcd_resume".

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm thin: fix stacked bi_next usage

commit 6f94a4c45a6f744383f9f695dde019998db3df55 upstream.

Avoid using the bi_next field for the holder of a cell when deferring
bios because a stacked device below might change it.  Store the
holder in a new field in struct cell instead.

When a cell is created, the bio that triggered creation (the holder) was
added to the same bio list as subsequent bios.  In some cases we pass
this holder bio directly to devices underneath.  If those devices use
the bi_next field there will be trouble...

This also simplifies some code that had to work out which bio was the
holder.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm persistent data: fix btree rebalancing after remove

commit b0988900bae9ecf968a8a8d086a9eec671a9517a upstream.

When we remove an entry from a node we sometimes rebalance with it's
two neighbours. This wasn't being done correctly; in some cases
entries have to move all the way from the right neighbour to the left
neighbour, or vice versa. This patch pretty much re-writes the
balancing code to fix it.

This code is barely used currently; only when you delete a thin
device, and then only if you have hundreds of them in the same pool.
Once we have discard support, which removes mappings, this will be used
much more heavily.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm exception store: fix init error path

commit aadbe266f2f89ccc68b52f4effc7b3a8b29521ef upstream.

Call the correct exit function on failure in dm_exception_store_init.

Signed-off-by: Andrei Warkentin <andrey.warkentin@gmail.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm crypt: add missing error handling

commit 72c6e7afc43e19f68a31dea204fc366624d6eee9 upstream.

Always set io->error to -EIO when an error is detected in dm-crypt.

There were cases where an error code would be set only if we finish
processing the last sector. If there were other encryption operations in
flight, the error would be ignored and bio would be returned with
success as if no error happened.

This bug is present in kcryptd_crypt_write_convert, kcryptd_crypt_read_convert
and kcryptd_async_done.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm crypt: fix mempool deadlock

commit aeb2deae2660a1773c83d3c6e9e6575daa3855d6 upstream.

This patch fixes a possible deadlock in dm-crypt's mempool use.

Currently, dm-crypt reserves a mempool of MIN_BIO_PAGES reserved pages.
It allocates first MIN_BIO_PAGES with non-failing allocation (the allocation
cannot fail and waits until the mempool is refilled). Further pages are
allocated with different gfp flags that allow failing.

Because allocations may be done in parallel, this code can deadlock. Example:
There are two processes, each tries to allocate MIN_BIO_PAGES and the processes
run simultaneously.
It may end up in a situation where each process allocates (MIN_BIO_PAGES / 2)
pages. The mempool is exhausted. Each process waits for more pages to be freed
to the mempool, which never happens.

To avoid this deadlock scenario, this patch changes the code so that only
the first page is allocated with non-failing gfp mask. Allocation of further
pages may fail.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gpio/davinci: fix enabling unbanked GPIO IRQs

commit 81b279d80a63628e580c71a31d30a8c3b3047ad4 upstream.

Unbanked GPIO IRQ handling code made a copy of just
the irq_chip structure for GPIO IRQ lines which caused
problems after the generic IRQ chip conversion because
there was no valid irq_chip_type structure with the
right "regs" populated. irq_gc_mask_set_bit() was
therefore accessing random addresses.

Fix it by making a copy of irq_chip_type structure
instead. This will ensure sane register offsets.

Reported-by: Jon Povey <Jon.Povey@racelogic.co.uk>
Tested-by: Jon Povey <Jon.Povey@racelogic.co.uk>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gpio/davinci: fix oops on unbanked gpio irq request

commit ab2dde9924dd1ddb791fa8b14aa52e1df681e20c upstream.

Unbanked GPIO irq setup code was overwriting chip_data leading
to the following oops on request_irq()

Unable to handle kernel paging request at virtual address febfffff
pgd = c22dc000
[febfffff] *pgd=00000000
Internal error: Oops: 801 [#1] PREEMPT
Modules linked in: mcu(+) edmak irqk cmemk
CPU: 0    Not tainted  (3.0.0-rc7+ #93)
PC is at irq_gc_mask_set_bit+0x68/0x7c
LR is at vprintk+0x22c/0x484
pc : [<c0080c0c>]    lr : [<c00457e0>]    psr: 60000093
sp : c33e3ba0  ip : c33e3af0  fp : c33e3bc4
r10: c04555bc  r9 : c33d4340  r8 : 60000013
r7 : 0000002d  r6 : c04555bc  r5 : fec67010  r4 : 00000000
r3 : c04734c8  r2 : fec00000  r1 : ffffffff  r0 : 00000026
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 0005317f  Table: 822dc000  DAC: 00000015
Process modprobe (pid: 526, stack limit = 0xc33e2270)
Stack: (0xc33e3ba0 to 0xc33e4000)
3ba0: 00000000 c007d3d4 c33e3bcc c04555bc c04555bc c33d4340 c33e3bdc c33e3bc8
3bc0: c007f5f8 c0080bb4 00000000 c04555bc c33e3bf4 c33e3be0 c007f654 c007f5c0
3be0: 00000000 c04555bc c33e3c24 c33e3bf8 c007e6e8 c007f618 c01f2284 c0350af8
3c00: c0405214 bf016c98 00000001 00000000 c33dc008 0000002d c33e3c54 c33e3c28
3c20: c007e888 c007e408 00000001 c23ef880 c33dc000 00000000 c33dc080 c25caa00
3c40: c0487498 bf017078 c33e3c94 c33e3c58 bf016b44 c007e7d4 bf017078 c33dc008
3c60: c25caa08 c33dc008 c33e3c84 bf017484 c25caa00 c25caa00 c01f5f48 c25caa08
3c80: c0496d60 bf017484 c33e3ca4 c33e3c98 c022a698 bf01692c c33e3cd4 c33e3ca8
3ca0: c01f5d88 c022a688 00000000 bf017484 c25caa00 c25caa00 c01f5f48 c25caa08
3cc0: c0496d60 00000000 c33e3cec c33e3cd8 c01f5f8c c01f5d10 00000000 c33e3cf0
3ce0: c33e3d14 c33e3cf0 c01f5210 c01f5f58 c303cb48 c25ecf94 c25caa00 c25caa00
3d00: c25caa34 c33e3dd8 c33e3d34 c33e3d18 c01f6044 c01f51b8 c0496d3c c25caa00
3d20: c044e918 c33e3dd8 c33e3d44 c33e3d38 c01f4ff4 c01f5fcc c33e3d94 c33e3d48
3d40: c01f3d10 c01f4fd8 00000000 c044e918 00000000 00000000 c01f52c0 c034d570
3d60: c33e3d84 c33e3d70 c022bf84 c25caa00 00000000 c044e918 c33e3dd8 c25c2e00
3d80: c0496d60 bf01763c c33e3db4 c33e3d98 c022b1a0 c01f384c c25caa00 c33e3dd8
3da0: 00000000 c33e3dd8 c33e3dd4 c33e3db8 c022b27c c022b0e8 00000000 bf01763c
3dc0: c0451c80 c33e3dd8 c33e3e34 c33e3dd8 bf016f60 c022b210 5f75636d 746e6f63
3de0: 006c6f72 00000000 00000000 00000000 00000000 00000000 00000000 bf0174bc
3e00: 00000000 00989680 00000000 00000020 c0451c80 c0451c80 bf0174dc c01f5eb0
3e20: c33f0f00 bf0174dc c33e3e44 c33e3e38 c01f72f4 bf016e2c c33e3e74 c33e3e48
3e40: c01f5d88 c01f72e4 00000000 c0451c80 c0451cb4 bf0174dc c01f5eb0 c33f0f00
3e60: c0473100 00000000 c33e3e94 c33e3e78 c01f5f44 c01f5d10 00000000 c33e3e98
3e80: bf0174dc c01f5eb0 c33e3ebc c33e3e98 c01f5534 c01f5ec0 c303c038 c3061c30
3ea0: 00003cd8 00098258 bf0174dc c0462ac8 c33e3ecc c33e3ec0 c01f5bec c01f54dc
3ec0: c33e3efc c33e3ed0 c01f4d30 c01f5bdc bf0173a0 c33e2000 00003cd8 00098258
3ee0: bf0174dc c33e2000 c00301a4 bf019000 c33e3f1c c33e3f00 c01f6588 c01f4c8c
3f00: 00003cd8 00098258 00000000 c33e2000 c33e3f2c c33e3f20 c01f777c c01f6524
3f20: c33e3f3c c33e3f30 bf019014 c01f7740 c33e3f7c c33e3f40 c002f3ec bf019010
3f40: 00000000 00003cd8 00098258 bf017518 00000000 00003cd8 00098258 bf017518
3f60: 00000000 c00301a4 c33e2000 00000000 c33e3fa4 c33e3f80 c007b934 c002f3c4
3f80: c00b307c c00b2f48 00003cd8 00000000 00000003 00000080 00000000 c33e3fa8
3fa0: c0030020 c007b8b8 00003cd8 00000000 00098288 00003cd8 00098258 00098240
3fc0: 00003cd8 00000000 00000003 00000080 00098008 00098028 00098288 00000001
3fe0: be892998 be892988 00013d7c 40178740 60000010 00098288 09089041 00200845
Backtrace:
[<c0080ba4>] (irq_gc_mask_set_bit+0x0/0x7c) from [<c007f5f8>] (irq_enable+0x48/0x58)
r6:c33d4340 r5:c04555bc r4:c04555bc
[<c007f5b0>] (irq_enable+0x0/0x58) from [<c007f654>] (irq_startup+0x4c/0x54)
r5:c04555bc r4:00000000
[<c007f608>] (irq_startup+0x0/0x54) from [<c007e6e8>] (__setup_irq+0x2f0/0x3cc)
r5:c04555bc r4:00000000
[<c007e3f8>] (__setup_irq+0x0/0x3cc) from [<c007e888>] (request_threaded_irq+0xc4/0x110)
r8:0000002d r7:c33dc008 r6:00000000 r5:00000001 r4:bf016c98
[<c007e7c4>] (request_threaded_irq+0x0/0x110) from [<bf016b44>] (mcu_spi_probe+0x228/0x37c [mcu])
[<bf01691c>] (mcu_spi_probe+0x0/0x37c [mcu]) from [<c022a698>] (spi_drv_probe+0x20/0x24)
[<c022a678>] (spi_drv_probe+0x0/0x24) from [<c01f5d88>] (driver_probe_device+0x88/0x1b0)
[<c01f5d00>] (driver_probe_device+0x0/0x1b0) from [<c01f5f8c>] (__device_attach+0x44/0x48)
[<c01f5f48>] (__device_attach+0x0/0x48) from [<c01f5210>] (bus_for_each_drv+0x68/0x94)
r5:c33e3cf0 r4:00000000
[<c01f51a8>] (bus_for_each_drv+0x0/0x94) from [<c01f6044>] (device_attach+0x88/0xa0)
r7:c33e3dd8 r6:c25caa34 r5:c25caa00 r4:c25caa00
[<c01f5fbc>] (device_attach+0x0/0xa0) from [<c01f4ff4>] (bus_probe_device+0x2c/0x4c)
r7:c33e3dd8 r6:c044e918 r5:c25caa00 r4:c0496d3c
[<c01f4fc8>] (bus_probe_device+0x0/0x4c) from [<c01f3d10>] (device_add+0x4d4/0x648)
[<c01f383c>] (device_add+0x0/0x648) from [<c022b1a0>] (spi_add_device+0xc8/0x128)
[<c022b0d8>] (spi_add_device+0x0/0x128) from [<c022b27c>] (spi_new_device+0x7c/0xb4)
r7:c33e3dd8 r6:00000000 r5:c33e3dd8 r4:c25caa00
[<c022b200>] (spi_new_device+0x0/0xb4) from [<bf016f60>] (mcu_probe+0x144/0x224 [mcu])
r7:c33e3dd8 r6:c0451c80 r5:bf01763c r4:00000000
[<bf016e1c>] (mcu_probe+0x0/0x224 [mcu]) from [<c01f72f4>] (platform_drv_probe+0x20/0x24)
[<c01f72d4>] (platform_drv_probe+0x0/0x24) from [<c01f5d88>] (driver_probe_device+0x88/0x1b0)
[<c01f5d00>] (driver_probe_device+0x0/0x1b0) from [<c01f5f44>] (__driver_attach+0x94/0x98)
[<c01f5eb0>] (__driver_attach+0x0/0x98) from [<c01f5534>] (bus_for_each_dev+0x68/0x94)
r7:c01f5eb0 r6:bf0174dc r5:c33e3e98 r4:00000000
[<c01f54cc>] (bus_for_each_dev+0x0/0x94) from [<c01f5bec>] (driver_attach+0x20/0x28)
r7:c0462ac8 r6:bf0174dc r5:00098258 r4:00003cd8
[<c01f5bcc>] (driver_attach+0x0/0x28) from [<c01f4d30>] (bus_add_driver+0xb4/0x258)
[<c01f4c7c>] (bus_add_driver+0x0/0x258) from [<c01f6588>] (driver_register+0x74/0x158)
[<c01f6514>] (driver_register+0x0/0x158) from [<c01f777c>] (platform_driver_register+0x4c/0x60)
r7:c33e2000 r6:00000000 r5:00098258 r4:00003cd8
[<c01f7730>] (platform_driver_register+0x0/0x60) from [<bf019014>] (mcu_init+0x14/0x20 [mcu])
[<bf019000>] (mcu_init+0x0/0x20 [mcu]) from [<c002f3ec>] (do_one_initcall+0x38/0x170)
[<c002f3b4>] (do_one_initcall+0x0/0x170) from [<c007b934>] (sys_init_module+0x8c/0x1a4)
[<c007b8a8>] (sys_init_module+0x0/0x1a4) from [<c0030020>] (ret_fast_syscall+0x0/0x2c)
r7:00000080 r6:00000003 r5:00000000 r4:00003cd8
Code: e1844003 e585400c e596300c e5932064 (e7814002)

Fix the issue.

Reported-by: Jon Povey <Jon.Povey@racelogic.co.uk>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gpio/omap: fix _set_gpio_irqenable implementation

commit 8276536cec38bc6bde30d0aa67716f22b9b9705a upstream.

This function should be capable of both enabling and disabling interrupts
based upon the *enable* parameter. Right now the function only enables
the interrupt and *enable* is not used at all. So add the interrupt
disable capability also using the parameter.

Signed-off-by: Tarun Kanti DebBarma <tarun.kanti@ti.com>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Reviewed-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

udf: Fix deadlock in udf_release_file()

commit a0391a3ae91d301c0e59368531a4de5f0b122bcf upstream.

udf_release_file() can be called from munmap() path with mmap_sem held. Thus
we cannot take i_mutex there because that ranks above mmap_sem. Luckily,
i_mutex is not needed in udf_release_file() anymore since protection by
i_data_sem is enough to protect from races with write and truncate.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: tegra: select required CPU and L2 errata options

commit f35b431dde39fb40944d1024f08d88fbf04a3193 upstream.

The ARM IP revisions in Tegra are:
Tegra20: CPU r1p1, PL310 r2p0
Tegra30: CPU A01=r2p7/>=A02=r2p9, NEON r2p3-50, PL310 r3p1-50

Based on work by Olof Johansson, although the actual list of errata is
somewhat different here, since I added a bunch more and removed one PL310
erratum that doesn't seem applicable.

Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

vfs: fix d_ancestor() case in d_materialize_unique

commit b18dafc86bb879d2f38a1743985d7ceb283c2f4d upstream.

In d_materialise_unique() there are 3 subcases to the 'aliased dentry'
case; in two subcases the inode i_lock is properly released but this
does not occur in the -ELOOP subcase.

This seems to have been introduced by commit 1836750115f2 ("fix loop
checks in d_materialise_unique()").

Signed-off-by: Michel Lespinasse <walken@google.com>
[ Added a comment, and moved the unlock to where we generate the -ELOOP,
  which seems to be more natural.

  You probably can't actually trigger this without a buggy network file
  server - d_materialize_unique() is for finding aliases on non-local
  filesystems, and the d_ancestor() case is for a hardlinked directory
  loop.

  But we should be robust in the case of such buggy servers anyway. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: check for zero length extent

commit 31d4f3a2f3c73f279ff96a7135d7202ef6833f12 upstream.

Explicitly test for an extent whose length is zero, and flag that as a
corrupted extent.

This avoids a kernel BUG_ON assertion failure.

Tested: Without this patch, the file system image found in
tests/f_ext_zero_len/image.gz in the latest e2fsprogs sources causes a
kernel panic. With this patch, an ext4 file system error is noted
instead, and the file system is marked as being corrupted.

https://bugzilla.kernel.org/show_bug.cgi?id=42859

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix race between sync and completed io work

commit 491caa43639abcffaa645fbab372a7ef4ce2975c upstream.

The following command line will leave the aio-stress process unkillable
on an ext4 file system (in my case, mounted on /mnt/test):

aio-stress -t 20 -s 10 -O -S -o 2 -I 1000 /mnt/test/aiostress.3561.4 /mnt/test/aiostress.3561.4.20 /mnt/test/aiostress.3561.4.19 /mnt/test/aiostress.3561.4.18 /mnt/test/aiostress.3561.4.17 /mnt/test/aiostress.3561.4.16 /mnt/test/aiostress.3561.4.15 /mnt/test/aiostress.3561.4.14 /mnt/test/aiostress.3561.4.13 /mnt/test/aiostress.3561.4.12 /mnt/test/aiostress.3561.4.11 /mnt/test/aiostress.3561.4.10 /mnt/test/aiostress.3561.4.9 /mnt/test/aiostress.3561.4.8 /mnt/test/aiostress.3561.4.7 /mnt/test/aiostress.3561.4.6 /mnt/test/aiostress.3561.4.5 /mnt/test/aiostress.3561.4.4 /mnt/test/aiostress.3561.4.3 /mnt/test/aiostress.3561.4.2

This is using the aio-stress program from the xfstests test suite.
That particular command line tells aio-stress to do random writes to
20 files from 20 threads (one thread per file).  The files are NOT
preallocated, so you will get writes to random offsets within the
file, thus creating holes and extending i_size.  It also opens the
file with O_DIRECT and O_SYNC.

On to the problem.  When an I/O requires unwritten extent conversion,
it is queued onto the completed_io_list for the ext4 inode.  Two code
paths will pull work items from this list.  The first is the
ext4_end_io_work routine, and the second is ext4_flush_completed_IO,
which is called via the fsync path (and O_SYNC handling, as well).
There are two issues I've found in these code paths.  First, if the
fsync path beats the work routine to a particular I/O, the work
routine will free the io_end structure!  It does not take into account
the fact that the io_end may still be in use by the fsync path.  I've
fixed this issue by adding yet another IO_END flag, indicating that
the io_end is being processed by the fsync path.

The second problem is that the work routine will make an assignment to
io->flag outside of the lock.  I have witnessed this result in a hang
at umount.  Moving the flag setting inside the lock resolved that
problem.

The problem was introduced by commit b82e384c7b ("ext4: optimize
locking for end_io extent conversion"), which first appeared in 3.2.
As such, the fix should be backported to that release (probably along
with the unwritten extent conversion race fix).

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix race between unwritten extent conversion and truncate

commit 266991b13890049ee1a6bb95b9817f06339ee3d7 upstream.

The following comment in ext4_end_io_dio caught my attention:

/* XXX: probably should move into the real I/O completion handler */
        inode_dio_done(inode);

The truncate code takes i_mutex, then calls inode_dio_wait.  Because the
ext4 code path above will end up dropping the mutex before it is
reacquired by the worker thread that does the extent conversion, it
seems to me that the truncate can happen out of order.  Jan Kara
mentioned that this might result in error messages in the system logs,
but that should be the extent of the "damage."

The fix is pretty straight-forward: don't call inode_dio_done until the
extent conversion is complete.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: ignore EXT4_INODE_JOURNAL_DATA flag with delalloc

commit 3d2b158262826e8b75bbbfb7b97010838dd92ac7 upstream.

Ext4 does not support data journalling with delayed allocation enabled.
We even do not allow to mount the file system with delayed allocation
and data journalling enabled, however it can be set via FS_IOC_SETFLAGS
so we can hit the inode with EXT4_INODE_JOURNAL_DATA set even on file
system mounted with delayed allocation (default) and that's where
problem arises. The easies way to reproduce this problem is with the
following set of commands:

mkfs.ext4 /dev/sdd
mount /dev/sdd /mnt/test1
dd if=/dev/zero of=/mnt/test1/file bs=1M count=4
chattr +j /mnt/test1/file
dd if=/dev/zero of=/mnt/test1/file bs=1M count=4 conv=notrunc
chattr -j /mnt/test1/file

Additionally it can be reproduced quite reliably with xfstests 272 and
269. In fact the above reproducer is a part of test 272.

To fix this we should ignore the EXT4_INODE_JOURNAL_DATA inode flag if
the file system is mounted with delayed allocation. This can be easily
done by fixing ext4_should_*_data() functions do ignore data journal
flag when delalloc is set (suggested by Ted). We also have to set the
appropriate address space operations for the inode (again, ignoring data
journal flag if delalloc enabled).

Additionally this commit introduces ext4_inode_journal_mode() function
because ext4_should_*_data() has already had a lot of common code and
this change is putting it all into one function so it is easier to
read.

Successfully tested with xfstests in following configurations:

delalloc + data=ordered
delalloc + data=writeback
data=journal
nodelalloc + data=ordered
nodelalloc + data=writeback
nodelalloc + data=journal

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

jbd2: clear BH_Delay & BH_Unwritten in journal_unmap_buffer

commit 15291164b22a357cb211b618adfef4fa82fc0de3 upstream.

journal_unmap_buffer()'s zap_buffer: code clears a lot of buffer head
state ala discard_buffer(), but does not touch _Delay or _Unwritten as
discard_buffer() does.

This can be problematic in some areas of the ext4 code which assume
that if they have found a buffer marked unwritten or delay, then it's
a live one. Perhaps those spots should check whether it is mapped
as well, but if jbd2 is going to tear down a buffer, let's really
tear it down completely.

Without this I get some fsx failures on sub-page-block filesystems
up until v3.2, at which point 4e96b2dbbf1d7e81f22047a50f862555a6cb87cb
and 189e868fa8fdca702eb9db9d8afc46b5cb9144c9 make the failures go
away, because buried within that large change is some more flag
clearing. I still think it's worth doing in jbd2, since
->invalidatepage leads here directly, and it's the right place
to clear away these flags.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

PM / Hibernate: Enable usermodehelpers in hibernate() error path

commit 05b4877f6a4f1ba4952d1222213d262bf8c132b7 upstream.

If create_basic_memory_bitmaps() fails, usermodehelpers are not re-enabled
before returning. Fix this. And while at it, reword the goto labels so that
they look more meaningful.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4: Rate limit the state manager warning messages

commit 9a3ba432330e504ac61ff0043dbdaba7cea0e35a upstream.

Prevent the state manager from filling up system logs when recovery
fails on the server.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mxl111sf: fix error on stream stop in mxl111sf_ep6_streaming_ctrl()

commit 3be5bb71fbf18f83cb88b54a62a78e03e5a4f30a upstream.

Remove unnecessary register access in mxl111sf_ep6_streaming_ctrl()

This code breaks driver operation in kernel 3.3 and later, although
it works properly in 3.2 Disable register access to 0x12 for now.

Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pvrusb2: fix 7MHz & 8MHz DVB-T tuner support for HVR1900 rev D1F5

commit 9ab2393fc3e460cd2040de1483918eb17abb822f upstream.

The D1F5 revision of the WinTV HVR-1900 uses a tda18271c2 tuner
instead of a tda18271c1 tuner as used in revision D1E9. To
account for this, we must hardcode the frontend configuration
to use the same IF frequency configuration for both revisions
of the device.

6MHz DVB-T is unaffected by this issue, as the recommended
IF Frequency configuration for 6MHz DVB-T is the same on both
c1 and c2 revisions of the tda18271 tuner.

Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Cc: Mike Isely <isely@pobox.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

lgdt330x: fix signedness error in i2c_read_demod_bytes()

commit 34817174fca0c5512c2d5b6ea0fc37a0337ce1d8 upstream.

The error handling in lgdt3303_read_status() and lgdt330x_read_ucblocks()
doesn't work, because i2c_read_demod_bytes() returns a u8 and (err < 0)
is always false.

        err = i2c_read_demod_bytes(state, 0x58, buf, 1);
        if (err < 0)
                return err;

Change the return type of i2c_read_demod_bytes() to int.  Also change
the return value on error to -EIO to make (err < 0) work.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

hwmon: (fam15h_power) Correct sign extension of running_avg_capture

commit fc0900cbda9243957d812cd6b4cc87965f9fe75f upstream.

Wrong bit was used for sign extension which caused wrong end results.
Thanks to Andre for spotting this bug.

Reported-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sysctl: protect poll() in entries that may go away

commit 4e474a00d7ff746ed177ddae14fa8b2d4bad7a00 upstream.

Protect code accessing ctl_table by grabbing the header with grab_header()
and after releasing with sysctl_head_finish(). This is needed if poll()
is called in entries created by modules: currently only hostname and
domainname support poll(), but this bug may be triggered when/if modules
use it and if user called poll() in a file that doesn't support it.

Dave Jones reported the following when using a syscall fuzzer while
hibernating/resuming:

RIP: 0010:[<ffffffff81233e3e>] [<ffffffff81233e3e>] proc_sys_poll+0x4e/0x90
RAX: 0000000000000145 RBX: ffff88020cab6940 RCX: 0000000000000000
RDX: ffffffff81233df0 RSI: 6b6b6b6b6b6b6b6b RDI: ffff88020cab6940
[ ... ]
Code: 00 48 89 fb 48 89 f1 48 8b 40 30 4c 8b 60 e8 b8 45 01 00 00 49 83
7c 24 28 00 74 2e 49 8b 74 24 30 48 85 f6 74 24 48 85 c9 75 32 <8b> 16
b8 45 01 00 00 48 63 d2 49 39 d5 74 10 8b 06 48 98 48 89

If an entry goes away while we are polling() it, ctl_table may not exist
anymore.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iommu/amd: Fix section warning for prealloc_protection_domains

commit cebd5fa4d3046d5b43ce1836a0120612822a7fb0 upstream.

Fix the following section warning in drivers/iommu/amd_iommu.c :

WARNING: vmlinux.o(.text+0x526e77): Section mismatch in reference from the function prealloc_protection_domains() to the function .init.text:alloc_passthrough_domain()
The function prealloc_protection_domains() references
the function __init alloc_passthrough_domain().
This is often because prealloc_protection_domains lacks a __init
annotation or the annotation of alloc_passthrough_domain is wrong.

Signed-off-by: Steffen Persvold <sp@numascale.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

proc-ns: use d_set_d_op() API to set dentry ops in proc_ns_instantiate().

commit 1b26c9b334044cff6d1d2698f2be41bc7d9a0864 upstream.

The namespace cleanup path leaks a dentry which holds a reference count
on a network namespace.  Keeping that network namespace from being freed
when the last user goes away.  Leaving things like vlan devices in the
leaked network namespace.

If you use ip netns add for much real work this problem becomes apparent
pretty quickly.  It light testing the problem hides because frequently
you simply don't notice the leak.

Use d_set_d_op() so that DCACHE_OP_* flags are set correctly.

This issue exists back to 3.0.

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reported-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86-32: Fix endless loop when processing signals for kernel tasks

commit 29a2e2836ff9ea65a603c89df217f4198973a74f upstream.

The problem occurs on !CONFIG_VM86 kernels [1] when a kernel-mode task
returns from a system call with a pending signal.

A real-life scenario is a child of 'khelper' returning from a failed
kernel_execve() in ____call_usermodehelper() [ kernel/kmod.c ].
kernel_execve() fails due to a pending SIGKILL, which is the result of
"kill -9 -1" (at least, busybox's init does it upon reboot).

The loop is as follows:

* syscall_exit_work:
- work_pending:            // start_of_the_loop
- work_notify_sig:
   - do_notify_resume()
     - do_signal()
       - if (!user_mode(regs)) return;
- resume_userspace         // TIF_SIGPENDING is still set
- work_pending             // so we call work_pending => goto
                            // start_of_the_loop

More information can be found in another LKML thread:
http://www.serverphorums.com/read.php?12,457826

[1] the problem was also seen on MIPS.

Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Link: http://lkml.kernel.org/r/1332448765.2299.68.camel@dimm
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e1000e: Avoid wrong check on TX hang

commit 09357b00255c233705b1cf6d76a8d147340545b8 upstream.

Based on the original patch submitted my Michael Wang
<wangyun@linux.vnet.ibm.com>.
Descriptors may not be write-back while checking TX hang with flag
FLAG2_DMA_BURST on.
So when we detect hang, we just flush the descriptor and detect
again for once.

-v2 change 1 to true and 0 to false and remove extra ()

CC: Michael Wang <wangyun@linux.vnet.ibm.com>
CC: Flavio Leitner <fbl@redhat.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usbnet: don't clear urb->dev in tx_complete

commit 5d5440a835710d09f0ef18da5000541ec98b537a upstream.

URB unlinking is always racing with its completion and tx_complete
may be called before or during running usb_unlink_urb, so tx_complete
must not clear urb->dev since it will be used in unlink path,
otherwise invalid memory accesses or usb device leak may be caused
inside usb_unlink_urb.

Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usbnet: increase URB reference count before usb_unlink_urb

commit 0956a8c20b23d429e79ff86d4325583fc06f9eb4 upstream.

Commit 4231d47e6fe69f061f96c98c30eaf9fb4c14b96d(net/usbnet: avoid
recursive locking in usbnet_stop()) fixes the recursive locking
problem by releasing the skb queue lock, but it makes usb_unlink_urb
racing with defer_bh, and the URB to being unlinked may be freed before
or during calling usb_unlink_urb, so use-after-free problem may be
triggerd inside usb_unlink_urb.

The patch fixes the use-after-free problem by increasing URB
reference count with skb queue lock held before calling
usb_unlink_urb, so the URB won't be freed until return from
usb_unlink_urb.

Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oliver@neukum.org>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()

commit 540a0f7584169651f485e8ab67461fcb06934e38 upstream.

The problem is that for the case of priority queues, we
have to assume that __rpc_remove_wait_queue_priority will move new
elements from the tk_wait.links lists into the queue->tasks[] list.
We therefore cannot use list_for_each_entry_safe() on queue->tasks[],
since that will skip these new tasks that __rpc_remove_wait_queue_priority
is adding.

Without this fix, rpc_wake_up and rpc_wake_up_status will both fail
to wake up all functions on priority wait queues, which can result
in some nasty hangs.

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

UBI: fix eraseblock picking criteria

commit 7eb3aa65853e1b223bfc786b023b702018cb76c0 upstream.

The 'find_wl_entry()' function expects the maximum difference as the second
argument, not the maximum absolute value. So the "unknown" eraseblock picking
was incorrect, as Shmulik Ladkani spotted. This patch fixes the issue.

Reported-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

UBI: fix error handling in ubi_scan()

commit a29852be492d61001d86c6ebf5fff9b93d7b4be9 upstream.

Two bad things can happen in ubi_scan():
1. If kmem_cache_create() fails we jump to out_si and call
   ubi_scan_destroy_si() which calls kmem_cache_destroy().
   But si->scan_leb_slab is NULL.
2. If process_eb() fails we jump to out_vidh, call
   kmem_cache_destroy() and ubi_scan_destroy_si() which calls
   again kmem_cache_destroy().

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

CIFS: Fix a spurious error in cifs_push_posix_locks

commit ce85852b90a214cf577fc1b4f49d99fd7e98784a upstream.

Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: fix issue mounting of DFS ROOT when redirecting from one domain controller to the next

commit 1daaae8fa4afe3df78ca34e724ed7e8187e4eb32 upstream.

This patch fixes an issue when cifs_mount receives a
STATUS_BAD_NETWORK_NAME error during cifs_get_tcon but is able to
continue after an DFS ROOT referral. In this case, the return code
variable is not reset prior to trying to mount from the system referred
to. Thus, is_path_accessible is not executed and the final DFS referral
is not performed causing a mount error.

Use case: In DNS, example.com resolves to the secondary AD server
ad2.example.com Our primary domain controller is ad1.example.com and has
a DFS redirection set up from \\ad1\share\Users to \\files\share\Users.
Mounting \\example.com\share\Users fails.

Regression introduced by commit 724d9f1.

Reviewed-by: Pavel Shilovsky <piastry@etersoft.ru
Signed-off-by: Thomas Hadig <thomas@intapp.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

CIFS: Respect negotiated MaxMpxCount

commit 10b9b98e41ba248a899f6175ce96ee91431b6194 upstream.

Some servers sets this value less than 50 that was hardcoded and
we lost the connection if when we exceed this limit. Fix this by
respecting this value - not sending more than the server allows.

Reviewed-by: Jeff Layton <jlayton@samba.org>
Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfs: fix inode lookup race

commit f30d500f809eca67a21704347ab14bb35877b5ee upstream.

When we get concurrent lookups of the same inode that is not in the
per-AG inode cache, there is a race condition that triggers warnings
in unlock_new_inode() indicating that we are initialising an inode
that isn't in a the correct state for a new inode.

When we do an inode lookup via a file handle or a bulkstat, we don't
serialise lookups at a higher level through the dentry cache (i.e.
pathless lookup), and so we can get concurrent lookups of the same
inode.

The race condition is between the insertion of the inode into the
cache in the case of a cache miss and a concurrently lookup:

Thread 1 Thread 2
xfs_iget()
  xfs_iget_cache_miss()
    xfs_iread()
    lock radix tree
    radix_tree_insert()
rcu_read_lock
radix_tree_lookup
lock inode flags
XFS_INEW not set
igrab()
unlock inode flags
rcu_read_unlock
use uninitialised inode
.....
    lock inode flags
    set XFS_INEW
    unlock inode flags
    unlock radix tree
  xfs_setup_inode()
    inode flags = I_NEW
    unlock_new_inode()
      WARNING as inode flags != I_NEW

This can lead to inode corruption, inode list corruption, etc, and
is generally a bad thing to occur.

Fix this by setting XFS_INEW before inserting the inode into the
radix tree. This will ensure any concurrent lookup will find the new
inode with XFS_INEW set and that forces the lookup to wait until the
XFS_INEW flag is removed before allowing the lookup to succeed.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4: Return the delegation if the server returns NFS4ERR_OPENMODE

commit 3114ea7a24d3264c090556a2444fc6d2c06176d4 upstream.

If a setattr() fails because of an NFS4ERR_OPENMODE error, it is
probably due to us holding a read delegation. Ensure that the
recovery routines return that delegation in this case.

Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>