Modify uid check in do_coredump so as to not apply it in the case of
pipes.
This just got noticed in testing. The end of do_coredump validates the
uid of the inode for the created file against the uid of the crashing
process to ensure that no one can pre-create a core file with different
ownership and grab the information contained in the core when they
shouldn' tbe able to. This causes failures when using pipes for a core
dumps if the crashing process is not root, which is the uid of the pipe
when it is created.
The fix is simple. Since the check for matching uid's isn't relevant for
pipes (a process can't create a pipe that the uermodehelper code will open
anyway), we can just just skip it in the event ispipe is non-zero
Reverts a pipe-affecting change which was accidentally made in
A bug was found with Li Zefan's ftrace_stress_test that caused applications
to segfault during the test.
Placing a tracing_off() in the segfault code, and examining several
traces, I found that the following was always the case. The lock tracer
was enabled (lockdep being required) and userstack was enabled. Testing
this out, I just enabled the two, but that was not good enough. I needed
to run something else that could trigger it. Running a load like hackbench
did not work, but executing a new program would. The following would
trigger the segfault within seconds:
# echo 1 > /debug/tracing/options/userstacktrace
# echo 1 > /debug/tracing/events/lock/enable
# while :; do ls > /dev/null ; done
Enabling the function graph tracer and looking at what was happening
I finally noticed that all cashes happened just after an NMI.
1) | copy_user_handle_tail() {
1) | bad_area_nosemaphore() {
1) | __bad_area_nosemaphore() {
1) | no_context() {
1) | fixup_exception() {
1) 0.319 us | search_exception_tables();
1) 0.873 us | }
[...]
1) 0.314 us | __rcu_read_unlock();
1) 0.325 us | native_apic_mem_write();
1) 0.943 us | }
1) 0.304 us | rcu_nmi_exit();
[...]
1) 0.479 us | find_vma();
1) | bad_area() {
1) | __bad_area() {
After capturing several traces of failures, all of them happened
after an NMI. Curious about this, I added a trace_printk() to the NMI
handler to read the regs->ip to see where the NMI happened. In which I
found out it was here:
What was happening is that the NMI would happen at the place that a page
fault occurred. It would call rcu_read_lock() which was traced by
the lock events, and the user_stack_trace would run. This would trigger
a page fault inside the NMI. I do not see where the CR2 register is
saved or restored in NMI handling. This means that it would corrupt
the page fault handling that the NMI interrupted.
The reason the while loop of ls helped trigger the bug, was that
each execution of ls would cause lots of pages to be faulted in, and
increase the chances of the race happening.
The simple solution is to not allow user stack traces in NMI context.
After this patch, I ran the above "ls" test for a couple of hours
without any issues. Without this patch, the bug would trigger in less
than a minute.
Reported-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When the trace iterator is read, tracing_start() and tracing_stop()
is called to stop tracing while the iterator is processing the trace
output.
These functions disable both the standard buffer and the max latency
buffer. But if the wakeup tracer is running, it can switch these
buffers between the two disables:
buffer = global_trace.buffer;
if (buffer)
ring_buffer_record_disable(buffer);
<<<--------- swap happens here
buffer = max_tr.buffer;
if (buffer)
ring_buffer_record_disable(buffer);
What happens is that we disabled the same buffer twice. On tracing_start()
we can enable the same buffer twice. All ring_buffer_record_disable()
must be matched with a ring_buffer_record_enable() or the buffer
can be disable permanently, or enable prematurely, and cause a bug
where a reset happens while a trace is commiting.
This patch protects these two by taking the ftrace_max_lock to prevent
a switch from occurring.
Found with Li Zefan's ftrace_stress_test.
Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In the ftrace code that resets the ring buffer it references the
buffer with a local variable, but then uses the tr->buffer as the
parameter to reset. If the wakeup tracer is running, which can
switch the tr->buffer with the max saved buffer, this can break
the requirement of disabling the buffer before the reset.
This warning in s_next() can be triggered by lseek():
[<c018b3f7>] ? s_next+0x77/0x80
[<c013e3c1>] warn_slowpath_common+0x81/0xa0
[<c018b3f7>] ? s_next+0x77/0x80
[<c013e3fa>] warn_slowpath_null+0x1a/0x20
[<c018b3f7>] s_next+0x77/0x80
[<c01efa77>] traverse+0x117/0x200
[<c01eff13>] seq_lseek+0xa3/0x120
[<c01efe70>] ? seq_lseek+0x0/0x120
[<c01d7081>] vfs_llseek+0x41/0x50
[<c01d8116>] sys_llseek+0x66/0xa0
[<c0102bd0>] sysenter_do_call+0x12/0x26
The iterator "leftover" variable is zeroed in the opening of the trace
file. But lseek can call s_start() which will call s_next() without
reseting the "leftover" variable back to zero, which might trigger
the WARN_ON_ONCE(iter->leftover) that is in s_next().
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B8CE06A.9090207@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If the graph tracer is active, and a task is forked but the allocating of
the processes graph stack fails, it can cause crash later on.
This is due to the temporary stack being NULL, but the curr_ret_stack
variable is copied from the parent. If it is not -1, then in
ftrace_graph_probe_sched_switch() the following:
for (index = next->curr_ret_stack; index >= 0; index--)
next->ret_stack[index].calltime += timestamp;
Will cause a kernel OOPS.
Found with Li Zefan's ftrace_stress_test.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We support event unthrottling in breakpoint events. It means
that if we have more than sysctl_perf_event_sample_rate/HZ,
perf will throttle, ignoring subsequent events until the next
tick.
So if ptrace exceeds this max rate, it will omit events, which
breaks the ptrace determinism that is supposed to report every
triggered breakpoints. This is likely to happen if we set
sysctl_perf_event_sample_rate to 1.
This patch removes support for unthrottling in breakpoint
events to break throttling and restore ptrace determinism.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: K.Prasad <prasad@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Callers of a stacktrace might pass bad frame pointers. Those
are usually checked for safety in stack walking helpers before
any dereferencing, but this is not the case when we need to go
through one more frame pointer that backlinks the irq stack to
the previous one, as we don't have any reliable address boudaries
to compare this frame pointer against.
This raises crashes when we record callchains for ftrace events
with perf because we don't use the right helpers to capture
registers there. We get wrong frame pointers as we call
task_pt_regs() even on kernel threads, which is a wrong thing
as it gives us the initial state of any kernel threads freshly
created. This is even not what we want for user tasks. What we want
is a hot snapshot of registers when the ftrace event triggers, not
the state before a task entered the kernel.
This requires more thoughts to do it correctly though.
So first put a guardian to ensure the given frame pointer
can be dereferenced to avoid crashes. We'll think about how to fix
the callers in a subsequent patch.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We currently enforce the !RW mapping for the kernel mapping that maps
holes between different text, rodata and data sections. However, kernel
identity mappings will have different RWX permissions to the pages mapping to
text and to the pages padding (which are freed) the text, rodata sections.
Hence kernel identity mappings will be broken to smaller pages. For 64-bit,
kernel text and kernel identity mappings are different, so we can enable
protection checks that come with CONFIG_DEBUG_RODATA, as well as retain 2MB
large page mappings for kernel text.
Konrad reported a boot failure with the Linux Xen paravirt guest because of
this. In this paravirt guest case, the kernel text mapping and the kernel
identity mapping share the same page-table pages. Thus forcing the !RW mapping
for some of the kernel mappings also cause the kernel identity mappings to be
read-only resulting in the boot failure. Linux Xen paravirt guest also
uses 4k mappings and don't use 2M mapping.
Fix this issue and retain large page performance advantage for native kernels
by not working hard and not enforcing !RW for the kernel text mapping,
if the current mapping is already using small page mapping.
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1266522700.2909.34.camel@sbs-t61.sc.intel.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The ring buffer resizing and resetting relies on a schedule RCU
action. The buffers are disabled, a synchronize_sched() is called
and then the resize or reset takes place.
But this only works if the disabling of the buffers are within the
preempt disabled section, otherwise a window exists that the buffers
can be written to while a reset or resize takes place.
Reported-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
LKML-Reference: <4B949E43.2010906@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The beacon sent gating doesn't seem to work with any combination
of flags. Thus, buffered frames tend to stay buffered forever,
using up tx descriptors.
Instead, use the DBA gating and hold transmission of the buffered
frames until 80% of the beacon interval has elapsed using the ready
time. This fixes the following error in AP mode:
ath5k phy0: no further txbuf available, dropping packet
Add a comment to acknowledge that this isn't the best solution.
Signed-off-by: Bob Copeland <me@bobcopeland.com> Acked-by: Nick Kossifidis <mickflemm@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When using the external sleep clock in AP mode, the
TSF increments too quickly, causing beacon interval
to be much lower than it is supposed to be, resulting
in lots of beacon-not-ready interrupts.
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=14802.
Signed-off-by: Bob Copeland <me@bobcopeland.com> Acked-by: Nick Kossifidis <mickflemm@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I/Q calibration was completely broken, resulting in a high number of CRC errors
on received packets. before i could see around 10% to 20% CRC errors, with this
patch they are between 0% and 3%.
1.) the removal of the mask in commit "ath5k: Fix I/Q calibration
(f1cf2dbd0f798b71b1590e7aca6647f2caef1649)" resulted in no mask beeing used
when writing the I/Q values into the register. additional errors in the
calculation of the values (see 2.) resulted too high numbers, exceeding the
masks, so wrong values like 0xfffffffe were written. to be safe we should
always use the bitmask when writing parts of a register.
2.) using a (s32) cast for q_coff is a wrong conversion to signed, since we
convert to a signed value later by substracting 128. this resulted in too low
numbers for Q many times, which were limited to -16 by the boundary check later
on.
3.) checked everything against the HAL sources and took over comments and minor
optimizations from there.
4.) we can't use ENABLE_BITS when we want to write a number (the number can
contain zeros). also always write the correction values first and set ENABLE
bit last, like the HAL does.
Signed-off-by: Bruno Randolf <br1@einfach.org> Acked-by: Nick Kossifidis <mickflemm@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Experience has shown that the block buffer can only be used for SMBus
(not I2C) block transactions, even though the datasheet doesn't
mention this limitation.
Reported-by: Felix Rubinstein <felixru@gmail.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Oleg Ryjkov <oryjkov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Aaro Koskinen reported an issue in kernel.org bugzilla #15366, where
on non-GENERIC_TIME systems, accessing
/sys/devices/system/clocksource/clocksource0/current_clocksource
results in an oops.
It seems the timekeeper/clocksource rework missed initializing the
curr_clocksource value in the !GENERIC_TIME case.
Thanks to Aaro for reporting and diagnosing the issue as well as
testing the fix!
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
LKML-Reference: <1267475683.4216.61.camel@localhost.localdomain> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since alc_auto_create_input_ctls() doesn't set the elements for the
secondary ADCs, "Input Source" elemtns for these also get empty, resulting
in buggy outputs of alsactl like:
control.14 {
comment.access 'read write'
comment.type ENUMERATED
comment.count 1
iface MIXER
name 'Input Source'
index 1
value 0
}
This patch fixes alc_mux_enum_*() (and others) to fall back to the
first entry if the secondary input mux is empty.
without the following patch audio ssttuutteerrs on
ASUS M2N32-SLI PREMIUM ACPI BIOS Revision 1304
the sound device is:
00:0e.1 Audio device: nVidia Corporation MCP55 High Definition Audio (rev a2)
worked with 2.6.32
forgot to update tg3_poll_controller(), leading to intermittent crashes with
netpoll.
Fix this.
Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Handling HT configuration changes involved setting the channel
with the new HT parameters and then issuing a rate_update()
notification to the driver.
This behavior changed after the off-channel changes. Now, the channel
is not updated with the new HT params in enable_ht() - instead, it
is now done when the scan work terminates. This results in the driver
depending on stale information, defaulting to non-HT mode always.
Fix this by passing the new channel type to the driver.
Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit 2552fc2 changed the way the decompressor decides if it is safe
to decompress the kernel directly to its final location. Unfortunately,
it took the top of the compressed data as being the stack pointer,
which it is for ROM=n cases. However, for ROM=y, the stack pointer
is not relevant, and results in the wrong answer.
Fix this by explicitly storing the end of the biggybacked data in the
decompressor, and use that to calculate the compressed image size.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The ARM kernel decompressor wants to be able to relocate r/w data
independently from the rest of the image, and we do this by ensuring that
r/w data has global visibility. Define STATIC_RW_DATA to be empty to
achieve this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Alain Knaff <alain@knaff.lu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The few lines below the kfree of hdr_buf may go to the label err_free
which will also free hdr_buf. The most straightforward solution seems to
be to just move the kfree of hdr_buf after these gotos.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
Distros generally (I looked at Debian, RHEL5 and SLES11) seem to
enable CONFIG_HIGHPTE for any x86 configuration which has highmem
enabled. This means that the overhead applies even to machines which
have a fairly modest amount of high memory and which therefore do not
really benefit from allocating PTEs in high memory but still pay the
price of the additional mapping operations.
Running kernbench on a 4G box I found that with CONFIG_HIGHPTE=y but
no actual highptes being allocated there was a reduction in system
time used from 59.737s to 55.9s.
With CONFIG_HIGHPTE=y and highmem PTEs being allocated:
Average Optimal load -j 4 Run (std deviation):
Elapsed Time 175.396 (0.238914)
User Time 515.983 (5.85019)
System Time 59.737 (1.26727)
Percent CPU 263.8 (71.6796)
Context Switches 39989.7 (4672.64)
Sleeps 42617.7 (246.307)
With CONFIG_HIGHPTE=y but with no highmem PTEs being allocated:
Average Optimal load -j 4 Run (std deviation):
Elapsed Time 174.278 (0.831968)
User Time 515.659 (6.07012)
System Time 55.9 (1.07799)
Percent CPU 263.8 (71.266)
Context Switches 39929.6 (4485.13)
Sleeps 42583.7 (373.039)
This patch allows the user to control the allocation of PTEs in
highmem from the command line ("userpte=nohigh") but retains the
status-quo as the default.
It is possible that some simple heuristic could be developed which
allows auto-tuning of this option however I don't have a sufficiently
large machine available to me to perform any particularly meaningful
experiments. We could probably handwave up an argument for a threshold
at 16G of total RAM.
Assuming 768M of lowmem we have 196608 potential lowmem PTE
pages. Each page can map 2M of RAM in a PAE-enabled configuration,
meaning a maximum of 384G of RAM could potentially be mapped using
lowmem PTEs.
Even allowing generous factor of 10 to account for other required
lowmem allocations, generous slop to account for page sharing (which
reduces the total amount of RAM mappable by a given number of PT
pages) and other innacuracies in the estimations it would seem that
even a 32G machine would not have a particularly pressing need for
highmem PTEs. I think 32G could be considered to be at the upper bound
of what might be sensible on a 32 bit machine (although I think in
practice 64G is still supported).
It's seems questionable if HIGHPTE is even a win for any amount of RAM
you would sensibly run a 32 bit kernel on rather than going 64 bit.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
LKML-Reference: <1266403090-20162-1-git-send-email-ian.campbell@citrix.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
setscheduler() saves task->sched_class outside of the rq->lock held
region for a check after the setscheduler changes have become
effective. That might result in checking a stale value.
rtmutex_setprio() has the same problem, though it is protected by
p->pi_lock against setscheduler(), but for correctness sake (and to
avoid bad examples) it needs to be fixed as well.
Retrieve task->sched_class inside of the rq->lock held region.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix a SMT scheduler performance regression that is leading to a scenario
where SMT threads in one core are completely idle while both the SMT threads
in another core (on the same socket) are busy.
This is caused by this commit (with the problematic code highlighted)
On a SMT system, power of the HT logical cpu will be 589 and
the scheduler load imbalance (for scenarios like the one mentioned above)
can be approximately 1024 (SCHED_LOAD_SCALE). The above change of scaling
the weighted load with the power will result in "wl > imbalance" and
ultimately resulting in find_busiest_queue() return NULL, causing
load_balance() to think that the load is well balanced. But infact
one of the tasks can be moved to the idle core for optimal performance.
We don't need to use the weighted load (wl) scaled by the cpu power to
compare with imabalance. In that condition, we already know there is only a
single task "rq->nr_running == 1" and the comparison between imbalance,
wl is to make sure that we select the correct priority thread which matches
imbalance. So we really need to compare the imabalnce with the original
weighted load of the cpu and not the scaled load.
But in other conditions where we want the most hammered(busiest) cpu, we can
use scaled load to ensure that we consider the cpu power in addition to the
actual load on that cpu, so that we can move the load away from the
guy that is getting most hammered with respect to the actual capacity,
as compared with the rest of the cpu's in that busiest group.
Fix it.
Reported-by: Ma Ling <ling.ma@intel.com> Initial-Analysis-by: Zhang, Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1266023662.2808.118.camel@sbs-t61.sc.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix for sched_mc_powersavigs for pre-Nehalem platforms.
Child sched domain should clear SD_PREFER_SIBLING if parent will have
SD_POWERSAVINGS_BALANCE because they are contradicting.
Sets the flags correctly based on sched_mc_power_savings.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100208100555.GD2931@dirshya.in.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We don't support these instructions, but guest can execute them even if the
feature('monitor') haven't been exposed in CPUID. So we would trap and inject
a #UD if guest try this way.
Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Free the dm_io structure before calling bio_endio() instead of after it,
to ensure that the io_pool containing it is not referenced after it is
freed.
This partially fixes a problem described here
https://www.redhat.com/archives/dm-devel/2010-February/msg00109.html
thread 1:
bio_endio(bio, io_error);
/* scheduling happens */
thread 2:
close the device
remove the device
thread 1:
free_io(md, io);
Thread 2, when removing the device, sees non-empty md->io_pool (because the
io hasn't been freed by thread 1 yet) and may crash with BUG in mempool_free.
Thread 1 may also crash, when freeing into a nonexisting mempool.
To fix this we must make sure that bio_endio() is the last call and
the md structure is not accessed afterwards.
There is another bio_endio in process_barrier, but it is called from the thread
and the thread is destroyed prior to freeing the mempools, so this call is
not affected by the bug.
A similar bug exists with module unloads - the module may be unloaded
immediately after bio_endio - but that is more difficult to fix.
The else part of the if statement is indented but does not have braces
around it. It clearly should since it uses clk_enable and clk_disable
which are supposed to balance.
If no platform_data was givin to the device it's going to use it's default
platform data struct which has all fields initialized to zero. As a
result the driver is going to try to request gpio0 both as write protect
and card detect pin. Which of course will fail and makes the driver
unusable
Previously to the introduction of no_wprotect and no_detect the behavior
was to assume that if no platform data was given there is no write protect
or card detect pin. This patch restores that behavior.
The 'struct svc_deferred_req's on the xpt_deferred queue do not
own a reference to the owning xprt. This is seen in svc_revisit
which is where things are added to this queue. dr->xprt is set to
NULL and the reference to the xprt it put.
So when this list is cleaned up in svc_delete_xprt, we mustn't
put the reference.
Also, replace the 'for' with a 'while' which is arguably
simpler and more likely to compile efficiently.
Cc: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This IBM system has a multi-function SDVO card that reports both VGA
and TV, but the system has no TV connector. The TV connector always
reported as connected, which would lead to poor modesetting choices.
Enable the SD-Card interface on multiple Option 3G sticks.
The unusual_devs.h entry is necessary because the device descriptor is
vendor-specific. That prevents usb-storage from binding to it as an interface
driver.
Signed-off-by: Jan Dumon <j.dumon@option.com> Signed-off-by: Phil Dibowitz <phil@ipom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
1. Open a USB device through devio.
2. Remove the hcd module in the host kernel.
3. Close the devio file descriptor.
The problem is that closing the file descriptor does usb_release_dev
as it is the last reference. usb_release_dev then tries to invoke
the hcd free_dev function (or rather dereferencing the hcd driver
struct). This causes an oops as the hcd driver has already been
unloaded so the struct is gone.
This patch tries to fix this by bringing the free_dev call earlier
and into usb_disconnect. I have verified that repeating the
above steps no longer crashes with this patch applied.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When hardware is removed on a Stratus, the system may crash like this:
ACPI: PCI interrupt for device 0000:7c:00.1 disabled
Trying to free nonexistent resource <00000000a8000000-00000000afffffff>
Trying to free nonexistent resource <00000000a4800000-00000000a480ffff>
uhci_hcd 0000:7e:1d.0: remove, state 1
usb usb2: USB disconnect, address 1
usb 2-1: USB disconnect, address 2
Unable to handle kernel paging request at 0000000000100100 RIP:
[<ffffffff88021950>] :uhci_hcd:uhci_scan_schedule+0xa2/0x89c
This occurs because an interrupt scans uhci->skelqh, which is
being freed. We do the right thing: disable the interrupts in the
device, and do not do any processing if the interrupt is shared
with other source, but it's possible that another CPU gets
delayed somewhere (e.g. loops) until we started freeing.
The agreed-upon solution is to wait for interrupts to play out
before proceeding. No other bareers are neceesary.
A backport of this patch was tested on a 2.6.18 based kernel.
Testing of 2.6.32-based kernels is under way, but it takes us
forever (months) to turn this around. So I think it's a good
patch and we should keep it.
Tracked in RH bz#516851
Signed-Off-By: Pete Zaitcev <zaitcev@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
According "5.3.6 Capability Parameters (HCCPARAMS)" of xHCI rev0.96 spec,
value of xECP register indicates a relative offset, in 32-bit words,
from Base to the beginning of the first extended capability.
The wrong calculation will cause BIOS handoff fail (not handoff from BIOS)
in some platform with BIOS USB legacy sup support.
Signed-off-by: Edward Shao <laface.tw@gmail.com> Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Renninger <trenn@suse.de> reported on IBM x3330
booting a latest kernel on this machine results in:
PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
ACPI: SCI (IRQ30) allocation failed
ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
ACPI: Unable to start the ACPI Interpreter
x86/pci: update pirq_enable_irq() to setup io apic routing
it turns out we need to set irq routing for the sci on ioapic1 early.
-v2: make it work without sparseirq too.
-v3: fix checkpatch.pl warning, and cc to stable
Reported-by: Thomas Renninger <trenn@suse.de> Bisected-by: Thomas Renninger <trenn@suse.de> Tested-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-2-git-send-email-yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When two drivers are setting up MSI-X at the same time via
pci_enable_msix() there is a race. See this dmesg excerpt:
[ 85.170610] ixgbe 0000:02:00.1: irq 97 for MSI/MSI-X
[ 85.170611] alloc irq_desc for 99 on node -1
[ 85.170613] igb 0000:08:00.1: irq 98 for MSI/MSI-X
[ 85.170614] alloc kstat_irqs on node -1
[ 85.170616] alloc irq_2_iommu on node -1
[ 85.170617] alloc irq_desc for 100 on node -1
[ 85.170619] alloc kstat_irqs on node -1
[ 85.170621] alloc irq_2_iommu on node -1
[ 85.170625] ixgbe 0000:02:00.1: irq 99 for MSI/MSI-X
[ 85.170626] alloc irq_desc for 101 on node -1
[ 85.170628] igb 0000:08:00.1: irq 100 for MSI/MSI-X
[ 85.170630] alloc kstat_irqs on node -1
[ 85.170631] alloc irq_2_iommu on node -1
[ 85.170635] alloc irq_desc for 102 on node -1
[ 85.170636] alloc kstat_irqs on node -1
[ 85.170639] alloc irq_2_iommu on node -1
[ 85.170646] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000088
As you can see igb and ixgbe are both alternating on create_irq_nr()
via pci_enable_msix() in their probe function.
ixgbe: While looping through irq_desc_ptrs[] via create_irq_nr() ixgbe
choses irq_desc_ptrs[102] and exits the loop, drops vector_lock and
calls dynamic_irq_init. Then it sets irq_desc_ptrs[102]->chip_data =
NULL via dynamic_irq_init().
igb: Grabs the vector_lock now and starts looping over irq_desc_ptrs[]
via create_irq_nr(). It gets to irq_desc_ptrs[102] and does this:
cfg_new = irq_desc_ptrs[102]->chip_data;
if (cfg_new->vector != 0)
continue;
This hits the NULL deref.
Another possible race exists via pci_disable_msix() in a driver or in
the number of error paths that call free_msi_irqs():
There's a path in the pagefault code where the kernel deliberately
breaks its own locking rules by kmapping a high pte page without
holding the pagetable lock (in at least page_check_address). This
breaks Xen's ability to track the pinned/unpinned state of the
page. There does not appear to be a viable workaround for this
behaviour so simply disable HIGHPTE for all Xen guests.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
LKML-Reference: <1267204562-11844-1-git-send-email-ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Pasi Kärkkäinen <pasik@iki.fi> Cc: <xen-devel@lists.xensource.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Do not set current->mm->mmap to NULL in 32-bit emulation on 64-bit
load_aout_binary after flush_old_exec as it would destroy already
set brpm mapping with arguments.
Given the right combination of ThinkPad and X.org, just reading the
video output control state is enough to hard-crash X.org.
Until the day I somehow find out a model or BIOS cut date to not
provide this feature to ThinkPads that can do video switching through
X RandR, change permissions so that only processes with CAP_SYS_ADMIN
can access any sort of video output control state.
This bug could be considered a local DoS I suppose, as it allows any
non-privledged local user to cause some versions of X.org to
hard-crash some ThinkPads.
Studying the DSDTs of various thinkpads, it looks like bit 3 of the
argument to SBDC and SWAN is not "set radio to last state on resume".
Rather, it seems to be "if this bit is set, enable radio on resume,
otherwise disable it on resume".
So, the proper way to prepare the radios for S3 suspend is: disable
radio and clear bit 3 on the SBDC/SWAN call to to resume with radio
disabled, and enable radio and set bit 3 on the SBDC/SWAN call to
resume with the radio enabled.
Also, for persistent devices, the rfkill core does not restore state,
so we really need to get the firmware to do the right thing.
We don't sync the radio state on suspend, instead we trust the BIOS to
not do anything weird if we never touched the radio state since boot.
Time will tell if that's a wise way of doing things...
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Brightness notification does not work until the user writes to
hotkey_mask attribute. That's because the polling thread will only run
if hotkey_user_mask is set and someone is reading the input device or
if hotkey_driver_mask is set. In this second case, this condition is
not tested after the mask is changed, because the brightness and
volume drivers are started after the hotkey drivers.
Fix tpacpi_hotkey_driver_mask_set() to call hotkey_poll_setup(), so
that the poller kthread will be started when needed.
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Event 0x3006 is used to help power management of the ODD in the
UltraBay. The EC generates this event when the ODD eject button is
pressed (even if the bay is powered down).
Normally, Linux doesn't need this as we keep the SATA link powered
up (which wastes power). The EC powers up the bay by itself when the
ODD eject button is pressed, and the SATA PHY reports the hotplug.
However, we could also power that SATA link down (and for that matter,
also power down the Ultrabay) if the ODD is left idle for a while with
no disk inside, and use event 0x3006 to know when we need that SATA link
powered back up.
For now, just stop asking for more information when event 0x3006 is
seen, there is no point in pestering users about it anymore.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If frames are transmitted on 4-addr ap vlan interfaces with no station,
they end up being transmitted unencrypted, even if the ap interface
uses WPA. This patch add some sanity checking to make sure that this
does not happen.
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Intergraph bought 3D Labs and some XVR-500 chips have Intergraph's
vendor id.
Reported-by: Jurij Smakov <jurij@wooyd.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
e->index overflows e->stamps[] every ip_pkt_list_tot packets.
Consider the case when ip_pkt_list_tot==1; the first packet received is stored
in e->stamps[0] and e->index is initialized to 1. The next received packet
timestamp is then stored at e->stamps[1] in recent_entry_update(),
a buffer overflow because the maximum e->stamps[] index is 0.
Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If b43 or b43legacy are deauthenticated or disconnected, there is a
possibility that a reconnection is tried with the queues stopped in
mac80211. To prevent this, start the queues before setting
STAT_INITIALIZED.
In b43, a similar change has been in place (twice) in the
wireless_core_init() routine. Remove the duplicate and add similar
code to b43legacy.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The hardware needs to know what type of frames are being
sent in order to fill in various fields, for example the
timestamp in probe responses (before this patch, it was
always 0). Set it correctly when initializing the TX
descriptor.
Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
While ath9k does not support RIFS yet, the ability to receive RIFS
frames is currently enabled for most chipsets in the initvals.
This is causing baseband related issues on AR9160 and AR9130 based
chipsets, which can lock up under certain conditions.
This patch fixes these issues by overriding the initvals, effectively
disabling RIFS for all affected chipsets.
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Acked-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When selecting the tx fallback rate, rc.c used a separate variable
'nrix' for storing the next rate index, however it did not use that as
reference for further rate index lowering. Because of that, it ended up
reusing the same rate for multiple multi-rate retry stages, thus
decreasing delivery probability under changing link conditions.
This patch removes the separate (unnecessary) variable and fixes
fallback the way it was intended to work.
This should result in increased throughput and better link stability.
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In AP mode, ath_beacon_config_ap only restarts the timer if a TSF
restart is requested. Apparently this was added, because this function
unconditionally sets the flag for TSF reset.
The problem with this is, that ath9k_hw_reset() clobbers the timer
registers (specified in the initvals), thus effectively disabling the
SWBA interrupt whenever a card reset without TSF reset is issued
(happens in a few places in the code).
This patch fixes ath_beacon_config_ap to only issue the TSF reset flag
when necessary, but reinitialize the timer unconditionally. Tests show,
that this is enough to keep the SWBA interrupt going after a call to
ath_reset()
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The newer single chip hardware family of chipsets have not been
experiencing issues with power saving set by default with recent
fixes merged (even into stable). The remaining issues are only
reported with AR5416 and since enabling PS by default can increase
power savings considerably best to take advantage of that feature
as this has been tested properly.
For more details on this issue see the bug report:
http://bugzilla.kernel.org/show_bug.cgi?id=14267
We leave AR5416 with PS disabled by default, that seems to require
some more work.
Cc: Peter Stuge <peter@stuge.se> Cc: Justin P. Mattock <justinmattock@gmail.com> Cc: Kristoffer Ericson <kristoffer.ericson@gmail.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In "wireless: remove WLAN_80211 and WLAN_PRE80211 from Kconfig" I
inadvertantly missed a line in include/linux/netdevice.h. I thereby
effectively reverted "net: Set LL_MAX_HEADER properly for wireless." by
accident. :-( Now we should check there for CONFIG_WLAN instead.
Signed-off-by: John W. Linville <linville@tuxdriver.com> Reported-by: Christoph Egger <siccegge@stud.informatik.uni-erlangen.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The alignment requirement for 64-bit load/store instructions on ARM is
implementation defined. Some CPUs (such as Marvell Feroceon) do not
generate an exception, if such an instruction is executed with an
address that is not 64 bit aligned. In such a case, the Feroceon
corrupts adjacent memory, which showed up in my tests as a crash in the
rx path of ath9k that only occured with CONFIG_XFRM set.
This crash happened, because the first field of the mac80211 rx status
info in the cb is an u64, and changing it corrupted the skb->sp field.
This patch also closes some potential pre-existing holes in the sk_buff
struct surrounding the cb[] area.
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We use scm_send and scm_recv on both unix domain and
netlink sockets, but only unix domain sockets support
everything required for file descriptor passing,
so error if someone attempts to pass file descriptors
over netlink sockets.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The UltraDMA Tss timing must be stretched with ATA clock of 66 MHz, but the
driver only does this when PCI clock is 66 MHz, whereas it always programs
DPLL clock (which is used as the ATA clock) to 66 MHz.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Recent changes on the I2C front have left off-by-one array indexes in
3 hwmon drivers. Fix them.
Faulty commit: e5e9f44c2 i2c: Drop I2C_CLIENT_INSMOD_2 to 8
Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Andre Prendel <andre.prendel@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
An off-by-one error caused some inputs to not be created by the driver
when they should. TMP421 gets only one input instead of two, TMP422
gets two instead of three, etc. Fix the bug by listing explicitly the
number of inputs each device has.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Tested-by: Andre Prendel <andre.prendel@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The cs5535-gpio driver's get() function was returning the output value.
This means that the GPIO pins would never work as an input, even if
configured as an input.
The driver should return the READ_BACK value, which is the sensed line
value. To make that work when the direction is 'output', INPUT_ENABLE
needs to be set.
In addition, the driver was not disabling OUTPUT_ENABLE when the direction
is set to 'input'. That would cause the GPIO to continue to drive the pin
if the direction was ever set to output.
This issue was noticed when attempting to use the gpiolib driver to read
an external input. I had previously been using the char/cs5535-gpio
driver.
Signed-off-by: Ben Gardner <gardner.ben@gmail.com> Acked-by: Andres Salomon <dilinger@collabora.co.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Brownell <dbrownell@users.sourceforge.net> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
wm831x_gpio_direction_output() ignored the state passed into it.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds error-paths to handle pci_dma_mapping errors.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A crash has been reported with sierra driver on disconnect with
Ubuntu/Lucid distribution based on kernel-2.6.32.
The cause of the crash was determined as "NULL tty pointer was being
referenced" and the NULL pointer was passed by sierra_indat_callback().
This patch modifies sierra_indat_callback() function to check for NULL
tty structure pointer. This modification prevents a crash from happening
when the device is disconnected.
This patch fixes the bug reported in Launchpad:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/511157
Signed-off-by: Elina Pasheva <epasheva@sierrawireless.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The platform code doesn't have to provide platform data to get sensible
default behaviour from the imx serial driver.
This patch does not handle NULL dereference in the IrDA case, which still
requires a valid platform data pointer (in imx_startup()/imx_shutdown()),
since I don't know whether there is a sensible default behaviour, or
should the operation just fail cleanly.
Signed-off-by: Baruch Siach <baruch@tkos.co.il> Cc: Baruch Siach <baruch@tkos.co.il> Cc: Alan Cox <alan@linux.intel.com> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Oskar Schirmer <os@emlix.com> Cc: Fabian Godehardt <fg@emlix.com> Cc: Daniel Glöckner <dg@emlix.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This was noticed by Matthias Urlichs and he proposed a fix. This patch
does the fixing a different way to avoid introducing several new race
conditions into the code.
The problem case is TTY_DRIVER_RESET_TERMIOS = 0. In that case while we
abort the ldisc change, the hangup processing has not cleaned up and restarted
the ldisc either.
We can't restart the ldisc stuff in the set_ldisc as we don't know what
the hangup did and may touch stuff we shouldn't as we are no longer
supposed to influence the tty at that point in case it has been re-opened
before we get rescheduled.
Instead do it the simple way. Always re-init the ldisc on the hangup, but
use TTY_DRIVER_RESET_TERMIOS to indicate that we should force N_TTY.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When sysfs_readdir stops short we now cache the next
sysfs_dirent to return to user space in filp->private_data.
There is no impact on the rest of sysfs by doing this and
in the common case it allows us to pick up exactly where
we left off with no seeking.
Additionally I drop and regrab the sysfs_mutex around
filldir to avoid a page fault abritrarily increasing the
hold time on the sysfs_mutex.
v2: Returned to using INT_MAX as the EOF condition.
seekdir is ambiguous unless all directory entries have
a unique f_pos value.
Before unlinking the inode, reset the current permissions of possible
references like hardlinks, so granted permissions can not be retained
across the device lifetime by creating hardlinks, in the unusual case
that there is a user-writable directory on the same filesystem.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The problem is that kobject_add_internal() first adds a kobject to the
kset and then try to create sysfs directory for it. If the creation
fails, it remove the kobject from the kset. get_device_parent()
accesses class_dirs kset while only holding class_dirs.list_lock to
see whether the cuse class dir exists. But when it exists, it may not
have finished initialization yet or may fail and get removed soon. In
the above case, the former happened so the second one ends up trying
to create subdirectory under NULL sysfs_dirent.
Fix it by grabbing a mutex in get_device_parent().
Don't touch the variable 'reg' to construct the value for the actual SPI
transport. This variable is again used to access the driver's register
cache, and so random memory is overwritten.
Compute the value in-place instead.
Signed-off-by: Daniel Mack <daniel@caiaq.de> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>