inotify_new_group() receives a get_uid-ed user_struct and saves the
reference on group->inotify_data.user. The problem is that free_uid() is
never called on it.
Issue seem to be introduced by 63c882a0 (inotify: reimplement inotify
using fsnotify) after 2.6.30.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Eric Paris <eparis@parisplace.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is a race in the inotify add/rm watch code. A task can find and
remove a mark which doesn't have all of it's references. This can
result in a use after free/double free situation.
Task A Task B
------------ -----------
inotify_new_watch()
allocate a mark (refcnt == 1)
add it to the idr
inotify_rm_watch()
inotify_remove_from_idr()
fsnotify_put_mark()
refcnt hits 0, free
take reference because we are on idr
[at this point it is a use after free]
[time goes on]
refcnt may hit 0 again, double free
The fix is to take the reference BEFORE the object can be found in the
idr.
Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit 65c3ac885ce9852852b895a4a62212f62cb5f2e9 in 2.6.33 accidentally
left out the initialization of the AC97 codec FMIC2MIC bit, which broke
recording from the front panel microphone.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Jaroslav Kysela <perex@perex.cz> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The capture source control of maya44 was wrongly coded with the bit
shift instead of the bit mask. Also, the slot for line-in was
wrongly assigned (slot 5 instead of 4).
After the "retry_open:" label, we first get the tty_mutex
and then the BKL. However a the end of tty_open, we jump
back to retry_open with the BKL still held. If we run into
this case, the tty_open function will be left with the BKL
still held.
The imx CTS trigger level is left at its reset value that is 32
chars. Since the RX FIFO has 32 entries, when CTS is raised, the
FIFO already is full. However, some serial port devices first empty
their TX FIFO before stopping when CTS is raised, resulting in lost
chars.
This patch sets the trigger level lower so that other chars arrive
after CTS is raised, there is still room for 16 of them.
When we made serverino the default, we trusted that the field sent by the
server in the "uniqueid" field was actually unique. It turns out that it
isn't reliably so.
Samba, in particular, will just put the st_ino in the uniqueid field when
unix extensions are enabled. When a share spans multiple filesystems, it's
quite possible that there will be collisions. This is a server bug, but
when the inodes in question are a directory (as is often the case) and
there is a collision with the root inode of the mount, the result is a
kernel panic on umount.
Fix this by checking explicitly for directory inodes with the same
uniqueid. If that is the case, then we can assume that using server inode
numbers will be a problem and that they should be disabled.
Fixes Samba bugzilla 7407
Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Anton Blanchard found that large POWER systems would occasionally
crash in the exception exit path when profiling with perf_events.
The symptom was that an interrupt would occur late in the exit path
when the MSR[RI] (recoverable interrupt) bit was clear. Interrupts
should be hard-disabled at this point but they were enabled. Because
the interrupt was not recoverable the system panicked.
The reason is that the exception exit path was calling
perf_event_do_pending after hard-disabling interrupts, and
perf_event_do_pending will re-enable interrupts.
The simplest and cleanest fix for this is to use the same mechanism
that 32-bit powerpc does, namely to cause a self-IPI by setting the
decrementer to 1. This means we can remove the tests in the exception
exit path and raw_local_irq_restore.
This also makes sure that the call to perf_event_do_pending from
timer_interrupt() happens within irq_enter/irq_exit. (Note that
calling perf_event_do_pending from timer_interrupt does not mean that
there is a possible 1/HZ latency; setting the decrementer to 1 ensures
that the timer interrupt will happen immediately, i.e. within one
timebase tick, which is a few nanoseconds or 10s of nanoseconds.)
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The various dasd_sleep_on functions use a global wait queue when
waiting for a cqr. The wait condition checks the status and devlist
fields of the cqr to determine if it is safe to continue. This
evaluation may return true, although the tasklet has not finished
processing of the cqr and the callback function has not been called
yet. When the callback is finally called, the data in the cqr may
already be invalid. The sleep_on wait condition needs a safe way to
determine if the tasklet has finished processing. Use the
callback_data field of the cqr to store a token, which is set by
the callback function itself.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
strace may change the system call number, so regs->gprs[2] must not
be read before tracehook_report_syscall_entry(). This fixes a bug
where "strace -f" will hang after a vfork().
My PIPE_CONTROL fix (just sent via Eric's tree) was buggy; I was
testing a whole set of patches together and missed a conversion to the
new HAS_PIPE_CONTROL macro, which will cause breakage on non-Ironlake
965 class chips. Fortunately, the fix is trivial and has been tested.
Be sure to use the HAS_PIPE_CONTROL macro in i915_get_gem_seqno, or
we'll end up reading the wrong graphics memory, likely causing hangs,
crashes, or worse.
Since 965, the hardware has supported the PIPE_CONTROL command, which
provides fine grained GPU cache flushing control. On recent chipsets,
this instruction is required for reliable interrupt and sequence number
reporting in the driver.
So add support for this instruction, including workarounds, on Ironlake
and Sandy Bridge hardware.
Disable data error interrupts while we are actually recording that there
is not such errors. This will prevent, in some cases, the warning message
printed at new request queuing (in atmci_start_request()).
The removing of an SD card in certain circumstances can lead to a kernel
oops if we do not make sure that the "data" field of the host structure is
valid. This patch adds a test in atmci_dma_cleanup() function and also
calls atmci_stop_dma() before throwing away the reference to data.
Originally, commit d899bf7b ("procfs: provide stack information for
threads") attempted to introduce a new feature for showing where the
threadstack was located and how many pages are being utilized by the
stack.
Commit c44972f1 ("procfs: disable per-task stack usage on NOMMU") was
applied to fix the NO_MMU case.
Commit 89240ba0 ("x86, fs: Fix x86 procfs stack information for threads on
64-bit") was applied to fix a bug in ia32 executables being loaded.
Commit 9ebd4eba7 ("procfs: fix /proc/<pid>/stat stack pointer for kernel
threads") was applied to fix a bug which had kernel threads printing a
userland stack address.
Commit 1306d603f ('proc: partially revert "procfs: provide stack
information for threads"') was then applied to revert the stack pages
being used to solve a significant performance regression.
This patch nearly undoes the effect of all these patches.
The reason for reverting these is it provides an unusable value in
field 28. For x86_64, a fork will result in the task->stack_start
value being updated to the current user top of stack and not the stack
start address. This unpredictability of the stack_start value makes
it worthless. That includes the intended use of showing how much stack
space a thread has.
Other architectures will get different values. As an example, ia64
gets 0. The do_fork() and copy_process() functions appear to treat the
stack_start and stack_size parameters as architecture specific.
I only partially reverted c44972f1 ("procfs: disable per-task stack usage
on NOMMU") . If I had completely reverted it, I would have had to change
mm/Makefile only build pagewalk.o when CONFIG_PROC_PAGE_MONITOR is
configured. Since I could not test the builds without significant effort,
I decided to not change mm/Makefile.
I only partially reverted 89240ba0 ("x86, fs: Fix x86 procfs stack
information for threads on 64-bit") . I left the KSTK_ESP() change in
place as that seemed worthwhile.
As it doesn't seem to be universally valid for all mainboard revisions of
the D945GCLF2 and breaks snd-hda-intel/ snd-hda-codec-realtek on the Intel
Corporation "D945GCLF2" (LF94510J.86A.0229.2009.0729.0209) mainboard.
00:1b.0 Audio device [0403]: Intel Corporation N10/ICH 7 Family High Definition Audio Controller [8086:27d8] (rev 01)
Ordinarily, application using hugetlbfs will create mappings with
reserves. For shared mappings, these pages are reserved before mmap()
returns success and for private mappings, the caller process is guaranteed
and a child process that cannot get the pages gets killed with sigbus.
An application that uses MAP_NORESERVE gets no reservations and mmap()
will always succeed at the risk the page will not be available at fault
time. This might be used for example on very large sparse mappings where
the developer is confident the necessary huge pages exist to satisfy all
faults even though the whole mapping cannot be backed by huge pages.
Unfortunately, if an allocation does fail, VM_FAULT_OOM is returned to the
fault handler which proceeds to trigger the OOM-killer. This is
unhelpful.
Even without hugetlbfs mounted, a user using mmap() can trivially trigger
the OOM-killer because VM_FAULT_OOM is returned (will provide example
program if desired - it's a whopping 24 lines long). It could be
considered a DOS available to an unprivileged user.
This patch alters hugetlbfs to kill a process that uses MAP_NORESERVE
where huge pages were not available with SIGBUS instead of triggering the
OOM killer.
This change affects hugetlb_cow() as well. I feel there is a failure case
in there, but I didn't create one. It would need a fairly specific target
in terms of the faulting application and the hugepage pool size. The
hugetlb_no_page() path is much easier to hit but both might as well be
closed.
Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: David Rientjes <rientjes@google.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The current allocation does not include the memory required for blanking
lines. So avoid memory corruption when multiple devices are using the DMA
memory near each other.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The work queue has to be flushed after the device has been made
inaccessible. The patch closes a window during which a work queue might
remain active after the device is removed and would then lead to ACPI
calls with undefined behavior.
Signed-off-by: Oliver Neukum <oneukum@suse.de> Acked-by: Eric Piel <eric.piel@tremplin-utc.net> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: Pavel Herrmann <morpheus.ibis@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 2783ef23 moved the initialisation of saddr and daddr after
pskb_may_pull() to avoid a potential data corruption. Unfortunately
also placing it after the short packet and bad checksum error paths,
where these variables are used for logging. The result is bogus
output like
[92238.389505] UDP: short packet: From 2.0.0.0:65535 23715/178 to 0.0.0.0:65535
Moving the saddr and daddr initialisation above the error paths, while still
keeping it after the pskb_may_pull() to keep the fix from commit 2783ef23.
Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Previously it was unconditionally used on all Sibyte family SOCs. The
M3 bug has to be handled in the TLB exception handler which is extremly
performance sensitive, so this modification is expected to deliver around
2-3% performance improvment. This is important as required changes to the
M3 workaround will make it more costly.
There's nastyness in the way we currently handle barriers (and
discards): They're effectively filesystem commands, but they get
processed as BLOCK_PC commands. Unfortunately BLOCK_PC commands are
taken by SCSI to be SG_IO commands and the issuer expects to see and
handle any returned errors, however trivial. This leads to a huge
problem, because the block layer doesn't expect this to happen and any
trivially retryable error on a barrier causes an immediate I/O error
to the filesystem.
The only real way to hack around this is to take the usual class of
offending errors (unit attentions) and make them all retryable in the
case of a REQ_HARDBARRIER. A correct fix would involve a rework of
the entire block and SCSI submit system, and so is out of scope for a
quick fix.
Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Some arrays are giving I/O errors with ext3 filesystems when
SYNCHRONIZE_CACHE gets a UNIT_ATTENTION. What is happening is that
these commands have no retries, so the UNIT_ATTENTION causes the
barrier to fail. We should be enable retries here to clear any
transient error and allow the barrier to succeed.
Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In the scsi_debug driver, the virtual_gb option ignores the
sector_size, implicitly assuming that is 512 bytes. So if
'virtual_gb=1 sector_size=4096' the result is an 8 GB (virtual) disk.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[SCSI] libiscsi: don't increment cmdsn if cmd is not sent
in 2.6.32.
When I moved the hdr->cmdsn after init_task, I added
a bug when header digests are used. The problem is
that the LLD may calculate the header digest in init_task,
so if we then set the cmdsn after the init_task call we
change what the digest will be calculated by the target.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 672917dcc78 ("cpuidle: menu governor: reduce latency on exit")
added an optimization, where the analysis on the past idle period moved
from the end of idle, to the beginning of the new idle.
Unfortunately, this optimization had a bug where it zeroed one key
variable for new use, that is needed for the analysis. The fix is
simple, zero the variable after doing the work from the previous idle.
During the audit of the code that found this issue, another issue was
also found; the ->measured_us data structure member is never set, a
local variable is always used instead.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add Dell Studio models (1558, 1557, 1555) to the 'set_sci_en_on_resume'
list to fix hang on resume.
BugLink: http://bugs.launchpad.net/bugs/553498 Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Alex Chiang <achiang@canonical.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
acpi_device_class can only be 19 characters and a NULL terminator.
The current code has a buffer overflow in acpi_power_meter_add():
strcpy(acpi_device_class(device), ACPI_POWER_METER_CLASS);
Signed-off-by: Dan Carpenter <error27@gmail.com> Cc: Len Brown <lenb@kernel.org> Cc: "Darrick J. Wong" <djwong@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Multiple Lenovo ThinkPad models with Intel Core i5/i7 CPUs can
successfully suspend/resume once, and then hang on the second s/r
cycle.
We got confirmation that this was due to a BIOS defect. The BIOS
did not properly set SCI_EN coming out of S3. The BIOS guys
hinted that The Other Leading OS ignores the fact that hardware
owns the bit and sets it manually.
In any case, an existing DMI table exists for machines where this
defect is a known problem. Lenovo promise to fix their BIOS, but
for folks who either won't or can't upgrade their BIOS, allow
Linux to workaround the issue.
Confirmed by numerous testers in the launchpad bug that using
acpi_sleep=sci_force_enable fixes the issue. We add the machines
to acpisleep_dmi_table[] to automatically enable this workaround.
Cc: Colin King <colin.king@canonical.com> Signed-off-by: Alex Chiang <achiang@canonical.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix: Raid-6 was not trying to correct a read-error when in
singly-degraded state and was instead dropping one more device, going to
doubly-degraded state. This patch fixes this behaviour.
Tested-by: Janos Haar <janos.haar@netcenter.hu> Signed-off-by: Gabriele A. Trombetti <g.trombetti.lkrnl1213@logicschema.com> Reported-by: Janos Haar <janos.haar@netcenter.hu> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Some time ago we stopped the clean/active metadata updates
from being written to a 'spare' device in most cases so that
it could spin down and say spun down. Device failure/removal
etc are still recorded on spares.
However commit 51d5668cb2e3fd1827a55 broke this 50% of the time,
depending on whether the event count is even or odd.
The change log entry said:
This means that the alignment between 'odd/even' and
'clean/dirty' might take a little longer to attain,
how ever the code makes no attempt to create that alignment, so it
could take arbitrarily long.
So when we find that clean/dirty is not aligned with odd/even,
force a second metadata-update immediately. There are already cases
where a second metadata-update is needed immediately (e.g. when a
device fails during the metadata update). We just piggy-back on that.
There is a typo here. We should be testing "*dentry" instead of
"dentry". If "*dentry" is an ERR_PTR, it gets dereferenced in either
mkdir() or create() which would cause an OOPs.
Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If request_resource() fails, we leak the struct resource we
allocated to represent the IOMMU mapping area.
This actually happens on sun4v machines because the IOMEM area is only
reported sans the IOMMU region, unlike all previous systems. I'll
need to fix that at some point, but for now fix the leak.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If we are in an NMI then doing a plain raw_local_irq_disable() will
write PIL_NORMAL_MAX into %pil, which is lower than PIL_NMI, and thus
we'll re-enable NMIs and recurse.
Doing a simple:
%pil = %pil | PIL_NORMAL_MAX
does what we want, if we're already at PIL_NMI (15) we leave it at
that setting, else we set it to PIL_NORMAL_MAX (14).
This should get the function tracer working on sparc64.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
tx_queue is used as a temporary queue when not allowed to queue skb
directly to the hw device driver (which may sleep). Most paths flush
it before returning, but ppp_start() currently cannot. Make sure we
don't leave skbs pointing to a non-existent device.
Thanks to Michael Barkowski for reporting this problem.
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
After upgrading a L2TP server to 2.6.33 it started to fail, tunnels going up an
down, after the 10th tunnel came up. My modified rp-l2tp uses a global
unconnected socket bound to (INADDR_ANY, 1701) and one connected socket per
tunnel after parameter negotiation.
After ten sockets were open and due to mixed parameters to
udp[46]_lib_lookup2() kernel started to drop packets.
Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The following situation was observed in the field:
tap1 sends packets, tap2 does not consume them, as a result
tap1 can not be closed. This happens because
tun/tap devices can hang on to skbs undefinitely.
As noted by Herbert, possible solutions include a timeout followed by a
copy/change of ownership of the skb, or always copying/changing
ownership if we're going into a hostile device.
This patch implements the second approach.
Note: one issue still remaining is that since skbs
keep reference to tun socket and tun socket has a
reference to tun device, we won't flush backlog,
instead simply waiting for all skbs to get transmitted.
At least this is not user-triggerable, and
this was not reported in practice, my assumption is
other devices besides tap complete an skb
within finite time after it has been queued.
A possible solution for the second issue
would not to have socket reference the device,
instead, implement dev->destructor for tun, and
wait for all skbs to complete there, but this
needs some thought, probably too risky for 2.6.34.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Yan Vugenfirer <yvugenfi@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
What happens is that, when the tipc module in inserted it enters a standalone
node mode in which communication to its own address is allowed <0.0.0> but not
to other addresses, since the appropriate data structures have not been
allocated yet (specifically the tipc_net pointer). There is nothing stopping a
client from trying to send such a message however, and if that happens, we
attempt to dereference tipc_net.zones while the pointer is still NULL, and
explode. The fix is pretty straightforward. Since these oopses all arise from
the dereference of global pointers prior to their assignment to allocated
values, and since these allocations are small (about 2k total), lets convert
these pointers to static arrays of the appropriate size. All the accesses to
these bits consider 0/NULL to be a non match when searching, so all the lookups
still work properly, and there is no longer a chance of a bad dererence
anywhere. As a bonus, this lets us eliminate the setup/teardown routines for
those pointers, and elimnates the need to preform any locking around them to
prevent access while their being allocated/freed.
I've updated the tipc_net structure to behave this way to fix the exact reported
problem, and also fixed up the tipc_bearers and media_list arrays to fix an
obvious simmilar problem that arises from issuing tipc-config commands to
manipulate bearers/links prior to entering networked mode
I've tested this for a few hours by running the sanity tests and stress test
with the tipcutils suite, and nothing has fallen over. There have been a few
lockdep warnings, but those were there before, and can be addressed later, as
they didn't actually result in any deadlock.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Allan Stephens <allan.stephens@windriver.com> CC: David S. Miller <davem@davemloft.net> CC: tipc-discussion@lists.sourceforge.net Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
tcp_read_sock() can have a eat skbs without immediately advancing copied_seq.
This can cause a panic in tcp_collapse() if it is called as a result
of the recv_actor dropping the socket lock.
A userspace program that splices data from a socket to either another
socket or to a file can trigger this bug.
Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When we finish processing ASCONF_ACK chunk, we try to send
the next queued ASCONF. This action runs the sctp state
machine recursively and it's not prepared to do so.
When calculating the INIT/INIT-ACK chunk length, we should not
only account the length of parameters, but also the parameters
zero padding length, such as AUTH HMACS parameter and CHUNKS
parameter. Without the parameters zero padding length we may get
following oops.
------------------------------------------------------------------
eth0 has addresses: 3ffe:501:ffff:100:20c:29ff:fe4d:f37e and 192.168.0.21
eth1 has addresses: 192.168.1.21
------------------------------------------------------------------
Reported-by: George Cheimonidis <gchimon@gmail.com> Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since the change of the atomics to percpu variables, we now
have to disable BH in process context when touching percpu variables.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When sctp attempts to update an assocition, it removes any
addresses that were not in the updated INITs. However, the loop
may attempt to refrence a transport with address after removing it.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
sk->sk_data_ready() of sctp socket can be called from both BH and non-BH
contexts, but the default sk->sk_data_ready(), sock_def_readable(), can
not be used in this case. Therefore, we have to make a new function
sctp_data_ready() to grab sk->sk_data_ready() with BH disabling.
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.33-rc6 #129
---------------------------------------------------------
sctp_darn/1517 just changed the state of lock:
(clock-AF_INET){++.?..}, at: [<c06aab60>] sock_def_readable+0x20/0x80
but this lock took another, SOFTIRQ-unsafe lock in the past:
(slock-AF_INET){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
1 lock held by sctp_darn/1517:
#0: (sk_lock-AF_INET){+.+.+.}, at: [<cdfe363d>] sctp_sendmsg+0x23d/0xc00 [sctp]
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
My recent patch to remove the open-coded checksum sequence in
tcp_v6_send_response broke it as we did not set the transport
header pointer on the new packet.
Actually, there is code there trying to set the transport
header properly, but it sets it for the wrong skb ('skb'
instead of 'buff').
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Trying to run izlisten (from lowpan-tools tests) on a device that does not
exists I got the oops below. The problem is that we are using get_dev_by_name
without checking if we really get a device back. We don't in this case and
writing to dev->type generates this oops.
[Oops code removed by Dmitry Eremin-Solenikov]
If possible this patch should be applied to the current -rc fixes branch.
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Autosuspend works until you bring the wwan interface up, then the
device does not enter autosuspend anymore.
The following patch fixes the problem by setting the .manage_power
field in the mbm_info struct to the same as in the cdc_info struct
(cdc_manager_power).
Signed-off-by: Torgny Johansson <torgny.johansson@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It has been reported that under certain heavy traffic conditions in MSI-X
mode, the driver can lose an MSI-X vector causing all packets in the
associated rx/tx ring pair to be dropped. The problem is caused by
the chip dropping the write to unmask the MSI-X vector by the kernel
(when migrating the IRQ for example).
This can be prevented by increasing the GRC timeout value for these
register read and write operations.
Thanks to Dell for helping us debug this problem.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Remove debug printks in pseries_mach_cpu_die(). These are
noisy at runtime. Traceevents can be added to instrument this
section of code.
The following KERN_INFO printks are removed:
cpu 62 (hwid 62) returned from cede.
Decrementer value = b2802fff Timebase value = 2fa8f95035f4a
cpu 62 (hwid 62) got prodded to go online
cpu 58 (hwid 58) ceding for offline with hint 2
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Rearrange condition checks for better code readability and
prevention of possible race conditions when
preferred_offline_state can potentially change during the
execution of pseries_mach_cpu_die(). The patch will make
pseries_mach_cpu_die() put cpu in one of the consistent states
and not hit the run over BUG()
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On low memory boxes or those with highmem, kernel can OOM before the
background reclaims inodes via xfssyncd. Add a shrinker to run inode
reclaim so that it inode reclaim is expedited when memory is low.
This is more complex than it needs to be because the VM folk don't
want a context added to the shrinker infrastructure. Hence we need
to add a global list of XFS mount structures so the shrinker can
traverse them.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Alex Elder <aelder@sgi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
MSI setup changes the value of irq_vec in struct tg3 *tp.
This attribute must be taken into account and restored before
we try to do a new request_irq for INTx fallback.
In powerpc, the original code was leading to an EINVAL return within
request_irq, because the driver was trying to use the disabled MSI
virtual irq number instead of tp->pdev->irq.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Brandon Philips <bphilips@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Further to the lsml thread titled:
"does scsi_io_completion need to dump sense data for ata pass through (ck_cond =
1) ?"
This is a patch to skip logging when the sense data is
associated with a SENSE_KEY of "RECOVERED_ERROR" and the
additional sense code is "ATA PASS-THROUGH INFORMATION
AVAILABLE". This only occurs with the SAT ATA PASS-THROUGH
commands when CK_COND=1 (in the cdb). It indicates that
the sense data contains ATA registers.
Smartmontools uses such commands on ATA disks connected via
SAT. Periodic checks such as those done by smartd cause
nuisance entries into logs that are:
- neither errors nor warnings
- pointless unless the cdb that caused them are also logged
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is quite similar to b39fe41f481d20c201012e4483e76c203802dda7
though said registers are not even documented as 64-bit registers
- as opposed to the initial TxDescStartAddress ones - but as single
bytes which must be combined into 32 bits at the MMIO read/write
level before being merged into a 64 bit logical entity.
Credits go to Ben Hutchings <ben@decadent.org.uk> for the MAR
registers (aka "multicast is broken for ages on ARM) and to
Timo Teräs <timo.teras@iki.fi> for the MAC registers.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
r8169 needs certain writes to be visible to other CPUs or the NIC before
touching the hardware, but was using smp_wmb() which is only required to
order cacheable memory access. Switch to wmb() which is required to
order both cacheable and non-cacheable memory.
Noticed by Catalin Marinas and Paul Mackerras.
Signed-off-by: David Dillow <dave@thedillows.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The bypassing of this test is a leftover from 2.4 vintage
kernels, and is no longer appropriate, or even used by KGDB.
Currently KGDB uses probe_kernel_write() for all access to
memory via the KGDB core, so it can simply be deleted.
This fixes CVE-2010-1446.
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Wufei <fei.wu@windriver.com> Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
fix off by one error in the queue size check of p54_tx_qos_accounting_alloc()
Coverity CID: 13314
Signed-off-by: Darren Jenkins <darrenrjenkins@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Resizing the filesystem would result in an diAllocExt error in some
instances because changes in bmp->db_agsize would not get noticed if
goto extendBmap was called.
Signed-off-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Cc: jfs-discussion@lists.sourceforge.net Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
creds_are_invalid() reads both cred->usage and cred->subscribers and then
compares them to make sure the number of processes subscribed to a cred struct
never exceeds the refcount of that cred struct.
The problem is that this can cause a race with both copy_creds() and
exit_creds() as the two counters, whilst they are of atomic_t type, are only
atomic with respect to themselves, and not atomic with respect to each other.
This means that if creds_are_invalid() can read the values on one CPU whilst
they're being modified on another CPU, and so can observe an evolving state in
which the subscribers count now is greater than the usage count a moment
before.
Switching the order in which the counts are read cannot help, so the thing to
do is to remove that particular check.
I had considered rechecking the values to see if they're in flux if the test
fails, but I can't guarantee they won't appear the same, even if they've
changed several times in the meantime.
Note that this can only happen if CONFIG_DEBUG_CREDENTIALS is enabled.
The problem is only likely to occur with multithreaded programs, and can be
tested by the tst-eintr1 program from glibc's "make check". The symptoms look
like:
The unpack routine fails to handle the decompress_method() returning
unrecognised decompressor (compress_name == NULL). This results in the
routine looping eventually oopsing on an out of bounds memory access.
Note this bug is usually hidden, only triggering on trailing junk after
one or more correct compressed blocks. The case of the compressed archive
being complete junk is (by accident?) caught by the if (state != Reset)
check because state is initialised to Start, but not updated due to the
decompressor not having been called. Obviously if the junk is trailing a
correctly decompressed buffer, state == Reset from the previous call to
the decompressor.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk> Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ext4_fiemap() rounds the length of the requested range down to
blocksize, which is is not the true number of blocks that cover the
requested region. This problem is especially impressive if the user
requests only the first byte of a file: not a single extent will be
reported.
We fix this by calculating the last block of the region and then
subtract to find the number of blocks in the extents.
Most drives from Seagate, Hitachi, and possibly other brands,
do not allow LBA28 access to sector number 0x0fffffff (2^28 - 1).
So instead use LBA48 for such accesses.
This bug could bite a lot of systems, especially when the user has
taken care to align partitions to 4KB boundaries. On misaligned systems,
it is less likely to be encountered, since a 4KB read would end at
0x10000000 rather than at 0x0fffffff.
Signed-off-by: Mark Lord <mlord@pobox.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If the firmware puts a device back into D0 state at resume time, we'll
update its state in resume_noirq and thus skip the platform resume code.
Calling that code twice should be safe and we ought to avoid getting to
that point anyway, so remove the check and also allow the platform pci
code to be called for D0.
Fixes USB not being powered after resume on recent Lenovo machines.
Acked-by: Alex Chiang <achiang@canonical.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BugLink: https://launchpad.net/bugs/549267
The OR verified that using the olpc-xo-1_5 model quirk allows the
headphones to be audible when inserted into the jack. Capture was
also verified to work correctly.
Reported-by: Richard Gagne Tested-by: Richard Gagne Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BugLink: https://launchpad.net/bugs/573284
The OR verified that using the olpc-xo-1_5 model quirk allows the
headphones to be audible when inserted into the jack. Capture was
also verified to work correctly.
Reported-by: Andy Couldrake <acouldrake@googlemail.com> Tested-by: Andy Couldrake <acouldrake@googlemail.com> Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BugLink: https://launchpad.net/bugs/541802
The OR's hardware distorts at PCM 100% because it does not correspond to
0 dB. Fix this in patch_cxt5045() for all Packard Bell models.
Reported-by: Valombre Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ignore spurious HV interrupts during suspend / resume, this avoids
mistaking them for a mute button press. This is not very pretty but
it seems the only way to fix the master volume control gets muted
after suspend issue I'm seeing. Note that the es1968 driver is doing
exactly the same.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Without this quirk sound stops working after suspend resume. With this quirk,
one still needs to manually unmute the master volume control after a suspend /
/ resume cycle. That is fixed in another patch in this set.
Note that this patch was submitted to the alsa bug tracker a long time ago:
https://bugtrack.alsa-project.org/alsa-bug/view.php?id=4319
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BugLink: https://launchpad.net/bugs/567494
The OR has verified that the existing model quirk, ALC880_UNIWILL,
is insufficient for audible playback and capture by default. Instead,
the ALC880_F1734 model quirk needs to be used.
This change is necessary for both 2.6.32.11 and 2.6.33.2.
BugLink: https://launchpad.net/bugs/568600
The OR has verified that the dell-m6 model quirk is necessary for audio
to be audible by default on the Dell Studio XPS 1645.
This change is necessary for 2.6.32.11 and 2.6.33.2 alike.
Reported-by: Andy Ross <andy@plausible.org> Tested-by: Andy Ross <andy@plausible.org> Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BugLink: https://launchpad.net/bugs/553002
The OR has verified that the dell-m6 model quirk is necessary for audio
to be audible by default on the Dell Studio XPS 1645.
This change is necessary for 2.6.32.11 and 2.6.33.2 alike.
Reported-by: Robert Chambers Tested-by: Robert Chambers Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BugLink: https://launchpad.net/bugs/459083
The OR has verified with 2.6.32.11 and the latest alsa-driver stable
daily snapshot that position_fix=1 is necessary for the external mic
to work and for PulseAudio not to crash constantly.
This patch is necessary also for 2.6.32.11 and 2.6.33.2.
Reported-by: <imwithid@yahoo.com> Tested-by: <imwithid@yahoo.com> Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
One side effect of it is that num_k8_northbridges
is not initialized anymore if not explicitly
called. This resulted in uninitialized pointers in
<arch/x86/kernel/cpu/intel_cacheinfo.c:amd_calc_l3_indices()>,
for example, which uses the num_k8_northbridges thing through
node_to_k8_nb_misc().
Fix that through an initcall that runs right after the PCI
subsystem and does all the scanning. Then, remove initialization
in gart_iommu_init() which is a rootfs_initcall and we're
running before that.
What is more, since num_k8_northbridges is being used in other
places beside GART IOMMU, include it whenever we add AMD CPU
support. The previous dependency chain in kconfig contained
K8_NB depends on AGP_AMD64|GART_IOMMU
which was clearly incorrect. The more natural way in terms of
hardware dependency should be
AGP_AMD64|GART_IOMMU depends on K8_NB depends on CPU_SUP_AMD &&
PCI. Make it so Number One!
Upstream PV guests fail to boot because of a NULL pointer in
irq_force_complete_move(). It is possible that xen guests have
irq_desc->chip_data = NULL.
Test for NULL chip_data pointer before attempting to complete an irq move.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
LKML-Reference: <20100427152434.16193.49104.sendpatchset@prarit.bos.redhat.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
"If software clears the PS (page size) bit in a present PDE (page
directory entry), that will cause linear addresses mapped through this
PDE to use 4-KByte pages instead of using a large page after old TLB
entries are invalidated. Due to this erratum, if a code fetch uses
this PDE before the TLB entry for the large page is invalidated then
it may fetch from a different physical address than specified by
either the old large page translation or the new 4-KByte page
translation. This erratum may also cause speculative code fetches from
incorrect addresses."
Where as commit 211b3d03c7400f48a781977a50104c9d12f4e229 seems to
workaround errata AAH41 (mixed 4K TLBs) it reduces the window of
opportunity for the bug to occur and does not totally remove it. This
patch disables mixed 4K/4MB page tables totally avoiding the page
splitting and not tripping this processor issue.
This is based on an original patch by Colin King.
Originally-by: Colin Ian King <colin.king@canonical.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
LKML-Reference: <1269271251-19775-1-git-send-email-colin.king@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When we do a thread switch, we clear the outgoing FS/GS base if the
corresponding selector is nonzero. This is taken by __switch_to() as
an entry invariant; it does not verify that it is true on entry.
However, copy_thread() doesn't enforce this constraint, which can
result in inconsistent results after fork().
Make copy_thread() match the behavior of __switch_to().
Reported-and-tested-by: Samuel Thibault <samuel.thibault@inria.fr> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <4BD1E061.8030605@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Hans de Goede identified a bug in p54p_check_tx_ring:
there are two ring indices. 1 => tx data and 3 => tx management.
But the old code had a constant "1" and this resulted in spurious
dma unmapping failures.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=583623 Bug-Identified-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Use correct bit positions in DM_SHARED_CTRL register for writes.
Michael Planes recently encountered a 'KY-RS9600 USB-LAN converter', which
came with a driver CD containing a Linux driver. This driver turns out to
be a copy of dm9601.c with symbols renamed and my copyright stripped.
That aside, it did contain 1 functional change in dm_write_shared_word(),
and after checking the datasheet the original value was indeed wrong
(read versus write bits).
On Michaels HW, this change bumps receive speed from ~30KB/s to ~900KB/s.
On other devices the difference is less spectacular, but still significant
(~30%).
Reported-by: Michael Planes <michael.planes@free.fr> Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It changes user-visible sysfs interfaces, and breaks some existing user
space applications which apparently rely on the fact that the output
does not contain the "0x" prefix.
blk_rq_timed_out_timer() relied on blk_add_timer() never returning a
timer value of zero, but commit 7838c15b8dd18e78a523513749e5b54bda07b0cb
removed the code that bumped this value when it was zero.
Therefore when jiffies is near wrap we could get unlucky & not set the
timeout value correctly.
This patch uses a flag to indicate that the timeout value was set and so
handles jiffies wrap correctly, and it keeps all the logic in one
function so should be easier to maintain in the future.