This patch (as1097) fixes a bug in the remote-wakeup handling in
ehci-hcd. The driver currently does not keep track of whether the
change-suspend feature is enabled for each port; the feature is
automatically reset the first time it is read. But recent changes to
the hub driver require that the feature be read at least twice in
order to work properly.
A bit-vector is added for storing the change-suspend feature values.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
My laptop thinks that it's a good idea to give -73C as the critical
CPU temperature.... which isn't the best thing since it causes a shutdown
right at bootup.
Temperatures below freezing are clearly invalid critical thresholds
so just reject these as such.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Current versions of gdb require a working implementation of
PTRACE_GETSIGINFO for proper watchpoint support. Since struct siginfo
contains pointers it must be converted when passed to a 32-bit debugger.
Signed-off-by: Andreas Schwab <schwab@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Hardware encryption doesn't work yet so lets use software
encryption for now.
Changes-licensed-under: 3-Clause-BSD
Signed-off-by: Luis R. Rodriguez <mcgrof@winlab.rutgers.edu> Signed-off-by: John W. Linville <linville@tuxdriver.com> Cc: Jiri Benc <jbenc@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If acpi_install_notify_handler() for a bay device fails, the bay driver is
superfluous. Most likely, another driver (like libata) is already caring
about this device anyway. Furthermore,
register_hotplug_dock_device(acpi_handle) from the dock driver must not be
called twice with the same handler. This would result in an endless loop
consuming 100% of CPU. So clean up and exit.
Signed-off-by: Holger Macht <hmacht@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When using the HIDP or BNEP kernel support, the user-space needs to
know if the connection has been terminated for some reasons. Wake up
the application if that happens. Otherwise kernel and user-space are
no longer on the same page and weird behaviors can happen.
I received a complaint that some FAT formated medias (e.g. sd memory cards)
trigger a "unknown partition table" message even though there is no partition
table and they work correctly, while in general (when e.g. formated with
mkdosfs or even Windows Vista) this message is not shown.
Currently this seems only to happen when the medias get formatted with Windows
XP (and possibly Win 2000). Then the boot indicator byte contains garbage
(part of text message) and so do the other parts checked by msdos_paritition
which then later triggers this message.
References: novell bug #364365
Most fat formatted media without partition table contains zeros in the boot
indication and the other tested bytes and so falls through the checks in
msdos_partition, leading it to return with 1 (all is fine).
But some (e.g. WinXP formatted) fat fomated medias don't use boot_ind and so
the check fails and causes a "unkown partition table" warning eventhough there
is none and everything would be fine.
This additional check directly verifies if there is a fat formatted medium
without a partition table.
Signed-off-by: Frank Seidel <fseidel@suse.de> Cc: Andreas Dilger <adilger@sun.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The on-disk media specification field in FAT is only 8-bits, so testing for
<=0xff is pointless, and can generate a "comparison is always true due to
limited range of data type" warning.
While we're there, convert FAT_VALID_MEDIA() into a C function - the present
implementation is buggy: it generates either one or two references to its
argument.
Cc: Frank Seidel <fseidel@suse.de> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fujitsu Siemens Amilo Pro V2030 needs nomux table entry, in addition to
already existing entry for V2010 model (note that Fujitsu-Siemens changed
the capitalization in the DMI data for product).
This patch introduces i8042_dmi_nopnp_table to make it possible to perform
DMI matches for systems that need 'i8042.nopnp' to work correctly, and
introduces such an entry for Intel D845PESV -- this system doesn't
detect PS2 mouse reliably without this option, as reported by Robert
Lewis.
[dtor@mail.ru - make it compile if CONFIG_PNP is off - reported
by Randy Dunlap]
There are several cases where the running transaction can get buffers added to
its BJ_Metadata list which it never dirtied, which makes its t_nr_buffers
counter end up larger than its t_outstanding_credits counter.
This will cause issues when starting new transactions as while we are logging
buffers we decrement t_outstanding_buffers, so when t_outstanding_buffers goes
negative, we will report that we need less space in the journal than we
actually need, so transactions will be started even though there may not be
enough room for them. In the worst case scenario (which admittedly is almost
impossible to reproduce) this will result in the journal running out of space.
The fix is to only
refile buffers from the committing transaction to the running transactions
BJ_Modified list when b_modified is set on that journal, which is the only way
to be sure if the running transaction has modified that buffer.
This patch also fixes an accounting error in journal_forget, it is possible
that we can call journal_forget on a buffer without having modified it, only
gotten write access to it, so instead of freeing a credit, we only do so if
the buffer was modified. The assert will help catch if this problem occurs.
Without these two patches I could hit this assert within minutes of running
postmark, with them this issue no longer arises. Thank you,
Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Acked-by: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
journal_try_to_free_buffers() could race with jbd commit transaction when
the later is holding the buffer reference while waiting for the data
buffer to flush to disk. If the caller of journal_try_to_free_buffers()
request tries hard to release the buffers, it will treat the failure as
error and return back to the caller. We have seen the directo IO failed
due to this race. Some of the caller of releasepage() also expecting the
buffer to be dropped when passed with GFP_KERNEL mask to the
releasepage()->journal_try_to_free_buffers().
With this patch, if the caller is passing the __GFP_WAIT and __GFP_FS to
indicating this call could wait, in case of try_to_free_buffers() failed,
let's waiting for journal_commit_transaction() to finish commit the
current committing transaction, then try to free those buffers again.
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Currently at the start of a journal commit we loop through all of the buffers
on the committing transaction and clear the b_modified flag (the flag that is
set when a transaction modifies the buffer) under the j_list_lock.
The problem is that everywhere else this flag is modified only under the jbd
lock buffer flag, so it will race with a running transaction who could
potentially set it, and have it unset by the committing transaction.
This is also a big waste, you can have several thousands of buffers that you
are clearing the modified flag on when you may not need to. This patch
removes this code and instead clears the b_modified flag upon entering
do_get_write_access/journal_get_create_access, so if that transaction does
indeed use the buffer then it will be accounted for properly, and if it does
not then we know we didn't use it.
That will be important for the next patch in this series. Tested thoroughly
by myself using postmark/iozone/bonnie++.
Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Acked-by: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In case when both EEXIST and EROFS would apply we used to
return the former in mkdir(2) and friends. Lest anyone suspects
us of being consistent, in the same situation knfsd gave clients
nfs_erofs...
ro-bind series had switched the syscall side of things to
returning -EROFS and immediately broke an application - namely,
mkdir -p. Patch restores the original behaviour...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Newer Dell CERC firmware (>= 6.62) implement a random deletion handling
compatible with the legacy megaraid driver. The legacy handling shifted
the target ID by 0x80 only for I/O commands (READ/WRITE/etc), whereas
megaraid_mbox shifts the target ID always if random deletion is supported.
The resulted in megaraid_mbox sending an INQUIRY to the wrong channel, and
not finding any devices, obviously.
So we disable the random deletion support if the offending firmware is
found.
Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Bo Yang <Bo.Yang@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We zero-fill them like we are supposed to, and that's all fine. It's
only an error if the 'romfs_copyfrom()' routine isn't able to fill the
data that is supposed to be there.
Most of the patch is really just re-organizing the code a bit, and using
separate variables for the error value and for how much of the page we
actually filled from the filesystem.
Reported-and-tested-by: Chris Fester <cfester@wms.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Matt Waddel <matt.waddel@freescale.com> Cc: Greg Ungerer <gerg@snapgear.com> Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The iov_iter_advance() function would look at the iov->iov_len entry
even though it might have iterated over the whole array, and iov was
pointing past the end. This would cause DEBUG_PAGEALLOC to trigger a
kernel page fault if the allocation was at the end of a page, and the
next page was unallocated.
The quick fix is to just change the order of the tests: check that there
is any iovec data left before we check the iov entry itself.
Thanks to Alexey Dobriyan for finding this case, and testing the fix.
Reported-and-tested-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When a conntrack entry is destroyed in process context and destruction
is interrupted by packet processing and the packet is an attempt to
reopen a closed connection, TCP conntrack tries to kill the old entry
itself and returns NF_REPEAT to pass the packet through the hook
again. This may lead to an endless loop: TCP conntrack repeatedly
finds the old entry, but can not kill it itself since destruction
is already in progress, but destruction in process context can not
complete since TCP conntrack is keeping the CPU busy.
Drop the packet in TCP conntrack if we can't kill the connection
ourselves to avoid this.
Reported by: hemao77@gmail.com [ Kernel bugzilla #11058 ] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Gibson [Fri, 18 Jul 2008 05:55:49 +0000 (15:55 +1000)]
Correct hash flushing from huge_ptep_set_wrprotect()
Correct hash flushing from huge_ptep_set_wrprotect() [stable tree version]
A fix for incorrect flushing of the hash page table at fork() for
hugepages was recently committed as 86df86424939d316b1f6cfac1b6204f0c7dee317. Without this fix, a process
can make a MAP_PRIVATE hugepage mapping, then fork() and have writes
to the mapping after the fork() pollute the child's version.
Unfortunately this bug also exists in the stable branch. In fact in
that case copy_hugetlb_page_range() from mm/hugetlb.c calls
ptep_set_wrprotect() directly, the hugepage variant hook
huge_ptep_set_wrprotect() doesn't even exist.
The patch below is a port of the fix to the stable25/master branch.
It introduces a huge_ptep_set_wrprotect() call, but this is #defined
to be equal to ptep_set_wrprotect() unless the arch defines its own
version and sets __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT.
This arch preprocessor flag is kind of nasty, but it seems the sanest
way to introduce this fix with minimum risk of breaking other archs
for whom prep_set_wprotect() is suitable for hugepages.
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
MSI is a nice thing, but we cannot enable it without changing the
interrupt handler. If we do it, we break MSI capable hardware,
specifically AR5006 chipset.
Signed-off-by: Pavel Roskin <proski@gnu.org> Acked-by: Nick Kossifidis <mickflemm@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The mutex is released on a successful return, so it would seem that it
should be released on an error return as well.
The semantic patch finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
expression l;
@@
mutex_lock(l);
.. when != mutex_unlock(l)
when any
when strict
(
if (...) { ... when != mutex_unlock(l)
+ mutex_unlock(l);
return ...;
}
|
mutex_unlock(l);
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ingo Molnar provided a fix to not call _PPC at processor driver
initialization time in "[PATCH] ACPI: fix cpufreq regression" (git
commit e4233dec749a3519069d9390561b5636a75c7579)
But it can still happen that _PPC is called at processor driver
initialization time.
This patch should make sure that this is not possible anymore.
Signed-off-by: Thomas Renninger <trenn@suse.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Len Brown <lenb@kernel.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Eric Sandeen [Tue, 29 Jul 2008 02:50:12 +0000 (02:50 +0000)]
eCryptfs: use page_alloc not kmalloc to get a page of memory
commit 7fcba054373d5dfc43d26e243a5c9b92069972ee upstream
Date: Mon, 28 Jul 2008 15:46:39 -0700
Subject: eCryptfs: use page_alloc not kmalloc to get a page of memory
With SLUB debugging turned on in 2.6.26, I was getting memory corruption
when testing eCryptfs. The root cause turned out to be that eCryptfs was
doing kmalloc(PAGE_CACHE_SIZE); virt_to_page() and treating that as a nice
page-aligned chunk of memory. But at least with SLUB debugging on, this
is not always true, and the page we get from virt_to_page does not
necessarily match the PAGE_CACHE_SIZE worth of memory we got from kmalloc.
My simple testcase was 2 loops doing "rm -f fileX; cp /tmp/fileX ." for 2
different multi-megabyte files. With this change I no longer see the
corruption.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The ixgbe driver was untested with device ID 8086:10c8 but still advertises
support. Currently if this device is present in the system when the driver
is loaded, the system will panic.
Remove this device ID until full support can be tested with available
hardware. This patch is necessary for 2.6.24, 2.6.25 and 2.6.26
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Paul pointed out two incorrect read barriers in the marker handler code in
the path where multiple probes are connected. Those are ordering reads of
"ptype" (single or multi probe marker), "multi" array pointer, and "multi"
array data access.
It should be ordered like this :
read ptype
smp_rmb()
read multi array pointer
smp_read_barrier_depends()
access data referenced by multi array pointer
The code with a single probe connected (optimized case, does not have to
allocate an array) has correct memory ordering.
It applies to kernel 2.6.26.x, 2.6.25.x and linux-next.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The block transfer routine in the mpc52xx psc spi driver misinterpret
the datasheet. According to the processor datasheet the chipselect is
held as long as the EOF is not written.
Theoretically blocks of any sizes can be transferred in this way. The
old routine however writes an EOF after every word, which has the size
of size_of_word. This makes the transfer slow.
Also fixed some duplicate code.
Signed-off-by: Luotao Fu <l.fu@pengutronix.de> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
SuSE's insserve initscript ordering program hits kernel BUG at mm/shmem.c:814
on 2.6.26. It's using posix_fadvise on directories, and the shmem_readpage
method added in 2.6.23 is letting POSIX_FADV_WILLNEED allocate useless pages
to a tmpfs directory, incrementing i_blocks count but never decrementing it.
Fix this by assigning shmem_aops (pointing to readpage and writepage and
set_page_dirty) only when it's needed, on a regular file or a long symlink.
Many thanks to Kel for outstanding bugreport and steps to reproduce it.
Reported-by: Kel Modderman <kel@otaku42.de> Tested-by: Kel Modderman <kel@otaku42.de> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[PATCH] inode-diet: Eliminate i_blksize from the inode structure
caused the block size used by pseudo-filesystems to decrease from
PAGE_SIZE to 1024 leading to a doubling of the number of context switches
during a kernbench run.
Signed-off-by: Alex Nixon <Alex.Nixon@citrix.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ian Campbell <Ian.Campbell@eu.citrix.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hugh@veritas.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I think that hda_verb array must have "terminator (empty array)".
But alc262_sony_unsol[] does not have it.
And it causes gcc-4.3's buggy behavior
with snd_hda_sequence_write().
arm's fls() is implemented as a macro, causing it to misbehave when passed
64-bit arguments. Fix.
Cc: Nickolay Vinogradov <nickolay@protei.ru> Tested-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When a kernel was rebuilt, the previous Module.markers was not cleared.
It caused markers with different format strings to appear as duplicates
when a markers was changed. This problem is present since
scripts/mod/modpost.c started to generate Module.markers, commit b2e3e658b344c6bcfb8fb694100ab2f2b5b2edb0
It therefore applies to 2.6.25, 2.6.26 and linux-next.
I merely merged the patches from Roland, Wenji and Takashi here.
Credits to
Roland McGrath <roland@redhat.com>
Wenji Huang <wenji.huang@oracle.com>
and
Takashi Nishiie <t-nishiie@np.css.fujitsu.com>
for providing the individual fixes.
- Changelog :
- Integrated Takashi's Makefile modification to clear Module.markers upon
make clean.
A couple of distributions (Fedora, Ubuntu) were having weird problems with the
ATI IXP series PATA controllers being reported as simplex. At the heart of
the problem is that both distros ignored the recommendations to load pata_acpi
and ata_generic *AFTER* specific host drivers.
The underlying cause however is that if you D3 and then D0 an ATI IXP it
helpfully throws away some configuration and won't let you rewrite it.
Add checks to ata_generic and pata_acpi to pin ATIIXP devices. Possibly the
real answer here is to quirk them and pin them, but right now we can't do that
before they've been pcim_enable()'d by a driver.
I'm indebted to David Gero for this. His bug report not only reported the
problem but identified the cause correctly and he had tested the right values
to prove what was going on
[If you backport this for 2.6.24 you will need to pull in the 2.6.25
removal of the bogus WARN_ON() in pcim_enagle]
Signed-off-by: Alan Cox <alan@redhat.com> Tested-by: David Gero <davidg@havidave.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
No straightforward fix exists because we want to
enable these IRQs and setup state atomically before
getting into the IRQ handler the first time.
What happens now is that we mark the VIRQ to not be
automatically enabled by request_irq(). Then we
make explicit enable_irq() calls when we grab the
LDC channel.
This way we don't need to call request_irq() illegally
under the LDC channel lock any more.
Bump LDC version and release date.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is based upon an excellent bug report from Eric Dumazet.
tcp_ack() should clear ->icsk_probes_out even if there are packets
outstanding. Otherwise if we get a sequence of ACKs while we do have
packets outstanding over and over again, we'll never clear the
probes_out value and eventually think the connection is too sick and
we'll reset it.
This appears to be some "optimization" added to tcp_ack() in the 2.4.x
timeframe. In 2.2.x, probes_out is pretty much always cleared by
tcp_ack().
Here is Eric's original report:
----------------------------------------
Apparently, we can in some situations reset TCP connections in a couple of seconds when some frames are lost.
In order to reproduce the problem, please try the following program on linux-2.6.25.*
Setup some iptables rules to allow two frames per second sent on loopback interface to tcp destination port 12000
iptables -N SLOWLO
iptables -A SLOWLO -m hashlimit --hashlimit 2 --hashlimit-burst 1 --hashlimit-mode dstip --hashlimit-name slow2 -j ACCEPT
iptables -A SLOWLO -j DROP
iptables -A OUTPUT -o lo -p tcp --dport 12000 -j SLOWLO
Then run the attached program and see the output :
# ./loop
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 40 127.0.0.1:54455 127.0.0.1:12000 timer:(persist,200ms,1)
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 40 127.0.0.1:54455 127.0.0.1:12000 timer:(persist,200ms,3)
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 40 127.0.0.1:54455 127.0.0.1:12000 timer:(persist,200ms,5)
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 40 127.0.0.1:54455 127.0.0.1:12000 timer:(persist,200ms,7)
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 40 127.0.0.1:54455 127.0.0.1:12000 timer:(persist,200ms,9)
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 40 127.0.0.1:54455 127.0.0.1:12000 timer:(persist,200ms,11)
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 40 127.0.0.1:54455 127.0.0.1:12000 timer:(persist,201ms,13)
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 40 127.0.0.1:54455 127.0.0.1:12000 timer:(persist,188ms,15)
write(): Connection timed out
wrote 890 bytes but was interrupted after 9 seconds
ESTAB 0 0 127.0.0.1:12000 127.0.0.1:54455
Exiting read() because no data available (4000 ms timeout).
read 860 bytes
While this tcp session makes progress (sending frames with 50 bytes of payload, every 500ms), linux tcp stack decides to reset it, when tcp_retries 2 is reached (default value : 15)
tcpdump :
15:30:28.856695 IP 127.0.0.1.56554 > 127.0.0.1.12000: S 33788768:33788768(0) win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
15:30:28.856711 IP 127.0.0.1.12000 > 127.0.0.1.56554: S 33899253:33899253(0) ack 33788769 win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
15:30:29.356947 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 1:61(60) ack 1 win 257
15:30:29.356966 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 61 win 257
15:30:29.866415 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 61:111(50) ack 1 win 257
15:30:29.866427 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 111 win 257
15:30:30.366516 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 111:161(50) ack 1 win 257
15:30:30.366527 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 161 win 257
15:30:30.876196 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 161:211(50) ack 1 win 257
15:30:30.876207 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 211 win 257
15:30:31.376282 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 211:261(50) ack 1 win 257
15:30:31.376290 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 261 win 257
15:30:31.885619 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 261:311(50) ack 1 win 257
15:30:31.885631 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 311 win 257
15:30:32.385705 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 311:361(50) ack 1 win 257
15:30:32.385715 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 361 win 257
15:30:32.895249 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 361:411(50) ack 1 win 257
15:30:32.895266 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 411 win 257
15:30:33.395341 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 411:461(50) ack 1 win 257
15:30:33.395351 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 461 win 257
15:30:33.918085 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 461:511(50) ack 1 win 257
15:30:33.918096 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 511 win 257
15:30:34.418163 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 511:561(50) ack 1 win 257
15:30:34.418172 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 561 win 257
15:30:34.927685 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 561:611(50) ack 1 win 257
15:30:34.927698 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 611 win 257
15:30:35.427757 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 611:661(50) ack 1 win 257
15:30:35.427766 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 661 win 257
15:30:35.937359 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 661:711(50) ack 1 win 257
15:30:35.937376 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 711 win 257
15:30:36.437451 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 711:761(50) ack 1 win 257
15:30:36.437464 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 761 win 257
15:30:36.947022 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 761:811(50) ack 1 win 257
15:30:36.947039 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 811 win 257
15:30:37.447135 IP 127.0.0.1.56554 > 127.0.0.1.12000: P 811:861(50) ack 1 win 257
15:30:37.447203 IP 127.0.0.1.12000 > 127.0.0.1.56554: . ack 861 win 257
15:30:41.448171 IP 127.0.0.1.12000 > 127.0.0.1.56554: F 1:1(0) ack 861 win 257
15:30:41.448189 IP 127.0.0.1.56554 > 127.0.0.1.12000: R 33789629:33789629(0) win 0
Source of program :
/*
* small producer/consumer program.
* setup a listener on 127.0.0.1:12000
* Forks a child
* child connect to 127.0.0.1, and sends 10 bytes on this tcp socket every 100 ms
* Father accepts connection, and read all data
*/
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <unistd.h>
#include <stdio.h>
#include <time.h>
#include <sys/poll.h>
int port = 12000;
char buffer[4096];
int main(int argc, char *argv[])
{
int lfd = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in socket_address;
time_t t0, t1;
int on = 1, sfd, res;
unsigned long total = 0;
socklen_t alen = sizeof(socket_address);
pid_t pid;
Due to the addition of __attribute__((__cold__)) to a few symbols
without adjusting the linker scripts, those symbols currently may end
up outside the [_stext,_etext) range, as they get placed in
.text.unlikely by (at least) gcc 4.3.0. This may confuse code not only
outside of the kernel, symbol_put_addr()'s BUG() could also trigger.
Hence we need to add .text.unlikely (and for future uses of
__attribute__((__hot__)) also .text.hot) to the TEXT_TEXT() macro.
Issue observed by Lukas Lipavsky.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Tested-by: Lukas Lipavsky <llipavsky@suse.cz> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
currently if you use PTRACE_SINGLEBLOCK on AMD K6-3 (i586) it will crash.
Kernel now wrongly assumes existing DEBUGCTLMSR MSR register there.
Removed the assumption also for some other non-K6 CPUs but I am not sure there
(but it can only bring small inefficiency there if my assumption is wrong).
Based on info from Roland McGrath, Chuck Ebbert and Mikulas Patocka.
More info at:
https://bugzilla.redhat.com/show_bug.cgi?id=456175
Some iso9660 images contain files with rockridge data that is either
incorrect or incompletely parsed. Prior to commit f2966632a134e865db3c819346a1dc7d96e05309 ("[PATCH] rock: handle directory
overflows") (included with kernel 2.6.13) the kernel ignored the rockridge
data for these files, while still allowing the files to be accessed under
their non-rockridge names. That commit inadvertently changed things so
that files with invalid rockridge data could not be accessed at all. (I
ran across the problem when comparing some old CDs with hard disk copies I
had made long ago under kernel 2.4: a few of the files on the hard disk
copies were no longer visible on the CDs.)
This change reverts to the pre-2.6.13 behavior.
Signed-off-by: Adam Greenblatt <adam.greenblatt@gmail.com> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When quota structure is going to be dropped and it is dirty, quota code tries
to write it. If the write fails for some reason (e. g. transaction cannot
be started because the journal is aborted), we try writing again and again and
again... Fix the problem by clearing the dirty bit even if the write failed.
(akpm: for 2.6.27, 2.6.26.x and 2.6.25.x)
Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: dingdinghua <dingdinghua85@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch clamps the cscov setsockopt values to a maximum of 0xFFFF.
Setsockopt values greater than 0xffff can cause an unwanted
wrap-around. Further, IPv6 jumbograms are not supported (RFC 3838,
3.5), so that values greater than 0xffff are not even useful.
Further changes: fixed a typo in the documentation.
[ Add USHORT_MAX from upstream to linux/kernel.h -DaveM ]
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When generating the ip header for the transformed packet we just copy
the frag_off field of the ip header from the original packet to the ip
header of the new generated packet. If we receive a packet as a chain
of fragments, all but the last of the new generated packets have the
IP_MF flag set. We have to mask the frag_off field to only keep the
IP_DF flag from the original packet. This got lost with git commit 36cf9acf93e8561d9faec24849e57688a81eb9c5 ("[IPSEC]: Separate
inner/outer mode processing on output")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This trivial patch restores correct behavior, and applies to current
Linus tree (should also be applied to stable tree as well.)
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We need to unshare the skb first as otherwise pskb_may_pull may
write to a shared skb which could be bad.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The length field in the PPPOE header wasn't checked completely.
This patch causes all packets shorter than the declared length
to be dropped.
It also changes the memcpy_toiovec call to skb_copy_datagram_iovec
so that paged packets (rare for PPPOE) are handled properly.
Thanks to Ilja of the Netric Security Team for discovering and
reporting this bug, and Chris Wright for the total_len check.
[ Incorporate warning fix from Stephen Hemminger. -DaveM ]
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch fixes a potential memory corruption in
pppol2tp_recvmsg(). If skb->len is bigger than the caller's buffer
length, memcpy_toiovec() will go into unintialized data on the kernel
heap, interpret it as an iovec and start modifying memory.
The fix is to change the memcpy_toiovec() call to
skb_copy_datagram_iovec() so that paged packets (rare for PPPOL2TP)
are handled properly. Also check that the caller's buffer is big
enough for the data and set the MSG_TRUNC flag if it is not so.
Reported-by: Ilja <ilja@netric.org> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This fixes the bridge reference count problem and cleanups ipv6 FIB
timer management. Don't use expires field, because it is not a proper
way to test, instead use timer_pending().
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is a trivial patch against the hdlcdrv module that fixes its CRC
calculation. The finished CRC was overwriting the first two bytes of
each packet rather than being appended to the end.
I've tested this with 2.6.8 and 2.6.10-rc1, but hdlcdrv hasn't changed
much recently so it should work with many other kernel versions.
Signed-off-by: Micah Dowty <micah@navi.cx> Acked-by: Thomas Sailer <t.sailer@alumni.ethz.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Unfortunately this allows the task to migrate after setting up the softirq
and raising it. Since softirqs run a queue that is per-cpu we may raise the
softirq on the wrong CPU and this will keep the queued softirq task from
running.
To solve this issue, this patch disables preemption around the releasing
of the hrtimer lock and raising of the softirq.
Even the newer ENE controllers have bugs in their DMA engine that make
it too dangerous to use. Disable it until someone has figured out under
which conditions it corrupts data.
This has caused problems at least once, and can be found as bug report
10925 in the kernel bugzilla.
The pxa27x DMA controller defaults to 64-bit alignment. This caused
the SCR reads to fail (and, depending on card type, error out) when
card->raw_scr was not aligned on a 8-byte boundary.
For performance reasons all scatter-gather addresses passed to
pxamci_request should be aligned on 8-byte boundaries, but if
this can't be guaranteed, byte aligned DMA transfers in the
have to be enabled in the controller to get correct behaviour.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is dma_mask in of_device upon of_platform_device_create()
but we don't actually set coherent_dma_mask. This may cause weird
behavior of USB subsystem using of_device USB host drivers.
The problem here is that if the ioc faults too early in the bring up
sequence (as it usually does for an irq routing problem), ioc_reset gets
called before the scsi host is even allocated. This causes an oops when
it later schedules a renegotiation. Fix this by checking ioc->sh before
trying to renegotiate.
Cc: Eric Moore <Eric.Moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I found a very old patch for this that was Acked but did not get applied
https://lists.linux-foundation.org/pipermail/kernel-janitors/2006-September/016362.html
There looks to be a small leak in isdn_writebuf_stub() in isdn_common.c, when
copy_from_user() returns an un-copied data length (length != 0). The below
patch should be a minimally invasive fix.
Signed-off-by: Darren Jenkins <darrenrjenkins@gmail.com> Acked-by: Karsten Keil <kkeil@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch is a bugfix for how defio handles multiple processes manipulating
the same framebuffer.
Thanks to Bernard Blackham for identifying this bug.
It occurs when two applications mmap the same framebuffer and concurrently
write to the same page. Normally, this doesn't occur since only a single
process mmaps the framebuffer. The symptom of the bug is that the mapping
applications will hang. The cause is that defio incorrectly tries to add the
same page twice to the pagelist. The solution I have is to walk the pagelist
and check for a duplicate before adding. Since I needed to walk the pagelist,
I now also keep the pagelist in sorted order.
Signed-off-by: Jaya Kumar <jayakumar.lkml@gmail.com> Cc: Bernard Blackham <bernard@largestprime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I had 8250.nr_uarts=16 in the boot line of a test kernel and I had a weird
mysterious crash in sysfs. After taking an in-depth look I realized that
CONFIG_SERIAL_8250_NR_UARTS was set to 4 and I was walking off the end of
the serial8250_ports array.
Ouch!!!
Don't let this happen to someone else.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Cortland Setlow pointed out a bug in ov7670.c where the result from
ov7670_read() was just being checked for !0, rather than <0. This made me
realize that ov7670_read's semantics were rather confusing; it both fills
in 'value' with the result, and returns it. This is goes against general
kernel convention; so rather than fixing callers, let's fix the function.
This makes ov7670_read return <0 in the case of an error, and 0 upon
success. Thus, code like:
res = ov7670_read(...);
if (!res)
goto error;
.will work properly.
Signed-off-by: Cortland Setlow <csetlow@tower-research.com> Signed-off-by: Andres Salomon <dilinger@debian.org> Acked-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The current definition of wksidarr works fine on little endian arches
(since cpu_to_le32 is a no-op there), but on big-endian arches, it fails
to compile with this error:
error: braced-group within expression allowed only inside a function
The problem is that this static declaration has cpu_to_le32 embedded
within it, and that expands into a function macro. We need to use
__constant_cpu_to_le32() instead.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The IRQ rate reported back by the RTC is incorrect when HPET is enabled.
Newer hardware that has HPET to emulate the legacy RTC device gets this value
wrong since after it sets the rate, it returns before setting the variable
used to report the IRQ rate back to users of the device -- so the set rate and
the reported rate get out of sync.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Brownell <david-b@pacbell.net> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
"The register %ecx looks innocent but is very important here. The disassembly:
mov %edx,%ecx
shr $0x2,%ecx
rep stos %eax,%es:(%edi) <-- the fault
So %ecx has been loaded from %edx... which is 0x6b6b6b6b/POISON_FREE.
(0x6b6b6b6b >> 2 == 0x1adadada.)
%ecx is the counter for the memset, from here:
memset(object, 0, c->objsize);
i.e. %ecx was loaded from c->objsize, so "c" must have been freed.
Where did "c" come from? Uh-oh...
c = get_cpu_slab(s, smp_processor_id());
This looks like it has very much to do with CPU hotplug/unplug. Is
there a race between SLUB/hotplug since the CPU slab is used after it
has been freed?"
Good analysis.
Yeah, it's possible that a caller of kmem_cache_alloc() -> slab_alloc()
can be migrated on another CPU right after local_irq_restore() and
before memset(). The inital cpu can become offline in the mean time (or
a migration is a consequence of the CPU going offline) so its
'kmem_cache_cpu' structure gets freed ( slab_cpuup_callback).
At some point of time the caller continues on another CPU having an
obsolete pointer...
Kernel Bugzilla #11063 points out that on some architectures (e.g. x86_32)
exec'ing an ELF without a PT_GNU_STACK program header should default to an
executable stack; but this got broken by the unlimited argv feature because
stack vma is now created before the right personality has been established:
so breaking old binaries using nested function trampolines.
Therefore re-evaluate VM_STACK_FLAGS in setup_arg_pages, where stack
vm_flags used to be set, before the mprotect_fixup. Checking through
our existing VM_flags, none would have changed since insert_vm_struct:
so this seems safer than finding a way through the personality labyrinth.
I would like to inform you of our zd1211 based usb wifi adapter (AirTies
WUS-201), which works with the zd1211rw driver with the following device
id definition.
Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: John W. Linville <linville@tuxdriver.com> Cc: Peter Nixon <listuser@peternixon.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Lost connections was reported by Thomas Bätzler (running 2.6.25 kernel) on
the netfilter mailing list (see the thread "Weird nat/conntrack Problem
with PASV FTP upload"). He provided tcpdump recordings which helped to
find a long lingering bug in conntrack.
In TCP connection tracking, checking the lower bound of valid ACK could
lead to mark valid packets as INVALID because:
- We have got a "higher or equal" inequality, but the test checked
the "higher" condition only; fixed.
- If the packet contains a SACK option, it could occur that the ACK
value was before the left edge of our (S)ACK "window": if a previous
packet from the other party intersected the right edge of the window
of the receiver, we could move forward the window parameters beyond
accepting a valid ack. Therefore in this patch we check the rightmost
SACK edge instead of the ACK value in the lower bound of valid (S)ACK
test.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
The current logic has a bug which cannot find matching pattern, if the
pattern is matched from the first character of target string.
for example:
pattern=abc, string=abcdefg
pattern=a, string=abcdefg
Searching algorithm should return 0 for those things.
Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the dubious attempt to prefer 'compute' over 'read'. Not only is it
wrong given commit c337869d (md: do not compute parity unless it is on a failed
drive), but it can trigger a BUG_ON in handle_parity_checks5().
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Even though the CAN netlayer only deals with CAN netdevices, the
netlayer interface to the userspace and to the device layer should
perform some sanity checks.
This patch adds several sanity checks that mainly prevent userspace apps
to send broken content into the system that may be misinterpreted by
some other userspace application.
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de> Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de> Acked-by: Andre Naujoks <nautsch@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As reported by Vipul Gandhi, the current serial_match_port() doesn't work
for tty-devices using dynamic major number allocation. Fix it.
It oopses if you suspend a serial port with _dynamic_ major number. ATM,
I think, there's only the drivers/serial/jsm/jsm_driver.c driver, that
does it in-tree.
This patch changes the way we determine the maximum number of outstanding
commands for each controller.
Most Smart Array controllers can support up to 1024 commands, the notable
exceptions are the E200 and E200i.
The next generation of controllers which were just added support a mode of
operation called Zero Memory Raid (ZMR). In this mode they only support
64 outstanding commands. In Full Function Raid (FFR) mode they support
1024.
We have been setting the queue depth by arbitrarily assigning some value
for each controller. We needed a better way to set the queue depth to
avoid lots of annoying "fifo full" messages. So we made the driver a
little smarter. We now read the config table and subtract 4 from the
returned value. The -4 is to allow some room for ioctl calls which are
not tracked the same way as io commands are tracked.
Please consider this for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
With the removal of struct file from the xattr code,
reiserfs_file_release() isn't used anymore, so the prealloc isn't
discarded. This causes hangs later down the line.
This patch adds it to reiserfs_delete_inode. In most cases it will be a
no-op due to it already having been called, but will avoid hangs with
xattrs.
The esp driver currently does hand rolled reference counting of its
target. It's much easier to do what it needs to do if it's plugged into
the mid-layer callbacks (target_alloc and target_destroy) which were
designed for this case, so do it this way and get rid of the internal
target reference count.
Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
OOPS reported by Friedrich Oslage <bluebird@porno-bullen.de>
The problem here is that tp->starget is set every time a lun
is allocated for a particular target so we can catch the
sdev_target parent value.
The reset handler uses the NULL'ness of this value to determine
which targets are active.
But esp_slave_destroy() does not NULL out this value when appropriate.
So for every target that doesn't respond, the SCSI bus scan causes
a stale pointer to be left here, with ensuing crashes like you're
seeing.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
-tip testing found that Johannes Berg's "softirq: remove irqs_disabled
warning from local_bh_enable" enhancement to lockdep triggers a new
warning on an old testbox that uses 3c59x vortex and netlogging:
----->
calling vortex_init+0x0/0xb0
PCI: Found IRQ 10 for device 0000:00:0b.0
PCI: Sharing IRQ 10 with 0000:00:0a.0
PCI: Sharing IRQ 10 with 0000:00:0b.1
3c59x: Donald Becker and others.
0000:00:0b.0: 3Com PCI 3c556 Laptop Tornado at e0800400.
PCI: Enabling bus mastering for device 0000:00:0b.0
initcall vortex_init+0x0/0xb0 returned 0 after 47 msecs
..
calling init_netconsole+0x0/0x1b0
netconsole: local port 4444
netconsole: local IP 10.0.1.9
netconsole: interface eth0
netconsole: remote port 4444
netconsole: remote IP 10.0.1.16
netconsole: remote ethernet address 00:19:xx:xx:xx:xx
netconsole: device eth0 not up yet, forcing it
eth0: setting half-duplex.
eth0: setting full-duplex.
------------[ cut here ]------------
WARNING: at kernel/softirq.c:137 local_bh_enable_ip+0xd1/0xe0()
Pid: 1, comm: swapper Not tainted 2.6.26-rc6-tip #2091
[<c0125ecf>] warn_on_slowpath+0x4f/0x70
[<c0126834>] ? release_console_sem+0x1b4/0x1d0
[<c0126d00>] ? vprintk+0x2a0/0x450
[<c012fde5>] ? __mod_timer+0xa5/0xc0
[<c046f7fd>] ? mdio_sync+0x3d/0x50
[<c0160ef6>] ? marker_probe_cb+0x46/0xa0
[<c0126ed7>] ? printk+0x27/0x50
[<c046f4c3>] ? vortex_set_duplex+0x43/0xc0
[<c046f521>] ? vortex_set_duplex+0xa1/0xc0
[<c0471b92>] ? vortex_timer+0xe2/0x3e0
[<c012b361>] local_bh_enable_ip+0xd1/0xe0
[<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
[<c0471b92>] vortex_timer+0xe2/0x3e0
[<c014743b>] ? trace_hardirqs_on+0xb/0x10
[<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
[<c012f8b2>] run_timer_softirq+0x162/0x1c0
[<c0471ab0>] ? vortex_timer+0x0/0x3e0
[<c012b361>] local_bh_enable_ip+0xd1/0xe0
[<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
[<c0471b92>] vortex_timer+0xe2/0x3e0
[<c014743b>] ? trace_hardirqs_on+0xb/0x10
[<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
[<c012f8b2>] run_timer_softirq+0x162/0x1c0
[<c0471ab0>] ? vortex_timer+0x0/0x3e0
[<c0471ab0>] ? vortex_timer+0x0/0x3e0
[<c012b60a>] __do_softirq+0x9a/0x160
[<c012b570>] ? __do_softirq+0x0/0x160
[<c0106775>] call_on_stack+0x15/0x30
[<c012b4f5>] ? irq_exit+0x55/0x60
[<c0106e85>] ? do_IRQ+0x85/0xd0
[<c0147391>] ? trace_hardirqs_on_caller+0xc1/0x160
[<c0104888>] ? common_interrupt+0x28/0x30
[<c08d8ac8>] ? mutex_unlock+0x8/0x10
[<c08d8180>] ? _cond_resched+0x10/0x30
[<c07a3be7>] ? netpoll_setup+0x117/0x390
[<c0cbfcfe>] ? init_netconsole+0x14e/0x1b0
[<c013d539>] ? ktime_get+0x19/0x40
[<c0c9bab2>] ? kernel_init+0x1b2/0x2c0
[<c0cbfbb0>] ? init_netconsole+0x0/0x1b0
[<c0396aa4>] ? trace_hardirqs_on_thunk+0xc/0x10
[<c0103f12>] ? restore_nocheck_notrace+0x0/0xe
[<c0c9b900>] ? kernel_init+0x0/0x2c0
[<c0c9b900>] ? kernel_init+0x0/0x2c0
[<c0104aa7>] ? kernel_thread_helper+0x7/0x10
=======================
---[ end trace 37f9c502aff112e0 ]---
console [netcon0] enabled
netconsole: network logging started
initcall init_netconsole+0x0/0x1b0 returned 0 after 2914 msecs
looking at the driver I think the bug is real and the fix actually
is trivial.
vp->lock is also taken in hardware IRQ context, so we _have_ to always
use irqsafe locking. As we run in a timer with IRQs disabled,
we can simply use spin_lock.