Pavel Machek [Sat, 8 Jul 2006 15:37:31 +0000 (17:37 +0200)]
pdflush: handle resume wakeups
2.6.16 needs this. It was merged into 2.6.18-rc1 in
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d616e09ab33aa4d013a93c9b393efd5cebf78521 .
pdflush is carefully designed to ensure that all wakeups have some
corresponding work to do - if a woken-up pdflush thread discovers that
it hasn't been given any work to do then this is considered an error.
That all broke when swsusp came along - because a timer-delivered
wakeup to a frozen pdflush thread will just get lost. This causes the
pdflush thread to get lost as well: the writeback timer is supposed to
be re-armed by pdflush in process context, but pdflush doesn't execute
the callout which does this.
Fix that up by ignoring the return value from try_to_freeze(): jsut
proceed, see if we have any work pending and only go back to sleep if
that is not the case.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Patrick McHardy [Sat, 8 Jul 2006 20:39:35 +0000 (13:39 -0700)]
Fix IPv4/DECnet routing rule dumping
When more rules are present than fit in a single skb, the remaining
rules are incorrectly skipped.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When found, it is obvious. nfds calculated when allocating fdsets is
rewritten by calculation of size of fdtable, and when we are unlucky, we
try to free fdsets of wrong size.
Found due to OpenVZ resource management (User Beancounters).
CONFIG_SND_FM801=y, CONFIG_SND_FM801_TEA575X=m resulted in the following
compile error:
<-- snip -->
...
LD vmlinux
sound/built-in.o: In function 'snd_fm801_free':
fm801.c:(.text+0x3c15b): undefined reference to 'snd_tea575x_exit'
sound/built-in.o: In function 'snd_card_fm801_probe':
fm801.c:(.text+0x3cfde): undefined reference to 'snd_tea575x_init'
make: *** [vmlinux] Error 1
<-- snip -->
This patch fixes kernel Bugzilla #6458.
Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jaroslav Kysela <perex@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Herbert Xu [Thu, 13 Jul 2006 09:11:01 +0000 (19:11 +1000)]
Add missing UFO initialisations
This bug was unknowingly fixed the GSO patches (or rather, its effect was
unknown at the time).
Thanks to Marco Berizzi's persistence which is documented in the thread
"ipsec tunnel asymmetrical mtu", we now know that it can have highly
non-obvious symptoms.
What happens is that uninitialised uso_size fields can cause packets to
be incorrectly identified as UFO, which means that it does not get
fragmented even if it's over the MTU.
The fix is simple enough.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The recent generic_file_write() deadlock fix caused
generic_file_buffered_write() to loop inifinitely when presented with a
zero-length iovec segment. Fix.
Note that this fix deliberately avoids calling ->prepare_write(),
->commit_write() etc with a zero-length write. This is because I don't trust
all filesystems to get that right.
This is a cautious approach, for 2.6.17.x. For 2.6.18 we should just go ahead
and call ->prepare_write() and ->commit_write() with the zero length and fix
any broken filesystems. So I'll make that change once this code is stabilised
and backported into 2.6.17.x.
The reason for preferring to call ->prepare_write() and ->commit_write() with
the zero-length segment: a zero-length segment _should_ be sufficiently
uncommon that this is the correct way of handling it. We don't want to
optimise for poorly-written userspace at the expense of well-written
userspace.
Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Neil Brown <neilb@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Greg KH <greg@kroah.com> Cc: <stable@kernel.org> Cc: walt <wa1ter@myrealbox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
generic_file_buffered_write(): deadlock on vectored write
generic_file_buffered_write() prefaults in user pages in order to avoid
deadlock on copying from the same page as write goes to.
However, it looks like there is a problem when write is vectored:
fault_in_pages_readable brings in current segment or its part (maxlen).
OTOH, filemap_copy_from_user_iovec is called to copy number of bytes
(bytes) which may exceed current segment, so filemap_copy_from_user_iovec
switches to the next segment which is not brought in yet. Pagefault is
generated. That causes the deadlock if pagefault is for the same page
write goes to: page being written is locked and not uptodate, pagefault
will deadlock trying to lock locked page.
[akpm@osdl.org: somewhat rewritten] Cc: Neil Brown <neilb@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Ian Abbott [Mon, 26 Jun 2006 11:59:17 +0000 (12:59 +0100)]
USB serial ftdi_sio: Prevent userspace DoS (CVE-2006-2936)
This patch limits the amount of outstanding 'write' data that can be
queued up for the ftdi_sio driver, to prevent userspace DoS attacks (or
simple accidents) that use up all the system memory by writing lots of
data to the serial port.
The original patch was by Guillaume Autran, who in turn based it on the
same mechanism implemented in the 'visor' driver. I (Ian Abbott)
re-targeted the patch to the latest sources, fixed a couple of errors,
renamed his new structure members, and updated the implementations of
the 'write_room' and 'chars_in_buffer' methods to take account of the
number of outstanding 'write' bytes. It seems to work fine, though at
low baud rates it is still possible to queue up an amount of data that
takes an age to shift (a job for another day!).
Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
- We cannot reliably block in link_pipe() while holding both input and output
mutexes. So do preparatory checks before locking down both mutexes and doing
the link.
- The ipipe->nrbufs vs i check was bad, because we could have dropped the
ipipe lock in-between. This causes us to potentially look at unknown
buffers if we were racing with someone else reading this pipe.
Dave Jones [Tue, 20 Jun 2006 03:11:32 +0000 (03:11 +0000)]
Make powernow-k7 work on SMP kernels.
[CPUFREQ] Make powernow-k7 work on SMP kernels.
Even though powernow-k7 doesn't work in SMP environments,
it can work on an SMP configured kernel if there's only
one CPU present, however recalibrate_cpu_khz was returning
-EINVAL on such kernels, so we failed to init the cpufreq driver.
Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Michael Krufky [Thu, 6 Jul 2006 18:26:45 +0000 (14:26 -0400)]
dvb-bt8xx: fix frontend detection for DViCO FusionHDTV DVB-T Lite rev 1.2
This patch adds support for the new revision of the DViCO
FusionHDTV DVB-T Lite, based on the zl10353 demod instead
of mt352.
Both mt352 and zl10353 revisions of this card have the
same PCI subsystem ID.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Acked-by: Chris Pascoe <c.pascoe@itee.uq.edu.au> Acked-by: Manu Abraham <manu@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch prevents the stradis driver from breaking all
other saa7146 devices by removing the autodetection based
on PCI subsystem ID 0000:0000 (no eeprom). Users that
want to use the stradis driver will have to manually
insert the module, or specify it in modprobe.conf
Signed-off-by: Andrew de Quincey <adq_dvb@lidskialf.net> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Oliver Endriss [Thu, 6 Jul 2006 18:26:40 +0000 (14:26 -0400)]
v4l/dvb: Backport the budget driver DISEQC instability fix
Backport the budget driver DISEQC instability fix.
Signed-off-by: Oliver Endriss <o.endriss@gmx.de> Signed-off-by: Andrew de Quincey <adq_dvb@lidskialf.net> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
v4l/dvb: Backport the DISEQC regression fix to 2.6.17.x
Backport the DISEQC regression fix to 2.6.17.x
Signed-off-by: Andrew de Quincey <adq_dvb@lidskialf.net> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andrew de Quincey <adq_dvb@lidskialf.net> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Graf [Thu, 6 Jul 2006 03:58:51 +0000 (20:58 -0700)]
PKT_SCHED: Fix error handling while dumping actions
"return -err" and blindly inheriting the error code in the netlink
failure exception handler causes errors codes to be returned as
positive value therefore making them being ignored by the caller.
May lead to sending out incomplete netlink messages.
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Graf [Thu, 6 Jul 2006 03:58:23 +0000 (20:58 -0700)]
PKT_SCHED: Return ENOENT if action module is unavailable
Return ENOENT if action module is unavailable
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Graf [Thu, 6 Jul 2006 03:58:02 +0000 (20:58 -0700)]
PKT_SCHED: Fix illegal memory dereferences when dumping actions
The TCA_ACT_KIND attribute is used without checking its
availability when dumping actions therefore leading to a
value of 0x4 being dereferenced.
The use of strcmp() in tc_lookup_action_n() isn't safe
when fed with string from an attribute without enforcing
proper NUL termination.
Both bugs can be triggered with malformed netlink message
and don't require any privileges.
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Michael Krufky [Thu, 29 Jun 2006 05:03:46 +0000 (01:03 -0400)]
v4l/dvb: Kconfig: fix description and dependencies for saa7115 module
This Kconfig description is incorrect, due to a previous merge a while back.
CONFIG_SAA711X builds module saa7115, which is the newer v4l2 module, and is
not obsoleted.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Bob Moore [Sun, 2 Jul 2006 10:53:01 +0000 (11:53 +0100)]
Reduce ACPI verbosity on null handle condition
As detailed at http://bugs.gentoo.org/131534 :
2.6.16 converted many ACPI debug messages into error or warning
messages. One extraneous message was incorrectly converted, resulting in
logs being flooded by "Handle is NULL and Pathname is relative" messages
on some systems.
This patch (part of a larger ACPICA commit) converts the message back to
debug level.
Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The original driver had a restriction that if a card as an saa7113 chip,
then it cannot have a CI interface. This is not the case.
Signed-off-by: Andrew de Quincey <adq_dvb@lidskialf.net> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
These cards do not need the tda10021 configuration change when data is
streamed through a CAM module. This disables it for these ones.
Signed-off-by: Andrew de Quincey <adq_dvb@lidskialf.net> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The budget-av needs this GPIO set low for most cards to work.
Signed-off-by: Andrew de Quincey <adq_dvb@lidskialf.net> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Yasunori Goto [Thu, 29 Jun 2006 09:24:27 +0000 (02:24 -0700)]
memory hotplug: solve config broken: undefined reference to `online_page'
Memory hotplug code of i386 adds memory to only highmem. So, if
CONFIG_HIGHMEM is not set, CONFIG_MEMORY_HOTPLUG shouldn't be set.
Otherwise, it causes compile error.
In addition, many architecture can't use memory hotplug feature yet. So, I
introduce CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG.
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Andi Kleen [Thu, 29 Jun 2006 18:54:26 +0000 (20:54 +0200)]
BLOCK: Fix bounce limit address check
This fixes some OOMs on 64bit systems with <4GB of RAM when accessing
the cdrom.
Do a safer check for when to enable DMA. Currently we enable ISA DMA
for cases that do not need it, resulting in OOM conditions when ZONE_DMA
runs out of space.
IB/mthca: restore missing PCI registers after reset
mthca does not restore the following PCI-X/PCI Express registers after reset:
PCI-X device: PCI-X command register
PCI-X bridge: upstream and downstream split transaction registers
PCI Express : PCI Express device control and link control registers
This causes instability and/or bad performance on systems where one of
these registers is set to a non-default value by BIOS.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Fix nused counter. It's currently getting set to -1 rather than getting
decremented by 1. Since nused never reaches 0, the "if (!free->hdr.nused)"
check in xfs_dir2_leafn_remove() fails every time and xfs_dir2_shrink_inode()
doesn't get called when it should. This causes extra blocks to be left on
an empty directory and the directory in unable to be converted back to
inline extent mode.
Signed-off-by: Mandy Kirkconnell <alkirkco@sgi.com> Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We have a bad interaction with both the kernel and user space being able
to change some of the /proc file status. This fixes the most obvious
part of it, but I expect we'll also make it harder for users to modify
even their "own" files in /proc.
fix prctl privilege escalation and suid_dumpable (CVE-2006-2451)
Based on a patch from Ernie Petrides
During security research, Red Hat discovered a behavioral flaw in core
dump handling. A local user could create a program that would cause a
core file to be dumped into a directory they would not normally have
permissions to write to. This could lead to a denial of service (disk
consumption), or allow the local user to gain root privileges.
Patrick McHardy [Fri, 30 Jun 2006 03:33:12 +0000 (05:33 +0200)]
[PATCH] NETFILTER: SCTP conntrack: fix crash triggered by packet without chunks [CVE-2006-2934]
When a packet without any chunks is received, the newconntrack variable
in sctp_packet contains an out of bounds value that is used to look up an
pointer from the array of timeouts, which is then dereferenced, resulting
in a crash. Make sure at least a single chunk is present.
Problem noticed by George A. Theall <theall@tenablesecurity.com>
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Turned out to be a race-condition and NULL ptr deref, here's my fix:
Users of the idr code are supposed to call idr_pre_get without locking, so the
idr code must serialize itself with respect to layer allocations. However, it
fails to do so in an error path in idr_get_new_above_int(). I added the
missing locking to fix this.
Signed-off-by: Sonny Rao <sonny@burdell.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Anton Blanchard [Sun, 25 Jun 2006 12:48:44 +0000 (05:48 -0700)]
[PATCH] Link error when futexes are disabled on 64bit architectures
From: Anton Blanchard <anton@samba.org>
If futexes are disabled we fail to link on ppc64.
Signed-off-by: Anton Blanchard <anton@samba.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Al Boldi [Mon, 26 Jun 2006 07:26:13 +0000 (00:26 -0700)]
[PATCH] ide-io: increase timeout value to allow for slave wakeup
During an STR resume cycle, the ide master disk times-out when there is
also a slave present (especially CD). Increasing the timeout in ide-io
from 10,000 to 100,000 fixes this problem.
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Robert Hancock [Fri, 23 Jun 2006 02:33:26 +0000 (20:33 -0600)]
[PATCH] ohci1394: Fix broken suspend/resume in ohci1394
I've been experimenting to track down the cause of suspend/resume
problems on my Compaq Presario X1050 laptop:
http://bugzilla.kernel.org/show_bug.cgi?id=6075
Essentially the ACPI Embedded Controller and keyboard controller would
get into a bizarre, confused state after resume.
I found that unloading the ohci1394 module before suspend and reloading
it after resume made the problem go away. Diffing the dmesg output from
resume, with and without the module loaded, I found that with the module
loaded I was missing these:
PM: Writing back config space on device 0000:02:00.0 at offset 1. (Was 2100080, writing 2100007)
PM: Writing back config space on device 0000:02:00.0 at offset 3. (Was
0, writing 8008)
PM: Writing back config space on device 0000:02:00.0 at offset 4. (Was
0, writing 90200000)
PM: Writing back config space on device 0000:02:00.0 at offset 5. (Was
1, writing 2401)
PM: Writing back config space on device 0000:02:00.0 at offset f. (Was 20000100, writing 2000010a)
The default PCI driver performs the pci_restore_state when no driver is
loaded for the device. When the ohci1394 driver is loaded, it is
supposed to do this, however it appears not to do so.
I created the patch below and tested it, and it appears to resolve the
suspend problems I was having with the module loaded. I only added in
the pci_save_state and pci_restore_state - however, though I know little
of this hardware, surely the driver should really be doing more than
this when suspending and resuming? Currently it does almost nothing,
what if there are commands in progress, etc?
Signed-off-by: Robert Hancock <hancockr@shaw.ca> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
[PATCH] IPV6 ADDRCONF: Fix default source address selection without CONFIG_IPV6_PRIVACY
We need to update hiscore.rule even if we don't enable CONFIG_IPV6_PRIVACY,
because we have more less significant rule; longest match.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Ćukasz Stelmach [Thu, 22 Jun 2006 08:39:05 +0000 (01:39 -0700)]
[PATCH] IPV6: Fix source address selection.
Two additional labels (RFC 3484, sec. 10.3) for IPv6 addreses
are defined to make a distinction between global unicast
addresses and Unique Local Addresses (fc00::/7, RFC 4193) and
Teredo (2001::/32, RFC 4380). It is necessary to avoid attempts
of connection that would either fail (eg. fec0:: to 2001:feed::)
or be sub-optimal (2001:0:: to 2001:feed::).
Signed-off-by: Ćukasz Stelmach <stlman@poczta.fm> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Michael Buesch [Sun, 18 Jun 2006 17:05:10 +0000 (19:05 +0200)]
[PATCH] bcm43xx: init fix for possible Machine Check
Place the Init-vs-IRQ workaround before any card register
access, because we might not have the wireless core mapped
at all times in init. So this will result in a Machine Check
caused by a bus error.
Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
[PATCH] NTFS: Critical bug fix (affects MIPS and possibly others)
It fixes a crash in NTFS on architectures where flush_dcache_page()
is a real function. I never noticed this as all my testing is done on
i386 where flush_dcache_page() is NULL.
http://bugzilla.kernel.org/show_bug.cgi?id=6700
Many thanks to Pauline Ng for the detailed bug report and analysis!
Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
David Miller [Tue, 20 Jun 2006 07:44:27 +0000 (00:44 -0700)]
[PATCH] SPARC32: Fix iommu_flush_iotlb end address
Fix the calculation of the end address when flushing iotlb entries to
ram. This bug has been a cause of esp dma errors, and it affects
HyperSPARC systems much worse than SuperSPARC systems.
Signed-off-by: Bob Breuer <breuerr@mc.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Herbert Xu [Tue, 20 Jun 2006 07:09:30 +0000 (00:09 -0700)]
[PATCH] ETHTOOL: Fix UFO typo
The function ethtool_get_ufo was referring to ETHTOOL_GTSO instead of
ETHTOOL_GUFO.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Neil Horman [Tue, 20 Jun 2006 07:09:03 +0000 (00:09 -0700)]
[PATCH] SCTP: Fix persistent slowdown in sctp when a gap ack consumes rx buffer.
In the event that our entire receive buffer is full with a series of
chunks that represent a single gap-ack, and then we accept a chunk
(or chunks) that fill in the gap between the ctsn and the first gap,
we renege chunks from the end of the buffer, which effectively does
nothing but move our gap to the end of our received tsn stream. This
does little but move our missing tsns down stream a little, and, if the
sender is sending sufficiently large retransmit frames, the result is a
perpetual slowdown which can never be recovered from, since the only
chunk that can be accepted to allow progress in the tsn stream necessitates
that a new gap be created to make room for it. This leads to a constant
need for retransmits, and subsequent receiver stalls. The fix I've come up
with is to deliver the frame without reneging if we have a full receive
buffer and the receiving sockets sk_receive_queue is empty(indicating that
the receive buffer is being blocked by a missing tsn).
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Tsutomu Fujii [Tue, 20 Jun 2006 07:08:29 +0000 (00:08 -0700)]
[PATCH] SCTP: Send only 1 window update SACK per message.
Right now, every time we increase our rwnd by more then MTU bytes, we
trigger a SACK. When processing large messages, this will generate a
SACK for almost every other SCTP fragment. However since we are freeing
the entire message at the same time, we might as well collapse the SACK
generation to 1.
Signed-off-by: Tsutomu Fujii <t-fujii@nb.jp.nec.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
David Miller [Tue, 20 Jun 2006 07:06:27 +0000 (00:06 -0700)]
[PATCH] SCTP: Reset rtt_in_progress for the chunk when processing its sack.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Vlad Yasevich [Tue, 20 Jun 2006 07:04:53 +0000 (00:04 -0700)]
[PATCH] SCTP: Reject sctp packets with broadcast addresses.
Make SCTP handle broadcast properly
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Vlad Yasevich [Tue, 20 Jun 2006 07:04:18 +0000 (00:04 -0700)]
[PATCH] SCTP: Limit association max_retrans setting in setsockopt.
When using ASSOCINFO socket option, we need to limit the number of
maximum association retransmissions to be no greater than the sum
of all the path retransmissions. This is specified in Section 7.1.2
of the SCTP socket API draft.
However, we only do this if the association has multiple paths. If
there is only one path, the protocol stack will use the
assoc_max_retrans setting when trying to retransmit packets.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
This patch fixes RTNLGRP_IPV6_IFINFO netlink notifications. Issue
pointed out by Patrick McHardy <kaber@trash.net>.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Arnd Bergmann [Thu, 15 Jun 2006 11:15:44 +0000 (21:15 +1000)]
[PATCH] powerpc: Fix 64k pages on non-partitioned machines
The page size encoding passed to tlbie is incorrect for new-style
large pages. This fixes it. This doesn't affect anything on older
machines because mmu_psize_defs[psize].penc (the page size encoding)
is 0 for 4k and 16M pages (the two are distinguished by a separate "is
a large page" bit).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov [Thu, 15 Jun 2006 16:12:02 +0000 (20:12 +0400)]
[PATCH] arm_timer: remove a racy and obsolete PF_EXITING check
arm_timer() checks PF_EXITING to prevent BUG_ON(->exit_state)
in run_posix_cpu_timers().
However, for some reason it does so only for CPUCLOCK_PERTHREAD
case (which is imho wrong).
Also, this check is not reliable, PF_EXITING could be set on
another cpu without any locks/barriers just after the check,
so it can't prevent from attaching the timer to the exiting
task.
Oleg Nesterov [Thu, 15 Jun 2006 16:11:43 +0000 (20:11 +0400)]
[PATCH] run_posix_cpu_timers: remove a bogus BUG_ON()
do_exit() clears ->it_##clock##_expires, but nothing prevents
another cpu to attach the timer to exiting process after that.
arm_timer() tries to protect against this race, but the check
is racy.
After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and
before do_exit() calls 'schedule() local timer interrupt can find
tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu
does sys_wait4) interrupted task has ->signal == NULL.
At this moment exiting task has no pending cpu timers, they were
cleanuped in __exit_signal()->posix_cpu_timers_exit{,_group}(),
so we can just return from irq.
Oleg Nesterov [Thu, 15 Jun 2006 16:11:15 +0000 (20:11 +0400)]
[PATCH] check_process_timers: fix possible lockup
If the local timer interrupt happens just after do_exit() sets PF_EXITING
(and before it clears ->it_xxx_expires) run_posix_cpu_timers() will call
check_process_timers() with tasklist_lock + ->siglock held and
check_process_timers:
t = tsk;
do {
....
do {
t = next_thread(t);
} while (unlikely(t->flags & PF_EXITING));
} while (t != tsk);
the outer loop will never stop.
Actually, the window is bigger. Another process can attach the timer
after ->it_xxx_expires was cleared (see the next commit) and the 'if
(PF_EXITING)' check in arm_timer() is racy (see the one after that).
A couple of fixes that should prevent crashes when using netconsole and
suspend/resume. First, netconsole poll routine shouldn't run unless the
device is up; second, the NAPI poll should be disabled during suspend.
This is only an issue on sky2, because it has to have one NAPI poll
routine for both ports on dual port boards. Normal drivers use
netif_rx_schedule_prep and that checks for netif_running.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jens Axboe [Fri, 16 Jun 2006 11:02:29 +0000 (13:02 +0200)]
[PATCH] Fix missing ret assignment in __bio_map_user() error path
If get_user_pages() returns less pages than what we asked for, we jump
to out_unmap which will return ERR_PTR(ret). But ret can contain a
positive number just smaller than local_nr_pages, so be sure to set it
to -EFAULT always.
Problem found and diagnosed by Damien Le Moal <damien@sdl.hitachi.co.jp>
Jens Axboe [Fri, 16 Jun 2006 05:46:37 +0000 (07:46 +0200)]
[PATCH] fix cdrom open
Some time ago the cdrom open routine was changed so that we call the
driver's open routine before checking to see if it is read only. However,
if we discovered that a read write open was not possible and the open
flags required a writable open, we just returned -EROFS without calling
the driver's release routine. This seems to work for most cdrom drivers,
but breaks the Powerpc iSeries virtual cdrom rather badly.
This just inserts the release call in the error path to balance the call
to "->open()" done by "open_for_data()".
Jens Axboe [Wed, 14 Jun 2006 17:11:57 +0000 (19:11 +0200)]
[PATCH] cfq-iosched: fix crash in do_div()
We don't clear the seek stat values in cfq_alloc_io_context(), and if
->seek_mean is unlucky enough to be set to -36 by chance, the first
invocation of cfq_update_io_seektime() will oops with a divide by zero
in do_div().
Just memset the entire cic instead of filling invididual values
independently.
[PATCH] sky2: stop/start hardware idle timer on suspend/resume
The resume bug was caused not by an early interrupt but because the idle
timeout was not being stopped on suspend. Also disable hardware IRQ's
on suspend. Will need to revisit this with hotplug?
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The set power state function is cleaner if it doesn't return anything.
The only caller that could fail is in suspend() and it can check the argument
there.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Randy Dunlap [Mon, 12 Jun 2006 22:13:40 +0000 (15:13 -0700)]
[PATCH] alpha: generic hweight build fix
From: Randy Dunlap <rdunlap@xenotime.net>
According to include/asm-alpha/bitops.h, only ALPHA_EV67 has hardware
hweight support, so ALPHA_EV6 needs to use GENERIC_HWEIGHT.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Ernst Herzberg <earny@net4u.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sergey Vlasov [Mon, 12 Jun 2006 20:53:23 +0000 (21:53 +0100)]
[PATCH] tmpfs: Decrement i_nlink correctly in shmem_rmdir()
shmem_rmdir() must undo the increment of i_nlink done in
shmem_get_inode() for directories, otherwise at least
IN_DELETE_SELF inotify event generation is broken.
Robin H. Johnson [Mon, 12 Jun 2006 20:50:25 +0000 (21:50 +0100)]
[PATCH] tmpfs: time granularity fix for [acm]time going backwards
I noticed a strange behavior in a tmpfs file system the other day, while
building packages - occasionally, and seemingly at random, make decided to
rebuild a target. However, only on tmpfs.
A file would be created, and if checked, it had a sub-second timestamp.
However, after an utimes related call where sub-seconds should be set, they
were zeroed instead. In the case that a file was created, and utimes(...,NULL)
was used on it in the same second, the timestamp on the file moved backwards.
After some digging, I found that this was being caused by tmpfs not having a
time granularity set, thus inheriting the default 1 second granularity.
Hugh adds: yes, we missed tmpfs when the s_time_gran mods went into 2.6.11.
Unfortunately, the granularity of CURRENT_TIME, often used in filesystems,
does not match the default granularity set by alloc_super. A few more such
discrepancies have been found, but this is the most important to fix now.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>