Alan Cox [Tue, 4 Oct 2005 12:09:19 +0000 (08:09 -0400)]
libata: bitmask based pci init functions for one or two ports
This redoes the n_ports logic I proposed before as a bitmask.
ata_pci_init_native_mode is now used with a mask allowing for mixed mode
stuff later on. ata_pci_init_legacy_port is called with port number and
does one port now not two. Instead it is called twice by the ata init
logic which cleans both of them up.
There are stil limits in the original code left over
- IRQ/port mapping for legacy mode should be arch specific values
- You can have one legacy mode IDE adapter per PCI root bridge on some systems
- Doesn't handle mixed mode devices yet (but is now a lot closer to it)
Tejun Heo [Sun, 2 Oct 2005 02:54:29 +0000 (11:54 +0900)]
[PATCH] libata: add ATA exceptions chapter to doc
Hello, Jeff.
This patch adds ATA errors & exceptions chapter to
Documentation/DocBook/libata.tmpl. As suggested, the chapter is
placed before low level driver specific chapters. Contents are
unchanged from the last posting.
Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
[PATCH] libata: Marvell SATA support (DMA mode) (resend: v0.22)
This is my libata compatible low level driver for the Marvell SATA
family. Currently it runs in DMA mode on a 6081 chip.
The 5xxx series parts are not yet DMA capable in this driver because
the registers have differences that haven't been accounted for yet.
Basically, I'm focused on the 6xxx series right now. I apologize for
those seeing problems on the 5xxx series, I've not had a chance to
look at those problems yet.
For those curious, the previous bug causing the SCSI timeout and
subsequent panics was caused by an improper clear of hc_irq_cause in
mv_host_intr().
This version is running well in my environment (6081 chips,
with/without SW raid1) and is showing equal or better performance
compared to the Marvell driver (mv_sata) in my initial tests (timed
dd's of reads/writes to/from memory/disk).
I still need to look at the causes of occasional problems such as this:
ata11: translating stat 0x35 err 0x00 to sense
ata11: status=0x35 { DeviceFault SeekComplete CorrectedError Error }
SCSI error : <10 0 0 0> return code = 0x8000002
Current sda: sense key Hardware Error
end_request: I/O error, dev sda, sector 3155010
and this, seen at init time:
ATA: abnormal status 0x80 on port 0xE093911C
but they aren't showstoppers.
Signed-off-by: Brett Russ <russb@emc.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Catalin Marinas [Sun, 2 Oct 2005 21:34:35 +0000 (22:34 +0100)]
[ARM] 2943/1: Clear the exclusive monitor in v6_early_abort
Patch from Catalin Marinas
Data abort caused by ldrex/strex can leave the exclusive monitor in an
unpredictable state. It is recommended that a clrex/strex is performed to
clear this state.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Sun, 2 Oct 2005 17:12:03 +0000 (18:12 +0100)]
[ARM] Fix init printk for EBSA110 network driver, and link timer
Arrange for the initialisation printks to happen after we've
registered the network interface, so we know what name the
device is. Also, check the link every 500ms (and use
msecs_to_jiffies.)
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Sven Henkel [Sat, 1 Oct 2005 22:30:33 +0000 (08:30 +1000)]
[PATCH] ppc32: Add new iBook 12" to PowerMac models table
This adds the new iBook G4 (manufactured after July 2005) to the
PowerMac models table. The model name (PowerBook6,7) is taken from a
12" iBook, I don't know if it also matches the 14" version. The patch
applies to a vanilla 2.6.13.2 kernel.
Signed-off-by: Sven Henkel <shenkel@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Sven Henkel [Sat, 1 Oct 2005 22:29:18 +0000 (08:29 +1000)]
[PATCH] pmac/radeonfb: Add suspend support for M11 chip in new iBook 12"
This adds suspend support for the Radeon M11 chip in 12" iBooks
manufactured after July 2005. I don't know if the new 14" iBooks also
have that chip, so they might also be supported.
The chip identifies itself as "RV350 NV" (pci id 0x4e56), revision 0x80.
Apple calls it "Snowy", xfree86 names it "ATI FireGL Mobility T2 (M11)
NV (AGP)". So, we seem to be lucky here: The suspend-code for the M10
(which also is a "RV350 NV") works flawless for that chip.
Signed-off-by: Sven Henkel <shenkel@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sat, 1 Oct 2005 18:04:18 +0000 (11:04 -0700)]
Fix inequality comparison against "task->state"
We should always use bitmask ops, rather than depend on some ordering of
the different states. With the TASK_NONINTERACTIVE flag, the inequality
doesn't really work.
Oleg Nesterov argues (likely correctly) that this test is unnecessary in
the first place. However, the minimal fix for now is to at least make
it work in the presense of TASK_NONINTERACTIVE. Waiting for consensus
from Roland & co on potential bigger cleanups.
[PATCH] ARM: Fix IXP2000 serial port resource range. For real this time.
Serial port only needs 32 bytes of resource space but we are currently
asking for 64K.
Signed-off-by: Deepak Saxena <dsaxena@plexity.net>
[ diff went missing first time due to corrupted patch ] Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As an alternative, people can use the boot time 'debug' option and/or use
'ethtool -s ethX msglvl xyz'. The different messages are listed at:
http://www.zoreil.com/~romieu/r8169/doc/msglvl.txt
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The old 550Mhz titanium powerbook can switch to a lower frequency
(500Mhz). A user has been repeately reporting overtemp conditions on his
machine at high speed so this simple patch adds support to PowerMac
cpufreq for this machine. The difference in frequency isn't big but seem
enough to fix that user's problems. The patch has been around for some
time now and doesn't seem to cause any problem, so I suppose it could go
in now.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Alain RICHARD <alain.richard@equation.fr> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The tests Alok carried out on Petr's box confirmed that cpu_to_node[BP] is
not setup early enough by numa_init_array due to the x86_64 changes in
2.6.14-rc*, and unfortunately set wrongly by the work around code in
numa_init_array(). cpu_to_node[0] gets set with 1 early and later gets set
properly to 0 during identify_cpu() when all cpus are brought up, but
confusing the numa slab in the process.
Here is a quick fix for this. The right fix obviously is to have
cpu_to_node[bsp] setup early for numa_init_array(). The following patch
will fix the problem now, and the code can stay on even when
cpu_to_node{BP] gets fixed early correctly.
Thanks to Petr for access to his box.
Signed off by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix the BP node_to_cpumask. 2.6.14-rc* broke the boot cpu bit as the
cpu_to_node(0) is now not setup early enough for numa_init_array.
cpu_to_node[] is setup much later at srat_detect_node on acpi srat based
em64t machines. This seems like a problem on amd machines too, Tested on
em64t though. /sys/devices/system/node/node0/cpumap shows up sanely after
this patch.
Signed off by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] utilization of kprobe_mutex is incorrect on x86_64
The up()/down() orders are incorrect in arch/x86_64/kprobes.c file.
kprobe_mutext is used to protect the free kprobe instruction slot list.
arch_prepare_kprobe applies for a slot from the free list, and
arch_remove_kprobe returns a slot to the free list. The incorrect up()/down()
orders to operate on kprobe_mutex fail to protect the free list. If 2 threads
try to get/return kprobe instruction slot at the same time, the free slot list
might be broken, or a free slot might be applied by 2 threads.
[PATCH] eth1394: workaround limitation in rawiso routines
Work around limitation in rawiso routines. Required with 1394b cards on
architectures where PAGE_SIZE is 4096. Based on a previous patch by Ben
Collins.
Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Cc: Ben Collins <bcollins@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] ieee1394: skip unnecessary pause when scanning config ROMs
Skip a superfluous pause that occured when the config ROM of a node was
scanned unsuccessfully. This also occurs if a node without link wrongly
enables its "link active" self ID flag. A GWCTech 6-port hub does this.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Cc: Ben Collins <bcollins@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] ieee1394: reorder activities after bus reset (fixes device detection)
Units were not detected if the local IRM performed a bus reset. ("The root
node is not cycle master capable; selecting a new root node and resetting...",
often seen with iPods and other SBP-2 devices). Rearrange the order of IRM
duties and node scanning. TODO: Audit the ROM caching and parsing code for
underlying issues.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Cc: Ben Collins <bcollins@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Set serialize_io=1 by default. This is safer and required by seemingly more
and more hardware. It causes little or no performance loss for S400 devices.
Performance of S800 1394b devices may drop by 25...30%. Therefore make the
parameter's description and dmesg message clearer about performance impact.
Update description of the max_speed parameter too. IEEE1394_SPEED_MAX is
currently S800.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Cc: Ben Collins <bcollins@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] sbp2: fix deadlocks and delays on device removal/rmmod
Fixes for deadlocks of the ieee1394 and scsi subsystems and long delays in
futile error recovery attempts when SBP-2 devices are removed or drivers are
unloaded.
- Complete commands quickly with DID_NO_CONNECT if the 1394 node is gone or if
the 1394 low-level driver was unloaded.
- Skip unnecessary work in the eh_abort_handler and eh_device_reset_handler if
the node or 1394 low-level driver is gone.
- Let scsi's high-level shut down gracefully when sbp2 is being unloaded or
detached from the 1394 unit. A call to scsi_remove_device is added for this
purpose, which requires us to store a scsi_device pointer.
- scsi_device pointer is obtained from slave_alloc hook and cleared by
slave_destroy. This avoids usage of the pointer after the scsi device was
deleted e.g. by the user via scsi_mod's sysfs interface.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Jody McIntyre <scjody@steamballoon.com> Cc: Ben Collins <bcollins@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This change removes a bogus error message from the IOC4 serial driver
interrupt handler.
This error message is bogus for two reasons. First, it can never occur
given that current code takes care to initialize IOC4 in such a way that
these "unknown" interrupts could never occur. Second, this code fails to
take into account that other drivers can share the IOC4 interrupt mechanism
through SA_SHIRQ, and thus this driver is not in-fact "all-knowing".
Finally, this error message triggers every time some "unknown" interrupt
occurs -- it's not rate limited or repetition limited in any way, thereby
effectively denying use of the console device. Given its bogosity in the
first place, it's best to just get rid of it entirely.
Acked-by: Pat Gefre <pfg@sgi.com> Signed-off-by: Brent Casavant <bcasavan@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Check O_DIRECT and return -EINVAL error in open. dentry_open() also checks
this but only after the open method is called. This patch optimizes away
the unnecessary upcalls in this case.
It could be a correctness issue too: if filesystem has open() with side
effect, then it should fail before doing the open, not after.
Calling truncate() on hostfs spits a kernel warning "Something isn't
implemented here", but it still works fine.
Indeed, hostfs i_op->truncate doesn't do anything. But hostfs_setattr() ->
set_attr() correctly detects ATTR_SIZE and calls truncate() on the host. So
we should be safe (using ftruncate() may be better, in case the file is
unlinked on the host, but we aren't sure to have the file open for writing,
and reopening it would cause the same races; plus nobody should expect UML to
be so careful).
So, the warning is wrong, because the current implementation is working. Al,
am I correct, and can the warning be therefore dropped?
CC: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
a) sysrq may be run when the scheduler is non-functioning
b) the warning I wanted to fix actually came from the fault handler run in
atomic context. But I fixed that not to take the semaphore in a separate
patch.
c) the fault handler is run because of a fault, and that fault was
unaffected by this patch.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] uml: clear SKAS0/3 flags when running in TT mode
SEGV_MAYBE_FIXABLE tests ptrace_faultinfo, and depends on it being 1 only in
SKAS3 mode, while currently when running with mode=tt it will be 1 anyway.
Fix this, and do the same for proc_mm.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I hadn't been running a SKAS3 host when testing the "uml: fix hang in TT mode
on fault" patch (commit 546fe1cbf91d4d62e3849517c31a2327c992e5c5), and I
didn't think enough to the missing trap_no in SKAS3 mode.
In fact, the resulting kernel doesn't work at all in SKAS3 mode.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Michael Krufky [Fri, 30 Sep 2005 18:58:58 +0000 (11:58 -0700)]
[PATCH] v4l: DViCO FusionHDTV5 Lite GPIO Fix
GPIO fix for the composite and tv mute states of bt8xx card #135: DViCO
FusionHDTV5 Lite. Without this patch, selecting one of these states could
produce unexpected behavior.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@brturbo.com.br> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Fri, 30 Sep 2005 18:58:57 +0000 (11:58 -0700)]
[PATCH] x86: hw_irq.h warning fix
include/asm/hw_irq.h:70: warning: `struct hw_interrupt_type' declared inside parameter list
include/asm/hw_irq.h:70: warning: its scope is only this definition or declaration, which is probably not what you want
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Zach Brown [Fri, 30 Sep 2005 18:58:56 +0000 (11:58 -0700)]
[PATCH] aio: avoid extra aio_{read,write} call when ki_left == 0
Recently aio_p{read,write} changed to perform retries internally rather
than returning -EIOCBRETRY. This inadvertantly resulted in always calling
aio_{read,write} with ki_left at 0 which would in turn immediately return
0. Harmless, but we can avoid this call by checking in the caller.
Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Zach Brown [Fri, 30 Sep 2005 18:58:55 +0000 (11:58 -0700)]
[PATCH] aio: remove unlocked task_list test and resulting race
Only one of the run or kick path is supposed to put an iocb on the run
list. If both of them do it than one of them can end up referencing a
freed iocb. The kick path could delete the task_list item from the wait
queue before getting the ctx_lock and putting the iocb on the run list.
The run path was testing the task_list item outside the lock so that it
could catch ki_retry methods that return -EIOCBRETRY *without* putting the
iocb on a wait queue and promising to call kick_iocb. This unlocked check
could then race with the kick path to cause both to try and put the iocb on
the run list.
The patch stops the run path from testing task_list by requring that any
ki_retry that returns -EIOCBRETRY *must* guarantee that kick_iocb() will be
called in the future. aio_p{read,write}, the only in-tree -EIOCBRETRY
users, are updated.
Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Zach Brown [Fri, 30 Sep 2005 18:58:54 +0000 (11:58 -0700)]
[PATCH] aio: lock around kiocbTryKick()
Only one of the run or kick path is supposed to put an iocb on the run
list. If both of them do it than one of them can end up referencing a
freed iocb. The kick patch could set the Kicked bit before acquiring the
ctx_lock and putting the iocb on the run list. The run path, while holding
the ctx_lock, could see this partial kick and mistake it for a kick that
was deferred while it was doing work with the run_list NULLed out. It
would then race with the kick thread to add the iocb to the run list.
This patch moves the kick setting under the ctx_lock so that only one of
the kick or run path queues the iocb on the run list, as intended.
Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As requested by Thomas Gleixner <tglx@linutronix.de>:
"5d3d0f7704ed0bc7eaca0501eeae3e5da1ea6c87 breaks a couple of ARM
boards, which depend on the historical bootmem allocation order.
There is a cleaner solution around to remove the pgdat list
completely, but this is a topic for post 2.6.14
James Morris [Fri, 30 Sep 2005 18:24:34 +0000 (14:24 -0400)]
[PATCH] SELinux - fix SCTP socket bug and general IP protocol handling
The following patch updates the way SELinux classifies and handles IP
based protocols.
Currently, IP sockets are classified by SELinux as being either TCP, UDP
or 'Raw', the latter being a default for IP socket that is not TCP or UDP.
The classification code is out of date and uses only the socket type
parameter to socket(2) to determine the class of IP socket. So, any
socket created with SOCK_STREAM will be classified by SELinux as TCP, and
SOCK_DGRAM as UDP. Also, other socket types such as SOCK_SEQPACKET and
SOCK_DCCP are currently ignored by SELinux, which classifies them as
generic sockets, which means they don't even get basic IP level checking.
This patch changes the SELinux IP socket classification logic, so that
only an IPPROTO_IP protocol value passed to socket(2) classify the socket
as TCP or UDP. The patch also drops the check for SOCK_RAW and converts
it into a default, so that socket types like SOCK_DCCP and SOCK_SEQPACKET
are classified as SECCLASS_RAWIP_SOCKET (instead of generic sockets).
Note that protocol-specific support for SCTP, DCCP etc. is not addressed
here, we're just getting these protocols checked at the IP layer.
This fixes a reported problem where SCTP sockets were being recognized as
generic SELinux sockets yet still being passed in one case to an IP level
check, which then fails for generic sockets.
It will also fix bugs where any SOCK_STREAM socket is classified as TCP or
any SOCK_DGRAM socket is classified as UDP.
This patch also unifies the way IP sockets classes are determined in
selinux_socket_bind(), so we use the already calculated value instead of
trying to recalculate it.
Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nick Piggin [Fri, 30 Sep 2005 16:34:42 +0000 (02:34 +1000)]
[PATCH] i386: include linux/irq.h rather than asm/hw_irq.h
I need the following patch to compile -git8 here, otherwise these
files fail to compile (asm/hw_irq.h needs definitions from
linux/irq.h and that file provides the required include ordering).
I did not do a full audit, though there looks to be many other
places that should get the same treatment, if this is the right
way to do it.
Al Viro [Fri, 30 Sep 2005 02:36:50 +0000 (03:36 +0100)]
[PATCH] bogus BUILD_BUG_ON() in bpa_iommu
BUILD_BUG_ON(1) is asking for trouble (and getting it) when used in that
manner - dead code elimination happens after we parse it and invalid
type is invalid type, dead code or not.
It might be version-dependent, but at least 4.0.1 refuses to accept
that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Gen FUKATSU [Fri, 30 Sep 2005 15:09:17 +0000 (16:09 +0100)]
[ARM] 2940/1: Fix BTB entry flush in arch/arm/mm/cache-v6.S
Patch from Gen FUKATSU
Invalidate BTB entry instruction flushes two instruction
at a time. Therefore this instruction should be done four
times after invalidate instruction cache line.
Signed-off-by: Gen Fukatsu Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[ARM] 2939/1: Fix compilation error in arch/arm/mm/flush.c
Patch from Catalin Marinas
When CONFIG_CPU_CACHE_VIPT is defined, the flush_pfn_alias() function is
implicitely declared and it later conflicts with its actual definition.
This patch moves the function definition to the beginning of the file.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
David S. Miller [Fri, 30 Sep 2005 01:50:34 +0000 (18:50 -0700)]
[SPARC64]: Fix several bugs in flush_ptrace_access().
1) Use cpudata cache line sizes, not magic constants.
2) Align start address in cheetah case so we do not get
unaligned address traps. (pgrep was good at triggering
this, via /proc/${pid}/cmdline accesses)
Signed-off-by: David S. Miller <davem@davemloft.net>
[TCP]: Don't over-clamp window in tcp_clamp_window()
From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Handle better the case where the sender sends full sized
frames initially, then moves to a mode where it trickles
out small amounts of data at a time.
This known problem is even mentioned in the comments
above tcp_grow_window() in tcp_input.c, specifically:
...
* The scheme does not work when sender sends good segments opening
* window and then starts to feed us spagetti. But it should work
* in common situations. Otherwise, we have to rely on queue collapsing.
...
When the sender gives full sized frames, the "struct sk_buff" overhead
from each packet is small. So we'll advertize a larger window.
If the sender moves to a mode where small segments are sent, this
ratio becomes tilted to the other extreme and we start overrunning
the socket buffer space.
tcp_clamp_window() tries to address this, but it's clamping of
tp->window_clamp is a wee bit too aggressive for this particular case.
Fix confirmed by Ion Badulescu.
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Kuznetsov has explained the situation as follows:
--------------------
I think the fix is incorrect. Look, the RFC function init_cwnd(mss) is
not continuous: f.e. for mss=1095 it needs initial window 1095*4, but
for mss=1096 it is 1096*3. We do not know exactly what mss sender used
for calculations. If we advertised 1096 (and calculate initial window
3*1096), the sender could limit it to some value < 1096 and then it
will need window his_mss*4 > 3*1096 to send initial burst.
See?
So, the honest function for inital rcv_wnd derived from
tcp_init_cwnd() is:
init_rcv_wnd(mss)=
min { init_cwnd(mss1)*mss1 for mss1 <= mss }
It is something sort of:
if (mss < 1096)
return mss*4;
if (mss < 1096*2)
return 1096*4;
return mss*2;
(I just scrablled a graph of piece of paper, it is difficult to see or
to explain without this)
I selected it differently giving more window than it is strictly
required. Initial receive window must be large enough to allow sender
following to the rfc (or just setting initial cwnd to 2) to send
initial burst. But besides that it is arbitrary, so I decided to give
slack space of one segment.
Actually, the logic was:
If mss is low/normal (<=ethernet), set window to receive more than
initial burst allowed by rfc under the worst conditions
i.e. mss*4. This gives slack space of 1 segment for ethernet frames.
For msses slighlty more than ethernet frame, take 3. Try to give slack
space of 1 frame again.
If mss is huge, force 2*mss. No slack space.
Value 1460*3 is really confusing. Minimal one is 1096*2, but besides
that it is an arbitrary value. It was meant to be ~4096. 1460*3 is
just the magic number from RFC, 1460*3 = 1095*4 is the magic :-), so
that I guess hands typed this themselves.
--------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
[ARM] 2941/1: Fix running legacy binaries from a soft-float root filesystem with CONFIG_IWMMXT.
Patch from Daniel Jacobowitz
Thread flags are inherited on fork(). In order for a binary which has
the iWMMXt coprocessor enabled to run a binary which needs the FPA
emulation, we need to explicitly clear TIF_USING_IWMMXT if we are not
going to set it.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The SMU driver has a small mistake in the locking of the interrupt code,
if polled access and interrupt access race, interrupt may take a lock
and return without releasing it. This fixes it. With that patch, the
driver is rock solid with my experimental thermal control (which bangs
it pretty hard) racing with real time clock and cpufreq handling.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] readv/writev syscalls are not checked by lsm
it seems that readv(2)/writev(2) syscalls do not call
file_permission callback. Looks like this is overlook.
I have filled the issue into redhat bugzilla as
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=169433
and got the recommendation to post this on lsm mailing list.
The following trivial patch solves the problem.
Signed-off-by: Kostik Belousov <kostikbel@gmail.com> Signed-off-by: Chris Wright <chrisw@osdl.org>
Mike Waychison [Thu, 29 Sep 2005 22:01:27 +0000 (00:01 +0200)]
[PATCH] x86_64: Fix mce_log
The attempt to fixup the lockless mce log buffer introduced an infinite loop
when trying to find a free entry.
And:
Using rcu_dereference() to load mcelog.next doesn't seem to be sufficient
enough to ensure that mcelog.next is loaded each time around the loop in
mce_log(). Instead, use an explicit rmb() to ensure that the compiler gets it
right.
AK: turned the smp_wmbs into true wmbs to make sure they are not
reordered by the compiler on UP.
Roland McGrath [Thu, 29 Sep 2005 21:54:42 +0000 (14:54 -0700)]
[PATCH] Fix task state testing properly in do_signal_stop()
Any tests using < TASK_STOPPED or the like are left over from the time
when the TASK_ZOMBIE and TASK_DEAD bits were in the same word, and it
served to check for "stopped or dead". I think this one in
do_signal_stop is the only such case. It has been buggy ever since
exit_state was separated, and isn't testing the exit_state value.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Roland points out that the flags end up having non-obvious dependencies
elsewhere, so revert aa55a08687059aa169d10a313c41f238c2070488 and add
some comments about why things are as they are.
We'll just have to fix up the broken comparisons. Roland has a patch.
[PATCH] fix TASK_STOPPED vs TASK_NONINTERACTIVE interaction
do_signal_stop:
for_each_thread(t) {
if (t->state < TASK_STOPPED)
++sig->group_stop_count;
}
However, TASK_NONINTERACTIVE > TASK_STOPPED, so this loop will not
count TASK_INTERRUPTIBLE | TASK_NONINTERACTIVE threads.
See also wait_task_stopped(), which checks ->state > TASK_STOPPED.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
[ We really probably should always use the appropriate bitmasks to test
task states, not do it like this. Using something like
and then doing "if (task->state & TASK_RUNNABLE)" or similar. But the
ordering of the task states is historical, and keeping the ordering
does make sense regardless. ]
[PATCH] intelfb: Fix regression (blank display) from ioremap patch
- Workaround for the ioremap patch that produces a blank display on some
chipsets
- Make hwcursor = 0 the default. The hardware cursor does not work with all
hardware.