Mitch Williams [Wed, 9 Nov 2005 18:36:57 +0000 (10:36 -0800)]
[PATCH] bonding: comments and changelog
Bonding source files still have changelogs in the comments. This, then,
is an update to that changelog.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:36:50 +0000 (10:36 -0800)]
[PATCH] bonding: spelling and whitespace corrections
Minor spelling and whitespace corrections.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:36:46 +0000 (10:36 -0800)]
[PATCH] bonding: version update
Update the version number for the bonding module. Since we've just
added a significant new feature (sysfs support), bump the major number.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:36:41 +0000 (10:36 -0800)]
[PATCH] bonding: add sysfs functionality to bonding (large)
This large patch adds sysfs functionality to the channel bonding module.
Bonds can be added, removed, and reconfigured at runtime without having
to reload the module. Multiple bonds with different configurations are
easily configured, and ifenslave is no longer required to configure bonds.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:36:25 +0000 (10:36 -0800)]
[PATCH] bonding: add ARP entries to /proc
Make the /proc files show which ARP targets are in use by each bond.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:36:19 +0000 (10:36 -0800)]
[PATCH] bonding: Allow ARP target table to have empty entries
With the sysfs interface, the user can remove entries from the ARP table
at runtime. The ARP monitor code now allows for empty entries in the
table.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:36:11 +0000 (10:36 -0800)]
[PATCH] bonding: make bond_init not __init
The sysfs interface can create bonds at runtime, and __init code goes away
after module init.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:36:04 +0000 (10:36 -0800)]
[PATCH] bonding: move bond creation into separate function
The sysfs interface can create bonds at runtime, so we need a separate
function to do this, instead of just doing it in the module init code.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:35:51 +0000 (10:35 -0800)]
[PATCH] bonding: make functions not static
The sysfs code needs access these functions, so make them
not static, and move the protos to the header file.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:35:44 +0000 (10:35 -0800)]
[PATCH] bonding: expose some structs
The sysfs code needs to know what these structs look like, so make them
not static, and move the definition to the header.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:35:35 +0000 (10:35 -0800)]
[PATCH] bonding: explicitly clear RLB flag during ALB init
Explicitly clear RLB flag during ALB init. This is needed for sysfs
support, since the bond mode can be changed at runtime via sysfs.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:35:30 +0000 (10:35 -0800)]
[PATCH] bonding: move kmalloc out of spinlock in ALB init
Move memory allocations out of the spinlock during ALB init. This gets
rid of a sleeping-inside-spinlock warning and accompanying stack dump.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:35:21 +0000 (10:35 -0800)]
[PATCH] bonding: get slave name from actual slave instead of param list
Take the primary slave name shown in /proc from the actual slave dev
instead of from the command-line parameter, which won't be present
if the bond is created via sysfs.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:35:13 +0000 (10:35 -0800)]
[PATCH] bonding: Add transmit policy to /proc
Adds information about the recently-added transmit policy setting to each
bond's /proc file.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:35:03 +0000 (10:35 -0800)]
[PATCH] bonding: expand module param descriptions
Expand and correct the parameter descriptions shown by modinfo.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:34:57 +0000 (10:34 -0800)]
[PATCH] bonding: add bond name to all error messages
Add the bond name to all error messages so we can tell which one is
complaining. Also reformats some error messages to be more consistent.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:34:45 +0000 (10:34 -0800)]
[PATCH] net: make dev_valid_name public
dev_valid_name() is a useful function. Make it public.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Mitch Williams [Wed, 9 Nov 2005 18:34:01 +0000 (10:34 -0800)]
[PATCH] net: allow newline terminated IP addresses in in_aton
in_aton() gives weird results if it sees a newline at the end of the
input. This patch makes it able to handle such input correctly.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Acked-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Neil Horman [Sat, 12 Nov 2005 00:08:24 +0000 (16:08 -0800)]
[SCTP]: Include ulpevents in socket receive buffer accounting.
Also introduces a sysctl option to configure the receive buffer
accounting policy to be either at socket or association level.
Default is all the associations on the same socket share the
receive buffer.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[SCTP]: Fix ia64 NaT consumption fault with sctp_sideffect commands.
On ia64, it is possible to get NaT Consumption Fault and a kernel panic
when initializing sctp sideeffect commands arguments. The union
sctp_arg_t contains different sized elements and when loading a smaller
sized element (32 or 16 bits), it is possible for a speculative load to
fail and result in a NaT bit set which causes a kernel crash. The easy
way to get around it is to load the largerst member of the union.
Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[SCTP]: Remove timeouts[] array from sctp_endpoint.
The socket level timeout values are maintained in sctp_sock and
association level timeouts are in sctp_association. So there is
no need for ep->timeouts.
Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[SCTP]: Fix potential NULL pointer dereference in sctp_v4_get_saddr
It is possible to get to sctp_v4_get_saddr() without a valid
association. This happens when processing OOTB packets and
the cached route entry is no longer valid.
However, when responding to OOTB packets we already properly
set the source address based on the information in the OOTB
packet. So, if we we get to sctp_v4_get_saddr() without an
association we can simply return.
Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 11 Nov 2005 22:27:32 +0000 (14:27 -0800)]
ppc64: default build as the merged 'powerpc' architecture
After the last merge of the new unified 'powerpc' architecture, ppc64 no
longer compiles cleanly as a standalone architecture. Some bits and
pieces still exist as files under the old ppc64 hierarchy, but the old
"ARCH=ppc64" is dead.
So if "uname" says ppc64, that now implies that the default architecture
should be "powerpc".
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nicolas Pitre [Fri, 11 Nov 2005 21:51:49 +0000 (21:51 +0000)]
[ARM] 3152/1: make various assembly local labels actually local (the rest)
Patch from Nicolas Pitre
For assembly labels to actually be local they must start with ".L" and
not only "." otherwise they still remain visible in the final link and
clutter kallsyms needlessly, and possibly make for unclear symbolic
backtrace. This patch simply inserts a"L" where appropriate. The code
itself is unchanged.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Nicolas Pitre [Fri, 11 Nov 2005 21:51:48 +0000 (21:51 +0000)]
[ARM] 3151/1: make various assembly local labels actually local (io-*.S)
Patch from Nicolas Pitre
For assembly labels to actually be local they must start with ".L" and
not only "." otherwise they still remain visible in the final link and
clutter kallsyms needlessly, and possibly make for unclear symbolic
backtrace. This patch simply inserts a"L" where appropriate. The code
itself is unchanged.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Nicolas Pitre [Fri, 11 Nov 2005 21:51:47 +0000 (21:51 +0000)]
[ARM] 3150/1: make various assembly local labels actually local (uaccess.S)
Patch from Nicolas Pitre
For assembly labels to actually be local they must start with ".L" and
not only "." otherwise they still remain visible in the final link and
clutter kallsyms needlessly, and possibly make for unclear symbolic
backtrace. This patch simply inserts a"L" where appropriate. The code
itself is unchanged.
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
David S. Miller [Fri, 11 Nov 2005 20:48:56 +0000 (12:48 -0800)]
[SPARC64]: Restore 2.4.x /proc/cpuinfo behavior for "ncpus probed" field.
Noticed by Tom 'spot' Callaway.
Even on uniprocessor we always reported the number of physical
cpus in the system via /proc/cpuinfo. But when this got changed
to use num_possible_cpus() it always reads as "1" on uniprocessor.
This change was unintentional.
So scan the firmware device tree and count the number of cpu
nodes, and report that, as we always did.
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Maule [Fri, 11 Nov 2005 17:52:43 +0000 (11:52 -0600)]
[IA64-SGI] set altix preferred console
Fix default VGA console on SN platforms. Since SN firmware does not pass
enough ACPI information to identify VGA cards and the associated legacy IO/MEM
addresses, we rely on the EFI PCDP table. Since the linux pcdp driver is
optional (and overridden if console= directives are used) SN duplicates a
portion of the pcdp scan code to identify if there is a usable console VGA
adapter. Additionally, dup necessary pcdp related structs to avoid dragging
drivers/pcdp.h into a more public location.
Signed-off-by: Mark Maule <maule@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Robin Holt [Fri, 11 Nov 2005 15:35:43 +0000 (09:35 -0600)]
[IA64] 4-level page tables
This patch introduces 4-level page tables to ia64. I have run
some benchmarks and found nothing interesting. Performance has
consistently fallen within the noise range.
It also introduces a config option (setting the default to 3
levels). The config option prevents having 4 level page
tables with 64k base page size.
Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Ralf Baechle [Wed, 9 Nov 2005 17:10:05 +0000 (17:10 +0000)]
[PATCH] SAA9730: Driver overhaul
o Try to work around some of the undocumented "features" of the SAA9730
o Use netdev_priv() instead of the previous broken mechanism to allocate
the private data structure.
o Try to make sure we don't leak resources on exit.
o No more need to call SET_MODULE_OWNER in 2.6.
o Use pci_free_consistent instead of homegrown architecture-specific
allocation.
Ayaz Abdulla [Fri, 11 Nov 2005 13:30:38 +0000 (08:30 -0500)]
[netdrvr forcedeth] support for irq mitigation
This patch contains support for different modes of interrupt mitigation
of forcedeth. It includes changes based on Jeff's comments. Currently,
the modes are changed through module parameters since ethtool does not
support something similar.
Frank Pavlic [Thu, 10 Nov 2005 12:51:25 +0000 (13:51 +0100)]
[PATCH] s390: introduce guestLan sniffer support in qeth
[patch 6/7] s390: introduce guestLan sniffer support in qeth
From: Peter Tiedemann <ptiedem@de.ibm.com>
- introduce guestLan sniffer support in qeth
feature allows a linux in a virtual machine
guest to become a network LAN sniffer,
monitoring and recording the networking traffic
within an entire guestLan.
Frank Pavlic [Thu, 10 Nov 2005 12:51:17 +0000 (13:51 +0100)]
[PATCH] s390: fix recovery failure of non-guestLAN devices
[patch 5/7] s390: fix recovery failure of non-guestLAN devices
From: Frank Pavlic <fpavlic@de.ibm.com>
- Recovery of non-guestLAN Layer 2 device failed due to
trying to register the real MAC address we got from
the READ_MAC adapter parameters command.
We have to keep the "old" MAC address when we process
the reply of a READ_MAC.
Frank Pavlic [Thu, 10 Nov 2005 12:50:58 +0000 (13:50 +0100)]
[PATCH] s390: some more qeth fixes
[patch 4/7] s390: some more qeth fixes
From: Frank Pavlic <fpavlic@de.ibm.com>
From: Peter Tiedemann <ptiedem@de.ibm.com>
- possible race on list fixed by reset
list processing after every operation
- traffic hang fixed
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
diffstat:
qeth_main.c | 11 +++++++----
1 files changed, 7 insertions(+), 4 deletions(-) Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
From: Klaus Dieter Wacker <kdwacker@de.ibm.com>
- when running in Layer2 mode we don't have to register
the multicast IP address but only group mac address.
Therefore for Layer 2 devices it is enough to go
through dev->mc_list list and register these entries.
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
diffstat:
qeth_main.c | 106 +++++++++++++++++++++++++++++++++++++++++++++---------------
1 files changed, 80 insertions(+), 26 deletions(-) Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Frank Pavlic [Thu, 10 Nov 2005 12:49:15 +0000 (13:49 +0100)]
[PATCH] s390: minor modification in qeth layer2 code
[patch 2/7] s390: minor modification in qeth layer2 code
From: Frank Pavlic <fpavlic@de.ibm.com>
- use qeth_layer2_send_setdelvlan_cb to check
return code of a SET/DELVLAN IP Assist command.
It fits better in qeth's design and mechanism of IP Assist
command handling.
Signed-off-by: Frank Pavlic <fpavlic@de.ibm.com>
diffstat:
qeth_main.c | 40 ++++++++++++++++++++++++++--------------
1 files changed, 26 insertions(+), 14 deletions(-) Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Alan Cox [Tue, 8 Nov 2005 14:09:44 +0000 (14:09 +0000)]
[PATCH] libata: propogate host private data from probe function
This will let me chop the code size of several drivers right down. In
many cases the actual private data is very useful and constant for a
given host controller so being able to just pass it at probe time would
be very useful indeed (eg with the via driver would could pass the udma
clocking and reduce the code size, or with the AMD one the UDMA
multiplier and the offset)
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
* Merge PCMCIA card table with new Brodowski PCMCIA id table.
* Add missing entries to PCMCIA id table.
* Other tweaks to conform with Documentation/driver-changes.txt
(types, call request_region, etc)
* Fix size of requested IO region.
* Reduce printk verbosity.
* Remove EXPERIMENTAL
* tweak to association code - don't force shared key authentication
when wep in use.
Paul Mackerras [Fri, 11 Nov 2005 11:36:34 +0000 (22:36 +1100)]
powerpc: Fix reading and writing SPRs from xmon on 32-bit
When we created the instructions to read/write SPRs in xmon, we were
setting up a ppc64-style procedure descriptor and calling that, which
doesn't work in 32-bit. For 32-bit a function pointer just points
to the instructions of the function. This fixes it to do the right
thing for both 32-bit and 64-bit.
Paul Mackerras [Fri, 11 Nov 2005 11:34:43 +0000 (22:34 +1100)]
powerpc: Initialize secondary CPU setup for 32-bit SMP
32-bit SMP powermacs weren't booting with ARCH=powerpc because the
boot cpu wasn't saving away the state of various control registers,
but the secondary CPUs were loading them from the uninitialized
state. This adds the necessary save-state call.
[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel
This patch moves the vdso's to arch/powerpc, adds support for the 32
bits vdso to the 32 bits kernel, rename systemcfg (finally !), and adds
some new (still untested) routines to both vdso's: clock_gettime() with
support for CLOCK_REALTIME and CLOCK_MONOTONIC, clock_getres() (same
clocks) and get_tbfreq() for glibc to retreive the timebase frequency.
Tom,Steve: The implementation of get_tbfreq() I've done for 32 bits
returns a long long (r3, r4) not a long. This is such that if we ever
add support for >4Ghz timebases on ppc32, the userland interface won't
have to change.
I have tested gettimeofday() using some glibc patches in both ppc32 and
ppc64 kernels using 32 bits userland (I haven't had a chance to test a
64 bits userland yet, but the implementation didn't change and was
tested earlier). I haven't tested yet the new functions.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
David Gibson [Fri, 11 Nov 2005 05:42:12 +0000 (16:42 +1100)]
[PATCH] powerpc: Move udbg code to arch/powerpc
Since the udbg code in ppc64 has no ppc32 equivalent, move it straight
over into arch/powerpc (and include/asm-powerpc for udbg.h). In time,
we probably want to meld the various bits and pieces of 32-bit early
debugging code into udbg, but for now only include it on
CONFIG_PPC64=y builds. The only change during the move is to
standardise the protecting #ifdef/#define in udbg.h, and move its
banner comment above the initial #ifdef (which seems to be normal
practice).
Built and booted on POWER5 LPAR (ARCH=powerpc and ARCH=ppc64). Built
for 32bit multiplatform (ARCH=powerpc).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Anton Blanchard [Fri, 11 Nov 2005 04:02:03 +0000 (15:02 +1100)]
[PATCH] ppc64: Increase sparsemem defaults
The definitions in sparsemem.h arent sufficient. We currently sell
machines with 2TB of RAM, and in order to give us room for a few years
growth lets set it to 16TB.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Anton Blanchard [Fri, 11 Nov 2005 03:22:35 +0000 (14:22 +1100)]
[PATCH] ppc64: Convert NUMA to sparsemem (3)
Convert to sparsemem and remove all the discontigmem code in the
process. This has a few advantages:
- The old numa_memory_lookup_table can go away
- All the arch specific discontigmem magic can go away
We also remove the triple pass of memory properties and instead create a
list of per node extents that we iterate through. A final cleanup would
be to change our lmb code to store extents per node, then we can reuse
that information in the numa code.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Anton Blanchard [Fri, 11 Nov 2005 02:53:11 +0000 (13:53 +1100)]
[PATCH] ppc64: Quieten lparcfg
If we dont have permission to read some information from the hypervisor,
lparcfg outputs a warning on the console. Now that lparcfg is world
readable this is a problem.
Dont warn in the case of H_Authority, remove some unnecessary function
prototypes and fix whitespace damage in a structure as well.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Kumar Gala [Thu, 10 Nov 2005 16:34:33 +0000 (10:34 -0600)]
[PATCH] ppc32: fix PQ2 PCI DMA interrupt handling
The bit position in the status register corresponding to the
PCI DMA interrupt was incorrect. Additionally, we did not
have a define for the PCI DMA interrupt.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Andrew Morton [Fri, 11 Nov 2005 00:21:38 +0000 (16:21 -0800)]
[PATCH] libata.h needs dma-mapping.h
On Alpha:
include/linux/libata.h: In function `ata_pad_alloc':
include/linux/libata.h:785: warning: implicit declaration of function `dma_alloc_coherent'
include/linux/libata.h:786: warning: assignment makes pointer from integer without a cast
include/linux/libata.h: In function `ata_pad_free':
include/linux/libata.h:792: warning: implicit declaration of function `dma_free_coherent'
(I have a decouple-some-header-files cleanup in -mm, so it's causing some
fallout of this nature)
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Use "hints" to speed up the SACK processing. Various forms
of this have been used by TCP developers (Web100, STCP, BIC)
to avoid the 2x linear search of outstanding segments.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
John Heffner [Fri, 11 Nov 2005 01:11:48 +0000 (17:11 -0800)]
[TCP]: receive buffer growth limiting with mixed MTU
This is a patch for discussion addressing some receive buffer growing issues.
This is partially related to the thread "Possible BUG in IPv4 TCP window
handling..." last week.
Specifically it addresses the problem of an interaction between rcvbuf
moderation (receiver autotuning) and rcv_ssthresh. The problem occurs when
sending small packets to a receiver with a larger MTU. (A very common case I
have is a host with a 1500 byte MTU sending to a host with a 9k MTU.) In
such a case, the rcv_ssthresh code is targeting a window size corresponding
to filling up the current rcvbuf, not taking into account that the new rcvbuf
moderation may increase the rcvbuf size.
One hunk makes rcv_ssthresh use tcp_rmem[2] as the size target rather than
rcvbuf. The other changes the behavior when it overflows its memory bounds
with in-order data so that it tries to grow rcvbuf (the same as with
out-of-order data).
These changes should help my problem of mixed MTUs, and should also help the
case from last week's thread I think. (In both cases though you still need
tcp_rmem[2] to be set much larger than the TCP window.) One question is if
this is too aggressive at trying to increase rcvbuf if it's under memory
stress.
Orignally-from: John Heffner <jheffner@psc.edu> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
This is an updated version of the RFC3465 ABC patch originally
for Linux 2.6.11-rc4 by Yee-Ting Li. ABC is a way of counting
bytes ack'd rather than packets when updating congestion control.
The orignal ABC described in the RFC applied to a Reno style
algorithm. For advanced congestion control there is little
change after leaving slow start.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Simplify the code that comuputes microsecond rtt estimate used
by TCP Vegas. Move the callback out of the RTT sampler and into
the end of the ack cleanup.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[TCP]: fix congestion window update when using TSO deferal
TCP peformance with TSO over networks with delay is awful.
On a 100Mbit link with 150ms delay, we get 4Mbits/sec with TSO and
50Mbits/sec without TSO.
The problem is with TSO, we intentionally do not keep the maximum
number of packets in flight to fill the window, we hold out to until
we can send a MSS chunk. But, we also don't update the congestion window
unless we have filled, as per RFC2861.
This patch replaces the check for the congestion window being full
with something smarter that accounts for TSO.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>