[TCP]: Don't over-clamp window in tcp_clamp_window()
From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Handle better the case where the sender sends full sized
frames initially, then moves to a mode where it trickles
out small amounts of data at a time.
This known problem is even mentioned in the comments
above tcp_grow_window() in tcp_input.c, specifically:
...
* The scheme does not work when sender sends good segments opening
* window and then starts to feed us spagetti. But it should work
* in common situations. Otherwise, we have to rely on queue collapsing.
...
When the sender gives full sized frames, the "struct sk_buff" overhead
from each packet is small. So we'll advertize a larger window.
If the sender moves to a mode where small segments are sent, this
ratio becomes tilted to the other extreme and we start overrunning
the socket buffer space.
tcp_clamp_window() tries to address this, but it's clamping of
tp->window_clamp is a wee bit too aggressive for this particular case.
Fix confirmed by Ion Badulescu.
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Kuznetsov has explained the situation as follows:
--------------------
I think the fix is incorrect. Look, the RFC function init_cwnd(mss) is
not continuous: f.e. for mss=1095 it needs initial window 1095*4, but
for mss=1096 it is 1096*3. We do not know exactly what mss sender used
for calculations. If we advertised 1096 (and calculate initial window
3*1096), the sender could limit it to some value < 1096 and then it
will need window his_mss*4 > 3*1096 to send initial burst.
See?
So, the honest function for inital rcv_wnd derived from
tcp_init_cwnd() is:
init_rcv_wnd(mss)=
min { init_cwnd(mss1)*mss1 for mss1 <= mss }
It is something sort of:
if (mss < 1096)
return mss*4;
if (mss < 1096*2)
return 1096*4;
return mss*2;
(I just scrablled a graph of piece of paper, it is difficult to see or
to explain without this)
I selected it differently giving more window than it is strictly
required. Initial receive window must be large enough to allow sender
following to the rfc (or just setting initial cwnd to 2) to send
initial burst. But besides that it is arbitrary, so I decided to give
slack space of one segment.
Actually, the logic was:
If mss is low/normal (<=ethernet), set window to receive more than
initial burst allowed by rfc under the worst conditions
i.e. mss*4. This gives slack space of 1 segment for ethernet frames.
For msses slighlty more than ethernet frame, take 3. Try to give slack
space of 1 frame again.
If mss is huge, force 2*mss. No slack space.
Value 1460*3 is really confusing. Minimal one is 1096*2, but besides
that it is an arbitrary value. It was meant to be ~4096. 1460*3 is
just the magic number from RFC, 1460*3 = 1095*4 is the magic :-), so
that I guess hands typed this themselves.
--------------------
Signed-off-by: David S. Miller <davem@davemloft.net>
[PATCH] fix TASK_STOPPED vs TASK_NONINTERACTIVE interaction
do_signal_stop:
for_each_thread(t) {
if (t->state < TASK_STOPPED)
++sig->group_stop_count;
}
However, TASK_NONINTERACTIVE > TASK_STOPPED, so this loop will not
count TASK_INTERRUPTIBLE | TASK_NONINTERACTIVE threads.
See also wait_task_stopped(), which checks ->state > TASK_STOPPED.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
[ We really probably should always use the appropriate bitmasks to test
task states, not do it like this. Using something like
and then doing "if (task->state & TASK_RUNNABLE)" or similar. But the
ordering of the task states is historical, and keeping the ordering
does make sense regardless. ]
[PATCH] intelfb: Fix regression (blank display) from ioremap patch
- Workaround for the ioremap patch that produces a blank display on some
chipsets
- Make hwcursor = 0 the default. The hardware cursor does not work with all
hardware.
Al Viro [Wed, 28 Sep 2005 23:31:14 +0000 (00:31 +0100)]
[PATCH] ppc32 ld.script fix for building on ppc64
In arch/ppc/boot/ld.script we need OUTPUT_ARCH(powerpc:common) for the
same reasons why we need it in vmlinux.lds.S; when we build on ppc64
box, we need to be explicit about the target.
See http://linus.bkbits.net:8080/linux-2.5/cset@1.1784.8.10 for the
corresponding fix in vmlinux.lds.S.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Wed, 28 Sep 2005 21:27:23 +0000 (22:27 +0100)]
[PATCH] uml makefiles sanitized
UML makefiles sanitized:
- number of generated headers reduced to 2 (from user-offsets.c and
kernel-offsets.c resp.). The rest is made constant and simply
includes those two.
- mk_... helpers are gone now that we don't need to generate these
headers
- arch/um/include2 removed since everything under arch/um/include/sysdep
is constant now and symlink can point straight to source tree.
- dependencies seriously simplified.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Howells [Wed, 28 Sep 2005 16:03:15 +0000 (17:03 +0100)]
[PATCH] Keys: Add possessor permissions to keys [try #3]
The attached patch adds extra permission grants to keys for the possessor of a
key in addition to the owner, group and other permissions bits. This makes
SUID binaries easier to support without going as far as labelling keys and key
targets using the LSM facilities.
This patch adds a second "pointer type" to key structures (struct key_ref *)
that can have the bottom bit of the address set to indicate the possession of
a key. This is propagated through searches from the keyring to the discovered
key. It has been made a separate type so that the compiler can spot attempts
to dereference a potentially incorrect pointer.
The "possession" attribute can't be attached to a key structure directly as
it's not an intrinsic property of a key.
Pointers to keys have been replaced with struct key_ref *'s wherever
possession information needs to be passed through.
This does assume that the bottom bit of the pointer will always be zero on
return from kmem_cache_alloc().
The key reference type has been made into a typedef so that at least it can be
located in the sources, even though it's basically a pointer to an undefined
type. I've also renamed the accessor functions to be more useful, and all
reference variables should now end in "_ref".
Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In kmalloc_node we are checking if the allocation is for the same node when
interrupts are "on". This may lead to an allocation on another node than
intended.
This patch just shifts the check for the current node in __cache_alloc_node
when interrupts are disabled.
Signed-off-by: Alok N Kataria <alokk@calsoftinc.com> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
My previous patch fixing invalidation of huge PTEs wasn't good enough, we
still had an issue if a PTE invalidation batch contained both small and
large pages. This patch fixes this by making sure the batch is flushed if
the page size fed to it changes.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When creating a multipath device, if the queue_if_no_path parameter is
specified it gets ignored.
While the queue_if_no_path variable is correctly set to 1, the
saved_queue_if_no_path gets set to 0. When the device is subsequently made
live (resumed), the saved value (0) always overwrites the live value (1) so
the option *always* gets turned off.
The fix adds a parameter to the queue_if_no_path() function to indicate
whether the previous value should be preserved or not - if not, as when the
device is being set up, the saved value is set to the new value (1).
Signed-Off-By: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
goggin, edward [Wed, 28 Sep 2005 04:45:44 +0000 (21:45 -0700)]
[PATCH] device-mapper: Trigger an event when a table is deleted
If anything is waiting on a device's table when the device is removed, we
must first wake it up so it will release its reference. Otherwise the
table's reference count will not drop to zero and the table will not get
removed.
Signed-Off-By: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] swsusp: avoid problems if there are too many pages to save
The following patch makes swsusp avoid problems during resume if there are
too many pages to save on suspend. It adds a constant that allows us to
verify if we are going to save too many pages and implements the check
(this is done as early as we can tell that the check will trigger, which is
in swsusp_alloc()).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] orinoco: Fix flood of kernel log with stupid WE warnings
Latest wireless extensions moved a field from netdev -> wireless_handlers.
The WE core will now printk a warning on every call to get_wireless_stats()
on a driver that still uses the old field. This patch fixes orinoco.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Anton Blanchard [Wed, 28 Sep 2005 04:45:38 +0000 (21:45 -0700)]
[PATCH] ppc64: Add missing barrier() in kexec code
Mikey and I were testing kexec and hit a lockup. It turns out gcc 4.0
optimises the kexec_prepare_cpus loop so we avoid reloading paca.hw_cpu_id.
A gcc barrier() fixes the problem.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Jackson [Wed, 28 Sep 2005 04:45:37 +0000 (21:45 -0700)]
[PATCH] cpuset maintainers
Specify the cpuset maintainers.
Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Simon Derr <simon.derr@bull.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
john stultz [Wed, 28 Sep 2005 04:45:36 +0000 (21:45 -0700)]
[PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
This should resolve the issue seen in bugme bug #5105, where it is assumed
that dualcore x86_64 systems have synced TSCs. This is not the case, and
alternate timesources should be used instead.
For more details, see:
http://bugzilla.kernel.org/show_bug.cgi?id=5105
Andi's earlier concerns that the TSCs should be synced on dualcore systems
have been resolved by confirmation from AMD folks that they can be
unsynced.
Rusty Russell [Wed, 28 Sep 2005 04:45:34 +0000 (21:45 -0700)]
[PATCH] Ignore trailing whitespace on kernel parameters correctly
Dave Jones says:
... if the modprobe.conf has trailing whitespace, modules fail to load
with the following helpful message..
snd_intel8x0: Unknown parameter `'
Previous version truncated last argument.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Prevent swsusp from leaking some memory in case of an error in
read_pagedir(). It also prevents the BUG_ON() from triggering if there's
an error while reading swap.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following patch removes some wrong code from the data_free() function
in swsusp.
This function could only be called if there's an error while writing the
suspend image to swap, so it is not triggered easily. However, if
triggered, it would probably corrupt some memory.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jon Burgess [Wed, 28 Sep 2005 04:45:26 +0000 (21:45 -0700)]
[PATCH] dvb: fix NULL pointer dereference when loading the budget-av module
Ralph Metzler wrote:
> AFAIR, there is a bug in tda10021.c in tda10021_readreg() which
> references state->frontend.dvb->num
> This is fatal if the frontend is not at the probed address and thus
> not yet registered (no dvb entry set yet -> NULL pointer ...).
The attached patch should get rid of the oops.
Signed-off-by: Jon Burgess <jburgess@uklinux.net> Cc: Johannes Stezenbach <js@linuxtv.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fid management cleanup. The patch attempts to fix the races in dentry's
fid management.
Dentries don't keep the opened fids anymore, they are moved to the file
structs. Ideally there should be no more than one fid with fidcreate equal
to zero in the dentry's list of fids.
v9fs_fid_create initializes the important fields (fid, fidcreated) before
v9fs_fid is added to the list. v9fs_fid_lookup returns only fids that are
not created by v9fs_create. v9fs_fid_get_created returns the fid created
by the same process by v9fs_create (if any) and removes it from dentry's
list
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Chris Sykes [Wed, 28 Sep 2005 04:45:23 +0000 (21:45 -0700)]
[PATCH] Fix ext3_new_inode() failure paths
Fix failure paths in ext3_new_inode() and clean up duplicated code: -
DQUOT_DROP() was not being called if ext3_init_security() failed.
Signed-off-by: Chris Sykes <chris@sigsegv.plus.com> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Chris Sykes [Wed, 28 Sep 2005 04:45:22 +0000 (21:45 -0700)]
[PATCH] Fix ext2_new_inode() failure paths
Fix failure paths in ext2_new_inode() and clean up duplicated code: -
DQUOT_DROP() was not being called if ext2_init_security() failed.
Signed-off-by: Chris Sykes <chris@sigsegv.plus.com> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add information about required version of the userspace library/utilities
to Documentation/Changes. Also add pointer to this and to FUSE
documentation from Kconfig.
Nick Piggin [Wed, 28 Sep 2005 04:45:18 +0000 (21:45 -0700)]
[PATCH] mm: move_pte to remap ZERO_PAGE
Move the ZERO_PAGE remapping complexity to the move_pte macro in
asm-generic, have it conditionally depend on
__HAVE_ARCH_MULTIPLE_ZERO_PAGE, which gets defined for MIPS.
For architectures without __HAVE_ARCH_MULTIPLE_ZERO_PAGE, move_pte becomes
a noop.
From: Hugh Dickins <hugh@veritas.com>
Fix nasty little bug we've missed in Nick's mremap move ZERO_PAGE patch.
The "pte" at that point may be a swap entry or a pte_file entry: we must
check pte_present before perhaps corrupting such an entry.
Patch below against 2.6.14-rc2-mm1, but the same bug is in 2.6.14-rc2's
mm/mremap.c, and more dangerous there since it's affecting all arches: I
think the safest course is to send Nick's patch and Yoichi's build fix and
this fix (build tested) on to Linus - so only MIPS can be affected.
Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David S. Miller [Wed, 28 Sep 2005 05:50:06 +0000 (22:50 -0700)]
[SPARC64]: Add missing IDs for newer cpus.
Also, the us3_cpufreq driver can work on Ultra-IV and IV+.
They use the SAFARI bus register to control the clock divider
just like Ultra-III and III+ do.
Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Dawid [Tue, 27 Sep 2005 23:11:29 +0000 (16:11 -0700)]
[APPLETALK]: Fix broadcast bug.
From: Oliver Dawid <oliver@helios.de>
we found a bug in net/appletalk/ddp.c concerning broadcast packets. In
kernel 2.4 it was working fine. The bug first occured 4 years ago when
switching to new SNAP layer handling. This bug can be splitted up into a
sending(1) and reception(2) problem:
Sending(1)
In kernel 2.4 broadcast packets were sent to a matching ethernet device
and atalk_rcv() was called to receive it as "loopback" (so loopback
packets were shortcutted and handled in DDP layer).
When switching to the new SNAP structure, this shortcut was removed and
the loopback packet was send to SNAP layer. The author forgot to replace
the remote device pointer by the loopback device pointer before sending
the packet to SNAP layer (by calling ddp_dl->request() ) therfor the
packet was not sent back by underlying layers to ddp's atalk_rcv().
Reception(2)
In atalk_rcv() a packet received by this loopback mechanism contains now
the (rigth) loopback device pointer (in Kernel 2.4 it was the (wrong)
remote ethernet device pointer) and therefor no matching socket will be
found to deliver this packet to. Because a broadcast packet should be
send to the first matching socket (as it is done in many other protocols
(?)), we removed the network comparison in broadcast case.
Below you will find a patch to correct this bug. Its diffed to kernel
2.6.14-rc1
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
We know the thing is at least 2-byte aligned, so take
advantage of that instead of invoking memcmp() which
results in truly horrifically inefficient code because
it can't assume anything about alignment.
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks [Tue, 27 Sep 2005 22:59:43 +0000 (15:59 -0700)]
[NET]: Fix GCC4 compile error: sysctl in linux/if_ether.h
The following is generated when compiling a
recent (2.6.14-rc2-git5) kernel configured for
ARM, with GCC4.
CC init/main.o
In file included from include/linux/netdevice.h:29,
from include/net/sock.h:48,
from init/main.c:50:
include/linux/if_ether.h:114: error: array type has incomplete element type
It seems that if CONFIG_SYSCTL is not set, then
the compiler will throw an error due to the definition
of the ether_table[] array
Attached is a solution to the problem
Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 27 Sep 2005 22:24:13 +0000 (15:24 -0700)]
[NET]: Add Sun Cassini driver.
Written by Adrian Sun (asun@darksunrising.com).
Ported to 2.6.x by Tom 'spot' Callaway <tcallawa@redhat.com>.
Further cleaned up and integrated by David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank Filz [Tue, 27 Sep 2005 22:23:38 +0000 (15:23 -0700)]
[NET]: Fix module reference counts for loadable protocol modules
I have been experimenting with loadable protocol modules, and ran into
several issues with module reference counting.
The first issue was that __module_get failed at the BUG_ON check at
the top of the routine (checking that my module reference count was
not zero) when I created the first socket. When sk_alloc() is called,
my module reference count was still 0. When I looked at why sctp
didn't have this problem, I discovered that sctp creates a control
socket during module init (when the module ref count is not 0), which
keeps the reference count non-zero. This section has been updated to
address the point Stephen raised about checking the return value of
try_module_get().
The next problem arose when my socket init routine returned an error.
This resulted in my module reference count being decremented below 0.
My socket ops->release routine was also being called. The issue here
is that sock_release() calls the ops->release routine and decrements
the ref count if sock->ops is not NULL. Since the socket probably
didn't get correctly initialized, this should not be done, so we will
set sock->ops to NULL because we will not call try_module_get().
While searching for another bug, I also noticed that sys_accept() has
a possibility of doing a module_put() when it did not do an
__module_get so I re-ordered the call to security_socket_accept().
Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[PATCH] Make POSIX message queue sys_mq_open() honor umask
We ignored umask when creating new queues via mq_open (when creating
with open() on mqueue fs it is ok of course). According to the
specification this a bug. This trivial patch fixes this.
Signed-off-by: Krzysztof Benedyczak <golbi@mat.uni.torun.pl> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Michael Chan [Tue, 27 Sep 2005 19:13:10 +0000 (12:13 -0700)]
[TG3]: misc. fixes
Fix interrupt test handler by adding check for IRQ assertion in
PCI_STATE register in addition to the status block updated bit.
Add test for valid ethernet address in tg3_set_mac_addr().
Add tg3_bus_string() to setup the PCI bus speed/width string for all
PCI/PCIX/PCI Express devices. This is used to print the bus type
during init_one().
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 27 Sep 2005 19:12:42 +0000 (12:12 -0700)]
[TG3]: 5780 PHY fixes
Fix 5780 PHY related problems:
1. MAC_RX_MODE reset must be done before setting up the MAC_MODE
register on 5705_PLUS chips or the chip will stop receiving after
a while. The MAC_RX_MODE reset is needed to prevent intermittently
losing the first receive packet on serdes chips.
2. Skip MAC loopback test on 5780 because of hardware errata. Normal
traffic including PHY loopback is not affected by the errata.
3. PHY loopback fails intermittently on 5708S and this is fixed by
putting the PHY in loopback mode first before programming the MAC
mode register. A MAC_RX_MODE reset is also added.
4. Return -EINVAL in tg3_nway_reset() if device is in TBI mode. Allow
nway_reset if 5780S is in parallel detect mode.
5. Add missing PHY IDs in KNOWN_PHY_ID() macro.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 27 Sep 2005 19:07:44 +0000 (12:07 -0700)]
[NEIGH]: Add debugging check when adding timers.
If we double-add a neighbour entry timer, which should be
impossible but has been reported, dump the current state of
the entry so that we can debug this.
Signed-off-by: David S. Miller <davem@davemloft.net>
[IB] mthca: Round up number of slots in HCA context memory table
When allocating a table for mem-free HCA context, don't assume that
obj_size * nobj is an even multiple of MTHCA_TABLE_CHUNK_SIZE. In
particular, make sure we allocate at least one slot even if the table
is smaller than MTHCA_TABLE_CHUNK_SIZE.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Al Viro [Mon, 26 Sep 2005 06:49:27 +0000 (07:49 +0100)]
[PATCH] missing dependency on arm O= builds
arm maketools needs include/asm-arm in place in the build tree.
On normal builds it's always there, of course, but on O= it's created
(by generic code) too late - when we get to asm-offset.h.
We used to get away with that by accident - creation of
include/asm-arm/arch symlink creates include/asm-arm and it happened
to go before maketools. However, we did not have such dependency,
so that luck didn't last - now maketools is picked first and we are screwed.
Both the symlink and maketools are prerequisites of the same
target (archprepare). This fix is obvious - make the latter explicitly
depend on the former and be done with that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 26 Sep 2005 04:49:44 +0000 (05:49 +0100)]
[PATCH] useless includes of linux/irq.h in arch/i386
Most of these guys are simply not needed (pulled by other stuff
via asm-i386/hardirq.h). One that is not entirely useless is hilarious -
arch/i386/oprofile/nmi_timer_int.c includes linux/irq.h... as a way to
get linux/errno.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David S. Miller [Mon, 26 Sep 2005 23:12:18 +0000 (16:12 -0700)]
[SPARC64]: Do not do TLB pre-filling any more.
In order to do it correctly on UltraSPARC-III+ and later we'd
need to add some complicated code to set the TAG access extension
register before loading the TLB.
Since this optimization gives questionable gains, it's best to
just remove it for now instead of adding the fix for Ultra-III+
Signed-off-by: David S. Miller <davem@davemloft.net>
Harald Welte [Mon, 26 Sep 2005 22:25:11 +0000 (15:25 -0700)]
[NETFILTER]: Fix invalid module autoloading by splitting iptable_nat
When you've enabled conntrack and NAT as a module (standard case in all
distributions), and you've also enabled the new conntrack netlink
interface, loading ip_conntrack_netlink.ko will auto-load iptable_nat.ko.
This causes a huge performance penalty, since for every packet you iterate
the nat code, even if you don't want it.
This patch splits iptable_nat.ko into the NAT core (ip_nat.ko) and the
iptables frontend (iptable_nat.ko). Threfore, ip_conntrack_netlink.ko will
only pull ip_nat.ko, but not the frontend. ip_nat.ko will "only" allocate
some resources, but not affect runtime performance.
This separation is also a nice step in anticipation of new packet filters
(nf-hipac, ipset, pkttables) being able to use the NAT core.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Mon, 26 Sep 2005 22:10:16 +0000 (15:10 -0700)]
[IPV6]: Fix [Bug 5306] Oops on IPv6 route lookup
> Steps to reproduce:
> 1. Boot Linux, do NOT setup any IPv6 routes
> 2. ip route get 2001::1 (or any unroutable address)
Well caught. We never set rt6i_idev on ip6_null_entry.
This patch should make the problem go away.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Williamson [Mon, 26 Sep 2005 21:28:02 +0000 (14:28 -0700)]
[NET]: Make sure ctl buffer is aligned properly in sys_sendmsg().
It's on the stack and declared as "unsigned char[]", but pointers
and similar can be in here thus we need to give it an explicit
alignment attribute.
Signed-off-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>