David S. Miller [Mon, 25 Apr 2005 02:12:33 +0000 (19:12 -0700)]
[TCP]: skb pcount with MTU discovery
The problem is that when doing MTU discovery, the too-large segments in
the write queue will be calculated as having a pcount of >1. When
tcp_write_xmit() is trying to send, tcp_snd_test() fails the cwnd test
when pcount > cwnd.
The segments are eventually transmitted one at a time by keepalive, but
this can take a long time.
This patch checks if TSO is enabled when setting pcount.
Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
chas williams [Mon, 25 Apr 2005 01:55:35 +0000 (18:55 -0700)]
[ATM]: [he] Use the DMA_32BIT_MASK constant from dma-mapping.h
Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
Replacing the open coded equivalents and making ax25 look more like
a linux network protocol, i.e. more similar to inet.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Mon, 25 Apr 2005 01:41:38 +0000 (18:41 -0700)]
[NETFILTER]: Fix NAT sequence number adjustment
The NAT changes in 2.6.11 changed the position where helpers
are called and perform packet mangling. Before 2.6.11, a NAT
helper was called before the packet was NATed and had its
sequence number adjusted. Since 2.6.11, the helpers get packets
with already adjusted sequence numbers.
This breaks sequence number adjustment, adjust_tcp_sequence()
needs the original sequence number to determine whether
a packet was a retransmission and to store it for further
corrections. It can't be reconstructed without more information
than available, so this patch restores the old order by
calling helpers from a new conntrack hook two priorities
below ip_conntrack_confirm() and adjusting the sequence number
from a new NAT hook one priority below ip_conntrack_confirm().
Tracked down by Phil Oester <kernel@linuxace.com>
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Sun, 24 Apr 2005 19:28:36 +0000 (12:28 -0700)]
[PATCH] mostek bogus sparse annotations fixed
void * __iomem foo is not a pointer to iomem - it's an iomem variable
containing void *. A pile of such guys in arch/sparc64/kernel/time.c,
drivers/sbus/char/rtc.c and include/asm-sparc64/mostek.h turned into
intended void __iomem *.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sun, 24 Apr 2005 19:28:35 +0000 (12:28 -0700)]
[PATCH] broken dependency for floppy on ARM
(!ARCH_S390 && !M68K && !IA64 && !UML) is obviously always true on ARM.
Intended behaviour for ARM is "absent unless we are on RiscPC or
EBSA285". So what we want is added && !ARM in the first term - without
it the last part (|| ARCH_RPC || ARCH_EBSA285, that is) doesn't do
anything.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sun, 24 Apr 2005 19:28:35 +0000 (12:28 -0700)]
[PATCH] __get_unaligned() turned into macro
Turns __get_unaligned() and __put_unaligned into macros. That is
definitely safe; leaving them as inlines breaks on e.g. alpha [try to
build ncpfs there and you'll get unresolved symbols since we end up
getting __get_unaligned() not inlined].
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sun, 24 Apr 2005 19:28:35 +0000 (12:28 -0700)]
[PATCH] broken dependency for I2C_MPC
All boards dealt with by I2C_MPC are 32bit. Moreover, driver simply
won't build on ppc64 - it uses ppc32-only types all over the place.
Dependency fixed - it's PPC32, not PPC.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sun, 24 Apr 2005 19:28:35 +0000 (12:28 -0700)]
[PATCH] missing dependency on sparc64
CONFIG_HW_CONSOLE selects vt.c; without the stuff pulled by CONFIG_VT it
will not build. Normally we get both in drivers/char/Kconfig and there
HW_CONSOLE depends on VT. sparc64 does not pull drivers/char/Kconfig
and has that sutff in arch/sparc64/Kconfig instead. However, it forgets
to add the same dependency. As the result, turning VT off [which is
possible] will end up with broken build. For no good reason...
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sun, 24 Apr 2005 19:28:34 +0000 (12:28 -0700)]
[PATCH] SCSI GFP fixes
Somebody forgot that | has higher priority than ?:. As the result,
allocation is done with bogus flags - instead of GFP_ATOMIC + possibly
GFP_DMA we always get GFP_DMA and no GFP_ATOMIC.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch is required to support cpu removal for IPF systems. Existing code
just fakes the real offline by keeping it run the idle thread, and polling
for the bit to re-appear in the cpu_state to get out of the idle loop.
For the cpu-offline to work correctly, we need to pass control of this CPU
back to SAL so it can continue in the boot-rendez mode. This gives the
SAL control to not pick this cpu as the monarch processor for global MCA
events, and addition does not wait for this cpu to checkin with SAL
for global MCA events as well. The handoff is implemented as documented in
SAL specification section 3.2.5.1 "OS_BOOT_RENDEZ to SAL return State"
Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
[IA64] ia32_signal.c: erroneous use of memset/memcpy
Found by Alexander Nyberg, improved by Bjorn Helgaas.
- Fix the incorrect argument to sizeof()
- looks like memcpy() code pass was dervived from code that used
copy_from_user(). But in this case we are doing to kernel space
to kernel space copy, so memcpy is the right routine, but it
doesn't return an error code.
Signed-off-by: Arun Sharma <arun.sharma@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
David S. Miller [Fri, 22 Apr 2005 05:06:13 +0000 (22:06 -0700)]
[SPARC64]: In sunsu driver, make sure to fully init chip for kbd/ms
We were forgetting to call sunsu_change_speed(). The reason
that replugging in the mouse cable "fixes things" is that
causes a BREAK interrupt which in turn caused a call to
sunsu_change_speed() which would get the chip setup properly.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 22 Apr 2005 04:42:34 +0000 (21:42 -0700)]
[SPARC]: Provide generic ioctls in Sparc RTC driver.
Provide support for drivers/char/rtc.c ioctls in the
Mostek rtc driver as well as the Sparc specific RTCGET
and RTCSET.
This allows userspace to be much less messy. Currently
util-linux and other spots jump through hoops trying
various ioctl variants until it hits the right one whatever
driver actually being used supports.
Eventually all of this should move over to the genrtc.c
driver, but not today...
While we are here, fix up the register types for sparse.
Thanks to Frans Pop for helping point out this issue.
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 22 Apr 2005 00:13:59 +0000 (17:13 -0700)]
[TG3]: Add msi test
Add MSI test for chips that support MSI. If MSI test fails, it will
switch back to INTx mode and will print a message asking the user to
report the failure.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 22 Apr 2005 00:09:53 +0000 (17:09 -0700)]
[TG3]: Workaround 5752 A0 chip ID
The 5752 A0 chip ID is wrong in hardware. The simplest way to workaround
it is to change it to the correct value in tp->pci_chip_rev_id. This
way, it is easier to check for the ASIC_REV_5752 in the rest of the
driver.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 22 Apr 2005 00:09:08 +0000 (17:09 -0700)]
[TG3]: Fix tg3_set_power_state()
Fix tg3_set_power_state to drive GPIOs properly based on the
TG3_FLAG_EEPROM_WRITE_PROTECT flag. Some delays are also added after D0
and D3 power state changes.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 22 Apr 2005 00:06:20 +0000 (17:06 -0700)]
[TG3]: Split tg3_phy_probe into 2 functions
Split the 1st half of tg3_phy_probe() into tg3_get_eeprom_hw_cfg() so
that the TG3_FLAG_EEPROM_WRITE_PROT can be determined before calling
tg3_set_power_state() in tg3_get_invariants(). This will allow
tg3_set_power_state() to drive the GPIOs correctly based on the config.
information in eeprom.
On the 5752, there are no pull-up resistors on the GPIO pins and it is
necessary to drive the unused GPIOs as output.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Fri, 22 Apr 2005 00:03:52 +0000 (17:03 -0700)]
[TG3]: add support for bcm5752 rev a1
Replace existing ASIC_REV_5752 definition with ASIC_REV_5752_A0,
and add definition for ASIC_REV_5752_A1. Then, add ASIC_REV_5752_A1
to check for setting TG3_FLG2_5750_PLUS in tg3_get_invariants.
Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Thu, 21 Apr 2005 23:58:56 +0000 (16:58 -0700)]
[TG3]: add bcm5752 entry to pci_ids.h
Add proper entry for bcm5752 PCI ID to pci_ids.h, and use it in tg3.
I did this separately in case patches like this (i.e. new PCI IDs)
need to come from more "official" sources.
Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[AX25]: make ax25_queue_xmit a net_device parameter
I.e. not using skb->dev as a way to pass the parameter used to fill...
skb->dev :-)
Also to get the _type_trans open coded sequence grouped, next changesets
will introduce ax25_type_trans.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
James Bottomley [Thu, 21 Apr 2005 23:20:35 +0000 (16:20 -0700)]
[PATCH] fix subarch breakage in amd dual core updates
The patch to arch/i386/kernel/cpu/amd.c relies on the variable
cpu_core_id which is defined in i386/kernel/smpboot.c. This means it is
only present if CONFIG_X86_SMP is defined, not CONFIG_SMP (alternative
SMP harnesses won't have it, which is why it breaks voyager).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix non-legacy multichannel ISO receive, broken by Parag Wardukar's
allocation fix. Multichannel ISO receive still sucks; it should be possible
to use both legacy and non-legacy modes at the same time, but with this
patch, things are no worse than they were in 2.6.11 and allocation is
still done at the correct time.
The ia64-version of fls() never worked as intended (the bitnumbering
was off by 1 and fls(0) was undefined). This patch fixes the problem
by using a popcnt-based fls(), which on McKinley-derived cores is
slightly faster than both ia64_fls() and generic_fls(). The resulting
code, however, is bigger (7-8 bundles instead of about 3 bundles).
Also switch ia64_popcnt() to __builtin_popcountl() for GCC v3.4 or
newer since the compiler can predicate that and schedule it better.
Thanks to Simon Derr and Matt Mackall for tracking down this bug.
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Thomas Graf [Wed, 20 Apr 2005 05:35:07 +0000 (22:35 -0700)]
[RTNETLINK]: Protocol family wildcard dumping for routing rules
Be kind to userspace and don't force them to hardcode protocol
families just to have it changed again once we support routing
rules for more than one protocol family.
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 20 Apr 2005 05:32:22 +0000 (22:32 -0700)]
[IPV6]: Replace bogus instances of inet->recverr
While looking at this problem I noticed that IPv6 was sometimes
looking at inet->recverr which is bogus. Here is a patch to
correct that and use np->recverr.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
ARM26 define FIRST_USER_ADDRESS as PAGE_SIZE (beyond the machine vectors when
they are mapped low), and use that definition in place of locally defined
MIN_MAP_ADDR. Previously, ARM26 permitted user mappings at 0 if the machine
vectors were mapped high; but that's inconsistent with ARM, and
FIRST_USER_ADDRESS would then have to be determined at runtime. Let's fix it
at PAGE_SIZE throughout the architecture.
ARM define FIRST_USER_ADDRESS as PAGE_SIZE (beyond the machine vectors when
they are mapped low), and use that definition in place of locally defined
MIN_MAP_ADDR.
Remove use of FIRST_USER_PGD_NR from sys_mincore: it's inconsistent (no other
syscall refers to it), unnecessary (sys_mincore loops over vmas further down)
and incorrect (misses user addresses in ARM's first pgd).
[PATCH] freepgt: free_pgtables from FIRST_USER_ADDRESS
The patches to free_pgtables by vma left problems on any architectures which
leave some user address page table entries unencapsulated by vma. Andi has
fixed the 32-bit vDSO on x86_64 to use a vma. Now fix arm (and arm26), whose
first PAGE_SIZE is reserved (perhaps) for machine vectors.
Our calls to free_pgtables must not touch that area, and exit_mmap's
BUG_ON(nr_ptes) must allow that arm's get_pgd_slow may (or may not) have
allocated an extra page table, which its free_pgd_slow would free later.
FIRST_USER_PGD_NR has misled me and others: until all the arches define
FIRST_USER_ADDRESS instead, a hack in mmap.c to derive one from t'other. This
patch fixes the bugs, the remaining patches just clean it up.
Once we're strict about clearing away page tables, hugetlb_prefault can assume
there are no page tables left within its range. Since the other arches
continue if !pte_none here, let i386 do the same.
ia64 and sparc64 hurriedly had to introduce their own variants of
pgd_addr_end, to leapfrog over the holes in their virtual address spaces which
the final clear_page_range suddenly presented when converted from pgd_index to
pgd_addr_end. But now that free_pgtables respects the vma list, those holes
are never presented, and the arch variants can go.
ia64 and ppc64 had hugetlb_free_pgtables functions which were no longer being
called, and it wasn't obvious what to do about them.
The ppc64 case turns out to be easy: the associated tables are noted elsewhere
and freed later, safe to either skip its hugetlb areas or go through the
motions of freeing nothing. Since ia64 does need a special case, restore to
ppc64 the special case of skipping them.
The ia64 hugetlb case has been broken since pgd_addr_end went in, though it
probably appeared to work okay if you just had one such area; in fact it's
been broken much longer if you consider a long munmap spanning from another
region into the hugetlb region.
In the ia64 hugetlb region, more virtual address bits are available than in
the other regions, yet the page tables are structured the same way: the page
at the bottom is larger. Here we need to scale down each addr before passing
it to the standard free_pgd_range. Was about to write a hugely_scaled_down
macro, but found htlbpage_to_page already exists for just this purpose. Fixed
off-by-one in ia64 is_hugepage_only_range.
Uninline free_pgd_range to make it available to ia64. Make sure the
vma-gathering loop in free_pgtables cannot join a hugepage_only_range to any
other (safe to join huges? probably but don't bother).
There's only one usage of MM_VM_SIZE(mm) left, and it's a troublesome macro
because mm doesn't contain the (32-bit emulation?) info needed. But it too is
only needed because we ignore the end from the vma list.
We could make flush_pgtables return that end, or unmap_vmas. Choose the
latter, since it's a natural fit with unmap_mapping_range_vma needing to know
its restart addr. This does make more than minimal change, but if unmap_vmas
had returned the end before, this is how we'd have done it, rather than
storing the break_addr in zap_details.
unmap_vmas used to return count of vmas scanned, but that's just debug which
hasn't been useful in a while; and if we want the map_count 0 on exit check
back, it can easily come from the final remove_vm_struct loop.
Recent woes with some arches needing their own pgd_addr_end macro; and 4-level
clear_page_range regression since 2.6.10's clear_page_tables; and its
long-standing well-known inefficiency in searching throughout the higher-level
page tables for those few entries to clear and free: all can be blamed on
ignoring the list of vmas when we free page tables.
Replace exit_mmap's clear_page_range of the total user address space by
free_pgtables operating on the mm's vma list; unmap_region use it in the same
way, giving floor and ceiling beyond which it may not free tables. This
brings lmbench fork/exec/sh numbers back to 2.6.10 (unless preempt is enabled,
in which case latency fixes spoil unmap_vmas throughput).
Beware: the do_mmap_pgoff driver failure case must now use unmap_region
instead of zap_page_range, since a page table might have been allocated, and
can only be freed while it is touched by some vma.
Move free_pgtables from mmap.c to memory.c, where its lower levels are adapted
from the clear_page_range levels. (Most of free_pgtables' old code was
actually for a non-existent case, prev not properly set up, dating from before
hch gave us split_vma.) Pass mmu_gather** in the public interfaces, since we
might want to add latency lockdrops later; but no attempt to do so yet, going
by vma should itself reduce latency.
But what if is_hugepage_only_range? Those ia64 and ppc64 cases need careful
examination: put that off until a later patch of the series.
What of x86_64's 32bit vdso page __map_syscall32 maps outside any vma?
And the range to sparc64's flush_tlb_pgtables? It's less clear to me now that
we need to do more than is done here - every PMD_SIZE ever occupied will be
flushed, do we really have to flush every PGDIR_SIZE ever partially occupied?
A shame to complicate it unnecessarily.
Special thanks to David Miller for time spent repairing my ceilings.
I can't use list.h, since sk_buff doesn't have a list_head but instead
has two struct sk_buff pointers, and I want to avoid any extra memory
allocation.
send outgoing packets in order
Signed-off-by: Ed L. Cashin <ecashin@coraid.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>