Mark Goodwin [Mon, 25 Apr 2005 20:21:54 +0000 (13:21 -0700)]
[IA64-SGI] Altix SN add support for slots in geoid_t locator
This patch against ia64-test-2.6.12 is needed for forthcoming
Altix chipsets. It renames geoid_any_t to geoid_common_t and
splits the 8bit 'slab' field into two 4bit fields for 'slab'
and 'slot'. Similar changes in the Altix SAL will retain backward
compatibility for old kernels.
Signed-off-by: Mark Goodwin <markgw@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Optimize ia64_leave_syscall() a bit better for McKinley-type cores.
The patch looks big, but that's mostly due to renaming r16/r17 to r2/r3.
Good for a 13 cycle improvement.
The problem is that the size of the physical stacked registers was
loaded into the wrong register (r3 instead of r17). Since r17 by
coincidence always had the value 1, this had the effect of turning
rse_clear_invalid into a no-op. That poses the risk of leaking kernel
state back to user-land and is hence not acceptable.
The fix below is simple, but unfortunately it costs us about 28 cycles
in syscall overhead. ;-(
Unfortunately, there isn't much we can do about that since those
registers have to be cleared one way or another.
Alex Williamson [Mon, 25 Apr 2005 20:14:36 +0000 (13:14 -0700)]
[IA64] sba_iommu bug fixes
This fixes a couple of bugs in the zx1/sx1000 sba_iommu. These are
all pretty low likelihood of hitting. The first problem is a simple off
by one, deep in the sba_alloc_range() error path. Surrounding that was
a lock ordering problem that could have potentially deadlocked with the
order the locks are grabbed in sba_unmap_single(). I moved the resource
locking into sba_search_bitmap() to prevent this. Finally, there's a
potential race between unmapping pdir entries and marking incoming DMA
pages clean. If you see any oddities, please let me know, but I've
tested it pretty thoroughly here. Tony, please apply. Thanks,
BTW, many of the options in this driver not on by default are becoming
more and more broken. I'll be working on some patches to clean them
out, but I wanted to get this bug fix out first.
Signed-off-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Robin Holt [Mon, 25 Apr 2005 20:13:16 +0000 (13:13 -0700)]
[IA64] Percpu quicklist for combined allocator for pgd/pmd/pte.
This patch introduces using the quicklists for pgd, pmd, and pte levels
by combining the alloc and free functions into a common set of routines.
This greatly simplifies the reading of this header file.
This patch is simple but necessary for large numa configurations.
It simply ensures that only pages from the local node are added to a
cpus quicklist. This prevents the trapping of pages on a remote nodes
quicklist by starting a process, touching a large number of pages to
fill pmd and pte entries, migrating to another node, and then unmapping
or exiting. With those conditions, the pages get trapped and if the
machine has more than 100 nodes of the same size, the calculation of
the pgtable high water mark will be larger than any single node so page
table cache flushing will never occur.
I ran lmbench lat_proc fork and lat_proc exec on a zx1 with and without
this patch and did not notice any change.
On an sn2 machine, there was a slight improvement which is possibly
due to pages from other nodes trapped on the test node before starting
the run. I did not investigate further.
This patch shrinks the quicklist based upon free memory on the node
instead of the high/low water marks. I have written it to enable
preemption periodically and recalculate the amount to shrink every time
we have freed enough pages that the quicklist size should have grown.
I rescan the nodes zones each pass because other processess may be
draining node memory at the same time as we are adding.
Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
This patch adds the necessary "hook" to allow SGI/SN
machines to perform a system power off upon a
'init 0', 'halt -p', 'poweroff' or 'shutdown -h'.
The "hook" is to set the pm_power_off callback
to ia64_sn_power_down(). pm_power_off is checked
in machine_power_off()/do_poweroff() and, if set, is executed.
ia64_sn_power_down() is a function already present (but not
used currently) in the sn kernel.
ia64_sn_power_down() makes a SAL call to execute the
power off.
Signed-off-by: Aaron J Young <ayoung@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
[IA64] perfmon: make pfm_sysctl a global, and other cleanup
- make pfm_sysctl a global such that it is possible
to enable/disable debug printk in sampling formats
using PFM_DEBUG.
- remove unused pfm_debug_var variable
- fix a bug in pfm_handle_work where an BUG_ON() could
be triggered. There is a path where pfm_handle_work()
can be called with interrupts enabled, i.e., when
TIF_NEED_RESCHED is set. The fix correct the masking
and unmasking of interrupts in pfm_handle_work() such
that we restore the interrupt mask as it was upon entry.
signed-off-by: stephane eranian <eranian@hpl.hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Recently I noticed that clearing ar.ssd/ar.csd right before srlz.d is
causing significant stalling in the syscall path. The patch below
fixes that by moving the register-writes after srlz.d. On a Madison,
this drops break-based getpid() from 241 to 226 cycles (-15 cycles).
Signed-off-by: David Mosberger-Tang <davidm@hpl.hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Keith Owens [Mon, 25 Apr 2005 18:45:26 +0000 (11:45 -0700)]
[IA64] Tighten up unw_unwind_to_user check
Detect user space by the unwind frame with predicate PRED_USER_STACK
set, instead of a user space IP. Tighten up the last ditch check for
running off the top of the kernel stack.
Based on a suggestion by David Mosberger, reworked to fit the current
tree. This survives my stress test which used to break 2.6.9 kernels.
Unlike 2.6.11, the stress test now unwinds to the correct point, so
gdb can get the user space registers.
Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Jack Steiner [Mon, 25 Apr 2005 18:42:39 +0000 (11:42 -0700)]
[IA64-SGI] Change SAL call request code for SN systems
Change the value of the SAL call number for a new SAL request. The
initial implementation in the PROM did not match what the OS expected.
Since the OS can run on PROMs that do not implement the new call,
changing the call number avoids the issue. New PROMs will implement
the new call number. (This avoids problems with the 4.05 PROM).
Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
replaced declaration of EA from u32 to unsigned long - this beast is
used only to cast it to (userland) pointer and proper integer type for
that is unsigned long.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Mon, 25 Apr 2005 14:55:57 +0000 (07:55 -0700)]
[PATCH] ppc iomem annotations: ->io_base_virt
* ->io_base_virt in struct pci_controller is iomem pointer. Marked as such.
Most of the places that used it are already annotated to expect iomem.
* places that did gratitious (and wrong) casts a-la
isa_io_base = (unsigned long)ioremap(...);
hose->io_base_virt = (void *)isa_io_base;
turned into
hose->io_base_virt = ioremap(...);
isa_io_base = (unsigned long)hose->io_base_virt;
* pci_bus_io_base() annotated as returning iomem pointer.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Peter Jones uncovered this one while we were debugging the framebuffer
issues. There are some references to -1 in the mxcc asm code, which
should be 0xffffffff.
This patch gets rid of the -1s.
Signed-off-by: Tom 'spot' Callaway <tcallawa@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Using the same logic as the other framebuffer fixes committed in 2.6.11,
this is a set of fixes to make TCX functional on the console again. Adds
the tcx_pan_display function, sets the
all->info.var.{red,green,blue}.length values to 8, and runs fb_set_cmap.
Also looks for the correct SUNW,tcx prom value.
This patch just slipped through the cracks.
Originally by: Georg Chini <georg.chini@triaton-webhosting.com>
Signed-off-by: Tom 'spot' Callaway <tcallawa@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Minor cleanups for sparc specific drivers (sunbmac, sunqe, sunlance,
sunhme, esp) so that they have a full module version definition that is
consistent with other upstream drivers.
Signed-off-by: Tom 'spot' Callaway <tcallawa@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Graf [Mon, 25 Apr 2005 03:19:54 +0000 (20:19 -0700)]
[PKT_SCHED]: improve hashing performance of cls_fw
Calculate hashtable size to fit into a page instead of a hardcoded
256 buckets hash table. Results in a 1024 buckets hashtable on
most systems.
Replace old naive extract-8-lsb-bits algorithm with a better
algorithm xor'ing 3 or 4 bit fields at the size of the hashtable
array index in order to improve distribution if the majority of
the lower bits are unused while keeping zero collision behaviour
for the most common use case.
Thanks to Wang Jian <lark@linux.net.cn> for bringing this issue
to attention and to Eran Mann <emann@mrv.com> for the initial
idea for this new algorithm.
Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
The SELinux hooks invoke ipv6_skip_exthdr() with an incorrect
length final argument. However, the length argument turns out
to be superfluous.
I was just reading ipv6_skip_exthdr and it occured to me that we can
get rid of len altogether. The only place where len is used is to
check whether the skb has two bytes for ipv6_opt_hdr. This check
is done by skb_header_pointer/skb_copy_bits anyway.
Now it might appear that we've made the code slower by deferring
the check to skb_copy_bits. However, this check should not trigger
in the common case so this is OK.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 25 Apr 2005 02:12:33 +0000 (19:12 -0700)]
[TCP]: skb pcount with MTU discovery
The problem is that when doing MTU discovery, the too-large segments in
the write queue will be calculated as having a pcount of >1. When
tcp_write_xmit() is trying to send, tcp_snd_test() fails the cwnd test
when pcount > cwnd.
The segments are eventually transmitted one at a time by keepalive, but
this can take a long time.
This patch checks if TSO is enabled when setting pcount.
Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
chas williams [Mon, 25 Apr 2005 01:55:35 +0000 (18:55 -0700)]
[ATM]: [he] Use the DMA_32BIT_MASK constant from dma-mapping.h
Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil> Signed-off-by: David S. Miller <davem@davemloft.net>
Replacing the open coded equivalents and making ax25 look more like
a linux network protocol, i.e. more similar to inet.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Mon, 25 Apr 2005 01:41:38 +0000 (18:41 -0700)]
[NETFILTER]: Fix NAT sequence number adjustment
The NAT changes in 2.6.11 changed the position where helpers
are called and perform packet mangling. Before 2.6.11, a NAT
helper was called before the packet was NATed and had its
sequence number adjusted. Since 2.6.11, the helpers get packets
with already adjusted sequence numbers.
This breaks sequence number adjustment, adjust_tcp_sequence()
needs the original sequence number to determine whether
a packet was a retransmission and to store it for further
corrections. It can't be reconstructed without more information
than available, so this patch restores the old order by
calling helpers from a new conntrack hook two priorities
below ip_conntrack_confirm() and adjusting the sequence number
from a new NAT hook one priority below ip_conntrack_confirm().
Tracked down by Phil Oester <kernel@linuxace.com>
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Sun, 24 Apr 2005 19:28:36 +0000 (12:28 -0700)]
[PATCH] mostek bogus sparse annotations fixed
void * __iomem foo is not a pointer to iomem - it's an iomem variable
containing void *. A pile of such guys in arch/sparc64/kernel/time.c,
drivers/sbus/char/rtc.c and include/asm-sparc64/mostek.h turned into
intended void __iomem *.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sun, 24 Apr 2005 19:28:35 +0000 (12:28 -0700)]
[PATCH] broken dependency for floppy on ARM
(!ARCH_S390 && !M68K && !IA64 && !UML) is obviously always true on ARM.
Intended behaviour for ARM is "absent unless we are on RiscPC or
EBSA285". So what we want is added && !ARM in the first term - without
it the last part (|| ARCH_RPC || ARCH_EBSA285, that is) doesn't do
anything.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sun, 24 Apr 2005 19:28:35 +0000 (12:28 -0700)]
[PATCH] __get_unaligned() turned into macro
Turns __get_unaligned() and __put_unaligned into macros. That is
definitely safe; leaving them as inlines breaks on e.g. alpha [try to
build ncpfs there and you'll get unresolved symbols since we end up
getting __get_unaligned() not inlined].
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sun, 24 Apr 2005 19:28:35 +0000 (12:28 -0700)]
[PATCH] broken dependency for I2C_MPC
All boards dealt with by I2C_MPC are 32bit. Moreover, driver simply
won't build on ppc64 - it uses ppc32-only types all over the place.
Dependency fixed - it's PPC32, not PPC.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sun, 24 Apr 2005 19:28:35 +0000 (12:28 -0700)]
[PATCH] missing dependency on sparc64
CONFIG_HW_CONSOLE selects vt.c; without the stuff pulled by CONFIG_VT it
will not build. Normally we get both in drivers/char/Kconfig and there
HW_CONSOLE depends on VT. sparc64 does not pull drivers/char/Kconfig
and has that sutff in arch/sparc64/Kconfig instead. However, it forgets
to add the same dependency. As the result, turning VT off [which is
possible] will end up with broken build. For no good reason...
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Sun, 24 Apr 2005 19:28:34 +0000 (12:28 -0700)]
[PATCH] SCSI GFP fixes
Somebody forgot that | has higher priority than ?:. As the result,
allocation is done with bogus flags - instead of GFP_ATOMIC + possibly
GFP_DMA we always get GFP_DMA and no GFP_ATOMIC.
Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch is required to support cpu removal for IPF systems. Existing code
just fakes the real offline by keeping it run the idle thread, and polling
for the bit to re-appear in the cpu_state to get out of the idle loop.
For the cpu-offline to work correctly, we need to pass control of this CPU
back to SAL so it can continue in the boot-rendez mode. This gives the
SAL control to not pick this cpu as the monarch processor for global MCA
events, and addition does not wait for this cpu to checkin with SAL
for global MCA events as well. The handoff is implemented as documented in
SAL specification section 3.2.5.1 "OS_BOOT_RENDEZ to SAL return State"
Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
[IA64] ia32_signal.c: erroneous use of memset/memcpy
Found by Alexander Nyberg, improved by Bjorn Helgaas.
- Fix the incorrect argument to sizeof()
- looks like memcpy() code pass was dervived from code that used
copy_from_user(). But in this case we are doing to kernel space
to kernel space copy, so memcpy is the right routine, but it
doesn't return an error code.
Signed-off-by: Arun Sharma <arun.sharma@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
David S. Miller [Fri, 22 Apr 2005 05:06:13 +0000 (22:06 -0700)]
[SPARC64]: In sunsu driver, make sure to fully init chip for kbd/ms
We were forgetting to call sunsu_change_speed(). The reason
that replugging in the mouse cable "fixes things" is that
causes a BREAK interrupt which in turn caused a call to
sunsu_change_speed() which would get the chip setup properly.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 22 Apr 2005 04:42:34 +0000 (21:42 -0700)]
[SPARC]: Provide generic ioctls in Sparc RTC driver.
Provide support for drivers/char/rtc.c ioctls in the
Mostek rtc driver as well as the Sparc specific RTCGET
and RTCSET.
This allows userspace to be much less messy. Currently
util-linux and other spots jump through hoops trying
various ioctl variants until it hits the right one whatever
driver actually being used supports.
Eventually all of this should move over to the genrtc.c
driver, but not today...
While we are here, fix up the register types for sparse.
Thanks to Frans Pop for helping point out this issue.
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 22 Apr 2005 00:13:59 +0000 (17:13 -0700)]
[TG3]: Add msi test
Add MSI test for chips that support MSI. If MSI test fails, it will
switch back to INTx mode and will print a message asking the user to
report the failure.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 22 Apr 2005 00:09:53 +0000 (17:09 -0700)]
[TG3]: Workaround 5752 A0 chip ID
The 5752 A0 chip ID is wrong in hardware. The simplest way to workaround
it is to change it to the correct value in tp->pci_chip_rev_id. This
way, it is easier to check for the ASIC_REV_5752 in the rest of the
driver.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 22 Apr 2005 00:09:08 +0000 (17:09 -0700)]
[TG3]: Fix tg3_set_power_state()
Fix tg3_set_power_state to drive GPIOs properly based on the
TG3_FLAG_EEPROM_WRITE_PROTECT flag. Some delays are also added after D0
and D3 power state changes.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 22 Apr 2005 00:06:20 +0000 (17:06 -0700)]
[TG3]: Split tg3_phy_probe into 2 functions
Split the 1st half of tg3_phy_probe() into tg3_get_eeprom_hw_cfg() so
that the TG3_FLAG_EEPROM_WRITE_PROT can be determined before calling
tg3_set_power_state() in tg3_get_invariants(). This will allow
tg3_set_power_state() to drive the GPIOs correctly based on the config.
information in eeprom.
On the 5752, there are no pull-up resistors on the GPIO pins and it is
necessary to drive the unused GPIOs as output.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Fri, 22 Apr 2005 00:03:52 +0000 (17:03 -0700)]
[TG3]: add support for bcm5752 rev a1
Replace existing ASIC_REV_5752 definition with ASIC_REV_5752_A0,
and add definition for ASIC_REV_5752_A1. Then, add ASIC_REV_5752_A1
to check for setting TG3_FLG2_5750_PLUS in tg3_get_invariants.
Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Thu, 21 Apr 2005 23:58:56 +0000 (16:58 -0700)]
[TG3]: add bcm5752 entry to pci_ids.h
Add proper entry for bcm5752 PCI ID to pci_ids.h, and use it in tg3.
I did this separately in case patches like this (i.e. new PCI IDs)
need to come from more "official" sources.
Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>