Daniel Yeisley [Wed, 15 Feb 2006 23:17:41 +0000 (15:17 -0800)]
[PATCH] x86_64: early initialization of cpu_to_node
The early initialization of cpu_to_node code as it is now only updates the
cpu_to_node array, and does not update cpu_pda()->nodemember. This will
cause numa_node_id() to return 0 on systems where CPU 0 is not on Node 0.
This leads to a kernel panic in slab.c.
I've tested the patch below on a 16 processor x86_64 ES7000-600 server, and
no longer see the panic I saw with the original 2.6.16-rc3.
Signed-off-by: Dan Yeisley <dan.yeisley@unisys.com> Acked-by: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make new MADV_REMOVE, MADV_DONTFORK, MADV_DOFORK consistent across all
arches. The idea is to make it possible to use them portably even before
distros include them in libc headers.
Move common flags to asm-generic/mman.h
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Cc: Roland Dreier <rolandd@cisco.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Jackson [Wed, 15 Feb 2006 23:17:38 +0000 (15:17 -0800)]
[PATCH] cpuset: oops in exit on null cpuset fix
Fix a latent bug in cpuset_exit() handling. If a task tried to allocate
memory after calling cpuset_exit(), it oops'd in
cpuset_update_task_memory_state() on a NULL cpuset pointer.
So set the exiting tasks cpuset to the root cpuset instead of to NULL.
A distro kernel hit this with an added kernel package that had just such a
hook (allocating memory) in the exit code path.
Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] neofb: avoid resetting display config on unblank (v2)
There were two mistakes in the register-read-on-(un)blank approach.
- First, without proper register (un)locking the value read back will always
be zero, and this is what I missed entirely until just now. Due to this,
the logic could not be verified at all and I tried some bogus checks which
are completely stupid.
- Second, the LCD status bit will always be set to zero when the backlight
has been turned off. Reading the value back during unblank will disable the
LCD unconditionally, regardless of the state it is supposed to be in, since
we set it to zero beforehand.
So this is what we do now:
- create a new variable in struct neofb_par, and use that to determine
whether to read back registers (initialized to true)
- before actually blanking the screen, read back the register to sense any
possible change made through Fn key combo
- use proper neoUnlock() / neoLock() to actually read something
- every call to neofb_blank() determines if we read back next time: blanking
disables readback, unblanking (FB_BLANK_UNBLANK) enables it
This should give us a nice and clean state machine. Has been thoroughly
tested on a Dell Latitude CPiA / NM220 Chip docked to a C/Dock2 with attached
CRT in all possible combinations of LCD/CRT on/off. I changed the config via
Fn key, let the console blank, unblanked by keypress - works flawlessly.
Signed-off-by: Christian Trefzer <ctrefzer@gmx.de> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
What if kill_proc_info(p->pid) happens in between?
copy_process() holds current->sighand.siglock, so we are safe
in CLONE_THREAD case, because current->sighand == p->sighand.
Otherwise, p->sighand is unlocked, the new process is already
visible to the find_task_by_pid(), but have a copy of parent's
'struct pid' in ->pids[PIDTYPE_TGID].
This means that __group_complete_signal() may hang while doing
do ... while (next_thread() != p)
We can solve this problem if we reverse these 2 attach_pid()s:
attach_pid() does wmb()
group_send_sig_info() calls spin_lock(), which
provides a read barrier. // Yes ?
I don't think we can hit this race in practice, but still.
Oleg Nesterov [Wed, 15 Feb 2006 19:13:24 +0000 (22:13 +0300)]
[PATCH] fix kill_proc_info() vs CLONE_THREAD race
There is a window after copy_process() unlocks ->sighand.siglock
and before it adds the new thread to the thread list.
In that window __group_complete_signal(SIGKILL) will not see the
new thread yet, so this thread will start running while the whole
thread group was supposed to exit.
I beleive we have another good reason to place attach_pid(PID/TGID)
under ->sighand.siglock. We can do the same for
release_task()->__unhash_process()
de_thread()->switch_exec_pids()
After that we don't need tasklist_lock to iterate over the thread
list, and we can simplify things, see for example do_sigaction()
or sys_times().
Looks like somebody forgot to use the _bh spin_lock variant. We ran into a
deadlock where br->hello_timer expired while br_stp_disable_br() walked
br->port_list.
Signed-off-by: Adrian Drzewiecki <z@drze.net> Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Wed, 15 Feb 2006 09:34:23 +0000 (01:34 -0800)]
[NETFILTER]: Fix xfrm lookup after SNAT
To find out if a packet needs to be handled by IPsec after SNAT, packets
are currently rerouted in POST_ROUTING and a new xfrm lookup is done. This
breaks SNAT of non-unicast packets to non-local addresses because the
packet is routed as incoming packet and no neighbour entry is bound to the
dst_entry. In general, it seems to be a bad idea to replace the dst_entry
after the packet was already sent to the output routine because its state
might not match what's expected.
This patch changes the xfrm lookup in POST_ROUTING to re-use the original
dst_entry without routing the packet again. This means no policy routing
can be used for transport mode transforms (which keep the original route)
when packets are SNATed to match the policy, but it looks like the best
we can do for now.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Steve French [Wed, 15 Feb 2006 04:30:52 +0000 (22:30 -0600)]
[PATCH] CIFS: fix cifs_user_read oops when null SMB response on forcedirectio mount
This patch fixes an oops reported by Adrian Bunk in cifs_user_read when a null
read response is returned on a forcedirectio mount.
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] neofb: avoid resetting display config on unblank
Fix issues with the NeoMagic framebuffer driver.
It nicely complements my previous fix already in linus' tree. The only
thing missing now is that the external CRT will not be activated at neofb
init when external-only is selected, either by register read or
module/kernel parameter.
Testing was done on a Dell Latitude CPi-A/NM2200 chip.
Previous behaviour:
- before booting linux, set the preferred display config X via FN+F8
- boot linux, neofb stores the register values in a private
variable
- change the display config to Y via keystroke
- leave the machine in peace until display is blanked
- touching any key will result in display config X being restored
- booting up, the BIOS will acknowledge config Y, though...
Current behaviour:
At the time of unblanking, config Y is honoured because we now read back
register contents instead of just overwriting them with outdated values.
Signed-off by: Christian Trefzer <ctrefzer@gmx.de> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Howells [Tue, 14 Feb 2006 21:53:20 +0000 (13:53 -0800)]
[PATCH] FRV: Use virtual interrupt disablement
Make the FRV arch use virtual interrupt disablement because accesses to the
processor status register (PSR) are relatively slow and because we will
soon have the need to deal with multiple interrupt controls at the same
time (separate h/w and inter-core interrupts).
The way this is done is to dedicate one of the four integer condition code
registers (ICC2) to maintaining a virtual interrupt disablement state
whilst inside the kernel. This uses the ICC2.Z flag (Zero) to indicate
whether the interrupts are virtually disabled and the ICC2.C flag (Carry)
to indicate whether the interrupts are physically disabled.
ICC2.Z is set to indicate interrupts are virtually disabled. ICC2.C is set
to indicate interrupts are physically enabled. Under normal running
conditions Z==0 and C==1.
Disabling interrupts with local_irq_disable() doesn't then actually
physically disable interrupts - it merely sets ICC2.Z to 1. Should an
interrupt then happen, the exception prologue will note ICC2.Z is set and
branch out of line using one instruction (an unlikely BEQ). Here it will
physically disable interrupts and clear ICC2.C.
When it comes time to enable interrupts (local_irq_enable()), this simply
clears the ICC2.Z flag and invokes a trap #2 if both Z and C flags are
clear (the HI integer condition). This can be done with the TIHI
conditional trap instruction.
The trap then physically reenables interrupts and sets ICC2.C again. Upon
returning the interrupt will be taken as interrupts will then be enabled.
Note that whilst processing the trap, the whole exceptions system is
disabled, and so an interrupt can't happen till it returns.
If no pending interrupt had happened, ICC2.C would still be set, the HI
condition would not be fulfilled, and no trap will happen.
Saving interrupts (local_irq_save) is simply a matter of pulling the ICC2.Z
flag out of the CCR register, shifting it down and masking it off. This
gives a result of 0 if interrupts were enabled and 1 if they weren't.
Restoring interrupts (local_irq_restore) is then a matter of taking the
saved value mentioned previously and XOR'ing it against 1. If it was one,
the result will be zero, and if it was zero the result will be non-zero.
This result is then used to affect the ICC2.Z flag directly (it is a
condition code flag after all). An XOR instruction does not affect the
Carry flag, and so that bit of state is unchanged. The two flags can then
be sampled to see if they're both zero using the trap (TIHI) as for the
unconditional reenablement (local_irq_enable).
This patch also:
(1) Modifies the debugging stub (break.S) to handle single-stepping crossing
into the trap #2 handler and into virtually disabled interrupts.
(2) Removes superseded fixup pointers from the second instructions in the trap
tables (there's no a separate fixup table for this).
(3) Declares the trap #3 vector for use in .org directives in the trap table.
(4) Moves irq_enter() and irq_exit() in do_IRQ() to avoid problems with
virtual interrupt handling, and removes the duplicate code that has now
been folded into irq_exit() (softirq and preemption handling).
(5) Tells the compiler in the arch Makefile that ICC2 is now reserved.
(6) Documents the in-kernel ABI, including the virtual interrupts.
(7) Renames the old irq management functions to different names.
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ingo Molnar [Tue, 14 Feb 2006 21:53:15 +0000 (13:53 -0800)]
[PATCH] hrtimer: round up relative start time on low-res arches
CONFIG_TIME_LOW_RES is a temporary way for architectures to signal that
they simply return xtime in do_gettimeoffset(). In this corner-case we
want to round up by resolution when starting a relative timer, to avoid
short timeouts. This will go away with the GTOD framework.
Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Apparently caused more than 10% performance regression for aim7 benchmark.
The setup in use is 16-cpu HP rx8620, 64Gb of memory and 12 MSA1000s with 144
disks. Each disk is 72Gb with a single ext3 filesystem (courtesy of HP, who
supplied benchmark results).
The problem is, for aim7, the wake-up pattern is random, but it still needs
load balancing action in the wake-up path to achieve best performance. With
the above commit, lack of load balancing hurts that workload.
However, for workloads like database transaction processing, the requirement
is exactly opposite. In the wake up path, best performance is achieved with
absolutely zero load balancing. We simply wake up the process on the CPU that
it was previously run. Worst performance is obtained when we do load
balancing at wake up.
There isn't an easy way to auto detect the workload characteristics. Ingo's
earlier patch that detects idle CPU and decide whether to load balance or not
doesn't perform with aim7 either since all CPUs are busy (it causes even
bigger perf. regression).
Currently, copy-on-write may change the physical address of a page even if the
user requested that the page is pinned in memory (either by mlock or by
get_user_pages). This happens if the process forks meanwhile, and the parent
writes to that page. As a result, the page is orphaned: in case of
get_user_pages, the application will never see any data hardware DMA's into
this page after the COW. In case of mlock'd memory, the parent is not getting
the realtime/security benefits of mlock.
In particular, this affects the Infiniband modules which do DMA from and into
user pages all the time.
This patch adds madvise options to control whether memory range is inherited
across fork. Useful e.g. for when hardware is doing DMA from/into these
pages. Could also be useful to an application wanting to speed up its forks
by cutting large areas out of consideration.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Acked-by: Hugh Dickins <hugh@veritas.com> Cc: Michael Kerrisk <mtk-manpages@gmx.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Karsten Keil [Tue, 14 Feb 2006 21:53:06 +0000 (13:53 -0800)]
[PATCH] Fix NULL pointer dereference in isdn_tty_at_cout
The changes in the tty related code introduced wrong parenthesis in a if
condition in the isdn_tty_at_cout function. This caused access to index -1
in the dev->drv[] array. This patch change it back to the correct
condition from the previous versions.
James Bottomley [Tue, 14 Feb 2006 21:53:05 +0000 (13:53 -0800)]
[PATCH] fix x86 topology export in sysfs for subarchitectures
The correct way to export hyperthreading based functions is to predicate
them on CONFIG_X86_HT. Without this, the topology exporting patch breaks
the build on all non-PC x86 subarchitectures.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Trond Myklebust [Tue, 14 Feb 2006 21:53:04 +0000 (13:53 -0800)]
[PATCH] NLM: Fix the NLM_GRANTED callback checks
If 2 threads attached to the same process are blocking on different locks on
different files (maybe even on different servers) but have the same lock
arguments (i.e. same offset+length - actually quite common, since most
processes try to lock the entire file) then the first GRANTED call that wakes
one up will also wake the other.
Currently when the NLM_GRANTED callback comes in, lockd walks the list of
blocked locks in search of a match to the lock that the NLM server has
granted. Although it checks the lock pid, start and end, it fails to check
the filehandle and the server address.
By checking the filehandle and server IP address, we ensure that this only
happens if the locks truly are referencing the same file.
These seem to be incremental bugfixes on the original patch and as such are
no longer needed.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When the _CRS for a single HPET contains multiple EXTENDED_IRQ resources,
we overwrote hdp->hd_nirqs every time we found one.
So the driver worked when all the IRQs were described in a single
EXTENDED_IRQ resource, but failed when multiple resources were used.
(Strictly speaking, I think the latter is actually more correct, but both
styles have been used.)
Someday we should remove all the ACPI stuff from hpet.c and use PNP driver
registration instead. But currently PNP_MAX_IRQ is 2, and HPETs often have
more IRQs. Hint, hint, Adam :-)
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Acked-by: Bob Picco <robert.picco@hp.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: Adam Belay <ambx1@neo.rr.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Fulghum [Tue, 14 Feb 2006 21:53:00 +0000 (13:53 -0800)]
[PATCH] tty reference count fix
Fix hole where tty structure can be released when reference count is non
zero. Existing code can sleep without tty_sem protection between deciding
to release the tty structure (setting local variables tty_closing and
otty_closing) and setting TTY_CLOSING to prevent further opens. An open
can occur during this interval causing release_dev() to free the tty
structure while it is still referenced.
This should fix bugzilla.kernel.org [Bug 6041] New: Unable to handle kernel
paging request
In Bug 6041, tty_open() oopes on accessing the tty structure it has
successfully claimed. Bug was on SMP machine with the same tty being
opened and closed by multiple processes, and DEBUG_PAGEALLOC enabled.
Signed-off-by: Paul Fulghum <paulkf@microgate.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Hugh Dickins [Tue, 14 Feb 2006 21:52:59 +0000 (13:52 -0800)]
[PATCH] compound page: no access_process_vm check
The PageCompound check before access_process_vm's set_page_dirty_lock is no
longer necessary, so remove it. But leave the PageCompound checks in
bio_set_pages_dirty, dio_bio_complete and nfs_free_user_pages: at least some
of those were introduced as a little optimization on hugetlb pages.
Hugh Dickins [Tue, 14 Feb 2006 21:52:59 +0000 (13:52 -0800)]
[PATCH] compound page: default destructor
Somehow I imagined that calling a NULL destructor would free a compound page
rather than oopsing. No, we must supply a default destructor, __free_pages_ok
using the order noted by prep_compound_page. hugetlb can still replace this
as before with its own free_huge_page pointer.
The case that needs this is not common: rarely does put_compound_page's
put_page_testzero bring the count down to 0. But if get_user_pages is applied
to some part of a compound page, without immediate release (e.g. AIO or
Infiniband), then it's possible for its put_page to come after the containing
vma has been unmapped and the driver done its free_pages.
That's just the kind of case compound pages are supposed to be guarding
against (but Nick points out, nor did PageReserved handle this right).
Hugh Dickins [Tue, 14 Feb 2006 21:52:58 +0000 (13:52 -0800)]
[PATCH] compound page: use page[1].lru
If a compound page has its own put_page_testzero destructor (the only current
example is free_huge_page), that is noted in page[1].mapping of the compound
page. But that's rather a poor place to keep it: functions which call
set_page_dirty_lock after get_user_pages (e.g. Infiniband's
__ib_umem_release) ought to be checking first, otherwise set_page_dirty is
liable to crash on what's not the address of a struct address_space.
And now I'm about to make that worse: it turns out that every compound page
needs a destructor, so we can no longer rely on hugetlb pages going their own
special way, to avoid further problems of page->mapping reuse. For example,
not many people know that: on 50% of i386 -Os builds, the first tail page of a
compound page purports to be PageAnon (when its destructor has an odd
address), which surprises page_add_file_rmap.
Keep the compound page destructor in page[1].lru.next instead. And to free up
the common pairing of mapping and index, also move compound page order from
index to lru.prev. Slab reuses page->lru too: but if we ever need slab to use
compound pages, it can easily stack its use above this.
(akpm: decoded version of the above: the tail pages of a compound page now
have ->mapping==NULL, so there's no need for the set_page_dirty[_lock]()
caller to check that they're not compund pages before doing the dirty).
Peter Osterlund [Tue, 14 Feb 2006 21:52:56 +0000 (13:52 -0800)]
[PATCH] pktcdvd: Reduce stack usage
Reduce stack usage in the pkt_start_write() function. Even though it's not
currently a real problem, the pages and offsets arrays can be eliminated,
which saves approximately 1000 bytes of stack space.
Signed-off-by: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Peter Osterlund [Tue, 14 Feb 2006 21:52:56 +0000 (13:52 -0800)]
[PATCH] pktcdvd: Don't unlock the door if the disc is in use
Unlocking the door when the disc is in use is obviously not good, because then
it's possible to eject the disc at the wrong time and cause severe disc data
corruption.
Signed-off-by: Peter Osterlund <petero2@telia.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ralf Baechle [Fri, 10 Feb 2006 01:31:24 +0000 (01:31 +0000)]
[MIPS] More uaccess.h fixes with gcc >= 4.0.1.
From Richard Sandiford <richard@codesourcery.com>:
This patch caused a miscompilation of the restore_gp_regs() block
in restore_sigcontext(). This was in a 32-bit kernel compiled with
GCC CVS head.
restore_gp_regs() copies 64-bit user fields into 32-bit variables,
and in this combination, the new __get_user_asm_ll32() clobbers too
many registers. It says:
Atsushi Nemoto [Thu, 9 Feb 2006 15:39:06 +0000 (00:39 +0900)]
[MIPS] Add protected_blast_icache_range, blast_icache_range, etc.
Add blast_xxx_range(), protected_blast_xxx_range() etc. for common
use. They are built by __BUILD_BLAST_CACHE_RANGE().
Use protected_cache_op() macro for various protected_ routines.
Output code should be logically same.
Atsushi Nemoto [Tue, 7 Feb 2006 16:48:03 +0000 (01:48 +0900)]
[MIPS] Rewrite get_wchan and its helper functions using kallsyms_lookup.
Implement get_wchan() and frame_info_init() using kallsyms_lookup().
This fixes problem with static sched/lock functions and mfinfo[]
maintenance issue. If CONFIG_KALLSYMS was disabled, get_wchan() just
returns thread_saved_pc() value.
Also unwind stackframe based on "addiu sp,-imm" analysis instead of
frame pointer. This fixes problem with functions compiled without
-fomit-frame-pointer.
That commit reorganized tests for the userspace stack walking moving all
those tests into dump_backtrace(), however, dump_backtrace() was used for
both userspace and kernel stalk walking. The result is typically no
recorded callgraph information for kernel samples.
Revive the original function as dump_kernel_backtrace() and rename the
other to dump_user_backtrace() to avoid future confusion.
Jean Delvare [Sun, 5 Feb 2006 22:13:48 +0000 (23:13 +0100)]
[PATCH] w83781d: Use real-time status registers
Use the real-time status registers of the Winbond W83782D, W83783S and
W83627HF chips, instead of the interrupt status registers. Interrupts
cannot be trusted at least for voltage inputs, as they are two-times
triggers (as opposed to comparator mode, which we want.) The w83627hf
driver was fixed in a similar way some times ago.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Brownell [Mon, 6 Feb 2006 20:15:15 +0000 (12:15 -0800)]
[PATCH] USB: sl811_cs needs platform_device conversion too
The switchover to "platform_driver" from "device_driver" missed
one rather essential usage, which broke the sl811_cs driver ...
this resolves the omission.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Brownell [Thu, 9 Feb 2006 21:35:31 +0000 (16:35 -0500)]
[PATCH] USB: fix up the usb early handoff logic for EHCI
Disable some dubious "early" USB handoff code that allegedly works around bugs
on some systems (we don't know which ones) but rudely breaks some others.
Also make the kernel warnings reporting BIOS handoff problems be more useful,
reporting the register whose value displays the trouble.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kyle McMartin [Tue, 14 Feb 2006 03:44:22 +0000 (22:44 -0500)]
[PATCH] sys_newfstatat -> sys_fstatat64
parisc defines ARCH_WANT_STAT64, so we want to use fstatat64. It does not
appear that it needs to be ENTRY_COMP, because struct stat64 is the same
on both 32-bit and 64-bit (unlike on other platforms which did define a
compat_sys_fstatat64.)
Herbert Xu [Tue, 14 Feb 2006 00:01:27 +0000 (16:01 -0800)]
[IPSEC]: Fix strange IPsec freeze.
Problem discovered and initial patch by Olaf Kirch:
there's a problem with IPsec that has been bugging some of our users
for the last couple of kernel revs. Every now and then, IPsec will
freeze the machine completely. This is with openswan user land,
and with kernels up to and including 2.6.16-rc2.
I managed to debug this a little, and what happens is that we end
up looping in xfrm_lookup, and never get out. With a bit of debug
printks added, I can this happening:
ip_route_output_flow calls xfrm_lookup
xfrm_find_bundle returns NULL (apparently we're in the
middle of negotiating a new SA or something)
We therefore call xfrm_tmpl_resolve. This returns EAGAIN
We go to sleep, waiting for a policy update.
Then we loop back to the top
Apparently, the dst_orig that was passed into xfrm_lookup
has been dropped from the routing table (obsolete=2)
This leads to the endless loop, because we now create
a new bundle, check the new bundle and find it's stale
(stale_bundle -> xfrm_bundle_ok -> dst_check() return 0)
People have been testing with the patch below, which seems to fix the
problem partially. They still see connection hangs however (things
only clear up when they start a new ping or new ssh). So the patch
is obvsiouly not sufficient, and something else seems to go wrong.
I'm grateful for any hints you may have...
I suggest that we simply bail out always. If the dst decides to die
on us later on, the packet will be dropped anyway. So there is no
great urgency to retry here. Once we have the proper resolution
queueing, we can then do the retry again.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Olaf Kirch <okir@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Mon, 13 Feb 2006 23:53:41 +0000 (15:53 -0800)]
[APPLETALK]: warning fix
drivers/net/appletalk/cops.c: In function `cops_load':
drivers/net/appletalk/cops.c:539: warning: assignment discards qualifiers from pointer target type
drivers/net/appletalk/cops.c:547: warning: assignment discards qualifiers from pointer target type
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Jones [Mon, 13 Feb 2006 23:36:21 +0000 (15:36 -0800)]
[IPV4] ICMP: Invert default for invalid icmp msgs sysctl
isic can trigger these msgs to be spewed at a very high rate.
There's already a sysctl to turn them off. Given these messages
aren't useful for most people, this patch disables them by
default.
Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Jones [Mon, 13 Feb 2006 17:46:58 +0000 (12:46 -0500)]
[PATCH] Remove "RV370 5B60 [Radeon X300 (PCIE)]" from DRI list
I get a machine check exception, triple fault, or NMI watchdog lockup
when DRI gets enabled on this card.
(And Mauro Tassinari <mtassinari@cmanet.it> reports hung kernels too in
http://lkml.org/lkml/2006/1/26/97)
[ Adrian Bunk also states that this is the only RV350 entry for an RV370
in our lists, which implies that it's just buggy ]
Cc: Adrian Bunk <bunk@stusta.de> Cc: Dave Jones <davej@redhat.com> Cc: Mauro Tassinari <mtassinari@cmanet.it> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Mahoney [Mon, 13 Feb 2006 16:12:36 +0000 (11:12 -0500)]
[PATCH] reiserfs: fix potential (unlikely) oops in reiserfs_get_acl
This fixes a potential oops if there is an error reported by
posix_acl_from_disk(). This is mostly theoretical due to the use of
magics and checksums in xattrs, but is still possible.
Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Marcel Holtmann [Mon, 13 Feb 2006 10:40:07 +0000 (11:40 +0100)]
[Bluetooth] Fix firmware loading problem of BT3C driver
Before the PCMCIA subsystem was fully integrated into the device and
driver model, the BT3C driver had to workaround this when loading the
firmware. This workaround is broken and makes the driver oops when
loading the firmware. This patch removes this workaround and uses now
the provided device structure from the PCMCIA subsystem.
Marcel Holtmann [Mon, 13 Feb 2006 10:40:03 +0000 (11:40 +0100)]
[Bluetooth] Fix NULL pointer dereferences of the HCI socket
This patch fixes the two NULL pointer dereferences found by the sfuzz
tool from Ilja van Sprundel. The first one was a call of getsockname()
for an unbound socket and the second was calling accept() while this
operation isn't implemented for the HCI socket interface.
Marcel Holtmann [Mon, 13 Feb 2006 10:39:57 +0000 (11:39 +0100)]
[Bluetooth] Reduce L2CAP MTU for RFCOMM connections
This patch reduces the default L2CAP MTU for all RFCOMM connections
from 1024 to 1013 to improve the interoperability with some broken
RFCOMM implementations. To make this more flexible the L2CAP MTU
becomes also a module parameter and so it can changed at runtime.
Andi Kleen [Sun, 12 Feb 2006 22:34:59 +0000 (14:34 -0800)]
[PATCH] x86_64: GART DMA merging fix
Don't touch the non DMA members in the sg list in dma_map_sg in the IOMMU
Some drivers (in particular ST) ran into problems because they reused the sg
lists after passing them to pci_map_sg(). The merging procedure in the K8
GART IOMMU corrupted the state. This patch changes it to only touch the dma*
entries during merging, but not the other fields. Approach suggested by Dave
Miller.
We found a problem with x86_64 kernels with preemption enabled, where
having multiple tasks doing ptrace singlesteps around the same time will
cause the system to 'oops'. The problem seems that a task can get
preempted out of the do_debug() processing while it is running on the
DEBUG_STACK stack. If another task on that same cpu then enters do_debug()
and uses the same per-cpu DEBUG_STACK stack, the previous preempted tasks's
stack contents can be corrupted, and the system will oops when the
preempted task is context switched back in again.
This patch disables preemptions for the task upon entry to do_debug(), before
interrupts are reenabled, and then disables preemption before exiting
do_debug(), after disabling interrupts. I've noticed that the task can be
preempted either at the end of an interrupt, or on the call to
force_sig_info() on the spin_unlock_irqrestore() processing. It might be
better to attempt to code a fix in entry.S around the code that calls
do_debug().
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jesse Allen [Sun, 12 Feb 2006 22:34:56 +0000 (14:34 -0800)]
[PATCH] orinoco: support smc2532w
The orinoco wireless driver can support the SMC 2532W-B PC Card, so add the
id for it.
Signed-off-by: Jesse Allen <the3dfxdude@gmail.com> Cc: Pavel Roskin <proski@gnu.org> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>