Johannes Berg [Sun, 22 Jul 2007 16:11:42 +0000 (18:11 +0200)]
[NET] skbuff: remove export of static symbol
skb_clone_fraglist is static so it shouldn't be exported.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
[NETFILTER]: nf_conntrack: don't track locally generated special ICMP error
The conntrack assigned to locally generated ICMP error is usually the one
assigned to the original packet which has caused the error. But if
the original packet is handled as invalid by nf_conntrack, no conntrack
is assigned to the original packet. Then nf_ct_attach() cannot assign
any conntrack to the ICMP error packet. In that case the current
nf_conntrack_icmp assigns appropriate conntrack to it. But the current
code mistakes the direction of the packet. As a result, NAT code mistakes
the address to be mangled.
To fix the bug, this changes nf_conntrack_icmp not to assign conntrack
to such ICMP error. Actually no address is necessary to be mangled
in this case.
Albert Lee [Sun, 22 Jul 2007 16:05:44 +0000 (18:05 +0200)]
ide: clear bmdma status in ide_intr() for ICHx controllers (revised #4)
patch 1/2 (revised):
- Fix drive->waiting_for_dma to work with CDB-intr devices.
- Do the dma status clearing in ide_intr() and add a new
hwif->ide_dma_clear_irq for Intel ICHx controllers.
Revised per Alan, Sergei and Bart's advice.
Patch against 2.6.20-rc6. Tested ok on my ICH4 and pdc20275 adapters.
Please review/apply, thanks.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
fix deadlock in the 8139too driver: poll handlers should never forcibly
enable local interrupts, because they might be used by netpoll/printk
from IRQ context.
=================================
[ INFO: inconsistent lock state ]
2.6.19 #11
---------------------------------
inconsistent {softirq-on-W} -> {in-softirq-W} usage.
swapper/1 [HC0[0]:SC1[1]:HE1:SE0] takes:
(&npinfo->poll_lock){-+..}, at: [<c0350a41>] net_rx_action+0x64/0x1de
{softirq-on-W} state was registered at:
[<c0134c86>] mark_lock+0x5b/0x39c
[<c0135012>] mark_held_locks+0x4b/0x68
[<c01351e9>] trace_hardirqs_on+0x115/0x139
[<c02879e6>] rtl8139_poll+0x3d7/0x3f4
[<c035c85d>] netpoll_poll+0x82/0x32f
[<c035c775>] netpoll_send_skb+0xc9/0x12f
[<c035cdcc>] netpoll_send_udp+0x253/0x25b
[<c0288463>] write_msg+0x40/0x65
[<c011cead>] __call_console_drivers+0x45/0x51
[<c011cf16>] _call_console_drivers+0x5d/0x61
[<c011d4fb>] release_console_sem+0x11f/0x1d8
[<c011d7d7>] register_console+0x1ac/0x1b3
[<c02883f8>] init_netconsole+0x55/0x67
[<c010040c>] init+0x9a/0x24e
[<c01049cf>] kernel_thread_helper+0x7/0x10
[<ffffffff>] 0xffffffff
irq event stamp: 819992
hardirqs last enabled at (819992): [<c0350a16>] net_rx_action+0x39/0x1de
hardirqs last disabled at (819991): [<c0350b1e>] net_rx_action+0x141/0x1de
softirqs last enabled at (817552): [<c01214e4>] __do_softirq+0xa3/0xa8
softirqs last disabled at (819987): [<c0106051>] do_softirq+0x5b/0xc9
other info that might help us debug this:
no locks held by swapper/1.
Mark Glines [Sun, 22 Jul 2007 15:51:59 +0000 (17:51 +0200)]
[TCP]: Use default 32768-61000 outgoing port range in all cases.
This diff changes the default port range used for outgoing connections,
from "use 32768-61000 in most cases, but use N-4999 on small boxes
(where N is a multiple of 1024, depending on just *how* small the box
is)" to just "use 32768-61000 in all cases".
I don't believe there are any drawbacks to this change, and it keeps
outgoing connection ports farther away from the mess of
IANA-registered ports.
Signed-off-by: Mark Glines <mark@glines.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
sys_setsockopt() do not check properly timeout values for
SO_RCVTIMEO/SO_SNDTIMEO, for example it's possible to set negative timeout
values. POSIX do not defines behaviour for sys_setsockopt in case negative
timeouts, but requires that setsockopt() shall fail with -EDOM if the send and
receive timeout values are too big to fit into the timeout fields in the socket
structure.
In current implementation negative timeout can lead to error messages like
"schedule_timeout: wrong timeout value".
Proposed patch:
- checks tv_usec and returns -EDOM if it is wrong
- do not allows to set negative timeout values (sets 0 instead) and outputs
ratelimited information message about such attempts.
Signed-off-By: Vasily Averin <vvs@sw.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jan Engelhardt [Sun, 22 Jul 2007 15:44:18 +0000 (17:44 +0200)]
[SPARC]: Linux always started with 9600 8N1
The Linux kernel ignored the PROM's serial settings (115200,n,8,1 in
my case). This was because mode_prop remained "ttyX-mode" (expected:
"ttya-mode") due to the constness of string literals when used with
"char *". Since there is no "ttyX-mode" property in the PROM, Linux
always used the default 9600.
[ Investigation of the suncore.s assembler reveals that gcc optimizied
away the stores, yet did not emit a warning, which is a pretty
anti-social thing to do and is the only reason this bug lived for
so long -DaveM ]
Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Dave Jones [Sun, 22 Jul 2007 15:42:38 +0000 (17:42 +0200)]
[IPV4]: Correct rp_filter help text.
As mentioned in http://bugzilla.kernel.org/show_bug.cgi?id=5015
The helptext implies that this is on by default.
This may be true on some distros (Fedora/RHEL have it enabled
in /etc/sysctl.conf), but the kernel defaults to it off.
Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
When creating a new connection by sending an unknown chunk type, we don't
transition to a valid state, causing a NULL pointer dereference in
sctp_packet when accessing sctp_timeouts[SCTP_CONNTRACK_NONE].
Fix by don't creating new conntrack entry if initial state is invalid.
Noticed by Vilmos Nebehaj <vilmos.nebehaj@ramsys.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Corey Mutter [Tue, 22 May 2007 23:01:53 +0000 (01:01 +0200)]
[IPV6]: Reverse sense of promisc tests in ip6_mc_input
Reverse the sense of the promiscuous-mode tests in ip6_mc_input().
Signed-off-by: Corey Mutter <crm-netdev@mutternet.com> Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
David L Stevens [Tue, 22 May 2007 22:55:49 +0000 (00:55 +0200)]
[IPV6]: Send ICMPv6 error on scope violations.
When an IPv6 router is forwarding a packet with a link-local scope source
address off-link, RFC 4007 requires it to send an ICMPv6 destination
unreachable with code 2 ("not neighbor"), but Linux doesn't. Fix below.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Srinivas Aji [Tue, 22 May 2007 22:54:10 +0000 (00:54 +0200)]
[TCP]: zero out rx_opt in tcp_disconnect()
When the server drops its connection, NFS client reconnects using the
same socket after disconnecting. If the new connection's SYN,ACK
doesn't contain the TCP timestamp option and the old connection's did,
tp->tcp_header_len is recomputed assuming no timestamp header but
tp->rx_opt.tstamp_ok remains set. Then tcp_build_and_update_options()
adds in a timestamp option past the end of the allocated TCP header,
overwriting TCP data, or when the data is in skb_shinfo(skb)->frags[],
overwriting skb_shinfo(skb) causing a crash soon after. (The issue was
debugged from such a crash.)
Similarly, wscale_ok and sack_ok also get set based on the SYN,ACK
packet but not reset on disconnect, since they are zeroed out at
initialization. The patch zeroes out the entire tp->rx_opt struct in
tcp_disconnect() to avoid this sort of problem.
Signed-off-by: Srinivas Aji <Aji_Srinivas@emc.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Sergei Shtylyov [Tue, 22 May 2007 22:43:37 +0000 (00:43 +0200)]
[NETPOLL]: Remove CONFIG_NETPOLL_RX
Get rid of the CONFIG_NETPOLL_RX option completely since all the
dependencies have been removed long ago...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Acked-by: Jeff Garzik <jgarzik@pobox.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Sergei Shtylyov [Tue, 22 May 2007 22:41:22 +0000 (00:41 +0200)]
[NETPOLL]: Fix TX queue overflow in trapped mode.
CONFIG_NETPOLL_TRAP causes the TX queue controls to be completely bypassed in
the netpoll's "trapped" mode which easily causes overflows in the drivers with
short TX queues (most notably, in 8139too with its 4-deep queue). So, make
this option more sensible by making it only bypass the TX softirq wakeup.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Acked-by: Jeff Garzik <jgarzik@pobox.com> Acked-by: Tom Rini <trini@kernel.crashing.org> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
When network device's are renamed, the IPV6 snmp6 code
gets confused. It doesn't track name changes so it will OOPS
when network device's are removed.
The fix is trivial, just unregister/re-register in notify handler.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Eric Sesterhenn [Tue, 22 May 2007 22:38:17 +0000 (00:38 +0200)]
[IPV6]: Fix slab corruption running ip6sic
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Andrew Morton [Tue, 22 May 2007 21:50:21 +0000 (23:50 +0200)]
gcc-4.1.0 is bust
Keith says
Compiling 2.6.19-rc6 with gcc version 4.1.0 (SUSE Linux), wait_hpet_tick is
optimized away to a never ending loop and the kernel hangs on boot in timer
setup.
Andrew Hendry [Fri, 4 May 2007 22:00:25 +0000 (00:00 +0200)]
[X.25]: Add missing sock_put in x25_receive_data
__x25_find_socket does a sock_hold.
This adds a missing sock_put in x25_receive_data.
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jaroslav Kysela [Fri, 4 May 2007 21:59:07 +0000 (23:59 +0200)]
[NETFILTER]: ipt_CLUSTERIP: fix oops in checkentry function
The clusterip_config_find_get() already increases entries reference
counter, so there is no reason to do it twice in checkentry() callback.
This causes the config to be freed before it is removed from the list,
resulting in a crash when adding the next rule.
Signed-off-by: Jaroslav Kysela <perex@suse.cz> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jorge Boncompte [Thu, 3 May 2007 23:00:32 +0000 (01:00 +0200)]
[NETFILTER]: ip_nat_proto_gre: do not modify/corrupt GREv0 packets through NAT
While porting some changes of the 2.6.21-rc7 pptp/proto_gre conntrack
and nat modules to a 2.4.32 kernel I noticed that the gre_key function
returns a wrong pointer to the GRE key of a version 0 packet thus
corrupting the packet payload.
The intended behaviour for GREv0 packets is to act like
ip_conntrack_proto_generic/ip_nat_proto_unknown so I have ripped the
offending functions (not used anymore) and modified the
ip_nat_proto_gre modules to not touch version 0 (non PPTP) packets.
Signed-off-by: Jorge Boncompte <jorge@dti2.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Hugh Dickins [Thu, 3 May 2007 22:54:25 +0000 (00:54 +0200)]
holepunch: fix mmap_sem i_mutex deadlock
sys_madvise has down_write of mmap_sem, then madvise_remove calls
vmtruncate_range which takes i_mutex and i_alloc_sem: no, we can
easily devise deadlocks from that ordering.
madvise_remove drop mmap_sem while calling vmtruncate_range: luckily,
since madvise_remove doesn't split or merge vmas, it's easy to handle
this case with a NULL prev, without restructuring sys_madvise. (Though
sad to retake mmap_sem when it's unlikely to be needed, and certainly
down_read is sufficient for MADV_REMOVE, unlike the other madvices.)
Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Hugh Dickins [Thu, 3 May 2007 22:53:54 +0000 (00:53 +0200)]
holepunch: fix disconnected pages after second truncate
shmem_truncate_range has its own truncate_inode_pages_range, to free any
pages racily instantiated while it was in progress: a SHMEM_PAGEIN flag
is set when this might have happened. But holepunching gets no chance
to clear that flag at the start of vmtruncate_range, so it's always set
(unless a truncate came just before), so holepunch almost always does
this second truncate_inode_pages_range.
shmem holepunch has unlikely swap<->file races hereabouts whatever we do
(without a fuller rework than is fit for this release): I was going to
skip the second truncate in the punch_hole case, but Miklos points out
that would make holepunch correctness more vulnerable to swapoff. So
keep the second truncate, but follow it by an unmap_mapping_range to
eliminate the disconnected pages (freed from pagecache while still
mapped in userspace) that it might have left behind.
Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Hugh Dickins [Thu, 3 May 2007 22:52:56 +0000 (00:52 +0200)]
holepunch: fix shmem_truncate_range punch locking
Miklos Szeredi observes that during truncation of shmem page directories,
info->lock is released to improve latency (after lowering i_size and
next_index to exclude races); but this is quite wrong for holepunching,
which receives no such protection from i_size or next_index, and is left
vulnerable to races with shmem_unuse, shmem_getpage and shmem_writepage.
Hold info->lock throughout when holepunching? No, any user could prevent
rescheduling for far too long. Instead take info->lock just when needed:
in shmem_free_swp when removing the swap entries, and whenever removing
a directory page from the level above. But so long as we remove before
scanning, we can safely skip taking the lock at the lower levels, except
at misaligned start and end of the hole.
Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Hugh Dickins [Thu, 3 May 2007 22:52:18 +0000 (00:52 +0200)]
holepunch: fix shmem_truncate_range punching too far
Miklos Szeredi observes BUG_ON(!entry) in shmem_writepage() triggered
in rare circumstances, because shmem_truncate_range() erroneously
removes partially truncated directory pages at the end of the range:
later reclaim on pages pointing to these removed directories triggers
the BUG. Indeed, and it can also cause data loss beyond the hole.
Fix this as in the patch proposed by Miklos, but distinguish between
"limit" (how far we need to search: ignore truncation's next_index
optimization in the holepunch case - if there are races it's more
consistent to act on the whole range specified) and "upper_limit"
(how far we can free directory pages: generally we must be careful
to keep partially punched pages, but can relax at end of file -
i_size being held stable by i_mutex).
Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Adrian Bunk [Mon, 30 Apr 2007 23:11:29 +0000 (01:11 +0200)]
[NETLINK]: Infinite recursion in netlink (CVE-2007-1861)
Reply to NETLINK_FIB_LOOKUP messages were misrouted back to kernel,
which resulted in infinite recursion and stack overflow.
The bug is present in all kernel versions since the feature appeared.
The patch also makes some minimal cleanup:
1. Return something consistent (-ENOENT) when fib table is missing
2. Do not crash when queue is empty (does not happen, but yet)
3. Put result of lookup
Sergey Vlasov:
Oops fix
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Zach Brown [Wed, 25 Apr 2007 22:47:15 +0000 (00:47 +0200)]
aio: remove bare user-triggerable error printk
The user can generate console output if they cause do_mmap() to fail
during sys_io_setup(). This was seen in a regression test that does
exactly that by spinning calling mmap() until it gets -ENOMEM before
calling io_setup().
We don't need this printk at all, just remove it.
Signed-off-by: Zach Brown <zach.brown@oracle.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: G. Liakhovetski <gl@dsa-ac.de> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Acked-by: David Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
We must reserve SAR + MAX_HEADER bytes for IrLMP to fit in.
This fixes an oops reported (and fixed) by Jeet Chaudhuri, when max_sdu_size
is greater than 0.
Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Shaohua Li [Mon, 23 Apr 2007 23:25:26 +0000 (01:25 +0200)]
x86 microcode: don't check the size
IA32 manual says if micorcode update's size is 0, then the size is
default size (2048 bytes). But this doesn't suggest all microcode
update's size should be above 2048 bytes to me. We actually had a
microcode update whose size is 1024 bytes. The patch just removed the
check.
Backported by Daniel Drake.
Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
tty_io: fix race in master pty close/slave pty close path
This patch fixes a possible race that leads to double freeing an idr index.
When the master begin to close, release_dev() is called and then
pty_close() is called:
if (tty->driver->close)
tty->driver->close(tty, filp);
This is done without helding any locks other than BKL. Inside pty_close(),
being a master close, the devpts entry will be removed:
#ifdef CONFIG_UNIX98_PTYS
if (tty->driver == ptm_driver)
devpts_pty_kill(tty->index);
#endif
But devpts_pty_kill() will call get_node() that may sleep while waiting for
&devpts_root->d_inode->i_sem. When this happens and the slave is being
opened, tty_open() just found the driver and index:
This part of the code is already protected under tty_mute. The problem is
that the slave close already got an index. Then init_dev() is called and
blocks waiting for the same &devpts_root->d_inode->i_sem.
When the master close resumes, it removes the devpts entry, and the
relation between idr index and the tty is gone. The master then sleeps
waiting for the tty_mutex on release_dev().
Slave open resumes and found no tty for that index. As result, a NULL tty
is returned and init_dev() doesn't flow to fast_track:
/* check whether we're reopening an existing tty */
if (driver->flags & TTY_DRIVER_DEVPTS_MEM) {
tty = devpts_get_tty(idx);
if (tty && driver->subtype == PTY_TYPE_MASTER)
tty = tty->link;
} else {
tty = driver->ttys[idx];
}
if (tty) goto fast_track;
The result of this, is that a new tty will be created and init_dev() returns
sucessfull. After returning, tty_mutex is dropped and master close may resume.
Master close finds it's the only use and both sides are closing, then releases
the tty and the index. At this point, the idr index is free, but slave still
has it.
Slave open then calls pty_open() and finds that tty->link->count is 0,
because there's no master and returns error. Then tty_open() calls
release_dev() which executes without any warning, as it was a case of last
slave close when the master is already closed (master->count == 0,
slave->count == 1). The tty is then released with the already released idr
index.
This normally would only issue a warning on idr_remove() but in case of a
customer's critical application, it's never too simple:
thread1: opens master, gets index X
thread1: begin closing master
thread2: begin opening slave with index X
thread1: finishes closing master, index X released
thread3: opens master, gets index X, just released
thread2: fails opening slave, releases index X <----
thread4: opens master, gets index X, init_dev() then find an already in use
and healthy tty and fails
If no more indexes are released, ptmx_open() will keep failing, as the
first free index available is X, and it will make init_dev() fail because
you're trying to "reopen a master" which isn't valid.
The patch notices when this race happens and make init_dev() fail
imediately. The init_dev() function is called with tty_mutex held, so it's
safe to continue with tty till the end of function because release_dev()
won't make any further changes without grabbing the tty_mutex.
Without the patch, on some machines it's possible get easily idr warnings
like this one:
idr_remove called for id=15 which is not allocated.
[<c02555b9>] idr_remove+0x139/0x170
[<c02a1b62>] release_mem+0x182/0x230
[<c02a28e7>] release_dev+0x4b7/0x700
[<c02a0ea7>] tty_ldisc_enable+0x27/0x30
[<c02a1e64>] init_dev+0x254/0x580
[<c02a0d64>] check_tty_count+0x14/0xb0
[<c02a4f05>] tty_open+0x1c5/0x340
[<c02a4d40>] tty_open+0x0/0x340
[<c017388f>] chrdev_open+0xaf/0x180
[<c017c2ac>] open_namei+0x8c/0x760
[<c01737e0>] chrdev_open+0x0/0x180
[<c0167bc9>] __dentry_open+0xc9/0x210
[<c0167e2c>] do_filp_open+0x5c/0x70
[<c0167a91>] get_unused_fd+0x61/0xd0
[<c0167e93>] do_sys_open+0x53/0x100
[<c0167f97>] sys_open+0x27/0x30
[<c010303b>] syscall_call+0x7/0xb
using this test application available on:
http://www.ruivo.org/~aris/pty_sodomizer.c
Signed-off-by: Aristeu Sergio Rozanski Filho <aris@ruivo.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
A flag was recently added to the elevator code to avoid
performing an unplug when reuests are being re-queued.
The goal of this flag was to avoid a deep recursion that
can occur when re-queueing requests after a SCSI device/host
reset. See http://lkml.org/lkml/2006/5/17/254
However, that fix added the flag near the bottom of a case
statement, where an earlier break (in an if statement) could
transport one out of the case, without setting the flag.
This patch sets the flag earlier in the case statement.
I re-discovered the deep recursion recently during testing;
I was told that it was a known problem, and the fix to it was
in the kernel I was testing. Indeed it was ... but it didn't
fix the bug. With the patch below, I no longer see the bug.
start_kernel: test if irq's got enabled early, barf, and disable them again
The calls made by parse_parms to other initialization code might enable
interrupts again way too early.
Having interrupts on this early can make systems PANIC when they initialize
the IRQ controllers (which happens later in the code). This patch detects
that irq's are enabled again, barfs about it and disables them again as a
safety net.
[akpm@osdl.org: cleanups] Signed-off-by: Ard van Breemen <ard@telegraafnet.nl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Olaf Kirch [Thu, 19 Apr 2007 23:45:09 +0000 (01:45 +0200)]
[IrDA]: Correctly handling socket error
This patch fixes an oops first reported in mid 2006 - see
http://lkml.org/lkml/2006/8/29/358 The cause of this bug report is that
when an error is signalled on the socket, irda_recvmsg_stream returns
without removing a local wait_queue variable from the socket's sk_sleep
queue. This causes havoc further down the road.
In response to this problem, a patch was made that invoked sock_orphan on
the socket when receiving a disconnect indication. This is not a good fix,
as this sets sk_sleep to NULL, causing applications sleeping in recvmsg
(and other places) to oops.
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jean Delvare [Thu, 19 Apr 2007 23:43:12 +0000 (01:43 +0200)]
hwmon/w83627ehf: Fix the fan5 clock divider write
Users have been complaining about the w83627ehf driver flooding their logs
with debug messages like:
w83627ehf 9191-0a10: Increasing fan 4 clock divider from 64 to 128
or:
w83627ehf 9191-0290: Increasing fan 4 clock divider from 4 to 8
The reason is that we failed to actually write the LSB of the encoded clock
divider value for that fan, causing the next read to report the same old value
again and again.
Additionally, the fan number was improperly reported, making the bug harder to
find.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Aubrey Li [Thu, 19 Apr 2007 23:40:19 +0000 (01:40 +0200)]
[NET]: Fix UDP checksum issue in net poll mode.
In net poll mode, the current checksum function doesn't consider the
kind of packet which is padded to reach a specific minimum length. I
believe that's the problem causing my test case failed. The following
patch fixed this issue.
Signed-off-by: Aubrey Li <aubreylee@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Tom Callaway [Thu, 19 Apr 2007 23:38:57 +0000 (01:38 +0200)]
[SPARC64]: Fix inline directive in pci_iommu.c
While building a test kernel for the new esp driver (against
git-current), I hit this bug. Trivial fix, put the inline declaration
in the right place. :)
Signed-off-by: Tom Callaway <tcallawa@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
David S. Miller [Thu, 19 Apr 2007 23:35:52 +0000 (01:35 +0200)]
[SPARC64]: Fix SBUS IOMMU allocation code.
There are several IOMMU allocator bugs. Instead of trying to fix this
overly complicated code, just mirror the PCI IOMMU arena allocator
which is very stable and well stress tested.
I tried to make the code as identical as possible so we can switch
sun4u PCI and SBUS over to a common piece of IOMMU code. All that
will be need are two callbacks, one to do a full IOMMU flush and one
to do a streaming buffer flush.
This patch gets rid of a lot of hangs and mysterious crashes on SBUS
sparc64 systems, at least for me.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Userspace uses an integer for TCA_TCINDEX_SHIFT, the kernel was changed
to expect and use a u16 value in 2.6.11, which broke compatibility on
big endian machines. Change back to use int.
Reported by Ole Reinartz <ole.reinartz@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Herbert Xu [Fri, 13 Apr 2007 19:32:53 +0000 (21:32 +0200)]
[IPSEC]: Reject packets within replay window but outside the bit mask
Up until this point we've accepted replay window settings greater than
32 but our bit mask can only accomodate 32 packets. Thus any packet
with a sequence number within the window but outside the bit mask would
be accepted.
This patch causes those packets to be rejected instead.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
[IPv6]: Fix incorrect length check in rawv6_sendmsg()
In article <20070329.142644.70222545.davem@davemloft.net> (at Thu, 29 Mar 2007 14:26:44 -0700 (PDT)), David Miller <davem@davemloft.net> says:
> From: Sridhar Samudrala <sri@us.ibm.com>
> Date: Thu, 29 Mar 2007 14:17:28 -0700
>
> > The check for length in rawv6_sendmsg() is incorrect.
> > As len is an unsigned int, (len < 0) will never be TRUE.
> > I think checking for IPV6_MAXPLEN(65535) is better.
> >
> > Is it possible to send ipv6 jumbo packets using raw
> > sockets? If so, we can remove this check.
>
> I don't see why such a limitation against jumbo would exist,
> does anyone else?
>
> Thanks for catching this Sridhar. A good compiler should simply
> fail to compile "if (x < 0)" when 'x' is an unsigned type, don't
> you think :-)
Dave, we use "int" for returning value,
so we should fix this anyway, IMHO;
we should not allow len > INT_MAX.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Acked-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
The ADEF bits in the TSCR register have different meanings in read and
write mode. For this reason ADEF has to be reset on every
read-modify-write operation.
This patch introduces a special write function for this register, which
takes care of it.
Thanks to Holger Magnussen for pointing my nose at this problem.
Signed-off-by: Andreas Oberritter <obi@linuxtv.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Oliver Endriss [Fri, 13 Apr 2007 19:23:49 +0000 (21:23 +0200)]
V4L: saa7146: Fix allocation of clipping memory
Olaf Hering pointed out that SAA7146_CLIPPING_MEM would become
very large for PAGE_SIZE > 4K.
In fact, the number of clipping windows is limited to 16,
and calculate_clipping_registers_rect() does not use more
than 256 bytes. SAA7146_CLIPPING_MEM adjusted accordingly.
Thanks-to: Olaf Hering <olaf@aepfle.de> Signed-off-by: Oliver Endriss <o.endriss@gmx.de> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
All the radio drivers need video_dev, but they were depending on
VIDEO_DEV!=n. That meant that one could try to compile the driver into
the kernel when VIDEO_DEV=m, which will not work. If video_dev is a
module, then the radio drivers must be modules too.
Driver needs to turn off carrier when down, otherwise it can
confuse bonding and bridging and looks like carrier is on immediately
when it is brought back up.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Thomas Gleixner [Fri, 13 Apr 2007 18:45:17 +0000 (20:45 +0200)]
i386: fix file_read_actor() and pipe_read() for original i386 systems
The __copy_to_user_inatomic() calls in file_read_actor() and pipe_read()
are broken on original i386 machines, where WP-works-ok == false, as
__copy_to_user_inatomic() on such systems calls functions which might
sleep and/or contain cond_resched() calls inside of a kmap_atomic()
region.
The original check for WP-works-ok was in access_ok(), but got moved
during the 2.5 series to fix a race vs. swap.
Return the number of bytes to copy in the case where we are in an atomic
region, so the non atomic code pathes in file_read_actor() and
pipe_read() are taken.
This could be optimized to avoid the kmap_atomicby moving the check for
WP-works-ok into fault_in_pages_writeable(), but this is more intrusive
and can be done later.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Adrian Bunk <bunk@stusta.de>
r8169: issue request_irq after the private data are completely initialized
The irq handler schedules a NAPI poll request unconditionally as soon as
the status register is not clean. It has been there - and wrong - for
ages but a recent timing change made it apparently easier to trigger.
Adrian Bunk:
backported to 2.6.16
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
The PM hooks are no-op if the r8169 interface is down (i.e. !IFF_UP).
However, as the chipset is enabled, the device will not work after a
suspend/resume cycle. The patch always issue the required PCI suspend
sequence and removes the module unload/reload workaround.
Jean Delvare [Tue, 10 Apr 2007 21:06:06 +0000 (23:06 +0200)]
APPLETALK: Fix a remotely triggerable crash (CVE-2007-1357)
When we receive an AppleTalk frame shorter than what its header says,
we still attempt to verify its checksum, and trip on the BUG_ON() at
the end of function atalk_sum_skb() because of the length mismatch.
This has security implications because this can be triggered by simply
sending a specially crafted ethernet frame to a target victim,
effectively crashing that host. Thus this qualifies, I think, as a
remote DoS. Here is the frame I used to trigger the crash, in npg
format:
<Appletalk Killer>
{
# Ethernet header -----
XX XX XX XX XX XX # Destination MAC
00 00 00 00 00 00 # Source MAC
00 1D # Length
The destination MAC address must be set to those of the victim.
The severity is mitigated by two requirements:
* The target host must have the appletalk kernel module loaded. I
suspect this isn't so frequent.
* AppleTalk frames are non-IP, thus I guess they can only travel on
local networks. I am no network expert though, maybe it is possible
to somehow encapsulate AppleTalk packets over IP.
The bug has been reported back in June 2004:
http://bugzilla.kernel.org/show_bug.cgi?id=2979
But it wasn't investigated, and was closed in July 2006 as both
reporters had vanished meanwhile.
This code was new in kernel 2.6.0-test5:
http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commitdiff;h=7ab442d7e0a76402c12553ee256f756097cae2d2
And not modified since then, so we can assume that vanilla kernels
2.6.0-test5 and later, and distribution kernels based thereon, are
affected.
Note that I still do not know for sure what triggered the bug in the
real-world cases. The frame could have been corrupted by the kernel if
we have a bug hiding somewhere. But more likely, we are receiving the
faulty frame from the network.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Thomas Gleixner [Sun, 8 Apr 2007 23:04:23 +0000 (01:04 +0200)]
hrtimer: prevent overrun DoS in hrtimer_forward()
hrtimer_forward() does not check for the possible overflow of
timer->expires. This can happen on 64 bit machines with large interval
values and results currently in an endless loop in the softirq because
the expiry value becomes negative and therefor the timer is expired all
the time.
Check for this condition and set the expiry value to the max. expiry
time in the future.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Thomas Gleixner [Sun, 8 Apr 2007 22:03:30 +0000 (00:03 +0200)]
prevent timespec/timeval to ktime_t overflow
Frank v. Waveren pointed out that on 64bit machines the timespec to
ktime_t conversion might overflow. This is also true for timeval to
time_t conversions. This breaks a "sleep inf" on 64bit machines.
While a timespec/timeval with tx.sec = MAX_LONG is valid by specification
the internal representation of ktime_t is based on nanoseconds. The
conversion of seconds to nanoseconds overflows for seconds values >=
(MAX_LONG / NSEC_PER_SEC).
Check the seconds argument to the conversion and limit it to the maximum
time which can be represented by ktime_t.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
David Moore [Sun, 8 Apr 2007 21:54:41 +0000 (23:54 +0200)]
ieee1394: video1394: DMA fix
This together with the phys_to_virt fix in lib/swiotlb.c::swiotlb_sync_sg
fixes video1394 DMA on machines with DMA bounce buffers, especially Intel
x86-64 machines with > 3GB RAM.
Signed-off-by: David Moore <dcm@acm.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Fix reparenting to the same thread group. (take 2)
This patch fixes the case when we reparent to a different thread in the
same thread group. This modifies the code so that we do not send
signals and do not change the signal to send to SIGCHLD unless we have
change the thread group of our parents. It also suppresses sending
pdeath_sig in this cas as well since the result of geppid doesn't
change.
Thanks to Oleg for spotting my bug of only fixing this for non-ptraced
tasks.
This fixes the issues identified by Albert Cahalan in thread
http://lkml.org/lkml/2006/12/21/22
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Doug Leith observed a discrepancy between the version of CUBIC described
in the papers and the version in 2.6.18. A math error related to scaling
causes Cubic to grow too slowly.
Patch is from "Sangtae Ha" <sha2@ncsu.edu>. I validated that
it does fix the problems.
See the following to show behavior over 500ms 100 Mbit link.
Alan Cox [Wed, 4 Apr 2007 19:34:22 +0000 (21:34 +0200)]
ide-floppy: Fix unformatted media crash
A ZIP or similar with unformatted media will cause crashes when attempts
are made to read/write it in some cases. This is because bs_factor is
zero and we divide by it causing an oops.
As the size of a non-accessible/non-existant media is really a bit of a
zen question it doesn't matter if non-existant media is 512 bytes per
sector or zero. Setting it to 1 causes us to generate 512 bytes/sector
accesses and error properly.
Based on a fix found lurking in an ancient bugzilla entry since about 2004 (ugghhh)
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Patrick McHardy [Tue, 3 Apr 2007 02:03:55 +0000 (04:03 +0200)]
[IFB]: Fix crash on input device removal
The input_device pointer is not refcounted, which means the device may
disappear while packets are queued, causing a crash when ifb passes packets
with a stale skb->dev pointer to netif_rx().
Fix by storing the interface index instead and do a lookup where neccessary.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
David S. Miller [Mon, 2 Apr 2007 23:50:31 +0000 (01:50 +0200)]
[VIDEO] ffb: Fix two DAC handling bugs.
The determination of whether the DAC has inverted cursor logic is
broken, import the version checks the X.org driver uses to fix this.
Next, when we change the timing generator, borrow code from X.org that
does 10 NOP reads of the timing generator register afterwards to make
sure the video-enable transition occurs cleanly.
Finally, use macros for the DAC registers and fields in order to
provide documentation for the next person who reads this code.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Eric Sesterhenn [Wed, 28 Mar 2007 20:35:52 +0000 (22:35 +0200)]
[ALSA] fix NULL pointer dereference in sound/synth/emux/soundfont.c
this is about coverity id #100.
It seems the if statement is negated, since the else branch calls
remove_info() with sflist->currsf as a parameter where it gets
dereferenced.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
Takashi Iwai [Wed, 28 Mar 2007 20:29:24 +0000 (22:29 +0200)]
[ALSA] hda-intel - Don't try to probe invalid codecs
Fix the max number of codecs detected by HD-intel (and compatible)
controllers.
ATI controllers may have up to 4 codecs while ICH up to 3.
Now max codecs is defined according to the driver type, either 3 or 4.
Currently 4 is set only to ATI chips. Other might need the same
change, too.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>