]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
17 years ago[NETLINK]: Fix unicast timeouts
Patrick McHardy [Tue, 13 Nov 2007 11:23:22 +0000 (12:23 +0100)]
[NETLINK]: Fix unicast timeouts

[ Upstream commit: c3d8d1e30cace31fed6186a4b8c6b1401836d89c ]

Commit ed6dcf4a in the history.git tree broke netlink_unicast timeouts
by moving the schedule_timeout() call to a new function that doesn't
propagate the remaining timeout back to the caller. This means on each
retry we start with the full timeout again.

ipc/mqueue.c seems to actually want to wait indefinitely so this
behaviour is retained.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoPPPOE: fix memory leak (local DoS) (CVE-2007-2525)
Florian Zumbiehl [Tue, 13 Nov 2007 10:12:46 +0000 (11:12 +0100)]
PPPOE: fix memory leak (local DoS) (CVE-2007-2525)

This patch fixes a memory leak when a PPPoE socket is release()d after
it has been connect()ed, but before the PPPIOCGCHAN ioctl ever has been
called on it.

This is somewhat of a security problem, too, since PPPoE sockets can be
created by any user, so any user can easily allocate all the machine's
RAM to non-swappable address space and thus DoS the system.

Is there any specific reason for PPPoE sockets being available to any
unprivileged process, BTW? After all, you need a packet socket for the
discovery stage anyway, so it's unlikely that any unprivileged process
will ever need to create a PPPoE socket, no? Allocating all session IDs
for a known AC is a kind of DoS, too, after all - with Juniper ERXes,
this is really easy, actually, since they don't ever assign session ids
above 8000 ...

Signed-off-by: Florian Zumbiehl <florz@florz.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[PKT_SCHED] CLS_U32: Fix endianness problem with u32 classifier hash masks.
Radu Rendec [Tue, 13 Nov 2007 08:30:35 +0000 (09:30 +0100)]
[PKT_SCHED] CLS_U32: Fix endianness problem with u32 classifier hash masks.

While trying to implement u32 hashes in my shaping machine I ran into
a possible bug in the u32 hash/bucket computing algorithm
(net/sched/cls_u32.c).

The problem occurs only with hash masks that extend over the octet
boundary, on little endian machines (where htonl() actually does
something).

Let's say that I would like to use 0x3fc0 as the hash mask. This means
8 contiguous "1" bits starting at b6. With such a mask, the expected
(and logical) behavior is to hash any address in, for instance,
192.168.0.0/26 in bucket 0, then any address in 192.168.0.64/26 in
bucket 1, then 192.168.0.128/26 in bucket 2 and so on.

This is exactly what would happen on a big endian machine, but on
little endian machines, what would actually happen with current
implementation is 0x3fc0 being reversed (into 0xc03f0000) by htonl()
in the userspace tool and then applied to 192.168.x.x in the u32
classifier. When shifting right by 16 bits (rank of first "1" bit in
the reversed mask) and applying the divisor mask (0xff for divisor
256), what would actually remain is 0x3f applied on the "168" octet of
the address.

One could say is this can be easily worked around by taking endianness
into account in userspace and supplying an appropriate mask (0xfc03)
that would be turned into contiguous "1" bits when reversed
(0x03fc0000). But the actual problem is the network address (inside
the packet) not being converted to host order, but used as a
host-order value when computing the bucket.

Let's say the network address is written as n31 n30 ... n0, with n0
being the least significant bit. When used directly (without any
conversion) on a little endian machine, it becomes n7 ... n0 n8 ..n15
etc in the machine's registers. Thus bits n7 and n8 would no longer be
adjacent and 192.168.64.0/26 and 192.168.128.0/26 would no longer be
consecutive.

The fix is to apply ntohl() on the hmask before computing fshift,
and in u32_hash_fold() convert the packet data to host order before
shifting down by fshift.

With helpful feedback from Jamal Hadi Salim and Jarek Poplawski.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[PKT_SCHED]: Fix OOPS when removing devices from a teql queuing discipline
Evgeniy Polyakov [Tue, 13 Nov 2007 08:27:27 +0000 (09:27 +0100)]
[PKT_SCHED]: Fix OOPS when removing devices from a teql queuing discipline

[ Upstream commit: 4f9f8311a08c0d95c70261264a2b47f2ae99683a ]

tecl_reset() is called from deactivate and qdisc is set to noop already,
but subsequent teql_xmit does not know about it and dereference private
data as teql qdisc and thus oopses.
not catch it first :)

Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoi386: fixup TRACE_IRQ breakage
Peter Zijlstra [Tue, 13 Nov 2007 07:46:02 +0000 (08:46 +0100)]
i386: fixup TRACE_IRQ breakage

The TRACE_IRQS_ON function in iret_exc: calls a C function without
ensuring that the segments are set properly. Move the trace function and
the enabling of interrupt into the C stub.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoHandle bogus %cs selector in single-step instruction decoding (CVE-2007-3731)
Roland McGrath [Tue, 13 Nov 2007 07:43:25 +0000 (08:43 +0100)]
Handle bogus %cs selector in single-step instruction decoding (CVE-2007-3731)

The code for LDT segment selectors was not robust in the face of a bogus
selector set in %cs via ptrace before the single-step was done.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[ATM]: Add CPPFLAGS to byteorder.h check
Ben Collins [Tue, 13 Nov 2007 06:50:09 +0000 (07:50 +0100)]
[ATM]: Add CPPFLAGS to byteorder.h check

O= builds produced errors in the shell command because of unfound headers.

Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[PPP_MPPE]: Don't put InterimKey on the stack
Michal Schmidt [Tue, 13 Nov 2007 06:48:46 +0000 (07:48 +0100)]
[PPP_MPPE]: Don't put InterimKey on the stack

ppp_mppe puts a crypto key on the kernel stack, then passes the
address of that into the crypto layer.  That doesn't work because the
crypto layer needs to be able to do virt_to_*() on the address which
does not universally work for the kernel stack on all platforms.

Adrian Bunk:
Backported to 2.6.16.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[INET_DIAG]: Fix oops in netlink_rcv_skb
Patrick McHardy [Mon, 12 Nov 2007 12:04:20 +0000 (13:04 +0100)]
[INET_DIAG]: Fix oops in netlink_rcv_skb

netlink_run_queue() doesn't handle multiple processes processing the
queue concurrently. Serialize queue processing in inet_diag to fix
a oops in netlink_rcv_skb caused by netlink_run_queue passing a
NULL for the skb.

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000054
[349587.500454]  printing eip:
[349587.500457] c03318ae
[349587.500459] *pde = 00000000
[349587.500464] Oops: 0000 [#1]
[349587.500466] PREEMPT SMP
[349587.500474] Modules linked in: w83627hf hwmon_vid i2c_isa
[349587.500483] CPU:    0
[349587.500485] EIP:    0060:[<c03318ae>]    Not tainted VLI
[349587.500487] EFLAGS: 00010246   (2.6.22.3 #1)
[349587.500499] EIP is at netlink_rcv_skb+0xa/0x7e
[349587.500506] eax: 00000000   ebx: 00000000   ecx: c148d2a0   edx: c0398819
[349587.500510] esi: 00000000   edi: c0398819   ebp: c7a21c8c   esp: c7a21c80
[349587.500517] ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
[349587.500521] Process oidentd (pid: 17943, ti=c7a20000 task=cee231c0 task.ti=c7a20000)
[349587.500527] Stack: 00000000 c7a21cac f7c8ba78 c7a21ca4 c0331962 c0398819 f7c8ba00 0000004c
[349587.500542]        f736f000 c7a21cb4 c03988e3 00000001 f7c8ba00 c7a21cc4 c03312a5 0000004c
[349587.500558]        f7c8ba00 c7a21cd4 c0330681 f7c8ba00 e4695280 c7a21d00 c03307c6 7fffffff
[349587.500578] Call Trace:
[349587.500581]  [<c010361a>] show_trace_log_lvl+0x1c/0x33
[349587.500591]  [<c01036d4>] show_stack_log_lvl+0x8d/0xaa
[349587.500595]  [<c010390e>] show_registers+0x1cb/0x321
[349587.500604]  [<c0103bff>] die+0x112/0x1e1
[349587.500607]  [<c01132d2>] do_page_fault+0x229/0x565
[349587.500618]  [<c03c8d3a>] error_code+0x72/0x78
[349587.500625]  [<c0331962>] netlink_run_queue+0x40/0x76
[349587.500632]  [<c03988e3>] inet_diag_rcv+0x1f/0x2c
[349587.500639]  [<c03312a5>] netlink_data_ready+0x57/0x59
[349587.500643]  [<c0330681>] netlink_sendskb+0x24/0x45
[349587.500651]  [<c03307c6>] netlink_unicast+0x100/0x116
[349587.500656]  [<c0330f83>] netlink_sendmsg+0x1c2/0x280
[349587.500664]  [<c02fcce9>] sock_sendmsg+0xba/0xd5
[349587.500671]  [<c02fe4d1>] sys_sendmsg+0x17b/0x1e8
[349587.500676]  [<c02fe92d>] sys_socketcall+0x230/0x24d
[349587.500684]  [<c01028d2>] syscall_call+0x7/0xb
[349587.500691]  =======================
[349587.500693] Code: f0 ff 4e 18 0f 94 c0 84 c0 0f 84 66 ff ff ff 89 f0 e8 86 e2 fc ff e9 5a ff ff ff f0 ff 40 10 eb be 55 89 e5 57 89 d7 56 89 c6 53 <8b> 50 54 83 fa 10 72 55 8b 9e 9c 00 00 00 31 c9 8b 03 83 f8 0f

Reported by Athanasius <link@miggy.org>

Adrian Bunk:
Backported to 2.6.16 based on a suggestion by David S. Miller.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[IPV6]: Fix unbalanced socket reference with MSG_CONFIRM.
YOSHIFUJI Hideaki [Mon, 12 Nov 2007 12:00:22 +0000 (13:00 +0100)]
[IPV6]: Fix unbalanced socket reference with MSG_CONFIRM.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoLinux 2.6.16.57 v2.6.16.57
Adrian Bunk [Mon, 5 Nov 2007 20:27:33 +0000 (21:27 +0100)]
Linux 2.6.16.57

17 years agoLinux 2.6.16.57-rc1 v2.6.16.57-rc1
Adrian Bunk [Fri, 2 Nov 2007 22:11:56 +0000 (23:11 +0100)]
Linux 2.6.16.57-rc1

17 years agoknfsd: allow nfsd READDIR to return 64bit cookies
Neil Brown [Fri, 2 Nov 2007 22:08:36 +0000 (23:08 +0100)]
knfsd: allow nfsd READDIR to return 64bit cookies

->readdir passes lofft_t offsets (used as nfs cookies) to
nfs3svc_encode_entry{,_plus}, but when they pass it on to encode_entry it
becomes an 'off_t', which isn't good.

So filesystems that returned 64bit offsets would lose.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agobuffer: memorder fix
Nick Piggin [Fri, 2 Nov 2007 22:07:14 +0000 (23:07 +0100)]
buffer: memorder fix

unlock_buffer(), like unlock_page(), must not clear the lock without
ensuring that the critical section is closed.

Mingming later sent the same patch, saying:

We are running SDET benchmark and saw double free issue for ext3 extended
attributes block, which complains the same xattr block already being freed (in
ext3_xattr_release_block()).  The problem could also been triggered by
multiple threads loop untar/rm a kernel tree.

The race is caused by missing a memory barrier at unlock_buffer() before the
lock bit being cleared, resulting in possible concurrent h_refcounter update.
That causes a reference counter leak, then later leads to the double free that
we have seen.

Inside unlock_buffer(), there is a memory barrier is placed *after* the lock
bit is being cleared, however, there is no memory barrier *before* the bit is
cleared.  On some arch the h_refcount update instruction and the clear bit
instruction could be reordered, thus leave the critical section re-entered.

The race is like this: For example, if the h_refcount is initialized as 1,

cpu 0:                                   cpu1
--------------------------------------   -----------------------------------
lock_buffer() /* test_and_set_bit */
clear_buffer_locked(bh);
                                        lock_buffer() /* test_and_set_bit */
h_refcount = h_refcount+1; /* = 2*/     h_refcount = h_refcount + 1; /*= 2 */
                                        clear_buffer_locked(bh);
....                                    ......

We lost a h_refcount here.  We need a memory barrier before the buffer head
lock bit being cleared to force the order of the two writes.  Please apply.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[PKTGEN]: srcmac fix
Adit Ranadive [Fri, 2 Nov 2007 22:05:27 +0000 (23:05 +0100)]
[PKTGEN]: srcmac fix

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[SPARC64]: Fix show_stack() when stack argument is NULL.
David S. Miller [Fri, 2 Nov 2007 21:56:18 +0000 (22:56 +0100)]
[SPARC64]: Fix show_stack() when stack argument is NULL.

It didn't handle that case at all, and now dump_stack()
can be implemented directly as show_stack(current, NULL)

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[SNAP]: Check packet length before reading
Herbert Xu [Fri, 2 Nov 2007 21:53:44 +0000 (22:53 +0100)]
[SNAP]: Check packet length before reading

The snap_rcv code reads 5 bytes so we should make sure that
we have 5 bytes in the head before proceeding.

Based on diagnosis and fix by Evgeniy Polyakov, reported by
Alan J. Wylie.

Patch also kills the skb->sk assignment before kfree_skb
since it's redundant.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[NET]: gen_estimator deadlock fix
Ranko Zivojnovic [Fri, 2 Nov 2007 21:51:48 +0000 (22:51 +0100)]
[NET]: gen_estimator deadlock fix

-Fixes ABBA deadlock noted by Patrick McHardy <kaber@trash.net>:

> There is at least one ABBA deadlock, est_timer() does:
> read_lock(&est_lock)
> spin_lock(e->stats_lock) (which is dev->queue_lock)
>
> and qdisc_destroy calls htb_destroy under dev->queue_lock, which
> calls htb_destroy_class, then gen_kill_estimator and this
> write_locks est_lock.

To fix the ABBA deadlock the rate estimators are now kept on an rcu list.

-The est_lock changes the use from protecting the list to protecting
the update to the 'bstat' pointer in order to avoid NULL dereferencing.

-The 'interval' member of the gen_estimator structure removed as it is
not needed.

Signed-off-by: Ranko Zivojnovic <ranko@spidernet.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[ICMP]: Fix icmp_errors_use_inbound_ifaddr sysctl
Patrick McHardy [Fri, 2 Nov 2007 21:42:48 +0000 (22:42 +0100)]
[ICMP]: Fix icmp_errors_use_inbound_ifaddr sysctl

Currently when icmp_errors_use_inbound_ifaddr is set and an ICMP error is
sent after the packet passed through ip_output(), an address from the
outgoing interface is chosen as ICMP source address since skb->dev doesn't
point to the incoming interface anymore.

Fix this by doing an interface lookup on rt->dst.iif and using that device.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[Bluetooth] Fix NULL pointer dereference in HCI line discipline
Ohad Ben-Cohen [Fri, 2 Nov 2007 03:41:26 +0000 (04:41 +0100)]
[Bluetooth] Fix NULL pointer dereference in HCI line discipline

Normally a serial Bluetooth device is opened, TIOSETD'ed to N_HCI line
discipline, HCIUARTSETPROTO'ed and finally closed. In case the device
fails to HCIUARTSETPROTO, closing it produces a NULL pointer dereference.

Signed-off-by: Ohad Ben-Cohen <ohad@bencohen.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[Bluetooth] Fix unintentional fall-through in HCI line discipline
Ohad Ben-Cohen [Fri, 2 Nov 2007 03:39:41 +0000 (04:39 +0100)]
[Bluetooth] Fix unintentional fall-through in HCI line discipline

A trivial fix to (what looks like) an unintentional fall-through in the
HCI line discipline.

Signed-off-by: Ohad Ben-Cohen <ohad@bencohen.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoide: add "optical" to sysfs "media" attribute
Danny Kukawka [Fri, 2 Nov 2007 03:19:29 +0000 (04:19 +0100)]
ide: add "optical" to sysfs "media" attribute

Add "optical" to sysfs "media" attribute as already in /proc

Signed-off-by: Danny Kukawka <dkukawka@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agooptical /proc/ide/*/media
Alexey Dobriyan [Fri, 2 Nov 2007 03:17:40 +0000 (04:17 +0100)]
optical /proc/ide/*/media

Sergey Vlasov reported that his "FUJITSU MCC3064AP, ATAPI OPTICAL drive"
pops up as UNKNOWN in /proc/ide/*/media .

Closes kernel Bugzilla #4145.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoaacraid: fix security hole (CVE-2007-4308)
Alan Cox [Fri, 2 Nov 2007 02:41:27 +0000 (03:41 +0100)]
aacraid: fix security hole (CVE-2007-4308)

On the SCSI layer ioctl path there is no implicit permissions check for
ioctls (and indeed other drivers implement unprivileged ioctls). aacraid
however allows all sorts of very admin only things to be done so should
check.

Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Mark Salyzyn <mark_salyzyn@adaptec.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoCIFS should honour umask (CVE-2007-3740)
Steve French [Fri, 2 Nov 2007 02:30:35 +0000 (03:30 +0100)]
CIFS should honour umask (CVE-2007-3740)

This patch makes CIFS honour a process' umask like other filesystems.
Of course the server is still free to munge the permissions if it wants
to; but the client will send the "right" permissions to begin with.

A few caveats:

1) It only applies to filesystems that have CAP_UNIX (aka support unix
extensions)
2) It applies the correct mode to the follow up CIFSSMBUnixSetPerms()
after remote creation

When mode to CIFS/NTFS ACL mapping is complete we can do the
same thing for that case for servers which do not
support the Unix Extensions.

Signed-off-by: Matt Keenen <matt@opcode-solutions.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[IEEE80211]: avoid integer underflow for runt rx frames (CVE-2007-4997)
John W. Linville [Fri, 2 Nov 2007 02:13:03 +0000 (03:13 +0100)]
[IEEE80211]: avoid integer underflow for runt rx frames (CVE-2007-4997)

Reported by Chris Evans <scarybeasts@gmail.com>:

> The summary is that an evil 80211 frame can crash out a victim's
> machine. It only applies to drivers using the 80211 wireless code, and
> only then to certain drivers (and even then depends on a card's
> firmware not dropping a dubious packet). I must confess I'm not
> keeping track of Linux wireless support, and the different protocol
> stacks etc.
>
> Details are as follows:
>
> ieee80211_rx() does not explicitly check that "skb->len >= hdrlen".
> There are other skb->len checks, but not enough to prevent a subtle
> off-by-two error if the frame has the IEEE80211_STYPE_QOS_DATA flag
> set.
>
> This leads to integer underflow and crash here:
>
> if (frag != 0)
>    flen -= hdrlen;
>
> (flen is subsequently used as a memcpy length parameter).

How about this?

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoFix oops in pwc v4l driver
Oliver Neukum [Thu, 1 Nov 2007 03:30:09 +0000 (04:30 +0100)]
Fix oops in pwc v4l driver

The pwc driver is defficient in locking, which can trigger an oops
when disconnecting.

Adrian Bunk:
Backported to 2.6.16.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoUSB: fix DoS in pwc USB video driver (CVE-2007-5093)
Oliver Neukum [Sat, 27 Oct 2007 21:36:46 +0000 (23:36 +0200)]
USB: fix DoS in pwc USB video driver (CVE-2007-5093)

The pwc driver has a disconnect method that waits for user space to
close the device. This opens up an opportunity for a DoS attack,
blocking the USB subsystem and making khubd's task busy wait in
kernel space. This patch shifts freeing resources to close if an opened
device is disconnected.

Adrian Bunk:
Backported to 2.6.16.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[SPARC64] pass correct addr in get_fb_unmapped_area(MAP_FIXED)
Chris Wright [Wed, 24 Oct 2007 19:54:41 +0000 (21:54 +0200)]
[SPARC64] pass correct addr in get_fb_unmapped_area(MAP_FIXED)

Looks like the MAP_FIXED case is using the wrong address hint.  I'd
expect the comment "don't mess with it" means pass the request
straight on through, not change the address requested to -ENOMEM.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoLinux 2.6.16.56 v2.6.16.56
Adrian Bunk [Thu, 1 Nov 2007 02:23:29 +0000 (03:23 +0100)]
Linux 2.6.16.56

17 years agoLinux 2.6.16.56-rc2 v2.6.16.56-rc2
Adrian Bunk [Sun, 28 Oct 2007 21:33:36 +0000 (22:33 +0100)]
Linux 2.6.16.56-rc2

17 years agohugetlb: fix size=4G parsing
Hugh Dickins [Sun, 28 Oct 2007 21:32:04 +0000 (22:32 +0100)]
hugetlb: fix size=4G parsing

On 32-bit machines, mount -t hugetlbfs -o size=4G gave a 0GB filesystem,
size=5G gave a 1GB filesystem etc: there's no point in masking size with
HPAGE_MASK just before shifting its lower bits away, and since HPAGE_MASK is a
UL, that removed all the higher bits of the unsigned long long size.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agohugetlb: fix error return for brk() entering a hugepage region
Hugh Dickins [Sun, 28 Oct 2007 21:22:25 +0000 (22:22 +0100)]
hugetlb: fix error return for brk() entering a hugepage region

The lats commit causes the wrong return value.
is_hugepage_only_range() is a boolean, so we should return
-EINVAL rather than 1.

Also - we can use "mm" instead of looking up "current->mm" again.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agohugetlb: check for brk() entering a hugepage region
David Gibson [Sun, 28 Oct 2007 21:20:34 +0000 (22:20 +0100)]
hugetlb: check for brk() entering a hugepage region

Unlike mmap(), the codepath for brk() creates a vma without first checking
that it doesn't touch a region exclusively reserved for hugepages.  On
powerpc, this can allow it to create a normal page vma in a hugepage
region, causing oopses and other badness.

Add a test to prevent this.  With this patch, brk() will simply fail if it
attempts to move the break into a hugepage reserved region.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[IA64] fix ia64 is_hugepage_only_range
Ken Chen [Sun, 28 Oct 2007 20:40:41 +0000 (21:40 +0100)]
[IA64] fix ia64 is_hugepage_only_range

fix is_hugepage_only_range() definition to be "overlaps"
instead of "within architectural restricted hugetlb address
range".  Simplify the ia64 specific code that used to use
is_hugepage_only_range() to just check which region the
address is in.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoLinux 2.6.16.56-rc1 v2.6.16.56-rc1
Adrian Bunk [Fri, 19 Oct 2007 17:15:19 +0000 (19:15 +0200)]
Linux 2.6.16.56-rc1

17 years agoDon't allow the stack to grow into hugetlb reserved regions (CVE-2007-3739)
Adam Litke [Fri, 19 Oct 2007 17:05:10 +0000 (19:05 +0200)]
Don't allow the stack to grow into hugetlb reserved regions (CVE-2007-3739)

When expanding the stack, we don't currently check if the VMA will cross
into an area of the address space that is reserved for hugetlb pages.
Subsequent faults on the expanded portion of such a VMA will confuse the
low-level MMU code, resulting in an OOPS.  Check for this.

Signed-off-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agodrivers/video/macmodes.c:mac_find_mode() mustn't be __init
Adrian Bunk [Fri, 19 Oct 2007 16:51:07 +0000 (18:51 +0200)]
drivers/video/macmodes.c:mac_find_mode() mustn't be __init

If it's EXPORT_SYMBOL'ed it can't be __devinit.

Reported by Mikael Pettersson.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agohugetlb: fix prio_tree unit (CVE-2007-4133)
Hugh Dickins [Fri, 19 Oct 2007 12:30:18 +0000 (14:30 +0200)]
hugetlb: fix prio_tree unit (CVE-2007-4133)

hugetlb_vmtruncate_list was misconverted to prio_tree: its prio_tree is in
units of PAGE_SIZE (PAGE_CACHE_SIZE) like any other, not HPAGE_SIZE (whereas
its radix_tree is kept in units of HPAGE_SIZE, otherwise slots would be
absurdly sparse).

At first I thought the error benign, just calling __unmap_hugepage_range on
more vmas than necessary; but on 32-bit machines, when the prio_tree is
searched correctly, it happens to ensure the v_offset calculation won't
overflow.  As it stood, when truncating at or beyond 4GB, it was liable to
discard pages COWed from lower offsets; or even to clear pmd entries of
preceding vmas, triggering exit_mmap's BUG_ON(nr_ptes).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agohugetlbfs: add Kconfig help text
Arthur Othieno [Fri, 19 Oct 2007 00:04:58 +0000 (02:04 +0200)]
hugetlbfs: add Kconfig help text

In kernel bugzilla #6248 (http://bugzilla.kernel.org/show_bug.cgi?id=6248),
Adrian Bunk <bunk@stusta.de> notes that CONFIG_HUGETLBFS is missing Kconfig
help text.

Signed-off-by: Arthur Othieno <apgo@patchbomb.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agohugetlbfs doc. update
Randy Dunlap [Fri, 19 Oct 2007 00:03:58 +0000 (02:03 +0200)]
hugetlbfs doc. update

Fix typos, spelling, etc., in Doc/vm/hugetlbpage.txt.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agox86: HUGETLBFS and DEBUG_PAGEALLOC are incompatible
Ken Chen [Thu, 18 Oct 2007 23:59:17 +0000 (01:59 +0200)]
x86: HUGETLBFS and DEBUG_PAGEALLOC are incompatible

DEBUG_PAGEALLOC is not compatible with hugetlb page support.  That debug
option turns off PSE.  Once it is turned off in CR4, the cpu will ignore
pse bit in the pmd and causing infinite page-not- present faults.

So disable DEBUG_PAGEALLOC if the user selected hugetlbfs.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[IA64] lazy_mmu_prot_update needs to be aware of huge pages
Zhang Yanmin [Thu, 18 Oct 2007 23:52:21 +0000 (01:52 +0200)]
[IA64] lazy_mmu_prot_update needs to be aware of huge pages

Function lazy_mmu_prot_update is also used on huge pages when it is called
by set_huge_ptep_writable, but it isn't aware of huge pages.

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Acked-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoSELinux: clear parent death signal on SID transitions
Stephen Smalley [Thu, 18 Oct 2007 23:27:51 +0000 (01:27 +0200)]
SELinux: clear parent death signal on SID transitions

Clear parent death signal on SID transitions to prevent unauthorized
signaling between SIDs.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@parisplace.org>
Signed-off-by: James Morris <jmorris@localhost.localdomain>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomake UML compile (FC6/x86-64)
Ulrich Drepper [Thu, 18 Oct 2007 21:46:58 +0000 (23:46 +0200)]
make UML compile (FC6/x86-64)

I need this patch to get a UML kernel to compile.  This is with the
kernel headers in FC6 which are automatically generated from the kernel
tree.  Some headers are missing but those files don't need them.  At
least it appears so since the resuling kernel works fine.

Tested on x86-64.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoDVB: get_dvb_firmware: update script for new location of tda10046 firmware
Andreas Arens [Thu, 18 Oct 2007 20:44:28 +0000 (22:44 +0200)]
DVB: get_dvb_firmware: update script for new location of tda10046 firmware

cherry picked from commit c545d6adbcacd296f7457bd992556feb055379de

Update get_dvb_firmware script for the new location of the
tda10046 firmware.

The old location doesn't work anymore.

Signed-off-by: Andreas Arens <ari@goron.de>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoDVB: get_dvb_firmware: update script for new location of sp8870 firmware
Michael Krufky [Thu, 18 Oct 2007 20:43:30 +0000 (22:43 +0200)]
DVB: get_dvb_firmware: update script for new location of sp8870 firmware

cherry picked from commit 302170a4b47e869372974abd885dd11d5536b64a

get_dvb_firmware: update script for new location of sp8870 firmware

This url is no longer valid:
http://www.technotrend.de/new/217g/tt_Premium_217g.zip

Replace with:
http://www.softwarepatch.pl/9999ccd06a4813cb827dbb0005071c71/tt_Premium_217g.zip

Thanks-to: Tobias Stoeber <tobi@to-st.de>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoalpha: fix epoll syscall enumerations
Mike Frysinger [Thu, 18 Oct 2007 19:30:46 +0000 (21:30 +0200)]
alpha: fix epoll syscall enumerations

We went and named them __NR_sys_foo instead of __NR_foo.

It may be too late to change this, but we can at least add the proper names
now.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agom68knommu: ptrace.h typo fix
Jan Altenberg [Thu, 18 Oct 2007 17:21:38 +0000 (19:21 +0200)]
m68knommu: ptrace.h typo fix

Signed-off-by: Jan Altenberg <tb10alj@tglx.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[TCP]: Fix fastpath_cnt_hint when GSO skb is partially ACKed
Ilpo Järvinen [Thu, 18 Oct 2007 16:55:43 +0000 (18:55 +0200)]
[TCP]: Fix fastpath_cnt_hint when GSO skb is partially ACKed

When only GSO skb was partially ACKed, no hints are reset,
therefore fastpath_cnt_hint must be tweaked too or else it can
corrupt fackets_out. The corruption to occur, one must have
non-trivial ACK/SACK sequence, so this bug is not very often
that harmful. There's a fackets_out state reset in TCP because
fackets_out is known to be inaccurate and that fixes the issue
eventually anyway.

In case there was also at least one skb that got fully ACKed,
the fastpath_skb_hint is set to NULL which causes a recount for
fastpath_cnt_hint (the old value won't be accessed anymore),
thus it can safely be decremented without additional checking.

Reported by Cedric Le Goater <clg@fr.ibm.com>

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[SPARC64]: Fix bugs in SYSV IPC handling in 64-bit processes.
David S. Miller [Thu, 18 Oct 2007 16:48:42 +0000 (18:48 +0200)]
[SPARC64]: Fix bugs in SYSV IPC handling in 64-bit processes.

Thanks to Tom Callaway for the excellent bug report and
test case.

sys_ipc() has several problems, most to due with semaphore
call handling:

1) 'err' return should be a 'long'
2) "union semun" is passed in a register on 64-bit compared
   to 32-bit which provides it on the stack and therefore
   by reference
3) Second and third arguments to SEMCTL are swapped compared
   to 32-bit.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[NET]: Zero length write() on socket should not simply return 0.
David S. Miller [Thu, 18 Oct 2007 16:47:05 +0000 (18:47 +0200)]
[NET]: Zero length write() on socket should not simply return 0.

This fixes kernel bugzilla #5731

It should generate an empty packet for datagram protocols when the
socket is connected, for one.

The check is doubly-wrong because all that a write() can be is a
sendmsg() call with a NULL msg_control and a single entry iovec.  No
special semantics should be assigned to it, therefore the zero length
check should be removed entirely.

This matches the behavior of BSD and several other systems.

Alan Cox notes that SuSv3 says the behavior of a zero length write on
non-files is "unspecified", but that's kind of useless since BSD has
defined this behavior for a quarter century and BSD is essentially
what application folks code to.

Based upon a patch from Stephen Hemminger.

Adrian Bunk:
Backported to 2.6.16.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[PKT_SCHED] cls_u32: error code isn't been propogated properly
Stephen Hemminger [Thu, 18 Oct 2007 16:31:51 +0000 (18:31 +0200)]
[PKT_SCHED] cls_u32: error code isn't been propogated properly

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[POWERPC] Fix handling of stfiwx math emulation
Kumar Gala [Thu, 18 Oct 2007 16:00:21 +0000 (18:00 +0200)]
[POWERPC] Fix handling of stfiwx math emulation

Its legal for the stfiwx instruction to have RA = 0 as part of its
effective address calculation.  This is illegal for all other XE
form instructions.

Add code to compute the proper effective address for stfiwx if
RA = 0 rather than treating it as illegal.

Adrian Bunk:
Backported to 2.6.16.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[PKT_SCHED] RED: Fix overflow in calculation of queue average
Ilpo Järvinen [Thu, 18 Oct 2007 15:56:27 +0000 (17:56 +0200)]
[PKT_SCHED] RED: Fix overflow in calculation of queue average

Overflow can occur very easily with 32 bits, e.g., with 1 second
us_idle is approx. 2^20, which leaves only 11-Wlog bits for queue
length. Since the EWMA exponent is typically around 9, queue
lengths larger than 2^2 cause overflow. Whether the affected
branch is taken when us_idle is as high as 1 second, depends on
Scell_log, but with rather reasonable configuration Scell_log is
large enough to cause p->Stab to have zero index, which always
results zero shift (typically also few other small indices result
in zero shift).

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoLinux 2.6.16.55 v2.6.16.55
Adrian Bunk [Fri, 12 Oct 2007 15:27:10 +0000 (17:27 +0200)]
Linux 2.6.16.55

17 years agoRevert "TCP: Fix TCP handling of SACK in bidirectional flows"
Adrian Bunk [Fri, 12 Oct 2007 21:03:25 +0000 (23:03 +0200)]
Revert "TCP: Fix TCP handling of SACK in bidirectional flows"

This reverts commit 3198d0f16dec0c87071cf26f3f11af9c8f0a009b.

17 years agoLinux 2.6.16.55-rc1 v2.6.16.55-rc1
Adrian Bunk [Sat, 6 Oct 2007 23:01:31 +0000 (01:01 +0200)]
Linux 2.6.16.55-rc1

17 years agoConvert snd-page-alloc proc file to use seq_file (CVE-2007-4571)
Takashi Iwai [Sun, 7 Oct 2007 01:26:43 +0000 (03:26 +0200)]
Convert snd-page-alloc proc file to use seq_file (CVE-2007-4571)

Commit ccec6e2c4a74adf76ed4e2478091a311b1806212 in mainline.

Use seq_file for the proc file read/write of snd-page-alloc module.
This automatically fixes bugs in the old proc code.

Adrian Bunk:
Backported to 2.6.16.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agosnd_mem_proc_read(): convert to list_for_each_entry*
Adrian Bunk [Sat, 6 Oct 2007 22:58:15 +0000 (00:58 +0200)]
snd_mem_proc_read(): convert to list_for_each_entry*

Stolen from a patch by Johannes Berg <johannes@sipsolutions.net>.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agosysfs: store sysfs inode nrs in s_ino to avoid readdir oopses (CVE-2007-3104)
Eric Sandeen [Sat, 6 Oct 2007 22:52:10 +0000 (00:52 +0200)]
sysfs: store sysfs inode nrs in s_ino to avoid readdir oopses (CVE-2007-3104)

Backport of
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.22-rc1/2.6.22-rc1-mm1/broken-out/gregkh-driver-sysfs-allocate-inode-number-using-ida.patch

For regular files in sysfs, sysfs_readdir wants to traverse
sysfs_dirent->s_dentry->d_inode->i_ino to get to the inode number.
But, the dentry can be reclaimed under memory pressure, and there is
no synchronization with readdir.  This patch follows Tejun's scheme of
allocating and storing an inode number in the new s_ino member of a
sysfs_dirent, when dirents are created, and retrieving it from there
for readdir, so that the pointer chain doesn't have to be traversed.

Tejun's upstream patch uses a new-ish "ida" allocator which brings
along some extra complexity; this -stable patch has a brain-dead
incrementing counter which does not guarantee uniqueness, but because
sysfs doesn't hash inodes as iunique expects, uniqueness wasn't
guaranteed today anyway.

Adrian Bunk:
Backported to 2.6.16.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agorandom: fix bound check ordering (CVE-2007-3105)
Matt Mackall [Sat, 6 Oct 2007 22:27:53 +0000 (00:27 +0200)]
random: fix bound check ordering (CVE-2007-3105)

If root raised the default wakeup threshold over the size of the
output pool, the pool transfer function could overflow the stack with
RNG bytes, causing a DoS or potential privilege escalation.

(Bug reported by the PaX Team <pageexec@freemail.hu>)

Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agorandom: fix seeding with zero entropy (CVE-2007-2453 2 of 2)
Matt Mackall [Sat, 6 Oct 2007 22:24:49 +0000 (00:24 +0200)]
random: fix seeding with zero entropy (CVE-2007-2453 2 of 2)

Add data from zero-entropy random_writes directly to output pools to
avoid accounting difficulties on machines without entropy sources.

Tested on lguest with all entropy sources disabled.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Acked-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agorandom: fix error in entropy extraction (CVE-2007-2453 1 of 2)
Matt Mackall [Sat, 6 Oct 2007 22:19:10 +0000 (00:19 +0200)]
random: fix error in entropy extraction (CVE-2007-2453 1 of 2)

Fix cast error in entropy extraction.
Add comments explaining the magic 16.
Remove extra confusing loop variable.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Acked-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoReset current->pdeath_signal on SUID binary execution (CVE-2007-3848)
Marcel Holtmann [Sat, 6 Oct 2007 22:03:26 +0000 (00:03 +0200)]
Reset current->pdeath_signal on SUID binary execution (CVE-2007-3848)

This fixes a vulnerability in the "parent process death signal"
implementation discoverd by Wojciech Purczynski of COSEINC PTE Ltd.
and iSEC Security Research.

http://marc.info/?l=bugtraq&m=118711306802632&w=2

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agofix buffer overflow in the moxa driver (CVE-2005-0504)
Dann Frazier [Sat, 6 Oct 2007 21:51:05 +0000 (23:51 +0200)]
fix buffer overflow in the moxa driver (CVE-2005-0504)

Signed-off-by: Dann Frazier <dannf@hp.com>
Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[POWERPC] Flush registers to proper task context
Kumar Gala [Sat, 6 Oct 2007 21:36:26 +0000 (23:36 +0200)]
[POWERPC] Flush registers to proper task context

When we flush register state for FP, Altivec, or SPE in flush_*_to_thread
we need to respect the task_struct that the caller has passed to us.

Most cases we are called with current, however sometimes (ptrace) we may
be passed a different task_struct.

This showed up when using gdbserver debugging a simple program that used
floating point. When gdb tried to show the FP regs they all showed up as 0,
because the child's FP registers were never properly flushed to memory.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agox86_64: Zero extend all registers after ptrace in 32bit entry path (CVE-2007-4573)
Andi Kleen [Sat, 6 Oct 2007 21:32:18 +0000 (23:32 +0200)]
x86_64: Zero extend all registers after ptrace in 32bit entry path (CVE-2007-4573)

Strictly it's only needed for eax.

It actually does a little more than strictly needed -- the other registers
are already zero extended.

Also remove the now unnecessary and non functional compat task check
in ptrace.

Found by Wojciech Purczynski

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agounexport ip_conntrack_{,un}register_notifier
Adrian Bunk [Sat, 6 Oct 2007 20:38:04 +0000 (22:38 +0200)]
unexport ip_conntrack_{,un}register_notifier

Static functions mustn't be exported.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agosound/core/pcm_lib.c: don't export static functions
Adrian Bunk [Sat, 6 Oct 2007 20:29:05 +0000 (22:29 +0200)]
sound/core/pcm_lib.c: don't export static functions

Static functions mustn't be exported.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agounexport csr1212_release_keyval
Adrian Bunk [Sat, 6 Oct 2007 20:05:29 +0000 (22:05 +0200)]
unexport csr1212_release_keyval

A static function mustn't be exported.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agounexport cpufreq_parse_governor
Adrian Bunk [Sat, 6 Oct 2007 19:59:38 +0000 (21:59 +0200)]
unexport cpufreq_parse_governor

A static function mustn't be exported.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agounexport neigh_update_hhs
Adrian Bunk [Sat, 6 Oct 2007 19:13:06 +0000 (21:13 +0200)]
unexport neigh_update_hhs

A static function mustn't be exported.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[SPARC]: fix sparc64 gcc 4.2 compile failure
Mikael Pettersson [Sat, 6 Oct 2007 19:05:23 +0000 (21:05 +0200)]
[SPARC]: fix sparc64 gcc 4.2 compile failure

Compiling 2.6.21-rc5 with gcc-4.2.0 20070317 (prerelease)
for sparc64 fails as follows:

  gcc -Wp,-MD,arch/sparc64/kernel/.time.o.d  -nostdinc -isystem /home/mikpe/pkgs/linux-sparc64/gcc-4.2.0/lib/gcc/sparc64-unknown-linux-gnu/4.2.0/include -D__KERNEL__ -Iinclude  -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os -m64 -pipe -mno-fpu -mcpu=ultrasparc -mcmodel=medlow -ffixed-g4 -ffixed-g5 -fcall-used-g7 -Wno-sign-compare -Wa,--undeclared-regs -fomit-frame-pointer  -fno-stack-protector -Wdeclaration-after-statement -Wno-pointer-sign -Werror   -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(time)"  -D"KBUILD_MODNAME=KBUILD_STR(time)" -c -o arch/sparc64/kernel/time.o arch/sparc64/kernel/time.c
cc1: warnings being treated as errors
arch/sparc64/kernel/time.c: In function 'kick_start_clock':
arch/sparc64/kernel/time.c:559: warning: overflow in implicit constant conversion
make[1]: *** [arch/sparc64/kernel/time.o] Error 1
make: *** [arch/sparc64/kernel] Error 2

gcc gets unhappy when the MSTK_SET macro's u8 __val variable
is updated with &= ~0xff (MSTK_YEAR_MASK). Making the constant
unsigned fixes the problem.

[ I fixed up the sparc32 side as well -DaveM ]

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agounexport ktime_get_real
Adrian Bunk [Sat, 6 Oct 2007 19:14:55 +0000 (21:14 +0200)]
unexport ktime_get_real

A static function mustn't be exported.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[IPSEC] AH4: Update IPv4 options handling to conform to RFC 4302.
Nick Bowler [Sat, 6 Oct 2007 18:34:18 +0000 (20:34 +0200)]
[IPSEC] AH4: Update IPv4 options handling to conform to RFC 4302.

In testing our ESP/AH offload hardware, I discovered an issue with how
AH handles mutable fields in IPv4.  RFC 4302 (AH) states the following
on the subject:

        For IPv4, the entire option is viewed as a unit; so even
        though the type and length fields within most options are immutable
        in transit, if an option is classified as mutable, the entire option
        is zeroed for ICV computation purposes.

The current implementation does not zero the type and length fields,
resulting in authentication failures when communicating with hosts
that do (i.e. FreeBSD).

I have tested record route and timestamp options (ping -R and ping -T)
on a small network involving Windows XP, FreeBSD 6.2, and Linux hosts,
with one router.  In the presence of these options, the FreeBSD and
Linux hosts (with the patch or with the hardware) can communicate.
The Windows XP host simply fails to accept these packets with or
without the patch.

I have also been trying to test source routing options (using
traceroute -g), but haven't had much luck getting this option to work
*without* AH, let alone with.

Signed-off-by: Nick Bowler <nbowler@ellipticsemi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agoLinux 2.6.16.54 v2.6.16.54
Adrian Bunk [Mon, 24 Sep 2007 22:24:51 +0000 (00:24 +0200)]
Linux 2.6.16.54

17 years agoLinux 2.6.16.54-rc1 v2.6.16.54-rc1
Adrian Bunk [Thu, 30 Aug 2007 04:26:15 +0000 (06:26 +0200)]
Linux 2.6.16.54-rc1

17 years agoTCP: Fix TCP handling of SACK in bidirectional flows
Ilpo Järvinen [Thu, 30 Aug 2007 04:21:05 +0000 (06:21 +0200)]
TCP: Fix TCP handling of SACK in bidirectional flows

It's possible that new SACK blocks that should trigger new LOST
markings arrive with new data (which previously made is_dupack
false). In addition, I think this fixes a case where we get
a cumulative ACK with enough SACK blocks to trigger the fast
recovery (is_dupack would be false there too).

I'm not completely pleased with this solution because readability
of the code is somewhat questionable as 'is_dupack' in SACK case
is no longer about dupacks only but would mean something like
'lost_marker_work_todo' too... But because of Eifel stuff done
in CA_Recovery, the FLAG_DATA_SACKED check cannot be placed to
the if statement which seems attractive solution. Nevertheless,
I didn't like adding another variable just for that either... :-)

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[PPP]: Fix output buffer size in ppp_decompress_frame().
Konstantin Sharlaimov [Thu, 30 Aug 2007 04:17:37 +0000 (06:17 +0200)]
[PPP]: Fix output buffer size in ppp_decompress_frame().

This patch addresses the issue with "osize too small" errors in mppe
encryption.  The patch fixes the issue with wrong output buffer size
being passed to ppp decompression routine.

--------------------
As pointed out by Suresh Mahalingam, the issue addressed by
ppp-fix-osize-too-small-errors-when-decoding patch is not fully resolved yet.
The size of allocated output buffer is correct, however it size passed to
ppp->rcomp->decompress in ppp_generic.c if wrong. The patch fixes that.
--------------------

Signed-off-by: Konstantin Sharlaimov <konstantin.sharlaimov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[PPP]: Fix osize too small errors when decoding mppe.
Konstantin Sharlaimov [Mon, 24 Sep 2007 21:01:54 +0000 (23:01 +0200)]
[PPP]: Fix osize too small errors when decoding mppe.

The mppe_decompress() function required a buffer that is 1 byte too
small when receiving a message of mru size. This fixes buffer
allocation to prevent this from occurring.

Signed-off-by: Konstantin Sharlaimov <konstantin.sharlaimov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[MATH-EMU]: Fix underflow exception reporting.
David S. Miller [Thu, 30 Aug 2007 04:14:37 +0000 (06:14 +0200)]
[MATH-EMU]: Fix underflow exception reporting.

The underflow exception cases were wrong.

This is one weird area of ieee1754 handling in that the underflow
behavior changes based upon whether underflow is enabled in the trap
enable mask of the FPU control register.  As a specific case the Sparc
V9 manual gives us the following description:

--------------------
If UFM = 0:     Underflow occurs if a nonzero result is tiny and a
                loss of accuracy occurs.  Tininess may be detected
                before or after rounding.  Loss of accuracy may be
                either a denormalization loss or an inexact result.

If UFM = 1:     Underflow occurs if a nonzero result is tiny.
                Tininess may be detected before or after rounding.
--------------------

What this amounts to in the packing case is if we go subnormal,
we set underflow if any of the following are true:

1) rounding sets inexact
2) we ended up rounding back up to normal (this is the case where
   we set the exponent to 1 and set the fraction to zero), this
   should set inexact too
3) underflow is set in FPU control register trap-enable mask

The initially discovered example was "DBL_MIN / 16.0" which
incorrectly generated an underflow.  It should not, unless underflow
is set in the trap-enable mask of the FPU csr.

Another example, "0x0.0000000000001p-1022 / 16.0", should signal both
inexact and underflow.  The cpu implementations and ieee1754
literature is very clear about this.  This is case #2 above.

However, if underflow is set in the trap enable mask, only underflow
should be set and reported as a trap.  That is handled properly by the
prioritization logic in

arch/sparc{,64}/math-emu/math.c:record_exception().

Based upon a report and test case from Jakub Jelinek.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[SPARC32]: Fix rounding errors in ndelay/udelay implementation.
Mark Fortescue [Thu, 30 Aug 2007 04:07:11 +0000 (06:07 +0200)]
[SPARC32]: Fix rounding errors in ndelay/udelay implementation.

__ndelay and __udelay have not been delayung >= specified time.
The problem with __ndelay has been tacked down to the rounding of the
multiplier constant. By changing this, delays > app 18us are correctly
calculated.
The problem with __udelay has also been tracked down to rounding issues.
Changing the multiplier constant (to match that used in sparc64) corrects
for large delays and adding in a rounding constant corrects for trunctaion
errors in the claculations.
Many short delays will return without looping. This is not an error as there
is the fixed delay of doing all the maths to calculate the loop count.

Signed-off-by: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years ago[SPARC32]: Fix bug in sparc optimized memset.
Alexander Shmelev [Thu, 30 Aug 2007 04:05:29 +0000 (06:05 +0200)]
[SPARC32]: Fix bug in sparc optimized memset.

Sparc optimized memset (arch/sparc/lib/memset.S) does not fill last
byte of the memory area, if area size is less than 8 bytes and start
address is not word (4-bytes) aligned.

Here is code chunk where bug located:
/* %o0 - memory address, %o1 - size, %g3 - value */
8:
     add    %o0, 1, %o0
    subcc    %o1, 1, %o1
    bne,a    8b
     stb %g3, [%o0 - 1]

This code should write byte every loop iteration, but last time delay
instruction stb is not executed because branch instruction sets
"annul" bit.

Patch replaces bne,a by bne instruction.

Error can be reproduced by simple kernel module:

--------------------
#include <linux/module.h>
#include <linux/config.h>
#include <linux/kernel.h>
#include <linux/errno.h>
#include <string.h>

static void do_memset(void **p, int size)
{
        memset(p, 0x00, size);
}

static int __init memset_test_init(void)
{
    char fooc[8];
    int *fooi;
    memset(fooc, 0xba, sizeof(fooc));

    do_memset((void**)(fooc + 3), 1);

    fooi = (int*) fooc;
    printk("%08X %08X\n", fooi[0], fooi[1]);

    return -1;
}

static void __exit memset_test_cleanup(void)
{
    return;
}

module_init(memset_test_init);
module_exit(memset_test_cleanup);

MODULE_LICENSE("GPL");
EXPORT_NO_SYMBOLS;
--------------------

Signed-off-by: Alexander Shmelev <ashmelev@task.sun.mcst.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd: avoid possible BUG_ON in md bitmap handling
Neil Brown [Thu, 23 Aug 2007 00:13:17 +0000 (02:13 +0200)]
md: avoid possible BUG_ON in md bitmap handling

md/bitmap tracks how many active write requests are pending on blocks
associated with each bit in the bitmap, so that it knows when it can clear
the bit (when count hits zero).

The counter has 14 bits of space, so if there are ever more than 16383, we
cannot cope.

Currently the code just calles BUG_ON as "all" drivers have request queue
limits much smaller than this.

However is seems that some don't.  Apparently some multipath configurations
can allow more than 16383 concurrent write requests.

So, in this unlikely situation, instead of calling BUG_ON we now wait
for the count to drop down a bit.  This requires a new wait_queue_head,
some waiting code, and a wakeup call.

Tested by limiting the counter to 20 instead of 16383 (writes go a lot slower
in that case...).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd: fix a few problems with the interface (sysfs and ioctl) to md
Neil Brown [Wed, 22 Aug 2007 23:39:24 +0000 (01:39 +0200)]
md: fix a few problems with the interface (sysfs and ioctl) to md

While developing more functionality in mdadm I found some bugs in md...

- When we remove a device from an inactive array (write 'remove' to
  the 'state' sysfs file - see 'state_store') would should not
  update the superblock information - as we may not have
  read and processed it all properly yet.

- initialise all raid_disk entries to '-1' else the 'slot sysfs file
  will claim '0' for all devices in an array before the array is
  started.

- all '\n' not to be present at the end of words written to
  sysfs files

- when we use SET_ARRAY_INFO to set the md metadata version,
  set the flag to say that there is persistant metadata.

- allow GET_BITMAP_FILE to be called on an array that hasn't
  been started yet.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd: assorted md and raid1 one-liners
Neil Brown [Wed, 22 Aug 2007 23:38:42 +0000 (01:38 +0200)]
md: assorted md and raid1 one-liners

Fix few bugs that meant that:
  - superblocks weren't alway written at exactly the right time (this
    could show up if the array was not written to - writting to the array
    causes lots of superblock updates and so hides these errors).

  - restarting device recovery after a clean shutdown (version-1 metadata
    only) didn't work as intended (or at all).

1/ Ensure superblock is updated when a new device is added.
2/ Remove an inappropriate test on MD_RECOVERY_SYNC in md_do_sync.
   The body of this if takes one of two branches depending on whether
   MD_RECOVERY_SYNC is set, so testing it in the clause of the if
   is wrong.
3/ Flag superblock for updating after a resync/recovery finishes.
4/ If we find the neeed to restart a recovery in the middle (version-1
   metadata only) make sure a full recovery (not just as guided by
   bitmaps) does get done.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd: allow SET_BITMAP_FILE to work on 64bit kernel with 32bit userspace
Neil Brown [Wed, 22 Aug 2007 23:37:54 +0000 (01:37 +0200)]
md: allow SET_BITMAP_FILE to work on 64bit kernel with 32bit userspace

..  so that you can use bitmaps with 32bit userspace on a 64 bit kernel.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd: fix some small races in bitmap plugging in raid5
Neil Brown [Wed, 22 Aug 2007 23:06:28 +0000 (01:06 +0200)]
md: fix some small races in bitmap plugging in raid5

The comment gives more details, but I didn't quite have the sequencing write,
so there was room for races to leave bits unset in the on-disk bitmap for
short periods of time.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd: fix a plug/unplug race in raid5
Neil Brown [Wed, 22 Aug 2007 23:02:57 +0000 (01:02 +0200)]
md: fix a plug/unplug race in raid5

When a device is unplugged, requests are moved from one or two (depending on
whether a bitmap is in use) queues to the main request queue.

So whenever requests are put on either of those queues, we should make sure
the raid5 array is 'plugged'.  However we don't.  We currently plug the raid5
queue just before putting requests on queues, so there is room for a race.  If
something unplugs the queue at just the wrong time, requests will be left on
the queue and nothing will want to unplug them.  Normally something else will
plug and unplug the queue fairly soon, but there is a risk that nothing will.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd: fix resync speed calculation for restarted resyncs
Neil Brown [Wed, 22 Aug 2007 22:57:45 +0000 (00:57 +0200)]
md: fix resync speed calculation for restarted resyncs

We introduced 'io_sectors' recently so we could count the sectors that causes
io during resync separate from sectors which didn't cause IO - there can be a
difference if a bitmap is being used to accelerate resync.

However when a speed is reported, we find the number of sectors processed
recently by subtracting an oldish io_sectors count from a current
'curr_resync' count.  This is wrong because curr_resync counts all sectors,
not just io sectors.

So, add a field to mddev to store the curren io_sectors separately from
curr_resync, and use that in the calculations.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd: Allow re-add to work on array without bitmaps
Neil Brown [Wed, 22 Aug 2007 22:56:48 +0000 (00:56 +0200)]
md: Allow re-add to work on array without bitmaps

When an array has a bitmap, a device can be removed and re-added and only
blocks changes since the removal (as recorded in the bitmap) will be resynced.

It should be possible to do a similar thing to arrays without bitmaps.  i.e.
if a device is removed and re-added and *no* changes have been made in the
interim, then the add should not require a resync.

This patch allows that option.  This means that when assembling an array one
device at a time (e.g.  during device discovery) the array can be enabled
read-only as soon as enough devices are available, but extra devices can still
be added without causing a resync.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd/bitmap: tidy up i_writecount handling in md/bitmap
Neil Brown [Sat, 11 Aug 2007 23:09:29 +0000 (01:09 +0200)]
md/bitmap: tidy up i_writecount handling in md/bitmap

md/bitmap modifies i_writecount of a bitmap file to make sure that no-one else
writes to it.  The reverting of the change is sometimes done twice, and there
is one error path where it is omitted.

This patch tidies that up.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd/bitmap: remove dead code from md/bitmap
Neil Brown [Sat, 11 Aug 2007 23:07:37 +0000 (01:07 +0200)]
md/bitmap: remove dead code from md/bitmap

bitmap_active is never called, and the BITMAP_ACTIVE flag is never users or
tested, so discard them both.

Also remove some out-of-date 'todo' comments.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd/bitmap: remove unnecessary page reference manipulations from md/bitmap code
Neil Brown [Sat, 11 Aug 2007 23:05:36 +0000 (01:05 +0200)]
md/bitmap: remove unnecessary page reference manipulations from md/bitmap code

md/bitmap gets a collection of pages representing the bitmap when it
initialises the bitmap, and puts all the references when discarding the
bitmap.

It also occasionally takes extra references without any good reason, and
sometimes drops them ...  though it doesn't always drop them, which can result
in a memory leak.

This patch removes the unnecessary 'get_page' calls, and the corresponding
'put_page' calls.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd/bitmap: use set_bit etc for bitmap page attributes
Neil Brown [Sat, 11 Aug 2007 23:04:41 +0000 (01:04 +0200)]
md/bitmap: use set_bit etc for bitmap page attributes

In particular, this means that we use 4 bits per page instead of a whole
unsigned long.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd/bitmap: cleaner separation of page attribute handlers in md/bitmap
Neil Brown [Sat, 11 Aug 2007 23:03:48 +0000 (01:03 +0200)]
md/bitmap: cleaner separation of page attribute handlers in md/bitmap

md/bitmap has some attributes per-page.  Handling of these attributes in
largely abstracted in set_page_attr and clear_page_attr.  However
get_page_attr exposes the format used to store them.  So prior to changing
that format, introduce test_page_attr instead of get_page_attr, and make
appropriate usage changes.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd/bitmap: fix online removal of file-backed bitmaps
Neil Brown [Sat, 11 Aug 2007 22:18:08 +0000 (00:18 +0200)]
md/bitmap: fix online removal of file-backed bitmaps

When "mdadm --grow /dev/mdX --bitmap=none" is used to remove a filebacked
bitmap, the bitmap was disconnected from the array, but the file wasn't closed
(until the array was stopped).

The file also wasn't closed if adding the bitmap file failed.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd: Don't clear bits in bitmap when writing to one device fails during recovery
Neil Brown [Sat, 11 Aug 2007 22:17:07 +0000 (00:17 +0200)]
md: Don't clear bits in bitmap when writing to one device fails during recovery

Currently a device failure during recovery leaves bits set in the bitmap.
This normally isn't a problem as the offending device will be rejected because
of errors.  However if device re-adding is being used with non-persistent
bitmaps, this can be a problem.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
17 years agomd: Add '4' to the list of levels for which bitmaps are supported
Neil Brown [Sat, 11 Aug 2007 22:15:55 +0000 (00:15 +0200)]
md: Add '4' to the list of levels for which bitmaps are supported

I really should make this a function of the personality....

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@kernel.org>