Jan Beulich [Fri, 29 Sep 2006 08:59:42 +0000 (01:59 -0700)]
[PATCH] fix Intel RNG detection
Previously, since determination whether there was an Intel random number
generator was based on a single bit, on systems with a matching bridge
device but without a firmware hub, there was a 50% chance that the code
would incorrectly decide that the system had an RNG. This patch adds
detection of the firmware hub to better qualify the existence of an RNG.
There is one issue with the patch: I was unable to determine the LPC
equivalent for the PCI bridge 8086:2430 (since the old code didn't care
about which of the many devices provided by the ICH/ESB it was chose to use
the PCI bridge device, but the FWH settings live in the LPC device, so the
device list needed to be changed).
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric Sandeen [Fri, 29 Sep 2006 08:59:41 +0000 (01:59 -0700)]
[PATCH] mount udf UDF_PART_FLAG_READ_ONLY partitions with MS_RDONLY
There's a bug where a UDF_PART_FLAG_READ_ONLY udf partition gets mounted
read-write, then subsequent problems happen; files seem to be able to be
removed, but file creation results in EIO or worse, oops.
EIO is coming from udf_new_block(), which returns EIO if the right flags
aren't set; only UDF_PART_FLAG_READ_ONLY is set in this case. We probably
s hould not have gotten this far...
Attached patch seems to fix it - and includes a printk to alert the user
that their "rw" mount request has been converted to "ro."
Here's the testcase I used:
[root@magnesium ~]# mkisofs -R -J -udf -o testiso /tmp/
...
Total translation table size: 0
Total rockridge attributes bytes: 342923
Total directory bytes: 382312
Path table size(bytes): 104
Max brk space used 103000
105059 extents written (205 MB)
[root@magnesium ~]# mount -o loop testiso /mnt/test/
[root@magnesium ~]# ls /mnt/test/fsfile
/mnt/test/fsfile
[root@magnesium ~]# rm /mnt/test/fsfile
[root@magnesium ~]# ls /mnt/test/fsfile
ls: /mnt/test/fsfile: No such file or directory
[root@magnesium ~]# touch /mnt/test/fsfile
touch: cannot touch `/mnt/test/fsfile': Input/output error
[root@magnesium tmp]# grep udf /proc/mounts
/dev/loop1 /mnt/test udf rw 0 0
Force readonly mounts of UDF partitions marked as read-only.
Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Olaf Hering [Fri, 29 Sep 2006 08:59:39 +0000 (01:59 -0700)]
[PATCH] ignore partition table on disks with AIX label
The on-disk data structures from AIX are not known, also the filesystem
layout is not known. There is a msdos partition signature at the end of
the first block, and the kernel recognizes 3 small (and overlapping)
partitions. But they are not usable. Maybe the firmware uses it to find
the bootloader for AIX, but AIX boots also if the first block is cleared.
This fixes also YaST which compares the output from parted (and formerly
fdisk) with /proc/partitions. fdisk recognizes the AIX label since a long
time, SuSE has a patch for parted to handle the disk label as unknown.
dmesg will look like this:
sda: [AIX] unknown partition table
Tested on an IBM B50 with AIX V4.3.3.
Signed-off-by: Olaf Hering <olh@suse.de> Cc: Albert Cahalan <acahalan@gmail.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] DMI: Decode and save OEM String information
This teaches dmi_decode() how to decode and save OEM Strings (type 11) DMI
information, which is currently discarded silently. Existing code using
DMI is not affected. Follows the "System Management BIOS (SMBIOS)
Specification" (http://www.dmtf.org/standards/smbios), and also the
userspace dmidecode.c code.
OEM Strings are the only safe way to identify some hardware, e.g., the
ThinkPad embedded controller used by the soon-to-be-submitted tp_smapi
driver. This will also let us eliminate the long whitelist in the mainline
hdaps driver (in a future patch).
[PATCH] timer: add lock annotation to lock_timer_base
lock_timer_base acquires a lock and returns with that lock held. Add a
lock annotation to this function so that sparse can check callers for lock
pairing, and so that sparse will not complain about this function since it
intentionally uses the lock in this manner.
Some filesystems may want to report different values depending on the path
within the filesystem, i.e. one mount is actually several filesystems. This
can be the case for a network filesystem exported by an unprivileged server
(e.g. sshfs).
This is now possible, thanks to David Howells "VFS: Permit filesystem to
perform statfs with a known root dentry" patch.
This change is backward compatible, so no need to change interface version.
Mark Huang [Fri, 29 Sep 2006 08:59:34 +0000 (01:59 -0700)]
[PATCH] module_subsys: initialize earlier
Initialize module_subsys earlier (or at least earlier than devices) since
it could be used very early in the boot process if kmod loads a module
before the device initcalls. Otherwise, kmod will crash in
kernel/module.c:mod_sysfs_setup() since the kset in module_subsys is not
initialized yet.
I only noticed this problem because occasionally, kmod loads the modules
for my SCSI and Ethernet adapters very early, during the boot process
itself. I don't quite understand why it loads them sometimes and doesn't
load them other times. Or who is telling kmod to do so. Can someone
explain?
Signed-off-by: Mark Huang <mlhuang@cs.princeton.edu> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] Require mmap handler for a.out executables
Files supported by fs/proc/base.c, i.e. /proc/<pid>/*, are not capable of
meeting the validity checks in ELF load_elf_*() handling because they have
no mmap handler which is required by ELF. In order to stop a.out
executables being used as part of an exploit attack against /proc-related
vulnerabilities, we make a.out executables depend on ->mmap() existing.
[PATCH] rcu: add lock annotations to rcu{,_bh}_torture_read_{lock,unlock}
rcu_torture_read_lock and rcu_bh_torture_read_lock acquire locks without
releasing them, and the matching functions rcu_torture_read_unlock and
rcu_bh_torture_read_unlock get called with the corresponding locks held and
release them. Add lock annotations to these four functions so that sparse
can check callers for lock pairing, and so that sparse will not complain
about these functions since they intentionally use locks in this manner.
Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
grab_super gets called with sb_lock held, and releases it. Add a lock
annotation to this function so that sparse can check callers for lock
pairing, and so that sparse will not complain about this function since it
intentionally uses the lock in this manner.
[PATCH] lockdep: don't pull in includes when lockdep disabled
Do not pull in various includes through lockdep.h if lockdep is disabled.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Cc: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] hugetlbfs: add lock annotation to hugetlbfs_forget_inode()
hugetlbfs_forget_inode releases inode_lock. Add a lock annotation to this
function so that sparse can check callers for lock pairing, and so that
sparse will not complain about this functions since it intentionally uses
the lock in this manner.
[PATCH] fuse: add lock annotations to request_end and fuse_read_interrupt
request_end and fuse_read_interrupt release fc->lock. Add lock annotations
to these two functions so that sparse can check callers for lock pairing,
and so that sparse will not complain about these functions since they
intentionally use locks in this manner.
[PATCH] afs: add lock annotations to afs_proc_cell_servers_{start,stop}
afs_proc_cell_servers_start acquires a lock, and afs_proc_cell_servers_stop
releases that lock. Add lock annotations to these two functions so that
sparse can check callers for lock pairing, and so that sparse will not
complain about these functions since they intentionally use locks in this
manner.
Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] efi: add lock annotations for efi_call_phys_prelog and efi_call_phys_epilog
The functions efi_call_phys_prelog and efi_call_phys_epilog in
arch/i386/kernel/efi.c wrap the spinlock efi_rt_lock: efi_call_phys_prelog
returns with the lock held, and efi_call_phys_epilog releases the lock
without acquiring it. Add lock annotations to these two functions so that
sparse can check callers for lock pairing, and so that sparse will not
complain about these functions since they intentionally use locks in this
manner.
Olaf Hering [Fri, 29 Sep 2006 08:59:21 +0000 (01:59 -0700)]
[PATCH] use gcc -O1 in fs/reiserfs only for ancient gcc versions
Only compile with -O1 if the (very old) compiler is broken. We use
reiserfs alot since SLES9 on ppc64, and it was never seen with gcc33.
Assume the broken gcc is gcc-3.4 or older.
Signed-off-by: Olaf Hering <olh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Komal Shah [Fri, 29 Sep 2006 08:59:20 +0000 (01:59 -0700)]
[PATCH] OMAP: Update OMAP1/2 boards to give keymapsize and other pdata
This patch adds keymapsize, delay and debounce flag in the keypad platform
data for various TI OMAP1/2 based boards like F-sample, H2, H3, Innovator,
Nokia770, OSK, Perseus and H4.
Signed-off-by: Komal Shah <komal_shah802003@yahoo.com> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Komal Shah [Fri, 29 Sep 2006 08:59:19 +0000 (01:59 -0700)]
[PATCH] OMAP: Add keypad driver
This patch adds support for keypad driver running on different TI
OMAP(http://www.ti.com/omap) processor based boards like OSK, H2, H3, H4,
Persuas and Nokia 770.
Signed-off-by: Komal Shah <komal_shah802003@yahoo.com> Acked-by: Dmitry Torokhov <dtor@mail.ru> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] spinlock_debug: don't recompute (jiffies_per_loop * HZ) in spinloop
In spinlock_debug.c, the spinloops call __delay() on every iteration.
Because that is an external function, (jiffies_per_loop * HZ), the loop's
iteration limit, gets recomputed every time. Caching it explicitly
prevents that.
Convert loop.c from the deprecated kernel_thread to kthread. This patch
simplifies the code quite a bit and passes similar testing to the previous
submission on both emulated x86 and s390.
Changes since last submission:
switched to using a rather simple loop based on
wait_event_interruptible.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] libfs: remove page up-to-date check from simple_readpage
Remove the unnecessary PageUptodate check from simple_readpage. The only
two callers for ->readpage that don't have explicit PageUptodate check are
read_cache_pages and page_cache_read which operate on newly allocated pages
which don't have the flag set.
[akpm: use the allegedly-faster clear_page(), too] Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
drivers/char/pc8736x_gpio.c:192: warning: #pc8736x_gpio_set_high# defined but not used
drivers/char/pc8736x_gpio.c:197: warning: #pc8736x_gpio_set_low# defined but not used
Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Michal Schmidt [Fri, 29 Sep 2006 08:59:03 +0000 (01:59 -0700)]
[PATCH] Make touch_nmi_watchdog imply touch_softlockup_watchdog on all archs
touch_nmi_watchdog() calls touch_softlockup_watchdog() on both
architectures that implement it (i386 and x86_64). On other architectures
it does nothing at all. touch_nmi_watchdog() should imply
touch_softlockup_watchdog() on all architectures. Suggested by Andi Kleen.
[heiko.carstens@de.ibm.com: s390 fix] Signed-off-by: Michal Schmidt <xschmi00@stud.feec.vutbr.cz> Cc: Andi Kleen <ak@muc.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Michal Schmidt <xschmi00@stud.feec.vutbr.cz> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jason Baron [Fri, 29 Sep 2006 08:58:58 +0000 (01:58 -0700)]
[PATCH] make PROT_WRITE imply PROT_READ
Make PROT_WRITE imply PROT_READ for a number of architectures which don't
support write only in hardware.
While looking at this, I noticed that some architectures which do not
support write only mappings already take the exact same approach. For
example, in arch/alpha/mm/fault.c:
"
if (cause < 0) {
if (!(vma->vm_flags & VM_EXEC))
goto bad_area;
} else if (!cause) {
/* Allow reads even for write-only mappings */
if (!(vma->vm_flags & (VM_READ | VM_WRITE)))
goto bad_area;
} else {
if (!(vma->vm_flags & VM_WRITE))
goto bad_area;
}
"
Thus, this patch brings other architectures which do not support write only
mappings in-line and consistent with the rest. I've verified the patch on
ia64, x86_64 and x86.
Additional discussion:
Several architectures, including x86, can not support write-only mappings.
The pte for x86 reserves a single bit for protection and its two states are
read only or read/write. Thus, write only is not supported in h/w.
Currently, if i 'mmap' a page write-only, the first read attempt on that page
creates a page fault and will SEGV. That check is enforced in
arch/blah/mm/fault.c. However, if i first write that page it will fault in
and the pte will be set to read/write. Thus, any subsequent reads to the page
will succeed. It is this inconsistency in behavior that this patch is
attempting to address. Furthermore, if the page is swapped out, and then
brought back the first read will also cause a SEGV. Thus, any arbitrary read
on a page can potentially result in a SEGV.
According to the SuSv3 spec, "if the application requests only PROT_WRITE, the
implementation may also allow read access." Also as mentioned, some
archtectures, such as alpha, shown above already take the approach that i am
suggesting.
The counter-argument to this raised by Arjan, is that the kernel is enforcing
the write only mapping the best it can given the h/w limitations. This is
true, however Alan Cox, and myself would argue that the inconsitency in
behavior, that is applications can sometimes work/sometimes fails is highly
undesireable. If you read through the thread, i think people, came to an
agreement on the last patch i posted, as nobody has objected to it...
Signed-off-by: Jason Baron <jbaron@redhat.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Andi Kleen <ak@muc.de> Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Ian Molton <spyro@f2s.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In blockdevc-check-errors.patch, add_bd_holder() is modified to return error
values when some of its operation failed. Among them, it returns -EEXIST when
a given bd_holder object already exists in the list.
However, in this case, the function completed its work successfully and need
no action by its caller other than freeing unused bd_holder object. So I
think it's better to return success after freeing by itself.
Otherwise, bd_claim-ing with same claim pointer will fail.
Typically, lvresize will fails with following message:
device-mapper: reload ioctl failed: Invalid argument
and you'll see messages like below in kernel log:
device-mapper: table: 254:13: linear: dm-linear: Device lookup failed
device-mapper: ioctl: error adding target to table
Similarly, it should not add bd_holder to the list if either one of symlinking
fails. I don't have a test case for this to happen but it should cause
dereference of freed pointer.
If a matching bd_holder is found in bd_holder_list, add_bd_holder() completes
its job by just incrementing the reference count. In this case, it should be
considered as success but it used to return 'fail' to let the caller free
temporary bd_holder. Fixed it to return success and free given object by
itself.
Also, if either one of symlinking fails, the bd_holder should not be added to
the list so that it can be discarded later. Otherwise, the caller will free
bd_holder which is in the list.
Jeff Dike [Fri, 29 Sep 2006 08:58:52 +0000 (01:58 -0700)]
[PATCH] uml: remove unneeded file
Remove arch/um/kernel/skas/process_kern.c again. The stack alignment
change which resulted in this file being here is safely in
arch/um/kernel/process.c.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike [Fri, 29 Sep 2006 08:58:51 +0000 (01:58 -0700)]
[PATCH] uml: close file descriptor leaks
Close two file descriptor leaks, one in the ubd driver and one to
/proc/mounts. The ubd driver bug also leaked some vmalloc space. The
/proc/mounts leak was a descriptor that was just never closed.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike [Fri, 29 Sep 2006 08:58:50 +0000 (01:58 -0700)]
[PATCH] uml: locking documentation
Some locking documentation and a cleanup. uml_exitcode is copied into a local
before sprintf sees it, in case sprintf does anything non-atomic with it.
The rest are comments about why certain globals don't need any kind of
locking.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike [Fri, 29 Sep 2006 08:58:50 +0000 (01:58 -0700)]
[PATCH] uml: mechanical tidying after random MACs change
Mechanical, hopefully non-functional changes stemming from
setup_etheraddr always succeeding now that it always assigns a MAC,
either from the command line or generated randomly:
the test of the return of setup_etheraddr is removed, and code
dependent on it succeeding is now unconditional
setup_etheraddr can now be made void
struct uml_net.have_mac is now always 1, so tests of it can be
similarly removed, and uses of it can be replaced with 1
struct uml_net.have_mac is no longer used, so it can be removed
struct uml_net_private.have_mac is copied from struct uml_net, so
it is always 1
tests of uml_net_private.have_mac can be removed
uml_net_private.have_mac can now be removed
the only call to dev_ip_addr was removed, so it can be deleted
It also turns out that setup_etheraddr is called only once, from the same
file, so it can be static and its declaration removed from net_kern.h.
Similarly, set_ether_mac is defined and called only from one file.
Finally, setup_etheraddr and set_ether_mac were moved to avoid needing forward
declarations.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike [Fri, 29 Sep 2006 08:58:46 +0000 (01:58 -0700)]
[PATCH] uml: assign random MACs to interfaces if necessary
Assign a random MAC to an ethernet interface if one was not provided on the
command line. This became pressing when distros started bringing interfaces
up before assigning IPs to them. The previous pattern of assigning an IP then
bringing it up allowed the MAC to be generated from the first IP assigned.
However, once the thing is up, it's probably a bad idea to change the MAC, so
the MAC stayed initialized to fe:fd:0:0:0:0.
Now, if there is no MAC from the command line, one is generated. We use the
microseconds from gettimeofday (20 bits), plus the low 12 bits of the pid to
seed the random number generator. random() is called twice, with 16 bits of
each result used. I didn't want to have to try to fill in 32 bits optimally
given an arbitrary RAND_MAX, so I just assume that it is greater than 65536
and use 16 bits of each random() return.
There is also a bit of reformatting and whitespace cleanup here.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
keith mannthey [Fri, 29 Sep 2006 08:58:46 +0000 (01:58 -0700)]
[PATCH] convert i386 Summit subarch to use SRAT info for apicid_to_node calls
Convert the i386 summit subarch apicid_to_node to use node information
provided by the SRAT. It was discussed a little on LKML a few weeks ago
and was seen as an acceptable fix. The current way of obtaining the nodeid
is just not correct for all summit systems/bios. Assuming the apicid
matches the Linux node number require a leap of faith that the bios mapped
out the apicids a set way. Modern summit HW (IBM x460) does not layout its
bios in the manner for various reasons and is unable to boot i386 numa.
The best way to get the correct apicid to node information is from the SRAT
table during boot. It lays out what apicid belongs to what node. I use
this information to create a table for use at run time.
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] SELinux: support mls categories for context mounts
Allows commas to be embedded into context mount options (i.e. "-o
context=some_selinux_context_t"), to better support multiple categories,
which are separated by commas and confuse mount.
For example, with the current code:
mount -t iso9660 /dev/cdrom /media/cdrom -o \
ro,context=system_u:object_r:iso9660_t:s0:c1,c3,c4,exec
The context option that will be interpreted by SELinux is
context=system_u:object_r:iso9660_t:s0:c1
instead of
context=system_u:object_r:iso9660_t:s0:c1,c3,c4
The options that will be passed on to the file system will be
ro,c3,c4,exec.
The proposed solution is to allow/require the SELinux context option
specified to mount to use quotes when the context contains a comma.
This patch modifies the option parsing in parse_opts(), contained in
mount.c, to take options after finding a comma only if it hasn't seen a
quote or if the quotes are matched. It also introduces a new function that
will strip the quotes from the context option prior to translation. The
quotes are replaced after the translation is completed to insure that in
the event the raw context contains commas the kernel will be able to
interpret the correct context.
Signed-off-by: Cory Olmo <colmo@TrustedCS.com> Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Failing context is a multi threaded process context and the failing
sequence is as follows.
One thread T0 doing self modifying code on page X on processor P0 and
another thread T1 doing COW (breaking the COW setup as part of just
happened fork() in another thread T2) on the same page X on processor P1.
T0 doing SMC can endup modifying the new page Y (allocated by the T1 doing
COW on P1) but because of different I/D TLB's, P0 ITLB will not see the new
mapping till the flush TLB IPI from P1 is received. During this interval,
if T0 executes the code created by SMC it can result in an app error (as
ITLB still points to old page X and endup executing the content in page X
rather than using the content in page Y).
Fix this issue by first clearing the PTE and flushing it, before updating
it with new entry.
Hugh sayeth:
I was a bit sceptical, in the habit of thinking that Self Modifying Code
must look such issues itself: but I guess there's nothing it can do to avoid
this one.
Fair enough, what you're changing it to is pretty much what powerpc and
s390 were already doing, and is a more robust way of proceeding, consistent
with how ptes are set everywhere else.
The ptep_clear_flush is a bit heavy-handed (it's anxious to return the pte
that was atomically cleared), but we'd have to wander through lots of arches
to get the right minimal behaviour. It'd also be nice to eliminate
ptep_establish completely, now only used to define other macros/inlines: it
always seemed obfuscation to me, what you've got there now is clearer.
Let's put those cleanups on a TODO list.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: "David S. Miller" <davem@davemloft.net> Acked-by: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] convert s390 page handling macros to functions
Convert s390 page handling macros to functions. In particular this fixes a
problem with s390's SetPageUptodate macro which uses its input parameter
twice which again can cause subtle bugs.
[akpm@osdl.org: build fix] Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Woodhouse [Fri, 29 Sep 2006 08:58:37 +0000 (01:58 -0700)]
[PATCH] Fix uninitialised spinlock in via-pmu-backlight code.
The uninitialised pmu_backlight_lock causes the current Fedora test kernel
(which has spinlock debugging enabled) to panic on suspend.
This is suboptimal, so I fixed it.
Signed-off-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Michael Hanselmann <linux-kernel@hansmi.ch> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Matthew Wilcox [Fri, 29 Sep 2006 08:58:36 +0000 (01:58 -0700)]
[PATCH] remove generic__raw_read_trylock()
If the cpu has the lock held for write, is interrupted, and the interrupt
handler calls read_trylock(), it's an instant deadlock.
Now, Dave Miller has subsequently pointed out that we don't have any
situations where this can occur. Nevertheless, we should delete
generic__raw_read_lock (and its associated EXPORT to make Arjan happy) so that
nobody thinks they can use it.
Acked-by: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Al Viro [Fri, 29 Sep 2006 08:58:34 +0000 (01:58 -0700)]
[PATCH] __percpu_alloc_mask() has to be __always_inline in UP case
... or we'll end up with cpu_online_map being evaluated on UP. In
modules. cpumask.h is very careful to avoid that, and for a very good
reason. So should we...
PS: yes, it really triggers (on alpha).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[Bluetooth]: Fix section mismatch of bt_sysfs_cleanup()
The bt_sysfs_cleanup() is marked with __exit attribute, but it will
be called from an __init function in the error case. So the __exit
attribute must be removed.
Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[Bluetooth]: Don't update disconnect timer for incoming connections
In the case of device pairing the only safe method is to establish
a low-level ACL link. In this case, the remote side should not use
the disconnect timer to give the other side the chance to enter the
PIN code. If the disconnect timer is used, the connection will be
dropped to soon, because it is impossible to identify an actual user
of this link.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>
John Heffner [Thu, 28 Sep 2006 21:47:38 +0000 (14:47 -0700)]
[TCP]: Fix and simplify microsecond rtt sampling
This changes the microsecond RTT sampling so that samples are taken in
the same way that RTT samples are taken for the RTO calculator: on the
last segment acknowledged, and only when the segment hasn't been
retransmitted.
Signed-off-by: John Heffner <jheffner@psc.edu> Acked-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Sesterhenn [Thu, 28 Sep 2006 21:37:07 +0000 (14:37 -0700)]
[SUNRPC]: Remove unnecessary check in net/sunrpc/svcsock.c
coverity spotted this one as possible dereference in the dprintk(),
but since there is only one caller of svc_create_socket(), which always
passes a valid sin, we dont need this check.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Thu, 28 Sep 2006 05:53:24 +0000 (22:53 -0700)]
[IPVS]: Make sure ip_vs_ftp ports are valid: module_param_array approach
I'm not entirely sure what happens in the case of a valid port,
at best it'll be silently ignored. This patch ensures that
the port values are unsigned short values, and thus always valid.
FIFOCTL_RXERR and FIFOCTL_TXERR are undocumented bits, according to the
Sigmatel datasheet. We should thus not take any assumption on their values
and semantics.
Problem spotted by andrzej zaborowski <balrogg@gmail.com> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch detects the smsc-ircc chipset on the nx1000
(including nx7000 and nx7010) and the nx5000 HP/Compaq laptop series.
Patch from "Linus Walleij (LD/EAB)" <linus.walleij@ericsson.com> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[IrDA] nsc-ircc: Configuration base address for PC87383
According to NatSemi datasheet, the configuration base address for the PC8738x
family is 0x2e or 0x164. 0x0 doesn't appear in any datasheet.
Patch from Lamarque Vieira Souza <lamarque@gmail.com> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[NET]: Move netlink interface bits to linux/if_link.h.
Moving netlink interface bits to linux/if.h is rather troublesome for
applications including both linux/if.h (which was changed to be included
from linux/rtnetlink.h automatically) and net/if.h.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[XFRM]: Do not add a state whose SPI is zero to the SPI hash.
SPI=0 is used for acquired IPsec SA and MIPv6 RO state.
Such state should not be added to the SPI hash
because we do not care about it on deleting path.
The loopback device status structure is a singleton and doesn't
need to be allocated. Add ethtool_ops hooks to show checksum always on,
and make ethtool_ops const.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Samuel Ortiz [Thu, 28 Sep 2006 03:05:38 +0000 (20:05 -0700)]
[IrDA]: af_irda.c cleanups
We lock the socket when both releasing and getting a disconnected
notification. In the latter case, we also ste the socket as orphan.
This fixes a potential kernel bug that can be triggered when we get the
disconnection notification before closing the socket.
Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 28 Sep 2006 02:03:36 +0000 (19:03 -0700)]
[IPV6]: Disable SG for GSO unless we have checksum
Because the system won't turn off the SG flag for us we
need to do this manually on the IPv6 path. Otherwise we
will throw IPv6 packets with bad checksums at the hardware.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>