gcc-4 warns with
include/linux/cpuset.h:21: warning: type qualifiers ignored on function
return type
cpuset_cpus_allowed is declared with const
extern const cpumask_t cpuset_cpus_allowed(const struct task_struct *p);
First const should be __attribute__((const)), but the gcc manual
explains that:
"Note that a function that has pointer arguments and examines the data
pointed to must not be declared const. Likewise, a function that calls a
non-const function usually must not be const. It does not make sense for
a const function to return void."
The following patch remove const from the function declaration.
Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Acked-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] pci enumeration on ixp2000: overflow in kernel/resource.c
IXP2000 (ARM-based) platforms use a separate 'struct resource' for PCI MEM
space. Resource allocation for PCI BARs always fails because the 'root'
resource (the IXP2000 PCI MEM resource) always has the entire address space
(00000000-ffffffff) free, and find_resource() calculates the size of that
range as ffffffff-00000000+1=0, so all allocations fail because it thinks
there is no space.
(akpm: pls. double-check)
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
James Bottomley [Sat, 16 Apr 2005 22:25:54 +0000 (15:25 -0700)]
[PATCH] add Big Endian variants of ioread/iowrite
In the new io infrastructure, all of our operators are expecting the
underlying device to be little endian (because the PCI bus, their main
consumer, is LE).
However, there are a fair few devices and busses in the world that are
actually Big Endian. There's even evidence that some of these BE bus and
chip types are attached to LE systems. Thus, there's a need for a BE
equivalent of our io{read,write}{16,32} operations.
The attached patch adds this as io{read,write}{16,32}be. When it's in,
I'll add the first consume (the 53c700 SCSI chip driver).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Daniel McNeil [Sat, 16 Apr 2005 22:25:50 +0000 (15:25 -0700)]
[PATCH] Direct IO async short read fix
The direct I/O code is mapping the read request to the file system block. If
the file size was not on a block boundary, the result would show the the read
reading past EOF. This was only happening for the AIO case. The non-AIO case
truncates the result to match file size (in direct_io_worker). This patch
does the same thing for the AIO case, it truncates the result to match the
file size if the read reads past EOF.
When I/O completes the result can be truncated to match the file size
without using i_size_read(), thus the aio result now matches the number of
bytes read to the end of file.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
These have been deprecated since ->compat_ioctl when in, thus only a short
deprecation period. There's four users left: i2o_config, s390/z90crypy,
s390/dasd and s390/zfcp and for the first two patches are about to be
submitted to get rid of it.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] quota: possible bug in quota format v2 support
Don't put root block of quota tree to the free list (when quota file is
completely empty). That should not actually happen anyway (somebody should
get accounted for the filesystem root and so quota file should never be
empty) but better prevent it here than solve magical quota file
corruption.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Bernard Blackham [Sat, 16 Apr 2005 22:25:45 +0000 (15:25 -0700)]
[PATCH] ext2 corruption - regression between 2.6.9 and 2.6.10
Whilst trying to stress test a Promise SX8 card, we stumbled across
some nasty filesystem corruption in ext2. Our tests involved
creating an ext2 partition, mounting, running several concurrent
fsx's over it, umounting, and fsck'ing, all scripted[1]. The fsck
would always return with errors.
This regression was traced back to a change between 2.6.9 and
2.6.10, which moves the functionality of ext2_put_inode into
ext2_clear_inode. The attached patch reverses this change, and
eliminated the source of corruption.
Mingming Cao <cmm@us.ibm.com> said:
I think his patch for ext2 is correct. The corruption on ext3 is not the same
issue he saw on ext2. I believe that's the race between discard reservation
and reservation in-use that we already fixed it in 2.6.12- rc1.
For the problem related to ext2, at the time when we design reservation for
ext3, we decide we only need to discard the reservation at the last file
close, so we have ext3_discard_reservation on iput_final- >ext3_clear_inode.
The ext2 handle discard preallocation differently at that time, it discard the
preallocation at each iput(), not in input_final(), so we think it's
unnecessary to thrash it so frequently, and the right thing to do, as we did
for ext3 reservation, discard preallocation on last iput(). So we moved the
ext2_discard_preallocation from ext2_put_inode(0 to ext2_clear_inode.
Since ext2 preallocation is doing pre-allocation on disk, so it is possible
that at the unmount time, someone is still hold the reference of the inode, so
the preallocation for a file is not discard yet, so we still mark those blocks
allocated on disk, while they are not actually in the inode's block map, so
fsck will catch/fix that error later.
This is not a issue for ext3, as ext3 reservation(pre-allocation) is done in
memory.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
- plug various leaks and use after frees in the remove and
initialization failure path (some still left)
- remove useless global list of boards and use pci_set_drvdata instead
- unobsfucate init path by merging functions together
- kill various totally useless state variables
- .. probably more I forgot
Note that the tty part still generates lots of sparse warnings and there's
still a totally useless layer of function pointer indirections, but maybe
someone else will fix that bit up.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ken Chen [Sat, 16 Apr 2005 22:25:43 +0000 (15:25 -0700)]
[PATCH] use cheaper elv_queue_empty when unplug a device
In function __generic_unplug_device(), kernel can use a cheaper function
elv_queue_empty() instead of more expensive elv_next_request to find
whether the queue is empty or not. blk_run_queue can also made conditional
on whether queue's emptiness before calling request_fn().
Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fix 3 calls to module_param_string() in
driver/media/video/tuner-core.c and drivers/media/video/tda9887.c. In all
three places, the len and the perm parameter was switched.
[PATCH] kernel/param.c: don't use .max when .num is NULL in param_array_set()
there seems to be a bug, at least for me, in kernel/param.c for arrays with
.num == NULL. If .num == NULL, the function param_array_set() uses &.max
for the call to param_array(), wich alters the .max value to the number of
arguments. The result is, you can't set more array arguments as the last
time you set the parameter.
example:
# a module 'example' with
# static int array[10] = { 0, };
# module_param_array(array, int, NULL, 0644);
$ insmod example.ko array=1,2,3
$ cat /sys/module/example/parameters/array
1,2,3
$ echo "4,3,2,1" > /sys/module/example/parameters/array
$ dmesg | tail -n 1
kernel: array: can take only 3 arguments
A question on sigwaitinfo based IO mechanism in multithreaded applications.
I am trying to use RT signals to notify me of IO events using RT signals
instead of SIGIO in a multithreaded applications. I noticed that there was
some discussion on lkml during november 1999 with the subject of the
discussion as "Signal driven IO". In the thread I noticed that RT signals
were being delivered to the worker thread. I am running 2.6.10 kernel and
I am trying to use the very same mechanism and I find that only SIGIO being
propogated to the worker threads and RT signals only being propogated to
the main thread and not the worker threads where I actually want them to be
propogated too. On further inspection I found that the following patch
which I have attached solves the problem.
I am not sure if this is a bug or feature in the kernel.
Roland McGrath <roland@redhat.com> said:
This relates only to fcntl F_SETSIG, which is a Linux extension. So there is
no POSIX issue. When changing various things like the normal SIGIO signalling
to do group signals, I was concerned strictly with the POSIX semantics and
generally avoided touching things in the domain of Linux inventions. That's
why I didn't change this when I changed the call right next to it. There is
no reason I can see that F_SETSIG-requested signals shouldn't use a group
signal like normal SIGIO does. I'm happy to ACK this patch, there is nothing
wrong with its change to the semantics in my book. But neither POSIX nor I
care a whit what F_SETSIG does.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is a possibility that a bio will be accessed after it has been freed
on SCSI. It happens if you submit a bio with BIO_SYNC marked and the
auto-unplugging kicks the request_fn, SCSI re-enables interrupts in-between
so if the request completes between the add_request() in __make_request()
and the bio_sync() call, we could be looking at a dead bio. It's a slim
race, but it has been triggered in the Real World.
Pavel Machek [Sat, 16 Apr 2005 22:25:34 +0000 (15:25 -0700)]
[PATCH] u32 vs. pm_message_t in ppc and radeon
This fixes pm_message_t vs. u32 confusion in ppc and aty (I *hope* that's
basically radeon code...). I was not able to test most of these, but I'm
not really changing anything, so it should be okay.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pavel Machek [Sat, 16 Apr 2005 22:25:32 +0000 (15:25 -0700)]
[PATCH] fix u32 vs. pm_message_t in drivers/macintosh
I thought I'm done with fixing u32 vs. pm_message_t ... unfortunately that
turned out not to be the case as Russel King pointed out. Here are fixes for
drivers/macintosh.
Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pavel Machek [Sat, 16 Apr 2005 22:25:30 +0000 (15:25 -0700)]
[PATCH] fix pm_message_t vs. u32 in alsa
I thought I'm done with fixing u32 vs. pm_message_t ... unfortunately that
turned out not to be the case as Russel King pointed out. This fixes last few
bits in alsa.
Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pavel Machek [Sat, 16 Apr 2005 22:25:24 +0000 (15:25 -0700)]
[PATCH] pm_message_t: more fixes in common and i386
I thought I'm done with fixing u32 vs. pm_message_t ... unfortunately
that turned out not to be the case as Russel King pointed out. Here are
fixes for Documentation and common code (mainly system devices).
Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] x86, x86_64: dual core proc-cpuinfo and sibling-map fix
- broken sibling_map setup in x86_64
- grouping all the core and HT related cpuinfo fields.
We are reasonably sure that adding new cpuinfo fields after "siblings" field,
will not cause any app failure. Thats because today's /proc/cpuinfo
format is completely different on x86, x86_64 and we haven't heard of any
x86 app breakage because of this issue. Grouping these fields will
result in more or less common format on all architectures (ia64, x86 and
x86_64) and will cause less confusion.
[PATCH] x86_64: Switch SMP bootup over to new CPU hotplug state machine
This will allow hotplug CPU in the future and in general cleans up a lot of
crufty code. It also should plug some races that the old hackish way
introduces. Remove one old race workaround in NMI watchdog setup that is not
needed anymore.
I removed the old total sum of bogomips reporting code. The brag value of
BogoMips has been greatly devalued in the last years on the open market.
Real CPU hotplug will need some more work, but the infrastructure for it is
there now.
One drawback: the new TSC sync algorithm is less accurate than before. The
old way of zeroing TSCs is too intrusive to do later. Instead the TSC of the
BP is duplicated now, which is less accurate.
akpm:
- sync_tsc_bp_init seems to have the sense of `init' inverted.
- SPIN_LOCK_UNLOCKED is deprecated - use DEFINE_SPINLOCK.
[PATCH] x86_64: Rename the extended cpuid level field
It was confusingly named.
Signed-off-by: Andi Kleen <ak@suse.de>
DESC
x86_64: Switch SMP bootup over to new CPU hotplug state machine
EDESC
From: "Andi Kleen" <ak@suse.de>
This will allow hotplug CPU in the future and in general cleans up a lot of
crufty code. It also should plug some races that the old hackish way
introduces. Remove one old race workaround in NMI watchdog setup that is not
needed anymore.
I removed the old total sum of bogomips reporting code. The brag value of
BogoMips has been greatly devalued in the last years on the open market.
Real CPU hotplug will need some more work, but the infrastructure for it is
there now.
One drawback: the new TSC sync algorithm is less accurate than before. The
old way of zeroing TSCs is too intrusive to do later. Instead the TSC of the
BP is duplicated now, which is less accurate.
Exceptions and hardware interrupts can, to a certain degree, nest, so when
attempting to follow the sequence of stacks used in order to dump their
contents this has to be accounted for. Also, IST stacks have their tops
stored in the TSS, so there's no need to add the stack size to get to their
ends.
Minor changes from AK.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up the code greatly. Now uses the infrastructure from the Intel dual
core patch Should fix a final bug noticed by Tyan of not detecting the nodes
correctly in some corner cases.
Patch for x86-64 and i386
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] x86_64: add support for Intel dual-core detection and displaying
Appended patch adds the support for Intel dual-core detection and displaying
the core related information in /proc/cpuinfo.
It adds two new fields "core id" and "cpu cores" to x86 /proc/cpuinfo and the
"core id" field for x86_64("cpu cores" field is already present in x86_64).
Number of processor cores in a die is detected using cpuid(4) and this is
documented in IA-32 Intel Architecture Software Developer's Manual (vol 2a)
(http://developer.intel.com/design/pentium4/manuals/index_new.htm#sdm_vol2a)
This patch also adds cpu_core_map similar to cpu_sibling_map.
[PATCH] x86_64: Keep only a single debug notifier chain
Calling a notifier three times in the debug handler does not make much sense,
because a debugger can figure out the various conditions by itself. Remove
the additional calls to DIE_DEBUG and DIE_DEBUGSTEP completely.
This matches what i386 does now.
This also makes sure interrupts are always still disabled when calling a
debugger, which prevents:
BUG: using smp_processor_id() in preemptible [00000001] code: tpopf/1470
caller is post_kprobe_handler+0x9/0x70
Could lead to a lost reschedule event when the process already rescheduled on
exception exit, and needs it again while still being in the kernel. Unlikely
case though.
Also remove one redundant cli in another entry.S path.
Jason Davis [Sat, 16 Apr 2005 22:24:53 +0000 (15:24 -0700)]
[PATCH] x86_64 genapic update
x86_64 genapic mechanism should be aware of machines that use physical APIC
mode regardless of how many clusters/processors are detected.
ACPI 3.0 FADT makes this determination very simple by providing a feature
flag "force_apic_physical_destination_mode" to state whether the machine
unconditionally uses physical APIC mode.
Unisys' next generation x86_64 ES7000 will need to utilize this FADT
feature flag in order to boot the x86_64 kernel in the correct APIC mode.
This patch has been tested on both x86_64 commodity and ES7000 boxes.
Signed-off-by: Jason Davis <jason.davis@unisys.com> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] x86-64/i386: Revert cpuinfo siblings behaviour back to 2.6.10
Only display physical id/siblings when there are siblings or dual core.
In 2.6.11 I accidentially broke it and it was always displaying these
fields But for compatibility to all these /proc parsers around it is better
to do it in the old way again.
Roland McGrath [Sat, 16 Apr 2005 22:24:48 +0000 (15:24 -0700)]
[PATCH] i386 vDSO: add PT_NOTE segment
This patch adds an ELF note to the vDSO giving the LINUX_VERSION_CODE
value. Having this in the vDSO lets the dynamic linker avoid the `uname'
syscall it now always does at startup to ascertain the kernel ABI
available.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Roland McGrath [Sat, 16 Apr 2005 22:24:46 +0000 (15:24 -0700)]
[PATCH] i386: Use loaddebug macro consistently
This moves the macro loaddebug from asm-i386/suspend.h to
asm-i386/processor.h, which is the place that makes sense for it to be
defined, removes the extra copy of the same macro in
arch/i386/kernel/process.c, and makes arch/i386/kernel/signal.c use the
macro in place of its expansion.
This is a purely cosmetic cleanup for the normal i386 kernel. However, it
is handy for Xen to be able to just redefine the loaddebug macro once
instead of also changing the signal.c code.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Olof Johansson [Sat, 16 Apr 2005 22:24:38 +0000 (15:24 -0700)]
[PATCH] ppc64: no prefetch for NULL pointers
For prefetches of NULL (as when walking a short linked list), PPC64 will in
some cases take a performance hit. The hardware needs to do the TLB walk,
and said walk will always miss, which means (up to) two L2 misses as
penalty. This seems to hurt overall performance, so for NULL pointers skip
the prefetch alltogether.
Signed-off-by: Olof Johansson <olof@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The code that parses the OF device tree contains an old bogus hack which
was killed a long time ago on ppc32, but survived in ppc64. It was
supposed to help with a problem on the f50 which is ... a 32 bits machine
:) Additionally, that hack is causing problems, so let's just get rid of
it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] ppc64: Detect altivec via firmware on unknown CPUs
This patch adds detection of the Altivec capability of the CPU via the
firmware in addition to the cpu table. This allows newer CPUs that aren't
in the table to still have working altivec support in the kernel.
It also fixes a problem where if a CPU isn't recognized as having altivec
features, and takes an altivec unavailable exception due to userland
issuing altivec instructions, the kernel would happily enable it and
context switch the registers ... but not all of them (it would basically
forget vrsave). With this patch, the kernel will refuse to enable altivec
when the feature isn't detected for the CPU (SIGILL).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch reworks the way the ppc64 is mapped in user memory by the kernel
to make it more robust against possible collisions with executable
segments. Instead of just whacking a VMA at 1Mb, I now use
get_unmapped_area() with a hint, and I moved the mapping of the vDSO to
after the mapping of the various ELF segments and of the interpreter, so
that conflicts get caught properly (it still has to be before
create_elf_tables since the later will fill the AT_SYSINFO_EHDR with the
proper address).
While I was at it, I also changed the 32 and 64 bits vDSO's to link at
their "natural" address of 1Mb instead of 0. This is the address where
they are normally mapped in absence of conflict. By doing so, it should be
possible to properly prelink one it's been verified to work on glibc.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Mackerras [Sat, 16 Apr 2005 22:24:34 +0000 (15:24 -0700)]
[PATCH] ppc64: fix export of wrong symbol
In arch/ppc64/kernel/ppc_ksyms.c, we are still exporting
flush_icache_range, but that has been changed to be an inline in
include/asm-ppc64/cacheflush.h which calls __flush_icache_range (defined in
arch/ppc64/kernel/misc.S).
This patch changes the export to __flush_icache_range, thus allowing
modules to use the inline flush_icache_range.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes ppc64 __ioremap() so that it stops adding implicitely
_PAGE_GUARDED when the cache is not writeback, and instead, let the callers
provide the flag they want here. This allows things like framebuffers to
explicitely request a non-cacheable and non-guarded mapping which is more
efficient for that type of memory without side effects. The patch also
fixes all current callers to add _PAGE_GUARDED except btext, which is fine
without it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>