Siddha, Suresh B [Sun, 11 Nov 2007 19:27:59 +0000 (11:27 -0800)]
x86: fix taking DNA during 64bit sigreturn
restore sigcontext is taking a DNA exception while restoring FP context
from the user stack, during the sigreturn. Appended patch fixes it by
doing clts() if the app doesn't touch FP during the signal handler
execution. This will stop generating a DNA, during the fxrstor in the
sigreturn.
This improves 64-bit lat_sig numbers by ~30% on my core2 platform.
Roland McGrath [Mon, 12 Nov 2007 03:13:43 +0000 (19:13 -0800)]
core dump: remain dumpable
The coredump code always calls set_dumpable(0) when it starts (even
if RLIMIT_CORE prevents any core from being dumped). The effect of
this (via task_dumpable) is to make /proc/pid/* files owned by root
instead of the user, so the user can no longer examine his own
process--in a case where there was never any privileged data to
protect. This affects e.g. auxv, environ, fd; in Fedora (execshield)
kernels, also maps. In practice, you can only notice this when a
debugger has requested PTRACE_EVENT_EXIT tracing.
set_dumpable was only used in do_coredump for synchronization and not
intended for any security purpose. (It doesn't secure anything that wasn't
already unsecured when a process dies by SIGTERM instead of SIGQUIT.)
This changes do_coredump to check the core_waiters count as the means of
synchronization, which is sufficient. Now we leave the "dumpable" bits alone.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jerome Pinot [Sat, 10 Nov 2007 18:01:10 +0000 (03:01 +0900)]
ACPI: add documentation for deprecated /proc/acpi/battery in ACPI_PROCFS
Add documentation in Kconfig help about the move of /proc/acpi/battery
to /sys/class/power_supply when selecting ACPI_PROCFS. This will impact
a lot of users and should be documented.
Linus Torvalds [Sat, 10 Nov 2007 22:26:04 +0000 (14:26 -0800)]
Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: Don't fail device revalidation for bad _GTF methods
libata: port and host should be stopped before hardware resources are released
libata: skip 0xff polling for PATA controllers
libata: pata_platform: Support polling-mode configuration.
libata: Support PIO polling-only hosts.
libata sata_qstor conversion to new error handling (EH).
libata sata_qstor workaround for spurious interrupts
libata sata_qstor nuke idle state
nv_hardreset: update dangling reference to bugzilla entry
ata_piix: add SATELLITE PRO U200 to broken suspend list
Francois Romieu [Thu, 8 Nov 2007 22:23:21 +0000 (23:23 +0100)]
r8169: prevent bit sign expansion error in mdio_write
Oops.
The current code does not like being given an u16 with the highest
bit set as an argument to mdio_write. Let's enforce a correct range of
values for both the register address and value (resp. 5 and 16 bits).
The callers are currently left as-is.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Edward Hsu <edward_hsu@realtek.com.tw>
Ron Mercer [Wed, 7 Nov 2007 21:59:06 +0000 (13:59 -0800)]
qla3xxx: bugfix: Move link state machine into a worker thread
The link state machine requires access to some resources that
are shared with the iSCSI function on the chip. (See iSCSI
driver at drivers/scsi/qla4xxx) If the interface is being
up/downed at a rapid pace this driver may need to sleep
waiting to get access to the common resources. For this we
are moving the state machine to run as a work thread.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Tue, 6 Nov 2007 21:33:29 +0000 (13:33 -0800)]
bonding: don't validate address at device open
The standard validate_addr handler refuses to accept the all zeroes address
as valid. However, it's common historical practice for the bonding
master to be configured up prior to having any slaves, at which time the
master will have a MAC address of all zeroes.
Resolved by setting the dev->validate_addr to NULL. The master still can't
end up with an invalid address, as the set_mac_address function tests
for validity.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Tue, 6 Nov 2007 21:33:28 +0000 (13:33 -0800)]
bonding: fix rtnl locking merge error
Looks like I incorrectly merged one of the rtnl lock changes,
so that one function, bonding_show_active_slave, held rtnl but didn't
release it, and another, bonding_store_active_slave, never held rtnl but
did release it.
Fixed so the first function doesn't mess with rtnl, and the
second correctly acquires and releases rtnl.
Bug reported by Moni Shoua <monis@voltaire.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Stefano Brivio [Wed, 7 Nov 2007 17:33:37 +0000 (18:33 +0100)]
b43legacy: fix shared IRQ race condition
Fix an IRQ race condition in b43legacy. If we call
b43legacy_wireless_core_stop(), it will set the status of the device to
INITIALIZED and the IRQ handler won't care any longer about IRQs, thus the
kernel will disable the IRQ if it's shared (unless we boot it with the
'irqpoll' option). So we must disable IRQs before changing the device
status.
Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Stefano Brivio [Wed, 7 Nov 2007 17:16:11 +0000 (18:16 +0100)]
b43: fix shared IRQ race condition
Fix an IRQ race condition in b43. If we call b43_stop_wireless_core(), it
will set the status of the device to INITIALIZED and the IRQ handler won't
care any longer about IRQs, thus the kernel will disable the IRQ if it's
shared (unless we boot it with the 'irqpoll' option). So we must disable
IRQs before changing the device status.
Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Stefano Brivio [Tue, 6 Nov 2007 21:48:56 +0000 (22:48 +0100)]
b43legacy: add me as maintainer and fix URLs
As b43legacy is going to be orphaned, add me as a maintainer. Fix URLs for
the related website and fix my e-mail address in MAINTAINERS file.
Signed-off-by: Stefano Brivio <stefano.brivio@polimi.it> Cc: Larry Finger <larry.finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michael Buesch [Sat, 3 Nov 2007 13:34:32 +0000 (14:34 +0100)]
b43: Rewrite and fix rfkill init
The rfkill subsystem doesn't like code like that
rfkill_allocate();
rfkill_register();
rfkill_unregister();
rfkill_register(); /* <- This will crash */
This sequence happens with
modprobe b43
ifconfig wlanX up
ifconfig wlanX down
ifconfig wlanX up
Fix this by always re-allocating the rfkill stuff before register.
Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Roel Kluin [Mon, 5 Nov 2007 22:55:02 +0000 (23:55 +0100)]
ipw2100: fix postfix decrement errors
If i reaches zero, the loop ends, but the postfix decrement subtracts it to -1.
Testing for 'i == 0', later in the function, will not fulfill its purpose.
Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Sun, 28 Oct 2007 13:39:52 +0000 (14:39 +0100)]
rt2x00: Block adhoc & master mode
rt2x00 is broken when it comes down to adhoc and master mode.
The main problem is the beaconing, which is completely failing.
Untill a solution has been found, both beacon requiring modes
must be disabled to prevent numerous bug reports.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Holger Schurig [Tue, 9 Oct 2007 08:41:57 +0000 (10:41 +0200)]
libertas: fixes for slow hardware
Fixes for slow hardware.
Signed-off-by: Vitaly V. Bursov <vitalyvb@ukr.net> Signed-off-by: Holger Schurig <hs4233@mail.mn-solutions.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
The Intel device supported by the hermes driver core is the IPW2011. The
"Intel PRO/Wireless" wording suggests the later Centrino devices and may
be confusing to some users.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Various symptoms depending on the .config options:
- the card stops working after some (short) time
- the card does not work at all
- the card disappears (nothing in lspci/dmesg)
A real power-off is needed to recover the card.
Signed-off-by: Mark Lord <mlord@pobox.com> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Francois Romieu [Tue, 6 Nov 2007 21:56:10 +0000 (22:56 +0100)]
r8169: do not enable the TBI for the 8168 and the 81x0
The 8168c and the 8100e choke on it. I have not seen an indication
nor received a report that the TBI is being actively used on the
remaining 8168b and 8110. Let's disable it for now until someone
complains.
eric miao [Tue, 30 Oct 2007 01:48:41 +0000 (09:48 +0800)]
add support for smc91x ethernet interface on zylonite
This patch adds LAN91C111 ethernet interface support for zylonite
(a.k.a Marvell's PXA3xx Development Platform) with smc91x driver.
It would be better if a patch would support zylonite along with all
other PXA boards with a single binary of smc91x driver, but it looks
quite difficult for the moment, so ugly #ifdef is still used here.
Signed-off-by: Aleksey Makarov <amakarov@ru.mvista.com> Acked-by: eric miao <eric.miao@marvell.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
The PCI AER support may not work for a couple of reasons.
It may not be configured into the kernel or there may be a BIOS
bug that prevents MMCONFIG from working. If MMCONFIG doesn't work
then the PCI registers that control AER will not be accessible via
pci_read_config functions; luckly there is another window to access
PCI space in the device, so use that.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
The D-Link PCI-X board (and maybe others) can lie about status
ring entries. It seems it will update the register for last status
index before completing the DMA for the ring entry. To avoid reading
stale data, zap the old entry and check.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Chuck Ebbert [Wed, 7 Nov 2007 15:48:39 +0000 (10:48 -0500)]
x86 - 32-bit ptrace emulation mishandles 6th arg
[ jdike - Pushing Chuck's patch - see
http://lkml.org/lkml/2005/9/16/261 for some history and a test
program. UML is also broken without this patch - its processes get
SIGBUS from the corrupt 6th argument to mmap being interpretted as a
file offset ]
When the 32-bit vDSO is used to make a system call, the %ebp register for
the 6th syscall arg has to be loaded from the user stack (where it's pushed
by the vDSO user code). The native i386 kernel always does this before
stopping for syscall tracing, so %ebp can be seen and modified via ptrace
to access the 6th syscall argument. The x86-64 kernel fails to do this,
presenting the stack address to ptrace instead. This makes the %rbp value
seen by 64-bit ptrace of a 32-bit process, and the %ebp value seen by a
32-bit caller of ptrace, both differ from the native i386 behavior.
This patch fixes the problem by putting the word loaded from the user stack
into %rbp before calling syscall_trace_enter, and reloading the 6th syscall
argument from there afterwards (so ptrace can change it). This makes the
behavior match that of i386 kernels.
Original-Patch-By: Roland McGrath <roland@redhat.com> Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com> Signed-off-by: Jeff Dike <jdike@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Tue, 6 Nov 2007 23:30:38 +0000 (15:30 -0800)]
x86_64: ia32 ptrace THREAD_AREA fix
The addr argument to PTRACE_GET_THREAD_AREA and PTRACE_SET_THREAD_AREA is
not a magic constant. It's derived from the segment register values being
used, which are computed originally from the index used with set_thread_area.
The value does not need to match what a native i386 kernel would accept.
It needs to match the segment selectors that can actually be in use in this
32-bit process. The 64-bit ptrace support for PTRACE_GET_THREAD_AREA
(normally used only on 32-bit processes) is correct, but the 32-bit emulation
of ptrace is broken.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Randy Dunlap [Sat, 10 Nov 2007 03:30:36 +0000 (04:30 +0100)]
voyager: use struct instead of PARAM
Use struct boot_params instead of PARAM + 0xoffsets.
Fixes one of many Voyager build problems.
arch/x86/kernel/setup_32.c:543: error: 'PARAM' undeclared (first use in this function)
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
David Miller [Wed, 7 Nov 2007 05:13:56 +0000 (21:13 -0800)]
[FUTEX] Fix address computation in compat code.
compat_exit_robust_list() computes a pointer to the
futex entry in userspace as follows:
(void __user *)entry + futex_offset
'entry' is a 'struct robust_list __user *', and
'futex_offset' is a 'compat_long_t' (typically a 's32').
Things explode if the 32-bit sign bit is set in futex_offset.
Type promotion sign extends futex_offset to a 64-bit value before
adding it to 'entry'.
This triggered a problem on sparc64 running 32-bit applications which
would lock up a cpu looping forever in the fault handling for the
userspace load in handle_futex_death().
Compat userspace runs with address masking (wherein the cpu zeros out
the top 32-bits of every effective address given to a memory operation
instruction) so the sparc64 fault handler accounts for this by
zero'ing out the top 32-bits of the fault address too.
Since the kernel properly uses the compat_uptr interfaces, kernel side
accesses to compat userspace work too since they will only use
addresses with the top 32-bit clear.
Because of this compat futex layer bug we get into the following loop
when executing the get_user() load near the top of handle_futex_death():
1) load from address '0xfffffffff7f16bd8', FAULT
2) fault handler clears upper 32-bits, processes fault
for address '0xf7f16bd8' which succeeds
3) goto #1
I want to thank Bernd Zeimetz, Josip Rodin, and Fabio Massimo Di Nitto
for their tireless efforts helping me track down this bug.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 9 Nov 2007 23:28:11 +0000 (15:28 -0800)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] IOSAPIC bogus error cleanup
[IA64] Update printing of feature set bits
[IA64] Fix IOSAPIC delivery mode setting
[IA64] XPC heartbeat timer function must run on CPU 0
[IA64] Clean up /proc/interrupts output
[IA64] Disable/re-enable CPE interrupts on Altix
[IA64] Clean-up McKinley Errata message
[IA64] Add gate.lds to list of files ignored by Git
[IA64] Fix section mismatch in contig.c version of per_cpu_init()
[IA64] Wrong args to memset in efi_gettimeofday()
[IA64] Remove duplicate includes from ia32priv.h
[IA64] fix number of bytes zeroed by sys_fw_init() in arch/ia64/hp/sim/boot/fw-emu.c
[IA64] Fix perfmon sysctl directory modes
* master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6: (26 commits)
sh: remove dead config symbols from SH code
sh: Kill off broken snapgear ds1302 code.
sh: Add a dummy vga.h.
rtc: rtc-sh: Zero out tm value for invalid rtc states.
rtc: sh-rtc: Handle rtc_device_register() failure properly.
sh: Fix heartbeart on Solution Engine series
sh: Remove SCI_NPORTS from sh-sci.h
sh: Fix up PAGE_KERNEL_PCC() for nommu.
sh: hs7751rvoip: Kill off dead IPR IRQ mappings.
sh: hs7751rvoip: irq.c needs linux/interrupt.h.
sh: Kill off __{copy,clear}_user_page().
sh: Optimized copy_{to,from}_user_page() for SH-4.
sh: Wire up clear_user_highpage().
sh: Kill off the remaining ST40 cruft.
superhyway: Handle device_register() retval properly.
sh: kgdb sysrq depends on magic sysrq.
sh: Add -Werror for clean directories.
sh: Fix up kgdb build with modular sh-sci.
sh: Export __{s,u}divsi3_i4i on all CPUs.
sh: Fix up kgdb-on-NMI branch target.
...
Linus Torvalds [Fri, 9 Nov 2007 23:16:52 +0000 (15:16 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (37 commits)
[POWERPC] EEH: Make sure warning message is printed
[POWERPC] Make altivec code in swsusp_32.S depend on CONFIG_ALTIVEC
[POWERPC] windfarm: Fix windfarm thread freezer interaction
[POWERPC] Fix si_addr value on low level hash failures
[POWERPC] Refresh ppc64_defconfig and enable pasemi-related options
[POWERPC] pasemi: Update defconfig
[POWERPC] iSeries: Fix ref counting in vio setup
[POWERPC] ] Fix memset size error
[POWERPC] Fix link errors for allyesconfig
[POWERPC] iSeries_init_IRQ non-PCI tidy
[POWERPC] Change fallocate to match unistd.h on powerpc
[POWERPC] EEH: Avoid crash on null device
[POWERPC] EEH: Drivers that need reset trump others
[POWERPC] EEH: Clean up comments
[POWERPC] Fix off-by-one error in setting decrementer on Book E/4xx (v2)
[POWERPC] Fix switch_slb handling of 1T ESID values
[POWERPC] Fix build failure when CONFIG_VIRT_CPU_ACCOUNTING is not defined
[POWERPC] Include udbg.h when using udbg_printf
[POWERPC] Fix cache line vs. block size confusion
[POWERPC] Fix sysctl table check failure on PowerMac
...
Alan Cox [Wed, 7 Nov 2007 16:53:00 +0000 (16:53 +0000)]
frv: Remove bogus NO_IRQ = -1 define
The old NO_IRQ define some platforms had was long ago declared obsolete
and wrong. FRV should therefore not be re-introducing this, especially as
IRQs are usually unsigned in the kernel. The "no IRQ" case is defined to be
zero and Linus made this rather clear at the time.
arch/frv shows no dependancy on this but it might show up driver fixes
needing doing I guess
Signed-off-by: Alan Cox <alan@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 9 Nov 2007 23:08:37 +0000 (15:08 -0800)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: Use "is_power_of_2" macro for simplicity.
[SPARC]: Remove duplicate includes.
Linus Torvalds [Fri, 9 Nov 2007 23:02:43 +0000 (15:02 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6:
SELinux: add more validity checks on policy load
SELinux: fix bug in new ebitmap code.
SELinux: suppress a warning for 64k pages.
Peter Zijlstra [Fri, 9 Nov 2007 21:39:39 +0000 (22:39 +0100)]
sched: avoid large irq-latencies in smp-balancing
SMP balancing is done with IRQs disabled and can iterate the full rq.
When rqs are large this can cause large irq-latencies. Limit the nr of
iterations on each run.
This fixes a scheduling latency regression reported by the -rt folks.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
sched: fix copy_namespace() <-> sched_fork() dependency in do_fork
Sukadev Bhattiprolu reported a kernel crash with control groups.
There are couple of problems discovered by Suka's test:
- The test requires the cgroup filesystem to be mounted with
atleast the cpu and ns options (i.e both namespace and cpu
controllers are active in the same hierarchy).
# mkdir /dev/cpuctl
# mount -t cgroup -ocpu,ns none cpuctl
(or simply)
# mount -t cgroup none cpuctl -> Will activate all controllers
in same hierarchy.
- The test invokes clone() with CLONE_NEWNS set. This causes a a new child
to be created, also a new group (do_fork->copy_namespaces->ns_cgroup_clone->
cgroup_clone) and the child is attached to the new group (cgroup_clone->
attach_task->sched_move_task). At this point in time, the child's scheduler
related fields are uninitialized (including its on_rq field, which it has
inherited from parent). As a result sched_move_task thinks its on
runqueue, when it isn't.
As a solution to this problem, I moved sched_fork() call, which
initializes scheduler related fields on a new task, before
copy_namespaces(). I am not sure though whether moving up will
cause other side-effects. Do you see any issue?
- The second problem exposed by this test is that task_new_fair()
assumes that parent and child will be part of the same group (which
needn't be as this test shows). As a result, cfs_rq->curr can be NULL
for the child.
The solution is to test for curr pointer being NULL in
task_new_fair().
With the patch below, I could run ns_exec() fine w/o a crash.
Ingo Molnar [Fri, 9 Nov 2007 21:39:39 +0000 (22:39 +0100)]
sched: turn off PREEMPT_RESTRICT
PREEMPT_RESTRICT was a method aimed at reducing the amount of wakeup
related preemption. It has a disadvantage though, it can prevent
legitimate wakeups if a task is 'unlucky' to be hit too early by a tick
that clears peer_preempt.
Now that the wakeup preemption has been cleaned up we dont seem to have
excessive preemptions anymore, so this feature can be turned off. (and
removed in the next patch)
Ingo Molnar [Fri, 9 Nov 2007 21:39:38 +0000 (22:39 +0100)]
KVM: fix !SMP build error
fix a !SMP build error:
drivers/kvm/kvm_main.c: In function 'kvm_flush_remote_tlbs':
drivers/kvm/kvm_main.c:220: error: implicit declaration of function 'smp_call_function_mask'
(and also avoid unused function warning related to up_smp_call_function()
not making use of the 'func' parameter.)
Ingo Molnar [Fri, 9 Nov 2007 21:39:38 +0000 (22:39 +0100)]
sched: reintroduce SMP tunings again
Yanmin Zhang reported an aim7 regression and bisected it down to:
| commit 38ad464d410dadceda1563f36bdb0be7fe4c8938
| Author: Ingo Molnar <mingo@elte.hu>
| Date: Mon Oct 15 17:00:02 2007 +0200
|
| sched: uniform tunings
|
| use the same defaults on both UP and SMP.
fix this by reintroducing similar SMP tunings again. This resolves
the regression.
(also update the comments to match the ilog2(nr_cpus) tuning effect)
Paul Mackerras [Fri, 9 Nov 2007 21:39:38 +0000 (22:39 +0100)]
sched: restore deterministic CPU accounting on powerpc
Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the
deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been
broken on powerpc, because we end up counting user time twice: once in
timer_interrupt() and once in update_process_times().
This fixes the problem by pulling the code in update_process_times
that updates utime and stime into a separate function called
account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined,
there is a version of account_process_tick in kernel/timer.c that
simply accounts a whole tick to either utime or stime as before. If
CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to
implement account_process_tick.
This also lets us simplify the s390 code a bit; it means that the s390
timer interrupt can now call update_process_times even when
CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a
suitable account_process_tick().
account_process_tick() now takes the task_struct * as an argument.
Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Balbir Singh [Fri, 9 Nov 2007 21:39:37 +0000 (22:39 +0100)]
sched: fix delay accounting regression
Fix the delay accounting regression introduced by commit 75d4ef16a6aa84f708188bada182315f80aab6fa. rq no longer has sched_info
data associated with it. task_struct sched_info structure is used by delay
accounting to provide back statistics to user space.
also remove direct use of sched_clock() (which is not a valid thing to
do anymore) and use rq->clock instead.
Peter Zijlstra [Fri, 9 Nov 2007 21:39:37 +0000 (22:39 +0100)]
sched: reintroduce the sched_min_granularity tunable
we lost the sched_min_granularity tunable to a clever optimization
that uses the sched_latency/min_granularity ratio - but the ratio
is quite unintuitive to users and can also crash the kernel if the
ratio is set to 0. So reintroduce the min_granularity tunable,
while keeping the ratio maintained internally.
no functionality changed.
[ mingo@elte.hu: some fixlets. ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Russ Anderson [Tue, 16 Oct 2007 22:02:38 +0000 (17:02 -0500)]
[IA64] Update printing of feature set bits
Newer Itanium versions have added additional processor feature set
bits. This patch prints all the implemented feature set bits. Some
bit descriptions have not been made public. For those bits, a generic
"Feature set X bit Y" message is printed. Bits that are not implemented
will no longer be printed.
Signed-off-by: Russ Anderson <rja@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Kenji Kaneshige [Wed, 7 Nov 2007 06:38:30 +0000 (15:38 +0900)]
[IA64] Fix IOSAPIC delivery mode setting
Fix the problem that redirect hit bit in I/O SAPIC RTE is set even
when it must be disabled (e.g. nointroute boot option is set, CPU
hotplug is enabled or percpu vector is enabled).
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Dean Nelson [Wed, 7 Nov 2007 13:53:06 +0000 (07:53 -0600)]
[IA64] XPC heartbeat timer function must run on CPU 0
Currently, XPC's heartbeat timer function runs on whatever CPU modprobe/insmod
ran on when XPC was started. To avoid the heartbeat from being delayed for
long periods the timer function must run on CPU 0.
N.B. Altix doesn't currently allow cpu0 to be taken offline, so this is
safe for now. This code must be revised when offline of cpu0 is enabled.
Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Jens Axboe [Fri, 9 Nov 2007 11:52:45 +0000 (12:52 +0100)]
block: fix requeue handling in blk_queue_invalidate_tags()
Credit goes to juergen.kadidlo@exasol.com for diagnosing this issue
and supplying the initial patch.
blk_queue_invalidate_tags() must use the proper requeueing paths instead
of open coding the re-add of the request, otherwise we bug out in rq
accounting. Just switch to using blk_requeue_request(), that takes care
of end-tag handling as well and also adds the blktrace REQUEUE notify
event that is also appropriate here.
Russell King [Thu, 8 Nov 2007 23:35:46 +0000 (23:35 +0000)]
[ARM] pxa: fix one-shot timer mode
One-shot timer mode on PXA has various bugs which prevent kernels
build with NO_HZ enabled booting. They end up spinning on a
permanently asserted timer interrupt because we don't properly
clear it down - clearing the OIER bit does not stop the pending
interrupt status. Fix this in the set_mode handler as well.
Moreover, the code which sets the next expiry point may race with
the hardware, and we might not set the match register sufficiently
in the future. If we encounter that situation, return -ETIME so
the generic time code retries.
Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Jan Rinze [Thu, 8 Nov 2007 20:51:05 +0000 (21:51 +0100)]
[ARM] 4645/1: Cyberpro: Trivial fix to restore 16bpp mode.
Cyberpro: when user requests 16bpp, use it and not 24bpp.
There was a missing break causing requests for 16bpp mode
to end up in 24bpp mode.
Signed-off-by: Jan Rinze Peterzon <janrinze@home.nl> Acked-by: Ralph Siemsen <ralphs@netwinder.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tejun Heo [Thu, 8 Nov 2007 04:09:00 +0000 (13:09 +0900)]
libata: port and host should be stopped before hardware resources are released
Port / host stop calls used to be made from ata_host_release() which
is called after all hardware resources acquired after host allocation
are released. This is wrong as port and host stop routines often
access the hardware.
Add separate devres for port / host stop which is invoked right after
IRQ is released but with all other hardware resources intact. The
devres is added iff ->host_stop and/or ->port_stop exist.
This problem has been spotted by Mark Lord.
Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Mark Lord <liml@rtr.ca> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Thu, 8 Nov 2007 02:20:18 +0000 (11:20 +0900)]
libata: skip 0xff polling for PATA controllers
In a presentation of true workmanship, pata_ali asserts IRQ
permanantly if the TF status register is read more than once when
there's no device attached to the port.
Avoid waiting polling for !0xff if it's PATA. It's needed only for
some rare SATA devices anyway.
This problem is reported by Luca Tettamanti in bugzilla bug 9298.
Paul Mundt [Thu, 8 Nov 2007 02:15:21 +0000 (11:15 +0900)]
libata: pata_platform: Support polling-mode configuration.
Some SH boards (old R2D-1 boards) have generally not had working CF
under libata, due to both buswidth issues (handled by Aoi Shinkai
in 43f4b8c7578b928892b6f01d374346ae14e5eb70), and buggy interrupt
controllers. For these sorts of boards simply disabling the IRQ and
polling ends up working fine.
This conditionalizes the IRQ resource for pata_platform and lets
platforms that want to use polling mode simply omit the resource
entirely.
Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Paul Mundt [Thu, 8 Nov 2007 02:14:56 +0000 (11:14 +0900)]
libata: Support PIO polling-only hosts.
By default ata_host_activate() expects a valid IRQ in order to
successfully register the host. This patch enables a special case
for registering polling-only hosts that either don't have IRQs
or have buggy IRQ generation (either in terms of handling or
sensing), which otherwise work fine.
Hosts that want to use polling mode can simply set ATA_FLAG_PIO_POLLING
and pass in an invalid IRQ.
Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>