Milton Miller [Wed, 24 Oct 2007 16:23:48 +0000 (18:23 +0200)]
sched: fix sched_domain sysctl registration again
commit 029190c515f15f512ac85de8fc686d4dbd0ae731 (cpuset
sched_load_balance flag) was not tested SCHED_DEBUG enabled as
committed as it dereferences NULL when used and it reordered
the sysctl registration to cause it to never show any domains
or their tunables.
Fixes:
1) restore arch_init_sched_domains ordering
we can't walk the domains before we build them
presently we register cpus with empty directories (no domain
directories or files).
2) make unregister_sched_domain_sysctl do nothing when already unregistered
detach_destroy_domains is now called one set of cpus at a time
unregister_syctl dereferences NULL if called with a null.
While the the function would always dereference null if called
twice, in the previous code it was always called once and then
was followed a register. So only the hidden bug of the
sysctl_root_table not being allocated followed by an attempt to
free it would have shown the error.
3) always call unregister and register in partition_sched_domains
The code is "smart" about unregistering only needed domains.
Since we aren't guaranteed any calls to unregister, always
unregister. Without calling register on the way out we
will not have a table or any sysctl tree.
4) warn if register is called without unregistering
The previous table memory is lost, leaving pointers to the
later freed memory in sysctl and leaking the memory of the
tables.
Before this patch on a 2-core 4-thread box compiled for SMT and NUMA,
the domains appear empty (there are actually 3 levels per cpu). And as
soon as two domains a null pointer is dereferenced (unreliable in this
case is stack garbage):
bu19a:~# ls -R /proc/sys/kernel/sched_domain/
/proc/sys/kernel/sched_domain/:
cpu0 cpu1 cpu2 cpu3
Linus Torvalds [Wed, 24 Oct 2007 03:50:57 +0000 (20:50 -0700)]
Linux 2.6.24-rc1
The patch is big. Really big. You just won't believe how vastly hugely
mindbogglingly big it is. I mean you may think it's a long way down the
road to the chemist, but that's just peanuts to how big the patch from
2.6.23 is.
Greg Ungerer [Wed, 24 Oct 2007 02:03:25 +0000 (12:03 +1000)]
m68knommu: cleanup 68360 startup code
Clean up 68360 timer support code. Removed header includes not needed.
Remove use of old m68knommu timer function pointers. Use common function
naming for 68328 timer functions.
Linus Torvalds [Wed, 24 Oct 2007 01:57:39 +0000 (18:57 -0700)]
Merge branch 'irq-upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6
* 'irq-upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
[SPARC, XEN, NET/CXGB3] use irq_handler_t where appropriate
drivers/char/riscom8: clean up irq handling
isdn/sc: irq handler clean
isdn/act2000: fix major bug. clean irq handler.
char/pcmcia/synclink_cs: trim trailing whitespace
drivers/char/ip2: separate polling and irq-driven work entry points
drivers/char/ip2: split out irq core logic into separate function
[NETDRVR] lib82596, netxen: delete pointless tests from irq handler
Eliminate pointless casts from void* in a few driver irq handlers.
[PARPORT] Remove unused 'irq' argument from parport irq functions
[PARPORT] Kill useful 'irq' arg from parport_{generic_irq,ieee1284_interrupt}
[PARPORT] Consolidate code copies into a single generic irq handler
Linus Torvalds [Wed, 24 Oct 2007 01:56:54 +0000 (18:56 -0700)]
Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (39 commits)
Remove Andrew Morton from list of net driver maintainers.
bonding: Acquire correct locks in alb for promisc change
bonding: Convert more locks to _bh, acquire rtnl, for new locking
bonding: Convert locks to _bh, rework alb locking for new locking
bonding: Convert miimon to new locking
bonding: Convert balance-rr transmit to new locking
Convert bonding timers to workqueues
Update MAINTAINERS to reflect my (jgarzik's) current efforts.
pasemi_mac: fix typo
defxx.c: dfx_bus_init() is __devexit not __devinit
s390 MAINTAINERS
remove header_ops bug in qeth driver
sky2: crash on remove
MIPSnet: Delete all the useless debugging printks.
AR7 ethernet: small post-merge cleanups and fixes
mv643xx_eth: Hook up mv643xx_get_sset_count
mv643xx_eth: Remove obsolete checksum offload comment
mv643xx_eth: Merge drivers/net/mv643xx_eth.h into mv643xx_eth.c
mv643xx_eth: Remove unused register defines
mv643xx_eth: Clean up mv643xx_eth.h
...
Jay Vosburgh [Thu, 18 Oct 2007 00:37:50 +0000 (17:37 -0700)]
bonding: Convert more locks to _bh, acquire rtnl, for new locking
Convert more lock acquisitions to _bh flavor to avoid deadlock
with workqueue activity and add acquisition of RTNL in appropriate places.
Affects ALB mode, as well as core bonding functions and sysfs.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Thu, 18 Oct 2007 00:37:49 +0000 (17:37 -0700)]
bonding: Convert locks to _bh, rework alb locking for new locking
Convert locking-related activity to new & improved system.
Convert some lock acquisitions to _bh and rework parts of ALB mode, both
to avoid deadlocks with workqueue activity.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Thu, 18 Oct 2007 00:37:48 +0000 (17:37 -0700)]
bonding: Convert miimon to new locking
Convert mii (link state) monitor to acquire correct locks for
failover events. In particular, failovers generally require RTNL at a low
level (when manipulating device MAC addresses, for example) and no other
locks. The high level monitor is responsible for acquiring a known set
of locks, RTNL, the bond->lock for read and the slave_lock for write, and
the low level failover processing can then release appropriate locks as
needed. This patch provides the high level portion.
As it is undesirable to acquire RTNL for every monitor pass (which
may occur as often as every 10 ms), the miimon has been converted to
do conditional locking. A first pass inspects all slaves to determine
if any action is required, and if so, a second pass (after acquring RTNL)
is done to perform any actions (doing a complete rescan, as the situation
may have changed when all locks were released).
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Thu, 18 Oct 2007 00:37:47 +0000 (17:37 -0700)]
bonding: Convert balance-rr transmit to new locking
Change locking in balance-rr transmit processing to use a free
running counter to determine which slave to transmit on. Instead, a
free-running counter is maintained, and modulo arithmetic used to select
a slave for transmit.
This removes lock operations from the TX path, and eliminates
a deadlock introduced by the conversion to work queues.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Thu, 18 Oct 2007 00:37:45 +0000 (17:37 -0700)]
Convert bonding timers to workqueues
Convert bonding timers to workqueues. This converts the various
monitor functions to run in periodic work queues instead of timers. This
patch introduces the framework and convers the calls, but does not resolve
various locking issues, and does not stand alone.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Olof Johansson [Sat, 20 Oct 2007 19:10:03 +0000 (14:10 -0500)]
pasemi_mac: fix typo
Add missing &:
drivers/net/pasemi_mac.c: In function 'pasemi_mac_clean_rx':
drivers/net/pasemi_mac.c:553: warning: passing argument 1 of 'prefetch'
makes pointer from integer without a cast
Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Ursula Braun [Mon, 22 Oct 2007 14:16:14 +0000 (16:16 +0200)]
remove header_ops bug in qeth driver
Remove qeth bug caused by commit:
[NET]: Move hardware header operations out of netdevice.
This is the second part of the qeth header_ops patch, since
first patch sent 10/19 has been insufficient.
Nevertheless first patch is still valid and should be kept.
Signed-off-by: Ursula Braun <braunu@de.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Neil Brown [Tue, 23 Oct 2007 21:09:13 +0000 (17:09 -0400)]
NFS: Fix for bug in handling of errors for O_DIRECT writes
Commit eda3cef8dd2b83875affe82595db9d0c278879b2 ("NFS: Fix error
handling in nfs_direct_write_result()") ensured that if a WRITE returns
an error, then data->res.verf->committed is not tested (as it is not
initialised).
So move the test so that we never examine ->committed in an error case,
and fix a speeling error while we are there.
Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 23 Oct 2007 23:32:11 +0000 (16:32 -0700)]
Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 4630/1: Fix the vector stride of the double vector instruction.
[ARM] 4629/1: Fix VFP emulation code to clear all exception flags of FPEXC
[ARM] 4613/1: pxa300: MFP typo fix
Carlos Corbacho [Fri, 19 Oct 2007 17:51:27 +0000 (18:51 +0100)]
x86: Force enable HPET for CK804 (nForce 4) chipsets
This patch adds a quirk from LinuxBIOS to force enable HPET on
the nVidia CK804 (nForce 4) chipset.
This quirk can very likely support more than just nForce 4
(LinuxBIOS use the same code for nForce 5), and possibly nForce 3,
but I don't have those chipsets, so cannot add and test them.
H. Peter Anvin [Tue, 23 Oct 2007 20:37:25 +0000 (22:37 +0200)]
x86: clean up setup.h and the boot code
Make <asm/setup.h> usable by the boot code.
Clean up vestiges of the old command-line protocol from setup.h and
head_32.S (it is still supported from the boot loader point of
view, since it is converted to the new command-line protocol by the
boot code.)
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Dave Johnson [Tue, 23 Oct 2007 20:37:22 +0000 (22:37 +0200)]
x86: fix more TSC clock source calibration errors
The previous patch wasn't correctly handling the 'count' variable. If
a CPU gave bad results on the 1st or 2nd run but good results on the
3rd, it wouldn't do the correct thing. No idea if any such CPU
exists, but the patch below handles that case by discarding the bad
runs.
If a bad result (too quick, or too slow) occurs on any of the 3 runs
it will be discarded.
Also updated some comments to explain what's going on.
Signed-off-by: Dave Johnson <djohnson@sw.starentnetworks.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Dave Johnson [Tue, 23 Oct 2007 20:37:22 +0000 (22:37 +0200)]
x86: fix TSC clock source calibration error
I ran into this problem on a system that was unable to obtain NTP sync
because the clock was running very slow (over 10000ppm slow). ntpd had
declared all of its peers 'reject' with 'peer_dist' reason.
On investigation, the tsc_khz variable was significantly incorrect
causing xtime to run slow. After a reboot tsc_khz was correct so I
did a reboot test to see how often the problem occurred:
Test was done on a 2000 Mhz Xeon system. Of 689 reboots, 8 of them
had unacceptable tsc_khz values (>500ppm):
It appears that on rare occasions, mach_countup() is taking longer to
complete than necessary.
I suspect that this is caused by the CPU taking a periodic SMI
interrupt right at the end of the 30ms calibration loop. This would
cause the loop to delay while the SMI BIOS hander runs. The resulting
TSC value is beyond what it actually should be resulting in a higher
tsc_khz.
The below patch makes native_calculate_cpu_khz() take the best
(shortest duration, lowest khz) run of it's 3 calibration loops. If a
SMI goes off causing a bad result (long duration, higher khz) it will
be discarded.
With the patch applied, 300 boots of the same system produce good
results:
Problem was found and tested against 2.6.18. Patch is against 2.6.22.
Signed-off-by: Dave Johnson <djohnson@sw.starentnetworks.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Nick Piggin [Fri, 19 Oct 2007 05:13:02 +0000 (07:13 +0200)]
x86: lock bitops
I missed an obvious one!
x86 CPUs are defined not to reorder stores past earlier loads, so there is
no hardware memory barrier required to implement a release-consistent store
(all stores are, by definition).
So ditch the generic lock bitops, and implement optimised versions for x86,
which removes the mfence from __clear_bit_unlock (which is already a useful
primitive for SLUB).
Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
WANG Cong [Sat, 6 Oct 2007 03:17:13 +0000 (11:17 +0800)]
[WATCHDOG] Documentation/watchdog/src/watchdog-simple.c: improve this code
Make some improvements for Documentation/watchdog/src/watchdog-simple.c.
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[WATCHDOG] Linux kernel IPC SBC Watchdog Timer driver
ICP's Wafer 5823 SBC has, as far as I can tell, the same WDT as many,
if not all ICP's SBC's (that do have a WDT). I have tested it with
several boards, including Rocky 4783, Rocky 3703 and Rocky 3782.
I propose a rename of the Wafer 5823 watchdog timer driver
to something like "IPC (SBC) Watchdog Timer", to reflect that it
works with other IPC boards (maybe even all of them).
Signed-off-by: Veljkovic Srdjan <sveljko@gvs.co.yu> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Andrew Vasquez [Fri, 19 Oct 2007 22:59:19 +0000 (15:59 -0700)]
[SCSI] qla2xxx: Correct display of ISP serial-number.
The original serial-number calculations based on WWPN no longer
apply to newer ISPs (ISP24xx and ISP25xx). These newer board's
serial number reside in the VPD.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>