Yinghai Lu [Fri, 19 Dec 2008 23:23:44 +0000 (15:23 -0800)]
x86: fix lguest used_vectors breakage, -v2
Impact: fix lguest, clean up
32-bit lguest used used_vectors to record vectors, but that model of
allocating vectors changed and got broken, after we changed vector
allocation to a per_cpu array.
Try enable that for 64bit, and the array is used for all vectors that
are not managed by vector_irq per_cpu array.
Also kill system_vectors[], that is now a duplication of the
used_vectors bitmap.
[ merged in cpus4096 due to io_apic.c cpumask changes. ]
[ -v2, fix build failure ]
Ingo Molnar [Thu, 18 Dec 2008 23:59:09 +0000 (00:59 +0100)]
x86: fix warning in arch/x86/kernel/io_apic.c
this warning:
arch/x86/kernel/io_apic.c: In function ‘ir_set_msi_irq_affinity’:
arch/x86/kernel/io_apic.c:3373: warning: ‘cfg’ may be used uninitialized in this function
triggers because the variable was truly uninitialized. We'd crash on
entering this code.
sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0
Impact: change task balancing to save power more agressively
Add SD_BALANCE_NEWIDLE flag at MC level and CPU level
if sched_mc is set. This helps power savings and
will not affect performance when sched_mc=0
Ingo and Mike Galbraith have optimised the SD flags by
removing SD_BALANCE_NEWIDLE at MC and CPU level. This
helps performance but hurts power savings since this
slows down task consolidation by reducing the number
of times load_balance is run.
This patch selectively enables SD_BALANCE_NEWIDLE flag
only when sched_mc is set to 1 or 2. This helps power savings
by task consolidation and also does not hurt performance at
sched_mc=0 where all power saving optimisations are turned off.
sched: activate active load balancing in new idle cpus
Impact: tweak task balancing to save power more agressively
Active load balancing is a process by which migration thread
is woken up on the target CPU in order to pull current
running task on another package into this newly idle
package.
This method is already in use with normal load_balance(),
this patch introduces this method to new idle cpus when
sched_mc is set to POWERSAVINGS_BALANCE_WAKEUP.
This logic provides effective consolidation of short running
daemon jobs in a almost idle system
The side effect of this patch may be ping-ponging of tasks
if the system is moderately utilised. May need to adjust the
iterations before triggering.
sched: bias task wakeups to preferred semi-idle packages
Impact: tweak task wakeup to save power more agressively
Preferred wakeup cpu (from a semi idle package) has been
nominated in find_busiest_group() in the previous patch. Use
this information in sched_mc_preferred_wakeup_cpu in function
wake_idle() to bias task wakeups if the following conditions
are satisfied:
- The present cpu that is trying to wakeup the process is
idle and waking the target process on this cpu will
potentially wakeup a completely idle package
- The previous cpu on which the target process ran is
also idle and hence selecting the previous cpu may
wakeup a semi idle cpu package
- The task being woken up is allowed to run in the
nominated cpu (cpu affinity and restrictions)
Basically if both the current cpu and the previous cpu on
which the task ran is idle, select the nominated cpu from semi
idle cpu package for running the new task that is waking up.
Cache hotness is considered since the actual biasing happens
in wake_idle() only if the application is cache cold.
This technique will effectively move short running bursty jobs in
a mostly idle system.
Wakeup biasing for power savings gets automatically disabled if
system utilisation increases due to the fact that the probability
of finding both this_cpu and prev_cpu idle decreases.
Impact: extend load-balancing code (no change in behavior yet)
When the system utilisation is low and more cpus are idle,
then the process waking up from sleep should prefer to
wakeup an idle cpu from semi-idle cpu package (multi core
package) rather than a completely idle cpu package which
would waste power.
Use the sched_mc balance logic in find_busiest_group() to
nominate a preferred wakeup cpu.
This info can be stored in appropriate sched_domain, but
updating this info in all copies of sched_domain is not
practical. Hence this information is stored in root_domain
struct which is one copy per partitioned sched domain.
The root_domain can be accessed from each cpu's runqueue
and there is one copy per partitioned sched domain.
sched: favour lower logical cpu number for sched_mc balance
Impact: change load-balancing direction to match that of irqbalanced
Just in case two groups have identical load, prefer to move load to lower
logical cpu number rather than the present logic of moving to higher logical
number.
find_busiest_group() tries to look for a group_leader that has spare capacity
to take more tasks and freeup an appropriate least loaded group. Just in case
there is a tie and the load is equal, then the group with higher logical number
is favoured. This conflicts with user space irqbalance daemon that will move
interrupts to lower logical number if the system utilisation is very low.
Gautham R Shenoy [Thu, 18 Dec 2008 17:56:09 +0000 (23:26 +0530)]
sched: framework for sched_mc/smt_power_savings=N
Impact: extend range of /sys/devices/system/cpu/sched_mc_power_savings
Currently the sched_mc/smt_power_savings variable is a boolean,
which either enables or disables topology based power savings.
This patch extends the behaviour of the variable from boolean to
multivalued, such that based on the value, we decide how
aggressively do we want to perform powersavings balance at
appropriate sched domain based on topology.
Variable levels of power saving tunable would benefit end user to
match the required level of power savings vs performance
trade-off depending on the system configuration and workloads.
This version makes the sched_mc_power_savings global variable to
take more values (0,1,2). Later versions can have a single
tunable called sched_power_savings instead of
sched_{mc,smt}_power_savings.
sched: convert BALANCE_FOR_xx_POWER to inline functions
Impact: cleanup
BALANCE_FOR_MC_POWER and similar macros defined in sched.h are
not constants and have various condition checks and significant
amount of code that is not suitable to be contain in a macro.
Also there could be side effects on the expressions passed to
some of them like test_sd_parent().
This patch converts all complex macros related to power savings
balance to inline functions.
Mike Travis [Wed, 17 Dec 2008 23:21:39 +0000 (15:21 -0800)]
x86: fix cpu_mask_to_apicid_and to include cpu_online_mask
Impact: fix potential APIC crash
In determining the destination apicid, there are usually three cpumasks
that are considered: the incoming cpumask arg, cfg->domain and the
cpu_online_mask. Since we are just introducing the cpu_mask_to_apicid_and
function, make sure it includes the cpu_online_mask in it's evaluation.
[Added with this patch.]
There are two io_apic.c functions that did not previously use the
cpu_online_mask: setup_IO_APIC_irq and msi_compose_msg. Both of these
simply used cpu_mask_to_apicid(cfg->domain & TARGET_CPUS), and all but
one arch (NUMAQ[*]) returns only online cpus in the TARGET_CPUS mask,
so the behavior is identical for all cases.
[*: NUMAQ bug?]
Note that alloc_cpumask_var is only used for the 32-bit cases where
it's highly likely that the cpumask set size will be small and therefore
CPUMASK_OFFSTACK=n. But if that's not the case, failing the allocate
will cause the same return value as the default.
Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Thu, 18 Dec 2008 10:55:06 +0000 (11:55 +0100)]
Merge branch 'x86/apic' into cpus4096
This done for conflict prevention: we merge it into the cpus4096 tree
because upcoming cpumask changes will touch apic.c that would collide
with x86/apic otherwise.
Linus Torvalds [Wed, 17 Dec 2008 23:05:26 +0000 (15:05 -0800)]
Merge branch 'i2c-fixes' of git://aeryn.fluff.org.uk/bjdooks/linux
* 'i2c-fixes' of git://aeryn.fluff.org.uk/bjdooks/linux:
i2c-s3c2410: fix check for being in suspend.
i2c-cpm: Detect and report NAK right away instead of timing out
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: pl2303: add id for Hewlett-Packard LD220-HP POS pole display
USB: set correct configuration in probe of ti_usb_3410_5052
USB: add 5372:2303 to pl2303
USB: skip Set-Interface(0) if already in altsetting 0
USB: fix comment about endianness of descriptors
USB: Documentation/usb/gadget_serial.txt: update to match driver use_acm behaviour
usbmon: drop bogus 0t from usbmon.txt
USB: gadget: fix rndis working at high speed
USB: ftdi_sio: Adding Ewert Energy System's CANdapter PID
USB: tty: SprogII DCC controller identifiers
usb-storage: update unusual_devs entry for Nokia 5310
USB: Unusual devs patch for Nokia 3500c
USB: storage: unusual_devs.h: Nokia 3109c addition
USB: fix problem with usbtmc driver not loading properly
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
STAGING: Move staging drivers back to staging-specific menu
driver core: add newlines to debugging enabled/disabled messages
xilinx_hwicap: remove improper wording in license statement
driver core: fix using 'ret' variable in unregister_dynamic_debug_module
Jeff Layton [Wed, 17 Dec 2008 11:31:53 +0000 (06:31 -0500)]
cifs: fix buffer overrun in parse_DFS_referrals
While testing a kernel with memory poisoning enabled, I saw some warnings
about the redzone getting clobbered when chasing DFS referrals. The
buffer allocation for the unicode converted version of the searchName is
too small and needs to take null termination into account.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 17 Dec 2008 22:58:56 +0000 (14:58 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc:
powerpc: Fix corruption error in rh_alloc_fixed()
powerpc/fsl-booke: Fix the miss interrupt restore
driver core: fix using 'ret' variable in unregister_dynamic_debug_module
The 'ret' variable is assigned, but not used in the return statement. Fix this.
Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Matthew Arnold [Sat, 13 Dec 2008 11:42:53 +0000 (22:42 +1100)]
USB: add 5372:2303 to pl2303
This patch adds the "Superial" USB-Serial converter to pl2303 so that it
is detected, by the correct driver. Adds the relevant vendor:product
(5372:2303) to the device tables in pl2303.c & pl2303.h. The patch has
been tested against 2.6.24-22-generic.
Signed-off-by: Matthew D Arnold <matthew.arnold-1@uts.edu.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Stern [Mon, 1 Dec 2008 15:24:41 +0000 (10:24 -0500)]
USB: skip Set-Interface(0) if already in altsetting 0
When a driver unbinds from an interface, usbcore always sends a
Set-Interface request to reinstall altsetting 0. Unforunately, quite
a few devices have buggy firmware that crashes when it receives this
request.
To avoid such problems, this patch (as1180) arranges to send the
Set-Interface request only when the interface is not already in
altsetting 0.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Phil Endecott [Mon, 1 Dec 2008 15:22:33 +0000 (10:22 -0500)]
USB: fix comment about endianness of descriptors
This patch fixes a comment and clarifies the documentation about the
endianness of descriptors. The current policy is that descriptors will
be little-endian at the API even on big-endian systems; however the
/proc/bus/usb API predates this policy and presents descriptors with
some multibyte fields byte-swapped.
Signed-off-by: Phil Endecott <usb_endian_patch@chezphil.org> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Peter Korsgaard [Thu, 4 Dec 2008 15:30:53 +0000 (16:30 +0100)]
USB: Documentation/usb/gadget_serial.txt: update to match driver use_acm behaviour
Commit 7bb5ea54 (usb gadget serial: use composite gadget framework)
changed the default for the use_acm parameter from 0 to 1.
Update the documentation to match.
Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Brownell [Tue, 25 Nov 2008 07:11:03 +0000 (23:11 -0800)]
USB: gadget: fix rndis working at high speed
Fix a bug specific to highspeed mode in the recently updated RNDIS
support: it wasn't setting up the high speed notification endpoint,
which prevented high speed RNDIS links from working.
Andrew Ewert [Thu, 4 Dec 2008 15:09:59 +0000 (09:09 -0600)]
USB: ftdi_sio: Adding Ewert Energy System's CANdapter PID
The following patch adds in the USB PID for Ewert Energy System's CANdapter
device (CANBUS to USB-Serial which uses the FTDI 245R chipset) to the ftdi_sio
device driver.
The patch was tested successfully on Linux kernel 2.6.27 under Ubuntu.
Relevant output from /proc/bus/usb/devices (With patch installed):
Alan Cox [Sun, 7 Dec 2008 07:46:04 +0000 (23:46 -0800)]
USB: tty: SprogII DCC controller identifiers
Someone on rmweb reminded me this had been overlooked from ages ago..
Add the identifiers for the Sprog II USB. This is a DCC control interface
using the FTDI-SIO hardware: http://www.sprog-dcc.co.uk/. People have been
using it with insmod options for ages, this just puts it into the driver
data.
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
CSÉCSY László [Tue, 9 Dec 2008 22:39:14 +0000 (23:39 +0100)]
USB: storage: unusual_devs.h: Nokia 3109c addition
2.6.26(.x, cannot remember) could handle the microSD card in my Nokia
3109c attached via USB as mass storage, 2.6.27(.x, up to and included
2.6.27.8) cannot. Please find the attached patch which fixes this
regression, and a copy of /proc/bus/usb/devices with my phone plugged in
running with this patch on Frugalware.
There is an error in rh_alloc_fixed() of the Remote Heap code:
If there is at least one free block blk won't be NULL at the end of the
search loop, so -ENOMEM won't be returned and the else branch of
"if (bs == s || be == e)" will be taken, corrupting the management
structures.
Signed-off-by: Guillaume Knispel <gknispel@proformatique.com> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Dave Liu [Wed, 17 Dec 2008 10:24:15 +0000 (18:24 +0800)]
powerpc/fsl-booke: Fix the miss interrupt restore
The commit e5e774d8833de1a0037be2384efccadf16935675
powerpc/fsl-booke: Fix problem with _tlbil_va being interrupted
introduce one issue. that casue the problem like this:
Mike Travis [Wed, 17 Dec 2008 01:34:03 +0000 (17:34 -0800)]
x86: Remove cpumask games in x86/kernel/cpu/intel_cacheinfo.c
Impact: remove cpumask_t from stack.
We should not try to save and restore cpus_allowed on current.
We can't use work_on_cpu() here, since it's in the hotplug cpu path
(if anyone else tries to get the hotplug lock from a workqueue we
could deadlock against them).
Fortunately, we can just use smp_call_function_single() since the
function can run from an interrupt.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@tv-sign.ru>
Mike Travis [Wed, 17 Dec 2008 01:33:58 +0000 (17:33 -0800)]
x86: fixup_irqs() doesnt need an argument.
Impact: cleanup, remove on-stack cpumask.
The "map" arg is always cpu_online_mask. Importantly, set_affinity
always ands the argument with cpu_online_mask anyway, so we don't need
to do it in fixup_irqs(), avoiding a temporary.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Mike Travis <travis@sgi.com>
Mike Travis [Wed, 17 Dec 2008 01:33:56 +0000 (17:33 -0800)]
x86: Update io_apic.c to use new cpumask API
Impact: cleanup, consolidate patches, use new API
Consolidate the following into a single patch to adapt to new
sparseirq code in arch/x86/kernel/io_apic.c, add allocation of
cpumask_var_t's in domain and old_domain, and reduce further
merge conflicts. Only one file (arch/x86/kernel/io_apic.c) is
changed in all of these patches.
Mike Travis [Wed, 17 Dec 2008 01:33:54 +0000 (17:33 -0800)]
x86: Add cpu_mask_to_apicid_and
Impact: new API
Add a helper function that takes two cpumask's, and's them and then
returns the apicid of the result. This removes a need in io_apic.c
that uses a temporary cpumask to hold (mask & cfg->domain).
Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Mike Travis [Wed, 17 Dec 2008 01:33:52 +0000 (17:33 -0800)]
x86 smp: modify send_IPI_mask interface to accept cpumask_t pointers
Impact: cleanup, change parameter passing
* Change genapic interfaces to accept cpumask_t pointers where possible.
* Modify external callers to use cpumask_t pointers in function calls.
* Create new send_IPI_mask_allbutself which is the same as the
send_IPI_mask functions but removes smp_processor_id() from list.
This removes another common need for a temporary cpumask_t variable.
* Functions that used a temp cpumask_t variable for:
cpumask_t allbutme = cpu_online_map;
cpu_clear(smp_processor_id(), allbutme);
if (!cpus_empty(allbutme))
...
become:
if (!cpus_equal(cpu_online_map, cpumask_of_cpu(cpu)))
...
* Other minor code optimizations (like using cpus_clear instead of
CPU_MASK_NONE, etc.)
Applies to linux-2.6.tip/master.
Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@elte.hu>
Yinghai Lu [Thu, 11 Dec 2008 08:15:01 +0000 (00:15 -0800)]
x86, sparseirq: move irq_desc according to smp_affinity, v7
Impact: improve NUMA handling by migrating irq_desc on smp_affinity changes
if CONFIG_NUMA_MIGRATE_IRQ_DESC is set:
- make irq_desc to go with affinity aka irq_desc moving etc
- call move_irq_desc in irq_complete_move()
- legacy irq_desc is not moved, because they are allocated via static array
for logical apic mode, need to add move_desc_in_progress_in_same_domain,
otherwise it will not be moved ==> also could need two phases to get
irq_desc moved.
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Mike Travis [Tue, 16 Dec 2008 17:13:11 +0000 (09:13 -0800)]
x86: fix build error with post-merge of tip/cpus4096 and rr-for-ingo/master.
Ingo Molnar wrote:
> allyes64 build failure:
>
> arch/x86/kernel/io_apic.c: In function ‘set_ir_ioapic_affinity_irq_desc’:
> arch/x86/kernel/io_apic.c:2295: error: incompatible type for argument 2 of
> ‘migrate_ioapic_irq_desc’
> arch/x86/kernel/io_apic.c: In function ‘ir_set_msi_irq_affinity’:
> arch/x86/kernel/io_apic.c:3205: error: incompatible type for argument 2 of
> ‘set_extra_move_desc’
> make[1]: *** wait: No child processes. Stop.
Here's a small patch to correct the build error with the post-merge tree.
Built and boot-tested. I'll will reset the follow on patches in my brand
new git tree to accommodate this change.
Fix two references in io_apic.c that were incorrect.
Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tao Ma [Fri, 5 Dec 2008 01:14:10 +0000 (09:14 +0800)]
ocfs2: Always update xattr search when creating bucket.
When we create xattr bucket during the process of xattr set, we always
need to update the ocfs2_xattr_search since even if the bucket size is
the same as block size, the offset will change because of the removal
of the ocfs2_xattr_block header.
Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Ben Dooks [Fri, 31 Oct 2008 16:10:22 +0000 (16:10 +0000)]
i2c-s3c2410: fix check for being in suspend.
As noted by Julia Lawall <julia@diku.dk>, we can never
trigger the check for being in suspend due to the result
of !readl(i2c->regs + S3C2410_IICCON) & S3C2410_IICCON_IRQEN
always being 0.
Add suspend/resume hooks to stop i2c transactions happening
until the driver has been resumed.
Mike Ditto [Tue, 16 Dec 2008 20:17:09 +0000 (20:17 +0000)]
i2c-cpm: Detect and report NAK right away instead of timing out
Make the driver report an ENXIO error immediately upon NAK instead of
waiting for another interrupt and getting a timeout.
When reading from a device that is not present or declines to respond
to, e.g., a non-existent register address, CPM immediately reports a
NAK condition in the TxBD, but the driver kept waiting until a timeout,
which takes 1 second and causes an ugly console error message.
Signed-off-by: Mike Ditto <mditto@consentry.com> Acked-by: Jochen Friedrich <jochen@scram.de>
[ben-linux@fluff.org: reordered description text] Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Linus Torvalds [Tue, 16 Dec 2008 17:47:58 +0000 (09:47 -0800)]
Merge branch 'sh/for-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh/for-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Disable GENERIC_HARDIRQS_NO__DO_IRQ for unconverted platforms.
sh: maple: Do not pass SLAB_POISON to kmem_cache_create()
Linus Torvalds [Tue, 16 Dec 2008 17:47:43 +0000 (09:47 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc/cell/axon-msi: Fix MSI after kexec
powerpc: Fix bootmem reservation on uninitialized node
powerpc: Check for valid hugepage size in hugetlb_get_unmapped_area
Tejun Heo [Tue, 9 Dec 2008 08:13:19 +0000 (17:13 +0900)]
pata_hpt366: fix cable detection,
pata_hpt366 is strange in that its two channels occupy two PCI
functions and both are primary channels and bit1 of PCI configuration
register 0x5A indicates cable for both channels.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Matt Fleming [Tue, 16 Dec 2008 00:15:31 +0000 (09:15 +0900)]
sh: maple: Do not pass SLAB_POISON to kmem_cache_create()
SLAB_POISON is not a valid flag for kmem_create_cache() unless
CONFIG_DEBUG_SLAB is set, so remove it from the flags argument.
Acked-by: Adrian McMenamin <adrian@newgolddream.dyndns.info> Signed-off-by: Matt Fleming <mjf@gentoo.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Arnd Bergmann [Fri, 12 Dec 2008 09:19:50 +0000 (09:19 +0000)]
powerpc/cell/axon-msi: Fix MSI after kexec
Commit d015fe995 'powerpc/cell/axon-msi: Retry on missing interrupt'
has turned a rare failure to kexec on QS22 into a reproducible
error, which we have now analysed.
The problem is that after a kexec, the MSIC hardware still points
into the middle of the old ring buffer. We set up the ring buffer
during reboot, but not the offset into it. On older kernels, this
would cause a storm of thousands of spurious interrupts after a
kexec, which would most of the time get dropped silently.
With the new code, we time out on each interrupt, waiting for
it to become valid. If more interrupts come in that we time
out on, this goes on indefinitely, which eventually leads to
a hard crash.
The solution in this commit is to read the current offset from
the MSIC when reinitializing it. This now works correctly, as
expected.
Reported-by: Dirk Herrendoerfer <d.herrendoerfer@de.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Dave Hansen [Thu, 11 Dec 2008 08:36:06 +0000 (08:36 +0000)]
powerpc: Fix bootmem reservation on uninitialized node
careful_allocation() was calling into the bootmem allocator for
nodes which had not been fully initialized and caused a previous
bug: http://patchwork.ozlabs.org/patch/10528/ So, I merged a
few broken out loops in do_init_bootmem() to fix it. That changed
the code ordering.
I think this bug is triggered by having reserved areas for a node
which are spanned by another node's contents. In the
mark_reserved_regions_for_nid() code, we attempt to reserve the
area for a node before we have allocated the NODE_DATA() for that
nid. We do this since I reordered that loop. I suck.
This is causing crashes at bootup on some systems, as reported
by Jon Tollefson.
This may only present on some systems that have 16GB pages
reserved. But, it can probably happen on any system that is
trying to reserve large swaths of memory that happen to span other
nodes' contents.
This commit ensures that we do not touch bootmem for any node which
has not been initialized, and also removes a compile warning about
an unused variable.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Brian King [Thu, 4 Dec 2008 04:07:54 +0000 (04:07 +0000)]
powerpc: Check for valid hugepage size in hugetlb_get_unmapped_area
It looks like most of the hugetlb code is doing the correct thing if
hugepages are not supported, but the mmap code is not. If we get into
the mmap code when hugepages are not supported, such as in an LPAR
which is running Active Memory Sharing, we can oops the kernel. This
fixes the oops being seen in this path.
Linus Torvalds [Tue, 16 Dec 2008 00:31:05 +0000 (16:31 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5348/1: fix documentation wrt location of the alignment trap interface
[ARM] Ensure linux/hardirqs.h is included where required
[ARM] fix kernel-doc syntax
[ARM] arch/arm/common/sa1111.c: Correct error handling code
[ARM] 5341/2: there is no copy_page on nommu ARM
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
Phonet: keep TX queue disabled when the device is off
SCHED: netem: Correct documentation comment in code.
netfilter: update rwlock initialization for nat_table
netlabel: Compiler warning and NULL pointer dereference fix
e1000e: fix double release of mutex
IA64: HP_SIMETH needs to depend upon NET
netpoll: fix race on poll_list resulting in garbage entry
ipv6: silence log messages for locally generated multicast
sungem: improve ethtool output with internal pcs and serdes
tcp: tcp_vegas cong avoid fix
sungem: Make PCS PHY support partially work again.
Rusty Russell [Mon, 15 Dec 2008 08:34:35 +0000 (19:04 +1030)]
Define smp_call_function_many for UP
Otherwise those using it in transition patches (eg. kvm) can't compile
with CONFIG_SMP=n:
arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function 'make_all_cpus_request':
arch/x86/kvm/../../../virt/kvm/kvm_main.c:380: error: implicit declaration of function 'smp_call_function_many'
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Menage [Mon, 15 Dec 2008 21:54:22 +0000 (13:54 -0800)]
cgroups: fix a race between rmdir and remount
When a cgroup is removed, it's unlinked from its parent's children list,
but not actually freed until the last dentry on it is released (at which
point cgrp->root->number_of_cgroups is decremented).
Currently rebind_subsystems checks for the top cgroup's child list being
empty in order to rebind subsystems into or out of a hierarchy - this can
result in the set of subsystems bound to a hierarchy being
removed-but-not-freed cgroup.
The simplest fix for this is to forbid remounts that change the set of
subsystems on a hierarchy that has removed-but-not-freed cgroups. This
bug can be reproduced via:
ACPI toshiba: only register rfkill if bt is enabled
Part of the rfkill initialization was done whenever BT was on or not. The
following patch checks for BT presence before registering the rfkill to
the input layer. Some minor cleanups (> 80 char lines) were also added in
the process.
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Tested-by: Andrey Borzenkov <arvidjaar@mail.ru> Acked-by: Len Brown <len.brown@intel.com> Cc: Richard Purdie <rpurdie@rpsys.net> Acked-by: Philip Langdale <philipl@overt.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Catalin Marinas [Mon, 15 Dec 2008 21:54:16 +0000 (13:54 -0800)]
slob: do not pass the SLAB flags as GFP in kmem_cache_create()
The kmem_cache_create() function in the slob allocator passes the SLAB
flags as GFP flags to the slob_alloc() function. The patch changes this
call to pass GFP_KERNEL as the other allocators seem to do.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Rostedt [Mon, 15 Dec 2008 08:19:14 +0000 (00:19 -0800)]
netfilter: update rwlock initialization for nat_table
The commit e099a173573ce1ba171092aee7bb3c72ea686e59
(netfilter: netns nat: per-netns NAT table) renamed the
nat_table from __nat_table to nat_table without updating the
__RW_LOCK_UNLOCKED(__nat_table.lock).
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Zachary Amsden [Sat, 13 Dec 2008 20:36:58 +0000 (12:36 -0800)]
x86 Fix VMI crash on boot in 2.6.28-rc8
VMI initialiation can relocate the fixmap, causing early_ioremap to
malfunction if it is initialized before the relocation. To fix this,
VMI activation is split into two phases; the detection, which must
happen before setting up ioremap, and the activation, which must happen
after parsing early boot parameters.
This fixes a crash on boot when VMI is enabled under VMware.