Russell King [Tue, 20 Sep 2011 22:35:15 +0000 (23:35 +0100)]
ARM: fix vmlinux.lds.S discarding sections
We are seeing linker errors caused by sections being discarded, despite
the linker script trying to keep them. The result is (eg):
`.exit.text' referenced in section `.alt.smp.init' of drivers/built-in.o: defined in discarded section `.exit.text' of drivers/built-in.o
`.exit.text' referenced in section `.alt.smp.init' of net/built-in.o: defined in discarded section `.exit.text' of net/built-in.o
This is the relevent part of the linker script (reformatted to make it
clearer):
| SECTIONS
| {
| /*
| * unwind exit sections must be discarded before the rest of the
| * unwind sections get included.
| */
| /DISCARD/ : {
| *(.ARM.exidx.exit.text)
| *(.ARM.extab.exit.text)
| }
| ...
| .exit.text : {
| *(.exit.text)
| *(.memexit.text)
| }
| ...
| /DISCARD/ : {
| *(.exit.text)
| *(.memexit.text)
| *(.exit.data)
| *(.memexit.data)
| *(.memexit.rodata)
| *(.exitcall.exit)
| *(.discard)
| *(.discard.*)
| }
| }
Now, this is what the linker manual says about discarded output sections:
| The special output section name `/DISCARD/' may be used to discard
| input sections. Any input sections which are assigned to an output
| section named `/DISCARD/' are not included in the output file.
No questions, no exceptions. It doesn't say "unless they are listed
before the /DISCARD/ section." Now, this is what asn-generic/vmlinux.lds.S
says:
| /*
| * Default discarded sections.
| *
| * Some archs want to discard exit text/data at runtime rather than
| * link time due to cross-section references such as alt instructions,
| * bug table, eh_frame, etc. DISCARDS must be the last of output
| * section definitions so that such archs put those in earlier section
| * definitions.
| */
And guess what - the list _always_ includes .exit.text etc.
Now, what's actually happening is that the linker is reading the script,
and it finds the first /DISCARD/ output section at the beginning of the
script. It continues reading the script, and finds the 'DISCARD' macro
at the end, which having been postprocessed results in another
/DISCARD/ output section. As the linker already contains the earlier
/DISCARD/ output section, it adds it to that existing section, so it
effectively is placed at the start. This can be seen by using the -M
option to ld:
As you can see, all the discarded sections are grouped together - and
as a result of it being the first output section, they all appear before
any other section.
The result is that not only is the unwind information discarded (as
intended), but also the .exit.text, despite us wanting to have the
.exit.text preserved.
We can't move the unwind information elsewhere, because it'll then be
included even when we do actually discard the .exit.text (and similar)
sections.
So, work around this by avoiding the generic DISCARDS macro, and instead
conditionalize the sections to be discarded ourselves. This avoids the
ambiguity in how the linker assigns input sections to output sections,
making our script less dependent on undocumented linker behaviour.
Reported-by: Rob Herring <robherring2@gmail.com> Tested-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Thu, 1 Sep 2011 10:57:59 +0000 (11:57 +0100)]
ARM: pm: add L2 cache cleaning for suspend
We need to ensure that state is pushed out from the L2 cache when
suspending so that the resume paths can access their data before the
MMU and caches have been re-initialized. Add the necessary calls to
__cpu_suspend_save().
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Shawn Guo <shawn.guo@linaro.org> Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Thu, 1 Sep 2011 10:52:33 +0000 (11:52 +0100)]
ARM: pm: convert some assembly to C
Convert some of the sleep.S guts to C code, which makes it easier to
use our macros and to add L2 cache handling. We provide a helper
function, __cpu_suspend_save(), which deals with saving the common
state, setting up for resume, and flushing caches.
The remainder left as assembly code is the saving of the CPU general
purpose registers, and allocating space on the stack to save the CPU
specific registers and resume state.
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Shawn Guo <shawn.guo@linaro.org> Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Wed, 31 Aug 2011 22:26:18 +0000 (23:26 +0100)]
ARM: pm: get rid of cpu_resume_turn_mmu_on
We don't require cpu_resume_turn_mmu_on as we can combine the ldr
instruction with the following code provided we ensure that
cpu_resume_mmu is aligned for older CPUs. Note that we also align
to a 32-byte boundary to ensure that the code can't cross a section
boundary.
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Shawn Guo <shawn.guo@linaro.org> Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Sun, 28 Aug 2011 09:30:34 +0000 (10:30 +0100)]
ARM: pm: no need to save/restore context ID register
There is no need to save and restore the context ID register on ARMv6
and ARMv7 with a temporary page table as we write the context ID
register when we switch back to the real page tables for the thread.
Moreover, the temporary page tables do not contain any non-global
mappings, so the context ID value should not be used. To be safe,
initialize the register to a reserved context ID value.
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Shawn Guo <shawn.guo@linaro.org> Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Sat, 27 Aug 2011 21:39:09 +0000 (22:39 +0100)]
ARM: pm: only use preallocated page table during resume
Only use the preallocated page table during the resume, not while
suspending. This avoids the overhead of having to switch unnecessarily
to the resume page table in the suspend path.
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Shawn Guo <shawn.guo@linaro.org> Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Fri, 26 Aug 2011 19:28:52 +0000 (20:28 +0100)]
ARM: pm: preallocate a page table for suspend/resume
Preallocate a page table and setup an identity mapping for the MMU
enable code. This means we don't have to "borrow" a page table to
do this, avoiding complexities with L2 cache coherency.
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Shawn Guo <shawn.guo@linaro.org> Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Merge branch 'fixes' of git://git.linaro.org/people/arnd/arm-soc
* 'fixes' of git://git.linaro.org/people/arnd/arm-soc:
mach-integrator: fix VGA base regression
arm/dt: Tegra: Update SDHCI nodes to match bindings
ARM: EXYNOS4: fix incorrect pad configuration for keypad row lines
ARM: SAMSUNG: fix to prevent declaring duplicated
ARM: SAMSUNG: fix watchdog reset issue with clk_get()
ARM: S3C64XX: Remove un-used code backlight code on SMDK6410
ARM: EXYNOS4: restart clocksource while system resumes
ARM: EXYNOS4: Fix routing timer interrupt to offline CPU
ARM: EXYNOS4: Fix return type of local_timer_setup()
ARM: EXYNOS4: Fix wrong pll type for vpll
ARM: Dove: fix second SPI initialization call
After commit c5f5c4db3938 ("staging: zcache: fix crash on high memory
swap") cleancache crashes on the first successful get. This was caused
by a remaining virt_to_page() call in zcache_pampd_get_data_and_free()
that only gets run in the cleancache path.
The patch converts the virt_to_page() to struct page casting like was
done for other instances in c5f5c4db3938.
Makes the Integrator/AP freeze completely. I appears that
this is due to the VGA base address being assigned at PCI
init time, while this base is needed earlier than that.
Moving the initialization of the base address to the
.map_io function solves this problem.
Cc: Rob Herring <rob.herring@calxeda.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Acked-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
drivers/net/pxa168_eth.c: In function 'pxa168_eth_collect_events':
drivers/net/pxa168_eth.c:866: error: 'IRQ_NONE' undeclared (first use in this function)
drivers/net/pxa168_eth.c:866: error: (Each undeclared identifier is reported only once
drivers/net/pxa168_eth.c:866: error: for each function it appears in.)
drivers/net/pxa168_eth.c: At top level:
drivers/net/pxa168_eth.c:913: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'pxa168_eth_int_handler'
drivers/net/pxa168_eth.c: In function 'pxa168_eth_open':
drivers/net/pxa168_eth.c:1133: error: implicit declaration of function 'request_irq'
drivers/net/pxa168_eth.c:1133: error: 'pxa168_eth_int_handler' undeclared (first use in this function)
drivers/net/pxa168_eth.c:1134: error: 'IRQF_DISABLED' undeclared (first use in this function)
drivers/net/pxa168_eth.c:1160: error: implicit declaration of function 'free_irq'
Signed-off-by: Tanmay Upadhyay <tanmay.upadhyay@einfochips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Lin Ming [Tue, 20 Sep 2011 19:45:07 +0000 (15:45 -0400)]
netconsole: switch init_netconsole() to late_initcall
Commit 88491d8(drivers/net: Kconfig & Makefile cleanup) causes a
regression that netconsole does not work if netconsole and network
device driver are build into kernel, because netconsole is linked
before network device driver.
Andrew Morton suggested to fix this with initcall ordering.
Fixes it by switching init_netconsole() to late_initcall.
Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Tue, 6 Sep 2011 12:44:25 +0000 (12:44 +0000)]
gianfar: Fix overflow check and return value for gfar_get_cls_all()
This function may currently fill one entry beyond the end of the
array it is given. It also doesn't return an error code in case
it does detect overflow.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Warren [Tue, 20 Sep 2011 16:46:26 +0000 (10:46 -0600)]
arm/dt: Tegra: Add support-8bit to SDHCI nodes
For Seaboard's internal eMMC, this makes the difference between a
5.5MB/s and 10.2MB/s transfer rate. On Harmony, there wasn't any
measurable difference on my cheap/slow ~2MB/s card.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Henry Wong [Sun, 18 Sep 2011 13:41:49 +0000 (13:41 +0000)]
ppp_generic: fix multilink fragment MTU calculation (again)
When using MLPPP, the maximum size of a fragment is incorrectly
calculated with an offset of -2.
This patch reverses the changes in the patch found here:
http://marc.info/?l=linux-netdev&m=123541324010539&w=2
The value of hdrlen includes the size of both the 2-byte PPP protocol
field and the 2- or 4-byte multilink header (2+4=6 for long sequence
numbers, 2+2=4 for short sequence numbers). Section 2 of RFC1661 says
that the MRU that is negotiated (i.e., the MTU of the sending system)
includes only the PPP payload but not the protocol field, thus the
correct MTU should be the link's MTU minus the multilink header (mtu -
(hdrlen-2)).
The incorrect calculation causes Linux to fragment packets to a size two
bytes smaller than the allowed MTU. While not technically illegal, this
behaviour confounds MRU-tuning to avoid PPP-layer fragmentation.
Signed-off-by: Henry Wong <henry@stuffedcow.net> Signed-off-by: David S. Miller <davem@davemloft.net>
The GRETH GBIT core does not do checksum offloading for IP
segmentation. This patch adds a check in the xmit function to
determine if the stack has calculated the checksum for us.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roy Li [Tue, 20 Sep 2011 19:10:16 +0000 (15:10 -0400)]
ipv6: fix a possible double free
When calling snmp6_alloc_dev fails, the snmp6 relevant memory
are freed by snmp6_alloc_dev. Calling in6_dev_finish_destroy
will free these memory twice.
Double free will lead that undefined behavior occurs.
Signed-off-by: Roy Li <rongqing.li@windriver.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix broken sec=ntlmv2/i sec option (try #2)
Fix the conflict between rwpidforward and rw mount options
CIFS: Fix ERR_PTR dereference in cifs_get_root
cifs: fix possible memory corruption in CIFSFindNext
John Crispin [Wed, 24 Aug 2011 08:31:39 +0000 (10:31 +0200)]
watchdog: lantiq: fix watchdogs timeout handling
The enable function was using the global timeout variable for local operations.
This resulted in the value of the global variable being corrupted, thus
breaking the code.
Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: Thomas Langer <thomas.langer@lantiq.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Cc: linux-watchdog@vger.kernel.org Cc: linux-mips@linux-mips.org
On platforms with no iCRU support don't print two, (possibly conflicting),
"NMI occurred" messages when the firmware is unable to source the NMI.
Please note that one of the enhancements to the v1.3.0 hpwdt driver is to panic and allow
KDUMP to succeed even on NMIs that are unknown to the platform firmware.
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Reviewed-by: Thomas Mingarelli <thomas.mingarelli@hp.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
watchdog: WatchDog Timer Driver Core - use passed watchdog_device
Use the passed watchdog_device instead of the static global variable when
testing and setting the status in watchdog_ping, watchdog_start, and
watchdog_stop. Note that the callers of these functions are actually
passing the static global variable.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Extent the POWER7 PMU driver with definitions for generic front-end and back-end
stall events.
As explained in Ingo's original comment(8f62242246351b5a4bc0c1f00c0c7003edea128a
), the exact definitions of the stall events are very much processor specific as
different things mean different in their respective instruction pipeline. These
two Power7 raw events are the closest approximation to the concept detailed in
Ingo's comment.
[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x100f8, /* GCT_NOSLOT_CYC */
It means cycles when the Global Completion Table has no slots from this thread
[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x4000a, /* CMPLU_STALL */
It means no groups completed and GCT not empty for this thread
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The firmware doesn't wait after lifting the PCI reset. However it does
timestamp it in the device tree. We use that to ensure we wait long
enough (3s is our current arbitrary setting) from that timestamp to
actually probing the bus.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/powernv: Implement MSI support for p5ioc2 PCIe
This implements support for MSIs on p5ioc2 PHBs. We only support
MSIs on the PCIe PHBs, not the PCI-X ones as the later hasn't been
properly verified in HW.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/powernv: Add support for p5ioc2 PCI-X and PCIe
This adds support for PCI-X and PCIe on the p5ioc2 IO hub using
OPAL. This includes allocating & setting up TCE tables and config
space access routines.
This also supports fallbacks via RTAS when OPAL is absent, using
legacy TCE format pre-allocated via the device-tree (BML style)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/powernv: Machine check and other system interrupts
OPAL can handle various interrupt for us such as Machine Checks (it
performs all sorts of recovery tasks and passes back control to us with
informations about the error), Hardware Management Interrupts and Softpatch
interrupts.
This wires up the mechanisms and prints out specific informations returned
by HAL when a machine check occurs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/powernv: Register and handle OPAL interrupts
We do the minimum which is to "pass" interrupts to HAL, which
makes the console smoother and will allow us to implement
interrupt based completion and console.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks
Implements OPAL RTC and NVRAM support and wire all that up to
the powernv platform.
We use RTAS for RTC as a fallback if available. Using RTAS for nvram
is not supported yet, pending some rework/cleanup and generalization
of the pSeries & CHRP code. We also use RTAS fallbacks for power off
and reboot
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/powernv: Hookup reboot and poweroff functions
This calls the respective HAL functions, and spin on hal_poll_event()
to ensure the HAL has a chance to communicate with the FSP to trigger
the reboot or shutdown operation
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/powernv: Add support for instanciating OPAL v2 from Open Firmware
OPAL v2 is instantiated in a way similar to RTAS using Open Firmware
client interface calls, and the resulting address and entry point are
put in the device-tree
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add definition of OPAL interfaces along with the wrappers to call
into OPAL runtime and the early device-tree parsing hook to locate
the OPAL runtime firmware.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/powernv: Get kernel command line accross OPAL takeover
We stash it in boot_command_line which isn't in BSS and so won't
be overwritten. We then use that as a default cmd_line before
we walk the device-tree.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On machines supporting the OPAL firmware version 1, the system
is initially booted under pHyp. We then use a special hypercall
to verify if OPAL is available and if it is, we then trigger
a "takeover" which disables pHyp and loads the OPAL runtime
firmware, giving control to the kernel in hypervisor mode.
This patch add the necessary code to detect that the OPAL takeover
capability is present when running under PowerVM (aka pHyp) and
perform said takeover to get hypervisor control of the processor.
To perform the takeover, we must first use RTAS (within Open
Firmware runtime environment) to start all processors & threads,
in order to give control to OPAL on all of them. We then call
the takeover hypercall on everybody, OPAL will re-enter the kernel
main entry point passing it a flat device-tree.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
of: Change logic to overwrite cmd_line with CONFIG_CMDLINE
We used to overwrite with CONFIG_CMDLINE if we found a chosen
node but failed to get bootargs out of it or they were empty,
unless CONFIG_CMDLINE_FORCE is set.
Instead change that to overwrite if "data" is non empty after
the bootargs check. It allows arch code to have other mechanisms
to retrieve the command line prior to parsing the device-tree.
Note: CONFIG_CMDLINE_FORCE case should ideally be handled elsewhere
as it won't work as it-is if the device-tree has no /chosen node
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: devicetree-discuss@lists-ozlabs.org CC: Grant Likely <grant.likely@secretlab.ca> Acked-by: Grant Likely <grant.likely@secretlab.ca>
This adds a skeletton for the new Power "Non Virtualized"
platform which will be used by machines supporting running
without an hypervisor, for example in order to run KVM.
These machines will be using a new firmware called OPAL
for which the support will be provided by later patches.
The PowerNV platform is intended to be also usable under
the BML environment used internally for early CPU bringup
which is why the code also supports using RTAS instead of
OPAL in various places.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/powernv: Don't clobber r9 in relative_toc()
With OPAL, r8 and r9 will be used to pass the OPAL base and entry
for debugging purposes (those informations are also in the
device-tree). We don't want to clobber those registers that
early.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/smp: More generic support for "soft hotplug"
This adds more generic support for doing CPU hotplug with a simple
idle loop and no actual reset of the processors. The generic
smp_generic_kick_cpu() does the hotplug bringup trick if the PACA
shows that the CPU has already been started at boot and we provide
an accessor for the CPU state.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Brian King [Tue, 13 Sep 2011 11:22:51 +0000 (11:22 +0000)]
hvcs: Ensure page aligned partner info buffer
The Power platform requires the partner info buffer to be page aligned
otherwise it will fail the partner info hcall with H_PARAMETER. Switch
from using kmalloc to allocate this buffer to __get_free_page to ensure
page alignment.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In create_section_mapping we BUG if htab_bolt_mapping returned
an error. A better approach is to return an error which will
propagate back to userspace.
Anton Blanchard [Sun, 24 Jul 2011 16:33:15 +0000 (16:33 +0000)]
powerpc/numa: Disable NEWIDLE balancing at node level
On big POWER7 boxes we see large amounts of CPU time in system
processes like workqueue and watchdog kernel threads.
We currently rebalance the entire machine each time a task goes
idle and this is very expensive on large machines. Disable newidle
balancing at the node level and rely on the scheduler tick to
rebalance across nodes.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Sun, 24 Jul 2011 16:33:13 +0000 (16:33 +0000)]
sched: Allow SD_NODES_PER_DOMAIN to be overridden
We want to override the default value of SD_NODES_PER_DOMAIN on ppc64,
so move it into linux/topology.h.
Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Sun, 24 Jul 2011 16:33:12 +0000 (16:33 +0000)]
powerpc/numa: Enable SD_WAKE_AFFINE in node definition
When chasing a performance issue on ppc64, I noticed tasks
communicating via a pipe would often end up on different nodes.
It turns out SD_WAKE_AFFINE is not set in our node defition. Commit 9fcd18c9e63e (sched: re-tune balancing) enabled SD_WAKE_AFFINE
in the node definition for x86 and we need a similar change for
ppc64.
I used lmbench lat_ctx and perf bench pipe to verify this fix. Each
benchmark was run 10 times and the average taken.
lmbench lat_ctx:
before: 66565 ops/sec
after: 204700 ops/sec
3.1x faster
perf bench pipe:
before: 5.6570 usecs
after: 1.3470 usecs
4.2x faster
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>