Paul Mackerras [Sat, 26 Jan 2008 05:40:33 +0000 (16:40 +1100)]
Revert "[POWERPC] Fake NUMA emulation for PowerPC"
This reverts commit 5c3f5892a2db6757a72ce8b27cba90db06683e1d,
basically because it changes behaviour even when no fake NUMA
information is specified on the kernel command line.
Firstly, it changes the nid, thus destroying the real NUMA
information. Secondly, it also changes behaviour in that if a node
ends up with no memory in it because of the memory limit, we used to
set it online and now we don't.
Also, in the non-NUMA case with no fake NUMA information, we do
node_set_online once for each LMB now, whereas previously we only did
it once. I don't know if that is actually a problem, but it does seem
unnecessary.
David Gibson [Fri, 11 Jan 2008 03:25:34 +0000 (14:25 +1100)]
[POWERPC] Enable RTC for Ebony and Walnut (v2)
This patch extends the Ebony and Walnut platform code to instantiate
the existing ds1742 RTC class driver for the DS1743 RTC/NVRAM chip
found on both those boards. The patch uses a helper function to scan
the device tree and instantiate the appropriate platform_device based
on it, so it should be easy to extend for other boards which have mmio
mapped RTC chips.
Along with this, the device tree binding for the ds1743 chips is
tweaked, based on the existing DS1385 OF binding found at:
http://playground.sun.com/1275/proposals/Closed/Remanded/Accepted/346-it.txt
Although that document covers the NVRAM portion of the chip, whereas
here we're interested in the RTC portion, so it's not entirely clear
if that's a good model.
This implements only RTC class driver support - that is /dev/rtc0, not
/dev/rtc, and the low-level get/set time callbacks remain
unimplemented. That means in order to get at the clock you will
either need a modified version of hwclock which will look at
/dev/rtc0, or you'll need to configure udev to symlink rtc0 to rtc.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
It can not work anymore because either the irq is not yet set to 14 or
pci_get_device() returns nothing. At least the printk() in
chrp_pci_fixup_vt8231_ata() does not trigger anymore.
pata_via works again on Pegasos with the change below.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Mon, 21 Jan 2008 05:42:48 +0000 (16:42 +1100)]
[POWERPC] Remove the global dma_direct_offset
We no longer need the global dma_direct_offset, update the comment to
reflect the new reality.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Mon, 21 Jan 2008 05:42:46 +0000 (16:42 +1100)]
[POWERPC] Have celleb use its own dma_direct_offset variable
Rather than using the global variable, have celleb use its own
variable to store the direct DMA offset.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Mon, 21 Jan 2008 05:42:45 +0000 (16:42 +1100)]
[POWERPC] Have cell use its own dma_direct_offset variable
Rather than using the global variable, have cell use its own variable
to store the direct DMA offset.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Mon, 21 Jan 2008 05:42:43 +0000 (16:42 +1100)]
[POWERPC] Use archdata.dma_data in dma_direct_ops and add the offset
Now that all platforms using dma_direct_offset setup the
archdata.dma_data correctly, we can change the dma_direct_ops to
retrieve the offset from the dma_data, rather than directly from the
global.
While we're here, change the way the offset is used - instead of
or'ing it into the value, add it. This should have no effect on
current implementations where the offset is far larger than memory,
however in future we may want to use smaller offsets.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Mon, 21 Jan 2008 05:42:42 +0000 (16:42 +1100)]
[POWERPC] Add celleb_dma_dev_setup()
Celleb always uses dma_direct_ops, and sets dma_direct_offset, so it
too should set dma_data to dma_direct_offset.
Currently there's no pci_dma_dev_setup() routine for Celleb so add one.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Ellerman [Mon, 21 Jan 2008 05:42:41 +0000 (16:42 +1100)]
[POWERPC] Set archdata.dma_data for direct DMA in cell_dma_dev_setup()
Store the direct_dma_offset in each device's dma_data in the case
where we're using the direct DMA ops.
We need to make sure we setup the ppc_md.pci_dma_dev_setup() callback
if we're using a non-zero offset.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Always hookup PHB IO resource even when empty
We must always hookup the pci_bus resource 0 to the PHB io_resource
even if the latter is empty (the bus has no IO support). Otherwise,
some other code will end up hooking it up to something bogus and the
resource tree will end up being broken.
This fixes boot on QS20 Cell blades where the IDE driver failed to
allocate the IO resources due to breakage of the resource tree.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
The PS3's LV1 hypervisor provides a Logical Performance Monitor that
abstracts the Cell processor's performance monitor features for use
by guest operating systems.
[POWERPC] PS3: Add repository polling loop to work around timing bug
PS3: Add repository polling loop to work around timing bug
On some firmware versions (e.g. 1.90), the storage device may not show up
in the repository immediately after receiving the notification message.
Add a small polling loop to make sure we don't miss it.
[POWERPC] PS3: Use the HVs storage device notification mechanism properly
The PS3 hypervisor has a storage device notification mechanism to wait
until a storage device is ready. Unfortunately the storage device
probing code used this mechanism in an incorrect way, needing a
polling loop and handling of devices that are not yet ready.
This change corrects this by:
- First waiting for the reception of an asynchronous notification
that a new storage device became ready,
- Then looking up the storage device in the device repository.
On shutdown, the storage probe thread is stopped and the storage
notification device is closed using a reboot notifier.
The storage probe feature of the PS3 hypervisor returns device IDs. Add
the corresponding repository routine ps3_repository_find_device_by_id()
which can be used to retrieve the device info from the repository.
Change the PS3 bus_id and dev_id from type unsigned int to u64. These
IDs are 64-bit in the repository, and the special storage notification
device has a device ID of ULONG_MAX.
Michael Neuling [Fri, 18 Jan 2008 04:50:30 +0000 (15:50 +1100)]
[POWERPC] kdump shutdown hook support
This adds hooks into the default_machine_crash_shutdown so drivers can
register a function to be run in the first kernel before we hand off
to the second kernel. This should only be used in exceptional
circumstances, like where the device can't be reset in the second
kernel alone (as is the case with eHEA). To emphasize this, the
number of handles allowed to be registered is currently #def to 1.
This uses the setjmp/longjmp code around the call out to the
registered hooks, so any bogus exceptions we encounter will hopefully
be recoverable.
Tested with bogus data and instruction exceptions.
Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Michael Neuling [Fri, 18 Jan 2008 04:50:30 +0000 (15:50 +1100)]
[POWERPC] Make setjmp/longjmp code usable outside of xmon
This makes the setjmp/longjmp code used by xmon, generically available
to other code. It also removes the requirement for debugger hooks to
be only called on 0x300 (data storage) exception.
Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Olof Johansson [Fri, 28 Dec 2007 04:11:09 +0000 (15:11 +1100)]
[POWERPC] Make smp_send_stop() handle panic and xmon reboot
smp_send_stop() will send an IPI to all other cpus to shut them down.
However, for the case of xmon-based reboots (as well as potentially some
panics), the other cpus are (or might be) spinning with interrupts off,
and won't take the IPI.
Current code will drop us into the debugger when the IPI fails, which
means we're in an infinite loop that we can't get out of without an
external reset of some sort.
Instead, make the smp_send_stop() IPI call path just print the warning
about being unable to send IPIs, but make it return so the rest of the
shutdown sequence can continue. It's not perfect, but the lesser of
two evils.
Also move the call_lock handling outside of smp_call_function_map so we
can avoid deadlocks in smp_send_stop().
Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
Kumar Gala [Wed, 23 Jan 2008 11:53:47 +0000 (05:53 -0600)]
[RAPIDIO] Fix compile error and warning
drivers/rapidio/rio.c: In function 'rio_get_asm':
drivers/rapidio/rio.c:413: error: implicit declaration of function 'in_interrupt'
drivers/rapidio/rio.c: In function 'rio_init_mports':
drivers/rapidio/rio.c:480: warning: format '%8.8lx' expects type 'long unsigned int', but argument 2 has type 'resource_size_t'
drivers/rapidio/rio.c:480: warning: format '%8.8lx' expects type 'long unsigned int', but argument 3 has type 'resource_size_t'
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Dale Farnsworth [Thu, 22 Nov 2007 15:46:20 +0000 (08:46 -0700)]
[POWERPC] 85xx: Respect KERNELBASE, PAGE_OFFSET, and PHYSICAL_START on e500
The e500 MMU init code previously assumed KERNELBASE always equaled
PAGE_OFFSET and PHYSICAL_START was 0. This is useful for kdump
support as well as asymetric multicore.
For the initial kdump support the secondary kernel will run at 32M
but need access to all of memory so we bump the initial TLB up to
64M. This also matches with the forth coming ePAPR spec.
Signed-off-by: Dale Farnsworth <dale@farnsworth.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Scott Wood [Mon, 14 Jan 2008 16:29:35 +0000 (10:29 -0600)]
[POWERPC] fsl_soc: Fix get_immrbase() to use ranges, rather than reg.
Don't depend on the reg property as a way to determine the base
of the immr space. The reg property might be defined differently for
different SoC families.
Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Timur Tabi [Wed, 9 Jan 2008 23:35:05 +0000 (17:35 -0600)]
[POWERPC] QE: Add support for Freescale QUICCEngine UART
Add support for UART serial ports using a Freescale QUICCEngine. Update
booting-without-of.txt to define new properties for a QE UART node. Update
the MPC8323E-MDS device tree to add UCC5 as a UART. Update the QE library
to support slow UCC devices and modules.
Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Timur Tabi [Tue, 8 Jan 2008 16:30:58 +0000 (10:30 -0600)]
[POWERPC] QE: Add ability to upload QE firmware
Define the layout of a binary blob that contains a QE firmware and instructions
on how to upload it. Add function qe_upload_firmware() to parse the blob
and perform the actual upload. Fully define 'struct rsp' in immap_qe.h to
include the actual RISC Special Registers. Added description of a new
QE firmware node to booting-without-of.txt.
Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Vitaly Bordug [Thu, 6 Dec 2007 22:51:22 +0000 (01:51 +0300)]
phy/fixed.c: rework to not duplicate PHY layer functionality
With that patch fixed.c now fully emulates MDIO bus, thus no need
to duplicate PHY layer functionality. That, in turn, drastically
simplifies the code, and drops down line count.
As an additional bonus, now there is no need to register MDIO bus
for each PHY, all emulated PHYs placed on the platform fixed MDIO bus.
There is also no more need to pre-allocate PHYs via .config option,
this is all now handled dynamically.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Vitaly Bordug <vitb@kernel.crashing.org> Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Kumar Gala [Mon, 14 Jan 2008 23:02:19 +0000 (17:02 -0600)]
[POWERPC] FSL: Rework PCI/PCIe support for 85xx/86xx
The current PCI code for Freescale 85xx/86xx was treating the virtual
P2P PCIe bridge as a transparent bridge. Rather than doing that fixup
the virtual P2P bridge by copying the resources from the PHB.
Also, fixup a bit of the code for dealing with resource_size_t being
64-bits and how we set ATMU registers for >4G.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Kumar Gala [Mon, 14 Jan 2008 15:41:36 +0000 (09:41 -0600)]
[POWERPC] Fixup transparent P2P resources
For transparent P2P bridges the first 3 resources may get set from based on
BAR registers and need to get fixed up. Where as the remainder come from the
parent bus and have already been fixed up.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Kumar Gala [Sat, 12 Jan 2008 23:23:26 +0000 (17:23 -0600)]
[POWERPC] Ensure we only handle PowerMac PCI bus fixup for memory resources
The fixup code that handles the case for PowerMac's that leave bridge
windows open over an inaccessible region should only be applied to
memory resources (IORESOURCE_MEM). If not we can get it trying to fixup
IORESOURCE_IO on some systems since the other conditions that are used to
detect the case can easily match for IORESOURCE_IO.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Kumar Gala [Wed, 9 Jan 2008 17:27:23 +0000 (11:27 -0600)]
[POWERPC] Fix handling of memreserve if the range lands in highmem
There were several issues if a memreserve range existed and happened
to be in highmem:
* The bootmem allocator is only aware of lowmem so calling
reserve_bootmem with a highmem address would cause a BUG_ON
* All highmem pages were provided to the buddy allocator
Added a lmb_is_reserved() api that we now use to determine if a highem
page should continue to be PageReserved or provided to the buddy
allocator.
Also, we incorrectly reported the amount of pages reserved since all
highmem pages are initally marked reserved and we clear the
PageReserved flag as we "free" up the highmem pages.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Paul Mackerras [Wed, 23 Jan 2008 21:35:13 +0000 (08:35 +1100)]
[POWERPC] Provide a way to protect 4k subpages when using 64k pages
Using 64k pages on 64-bit PowerPC systems makes life difficult for
emulators that are trying to emulate an ISA, such as x86, which use a
smaller page size, since the emulator can no longer use the MMU and
the normal system calls for controlling page protections. Of course,
the emulator can emulate the MMU by checking and possibly remapping
the address for each memory access in software, but that is pretty
slow.
This provides a facility for such programs to control the access
permissions on individual 4k sub-pages of 64k pages. The idea is
that the emulator supplies an array of protection masks to apply to a
specified range of virtual addresses. These masks are applied at the
level where hardware PTEs are inserted into the hardware page table
based on the Linux PTEs, so the Linux PTEs are not affected. Note
that this new mechanism does not allow any access that would otherwise
be prohibited; it can only prohibit accesses that would otherwise be
allowed. This new facility is only available on 64-bit PowerPC and
only when the kernel is configured for 64k pages.
The masks are supplied using a new subpage_prot system call, which
takes a starting virtual address and length, and a pointer to an array
of protection masks in memory. The array has a 32-bit word per 64k
page to be protected; each 32-bit word consists of 16 2-bit fields,
for which 0 allows any access (that is otherwise allowed), 1 prevents
write accesses, and 2 or 3 prevent any access.
Implicit in this is that the regions of the address space that are
protected are switched to use 4k hardware pages rather than 64k
hardware pages (on machines with hardware 64k page support). In fact
the whole process is switched to use 4k hardware pages when the
subpage_prot system call is used, but this could be improved in future
to switch only the affected segments.
The subpage protection bits are stored in a 3 level tree akin to the
page table tree. The top level of this tree is stored in a structure
that is appended to the top level of the page table tree, i.e., the
pgd array. Since it will often only be 32-bit addresses (below 4GB)
that are protected, the pointers to the first four bottom level pages
are also stored in this structure (each bottom level page contains the
protection bits for 1GB of address space), so the protection bits for
addresses below 4GB can be accessed with one fewer loads than those
for higher addresses.