David Gibson [Wed, 19 Oct 2005 04:53:32 +0000 (14:53 +1000)]
[PATCH] powerpc: Merge ppc64 pmc.[ch] with ppc32 perfmon.[ch]
This patches the ppc32 and ppc64 versions of the headers and .c files
with helper functions for manipulating the performance counting
hardware. As a side effect, it removes use of the term "perfmon" from
ppc32, thus avoiding confusion with the unrelated performance counter
interface from HP Labs also called "perfmon".
Built, but not booted, for g5, pSeries, iSeries, and 32-bit Powermac
with both ARCH=powerpc and ARCH=ppc{,64} as appropriate.
Signed-off-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Kumar Gala [Tue, 18 Oct 2005 22:42:41 +0000 (17:42 -0500)]
[PATCH] ppc32: replace use of _GLOBAL with .globl for ppc32
The _GLOBAL() macro is for text symbols only. Changed to using
.globl for .data symbols. This is also needed in ppc32 land
to allow FSL Book-E, 40x, and 44x to work.
Signed-off-by: Kumar K. Gala <kumar.gala@freescale.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Paul Mackerras [Wed, 19 Oct 2005 23:37:02 +0000 (09:37 +1000)]
powerpc: Fix a corner case in __div64_32
The code was incorrectly doing a division by 0 in the case where
the denominator was 0x100000000 and the divisor was 0xffffffff.
Thanks to Fred Liu of Motorola for pointing this out.
Paul Mackerras [Wed, 19 Oct 2005 23:23:26 +0000 (09:23 +1000)]
powerpc: Merge time.c and asm/time.h.
We now use the merged time.c for both 32-bit and 64-bit compilation
with ARCH=powerpc, and for ARCH=ppc64, but not for ARCH=ppc32.
This removes setup_default_decr (folds its function into time_init)
and moves wakeup_decrementer into time.c. This also makes an
asm-powerpc/rtc.h.
Paul Mackerras [Wed, 19 Oct 2005 23:15:05 +0000 (09:15 +1000)]
ppc64: Minor compilation fixes
This defines CONFIG_PPC_STD_MMU for ppc64, changes an instance of
sys32_ to compat_sys_ in the ppc64 syscall table, and removes a
reference to a non-existent arch/powerpc/xmon/Makefile.
Paul Mackerras [Wed, 19 Oct 2005 13:11:21 +0000 (23:11 +1000)]
powerpc: Merge machdep.h
A few things change for consistency between ppc32 and ppc64:
idle functions return void; *_get_boot_time functions return
unsigned long (i.e. time_t) rather than filling in a struct rtc_time
(since that's useful to the callers and easier for pmac to
generate); *_get_rtc_time and *_set_rtc_time functions take
a struct rtc_time; irq_canonicalize is gone; nvram_sync returns
void.
Paul Mackerras [Wed, 19 Oct 2005 11:44:51 +0000 (21:44 +1000)]
ppc: Minor smp changes for consistency with ppc64
This makes platform code use the smp_ops variable directly instead
of ppc_md.smp_ops, removes the two unused `data' and `wait' arguments
from the *_message_pass() functions, and removes the call to the
never-implemented smp_ops->space_timers() function.
Paul Mackerras [Tue, 18 Oct 2005 04:19:41 +0000 (14:19 +1000)]
powerpc: Fix various compile errors with ARCH=ppc, ppc64 and powerpc
This makes ppc use the syscalls.c from arch/powerpc/kernel, exports
copy_and_flush from head_32.S for use by prom_init.c (ARCH=powerpc),
and consolidates the sys_fadvise64_64 implementations for 32-bit.
David Gibson [Thu, 13 Oct 2005 05:46:22 +0000 (15:46 +1000)]
[PATCH] powerpc: Another maple merge tree fix
With ARCH=powerpc, a spurious ifdef in prom_init prevented the
seconday hold loop being correctly copied down on Maple. With this
patch, Maple boots with ARCH=powerpc
Signed-off-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
David Gibson [Thu, 13 Oct 2005 04:28:58 +0000 (14:28 +1000)]
[PATCH] powerpc: Fix use of LOADBASE in merge tree
The merge-tree version of LOADBASE actually loads the whole given
address from the toc for ppc64. The matching OFF macro adjust for
this, using an offset of 0 for ppc64, but we weren't using that in
power4_idle.
Signed-off-by: David Gibson <dwg@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Paul Mackerras [Mon, 17 Oct 2005 09:20:46 +0000 (19:20 +1000)]
powerpc: Initialize btext subsystem later, after prom_init
We were initializing the btext stuff from prom_init(), thus breaking
the rule that all communication between prom_init() and the rest of
the kernel has to be via the flattened device tree. This removes
the btext initialization calls from prom_init() and initializes it
instead after the device tree is unflattened. It would be nice to
do it earlier, but that needs some more infrastructure to find the
properties we need in the flattened device tree.
Stephen Rothwell [Fri, 14 Oct 2005 04:51:42 +0000 (14:51 +1000)]
powerpc: move iSeries/iSeries_pci.h to platforms/iseries
The only real user of this file outside platforms/iseries was
drivers/net/iseries_veth.c but all it wanted was ISERIES_HV_ADDR()
so we move that to abs_addr.h (and lowercase it).
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Paul Mackerras [Wed, 12 Oct 2005 07:03:36 +0000 (17:03 +1000)]
ppc64: Remove duplicate versions of some headers
This removes three headers from include/asm-ppc64 that are now in
include/asm-powerpc and are sufficiently similar that they can be
used with ARCH=ppc64.
Paul Mackerras [Wed, 12 Oct 2005 07:01:50 +0000 (17:01 +1000)]
powerpc: Bring in some changes made to arch/ppc and include/asm-ppc64
Recent commits upstream have changed files which are currently
duplicated in arch/powerpc and include/asm-powerpc. This updates
them with the corresponding changes.
Paul Mackerras [Wed, 12 Oct 2005 06:58:53 +0000 (16:58 +1000)]
powerpc: Move default hash table size calculation to hash_utils_64.c
We weren't computing the size of the hash table correctly on iSeries
because the relevant code in prom.c was #ifdef CONFIG_PPC_PSERIES.
This moves the code to hash_utils_64.c, makes it unconditional, and
cleans it up a bit.
Stephen Rothwell [Tue, 11 Oct 2005 09:40:20 +0000 (19:40 +1000)]
powerpc: make iSeries boot again
On ARCH=ppc64 we were getting htab_hash_mask recalculated
to the correct value for our particular machine by accident.
In the merge tree, that code was commented out, so htab_hash_mask
was being corrupted.
We now set ppc64_pft_size instead which gets htab_has_mask
calculated correctly for us later. We should put an
ibm,pft-size property in the device tree at some point.
Also set -mno-minimal-toc in some makefiles.
Allow iSeries to configure PROC_DEVICETREE.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
David S. Miller [Tue, 11 Oct 2005 22:45:16 +0000 (15:45 -0700)]
[SPARC64]: Fix net booting on Ultra5
We were not doing alignment properly when remapping the kernel image.
What we want is a 4MB aligned physical address to map at KERNBASE.
Mistakedly we were 4MB aligning the virtual address where the kernel
initially sits, that's wrong.
Instead, we should PAGE align the virtual address, then 4MB align the
physical address result the prom gives to us.
Signed-off-by: David S. Miller <davem@davemloft.net>
akpm@osdl.org [Tue, 11 Oct 2005 15:29:08 +0000 (08:29 -0700)]
[PATCH] binfmt_elf bss padding fix
Nir Tzachar <tzachar@cs.bgu.ac.il> points out that if an ELF file specifies a
zero-length bss at a whacky address, we cannot load that binary because
padzero() tries to zero out the end of the page at the whacky address, and
that may not be writeable.
See also http://bugzilla.kernel.org/show_bug.cgi?id=5411
So teach load_elf_binary() to skip the bss settng altogether if the elf file
has a zero-length bss segment.
Cc: Roland McGrath <roland@redhat.com> Cc: Daniel Jacobowitz <dan@debian.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paolo Galtieri [Tue, 11 Oct 2005 15:29:07 +0000 (08:29 -0700)]
[PATCH] ppc highmem fix
I've noticed that the calculations for seg_size and nr_segs in
__dma_sync_page_highmem() (arch/ppc/kernel/dma-mapping.c) are wrong. The
incorrect calculations can result in either an oops or a panic when running
fsck depending on the size of the partition.
The problem with the seg_size calculation is that it can result in a
negative number if size is offset > size. The problem with the nr_segs
caculation is returns the wrong number of segments, e.g. it returns 1 when
size is 200 and offset is 4095, when it should return 2 or more.
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Suzuki [Tue, 11 Oct 2005 15:29:06 +0000 (08:29 -0700)]
[PATCH] madvise: Avoid returning error code -EBADF for anonymous mappings
Revert this recent correctness change: Douglas Crosher <dcrosher@scieneer.com>
reported that it broke an existing application, and that madvise() works
without error on anonymous mappings on Solaris.
This means that madvise() will remain non-standards-compliant: we should
return -EBADF for all requests against non-file-backed vma's, but Linux only
does this for MADV_WILLNEED requests.
Signed-off-by: Suzuki K P <suzuki@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Here is a compatibility fix between Linux and Solaris when used with VxFS
filesystems: Solaris usually accepts acl entries in any order, but with
VxFS it replies with NFSERR_INVAL when it sees a four-entry acl that is not
in canonical form. It may also fail with other non-canonical acls -- I
can't tell, because that case never triggers: We only send non-canonical
acls when we fake up an ACL_MASK entry.
Instead of adding fake ACL_MASK entries at the end, inserting them in the
correct position makes Solaris+VxFS happy. The Linux client and server
sides don't care about entry order. The three-entry-acl special case in
which we need a fake ACL_MASK entry was handled in xdr_nfsace_encode. The
patch moves this into nfsacl_encode.
Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Latchesar Ionkov [Tue, 11 Oct 2005 15:29:03 +0000 (08:29 -0700)]
[PATCH] v9fs: remove additional buffer allocation from v9fs_file_read and v9fs_file_write
v9fs_file_read and v9fs_file_write use kmalloc to allocate buffers as big
as the data buffer received as parameter. kmalloc cannot be used to
allocate buffers bigger than 128K, so reading/writing data in chunks bigger
than 128k fails.
This patch reorganizes v9fs_file_read and v9fs_file_write to allocate only
buffers as big as the maximum data that can be sent in one 9P message.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Cc: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Abhay Salunke [Tue, 11 Oct 2005 15:29:02 +0000 (08:29 -0700)]
[PATCH] dell_rbu: changes in packet update mechanism
In the current dell_rbu code ver 2.0 the packet update mechanism makes the
user app dump every individual packet in to the driver.
This adds in efficiency as every packet update makes the
/sys/class/firmware/dell_rbu/loading and data files to disappear and reappear
again. Thus the user app needs to wait for the files to reappear to dump
another packet. This slows down the packet update tremendously in case of
large number of packets. I am submitting a new patch for dell_rbu which will
change the way we do packet updates;
In the new method the user app will create a new single file which has already
packetized the rbu image and all the packets are now staged in this file.
This driver also creates a new entry in
/sys/devices/platform/dell_rbu/packet_size ; the user needs to echo the packet
size here before downloading the packet file.
The user should do the following:
create one single file which has all the packets stacked together.
echo the packet size in to /sys/devices/platform/dell_rbu/packet_size.
echo 1 > /sys/class/firmware/dell_rbu/loading
cat the packetfile > /sys/class/firmware/dell_rbu/data
echo 0 > /sys/class/firmware/dell_rbu/loading
The driver takes the file which came through /sys/class/firmware/dell_rbu/data
and takes chunks of paket_size data from it and place in contiguous memory.
This makes packet update process very efficient and fast. As all the packet
update happens in one single operation. The user can still read back the
downloaded file from /sys/devices/platform/dell_rbu/data.
Anton Blanchard [Tue, 11 Oct 2005 15:29:00 +0000 (08:29 -0700)]
[PATCH] ppc64: Fix PCI hotplug
pSeries_irq_bus_setup is marked __devinit but references s7a_workaround
which is marked __initdata.
Depending on who got the memory for s7a_workaround (and if the value was
now positive), it was possible for PCI hotplugged devices to have 3
subtracted from their interrupt number. This would happen randomly and
caused me much confusion :)
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Cornelia Huck [Tue, 11 Oct 2005 15:28:59 +0000 (08:28 -0700)]
[PATCH] s390: ccw device reconnect oops.
Search for a disconnect ccw_device on the ccw bus rather than on the css
bus (was a typo in patch I did for the klist conversion). A cast to an
embedding ccw_device from an embedded device in a struct subchannel will
lead us to oopses.
Signed-off-by: Cornelia Huck <cohuck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Mackerras [Tue, 11 Oct 2005 12:08:12 +0000 (22:08 +1000)]
ppc: Various minor compile fixes
This fixes up a variety of minor problems in compiling with ARCH=ppc
arising from using the merged versions of various header files.
A lot of the changes are just adding #include <asm/machdep.h> to
files that use ppc_md or smp_ops_t.
This also arranges for us to use semaphore.c, vecemu.c, vector.S and
fpu.S from arch/powerpc/kernel when compiling with ARCH=ppc.
Paul Mackerras [Tue, 11 Oct 2005 12:03:09 +0000 (22:03 +1000)]
ppc: Adapt to asm-powerpc/irq.h irq_canonicalize changes
Now instead of having a ppc_md function, we just have a variable
which says whether to do the i8259 irq canonicalization or not,
and set that variable on the platforms that need that. It looks
to me that radstone_ppc7d was trying to use irq canonicalization
for something else in a broken kind of way - it will need to be
fixed properly.
It breaks non-PCI builds, it's probably better to have a more
direct implementation on sparc32, and which driver actually
needs this is still questionable.
We can resolve this in 2.6.15
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Mackerras [Mon, 10 Oct 2005 23:25:40 +0000 (09:25 +1000)]
powerpc: Make building the boot image work for both 32-bit and 64-bit
At the moment we don't have a merged arch/powerpc/boot, so we build the
boot images in arch/ppc/boot and arch/ppc64/boot. Unfortunately the
makefile targets are different in those two directories, so this makes
a change to accommodate both for the moment.
Andi Kleen [Mon, 10 Oct 2005 23:28:33 +0000 (01:28 +0200)]
[PATCH] i386: Don't discard upper 32bits of HWCR on K8
Need to use long long, not long when RMWing a MSR. I think
it's harmless right now, but still should be better fixed
if AMD adds any bits in the upper 32bit of HWCR.
Bug was introduced with the TLB flush filter fix for i386
Andi Kleen [Mon, 10 Oct 2005 20:32:45 +0000 (22:32 +0200)]
[PATCH] x86_64: Allocate cpu local data for all possible CPUs
CPU hotplug fills up the possible map to NR_CPUs, but it did that after
setting up per CPU data. This lead to CPU data not getting allocated
for all possible CPUs, which lead to various side effects.
Harald Welte [Mon, 10 Oct 2005 17:44:29 +0000 (19:44 +0200)]
[PATCH] Fix signal sending in usbdevio on async URB completion
If a process issues an URB from userspace and (starts to) terminate
before the URB comes back, we run into the issue described above. This
is because the urb saves a pointer to "current" when it is posted to the
device, but there's no guarantee that this pointer is still valid
afterwards.
In fact, there are three separate issues:
1) the pointer to "current" can become invalid, since the task could be
completely gone when the URB completion comes back from the device.
2) Even if the saved task pointer is still pointing to a valid task_struct,
task_struct->sighand could have gone meanwhile.
3) Even if the process is perfectly fine, permissions may have changed,
and we can no longer send it a signal.
So what we do instead, is to save the PID and uid's of the process, and
introduce a new kill_proc_info_as_uid() function.
Signed-off-by: Harald Welte <laforge@gnumonks.org>
[ Fixed up types and added symbol exports ] Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David S. Miller [Mon, 10 Oct 2005 23:12:13 +0000 (16:12 -0700)]
[SPARC64]: Fix Ultra5, Ultra60, et al. boot failures.
On the boot processor, we need to do the move onto the Linux trap
table a little bit differently else we'll take unhandlable faults in
the firmware address space.
Previously we would do the following:
1) Disable PSTATE_IE in %pstate.
2) Set %tba by hand to sparc64_ttable_tl0
3) Initialize alternate, mmu, and interrupt global
trap registers.
4) Call prom_set_traptable()
That doesn't work very well actually with the way we boot the kernel
VM these days. It worked by luck on many systems because the firmware
accesses for the prom_set_traptable() call happened to be loaded into
the TLB already, something we cannot assume.
So the new scheme is this:
1) Clear PSTATE_IE in %pstate and set %pil to 15
2) Call prom_set_traptable()
3) Initialize alternate, mmu, and interrupt global
trap registers.
and this works quite well. This sequence has been moved into a
callable function in assembler named setup-trap_table(). The idea is
that eventually trampoline.S can use this code as well. That isn't
possible currently due to some complications, but eventually we should
be able to do it.
Thanks to Meelis Roos for the Ultra5 boot failure report.
Signed-off-by: David S. Miller <davem@davemloft.net>
Undo wrong change in global_flush_tlb. We need to flush the caches in all
cases, not just when pages were reverted. This was a bogus optimization
added earlier, but it was wrong.
Vincent Sanders [Mon, 10 Oct 2005 17:24:09 +0000 (18:24 +0100)]
[ARM] 2968/1: defconfig for the ARM Collie platform
Patch from Vincent Sanders
Add a defconfig for the ARM Collie platform
Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Vincent Sanders <vince@arm.linux.org.uk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Vincent Sanders [Mon, 10 Oct 2005 17:24:08 +0000 (18:24 +0100)]
[ARM] 2967/1: defconfig for the ARM Corgi platform
Patch from Vincent Sanders
Add a defconfig for the ARM Corgi Zarus platform
Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Vincent Sanders <vince@arm.linux.org.uk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Vincent Sanders [Mon, 10 Oct 2005 17:24:07 +0000 (18:24 +0100)]
[ARM] 2966/1: defconfig for the ARM Poodle platform
Patch from Vincent Sanders
Add a defconfig for the ARM Poodle Zarus platform
Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Vincent Sanders <vince@arm.linux.org.uk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Vincent Sanders [Mon, 10 Oct 2005 17:24:06 +0000 (18:24 +0100)]
[ARM] 2965/1: defconfig for the ARM Spitz platform
Patch from Vincent Sanders
Add a defconfig for the ARM Spitz Zarus platform
Signed-off-by: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Vincent Sanders <vince@arm.linux.org.uk> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Nicolas Pitre [Mon, 10 Oct 2005 17:22:17 +0000 (18:22 +0100)]
[ARM] 2956/1: fix the "Fix gcc4 build errors in ucb1x00-core.c"
Patch from Nicolas Pitre
drivers/mfd/ucb1x00-core.c: In function 'ucb1x00_probe':
drivers/mfd/ucb1x00-core.c:482: error: 'ucb1x00_class' undeclared (first use in this function)
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[PATCH] i386: fix stack alignment for signal handlers
This fixes the setup of the alignment of the signal frame, so that all
signal handlers are run with a properly aligned stack frame.
The current code "over-aligns" the stack pointer so that the stack frame
is effectively always mis-aligned by 4 bytes. But what we really want
is that on function entry ((sp + 4) & 15) == 0, which matches what would
happen if the stack were aligned before a "call" instruction.
Signed-off-by: Markus F.X.J. Oberhumer <markus@oberhumer.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The logic in ide_do_request() doesn't guarantee that both drives will be
serviced after a call. It may "forget" to service one in some
circumstances, including when one of the drive is suspended (it will
eventually fail to service the slave when the master is suspended for
example). This prevents the wakeup requests that gets queued on wakeup
from sleep from beeing serviced in some cases when 2 drives are sharing
an IDE bus.
The problem is deep enough in the way this code works (and there are
probably a few other problematic but rare corner cases) and fixing it
would require some major rethinking of the way IDE decides which channel
to service. This is not 2.6.14 material. However, in the meantime,
Bart has accepted this simple workaround that will fix the crash on
wakeup from sleep since this specific corner case is actually hitting
users to get into 2.6.14.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jeff Dike [Sun, 9 Oct 2005 20:11:44 +0000 (16:11 -0400)]
[PATCH] uml: fix x86_64 with !CONFIG_FRAME_POINTER
UML/x86_64 doesn't run when built with frame pointers disabled. There
was an implicit frame pointer assumption in the stub segfault handler.
With frame pointers disabled, UML dies on handling its first page fault.
The container-of part of this is from Paolo Giarrusso <blaisorblade@yahoo.it>.
Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] x86_64: Set up safe page tables during resume
The following patch makes swsusp avoid the possible temporary corruption
of page translation tables during resume on x86-64. This is achieved by
creating a copy of the relevant page tables that will not be modified by
swsusp and can be safely used by it on resume.
The problem is that during resume on x86-64 swsusp may temporarily
corrupt the page tables used for the direct mapping of RAM. If that
happens, a page fault occurs and cannot be handled properly, which leads
to the solid hang of the affected system. This leads to the loss of the
system's state from before suspend and may result in the loss of data or
the corruption of filesystems, so it is a serious issue. Also, it
appears to happen quite often (for me, as often as 50% of the time).
The problem is related to the fact that (at least) one of the PMD
entries used in the direct memory mapping (starting at PAGE_OFFSET)
points to a page table the physical address of which is much greater
than the physical address of the PMD entry itself. Moreover,
unfortunately, the physical address of the page table before suspend
(i.e. the one stored in the suspend image) happens to be different to
the physical address of the corresponding page table used during resume
(i.e. the one that is valid right before swsusp_arch_resume() in
arch/x86_64/kernel/suspend_asm.S is executed). Thus while the image is
restored, the "offending" PMD entry gets overwritten, so it does not
point to the right physical address any more (i.e. there's no page
table at the address pointed to by it, because it points to the address
the page table has been at during suspend). Consequently, if the PMD
entry is used later on, and it _is_ used in the process of copying the
image pages, a page fault occurs, but it cannot be handled in the normal
way and the system hangs.
In principle we can call create_resume_mapping() from
swsusp_arch_resume() (ie. from suspend_asm.S), but then the memory
allocations in create_resume_mapping(), resume_pud_mapping(), and
resume_pmd_mapping() must be made carefully so that we use _only_
NosaveFree pages in them (the other pages are overwritten by the loop in
swsusp_arch_resume()). Additionally, we are in atomic context at that
time, so we cannot use GFP_KERNEL. Moreover, if one of the allocations
fails, we should free all of the allocated pages, so we need to trace
them somehow.
All of this is done in the appended patch, except that the functions
populating the page tables are located in arch/x86_64/kernel/suspend.c
rather than in init.c. It may be done in a more elegan way in the
future, with the help of some swsusp patches that are in the works now.
[AK: move some externs into headers, renamed a function]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] uml: cleanup byte order macros for COW driver
After restoring the existing code, make it work also when included in
kernelspace code (which isn't currently the case, but at least this will prevent
people from "fixing" it as just happened).
Whitespace is fixed in next patch - it cluttered the diff too much.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It broke because:
a) because this part doesn't fall under the description
b) the author didn't know what he was doing here
c) the author didn't try to compile the existing code and see that it worked
perfectly.
d) the author didn't ask us what was happening
e) you didn't either, and somebody there should have learned that UML is a bit
different.
In fact, UML is special in linking to host libc and using its includes.
In particular, since host includes always define both __BIG_ENDIAN and
__LITTLE_ENDIAN, ntohll() macros started thinking to be in a big-endian world;
and on-disk compatibility was broken.
Many thanks go to Nix for reporting the problem and correctly diagnosing an
endianness problem.
Btw, this patch restores the previous code, which worked; but the definitions
would be uncorrect if used in kernelspace files.
Next patch addresses that.
Cc: Nix <nix@esperi.org.uk>, Olaf Hering <olh@suse.de> Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Linus Torvalds <torvalds@osdl.org>