Lachlan McIlroy [Mon, 23 Jun 2008 03:25:02 +0000 (13:25 +1000)]
[XFS] fix extent corruption in xfs_iext_irec_compact_full()
This function is used to compact the indirect extent list by moving
extents from one page to the previous to fill them up. After we move some
extents to an earlier page we need to shuffle the remaining extents to the
start of the page. The actual bug here is the second argument to memmove()
needs to index past the extents, that were copied to the previous page,
and move the remaining extents. For pages that are already full (ie
ext_avail == 0) the compaction code has no net effect so don't do it.
SGI-PV: 983337
SGI-Modid: xfs-linux-melb:xfs-kern:31332a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
Lachlan McIlroy [Mon, 23 Jun 2008 03:23:57 +0000 (13:23 +1000)]
[XFS] make inode reclaim wait for log I/O to complete
During a forced shutdown a xfs inode can be destroyed before log I/O
involving that inode is complete. We need to wait for the inode to be
unpinned before tearing it down. Version 2 cleans up the code a bit by
relying on xfs_iflush() to do the unpinning and forced shutdown check.
SGI-PV: 981240
SGI-Modid: xfs-linux-melb:xfs-kern:31326a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com>
Eric Sandeen [Mon, 23 Jun 2008 03:23:32 +0000 (13:23 +1000)]
[XFS] Pack some shortform dir2 structures for the ARM old ABI
architecture.
This should fix the longstanding issues with xfs and old ABI arm boxes,
which lead to various asserts and xfs shutdowns, and for which an
(incorrect) patch has been floating around for years.
I've verified this patch by comparing the on-disk structure layouts using
pahole from the dwarves package, as well as running through a bit of xfsqa
under qemu-arm, modified so that the check/repair phase after each test
actually executes check/repair from the x86 host, on the filesystem
populated by the arm emulator. Thus far it all looks good.
There are 2 other structures with extra padding at the end, but they don't
seem to cause trouble. I suppose they could be packed as well:
xfs_dir2_data_unused_t and xfs_dir2_sf_t.
Note that userspace needs a similar treatment, and any filesystems which
were running with the previous rogue "fix" will now see corruption (either
in the kernel, or during xfs_repair) with this fix properly in place; it
may be worth teaching xfs_repair to identify and fix that specific issue.
SGI-PV: 982930
SGI-Modid: xfs-linux-melb:xfs-kern:31280a
Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Lachlan McIlroy [Mon, 23 Jun 2008 03:23:01 +0000 (13:23 +1000)]
[XFS] Use the generic xattr methods.
Use the generic set, get and removexattr methods and supply the s_xattr
array with fine-grained handlers. All XFS/Linux highlevel attr handling is
rewritten from scratch and placed into fs/xfs/linux-2.6/xfs_xattr.c so
that it's separated from the generic low-level code.
SGI-PV: 982343
SGI-Modid: xfs-linux-melb:xfs-kern:31234a
Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Barry Naujok [Mon, 16 Jun 2008 02:07:41 +0000 (12:07 +1000)]
[XFS] Invalidate dentry in unlink/rmdir if in case-insensitive mode
The vfs_unlink/d_delete functionality in the Linux VFS make the
dentry negative if it is the only inode being referenced. Case-insensitive
mode doesn't work with negative dentries, so if using CI-mode, invalidate
the dentry on unlink/rmdir.
Barry Naujok [Tue, 3 Jun 2008 01:59:18 +0000 (11:59 +1000)]
[XFS] Zero uninitialised xfs_da_args structure in xfs_dir2.c
Fixes a problem in the xfs_dir2_remove and xfs_dir2_replace paths which
intenally call directory format specific lookup funtions that assume
args->cmpresult is zeroed.
Barry Naujok [Wed, 21 May 2008 06:58:55 +0000 (16:58 +1000)]
[XFS] XFS: ASCII case-insensitive support
Implement ASCII case-insensitive support. It's primary purpose is for
supporting existing filesystems that already use this case-insensitive
mode migrated from IRIX. But, if you only need ASCII-only case-insensitive
support (ie. English only) and will never use another language, then this
mode is perfectly adequate.
ASCII-CI is implemented by generating hashes based on lower-case letters
and doing lower-case compares. It implements a new xfs_nameops vector for
doing the hashes and comparisons for all filename operations.
To create a filesystem with this CI mode, use: # mkfs.xfs -n version=ci
<device>
Barry Naujok [Wed, 21 May 2008 06:58:22 +0000 (16:58 +1000)]
[XFS] Return case-insensitive match for dentry cache
This implements the code to store the actual filename found during a
lookup in the dentry cache and to avoid multiple entries in the dcache
pointing to the same inode.
To avoid polluting the dcache, we implement a new directory inode
operations for lookup. xfs_vn_ci_lookup() stores the correct case name in
the dcache.
The "actual name" is only allocated and returned for a case- insensitive
match and not an actual match.
Another unusual interaction with the dcache is not storing negative
dentries like other filesystems doing a d_add(dentry, NULL) when an ENOENT
is returned. During the VFS lookup, if a dentry returned has no inode,
dput is called and ENOENT is returned. By not doing a d_add, this actually
removes it completely from the dcache to be reused. create/rename have to
be modified to support unhashed dentries being passed in.
Barry Naujok [Wed, 21 May 2008 06:50:46 +0000 (16:50 +1000)]
dcache: Add case-insensitive support d_ci_add() routine
This add a dcache entry to the dcache for lookup, but changing the name
that is associated with the entry rather than the one passed in to the
lookup routine.
First, it sees if the case-exact match already exists in the dcache and
uses it if one exists. Otherwise, it allocates a new node with the new
name and splices it into the dcache.
Original code from ntfs_lookup in fs/ntfs/namei.c by Anton Altaparmakov.
Signed-off-by: Barry Naujok <bnaujok@sgi.com> Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Acked-by: Christoph Hellwig <hch@infradead.org>
Barry Naujok [Wed, 21 May 2008 06:42:05 +0000 (16:42 +1000)]
[XFS] Add op_flags field and helpers to xfs_da_args
The end of the xfs_da_args structure has 4 unsigned char fields for
true/false information on directory and attr operations using the
xfs_da_args structure.
The following converts these 4 into a op_flags field that uses the first 4
bits for these fields and allows expansion for future operation
information (eg. case-insensitive lookup request).
Barry Naujok [Wed, 21 May 2008 06:41:01 +0000 (16:41 +1000)]
[XFS] Name operation vector for hash and compare
Adds two pieces of functionality for the basis of case-insensitive support
in XFS:
1. A comparison result enumerated type: xfs_dacmp. It represents an
exact match, case-insensitive match or no match at all. This patch
only implements different and exact results.
2. xfs_nameops vector for specifying how to perform the hash generation
of filenames and comparision methods. In this patch the hash vector
points to the existing xfs_da_hashname function and the comparison
method does a length compare, and if the same, does a memcmp and
return the xfs_dacmp result.
All filename functions that use the hash (create, lookup remove, rename,
etc) now use the xfs_nameops.hashname function and all directory lookup
functions also use the xfs_nameops.compname function.
The lookup functions also handle case-insensitive results even though the
default comparison function cannot return that. And important aspect of
the lookup functions is that an exact match always has precedence over a
case-insensitive. So while a case-insensitive match is found, we have to
keep looking just in case there is an exact match. In the meantime, the
info for the first case-insensitive match is retained if no exact match is
found.
Eric Sandeen [Tue, 20 May 2008 05:11:17 +0000 (15:11 +1000)]
[XFS]
de-duplicate calls to xfs_attr_trace_enter
Every call to xfs_attr_trace_enter() shares the exact same 16 args in the
middle... just send in the context pointer and let the next level down
split it into the ktrace.
[XFS] kill calls to xfs_binval in the mount error path
xfs_binval aka xfs_flush_buftarg is the first thing done in
xfs_free_buftarg, so there is no need to have duplicated calls just before
xfs_free_buftarg in the mount failure path.
xfs_mount_init is inlined into xfs_fs_fill_super and allocation switched
to kzalloc. Plug a leak of the mount structure for most early mount
failures. Move xfs_icsb_init_counters to as late as possible in the mount
path and make sure to undo it so that no stale hotplug cpu notifiers are
left around on mount failures.
xfs_mount is already pretty linux-specific so merge it into
xfs_fs_fill_super to allow for a more structured mount code in the next
patches. xfs_start_flags and xfs_finish_flags also move to xfs_super.c.
[XFS] merge xfs_unmount into xfs_fs_put_super / xfs_fs_fill_super
xfs_unmount is small and already pretty Linux specific, so merge it into
the callers. The real unmount path is simplified a little by doing a
WARN_ON on the xfs_unmount_flush retval directly instead of propagating
the error back to the caller, and the mout failure case in simplified
significantly by removing the forced shutdown case and all the dmapi
events that shouldn't be sent because the dmapi mount event hasn't been
sent by that time either.
David Chinner [Tue, 20 May 2008 01:30:27 +0000 (11:30 +1000)]
[XFS] Update valid fields in xfs_mount_log_sb()
Recent changes to update the version number during mount (attr2 stuff)
failed to change the assert that checked for calid flags being changed on
mount. Clearly this path hasn't been exercised by the test code....
[XFS] Kill attr_capable checks as already done in xattr_permission.
No need for addition permission checks in the xattr handler,
fs/xattr.c:xattr_permission() already does them, and in fact slightly more
strict then what was in the attr_capable handlers.
Matthew Wilcox [Mon, 19 May 2008 06:34:27 +0000 (16:34 +1000)]
[XFS] Convert l_flushsema to a sv_t
The l_flushsema doesn't exactly have completion semantics, nor mutex
semantics. It's used as a list of tasks which are waiting to be notified
that a flush has completed. It was also being used in a way that was
potentially racy, depending on the semaphore implementation.
By using a sv_t instead of a semaphore we avoid the need for a separate
counter, since we know we just need to wake everything on the queue.
Original waitqueue implementation from Matthew Wilcox. Cleanup and
conversion to sv_t by Christoph Hellwig.
Tim Shimmin [Wed, 30 Apr 2008 08:15:28 +0000 (18:15 +1000)]
[XFS] Fix up noattr2 so that it will properly update the versionnum and
features2 fields.
Previously, mounting with noattr2 failed to achieve anything because
although it cleared the attr2 mount flag, it would set it again as soon as
it processed the superblock fields. The fix now has an explicit noattr2
flag and uses it later to fix up the versionnum and features2 fields.
Merge branch 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Ensure we call nfs_sb_deactive() after releasing the directory inode
nfs_remount oops when rebooting + possible fix
NFS: Ensure we call nfs_sb_deactive() after releasing the directory inode
In order to avoid the "Busy inodes after unmount" error message, we need to
ensure that nfs_async_unlink_release() releases the super block after the
call to nfs_free_unlinkdata().
The new super.c:nfs_remount function doesn't check the validity of the
options/options4 pointers. Unfortunately, this seems to happend.
The obvious patch seems to check the pointers, and not to do anything if
the happend to be NULL.
Tested on an XScale PXA255 system, latest git.
Regards,
M.
Signed-off-by: Marc Zyngier <marc.zyngier@altran.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Thomas Gleixner [Sun, 27 Jul 2008 19:43:11 +0000 (21:43 +0200)]
x86: fix cpu hotplug on 32bit
commit 3e9704739daf46a8ba6593d749c67b5f7cd633d2 ("x86: boot secondary
cpus through initial_code") causes the kernel to crash when a CPU is
brought online after the read only sections have been write
protected. The write to initial_code in do_boot_cpu() fails.
Move inital_code to .cpuinit.data section.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: H. Peter Anvin <hpa@zytor.com>
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: state userland requirements in Kconfig help
firewire: avoid memleak after phy config transmit failure
firewire: fw-ohci: TSB43AB22/A dualbuffer workaround
firewire: queue the right number of data
firewire: warn on unfinished transactions during card removal
firewire: small fw_fill_request cleanup
firewire: fully initialize fw_transaction before marking it pending
firewire: fix race of bus reset with request transmission
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (59 commits)
[SCSI] replace __FUNCTION__ with __func__
[SCSI] extend the last_sector_bug flag to cover more sectors
[SCSI] qla2xxx: Update version number to 8.02.01-k6.
[SCSI] qla2xxx: Additional NPIV corrections.
[SCSI] qla2xxx: suppress uninitialized-var warning
[SCSI] qla2xxx: use memory_read_from_buffer()
[SCSI] qla2xxx: Issue proper ISP callbacks during stop-firmware.
[SCSI] ch: fix ch_remove oops
[SCSI] 3w-9xxx: add MSI support and misc fixes
[SCSI] scsi_lib: use blk_rq_tagged in scsi_request_fn
[SCSI] ibmvfc: Update driver version to 1.0.1
[SCSI] ibmvfc: Add ADISC support
[SCSI] ibmvfc: Miscellaneous fixes
[SCSI] ibmvfc: Fix hang on module removal
[SCSI] ibmvfc: Target refcounting fixes
[SCSI] ibmvfc: Reduce unnecessary log noise
[SCSI] sym53c8xx: free luntbl in sym_hcb_free
[SCSI] scsi_scan.c: Release mutex in error handling code
[SCSI] scsi_eh_prep_cmnd should save scmd->underflow
[SCSI] sd: Support for SCSI disk (SBC) Data Integrity Field
...
* git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
avr32: some mmc/sd cleanups
include/video/atmel_lcdc.h must #include <linux/workqueue.h>
avr32: allow system timer to share interrupt to make OProfile work
drivers/misc/atmel-ssc.c: Removed duplicated include
avr32: Add platform data for AC97C platform device
avr32: clean up mci platform code
fix avr32 build errors
Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: ppc: fix invalidation of large guest pages
KVM: s390: Fix possible host kernel bug on lctl(g) handling
KVM: s390: Fix instruction naming for lctlg
KVM: s390: Fix program check on interrupt delivery handling
KVM: s390: Change guestaddr type in gaccess
KVM: s390: Fix guest kconfig
KVM: s390: Advertise KVM_CAP_USER_MEMORY
KVM: ia64: Fix irq disabling leak in error handling code
KVM: VMX: Fix undefined beaviour of EPT after reload kvm-intel.ko
KVM: VMX: Fix bypass_guest_pf enabling when disable EPT in module parameter
KVM: task switch: translate guest segment limit to virt-extension byte granular field
KVM: Avoid instruction emulation when event delivery is pending
KVM: task switch: use seg regs provided by subarch instead of reading from GDT
KVM: task switch: segment base is linear address
KVM: SVM: allow enabling/disabling NPT by reloading only the architecture module
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (25 commits)
setlocalversion: do not describe if there is nothing to describe
kconfig: fix typos: "Suport" -> "Support"
kconfig: make defconfig is no longer chatty
kconfig: make oldconfig is now less chatty
kconfig: speed up all*config + randconfig
kconfig: set all new symbols automatically
kconfig: add diffconfig utility
kbuild: remove Module.markers during mrproper
kbuild: sparse needs CF not CHECKFLAGS
kernel-doc: handle/strip __init
vmlinux.lds: move __attribute__((__cold__)) functions back into final .text section
init: fix URL of "The GNU Accounting Utilities"
kbuild: add arch/$ARCH/include to search path
kbuild: asm symlink support for arch/$ARCH/include
kbuild: support arch/$ARCH/include for tags, cscope
kbuild: prepare headers_* for arch/$ARCH/include
kbuild: install all headers when arch is changed
kbuild: make clean removes *.o.* as well
kbuild: optimize headers_* targets
kbuild: only one call for include/ in make headers_*
...
Andrea Righi [Sun, 27 Jul 2008 15:29:15 +0000 (17:29 +0200)]
task IO accounting: improve code readability
Put all i/o statistics in struct proc_io_accounting and use inline functions to
initialize and increment statistics, removing a lot of single variable
assignments.
This also reduces the kernel size as following (with CONFIG_TASK_XACCT=y and
CONFIG_TASK_IO_ACCOUNTING=y).
text data bss dec hex filename
11651 0 0 11651 2d83 kernel/exit.o.before
11619 0 0 11619 2d63 kernel/exit.o.after
10886 132 136 11154 2b92 kernel/fork.o.before
10758 132 136 11026 2b12 kernel/fork.o.after
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: Allow to force model to intel-mac-v3 in snd_hda_intel (sigmatel).
ALSA: cs4232: fix crash during chip PNP detection
ALSA: hda - Add automatic model setting for the Acer Aspire 5920G laptop
ALSA: make snd_ac97_add_vmaster() static
ALSA: sound/pci/azt3328.h: no variables for enums
ALSA: soc - wm9712 mono mixer
ALSA: hda - Add support of ASUS Eeepc P90*
ALSA: opti9xx: no isapnp param for !CONFIG_PNP
ALSA: opti93x - Fix NULL dereference
ALSA: hda - Added support for Asus V1Sn
ALSA: ASoC: Factor PGA DAPM handling into main
ALSA: ASoC: Refactor DAPM event handler
ALSA: ALSA: ens1370: communicate PCI device to AC97
ALSA: ens1370: SRC stands for Sample Rate Converter
ALSA: hda - Align BDL position adjustment parameter
ALSA: Au1xpsc: psc not disabled when TX is idle
ALSA: add TriTech 28023 AC97 codec ID and Wolfson 9701 name.
Al Viro [Sun, 27 Jul 2008 07:59:33 +0000 (08:59 +0100)]
missing bits of net-namespace / sysctl
Piss-poor sysctl registration API strikes again, film at 11...
What we really need is _pathname_ required to be present in already
registered table, so that kernel could warn about bad order. That's the
next target for sysctl stuff (and generally saner and more explicit
order of initialization of ipv[46] internals wouldn't hurt either).
For the time being, here are full fixups required by ..._rotable()
stuff; we make per-net sysctl sets descendents of "ro" one and make sure
that sufficient skeleton is there before we start registering per-net
sysctls.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Harvey Harrison [Fri, 4 Jul 2008 06:47:27 +0000 (23:47 -0700)]
[SCSI] replace __FUNCTION__ with __func__
[jejb: fixed up a ton of missed conversions.
All of you are on notice this has happened, driver trees will now
need to be rebased]
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: SCSI List <linux-scsi@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Alan Jenkins [Sun, 27 Jul 2008 08:38:42 +0000 (09:38 +0100)]
[SCSI] extend the last_sector_bug flag to cover more sectors
The last_sector_bug flag was added to work around a bug in certain usb
cardreaders, where they would crash if a multiple sector read included the
last sector. The original implementation avoids this by e.g. splitting an 8
sector read which includes the last sector into a 7 sector read, and a single
sector read for the last sector. The flag is enabled for all USB devices.
This revealed a second bug in other usb cardreaders, which crash when they
get a multiple sector read which stops 1 sector short of the last sector.
Affected hardware includes the Kingston "MobileLite" external USB cardreader
and the internal USB cardreader on the Asus EeePC.
Extend the last_sector_bug workaround to ensure that any access which touches
the last 8 hardware sectors of the device is a single sector long. Requests
are shrunk as necessary to meet this constraint.
This gives us a safety margin against potential unknown or future bugs
affecting multi-sector access to the end of the device. The two known bugs
only affect the last 2 sectors. However, they suggest that these devices
are prone to fencepost errors and that multi-sector access to the end of the
device is not well tested. Popular OS's use multi-sector accesses, but they
rarely read the last few sectors. Linux (with udev & vol_id) automatically
reads sectors from the end of the device on insertion. It is assumed that
single sector accesses are more thoroughly tested during development.
Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
David Brownell [Sun, 27 Jul 2008 09:34:45 +0000 (02:34 -0700)]
avr32: some mmc/sd cleanups
Minor cleanups for the MMC/SD support on avr32:
- Make at32_add_device_mci() properly initialize "missing"
platform data ... so boards like STK1002 won't try GPIO 0.
- Switch over to gpio_is_valid() instead of testing for only
one designated value.
- Provide STK1002 platform data for the unlikely case that
switches are set so first Ethernet controller isn't in use.
(That's the only way to get card detect and writeprotect
switch sensing on the STK1000.)
And get rid of one "unused variable" warning.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Russell King [Sun, 27 Jul 2008 09:35:54 +0000 (10:35 +0100)]
[ARM] Fix shared mmap when more than two maps of the same file exist
The shared mmap code works fine for the test case, which only checked
for two shared maps of the same file. However, three shared maps
result in one mapping remaining cached, resulting in stale data being
visible via that mapping. Fix this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
... VIVT only stuff
... VIPT only stuff
... VIVT or VIPT stuff
which is clearly bogus - we would only ever use the "VIVT or VIPT" case
when both VIVT and VIPT are not selected. Fix this.
Add comments to each case, including noting the impossibility of
correctly detecting the cache type of ARM926 and ARMv6 cores from
the cache type register in the "VIVT or VIPT" case.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When guest invalidates a large tlb map, there may be more than one
corresponding shadow tlb maps that need to be invalidated. Use eaddr and eend
to find these shadow tlb maps.
Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
KVM: s390: Fix possible host kernel bug on lctl(g) handling
The lctl(g) instructions require a specific alignment for the parameters.
The architecture requires a specification program check if these alignments
are not used. Enforcing this alignment also removes a possible host BUG,
since the get_guest functions check for proper alignment and emits a BUG.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
KVM: s390: Fix program check on interrupt delivery handling
The current interrupt handling on s390 misbehaves on an error case. On s390
each cpu has the prefix area (lowcore) for interrupt delivery. This memory
must always be available. If we fail to access the prefix area for a guest
on interrupt delivery the configuration is completely unusable. There is no
point in sending another program interrupt to an inaccessible lowcore.
Furthermore, we should not bug the host kernel, because this can be triggered
by userspace. I think the guest kernel itself can not trigger the problem, as
SET PREFIX and SIGNAL PROCESSOR SET PREFIX both check that the memory is
available and sane. As this is a userspace bug (e.g. setting the wrong guest
offset, unmapping guest memory) we should kill the userspace process instead
of BUGing the host kernel.
In the long term we probably should notify the userspace process about this
problem.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
All registers are unsigned long types. This patch changes all occurences
of guestaddr in gaccess from u64 to unsigned long.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
The virtio transport only works with kvm guest support and only as a
builtin. Lets change the build process of drivers/s390/kvm/kvm_virtio.c
to depend on kvm guest support, which is also a bool.
CONFIG_S390_GUEST already selects CONFIG_VIRTIO, that should prevent
CONFIG_S390_GUEST=y CONFIG_VIRTIO=n situations.
CC: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
Julia Lawall [Tue, 22 Jul 2008 19:38:18 +0000 (21:38 +0200)]
KVM: ia64: Fix irq disabling leak in error handling code
There is a call to local_irq_restore in the normal exit case, so it would
seem that there should be one on an error return as well.
The semantic patch that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
expression l;
expression E,E1,E2;
@@
local_irq_save(l);
... when != local_irq_restore(l)
when != spin_unlock_irqrestore(E,l)
when any
when strict
(
if (...) { ... when != local_irq_restore(l)
when != spin_unlock_irqrestore(E1,l)
+ local_irq_restore(l);
return ...;
}
|
if (...)
+ {local_irq_restore(l);
return ...;
+ }
|
spin_unlock_irqrestore(E2,l);
|
local_irq_restore(l);
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Avi Kivity <avi@qumranet.com>
Avi Kivity [Sat, 19 Jul 2008 05:57:05 +0000 (08:57 +0300)]
KVM: Avoid instruction emulation when event delivery is pending
When an event (such as an interrupt) is injected, and the stack is
shadowed (and therefore write protected), the guest will exit. The
current code will see that the stack is shadowed and emulate a few
instructions, each time postponing the injection. Eventually the
injection may succeed, but at that time the guest may be unwilling
to accept the interrupt (for example, the TPR may have changed).
This occurs every once in a while during a Windows 2008 boot.
Fix by unshadowing the fault address if the fault was due to an event
injection.
KVM: task switch: use seg regs provided by subarch instead of reading from GDT
There is no guarantee that the old TSS descriptor in the GDT contains
the proper base address. This is the case for Windows installation's
reboot-via-triplefault.
Use guest registers instead. Also translate the address properly.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
KVM: SVM: allow enabling/disabling NPT by reloading only the architecture module
If NPT is enabled after loading both KVM modules on AMD and it should be
disabled, both KVM modules must be reloaded. If only the architecture module is
reloaded the behavior is undefined. With this patch it is possible to disable
NPT only by reloading the kvm_amd module.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
Nicolas Boichat [Mon, 21 Jul 2008 14:18:01 +0000 (22:18 +0800)]
ALSA: Allow to force model to intel-mac-v3 in snd_hda_intel (sigmatel).
Currently, even if you pass model=intel-mac-v3 as a module parameter to
snd_hda_intel, the function patch_stac922x (patch_sigmatel.c) will still
try to auto-detect the model type. This is a problem on my MacBook Pro 1st
generation, which needs intel-mac-v3, but sometimes incorrectly reports
0x00000100 as subsystem id, which causes the switch in patch_stac922x to
select intel-mac-v4.
To fix this, I added a new model called intel-mac-auto, so in case no
module parameter is passed, and an Intel Mac board is detected, the
model will be automatically detected, while no detection will be done
if the model is forced to intel-mac-v3.
This problem has been around for quite a while, and I used to fix it
by moving the case statement for 0x00000100 in patch_stac922x so that
intel-mac-v3 is chosen.
Another way to fix the problem would be to check if a module parameter
was set directly in patch_stac922x, using something like this:
if (spec->board_config == STAC_INTEL_MAC_V3 &&
!codec->bus->modelname) {
But I think it is less elegant (if you prefer that way, I can prepare a
patch).
Signed-off-by: Nicolas Boichat <nicolas@boichat.ch> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
mlx4: Update/add Mellanox Technologies copyright lines to mlx4 driver files
mlx4_core: Add VLAN tag field to WQE control segment struct
RDMA/nes: CM connection setup/teardown rework
IPoIB: Correct help text for INFINIBAND_IPOIB_DEBUG
IPoIB/cm: Connected mode is no longer EXPERIMENTAL
RDMA/ucm: BKL is not needed for ib_ucm_open()
RDMA/ucma: BKL is not needed for ucma_open()
* git://git.infradead.org/mtd-2.6: (57 commits)
[MTD] [NAND] subpage read feature as a way to increase performance.
CPUFREQ: S3C24XX NAND driver frequency scaling support.
[MTD][NAND] au1550nd: remove unused variable
[MTD] jedec_probe: Fix SST 16-bit chip detection
[MTD][MTDPART] Fix a division by zero bug
[MTD][MTDPART] Cleanup and document the erase region handling
[MTD][MTDPART] Handle most checkpatch findings
[MTD][MTDPART] Seperate main loop from per-partition code in add_mtd_partition
[MTD] physmap: resume already suspended chips on failure to suspend
[MTD] physmap: Fix suspend/resume/shutdown bugs.
[MTD] [NOR] Fix -ETIMEO errors in CFI driver
[MTD] [NAND] fsl_elbc_nand: fix section mismatch with CONFIG_MTD_OF_PARTS=y
[JFFS2] Use .unlocked_ioctl
[MTD] Fix const assignment in the MTD command line partitioning driver
[MTD] [NOR] gen_probe: No debug message when debugging is disabled
[MTD] [NAND] remove __PPC__ hardcoded address from DiskOnChip drivers
[MTD] [MAPS] Remove the bast-flash driver.
[MTD] [NAND] fsl_elbc_nand: ecclayout cleanups
[MTD] [NAND] fsl_elbc_nand: implement support for flash-based BBT
[MTD] [NAND] fsl_elbc_nand: fix OOB workability for large page NAND chips
...
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
atmel-mci: debugfs support
mmc: Add per-card debugfs support
mmc: Export internal host state through debugfs
imxmmc: fix crash when no platform data is provided
imxmmc: fix platform resources
imxmmc: remove DEBUG definition
mmc_spi: put signals to low power off fix
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (39 commits)
[PATCH] fix RLIM_NOFILE handling
[PATCH] get rid of corner case in dup3() entirely
[PATCH] remove remaining namei_{32,64}.h crap
[PATCH] get rid of indirect users of namei.h
[PATCH] get rid of __user_path_lookup_open
[PATCH] f_count may wrap around
[PATCH] dup3 fix
[PATCH] don't pass nameidata to __ncp_lookup_validate()
[PATCH] don't pass nameidata to gfs2_lookupi()
[PATCH] new (local) helper: user_path_parent()
[PATCH] sanitize __user_walk_fd() et.al.
[PATCH] preparation to __user_walk_fd cleanup
[PATCH] kill nameidata passing to permission(), rename to inode_permission()
[PATCH] take noexec checks to very few callers that care
Re: [PATCH 3/6] vfs: open_exec cleanup
[patch 4/4] vfs: immutable inode checking cleanup
[patch 3/4] fat: dont call notify_change
[patch 2/4] vfs: utimes cleanup
[patch 1/4] vfs: utimes: move owner check into inode_change_ok()
[PATCH] vfs: use kstrdup() and check failing allocation
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
netns: fix ip_rt_frag_needed rt_is_expired
netfilter: nf_conntrack_extend: avoid unnecessary "ct->ext" dereferences
netfilter: fix double-free and use-after free
netfilter: arptables in netns for real
netfilter: ip{,6}tables_security: fix future section mismatch
selinux: use nf_register_hooks()
netfilter: ebtables: use nf_register_hooks()
Revert "pkt_sched: sch_sfq: dump a real number of flows"
qeth: use dev->ml_priv instead of dev->priv
syncookies: Make sure ECN is disabled
net: drop unused BUG_TRAP()
net: convert BUG_TRAP to generic WARN_ON
drivers/net: convert BUG_TRAP to generic WARN_ON
Randy Dunlap [Sat, 26 Jul 2008 22:22:28 +0000 (15:22 -0700)]
firmware: fix memmap printk format warnings
Fix firmware/memmap printk format warnings:
drivers/firmware/memmap.c:156: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'resource_size_t'
drivers/firmware/memmap.c:161: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'resource_size_t'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Bernhard Walle <bwalle@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adrian Bunk [Sat, 26 Jul 2008 22:22:28 +0000 (15:22 -0700)]
mm/util.c must #include <linux/sched.h>
mm/util.c: In function 'arch_pick_mmap_layout':
mm/util.c:144: error: dereferencing pointer to incomplete type
mm/util.c:145: error: 'arch_get_unmapped_area' undeclared (first use in this function)
mm/util.c:145: error: (Each undeclared identifier is reported only once
mm/util.c:145: error: for each function it appears in.)
mm/util.c:146: error: 'arch_unmap_area' undeclared (first use in this function)
Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Righi [Sat, 26 Jul 2008 22:22:27 +0000 (15:22 -0700)]
hugetlb: remove unused variable warning
Remove the following warning when CONFIG_HUGETLB_PAGE is not set:
ipc/shm.c: In function `shm_get_stat':
ipc/shm.c:565: warning: unused variable `h'
[akpm@linux-foundation.org: use tabs, not spaces] Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>