Jacek Galowicz [Tue, 22 Nov 2011 00:09:01 +0000 (11:09 +1100)]
lguest: switch segment-voodoo-numbers to readable symbols
When studying lguest's x86 segment descriptor code, it is not longer
necessary to have the Intel x86 architecture manual open on the page
with the segment descriptor illustration to understand the crazy
numbers assigned to both descriptor structure halves a/b.
Now the struct desc_struct's fields, like suggested by
Glauber de Oliveira Costa in 2008, are used.
Signed-off-by: Jacek Galowicz <jacek@galowicz.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Kevin Cernekee [Sun, 13 Nov 2011 03:08:56 +0000 (19:08 -0800)]
module: Fix performance regression on modules with large symbol tables
Commit 554bdfe5acf3715e87c8d5e25a4f9a896ac9f014 (module: reduce string
table for loaded modules) introduced an optimization to shrink the size of
the resident string table. Part of this involves calling bitmap_weight()
on the strmap bitmap once for each core symbol. strmap contains one bit
for each byte of the module's strtab.
For kernel modules with a large number of symbols, the addition of the
bitmap_weight() operation to each iteration of the add_kallsyms() loop
resulted in a significant "insmod" performance regression from 2.6.31
to 2.6.32. bitmap_weight() is expensive when the bitmap is large.
The proposed alternative optimizes the common case in this loop
(is_core_symbol() == true, and the symbol name is not a duplicate), while
penalizing the exceptional case of a duplicate symbol.
My test was run on a 600 MHz MIPS processor, using a kernel module with
15,000 "core" symbols and 10,000 symbols in .init.text. .strtab takes up
250,227 bytes.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Tue, 22 Nov 2011 00:09:00 +0000 (11:09 +1100)]
virtio: add debugging if driver doesn't kick.
Under the existing #ifdef DEBUG, check that they don't have more than
1/10 of a second between an add_buf() and a
virtqueue_notify()/virtqueue_kick_prepare() call.
We could get false positives on a really busy system, but good for
development.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Tue, 22 Nov 2011 00:09:00 +0000 (11:09 +1100)]
virtio: expose added descriptors immediately.
A virtio driver does virtqueue_add_buf() multiple times before finally
calling virtqueue_kick(); previously we only exposed the added buffers
in the virtqueue_kick() call. This means we don't need a memory
barrier in virtqueue_add_buf(), but it reduces concurrency as the
device (ie. host) can't see the buffers until the kick.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Tue, 22 Nov 2011 00:08:59 +0000 (11:08 +1100)]
virtio: support unlocked queue kick
Based on patch by Christoph for virtio_blk speedup:
Split virtqueue_kick to be able to do the actual notification
outside the lock protecting the virtqueue. This patch was
originally done by Stefan Hajnoczi, but I can't find the
original one anymore and had to recreated it from memory.
Pointers to the original or corrections for the commit message
are welcome.
Rusty Russell [Tue, 22 Nov 2011 00:08:59 +0000 (11:08 +1100)]
virtio: rename virtqueue_add_buf_gfp to virtqueue_add_buf
Remove wrapper functions. This makes the allocation type explicit in
all callers; I used GPF_KERNEL where it seemed obvious, left it at
GFP_ATOMIC otherwise.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Reviewed-by: Christoph Hellwig <hch@lst.de>
virtio pci device reset actually just does an I/O
write, which in PCI is really posted, that is it
can complete on CPU before the device has received it.
Further, interrupts might have been pending on
another CPU, so device callback might get invoked after reset.
This conflicts with how drivers use reset, which is typically:
reset
unregister
a callback running after reset completed can race with
unregister, potentially leading to use after free bugs.
Fix by flushing out the write, and flushing pending interrupts.
This assumes that device is never reset from
its vq/config callbacks, or in parallel with being
added/removed, document this assumption.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Heiko Carstens [Tue, 15 Nov 2011 09:13:24 +0000 (10:13 +0100)]
virtio: add HAS_IOMEM dependency to MMIO platform bus driver
Fix this compile error on s390:
CC [M] drivers/virtio/virtio_mmio.o
drivers/virtio/virtio_mmio.c: In function 'vm_get_features':
drivers/virtio/virtio_mmio.c:107:2: error: implicit declaration of function 'writel'
Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Guenter Roeck [Tue, 22 Nov 2011 00:08:54 +0000 (11:08 +1100)]
hwmon: (lm63) Add support for LM96163
LM96163 is an enhanced version of LM63 with improved PWM resolution. Add chip
detection code as well as support for improved PWM resolution if the chip is
configured to use it.
Eric Miao [Tue, 22 Nov 2011 00:08:52 +0000 (11:08 +1100)]
hwmon: (max1111) Change sysfs interface to in[0-3]_input in millivolts
This patch fixed the inconsistent max1111 sysfs interface as pointed
out by Jean Delvare:
It was pointed to me that the max1111 driver doesn't implement the
standard sysfs interface for hwmon drivers (as described in
Documentation/hwmon/sysfs-interface). It exports files adc[0-3]_in,
which
aren't part of the standard interface. Presumably these should be
renamed to in[0-3]_input. Renaming them is probably not sufficient
though, as I see no scaling done in the driver. As the MAX1111 chip has
a documented full scale of 2.048V, I take it that the LSB of the ADC
has a weight of 8 mV. Exporting raw register values to user-space is
not OK.
Reported-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Eric Miao <eric.y.miao@gmail.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tyler Hicks [Mon, 21 Nov 2011 23:31:29 +0000 (17:31 -0600)]
eCryptfs: Flush file in vma close
Dirty pages weren't being written back when an mmap'ed eCryptfs file was
closed before the mapping was unmapped. Since f_ops->flush() is not
called by the munmap() path, the lower file was simply being released.
This patch flushes the eCryptfs file in the vm_ops->close() path.
Tyler Hicks [Mon, 21 Nov 2011 23:31:02 +0000 (17:31 -0600)]
eCryptfs: Prevent file create race condition
The file creation path prematurely called d_instantiate() and
unlock_new_inode() before the eCryptfs inode info was fully
allocated and initialized and before the eCryptfs metadata was written
to the lower file.
This could result in race conditions in subsequent file and inode
operations leading to unexpected error conditions or a null pointer
dereference while attempting to use the unallocated memory.
The changes to omap4-common.h were moved to arch/arm/mach-omap2/common.h
and the other trivial conflicts resolved. The now empty ifdef in irqs.h
was also eliminated.
Dan Carpenter [Mon, 21 Nov 2011 21:46:24 +0000 (16:46 -0500)]
caif: fix endian conversion in cffrml_transmit()
The "tmp" variable here is used to store the result of cpu_to_le16()
so it should be an __le16 instead of an int. We want the high bits
set and the current code works on little endian systems but not on
big endian systems.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Juhl [Sun, 20 Nov 2011 11:07:09 +0000 (11:07 +0000)]
net, sja1000: Don't include version.h in peak_pci.c when not needed
It was pointed out by "make versioncheck" that we do not need to include
version.h in drivers/net/can/sja1000/peak_pci.c
This patch removes the unneeded include.
Signed-off-by: Jesper Juhl <jj@chaosbits.net> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 18 Nov 2011 17:32:46 +0000 (17:32 +0000)]
netfilter: use jump_label for nf_hooks
On configs where CONFIG_JUMP_LABEL=y, we can replace in fast path a
load/compare/conditional jump by a single jump with no dcache reference.
Jump target is modified as soon as nf_hooks[pf][hook] switches from
empty state to non empty states. jump_label state is kept outside of
nf_hooks array so has no cost on cpu caches.
This patch removes the test on CONFIG_NETFILTER_DEBUG : No need to call
nf_hook_slow() at all if nf_hooks[pf][hook] is empty, this didnt give
useful information, but slowed down things a lot.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Patrick McHardy <kaber@trash.net> CC: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 18 Nov 2011 06:47:01 +0000 (06:47 +0000)]
tg3: switch to build_skb() infrastructure
This is very similar to bnx2x conversion, but simpler since no special
alignement is required, so goal was not to reduce skb truesize.
Using build_skb() reduces cache line misses in the driver, since we
use cache hot skb instead of cold ones. Number of in-flight sk_buff
structures is lower, they are more likely recycled in SLUB caches
while still hot.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Matt Carlson <mcarlson@broadcom.com> CC: Michael Chan <mchan@broadcom.com> CC: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Beulich [Fri, 18 Nov 2011 05:42:05 +0000 (05:42 +0000)]
xen-netback: use correct index for invalidation in xen_netbk_tx_check_gop()
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
We need to mask the MMC irq otherwise if we raise the mmc
interrupts that are not handled the driver loops in the
handler.
In fact, by default all mmc counters (only used for stats)
are managed in SW and registers are cleared on each READ.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Thu, 17 Nov 2011 14:30:55 +0000 (14:30 +0000)]
net: Change mii to ethtool advertisement function names
This patch implements advice by Ben Hutchings to change the mii side of
the function names to look more like the register whose values they
convert. New LPA translation functions have been added as well.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 21 Nov 2011 20:11:37 +0000 (12:11 -0800)]
Merge branch 'dev' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'dev' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix up a undefined error in ext4_free_blocks in debugging code
ext4: add blk_finish_plug in error case of writepages.
ext4: Remove kernel_lock annotations
ext4: ignore journalled data options on remount if fs has no journal
Linus Torvalds [Mon, 21 Nov 2011 20:11:13 +0000 (12:11 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: Allocate larger oid buffer in request msgs
ceph: initialize root dentry
ceph: fix iput race when queueing inode work