Eric Dumazet [Tue, 6 Nov 2007 07:38:39 +0000 (23:38 -0800)]
[NET]: Define infrastructure to keep 'inuse' changes in an efficent SMP/NUMA way.
"struct proto" currently uses an array stats[NR_CPUS] to track change on
'inuse' sockets per protocol.
If NR_CPUS is big, this means we use a big memory area for this.
Moreover, all this memory area is located on a single node on NUMA
machines, increasing memory pressure on the boot node.
In this patch, I tried to :
- Keep a fast !CONFIG_SMP implementation
- Keep a fast CONFIG_SMP implementation for often used protocols
(tcp,udp,raw,...)
- Introduce a NUMA efficient implementation
Some helper macros are defined in include/net/sock.h
These macros take into account CONFIG_SMP
If a "struct proto" is declared without using DEFINE_PROTO_INUSE /
REF_PROTO_INUSE
macros, it will automatically use a default implementation, using a
dynamically allocated percpu zone.
This default implementation will be NUMA efficient, but might use 32/64
bytes per possible cpu
because of current alloc_percpu() implementation.
However it still should be better than previous implementation based on
stats[NR_CPUS] field.
When a "struct proto" is changed to use the new macros, we use a single
static "int" percpu variable,
lowering the memory and cpu costs, still preserving NUMA efficiency.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
James Chapman [Tue, 6 Nov 2007 07:32:37 +0000 (23:32 -0800)]
[PPP]: L2TP: Fix oops in transmit and receive paths
Changes made on 18-sep to fix skb handling in the pppol2tp driver
broke the transmit and receive paths. Users are only running into this
now because distros are now using 2.6.23 and I must have messed up
when I tested the change.
For receive, we now do our own calculation of how much to pull from
the skb (variable length L2TP header) rather than using
skb_transport_offset(). Also, if the skb isn't a data packet, it must
be passed back to UDP with skb->data pointing to the UDP header.
For transmit, make sure skb->sk is set up because ip_queue_xmit()
needs it.
Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Tue, 6 Nov 2007 05:32:31 +0000 (21:32 -0800)]
[IPV4]: Clean the ip_sockglue.c from some ugly ifdefs
The #idfed CONFIG_IP_MROUTE is sometimes places inside the if-s,
which looks completely bad. Similar ifdefs inside the functions
looks a bit better, but they are also not recommended to be used.
Provide an ifdef-ed ip_mroute_opt() helper to cleanup the code.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Mitsuru Chinen [Tue, 6 Nov 2007 05:29:17 +0000 (21:29 -0800)]
[IPv6] SNMP: Restore Udp6InErrors incrementation
As the checksum verification is postponed till user calls recv or poll,
the inrementation of Udp6InErrors counter should be also postponed.
Currently, it is postponed in non-blocking operation case. However it
should be postponed in all case like the IPv4 code.
Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Tue, 6 Nov 2007 05:04:31 +0000 (21:04 -0800)]
[IPV6]: Consolidate the ip cork destruction in ip6_output.c
The ip6_push_pending_frames and ip6_flush_pending_frames do the
same things to flush the sock's cork. Move this into a separate
function and save ~100 bytes from the .text
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Tue, 6 Nov 2007 05:03:24 +0000 (21:03 -0800)]
[IPV4]: Consolidate the ip cork destruction in ip_output.c
The ip_push_pending_frames and ip_flush_pending_frames do the
same things to flush the sock's cork. Move this into a separate
function and save ~80 bytes from the .text
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: ebt_arp: fix --arp-gratuitous matching dependence on --arp-ip-{src,dst}
Fix --arp-gratuitous matching dependence on --arp-ip-{src,dst}
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be> Signed-off-by: Lutz Preßler <Lutz.Pressler@SerNet.DE> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Tue, 6 Nov 2007 04:44:06 +0000 (20:44 -0800)]
[NETFILTER]: nf_sockopts list head cleanup
Code is using knowledge that nf_sockopt_ops::list list_head is first
field in structure by using casts. Switch to list_for_each_entry()
itetators while I am at it.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Tue, 6 Nov 2007 04:42:54 +0000 (20:42 -0800)]
[NETFILTER]: Clean up Makefile
Sort matches and targets in the NF makefiles.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Tue, 6 Nov 2007 04:42:16 +0000 (20:42 -0800)]
[NETFILTER]: Sort matches/targets in Kbuild file
Sort matches and targets in the Kbuild file.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Engelhardt [Tue, 6 Nov 2007 04:35:56 +0000 (20:35 -0800)]
[NETFILTER]: Copyright/Email update
Transfer all my copyright over to our company.
Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Dike [Tue, 6 Nov 2007 16:02:50 +0000 (11:02 -0500)]
UML: fix defconfig build again
Reported by Al Viro.
This fixes it:
[AC]FLAGS -> KBUILD_[AC]FLAGS conversion in Makefile-i386.
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 6 Nov 2007 15:53:46 +0000 (07:53 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
9p: add missing end-of-options record for trans_fd
9p: return NULL when trans not found
9p: use copy of the options value instead of original
9p: fix memory leak in v9fs_get_sb
v9fs_match_trans function returns arbitrary transport module instead of NULL
when the requested transport is not registered. This patch modifies the
function to return NULL in that case.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
9p: use copy of the options value instead of original
v9fs_parse_options function uses strsep which modifies the value of the
v9ses->options field. That modified value is later passed to the function
that creates the transport potentially making the transport creation
function to fail.
This patch creates a copy of v9ses->option field that v9fs_parse_options
function uses instead of the original value.
Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Linus Torvalds [Tue, 6 Nov 2007 01:43:04 +0000 (17:43 -0800)]
Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: handle broken cable reporting
pata_hpt37x: Fix outstanding bug reports on the HPT374 and 37x cable detect
ata_piix: Add additional PCI identifier for 40 wire short cable
pata_serverworks: Fix problem with some drive combinations
libata: Don't disable dipm with SET FEATURES
libata and bogus LBA48 drives
Li Zefan [Mon, 5 Nov 2007 22:51:10 +0000 (14:51 -0800)]
time: fix inconsistent function names in comments
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Greg Ungerer [Mon, 5 Nov 2007 22:51:04 +0000 (14:51 -0800)]
m68knommu: fix pread/pwrite defines
Fix system call defines for system call 180 and 181 to match the underlying
system call table function entries. System call 180 calls sys_pread64, and
181 calls sys_pwrite64, so make the definitions match.
Michael Halcrow [Mon, 5 Nov 2007 22:51:03 +0000 (14:51 -0800)]
eCryptfs: increment extent_offset once per loop interation
The extent_offset is getting incremented twice per loop iteration through any
given page. It should only be getting incremented once. This bug should only
impact hosts with >4K page sizes.
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Neil Brown [Mon, 5 Nov 2007 22:51:02 +0000 (14:51 -0800)]
md: fix misapplied patch in raid5.c
commit 4ae3f847e49e3787eca91bced31f8fd328d50496 ("md: raid5: fix
clearing of biofill operations") did not get applied correctly,
presumably due to substantial similarities between handle_stripe5 and
handle_stripe6.
This patch moves the chunk of new code from handle_stripe6 (where it isn't
needed (yet)) to handle_stripe5.
Signed-off-by: Neil Brown <neilb@suse.de> Cc: "Dan Williams" <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Dike [Mon, 5 Nov 2007 22:50:59 +0000 (14:50 -0800)]
uml: correctly strip kernel defines from userspace CFLAGS
KERNEL_DEFINES needs whitespace trimmed, otherwise the whitespace crunching
done by make fools the patsubst which is used to remove KERNEL_DEFINES from
USER_CFLAGS.
Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WANG Cong [Mon, 5 Nov 2007 22:50:59 +0000 (14:50 -0800)]
uml: fix incompatible types warning in previous SG fix
Fix an incompatible-pointer warning.
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Jeff Dike <jdike@linux.intel.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Build with randconfig fails with following error with the
2.6.24-rc4-git9
include/linux/kallsyms.h:56: error: `NULL' undeclared (first use in this
function)
include/linux/kallsyms.h:56: error: (Each undeclared identifier is
reported only once
include/linux/kallsyms.h:56: error: for each function it appears in.)
make[2]: *** [arch/powerpc/platforms/cell/spu_callbacks.o] Error 1
make[1]: *** [arch/powerpc/platforms/cell] Error 2
make: *** [arch/powerpc/platforms] Error 2
Grant Grundler was asking for more detail about correct usage of local
atomic operations and suggested adding the resulting summary to
local_ops.txt.
"Please add a bit more detail. If DaveM is correct (he normally is), then
there must be limits on how the local_t can be used in the kernel process
and interrupt contexts. I'd like those rules spelled out very clearly
since it's easy to get wrong and tracking down such a bug is quite painful."
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Mon, 5 Nov 2007 22:58:58 +0000 (22:58 +0000)]
libata: handle broken cable reporting
One or two ancient drives predated the cable spec and didn't sent the
valid bits for the field. I had hoped to leave this out of libata as a
piece of historical annoyance but a recent CD drive shows the same bug so
we have to import support for it.
Same concept as Bartlomiej's changes old IDE except that as we have
centralised blacklists we can avoid keeping another private table of stuff
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Alan Cox [Mon, 5 Nov 2007 15:04:40 +0000 (15:04 +0000)]
pata_serverworks: Fix problem with some drive combinations
The driver used the channel not the device number for deciding where to
load some timing parameters. Also change so that we clear the UDMA field
as the old driver did. Not believed neccessary but does no harm.
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
During booting of last vanilla kernel I got:
Trying to free nonexistent resource...
This because of if "ENABLE_APRICOT" is on we do:
request_region(ioaddr,...)
if (checksum test failed)
goto out1;
dev->base_addr = ioaddr;//<-here mistake
out1:
release_region(dev->base_addr,...)
This change fixes this bug for me.
Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6:
PCI: Add Kconfig option to disable deprecated pci_find_* API
PCI: pciserial_resume_one ignored return value of pci_enable_device
PCI Hotplug: cpqhp_pushbutton_thread(): remove a pointless if() check
PCI: make pci_match_device() static
PCI: Remove 3 incorrect MSI quirks.
PCI: Add MSI INTX_DISABLE quirks for ATI SB700/800 SATA and IXP SB400 USB
PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set.
PCI: Add MSI quirk for ServerWorks HT1000 PCIX bridge.
PCI: Revert "PCI: disable MSI by default on systems with Serverworks HT1000 chips"
David Brownell [Wed, 31 Oct 2007 09:37:37 +0000 (10:37 +0100)]
leds: bugfixes for leds-gpio
Three bugfixes to the leds-gpio driver, plus minor whitespace tweaks:
- Do the INIT_WORK() before registering each LED, so if its trigger
becomes immediately active it can schedule work without oopsing..
- Use normal registration, not platform_driver_probe(), so that
devices appearing "late" (hotplug type) can still be bound.
- Mark the driver remove code as "__devexit", preventing oopses
when the underlying device is removed.
These issues came up when using this driver with some GPIO expanders
living on serial busses, which act unlike "normal" platform devices:
they can appear and vanish along with the serial bus driver.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
David Miller [Thu, 25 Oct 2007 08:17:16 +0000 (01:17 -0700)]
PCI: Remove 3 incorrect MSI quirks.
Now that we have dealt with the real issue, in that some ATI SATA and
USB controllers needed the INTX_DISABLE quirk, we can remove these AMD
chipset global MSI disabling quirks.
This is based upon testing and feedback from
Shane Huang <Shane.Huang@amd.com>.
Cc: Shane Huang <Shane.Huang@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
David Miller [Thu, 25 Oct 2007 08:16:30 +0000 (01:16 -0700)]
PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set.
A reasonably common problem with some devices is that they will
disable MSI generation when the INTX_DISABLE bit is set in the
PCI_COMMAND register.
Quirk this explicitly, guarding the pci_intx() calls in msi.c with
this quirk indication.
The first entries for this quirk are for 5714 and 5780 Tigon3 chips,
and thus we can remove the workaround code from the tg3.c driver.
Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Michael Chan <mchan@broadcom.com> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
"PCI: disable MSI by default on systems with Serverworks HT1000 chips"
was not entirely correct, and has been reverted.
MSI does not work on the PCIX bus because the BIOS did not set the
HT_MSI_FLAGS_ENABLE bit in the HyperTransport MSI capability on the
bridge. We use the existing quirk_msi_ht_cap() to detect the problem
and disable MSI in all buses behind it.
Signed-off-by: Michael Chan <mchan@broadcom.com> Cc: Anantha Subramanyam <ananth@broadcom.com> Cc: Naren Sankar <nsankar@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ide: fix IDE_HFLAG_NO_ATAPI_DMA handling in config_drive_for_dma()
commit 33c1002ed912ac9dacedd5d5b166da3b72d18460 incorrectly changed return
value from '0' to '-1', fix it (ns87415 was the only host driver affected
since it uses both IDE_HFLAG_TRUST_BIOS_FOR_DMA and IDE_HFLAG_NO_ATAPI_DMA).
drive_cmd_intr() is used by both REQ_TYPE_ATA_CMD and REQ_TYPE_ATA_TASK
but commands using PIO-in protocol are valid only for REQ_TYPE_ATA_CMD
(&args[4] in case of REQ_TYPE_ATA_TASK points to a value for IDE_LCYL_REG
register instead of the data buffer). This fix allows REQ_TYPE_ATA_TASK
commands to use non-zero values for IDE_SECTOR_REG (args[3]).
This config option is effective only for host drivers that use
IDE_HFLAG_OFF_BOARD host flag (aec62xx, generic, hpt34x, hpt366,
pdc202xx_new, pdc202xx_old and tc86c001).
sebdeg@ngi.it [Mon, 5 Nov 2007 20:42:25 +0000 (21:42 +0100)]
piix: add support for ICH7 on Acer 5602aWLMi
In piix.c (and in ata_piix.c) are already included some patches to skip the
cable check on some laptops and to enable UDMA > 33 modes, but I've noticed
than theese doesn't work on my Acer Aspire 5602WLMi (maybe exist more
versions of this laptop). With this simple patch I can set transfer mode
to UDMA100.
From: "sebdeg@ngi.it" <sebdeg@ngi.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Linus Torvalds [Mon, 5 Nov 2007 19:39:25 +0000 (11:39 -0800)]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] tod clock: announce clocksource as perfect
[S390] Rename "idle_time" attribute to "idle_time_us".
[S390] Fix priority mistakes in drivers/s390/cio/cmf.c
[S390] Fix memory detection.
[S390] Fix compile on !CONFIG_SMP.
[S390] device_schedule_callback() for dcssblk.
[S390] Fix smsgiucv init on no iucv machines
[S390] cio: use INIT_WORK to initialize struct work.
Linus Torvalds [Mon, 5 Nov 2007 19:39:00 +0000 (11:39 -0800)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest:
lguest: tidy up documentation
kernel/futex.c: make 3 functions static
unexport access_process_vm
lguest: make async_hcall() static
Commit f2a0bd3753dad7ea4605ebd5435716b39e9f92bb defines the function
with "void cpm_load_patch(cpm8xx_t *cp)" prtotype and is declared as
"extern void cpm_load_patch(volatile immap_t *immr)" in the header file.
Fix the memory leak that may occur when we attempt to reuse a cpu_slab
that was allocated while we reenabled interrupts in order to be able to
grow a slab cache.
The per cpu freelist may contain objects and in that situation we may
overwrite the per cpu freelist pointer loosing objects. This only
occurs if we find that the concurrently allocated slab fits our
allocation needs.
If we simply always deactivate the slab then the freelist will be
properly reintegrated and the memory leak will go away.
The Time of Day clock is the standard time source for s390. It is
- monotonic
- allows very fast reading
- architecture guarantees at least microsecond stepping
- available as part of the architecture
We should announce the rate of tod as 400 to be in sync with the
description found in clocksource.h:
"400-499:Perfect The ideal clocksource. A must-use where available."
This change will prefer tod over less reliable clock sources.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
# cat /sys/devices/system/cpu/cpu0/idle_time 131473592 us
Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Mon, 5 Nov 2007 10:10:11 +0000 (11:10 +0100)]
[S390] Fix memory detection.
Yet another patch in the countless series of memory detection fixes:
if the last area of the reported storage size is a hole the detection
loop will loop forever.
Just break chunk detection loop if its end is going to be larger than
reported storage size.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>