Roland Dreier [Tue, 12 Dec 2006 22:48:18 +0000 (14:48 -0800)]
IPoIB: Make sure struct ipoib_neigh.queue is always initialized
Move the initialization of ipoib_neigh's skb_queue into
ipoib_neigh_alloc(), since commit 2745b5b7 ("IPoIB: Fix skb leak when
freeing neighbour") will make iterate over the skb_queue to free any
packets left over when freeing the ipoib_neigh structure.
This fixes a crash when freeing ipoib_neigh structures allocated in
ipoib_mcast_send(), which otherwise don't have their skb_queue
initialized.
Ralph Campbell [Tue, 12 Dec 2006 22:27:41 +0000 (14:27 -0800)]
IB: Add DMA mapping functions to allow device drivers to interpose
The QLogic InfiniPath HCAs use programmed I/O instead of HW DMA.
This patch allows a verbs device driver to interpose on DMA mapping
function calls in order to avoid relying on bus_to_virt() and
phys_to_virt() to undo the mappings created by dma_map_single(),
dma_map_sg(), etc.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Sean Hefty [Fri, 1 Dec 2006 00:44:16 +0000 (16:44 -0800)]
RDMA/cma: Add support for RDMA_PS_UDP
Allow the use of UD QPs through the rdma_cm, in order to provide
address translation services for resolving IB addresses for datagram
messages using SIDR.
Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Sean Hefty [Fri, 1 Dec 2006 00:37:15 +0000 (16:37 -0800)]
RDMA/cma: Allow early transition to RTS to handle lost CM messages
During connection establishment, the passive side of a connection can
receive messages from the active side before the connection event has
been delivered to the user. Allow the passive side to send messages
in response to received data before the event is delivered. To handle
the case where the connection messages are lost, a new rdma_notify()
function is added that users may invoke to force a connection into the
established state.
Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Sean Hefty [Fri, 1 Dec 2006 00:33:14 +0000 (16:33 -0800)]
RDMA/cma: Report connect info with connect events
Connection information was never given to the recipient of a
connection request or reply message. Only the event was delivered.
Report the connection data with the event to allows user to
reject the connection based on the requested parameters, or adjust
their resources to match the request.
Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Roland Dreier [Tue, 12 Dec 2006 19:50:20 +0000 (11:50 -0800)]
IB/ipath: Fix IRQ for PCI Express HCAs
Commit 51f65ebc ("IB/ipath - program intconfig register using new HT
irq hook"), which fixed interrupts for HyperTransport HCAs, broke PCI
Express HCAs, because for those HCAs, the driver uses the value of
pdev->irq before pci_enable_msi() and ends up getting a totally bogus
IRQ number. Fix this by using the value of pdev->irq after
pci_enable_msi().
Roland Dreier [Tue, 12 Dec 2006 19:50:20 +0000 (11:50 -0800)]
IB/iser: Remove unused "write-only" variables
Remove variables that are set but then never looked at in the iSER
initiator. These cleanups came from David Binderman's list of "set
but never used" warnings from icc.
Roland Dreier [Tue, 12 Dec 2006 19:50:20 +0000 (11:50 -0800)]
IB/ipath: Remove unused "write-only" variables
Remove variables that are set but then never looked at in the ipath
driver. These cleanups came from David Binderman's list of "set but
never used" warnings from icc.
Roland Dreier [Tue, 12 Dec 2006 19:50:19 +0000 (11:50 -0800)]
IB/fmr: ib_flush_fmr_pool() may wait too long
ib_flush_fmr_pool() stashes away the request generation number
properly, but then goes ahead and rereads it every time it tests
whether the flush generation number has caught up. This means that
there is a theoretical possibility of livelock, if the request
generation number keeps getting bumped and the flush generation number
never catches up. The fix is simple: use the request generation
number read at the beginning of the function.
Also, atomic_inc() followed by atomic_read() can be replaced with
atomic_int_return(). There's no real requirement for atomicity here
but we might as well shrink the code.
This bug was discovered using David Binderman's list of "set but never
used" warnings from icc.
Nicolas Pitre [Tue, 12 Dec 2006 18:32:29 +0000 (13:32 -0500)]
[PATCH] remove config ordering/dependency between ucb1400-ts and sound subsystem
Commit 2d4ba4a3b9aef95d328d74a17ae84f8d658059e2 introduced a dependency
that was never meant to exist when the ac97_bus.c module was created.
Move ac97_bus.c up the directory hierarchy to make sure it is built when
selected even if sound is configured out so things work as originally
intended.
Signed-off-by: Nicolas Pitre <nico@cam.org> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Tue, 12 Dec 2006 17:57:55 +0000 (09:57 -0800)]
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c: Fix OMAP clock prescaler to match the comment
i2c: Refactor a kfree in i2c-dev
i2c: Fix return value check in i2c-dev
i2c: Enable PEC on more i2c-i801 devices
i2c: Discard the i2c algo del_bus wrappers
i2c: New ARM Versatile/Realview bus driver
i2c: fix broken ds1337 initialization
i2c: i2c-i801 documentation update
i2c: Use the __ATTR macro where possible
i2c: Whitespace cleanups
i2c: Use put_user instead of copy_to_user where possible
i2c: New Atmel AT91 bus driver
i2c: Add support for nested i2c bus locking
i2c: Cleanups to the i2c-nforce2 bus driver
i2c: Add request/release_mem_region to i2c-ibm_iic bus driver
i2c: New Philips PNX bus driver
i2c: Delete the broken i2c-ite bus driver
i2c: Update the list of driver IDs
i2c: Fix documentation typos
Ingo Molnar [Tue, 12 Dec 2006 12:49:35 +0000 (13:49 +0100)]
[PATCH] net, 8139too.c: fix netpoll deadlock
fix deadlock in the 8139too driver: poll handlers should never forcibly
enable local interrupts, because they might be used by netpoll/printk
from IRQ context.
=================================
[ INFO: inconsistent lock state ]
2.6.19 #11
---------------------------------
inconsistent {softirq-on-W} -> {in-softirq-W} usage.
swapper/1 [HC0[0]:SC1[1]:HE1:SE0] takes:
(&npinfo->poll_lock){-+..}, at: [<c0350a41>] net_rx_action+0x64/0x1de
{softirq-on-W} state was registered at:
[<c0134c86>] mark_lock+0x5b/0x39c
[<c0135012>] mark_held_locks+0x4b/0x68
[<c01351e9>] trace_hardirqs_on+0x115/0x139
[<c02879e6>] rtl8139_poll+0x3d7/0x3f4
[<c035c85d>] netpoll_poll+0x82/0x32f
[<c035c775>] netpoll_send_skb+0xc9/0x12f
[<c035cdcc>] netpoll_send_udp+0x253/0x25b
[<c0288463>] write_msg+0x40/0x65
[<c011cead>] __call_console_drivers+0x45/0x51
[<c011cf16>] _call_console_drivers+0x5d/0x61
[<c011d4fb>] release_console_sem+0x11f/0x1d8
[<c011d7d7>] register_console+0x1ac/0x1b3
[<c02883f8>] init_netconsole+0x55/0x67
[<c010040c>] init+0x9a/0x24e
[<c01049cf>] kernel_thread_helper+0x7/0x10
[<ffffffff>] 0xffffffff
irq event stamp: 819992
hardirqs last enabled at (819992): [<c0350a16>] net_rx_action+0x39/0x1de
hardirqs last disabled at (819991): [<c0350b1e>] net_rx_action+0x141/0x1de
softirqs last enabled at (817552): [<c01214e4>] __do_softirq+0xa3/0xa8
softirqs last disabled at (819987): [<c0106051>] do_softirq+0x5b/0xc9
other info that might help us debug this:
no locks held by swapper/1.
Ingo Molnar [Tue, 12 Dec 2006 11:10:28 +0000 (12:10 +0100)]
[PATCH] lockdep: fix seqlock_init()
seqlock_init() needs to use spin_lock_init() for dynamic locks, so that
lockdep is notified about the presence of a new lock.
(this is a fallout of the recent networking merge, which started using
the so-far unused seqlock_init() API.)
This fix solves the following lockdep-internal warning on current -git:
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
__lock_acquire+0x10c/0x9f9
lock_acquire+0x56/0x72
_spin_lock+0x35/0x42
neigh_destroy+0x9d/0x12e
neigh_periodic_timer+0x10a/0x15c
run_timer_softirq+0x126/0x18e
__do_softirq+0x6b/0xe6
do_softirq+0x64/0xd2
ksoftirqd+0x82/0x138
Linus Torvalds [Tue, 12 Dec 2006 16:01:50 +0000 (08:01 -0800)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (4954): Fix: On ia64, i2c adap->inb/adap->outb are wrongly evaluated
V4L/DVB (4954): Fix: On ia64, i2c adap->inb/adap->outb are wrongly evaluated
i2c defines two callbacks (inb/outb). On ia64, since it defines also two macros
with those names, it causes the following errors:
drivers/media/video/usbvision/usbvision-i2c.c:64:39: macro "outb" passed 4 arguments, but takes just 2
drivers/media/video/usbvision/usbvision-i2c.c: In function `try_write_address':
drivers/media/video/usbvision/usbvision-i2c.c:64: warning: assignment makes integer from pointer without a cast
drivers/media/video/usbvision/usbvision-i2c.c:89:38: macro "inb" passed 4 arguments, but takes just 1
drivers/media/video/usbvision/usbvision-i2c.c: In function `try_read_address':
drivers/media/video/usbvision/usbvision-i2c.c:89: warning: assignment makes integer from pointer without a cast
drivers/media/video/usbvision/usbvision-i2c.c:85: warning: unused variable `buf'
drivers/media/video/usbvision/usbvision-i2c.c:173:53: macro "inb" passed 4 arguments, but takes just 1
drivers/media/video/usbvision/usbvision-i2c.c: In function `usb_xfer':
drivers/media/video/usbvision/usbvision-i2c.c:173: warning: assignment makes integer from pointer without a cast
drivers/media/video/usbvision/usbvision-i2c.c:179:54: macro "outb" passed 4 arguments, but takes just 2
drivers/media/video/usbvision/usbvision-i2c.c:179: warning: assignment makes integer from pointer without a cast
thanks to Andrew Morton for pointing this.
Boaz Harrosh [Tue, 5 Dec 2006 09:19:14 +0000 (10:19 +0100)]
[PATCH] remove blk_queue_activity_fn
While working on bidi support at struct request level
I have found that blk_queue_activity_fn is actually never used.
The only user is in ide-probe.c with this code:
/* enable led activity for disk drives only */
if (drive->media == ide_disk && hwif->led_act)
blk_queue_activity_fn(q, hwif->led_act, drive);
And led_act is never initialized anywhere.
(Looking back at older kernels it was used in the PPC arch, but was removed around 2.6.18)
Unless it is all for future use off course.
(this patch is against linux-2.6-block.git as off 2006/12/4)
Linus Torvalds [Tue, 12 Dec 2006 02:28:59 +0000 (18:28 -0800)]
Merge branch 'for-linus' of git://www.atmel.no/~hskinnemoen/linux/kernel/avr32
* 'for-linus' of git://www.atmel.no/~hskinnemoen/linux/kernel/avr32:
[AVR32] Add missing #include <linux/param.h> to delay.c
[AVR32] Pass dev parameter to dma_cache_sync()
[AVR32] Implement intc_get_pending()
[AVR32] Don't include <asm/delay.h>
[AVR32] Put the chip in "stop" mode when halting the system
[AVR32] Set flow handler for external interrupts
[AVR32] Remove unused file
[AVR32] Remove mii_phy_addr and eth_addr from eth_platform_data
[AVR32] Move ethernet tag parsing to board-specific code
[AVR32] Add macb1 platform_device
[AVR32] Portmux API update
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (36 commits)
[POWERPC] Generic BUG for powerpc
[PPC] Fix compile failure do to introduction of PHY_POLL
[POWERPC] Only export __mtdcr/__mfdcr if CONFIG_PPC_DCR is set
[POWERPC] Remove old dcr.S
[POWERPC] Fix SPU coredump code for max_fdset removal
[POWERPC] Fix irq routing on some 32-bit PowerMacs
[POWERPC] ps3: Add vuart support
[POWERPC] Support ibm,dynamic-reconfiguration-memory nodes
[POWERPC] dont allow pSeries_probe to succeed without initialising MMU
[POWERPC] micro optimise pSeries_probe
[POWERPC] Add SPURR SPR to sysfs
[POWERPC] Add DSCR SPR to sysfs
[POWERPC] Fix 440SPe CPU table entry
[POWERPC] Add support for FP emulation for the e300c2 core
[POWERPC] of_device_register: propagate device_create_file return code
[POWERPC] Fix mmap of PCI resource with hack for X
[POWERPC] iSeries: head_64.o needs to depend on lparmap.s
[POWERPC] cbe_thermal: Fix initialization of sysfs attribute_group
[POWERPC] Remove QE header files from lite5200.c
[POWERPC] of_platform_make_bus_id(): make `magic' int
...
Ralf Baechle [Mon, 11 Dec 2006 11:54:52 +0000 (11:54 +0000)]
[MIPS] Discard .exit.text and .exit.data at runtime.
While the recent cset 86384d544157db23879064cde36061cdcafc6794 did improve
things it didn't resolve all the problems. So bite the bullet and discard
.exit.text and .exit.data at runtime. Which of course sucks because it
bloats binaries with code that will never ever be used but it's the only
thing that will work reliable as demonstrated by the function sd_major() in
drivers/scsi/sd.c.
Gcc may compile sd_major() using a jump table which it will put into
.rodata. If it also inlines sd_major's function body into exit_sd() which
gcc > 3.4.x does. If CONFIG_BLK_DEV_SD has been set to y we would like ld
to discard exit_sd's code at link time. However sd_major happens to
contain a switch statement which gcc will compile using a jump table in
.rodata on the architectures I checked. So, when ld later discards
.exit.text only the jump table in .rodata with its stale references to
the discard .exit.text will be left which any no antique ld will honor
with a link error.
Andrew Morton [Tue, 12 Dec 2006 01:24:46 +0000 (17:24 -0800)]
[NETPOLL]: Fix local_bh_enable() warning.
During boot we get:
netconsole: device eth0 not up yet, forcing it
e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex
WARNING (!__warned) at kernel/softirq.c:137 local_bh_enable()
Normally networking isn't invoked with interrupts turned off, but I
suppose we don't have a choice here. This is unique being a place where you
can get called with BH on, off, or IRQs off.
Given that this is only used for printk, the easiest solution is probably
just to disable local IRQs instead of BH.
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Mundt [Mon, 11 Dec 2006 11:29:28 +0000 (20:29 +0900)]
sh: Fixup sh_bios() trap handling.
This was inadvertently broken when the entry.S code split up,
restore the missing branch and get subsequent traps working
under debug again. This manifested itself as a lockup when
attempting to reload the VBR base.
Paul Mundt [Fri, 8 Dec 2006 08:46:29 +0000 (17:46 +0900)]
sh: Fix get_wchan().
Some time ago the schedule frame size changed and we failed to reflect
this in get_wchan() at the time. This first popped up as a problem on
SH7751R where schedule_frame ended up being unaligned and generating
an unaligned trap. This fixes it up again..
Paul Mundt [Fri, 8 Dec 2006 08:41:43 +0000 (17:41 +0900)]
sh: BUG() handling through trapa vector.
Previously we haven't been doing anything with verbose BUG() reporting,
and we've been relying on the oops path for handling BUG()'s, which is
rather sub-optimal.
This switches BUG handling to use a fixed trapa vector (#0x3e) where we
construct a small bug frame post trapa instruction to get the context
right. This also makes it trivial to wire up a DIE_BUG for the atomic
die chain, which we couldn't really do before.
Jamie Lenehan [Fri, 8 Dec 2006 06:26:15 +0000 (15:26 +0900)]
rtc: rtc-sh: alarm support.
This adds alarm support for the RTC_ALM_SET, RTC_ALM_READ,
RTC_WKALM_SET and RTC_WKALM_RD operations to rtc-sh.
The only unusual part is the handling of the alarm interrupt. If you
clear the alarm flag (AF) while the time in the RTC still matches the
time in the alarm registers than AF is immediately re-set, and if the
alarm interrupt (AIE) is still enabled then it re-triggers. I was
originally getting around 20k+ interrupts generated during the second
when the RTC and alarm registers matches.
The solution I've used is to clear AIE when the alarm goes off and
then use the carry interrupt to re-enabled it. The carry interrupt
will check AF and re-enabled AIE if it's clear. If AF is not clear
it'll clear it and then the check will be repeated next carry
interrupt. This a bit in rtc structure that indicates that it's
waiting to have AIE re-enabled so it doesn't turn it on when it
wasn't enabled anyway.
Signed-off-by: Jamie Lenehan <lenehan@twibble.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Jamie Lenehan [Fri, 8 Dec 2006 05:49:30 +0000 (14:49 +0900)]
rtc: rtc-sh: fix rtc for out-by-one for the month.
The RMONCNT register, which holds the month in the RTC, takes a value
between 1 and 12 while the tm_mon field in the time structures takes
a value between 0 and 11. This wasn't being taken into account in
rtc-sh resulting in the month being out by one.
eg, on my board during boot the RTC is set to:
RTC is set to Thu Jul 01 09:00:00 1999
but "hwclock -r" immediately after logging in was showing:
Sun Aug 1 09:01:43 1999 0.000000 seconds
Signed-off-by: Jamie Lenehan <lenehan@twibble.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Paul Mundt [Thu, 7 Dec 2006 11:33:38 +0000 (20:33 +0900)]
sh: Split out atomic ops logically.
We have a few different ways to do the atomic operations, so split
them out in to different headers rather than bloating atomic.h.
Kernelspace gUSA will take this up to a third implementation.
Stuart Menefy [Thu, 7 Dec 2006 08:48:52 +0000 (17:48 +0900)]
sh: gcc4 symbol export fixups.
gcc 4 for sh changes the names of some compiler intrinsic functions
and adds some additional ones. This patch adds the new ones, and
fixes up various module symbol resolution issues.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Jamie Lenehan [Thu, 7 Dec 2006 08:23:50 +0000 (17:23 +0900)]
rtc: rtc-sh: fix for period rtc interrupts.
When testing the per second interrupt support (RTC_UIE_ON/RTC_UIE_OFF)
of the new RTC system it would die in sh_rtc_interrupt due to a null
ptr dereference. The following gets it working correctly.
Signed-off-by: Jamie Lenehan <lenehan@twibble.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Paul Mundt [Thu, 7 Dec 2006 03:43:06 +0000 (12:43 +0900)]
sh: Reworked swap cache entry encoding for SH-X2 MMU.
In the 64-bit PTE case there's no point in restricting the encoding
to the low bits of the PTE, we can instead bump all of this up to
the high 32 bits and extend PTE_FILE_MAX_BITS to 32, adopting the
same convention used by x86 PAE.
There's a minor discrepency between the number of bits used for the
swap type encoding between 32 and 64-bit PTEs, but this is unlikely
to cause any problem given the extended offset.
Simon Horman [Mon, 11 Dec 2006 06:35:24 +0000 (22:35 -0800)]
[IPVS]: Use msleep_interruptable() instead of ssleep() aka msleep()
Dean Manners notices that when an IPVS synchonisation daemons are
started the system load slowly climbs up to 1. This seems to be related
to the call to ssleep(1) (aka msleep(1000) in the main loop. Replacing
this with a call to msleep_interruptable() seems to make the problem go
away. Though I'm not sure that it is correct.
This is the second edition of this patch, which replaces ssleep()
in the main loop for both the master and backup threads, as well
as some thread synchronisation code. The latter is just for thorougness
as it shouldn't be causing any problems.
Signed-Off-By: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
[DCCP] ccid3: Fixup some type conversions related to rtts
Spotted by David Miller when compiling on sparc64, I reproduced it here on
parisc64, that are the only platforms to define __kernel_suseconds_t as an
'int', all the others, x86_64 and x86 included typedef it as a 'long', but from
the definition of suseconds_t it should just be an 'int' on platforms where it
is >= 32bits, it would not require all the castings from suseconds_t to (int)
when printking variables of this type, that are not needed on parisc64 and
sparc64.
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:24:57 +0000 (00:24 -0200)]
[DCCP] ccid3: BUG-FIX - conversion errors
This fixes conversion errors which arose by not properly type-casting
from u32 to __u64. Fixed by explicitly casting each type which is not
__u64, or by performing operation after assignment.
The patch further adds missing debug information to track the current
value of X_recv.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:24:11 +0000 (00:24 -0200)]
[DCCP] ccid3: Reorder packet history header file
No code change at all.
To make the header file easier to read, the following ordering is established
among the declarations:
* hist_new
* hist_delete
* hist_entry_new
* hist_head
* hist_find_entry
* hist_add_entry
* hist_entry_delete
* hist_purge
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:09:21 +0000 (00:09 -0200)]
[DCCP] ccid3: Perform history operations only after packet has been sent
This migrates all packet history operations into the routine
ccid3_hc_tx_packet_sent, thereby removing synchronization problems
that occur when, as before, the operations are spread over multiple
routines.
The following minor simplifications are also applied:
* several simplifications now follow from this change - several tests
are now no longer required
* removal of one unnecessary variable (dp)
Justification:
Currently packet history operations span two different routines,
one of which is likely to pass through several iterations of sleeping
and awakening.
The first routine, ccid3_hc_tx_send_packet, allocates an entry and
sets a few fields. The remaining fields are filled in when the second
routine (which is not within a sleeping context), ccid3_hc_tx_packet_sent,
is called. This has several strong drawbacks:
* it is not necessary to split history operations - all fields can be
filled in by the second routine
* the first routine is called multiple times, until a packet can be sent,
and sleeps meanwhile - this causes a lot of difficulties with regard to
keeping the list consistent
* since both routines do not have a producer-consumer like synchronization,
it is very difficult to maintain data across calls to these routines
* the fact that the routines are called in different contexts (sleeping, not
sleeping) adds further problems
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:07:37 +0000 (00:07 -0200)]
[DCCP] ccid3: Shift window counter computation
This puts the window counter computation [RFC 4342, 8.1] into a separate
function which is called whenever a new packet is ready for immediate
transmission in ccid3_hc_tx_send_packet.
Justification:
The window counter update was previously computed after the packet was sent. This has
two drawbacks, both fixed by this patch:
1) re-compute another timestamp almost directly after the packet was sent (expensive),
2) the CCVal for the window counter is needed at the instant the packet is sent.
Further details:
The initialisation of the window counter is left in the state NO_SENT, as before.
The algorithm will do nothing if either RTT is initialised to 0 (which is ok) or if
the RTT value remains below 4 microseconds (which is almost pathological).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:06:32 +0000 (00:06 -0200)]
[DCCP] ccid3: Sanity-check RTT samples
CCID3 performance depends much on the accuracy of RTT samples. If RTT
samples grow too large, performance can be catastrophically poor.
To limit the amount of possible damage in such cases, the patch
* introduces an upper limit which identifies a maximum `sane' RTT value;
* uses a macro to enforce this upper limit.
Using a macro was given preference, since it is necessary to identify the
calling function in the warning message. Since exceeding this threshold
identifies a critical condition, DCCP_CRIT is used and not DCCP_WARN.
Many thanks to Ian McDonald for collaboration on this issue.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:06:01 +0000 (00:06 -0200)]
[DCCP] ccid3: Initialise RTT values
In both the sender and the receiver it is possible that the stored
RTT value is accessed before an actual RTT estimate has been computed.
This patch
* initialises the sender RTT to 0
- the sender always accesses the RTT in ccid3_hc_tx_packet_sent
- the RTT is further needed for the window counter algorithm
* replaces the receiver initialisation of 5msec with 0
- which has the same effect and removes an `XXX'
- the RTT value is needed in ccid3_hc_rx_packet_recv as rtt_prev
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
is already performed _unconditionally_ in ccid3_hc_tx_send_packet.
Since there is further no current need for this function, it is removed
entirely. Since furthermore, there is actually no present need for the
entire interface function ccid_hc_tx_insert_options, it was decided to
remove it also, to clean up the interface.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:05:12 +0000 (00:05 -0200)]
[DCCP]: Warn when discarding packet due to internal errors
This adds a (debug) warning message which is triggered whenever a packet is
discarded due to send failure.
It also adds a conditional, so that an interruption during dccp_wait_for_ccid
is not treated as a `BUG': the rationale is that interruptions are external,
whereas bug warnings are concerned with the internals.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:04:43 +0000 (00:04 -0200)]
[DCCP]: Only deliver to the CCID rx side in charge
This is an optimisation to reduce CPU load. The received feedback is now
only directed to the active CCID component, without requiring processing
also by the inactive one.
As a consequence, a similar test in ccid3.c is now redundant and is
also removed.
Justification:
Currently DCCP works as a unidirectional service, i.e. a listening server
is not at the same time a connecting client.
As far as I can see, several modifications are necessary until that
becomes possible.
At the present time, received feedback is both fed to the rx/tx CCID
modules. In unidirectional service, only one of these is active at any
one time.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:04:16 +0000 (00:04 -0200)]
[DCCP]: Simplify TFRC calculation
In migrating towards using the newer functions scaled_div/scaled_div32
for TFRC computations mapped from floating-point onto integer arithmetic,
this completes the last stage of modifications.
In particular, the overflow case for computing X_calc is circumvented by
* breaking the computation into two stages
* the first stage, res = (s*1E6)/R, cannot overflow due to use of u64
* in the second stage, res = (res*1E6)/f, overflow on u32 is avoided due
to (i) returning UINT_MAX in this case (which is logically appropriate)
and (ii) issuing a warning message into the system log (since very likely
there is a problem somewhere else with the parameters)
Lastly, all such scaling operations are now exported into tfrc.h, since
actually this form of scaled computation is specific to TFRC and not to CCID3.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:03:51 +0000 (00:03 -0200)]
[DCCP]: Debug timeval operations
Problem:
Most target types in the CCID3 code are u32, so subtle conversion errors
can occur if signed time calculations yield negative results: the original
values are lost in the conversion to unsigned, calculation errors go undetected.
This patch therefore
* sets all critical time types from unsigned to suseconds_t
* avoids comparison between signed/unsigned via type-casting
* provides ample warning messages in case time calculations are negative
These warning messages can be removed at a later stage when the code
has undergone more testing.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:03:30 +0000 (00:03 -0200)]
[DCCP] ccid3: Simplify calculation for reverse lookup of p
This simplifies the calculation of a value p for a given fval when the
first loss interval is computed (RFC 3448, 6.3.1). It makes use of the
two new functions scaled_div/scaled_div32 to provide overflow protection.
Additionally, protection against divide-by-zero is extended - in this
case the function will return the maximally possible value of p=100%.
Background:
The maximum fval, f(100%), is approximately 244, i.e. the scaled value of fval
should never exceed 244E6, which fits easily into u32. The problem is the scaling
by 10^6, since additionally R(TT) is in microseconds.
This is resolved by breaking the division into two stages: the first stage
computes fval=(s*10^6)/R, stores that into u64; the second stage computes
fval = (fval*10^6)/X_recv and complains if overflow is reached for u32.
This case is safe since the TFRC reverse-lookup routine then returns p=100%.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:02:12 +0000 (00:02 -0200)]
[DCCP] ccid3: Finer-grained resolution of sending rates
This patch
* resolves a bug where packets smaller than 32/64 bytes resulted in sending rates of 0
* supports all sending rates from 1/64 bytes/second up to 4Gbyte/second
* simplifies the present overflow problems in calculations
Current sending rate X and the cached value X_recv of the receiver-estimated
sending rate are both scaled by 64 (2^6) in order to
* cope with low sending rates (minimally 1 byte/second)
* allow upgrading to use a packets-per-second implementation of CCID 3
* avoid calculation errors due to integer arithmetic cut-off
The patch implements a revised strategy from
http://www.mail-archive.com/dccp@vger.kernel.org/msg01040.html
The only difference with regard to that strategy is that t_ipi is already
used in the calculation of the nofeedback timeout, which saves one division.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:01:22 +0000 (00:01 -0200)]
[DCCP] ccid3: Fix two bugs in sending rate computation
This fixes
1) a bug in the recomputation of the sending rate by the nofeedback
timer when no feedback at all has so far been sent by the receiver:
min_t was used instead of max_t, which is wrong (cf. RFC 3448, p. 10);
2) an error in the computation of larger initial windows: instead of
min(... max()) (cf. RFC 4342, 5.), the code had used max(... max()).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 02:00:14 +0000 (00:00 -0200)]
[DCCP] ccid3: Two optimisations for sending rate recomputation
This performs two optimisations for the recomputation of the sending rate.
1) Currently the target sending rate X_calc is recalculated whenever
a) the nofeedback timer expires, or
b) a feedback packet is received.
In the (a) case, recomputing X_calc is redundant, since
* the parameters p and RTT do not change in between the
reception of feedback packets;
* the parameter X_recv is either modified from received
feedback or via the nofeedback timer;
* a test (`p == 0') in the nofeedback timer avoids using
a stale/undefined value of X_calc if p was previously 0.
2) The nofeedback timer now only recomputes a timestamp when p == 0.
This is according to step (4) of [RFC 3448, 4.3] and avoids
unnecessarily determining a timestamp.
A debug statement about not updating X is also removed - it helps very
little in debugging and just clutters the logs.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Gerrit Renker [Sun, 10 Dec 2006 01:59:14 +0000 (23:59 -0200)]
[DCCP] ccid3: Check against too large p
This patch follows a suggestion by Ian McDonald and ensures that in
the current code the value of p can not exceed 100%. Such a value is
illegal and would consequently cause a bug condition in tfrc_calc_x().
The receiver case is also tested, and a warning message is added.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Ian McDonald [Sun, 10 Dec 2006 01:56:09 +0000 (23:56 -0200)]
[DCCP]: Remove timeo from output.c
It simplifies waiting for the CCID module to signal that a packet
is ready to be sent. Other simplifications flow on from this such as
removing constants.
As a result of this EAGAIN is not returned any more by dccp_wait_for_ccid
(which would otherwise lead to unnecessarily discarding the packet in
dccp_write_xmit).
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Randy Dunlap [Sat, 9 Dec 2006 21:58:42 +0000 (13:58 -0800)]
[NET]: Fix WAN routers kconfig dependency.
Currently WAN router drivers can be built in-kernel while the
register/unregister_wan_device interfaces are built as modules.
This causes:
drivers/built-in.o: In function `cycx_init':
cycx_main.c:(.init.text+0x5c4b): undefined reference to `register_wan_device'
drivers/built-in.o: In function `cycx_exit':
cycx_main.c:(.exit.text+0x560): undefined reference to `unregister_wan_device'
make: *** [.tmp_vmlinux1] Error 1
The problem is caused by tristate -> bool conversion (y or m => y),
so convert WAN_ROUTER_DRIVERS to a tristate so that the correct
dependency is preserved.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>