]> git.karo-electronics.de Git - linux-beck.git/log
linux-beck.git
19 years ago[NETFILTER]: Make NETMAP target usable in OUTPUT
Gary Wayne Smith [Mon, 15 Aug 2005 00:33:24 +0000 (17:33 -0700)]
[NETFILTER]: Make NETMAP target usable in OUTPUT

Signed-off-by: Gary Wayne Smith <gary.w.smith@primeexalia.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: Don't exclude local packets from MASQUERADING
Patrick McHardy [Mon, 15 Aug 2005 00:32:50 +0000 (17:32 -0700)]
[NETFILTER]: Don't exclude local packets from MASQUERADING

Increases consistency in source-address selection.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: Remove two unused files
Domen Puncer [Mon, 15 Aug 2005 00:32:05 +0000 (17:32 -0700)]
[NETFILTER]: Remove two unused files

Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NET]: Store skb->timestamp as offset to a base timestamp
Patrick McHardy [Mon, 15 Aug 2005 00:24:31 +0000 (17:24 -0700)]
[NET]: Store skb->timestamp as offset to a base timestamp

Reduces skb size by 8 bytes on 64-bit.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: Nicer names for ipt_connbytes constants
Patrick McHardy [Sat, 13 Aug 2005 20:58:21 +0000 (13:58 -0700)]
[NETFILTER]: Nicer names for ipt_connbytes constants

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: Fix div64_64 in ipt_connbytes
Patrick McHardy [Sat, 13 Aug 2005 20:57:58 +0000 (13:57 -0700)]
[NETFILTER]: Fix div64_64 in ipt_connbytes

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: Add new iptables "connbytes" match
Harald Welte [Sat, 13 Aug 2005 20:56:26 +0000 (13:56 -0700)]
[NETFILTER]: Add new iptables "connbytes" match

This patch ads a new "connbytes" match that utilizes the CONFIG_NF_CT_ACCT
per-connection byte and packet counters.  Using it you can do things like
packet classification on average packet size within a connection.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: introduce and use aligned_u64 data type
Harald Welte [Sat, 13 Aug 2005 20:55:44 +0000 (13:55 -0700)]
[NETFILTER]: introduce and use aligned_u64 data type

As proposed by Andi Kleen, this is required esp. for x86_64 architecture,
where 64bit code needs 8byte aligned 64bit data types, but 32bit userspace
apps will only align to 4bytes.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET_DIAG]: Move the tcp_diag interface to the proper place
Arnaldo Carvalho de Melo [Fri, 12 Aug 2005 15:59:17 +0000 (12:59 -0300)]
[INET_DIAG]: Move the tcp_diag interface to the proper place

With this the previous setup is back, i.e. tcp_diag can be built as a module,
as dccp_diag and both share the infrastructure available in inet_diag.

If one selects CONFIG_INET_DIAG as module CONFIG_INET_TCP_DIAG will also be
built as a module, as will CONFIG_INET_DCCP_DIAG, if CONFIG_IP_DCCP was
selected static or as a module, if CONFIG_INET_DIAG is y, being statically
linked CONFIG_INET_TCP_DIAG will follow suit and CONFIG_INET_DCCP_DIAG will be
built in the same manner as CONFIG_IP_DCCP.

Now to aim at UDP, converting it to use inet_hashinfo, so that we can use
iproute2 for UDP sockets as well.

Ah, just to show an example of this new infrastructure working for DCCP :-)

[root@qemu ~]# ./ss -dane
State      Recv-Q Send-Q Local Address:Port  Peer Address:Port
LISTEN     0      0                  *:5001             *:*     ino:942 sk:cfd503a0
ESTAB      0      0          127.0.0.1:5001     127.0.0.1:32770 ino:943 sk:cfd50a60
ESTAB      0      0          127.0.0.1:32770    127.0.0.1:5001  ino:947 sk:cfd50700
TIME-WAIT  0      0          127.0.0.1:32769    127.0.0.1:5001  timer:(timewait,3.430ms,0) ino:0 sk:cf209620

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET_DIAG]: Rename tcp_diag.[ch] to inet_diag.[ch]
Arnaldo Carvalho de Melo [Fri, 12 Aug 2005 15:56:38 +0000 (12:56 -0300)]
[INET_DIAG]: Rename tcp_diag.[ch] to inet_diag.[ch]

Next changeset will introduce net/ipv4/tcp_diag.c, moving the code that was put
transitioanlly in inet_diag.c.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TCPDIAG]: Just rename everything to inet_diag
Arnaldo Carvalho de Melo [Fri, 12 Aug 2005 15:51:49 +0000 (12:51 -0300)]
[TCPDIAG]: Just rename everything to inet_diag

Next changeset will rename tcp_diag.[ch] to inet_diag.[ch].

I'm taking this longer route so as to easy review, making clear the changes
made all along the way.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TCPDIAG]: Introduce inet_diag_{register,unregister}
Arnaldo Carvalho de Melo [Fri, 12 Aug 2005 12:27:49 +0000 (09:27 -0300)]
[TCPDIAG]: Introduce inet_diag_{register,unregister}

Next changeset will rename tcp_diag to inet_diag and move the tcp_diag code out
of it and into a new tcp_diag.c, similar to the net/dccp/diag.c introduced in
this changeset, completing the transition to a generic inet_diag
infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET6_HASHTABLES]: Move inet6_lookup functions to net/ipv6/inet6_hashtables.c
Arnaldo Carvalho de Melo [Fri, 12 Aug 2005 12:26:18 +0000 (09:26 -0300)]
[INET6_HASHTABLES]: Move inet6_lookup functions to net/ipv6/inet6_hashtables.c

Doing this we allow tcp_diag to support IPV6 even if tcp_diag is compiled
statically and IPV6 is compiled as a module, removing the previous restriction
while not building any IPV6 code if it is not selected.

Now to work on the tcpdiag_register infrastructure and then to rename the whole
thing to inetdiag, reflecting its by then completely generic nature.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[IPV6]: Generalise the tcp_v6_lookup routines
Arnaldo Carvalho de Melo [Fri, 12 Aug 2005 12:19:38 +0000 (09:19 -0300)]
[IPV6]: Generalise the tcp_v6_lookup routines

In the same way as was done with the v4 counterparts, this will be moved
to inet6_hashtables.c.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: Fix gcc-3.4.x warning about iplicit operator precedence
Harald Welte [Fri, 12 Aug 2005 18:36:44 +0000 (11:36 -0700)]
[NETFILTER]: Fix gcc-3.4.x warning about iplicit operator precedence

Fix gcc-3.4.x warning about iplicit operator precedence in NF_QUEUE_NR()

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NET]: Deinline netif_carrier_{on,off}().
Denis Vlasenko [Thu, 11 Aug 2005 22:32:53 +0000 (15:32 -0700)]
[NET]: Deinline netif_carrier_{on,off}().

# grep -r 'netif_carrier_o[nf]' linux-2.6.12 | wc -l
246

# size vmlinux.org vmlinux.carrier
text    data     bss     dec     hex filename
4339634 1054414  259296 5653344  564360 vmlinux.org
4337710 1054414  259296 5651420  563bdc vmlinux.carrier

And this ain't an allyesconfig kernel!

Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: Fix NF_QUEUE_NR() macro
Harald Welte [Thu, 11 Aug 2005 22:31:15 +0000 (15:31 -0700)]
[NETFILTER]: Fix NF_QUEUE_NR() macro

I obviously wanted to use bitwise-or, not logical or.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: Fix compilation when no PROC_FS enabled
Harald Welte [Thu, 11 Aug 2005 22:30:45 +0000 (15:30 -0700)]
[NETFILTER]: Fix compilation when no PROC_FS enabled

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TCPDIAG]: Introduce CONFIG_IP_TCPDIAG_DCCP
Arnaldo Carvalho de Melo [Thu, 11 Aug 2005 21:37:16 +0000 (14:37 -0700)]
[TCPDIAG]: Introduce CONFIG_IP_TCPDIAG_DCCP

Similar to CONFIG_IP_TCPDIAG_IPV6

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[BNX2]: Possible sparse fixes, take two
Peter Hagervall [Wed, 10 Aug 2005 21:18:16 +0000 (14:18 -0700)]
[BNX2]: Possible sparse fixes, take two

This patch contains the following possible cleanups/fixes:

- use C99 struct initializers
- make a few arrays and structs static
- remove a few uses of literal 0 as NULL pointer
- use convenience function instead of cast+dereference in bnx2_ioctl()
- remove superfluous casts to u8 * in calls to readl/writel

Signed-off-by: Peter Hagervall <hager@cs.umu.se>
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NET]: Make use of ->private_data in sockfd_lookup
Benjamin LaHaise [Wed, 10 Aug 2005 21:16:04 +0000 (14:16 -0700)]
[NET]: Make use of ->private_data in sockfd_lookup

Please consider the patch below which makes use of file->private_data to
store the pointer to the socket, which avoids touching several unused
cachelines in the dentry and inode in sockfd_lookup.

Signed-off-by: Benjamin LaHaise <bcrl@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[CCID3]: Ditch USEC_IN_SEC as time.h has USEC_PER_SEC
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 16:29:27 +0000 (13:29 -0300)]
[CCID3]: Ditch USEC_IN_SEC as time.h has USEC_PER_SEC

That is equivalent, no need to have a private one.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
19 years ago[CCID3]: Separate most of the packet history code
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 15:59:38 +0000 (12:59 -0300)]
[CCID3]: Separate most of the packet history code

This also changes the list_for_each_entry_safe_continue behaviour to match its
kerneldoc comment, that is, to start after the pos passed.

Also adds several helper functions from previously open coded fragments, making
the code more clear.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
19 years ago[TCPDIAG]: Implement cheapest way of supporting DCCPDIAG_GETSOCK
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 08:54:28 +0000 (05:54 -0300)]
[TCPDIAG]: Implement cheapest way of supporting DCCPDIAG_GETSOCK

With ugly ifdefs, etc, but this actually:

1. keeps the existing ABI, i.e. no need to recompile the iproute2
   utilities if not interested in DCCP.

2. Provides all the tcp_diag functionality in DCCP, with just a
   small patch that makes iproute2 support DCCP.

Of course I'll get this cleaned-up in time, but for now I think its
OK to be this way to quickly get this functionality.

iproute2-ss050808 patch at:

http://vger.kernel.org/~acme/iproute2-ss050808.dccp.patch

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[ICSK]: Move TCP congestion avoidance members to icsk
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 07:03:31 +0000 (04:03 -0300)]
[ICSK]: Move TCP congestion avoidance members to icsk

This changeset basically moves tcp_sk()->{ca_ops,ca_state,etc} to inet_csk(),
minimal renaming/moving done in this changeset to ease review.

Most of it is just changes of struct tcp_sock * to struct sock * parameters.

With this we move to a state closer to two interesting goals:

1. Generalisation of net/ipv4/tcp_diag.c, becoming inet_diag.c, being used
   for any INET transport protocol that has struct inet_hashinfo and are
   derived from struct inet_connection_sock. Keeps the userspace API, that will
   just not display DCCP sockets, while newer versions of tools can support
   DCCP.

2. INET generic transport pluggable Congestion Avoidance infrastructure, using
   the current TCP CA infrastructure with DCCP.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NET]: Make NETDEBUG pure printk wrappers
Patrick McHardy [Wed, 10 Aug 2005 03:50:53 +0000 (20:50 -0700)]
[NET]: Make NETDEBUG pure printk wrappers

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[DCCP]: Finish the TIMEWAIT minisock support
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:45:21 +0000 (20:45 -0700)]
[DCCP]: Finish the TIMEWAIT minisock support

Using most of the infrastructure TCP uses, with a dccp_death_row,
etc. As per my current interpretation of the draft what we have with
this changeset seems to be all we need (or very close to it 8)).

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TIMEWAIT]: Move inet_timewait_death_row routines to net/ipv4/inet_timewait_sock.c
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:45:03 +0000 (20:45 -0700)]
[TIMEWAIT]: Move inet_timewait_death_row routines to net/ipv4/inet_timewait_sock.c

Also export the ones that will be used in the next changeset, when
DCCP uses this infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TIMEWAIT]: Introduce inet_timewait_death_row
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:44:40 +0000 (20:44 -0700)]
[TIMEWAIT]: Introduce inet_timewait_death_row

That groups all of the tables and variables associated to the TCP timewait
schedulling/recycling/killing code, that now can be isolated from the TCP
specific code and used by other transport protocols, such as DCCP.

Next changeset will move this code to net/ipv4/inet_timewait_sock.c

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[DCCP]: Initialize icsk_rto in dccp_v4_init_sock
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:31:11 +0000 (20:31 -0700)]
[DCCP]: Initialize icsk_rto in dccp_v4_init_sock

Fixes nasty bug related to the retransmit timer (yeah, DCCP does
retransmits) firing too early.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[DCCP]: Introduce dccp_write_xmit from code in dccp_sendmsg
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:30:56 +0000 (20:30 -0700)]
[DCCP]: Introduce dccp_write_xmit from code in dccp_sendmsg

This way it gets closer to the TCP flow, where congestion window
checks are done, it seems we can map ccid_hc_tx_send_packet in
dccp_write_xmit to tcp_snd_wnd_test in tcp_write_xmit, a CCID2
decision should just fit in here as well...

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[Bluetooth]: Move packet type into the SKB control buffer
Marcel Holtmann [Wed, 10 Aug 2005 03:30:28 +0000 (20:30 -0700)]
[Bluetooth]: Move packet type into the SKB control buffer

This patch moves the usage of packet type into the SKB control
buffer. After this patch it is now possible to shrink the sk_buff
structure and redefine its pkt_type.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[Bluetooth]: Fix sparse warnings (__nocast type)
Victor Fusco [Wed, 10 Aug 2005 03:29:11 +0000 (20:29 -0700)]
[Bluetooth]: Fix sparse warnings (__nocast type)

This patch fixes the sparse warnings "implicit cast to nocast type"
for the priority or gfp_mask parameters of the memory allocations.

Signed-off-by: Victor Fusco <victor@cetuc.puc-rio.br>
Signed-off-by: Domen Puncer <domen@coderock.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[Bluetooth]: Implement RFCOMM remote port negotiation
J. Suter [Wed, 10 Aug 2005 03:28:46 +0000 (20:28 -0700)]
[Bluetooth]: Implement RFCOMM remote port negotiation

This patch implements the remote port negotiation (RPN) of the RFCOMM
protocol for Bluetooth.

Signed-off-by: J. Suter <jsuter@hardwave.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[Bluetooth]: Call tty_hangup() when DCD is de-asserted
Timo Teräs [Wed, 10 Aug 2005 03:28:21 +0000 (20:28 -0700)]
[Bluetooth]: Call tty_hangup() when DCD is de-asserted

The RFCOMM layer does not handle properly the de-assertation
of CD signal. It should call tty_hangup() to work properly.

Signed-off-by: Timo Teräs <ext-timo.teras@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[Bluetooth]: Track page scan repetition mode changes
Marcel Holtmann [Wed, 10 Aug 2005 03:28:02 +0000 (20:28 -0700)]
[Bluetooth]: Track page scan repetition mode changes

The HCI page scan repetition mode change event contains the actual
page scan repetition mode for the remote device. It is the same
value that is received from an inquiry response and it can be used
to make further reconnections faster.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[Bluetooth]: Workaround for inquiry results with RSSI and page scan mode
Marcel Holtmann [Wed, 10 Aug 2005 03:27:49 +0000 (20:27 -0700)]
[Bluetooth]: Workaround for inquiry results with RSSI and page scan mode

This patch implements a workaround for buggy Bluetooth 1.2 devices from
Silicon Wave. Their inquiry results with RSSI contain the page scan mode
field. This field was removed in the final Bluetooth 1.2 specification.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[Bluetooth]: Update and cleanup of the virtual HCI driver
Marcel Holtmann [Wed, 10 Aug 2005 03:27:37 +0000 (20:27 -0700)]
[Bluetooth]: Update and cleanup of the virtual HCI driver

This patch cleans up the virtual HCI driver. It also adds support for
the dynamic minor device number allocation.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[DCCP]: Fix u64 printf format warnings.
David S. Miller [Wed, 10 Aug 2005 03:27:14 +0000 (20:27 -0700)]
[DCCP]: Fix u64 printf format warnings.

Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: New iptables DCCP protocol header match
Harald Welte [Wed, 10 Aug 2005 03:26:55 +0000 (20:26 -0700)]
[NETFILTER]: New iptables DCCP protocol header match

Using this new iptables DCCP protocol header match, it is possible to
create simplistic stateless packet filtering rules for DCCP.  It
permits matching of port numbers, packet type and options.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[DCCP]: Fix struct sockaddr_dccp definition
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:26:28 +0000 (20:26 -0700)]
[DCCP]: Fix struct sockaddr_dccp definition

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[DCCP]: make <linux/dccp.h> include-able from userspace
Harald Welte [Wed, 10 Aug 2005 03:26:03 +0000 (20:26 -0700)]
[DCCP]: make <linux/dccp.h> include-able from userspace

The protocol header files in <linux/foo.h> are usually structured in a
way to be included by userspace code.  The top section consists of
general protocol structure definitions, typedefs, enums - followed by
an #ifdef __KERNEL__ section.

Currently <linux/dccp.h> doesn't follow that convention and can
therefore not be used from userspace.  However, for example iptables'
libipt_dccp.c actually needs various definitions from there.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[IPV4]: fib_trie: Use const
Stephen Hemmigner [Wed, 10 Aug 2005 03:25:39 +0000 (20:25 -0700)]
[IPV4]: fib_trie: Use const

Use const where possible and get rid of EXTRACT() macro
that was never used.

Signed-off-by: Stephen Hemmigner <shemminger@osdl.org>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[IPV4]: fib_trie: Use ERR_PTR to handle errno return
Robert Olsson [Wed, 10 Aug 2005 03:25:06 +0000 (20:25 -0700)]
[IPV4]: fib_trie: Use ERR_PTR to handle errno return

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[IPV4]: FIB Trie cleanups.
Olof Johansson [Wed, 10 Aug 2005 03:24:39 +0000 (20:24 -0700)]
[IPV4]: FIB Trie cleanups.

Below is a patch that cleans up some of this, supposedly without
changing any behaviour:

* Whitespace cleanups
* Introduce DBG()
* BUG_ON() instead of if () { BUG(); }
* Remove some of the deep nesting to make the code flow more
  comprehensible
* Some mask operations were simplified

Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: return ENOMEM when ip_conntrack_alloc() fails.
Yasuyuki Kozakai [Wed, 10 Aug 2005 03:24:15 +0000 (20:24 -0700)]
[NETFILTER]: return ENOMEM when ip_conntrack_alloc() fails.

This patch fixes the bug which doesn't return ERR_PTR(-ENOMEM) if it
failed to allocate memory space from slab cache.  This bug leads to
erroneously not dropped packets under stress, and wrong statistic
counters ('invalid' is incremented instead of 'drop').  It was
introduced during the ctnetlink merge in the net-2.6.14 tree, so no
stable or mainline releases affected.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: check nf_log function call arguments
Harald Welte [Wed, 10 Aug 2005 03:23:53 +0000 (20:23 -0700)]
[NETFILTER]: check nf_log function call arguments

Check whether pf is too large in order to prevent array overflow.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: more verbose return codes from nf_{log,queue}
Harald Welte [Wed, 10 Aug 2005 03:23:36 +0000 (20:23 -0700)]
[NETFILTER]: more verbose return codes from nf_{log,queue}

This adds EEXIST to distinguish between the following return values:
0:  nobody was registered, registration successful
EEXIST: the exact same handler was already registered, no registration
required
EBUSY: somebody else is registered, registration unsuccessful.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: add /proc/net/netfilter interface to nf_queue
Harald Welte [Wed, 10 Aug 2005 03:23:11 +0000 (20:23 -0700)]
[NETFILTER]: add /proc/net/netfilter interface to nf_queue

This patch adds a /proc/net/netfilter/nf_queue file, similar to the
recently-added /proc/net/netfilter/nf_log.  It indicates which queue
handler is registered to which protocol family.  This is useful since
there are now multiple queue handlers in the treee (ip[6]_queue,
nfnetlink_queue).

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: add correct bridging support to nfnetlink_{queue,log}
Harald Welte [Wed, 10 Aug 2005 03:22:10 +0000 (20:22 -0700)]
[NETFILTER]: add correct bridging support to nfnetlink_{queue,log}

This patch adds support for passing the real 'physical' device ifindex
down to userspace via nfnetlink_log and nfnetlink_queue.

This feature basically obsoletes net/bridge/netfilter/ebt_ulog.c, and
it is likely ebt_ulog.c will die with one of the next couple of
patches.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: split net/core/netfilter.c into net/netfilter/*.c
Harald Welte [Wed, 10 Aug 2005 03:21:49 +0000 (20:21 -0700)]
[NETFILTER]: split net/core/netfilter.c into net/netfilter/*.c

This patch doesn't introduce any code changes, but merely splits the
core netfilter code into four separate files.  It also moves it from
it's old location in net/core/ to the recently-created net/netfilter/
directory.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: ip{6}_queue: prevent unregistration race with nfnetlink_queue
Harald Welte [Wed, 10 Aug 2005 03:20:54 +0000 (20:20 -0700)]
[NETFILTER]: ip{6}_queue: prevent unregistration race with nfnetlink_queue

Since nfnetlink_queue can override ip{6}_queue as queue handlers, we
can no longer blindly unregister whoever is registered for PF_INET[6],
but only unregister ourselves.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: fix autoloading of nfnetlink_log
Harald Welte [Wed, 10 Aug 2005 03:20:34 +0000 (20:20 -0700)]
[NETFILTER]: fix autoloading of nfnetlink_log

This patch adds the MODULE_ALIAS required for netnlink autoloading of
nfnetlink_log.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[SUNRPC]: svcsock.c needs linux/tcp.h
Andrew Morton [Wed, 10 Aug 2005 03:20:07 +0000 (20:20 -0700)]
[SUNRPC]: svcsock.c needs linux/tcp.h

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: move conntrack helper buffers from BSS to kmalloc()ed memory
Harald Welte [Wed, 10 Aug 2005 03:19:44 +0000 (20:19 -0700)]
[NETFILTER]: move conntrack helper buffers from BSS to kmalloc()ed memory

According to DaveM, it is preferrable to have large data structures be
allocated dynamically from the module init() function rather than
putting them as static global variables into BSS.

This patch moves the conntrack helper packet buffers into dynamically
allocated memory.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Make inet_create try to load protocol modules
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:19:14 +0000 (20:19 -0700)]
[INET]: Make inet_create try to load protocol modules

Syntax is net-pf-PROTOCOL_FAMILY-PROTOCOL-SOCK_TYPE and if this
fails net-pf-PROTOCOL_FAMILY-PROTOCOL.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TG3]: Fix bug in setting a tg3_flag
Michael Chan [Wed, 10 Aug 2005 03:17:41 +0000 (20:17 -0700)]
[TG3]: Fix bug in setting a tg3_flag

Found a bug while reviewing the patches the second time.

The TG3_FLAG_TXD_MBOX_HWBUG flag is set after the register access
methods have been determined. This patch fixes it by moving it up before
the various access methods are assigned.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TG3]: Eliminate one register write in tg3_restart_ints()
Michael Chan [Wed, 10 Aug 2005 03:17:28 +0000 (20:17 -0700)]
[TG3]: Eliminate one register write in tg3_restart_ints()

The register write to register 0x68 to restart interrupts is unnecessary
as the interrupt wasn't masked in that register by the irq handler. This
will save one register write in the fast path.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TG3]: Add indirect register method for 5703 behind ICH
Michael Chan [Wed, 10 Aug 2005 03:17:14 +0000 (20:17 -0700)]
[TG3]: Add indirect register method for 5703 behind ICH

This patch adds the new workaround for 5703 A1/A2 if it is behind
certain ICH bridges. The workaround disables memory and uses config.
cycles only to access all registers. The 5702/03 chips can mistakenly
decode the special cycles from the ICH chipsets as memory write cycles,
causing corruption of register and memory space. Only certain ICH
bridges will drive special cycles with non-zero data during the address
phase which can fall within the 5703's address range. This is not an ICH
bug as the PCI spec allows non-zero address during special cycles.
However, only these ICH bridges are known to drive non-zero addresses
during special cycles.

The indirect_lock is also changed to spin_lock_irqsave from spin_lock_bh
because it is used in irq handler when using the indirect method to
disable interrupts.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TG3]: Add mailbox read method
Michael Chan [Wed, 10 Aug 2005 03:17:00 +0000 (20:17 -0700)]
[TG3]: Add mailbox read method

This patch adds the mailbox read method and also adds an inline function
tw32_mailbox_f() for mailbox writes that require read flush.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TG3]: Add various register methods
Michael Chan [Wed, 10 Aug 2005 03:16:46 +0000 (20:16 -0700)]
[TG3]: Add various register methods

This patch adds various dedicated register read/write methods for the
existing workarounds, including PCIX target workaround, write with read
flush, etc. The chips that require these workarounds will use these
dedicated access functions.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TG3]: Add basic register access function pointers
Michael Chan [Wed, 10 Aug 2005 03:16:32 +0000 (20:16 -0700)]
[TG3]: Add basic register access function pointers

This patch adds the basic function pointers to do register accesses in
the fast path. This was suggested by David Miller. The idea is that
various register access methods for different hardware errata can easily
be implemented with these function pointers and performance will not be
degraded on chips that use normal register access methods.

The various register read write macros (e.g. tw32, tr32, tw32_mailbox)
are redefined to call the function pointers.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[CCID3]: Reenable list_for_each_entry_safe_continue usage
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:16:04 +0000 (20:16 -0700)]
[CCID3]: Reenable list_for_each_entry_safe_continue usage

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[LIST]: Introduce list_for_each_entry_safe_continue
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:15:51 +0000 (20:15 -0700)]
[LIST]: Introduce list_for_each_entry_safe_continue

Used in the dccp CCID3 code, that is going to be submitted RSN.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[DCCP]: Fix checksum routines
Yoshifumi Nishida [Wed, 10 Aug 2005 03:15:35 +0000 (20:15 -0700)]
[DCCP]: Fix checksum routines

Signed-off-by: Yoshifumi Nishida <nishida@csl.sony.co.jp>
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[ICSK]: Move generalised functions from tcp to inet_connection_sock
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:15:09 +0000 (20:15 -0700)]
[ICSK]: Move generalised functions from tcp to inet_connection_sock

This also improves reqsk_queue_prune and renames it to
inet_csk_reqsk_queue_prune, as it deals with both inet_connection_sock
and inet_request_sock objects, not just with request_sock ones thus
belonging to inet_request_sock.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[DCCP]: Initial implementation
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:14:34 +0000 (20:14 -0700)]
[DCCP]: Initial implementation

Development to this point was done on a subversion repository at:

http://oops.ghostprotocols.net:81/cgi-bin/viewcvs.cgi/dccp-2.6/

This repository will be kept at this site for the foreseable future,
so that interested parties can see the history of this code,
attributions, etc.

If I ever decide to take this offline I'll provide the full history at
some other suitable place.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[RANDOM]: Introduce secure_dccp_sequence_number
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:12:30 +0000 (20:12 -0700)]
[RANDOM]: Introduce secure_dccp_sequence_number

Code contributed by Stephen Hemminger.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NET]: Export symbols needed by the current DCCP code
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:12:12 +0000 (20:12 -0700)]
[NET]: Export symbols needed by the current DCCP code

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[ICSK]: Introduce reqsk_queue_prune from code in tcp_synack_timer
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:11:56 +0000 (20:11 -0700)]
[ICSK]: Introduce reqsk_queue_prune from code in tcp_synack_timer

With this we're very close to getting all of the current TCP
refactorings in my dccp-2.6 tree merged, next changeset will export
some functions needed by the current DCCP code and then dccp-2.6.git
will be born!

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[ICSK]: Generalise tcp_listen_{start,stop}
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:11:41 +0000 (20:11 -0700)]
[ICSK]: Generalise tcp_listen_{start,stop}

This also moved inet_iif from tcp to inet_hashtables.h, as it is
needed by the inet_lookup callers, perhaps this needs a bit of
polishing, but for now seems fine.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[ICSK]: Introduce inet_csk_clone
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:11:24 +0000 (20:11 -0700)]
[ICSK]: Introduce inet_csk_clone

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NET]: Just move the inet_connection_sock function from tcp sources
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:11:08 +0000 (20:11 -0700)]
[NET]: Just move the inet_connection_sock function from tcp sources

Completing the previous changeset, this also generalises tcp_v4_synq_add,
renaming it to inet_csk_reqsk_queue_hash_add, already geing used in the
DCCP tree, which I plan to merge RSN.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NET]: Introduce inet_connection_sock
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:10:42 +0000 (20:10 -0700)]
[NET]: Introduce inet_connection_sock

This creates struct inet_connection_sock, moving members out of struct
tcp_sock that are shareable with other INET connection oriented
protocols, such as DCCP, that in my private tree already uses most of
these members.

The functions that operate on these members were renamed, using a
inet_csk_ prefix while not being moved yet to a new file, so as to
ease the review of these changes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[SOCK]: Introduce sk_clone
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:10:12 +0000 (20:10 -0700)]
[SOCK]: Introduce sk_clone

Out of tcp_create_openreq_child, will be used in
dccp_create_openreq_child, and is a nice sock function anyway.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET_TWSK]: Introduce inet_twsk_alloc
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:09:59 +0000 (20:09 -0700)]
[INET_TWSK]: Introduce inet_twsk_alloc

With the parts of tcp_time_wait that are not TCP specific, tcp_time_wait uses
it and so will dccp_time_wait.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Generalise the TCP sock ID lookup routines
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:09:46 +0000 (20:09 -0700)]
[INET]: Generalise the TCP sock ID lookup routines

And also some TIME_WAIT functions.

[acme@toy net-2.6.14]$ grep built-in /tmp/before.size /tmp/after.size
/tmp/before.size: 282955   13122    9312  305389   4a8ed net/ipv4/built-in.o
/tmp/after.size:  281566   13122    9312  304000   4a380 net/ipv4/built-in.o
[acme@toy net-2.6.14]$

I kept them still inlined, will uninline at some point to see what
would be the performance difference.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Generalise tcp_tw_bucket, aka TIME_WAIT sockets
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:09:30 +0000 (20:09 -0700)]
[INET]: Generalise tcp_tw_bucket, aka TIME_WAIT sockets

This paves the way to generalise the rest of the sock ID lookup
routines and saves some bytes in TCPv4 TIME_WAIT sockets on distro
kernels (where IPv6 is always built as a module):

[root@qemu ~]# grep tw_sock /proc/slabinfo
tw_sock_TCPv6  0  0  128  31  1
tw_sock_TCP    0  0   96  41  1
[root@qemu ~]#

Now if a protocol wants to use the TIME_WAIT generic infrastructure it
only has to set the sk_prot->twsk_obj_size field with the size of its
inet_timewait_sock derived sock and proto_register will create
sk_prot->twsk_slab, for now its only for INET sockets, but we can
introduce timewait_sock later if some non INET transport protocolo
wants to use this stuff.

Next changesets will take advantage of this new infrastructure to
generalise even more TCP code.

[acme@toy net-2.6.14]$ grep built-in /tmp/before.size /tmp/after.size
/tmp/before.size: 188646   11764    5068  205478   322a6 net/ipv4/built-in.o
/tmp/after.size:  188144   11764    5068  204976   320b0 net/ipv4/built-in.o
[acme@toy net-2.6.14]$

Tested with both IPv4 & IPv6 (::1 (localhost) & ::ffff:172.20.0.1
(qemu host)).

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Generalise tcp_v4_lookup_listener
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:09:06 +0000 (20:09 -0700)]
[INET]: Generalise tcp_v4_lookup_listener

[acme@toy net-2.6.14]$ grep built-in /tmp/before /tmp/after
/tmp/before: 282560       13122    9312  304994   4a762 net/ipv4/built-in.o
/tmp/after:  282560       13122    9312  304994   4a762 net/ipv4/built-in.o

Will be used in DCCP, not exporting it right now not to get in Adrian
Bunk's exported-but-not-used-on-modules radar 8)

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Generalise tcp_v4_hash & tcp_unhash
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:08:50 +0000 (20:08 -0700)]
[INET]: Generalise tcp_v4_hash & tcp_unhash

It really just makes the existing code be a helper function that
tcp_v4_hash and tcp_unhash uses, specifying the right inet_hashinfo,
tcp_hashinfo.

One thing I'll investigate at some point is to have the inet_hashinfo
pointer in sk_prot, so that we get all the hashtable information from
the sk pointer, this can lead to some extra indirections that may well
hurt performance/code size, we'll see. Ultimate idea would be that
sk_prot would provide _all_ the information about a protocol
implementation.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[TCP]: Move the tcp sock states to net/tcp_states.h
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:08:28 +0000 (20:08 -0700)]
[TCP]: Move the tcp sock states to net/tcp_states.h

Lots of places just needs the states, not even linux/tcp.h, where this
enum was, needs it.

This speeds up development of the refactorings as less sources are
rebuilt when things get moved from net/tcp.h.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Generalise the tcp_listen_ lock routines
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:08:09 +0000 (20:08 -0700)]
[INET]: Generalise the tcp_listen_ lock routines

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Move tcp_port_rover to inet_hashinfo
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:07:35 +0000 (20:07 -0700)]
[INET]: Move tcp_port_rover to inet_hashinfo

Also expose all of the tcp_hashinfo members, i.e. killing those
tcp_ehash, etc macros, this will more clearly expose already generic
functions and some that need just a bit of work to become generic, as
we'll see in the upcoming changesets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Generalise tcp_bind_hash & tcp_inherit_port
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:07:13 +0000 (20:07 -0700)]
[INET]: Generalise tcp_bind_hash & tcp_inherit_port

This required moving tcp_bucket_cachep to inet_hashinfo.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: fix list traversal order in ctnetlink
Pablo Neira Ayuso [Wed, 10 Aug 2005 03:06:42 +0000 (20:06 -0700)]
[NETFILTER]: fix list traversal order in ctnetlink

Currently conntracks are inserted after the head. That means that
conntracks are sorted from the biggest to the smallest id. This happens
because we use list_prepend (list_add) instead list_add_tail. This can
result in problems during the list iteration.

                 list_for_each(i, &ip_conntrack_hash[cb->args[0]]) {
                         h = (struct ip_conntrack_tuple_hash *) i;
                         if (DIRECTION(h) != IP_CT_DIR_ORIGINAL)
                                 continue;
                         ct = tuplehash_to_ctrack(h);
                         if (ct->id <= *id)
                                 continue;

In that case just the first conntrack in the bucket will be dumped. To
fix this, we iterate the list from the tail to the head via
list_for_each_prev. Same thing for the list of expectations.

Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: Fix typo in ctnl_exp_cb array (no bug, just memory waste)
Pablo Neira Ayuso [Wed, 10 Aug 2005 03:06:27 +0000 (20:06 -0700)]
[NETFILTER]: Fix typo in ctnl_exp_cb array (no bug, just memory waste)

This fixes the size of the ctnl_exp_cb array that is IPCTNL_MSG_EXP_MAX
instead of IPCTNL_MSG_MAX. Simple typo.

Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: fix conntrack refcount leak in unlink_expect()
Pablo Neira Ayuso [Wed, 10 Aug 2005 03:06:11 +0000 (20:06 -0700)]
[NETFILTER]: fix conntrack refcount leak in unlink_expect()

In unlink_expect(), the expectation is removed from the list so the
refcount must be dropped as well.

Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: ctnetlink: make sure event order is correct
Pablo Neira Ayuso [Wed, 10 Aug 2005 03:05:52 +0000 (20:05 -0700)]
[NETFILTER]: ctnetlink: make sure event order is correct

The following sequence is displayed during events dumping of an ICMP
connection: [NEW] [DESTROY] [UPDATE]

This happens because the event IPCT_DESTROY is delivered in
death_by_timeout(), that is called from the icmp protocol helper
(ct->timeout.function) once we see the reply.

To fix this, we move this event to destroy_conntrack().

Signed-off-by: Pablo Neira Ayuso <pablo@eurodev.net>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: don't use nested attributes for conntrack_expect
Harald Welte [Wed, 10 Aug 2005 03:04:07 +0000 (20:04 -0700)]
[NETFILTER]: don't use nested attributes for conntrack_expect

We used to use nested nfattr structures for ip_conntrack_expect.  This is
bogus, since ip_conntrack and ip_conntrack_expect are communicated in
different netlink message types.  both should be encoded at the top level
attributes, no extra nesting required.  This patch addresses the issue.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: cleanup nfnetlink_check_attributes()
Harald Welte [Wed, 10 Aug 2005 03:03:54 +0000 (20:03 -0700)]
[NETFILTER]: cleanup nfnetlink_check_attributes()

1) memset return parameter 'cda' (nfattr pointer array) only on success
2) a message without attributes and just a 'struct nfgenmsg' is valid,
   don't return -EINVAL
3) use likely() and unlikely() where apropriate

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: attribute count is an attribute of message type, not subsytem
Harald Welte [Wed, 10 Aug 2005 03:03:40 +0000 (20:03 -0700)]
[NETFILTER]: attribute count is an attribute of message type, not subsytem

Prior to this patch, every nfnetlink subsystem had to specify it's
attribute count.  However, in reality the attribute count depends on
the message type within the subsystem, not the subsystem itself.  This
patch moves 'attr_count' from 'struct nfnetlink_subsys' into
nfnl_callback to fix this.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: fix ctnetlink 'create_expect' parsing
Harald Welte [Wed, 10 Aug 2005 03:03:22 +0000 (20:03 -0700)]
[NETFILTER]: fix ctnetlink 'create_expect' parsing

There was a stupid copy+paste mistake where we parse the MASK nfattr into
the "tuple" variable instead of the "mask" variable.  This patch fixes it.
Thanks to Pablo Neira.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: conntrack_netlink: Fix locking during conntrack_create
Pablo Neira [Wed, 10 Aug 2005 03:02:55 +0000 (20:02 -0700)]
[NETFILTER]: conntrack_netlink: Fix locking during conntrack_create

The current codepath allowed for ip_conntrack_lock to be unlock'ed twice.

Signed-off-by: Pablo Neira <pablo@eurodev.net>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: remove bogus memset() calls from ip_conntrack_netlink.c
Pablo Neira [Wed, 10 Aug 2005 03:02:36 +0000 (20:02 -0700)]
[NETFILTER]: remove bogus memset() calls from ip_conntrack_netlink.c

nfattr_parse_nested() calls nfattr_parse() which in turn does a memset
on the 'tb' array.  All callers therefore don't need to memset before
calling it.

Signed-off-by: Pablo Neira <pablo@eurodev.net>
Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: Fix multiple problems with the conntrack event cache
Patrick McHardy [Wed, 10 Aug 2005 03:02:13 +0000 (20:02 -0700)]
[NETFILTER]: Fix multiple problems with the conntrack event cache

refcnt underflow: the reference count is decremented when a conntrack
entry is removed from the hash but it is not incremented when entering
new entries.

missing protection of process context against softirq context: all
cache operations need to locally disable softirqs to avoid races.
Additionally the event cache can't be initialized when a packet
enteres the conntrack code but needs to be initialized whenever we
cache an event and the stored conntrack entry doesn't match the
current one.

incorrect flushing of the event cache in ip_ct_iterate_cleanup:
without real locking we can't flush the cache for different CPUs
without incurring races. The cache for different CPUs can only be
flushed when no packets are going through the
code. ip_ct_iterate_cleanup doesn't need to drop all references, so
flushing is moved to the cleanup path.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Move bind_hash from tcp_sk to inet_sk
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:01:14 +0000 (20:01 -0700)]
[INET]: Move bind_hash from tcp_sk to inet_sk

This should really be in a inet_connection_sock, but I'm leaving it
for a later optimization, when some more fields common to INET
transport protocols now in tcp_sk or inet_sk will be chunked out into
inet_connection_sock, for now its better to concentrate on getting the
changes in the core merged to leave the DCCP tree with only DCCP
specific code.

Next changesets will take advantage of this move to generalise things
like tcp_bind_hash, tcp_put_port, tcp_inherit_port, making the later
receive a inet_hashinfo parameter, and even __tcp_tw_hashdance, etc in
the future, when tcp_tw_bucket gets transformed into the struct
timewait_sock hierarchy.

tcp_destroy_sock also is eligible as soon as tcp_orphan_count gets
moved to sk_prot.

A cascade of incremental changes will ultimately make the tcp_lookup
functions be fully generic.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Move the TCP hashtable functions/structs to inet_hashtables.[ch]
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 03:00:51 +0000 (20:00 -0700)]
[INET]: Move the TCP hashtable functions/structs to inet_hashtables.[ch]

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Just rename the TCP hashtable functions/structs to inet_
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 02:59:44 +0000 (19:59 -0700)]
[INET]: Just rename the TCP hashtable functions/structs to inet_

This is to break down the complexity of the series of patches,
making it very clear that this one just does:

1. renames tcp_ prefixed hashtable functions and data structures that
   were already mostly generic to inet_ to share it with DCCP and
   other INET transport protocols.

2. Removes not used functions (__tb_head & tb_head)

3. Removes some leftover prototypes in the headers (tcp_bucket_unlock &
   tcp_v4_build_header)

Next changesets will move tcp_sk(sk)->bind_hash to inet_sock so that we can
make functions such as tcp_inherit_port, __tcp_inherit_port, tcp_v4_get_port,
__tcp_put_port,  generic and get others like tcp_destroy_sock closer to generic
(tcp_orphan_count will go to sk->sk_prot to allow this).

Eventually most of these functions will be used passing the transport protocol
inet_hashinfo structure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[INET]: Move the TCP ehash functions to include/net/inet_hashtables.h
Arnaldo Carvalho de Melo [Wed, 10 Aug 2005 02:59:20 +0000 (19:59 -0700)]
[INET]: Move the TCP ehash functions to include/net/inet_hashtables.h

To be shared with DCCP (and others), this is the start of a series of patches
that will expose the already generic TCP hash table routines.

The few changes noticed when calling gcc -S before/after on a pentium4 were of
this type:

        movl    40(%esp), %edx
        cmpl    %esi, 472(%edx)
        je      .L168
-       pushl   $291
+       pushl   $272
        pushl   $.LC0
        pushl   $.LC1
        pushl   $.LC2

[acme@toy net-2.6.14]$ size net/ipv4/tcp_ipv4.before.o net/ipv4/tcp_ipv4.after.o
   text    data     bss     dec     hex filename
  17804     516     140   18460    481c net/ipv4/tcp_ipv4.before.o
  17804     516     140   18460    481c net/ipv4/tcp_ipv4.after.o

Holler if some weird architecture has issues with things like this 8)

Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 years ago[NETFILTER]: Add new "nfnetlink_log" userspace packet logging facility
Harald Welte [Wed, 10 Aug 2005 02:58:39 +0000 (19:58 -0700)]
[NETFILTER]: Add new "nfnetlink_log" userspace packet logging facility

This is a generic (layer3 independent) version of what ipt_ULOG is already
doing for IPv4 today.  ipt_ULOG, ebt_ulog and finally also ip[6]t_LOG will
be deprecated by this mechanism in the long term.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>