git.karo-electronics.de Git - linux-beck.git/log

[NETFILTER]: conntrack: fix SYSCTL=n compile

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: nf_queue: handle NF_STOP and unknown verdicts in nf_reinject

In case of an unknown verdict or NF_STOP the packet leaks. Unknown verdicts
can happen when userspace is buggy. Reinject the packet in case of NF_STOP,
drop on unknown verdicts.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NETFILTER]: H.323 helper: fix possible NULL-ptr dereference

An RCF message containing a timeout results in a NULL-ptr dereference if
no RRQ has been seen before.

Noticed by the "SATURN tool", reported by Thomas Dillig <tdillig@stanford.edu>
and Isil Dillig <isil@stanford.edu>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Correct dev_alloc_skb kerneldoc

dev_alloc_skb is designated for RX descriptors, not TX. (Some drivers
use it for the latter anyway, but that's a different story)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Remove CONFIG_HAVE_ARCH_DEV_ALLOC_SKB

skbuff.h has an #ifndef CONFIG_HAVE_ARCH_DEV_ALLOC_SKB to allow
architectures to reimplement __dev_alloc_skb. It's not set on any
architecture and now that we have an architecture-overrideable
NET_SKB_PAD there is not point at all to have one either.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

[VLAN]: Fix link state propagation

When the queue of the underlying device is stopped at initialization time
or the device is marked "not present", the state will be propagated to the
vlan device and never change. Based on an analysis by Patrick McHardy.

Signed-off-by: Stefan Rompf <stefan@loplof.de>
ACKed-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV6] xfrm6_tunnel: Delete debugging code.

It doesn't compile, and it's dubious in several regards:

1) is enabled by non-Kconfig controlled CONFIG_* value
(noted by Randy Dunlap)
2) XFRM6_TUNNEL_SPI_MAGIC is defined after it's first use
3) the debugging messages print object pointer addresses
which have no meaning without context

So let's just get rid of it.

Signed-off-by: David S. Miller <davem@davemloft.net>

[Bluetooth] Enable SCO support for Broadcom HID proxy dongle

The Broadcom dongles with HID proxy support actually support SCO over
HCI if the SCO buffer size values are corrected. So instead of disabling
the SCO support, mark this dongle with the quirk for the Bluetooth core
to correct the wrong buffer size values.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Add quirk for another broken RTX Telecom based dongle

This patch disables the ISOC transfers for another broken RTX Telecom
based USB dongle. Starting the USB ISOC transfers only ends in a burst
of error messages for invalid SCO packets on connection handle 0.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Correct SCO buffer size for Belkin devices

The Belkin F8T012 and F8T013 devices are both based on a Bluetooth chip
from Broadcom and their SCO buffer size values are wrong. The Bluetooth
core should correct these values.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Correct SCO buffer size for another Broadcom chip

The SCO buffer size values on IBM/Lenovo ThinkPad laptops with a
Bluetooth chip from Broadcom are wrong. The USB Bluetooth driver
has to set a quirk to correct the SCO buffer size values.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[Bluetooth] Correct RFCOMM channel MTU for broken implementations

Some Bluetooth RFCOMM implementations try to negotiate a bigger channel
MTU than we can support for a particular session. The maximum MTU for
a RFCOMM session is limited through the L2CAP layer. So if the other
side proposes a channel MTU that is bigger than the underlying L2CAP
MTU, we should reduce it to the L2CAP MTU of the session minus five
bytes for the RFCOMM headers.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>

[PKT_SCHED]: Fix regression in PSCHED_TADD{,2}.

In PSCHED_TADD and PSCHED_TADD2, if delta is less than tv.tv_usec (so,
less than USEC_PER_SEC too) then tv_res will be smaller than tv. The
affectation "(tv_res).tv_usec = __delta;" is wrong. The fix is to
revert to the original code before
4ee303dfeac6451b402e3d8512723d3a0f861857 and change the 'if' in
'while'.

[Shuya MAEDA: "while (__delta >= USEC_PER_SEC){ ... }" instead of
"while (__delta > USEC_PER_SEC){ ... }"]

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

[DCCP]: Fix default sequence window size

When using the default sequence window size (100) I got the following in
my logs:

Jun 22 14:24:09 localhost kernel: [ 1492.114775] DCCP: Step 6 failed for
DATA packet, (LSWL(6279674225) <= P.seqno(6279674749) <=
S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <=
P.ackno(1125899906842620) <= S.AWH(18798206548), sending SYNC...
Jun 22 14:24:09 localhost kernel: [ 1492.115147] DCCP: Step 6 failed for
DATA packet, (LSWL(6279674225) <= P.seqno(6279674750) <=
S.SWH(6279674324)) and (P.ackno doesn't exist or LAWL(18798206530) <=
P.ackno(1125899906842620) <= S.AWH(18798206549), sending SYNC...

I went to alter the default sysctl and it didn't take for new sockets.
Below patch fixes this.

I think the default is too low but it is what the DCCP spec specifies.

As a side effect of this my rx speed using iperf goes from about 2.8 Mbits/sec
to 3.5. This is still far too slow but it is a step in the right direction.

Compile tested only for IPv6 but not particularly complex change.

Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

IB/mthca: Initialize max_cmds before debug code prints it

Read the max_cmds value from the response to the QUERY_FW command
before printing out the value, so that the real value goes into the
debug output.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipoib: Fix packet loss after hardware address update

The neighbour ha field may get updated without destroying the
neighbour. In this case, the ha field gets out of sync with the
address handle stored in ipoib_neigh->ah, with the result that
the ah field would point to an incorrect path, resulting in all
packets being lost.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipoib: Fix oops with ipoib_debug_mcast set

Need to set mcast->ah before debug code dereferences it.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/mad: Validate MADs for spec compliance

Validate MADs sent by userspace clients for spec compliance with
C13-18.1.1 (prevent duplicate requests and responses sent on the
same port). Without this, RMPP transactions get aborted because
of duplicate packets.

This patch is similar to that provided by Jack Morgenstein.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipath: ipath_skip_sge() can break if num_sge > 1

ipath_skip_sge() doesn't exactly duplicate the side effects of
ipath_copy_sge() if num_sge > 1 since it doesn't decrement ss->num_sge.
This could result in the sg_list being accessed out of bounds.
Since ipath_skip_sge() is almost always called with num_sge == 1,
the original "optimization" is almost never used.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipath: Fix ib_ipath driver to work with SRP

I am still working on a proposal to remove the phys_to_virt() calls
in the ib_ipath driver. In the mean time, this patch allows SRP
to work by fixing the R_Key check and conversion from IB address
to kernel virtual address. It also returns the correct page size
for FMRs.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ipath: Fix a data corruption

This patch fixes a problem where certain error packets are passed
to the InfiniBand layer for processing even though the packet
actually was received with an error.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/mthca: Fix SRQ limit event range check

Mem-free HCAs always keep one spare SRQ WQE, so the SRQ limit cannot
be set beyond srq->max - 1.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/uverbs: Fix lockdep warnings

Lockdep warns because uverbs is trying to take uobj->mutex when it
already holds that lock. This is because there are really multiple
types of uobjs even though all of their locks are initialized in
common code.

Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/uverbs: Fix unlocking in error paths

ib_uverbs_create_ah() and ib_uverbs_create_srq() did not release the
PD's read lock in their error paths, which lead to deadlock when
destroying the PD.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

[PATCH] Cpuset: fix ABBA deadlock with cpu hotplug lock

Fix ABBA deadlock between lock_cpu_hotplug() and the cpuset
callback_mutex lock.

It only happens on cpu_exclusive cpusets, due to the dynamic
sched domain code trying to take the cpu hotplug lock inside
the cpuset callback_mutex lock.

This bug has apparently been here for several months, but didn't
get hit until the right customer load on a large system.

This fix appears right from inspection, but it will take a few
more days running it on that customers workload to be confident
we nailed it.  We don't have any other reproducible test case.

The cpu_hotplug_lock() tends to cover large runs of code.
The other places that hold both that lock and the cpuset callback
mutex lock always nest the cpuset lock inside the hotplug lock.
This place tries to do the reverse, risking an ABBA deadlock.

This is in the cpuset_rmdir() code, where we:
  * take the callback_mutex lock
  * mark the cpuset CS_REMOVED
  * call update_cpu_domains for cpu_exclusive cpusets
  * in that call, take the cpu_hotplug lock if the
    cpuset is marked for removal.

Thanks to Jack Steiner for identifying this deadlock.

The fix is to tear down the dynamic sched domain before we grab
the cpuset callback_mutex lock.  This way, the two locks are
serialized, with the hotplug lock taken and released before
trying for the cpuset lock.

I suspect that this bug was introduced when I changed the
cpuset locking from one lock to two.  The dynamic sched domain
dependency on cpu_exclusive cpusets and its hotplug hooks were
added to this code earlier, when cpusets had only a single lock.
It may well have been fine then.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

cpu hotplug: simplify and hopefully fix locking

The CPU hotplug locking was quite messy, with a recursive lock to
handle the fact that both the actual up/down sequence wanted to
protect itself from being re-entered, but the callbacks that it
called also tended to want to protect themselves from CPU events.

This splits the lock into two (one to serialize the whole hotplug
sequence, the other to protect against the CPU present bitmaps
changing). The latter still allows recursive usage because some
subsystems (ondemand policy for cpufreq at least) had already gotten
too used to the lax locking, but the locking mistakes are hopefully
now less fundamental, and we now warn about recursive lock usage
when we see it, in the hope that it can be fixed.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[cpufreq] ondemand: make shutdown sequence more robust

Shutting down the ondemand policy was fraught with potential
problems, causing issues for SMP suspend (which wants to hot-
unplug) all but the last CPU.

This should fix at least the worst problems (divide-by-zero
and infinite wait for the workqueue to shut down).

Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  [TIPC]: Removing useless casts
  [IPV4]: Fix nexthop realm dumping for multipath routes
  [DUMMY]: Avoid an oops when dummy_init_one() failed
  [IFB] After ifb_init_one() failed, i is increased. Decrease
  [NET]: Fix reversed error test in netif_tx_trylock
  [MAINTAINERS]: Mark LAPB as Oprhan.
  [NET]: Conversions from kmalloc+memset to k(z|c)alloc.
  [NET]: sun happymeal, little pci cleanup
  [IrDA]: Use alloc_skb() in IrDA TX path
  [I/OAT]: Remove pci_module_init() from Intel I/OAT DMA engine
  [I/OAT]: net/core/user_dma.c should #include <net/netdma.h>
  [SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKed
  [SCTP]: Set chunk->data_accepted only if we are going to accept it.
  [SCTP]: Verify all the paths to a peer via heartbeat before using them.
  [SCTP]: Unhash the endpoint in sctp_endpoint_free().
  [SCTP]: Check for NULL arg to sctp_bucket_destroy().
  [PKT_SCHED] netem: Fix slab corruption with netem (2nd try)
  [WAN]: Converted synclink drivers to use netif_carrier_*()
  [WAN]: Cosmetic changes to N2 and C101 drivers
  [WAN]: Added missing netif_dormant_off() to generic HDLC
  ...

[TIPC]: Removing useless casts

Removing useless casts

Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: Fix nexthop realm dumping for multipath routes

Routing realms exist per nexthop, but are only returned to userspace
for the first nexthop. This is due to the fact that iproute2 only
allows to set the realm for the first nexthop and the kernel refuses
multipath routes where only a single realm is present.

Dump all realms for multipath routes to enable iproute to correctly
display them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

[DUMMY]: Avoid an oops when dummy_init_one() failed

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IFB] After ifb_init_one() failed, i is increased. Decrease

It before entering in the loop for freeing the other ifb devices.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Fix reversed error test in netif_tx_trylock

A non-zero return value indicates success from spin_trylock,
not error.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

[MAINTAINERS]: Mark LAPB as Oprhan.

Maintainer email not longer exists.

Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: Conversions from kmalloc+memset to k(z|c)alloc.

Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[NET]: sun happymeal, little pci cleanup

Use pci_register_driver instead of pci_module_init. Use PCI_DEVICE macro.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IrDA]: Use alloc_skb() in IrDA TX path

As pointed out by Christoph Hellwig, dev_alloc_skb() is not intended to be
used for allocating TX sk_buff. The IrDA stack was exclusively calling
dev_alloc_skb() on the TX path, and this patch fixes that.

Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[I/OAT]: Remove pci_module_init() from Intel I/OAT DMA engine

Changes pci_module_init() to pci_register_driver().

Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

[I/OAT]: net/core/user_dma.c should #include <net/netdma.h>

Every file should #include the headers containing the prototypes for
its global functions.

Especially in cases like this one where gcc can tell us through a
compile error that the prototype was wrong...

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: ADDIP: Don't use an address as source until it is ASCONF-ACKed

This implements Rules D1 and D4 of Sec 4.3 in the ADDIP draft.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Set chunk->data_accepted only if we are going to accept it.

Currently there is a code path in sctp_eat_data() where it is possible
to set this flag even when we are dropping this chunk.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Verify all the paths to a peer via heartbeat before using them.

This patch implements Path Initialization procedure as described in
Sec 2.36 of RFC4460.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Unhash the endpoint in sctp_endpoint_free().

This prevents a race between the close of a socket and receive of an
incoming packet.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SCTP]: Check for NULL arg to sctp_bucket_destroy().

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

[PKT_SCHED] netem: Fix slab corruption with netem (2nd try)

CONFIG_DEBUG_SLAB found the following bug:
netem_enqueue() in sch_netem.c gets a pointer inside a slab object:
struct netem_skb_cb *cb = (struct netem_skb_cb *)skb->cb;
But then, the slab object may be freed:
skb = skb_unshare(skb, GFP_ATOMIC)
cb is still pointing inside the freed skb, so here is a patch to
initialize cb later, and make it clear that initializing it sooner
is a bad idea.

[From Stephen Hemminger: leave cb unitialized in order to let gcc
complain in case of use before initialization]

Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

[WAN]: Converted synclink drivers to use netif_carrier_*()

WAN: Converted synclink drivers to use netif_carrier_*() instead
of hdlc_set_carrier().

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

[WAN]: Cosmetic changes to N2 and C101 drivers

WAN: Cosmetic changes to N2 and C101 drivers

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

[WAN]: Added missing netif_dormant_off() to generic HDLC

WAN: Fixed a problem with PPP/raw HDLC/X.25 protocols not doing
netif_dormant_off() at startup.

Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

[IPV4]: Get rid of redundant IPCB->opts initialisation

Now that we always zero the IPCB->opts in ip_rcv, it is no longer
necessary to do so before calling netif_rx for tunneled packets.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Update defconfig.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Fix length parameter verification in sys_getdomainname().

Found by scrashme.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SERIAL] sunzilog: Fix instance enumeration.

Just do a linear enumeration so that we handle sun4d systems
correctly. As a consequence, eliminate the hard coded keyboard and
mouse channel line values, use the CONS_{KEYB,MS} flags instead.

Also, report the keyboard/mouse Zilog channels just like the uart ones
do.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SERIAL] sunzilog: Remove duplicate IRQ registry in zs_probe().

We do it now in sunzilog_init() after all devices have been
probed.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Get sun4d SMP building again.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Do not call sun4m_irq_rotate on sun4d.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Simplify and correct __cpu_find_by()

By using for_each_node_by_type().

Also, correct a spurioud test in check_cpu_node() on sparc64.
It is only called with nodes that have device_type "cpu".

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Initialize iounit spinlock in iounit_init().

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Fix initialization of sun4d SBUS interrupts.

1) Explicitly traverse to the root looking for the "sbi".
2) Grab the "board#" property from the sbi's parent and
verify that this parent is an "io-unit" node.
3) Skip IRQ initialization when device lacks "reg" property.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SERIAL] sunzilog: Register IRQ after all devices have been probed.

Otherwise we will deref half-initialized channel pointers
and crash in the interrupt handler.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC] sbus: Make sure sbus nodes are named uniquely.

Just name them "sbus%d" otherwise on sun4d we try to register
multiple entries named "sbi@0,0" which does not work.

Based upon a report from Raymond Burns.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Fix property name acquisition in prom.c

On sparc32 the prom_{first,next}prop() interfaces work
a little differently. The buffer argument is ignored on
sparc32 and the firmware just returns a raw pointer to
the property name.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SERIAL] sunsab: Get line numbers and table sizing correct.

Table sizing code should look for "se" not "su" nodes.

The chip at the lower address should get the first index.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64] Fix sunsab ports ordering

Register second SAB port before the first one, as serial A is wired to
it, and expected to appear as ttyS0.

Signed-off-by: Marc Zyngier <maz@misterjones.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC]: Kill prom_getname, unused and not implemented properly.

The m68k port's sun3 asm/oplib.h had a stray reference too, so I
killed that off as well.

Signed-off-by: David S. Miller <davem@davemloft.net>

[SPARC64]: Fix more of_device layer IRQ bugs, and correct PROMREG_MAX.

Sabre and Psycho PCI controllers can have partial interrupt-map
properties, meaning that on-board devices don't match up to any
entries. Instead, they are fully specified from the beginning and
we should pass them directly to the IRQ translator as-is.

Also, fill in the necessary translator slots for the "graphics"
and "expansion UPA" interrupts on Sabre, Psycho, and SYSIO SBUS.

Increase PROMREG_MAX to 24, as seen on SUNW,ffb devices.

Finally, prevent accidentally writing past the end of the of_device
struct resource[] and irqs[] arrays. Spit out a log message when
we ignore some entries because there are too many of them.

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (38 commits)
  [SCSI] More buffer->request_buffer changes
  [SCSI] mptfusion: bump version to 3.04.01
  [SCSI] mptfusion: misc fix's
  [SCSI] mptfusion: firmware download boot fix's
  [SCSI] mptfusion: task abort fix's
  [SCSI] mptfusion: sas nexus loss support
  [SCSI] mptfusion: sas loginfo update
  [SCSI] mptfusion: mptctl panic when loading
  [SCSI] mptfusion: sas enclosures with smart drive
  [SCSI] NCR_D700: misc fixes (section and argument ordering)
  [SCSI] scsi_debug: must_check fixes
  [SCSI] scsi_transport_sas: kill the use of channel
  [SCSI] scsi_transport_sas: add expander backlink
  [SCSI] hide EH backup data outside the scsi_cmnd
  [SCSI] ibmvscsi: handle inactive SCSI target during probe
  [SCSI] ibmvscsi: allocate lpevents for ibmvscsi on iseries
  [SCSI] aic7[9x]xx: Remove last vestiges of reverse_scan
  [SCSI] aha152x: stop poking at saved scsi_cmnd members
  [SCSI] st.c: Improve sense output
  [SCSI] lpfc 8.1.7: Change version number to 8.1.7
  ...

Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6

* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] sysfs_create_xxx return values.
  [S390] .align 4096 statements in head.S
  [S390] get_clock inline assembly.
  [S390] channel measurement interval display.
  [S390] xpram module parameter parsing - take 2.
  [S390] Fix gcc warning about unused return values.

Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
  [PATCH] spidernet: rework tx queue handling
  [PATCH] spidernet: bug fix for init code
  [PATCH] sky2: NAPI poll fix
  [NET] ethtool: fix oops by testing correct struct member
  e1000: bump version to 7.1.9-k4
  e1000: fix panic on large frame receive when mtu=default
  e1000: remove CRC bytes from measured packet length
  e1000: Redo netpoll fix to address community concerns

[S390] sysfs_create_xxx return values.

Take return values of sysfs_create_group & friends into account.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[S390] .align 4096 statements in head.S

SLES9 binutils don't like .align 4096 statements in head.S. Work around this
by using .org statements.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[PATCH] spidernet: rework tx queue handling

With this patch TX queue descriptors are not chained per default any more.
The pointer to next descriptor is set only when next descriptor is prepaired
for transfer. Also the mechanism of checking wether Spider is ready has been
changed: it checks not for CARDOWNED flag in status of previous descriptor
but for a TXDMAENABLED flag in Spider's register.

Signed-off-by: Maxim Shchetynin <maxim@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

[PATCH] spidernet: bug fix for init code

We want to intitialize addr instead of data register first.

Signed-off-by: Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

[PATCH] sky2: NAPI poll fix

When sky2 driver gets lots of received packets at once, it can get stuck.
The NAPI poll routine gets called back to keep going, but since no IRQ bits
are set it doesn't make progress.

Increase version, since this is serious enough problem that I want to be
able to tell new from old problems.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

Merge branch 'upstream-fixes-jgarzik' of git://lost.foo-projects.org/~ahkok/git/netdev-2.6 into upstream-fixes

[NET] ethtool: fix oops by testing correct struct member

Noticed by Willy Tarreau.

Signed-off-by: Jeff Garzik <jeff@garzik.org>

[S390] get_clock inline assembly.

Add missing volatile to the get_clock / get_cycles inline assemblies
to avoid that consecutive calls get optimized away.

Signed-off-by: Andreas Krebbel <krebbel1@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[S390] channel measurement interval display.

Display avg_sample_interval in nanoseconds, like it is documented.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[S390] xpram module parameter parsing - take 2.

Don't use memparse since the default size modifier is 'k'.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[S390] Fix gcc warning about unused return values.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

[libata] ata_piix: correct 'invalid MAP value' typo-caused error

Signed-off-by: Jeff Garzik <jeff@garzik.org>

[libata] ata_piix: minor cleanups noticed in prior patch run

* delete unused PIIX_FLAG_COMBINED*
* port_enable should be u16 rather than u32

Signed-off-by: Jeff Garzik <jeff@garzik.org>

[libata] ata_piix: attempt to fix ICH8 support

Take into account the fact that ICH8 changed the register layout of
the MAP and PCS register bits.

Signed-off-by: Jeff Garzik <jeff@garzik.org>

[libata] ata_piix: Consolidate PCS register writing

Prior to this patch, the driver would do this for each port:
read 8-bit PCS
write 8-bit PCS
read 8-bit PCS
write 8-bit PCS

In the field, flaky behavior has been observed related to this register.
In particular, these overzealous register writes can cause misdetection
problems.

Update to do the following once (not once per port) at boot:
read 16-bit PCS
if needs changing,
write 16-bit PCS

And thereafter, we only perform a 'read 16-bit PCS' per port.

This should eliminate all PCS writes in many cases, and be more friendly
in the cases where we do need to enable ports.

Signed-off-by: Jeff Garzik <jeff@garzik.org>

[PATCH] ata_piix: add host_set private structure

Add host_set private structure piix_host_priv. Currently the only
field is ->map which used to be stored directly at
host_set->private_data. This change allows more host_set private
fields to be added.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

Linux 2.6.18-rc2

Finishing up for the kernel summit. Ottawa, here I come.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6:
JFS: commit_mutex cleanups

[PATCH] UML - fix utsname build breakage

Some -mm-only material leaked into a patch destined for mainline, and I didn't
notice.

This was the replacement of system_utsname with utsname() that's required by
the uts namespace patch. This patch reverts those changes (which are correct
in -mm) so that mainline UML builds again.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Don't allow chmod() on the /proc/<pid>/ files

This just turns off chmod() on the /proc/<pid>/ files, since there is no
good reason to allow it, and had we disallowed it originally, the nasty
/proc race exploit wouldn't have been possible.

The other patches already fixed the problem chmod() could cause, so this
is really just some final mop-up..

This particular version is based off a patch by Eugene and Marcel which
had much better naming than my original equivalent one.

Signed-off-by: Eugene Teo <eteo@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Mark /proc MS_NOSUID and MS_NOEXEC

Not that we really need this any more, but at the same time there's no
reason not to do this.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] sch_htb compile fix.

net/sched/sch_htb.c: In function 'htb_change_class':
net/sched/sch_htb.c:1605: error: expected ';' before 'do_gettimeofday'

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Merge master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6:
[CRYPTO] padlock: Fix alignment after aes_ctx rearrange

Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64] Fix PSYCHO PCI controler init.
  [SPARC64] psycho: Fix pbm->name handling in pbm_register_toplevel_resources()
  [SERIAL] sunsab: Fix significant typo in sab_probe()
  [SERIAL] sunsu: Report keyboard and mouse ports in kernel log.
  [SPARC64]: Make sure IRQs are disabled properly during early boot.

Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [VLAN]: __vlan_hwaccel_rx can use the faster ether_compare_addr
  [PKT_SCHED] HTB: initialize upper bound properly
  [IPV4]: Clear skb cb on IP input
  [NET]: Update frag_list in pskb_trim

[PATCH] remove set_wmb - arch removal

set_wmb should not be used in the kernel because it just confuses the
code more and has no benefit. Since it is not currently used in the
kernel this patch removes it so that new code does not include it.

All archs define set_wmb(var, value) to do { var = value; wmb(); }
while(0) except ia64 and sparc which use a mb() instead. But this is
still moot since it is not used anyway.

Hasn't been tested on any archs but x86 and x86_64 (and only compiled
tested)

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] remove set_wmb - doc update

This patch removes the reference to set_wmb from memory-barriers.txt
since it shouldn't be used.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] Remove down_write() from taskstats code invoked on the exit() path

In send_cpu_listeners(), which is called on the exit path, a down_write()
was protecting operations like skb_clone() and genlmsg_unicast() that do
GFP_KERNEL allocations.  If the oom-killer decides to kill tasks to satisfy
the allocations,the exit of those tasks could block on the same semphore.

The down_write() was only needed to allow removal of invalid listeners from
the listener list.  The patch converts the down_write to a down_read and
defers the removal to a separate critical region.  This ensures that even
if the oom-killer is called, no other task's exit is blocked as it can
still acquire another down_read.

Thanks to Andrew Morton & Herbert Xu for pointing out the oom related
pitfalls, and to Chandra Seetharaman for suggesting this fix instead of
using something more complex like RCU.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] per-task delay accounting taskstats interface: control exit data through cpumasks

On systems with a large number of cpus, with even a modest rate of tasks
exiting per cpu, the volume of taskstats data sent on thread exit can
overflow a userspace listener's buffers.

One approach to avoiding overflow is to allow listeners to get data for a
limited and specific set of cpus. By scaling the number of listeners
and/or the cpus they monitor, userspace can handle the statistical data
overload more gracefully.

In this patch, each listener registers to listen to a specific set of cpus
by specifying a cpumask. The interest is recorded per-cpu. When a task
exits on a cpu, its taskstats data is unicast to each listener interested
in that cpu.

Thanks to Andrew Morton for pointing out the various scalability and
general concerns of previous attempts and for suggesting this design.

[akpm@osdl.org: build fix]
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] per-task delay accounting: avoid send without listeners

Don't send taskstats (per-pid or per-tgid) on thread exit when no one is
listening for such data.

Currently the taskstats interface allocates a structure, fills it in and
calls netlink to send out per-pid and per-tgid stats regardless of whether
a userspace listener for the data exists (netlink layer would check for
that and avoid the multicast).

As a result of this patch, the check for the no-listener case is performed
early, avoiding the redundant allocation and filling up of the taskstats
structures.

Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] per task delay accounting taskstats interface: documentation fix

Change documentation and example program to reflect the flow control issues
being addressed by the cpumask changes.

Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] delay accounting taskstats interface send tgid once

Send per-tgid data only once during exit of a thread group instead of once
with each member thread exit.

Currently, when a thread exits, besides its per-tid data, the per-tgid data
of its thread group is also sent out, if its thread group is non-empty.
The per-tgid data sent consists of the sum of per-tid stats for all
*remaining* threads of the thread group.

This patch modifies this sending in two ways:

- the per-tgid data is sent only when the last thread of a thread group
  exits.  This cuts down heavily on the overhead of sending/receiving
  per-tgid data, especially when other exploiters of the taskstats
  interface aren't interested in per-tgid stats

- the semantics of the per-tgid data sent are changed.  Instead of being
  the sum of per-tid data for remaining threads, the value now sent is the
  true total accumalated statistics for all threads that are/were part of
  the thread group.

The patch also addresses a minor issue where failure of one accounting
subsystem to fill in the taskstats structure was causing the send of
taskstats to not be sent at all.

The patch has been tested for stability and run cerberus for over 4 hours
on an SMP.

[akpm@osdl.org: bugfixes]
Signed-off-by: Shailabh Nagar <nagar@watson.ibm.com>
Signed-off-by: Balbir Singh <balbir@in.ibm.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>