David Howells [Wed, 9 Jul 2008 22:06:45 +0000 (15:06 -0700)]
netfilter: nf_nat_snmp_basic: fix a range check in NAT for SNMP
Fix a range check in netfilter IP NAT for SNMP to always use a big enough size
variable that the compiler won't moan about comparing it to ULONG_MAX/8 on a
64-bit platform.
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Wed, 9 Jul 2008 22:06:12 +0000 (15:06 -0700)]
netfilter: nf_conntrack_tcp: fix endless loop
When a conntrack entry is destroyed in process context and destruction
is interrupted by packet processing and the packet is an attempt to
reopen a closed connection, TCP conntrack tries to kill the old entry
itself and returns NF_REPEAT to pass the packet through the hook
again. This may lead to an endless loop: TCP conntrack repeatedly
finds the old entry, but can not kill it itself since destruction
is already in progress, but destruction in process context can not
complete since TCP conntrack is keeping the CPU busy.
Drop the packet in TCP conntrack if we can't kill the connection
ourselves to avoid this.
Reported by: hemao77@gmail.com [ Kernel bugzilla #11058 ] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
After this we have an ifp->rt inserted into fib6 lists, but
queued for gc, which in turn can result in oopses in the
fib6_run_gc. Maybe some other nasty things, but we caught
only the oops in gc so far.
The solution is to disarm the ifp->timer before flushing the
rt from it.
Signed-off-by: Andrey Vagin <avagin@parallels.com> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Julius Volz [Tue, 8 Jul 2008 10:07:43 +0000 (03:07 -0700)]
irda: Fix netlink error path return value
Fix an incorrect return value check of genlmsg_put() in irda_nl_get_mode().
genlmsg_put() does not use ERR_PTR() to encode return values, it just
returns NULL on error.
Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Ville Syrjala [Tue, 8 Jul 2008 10:07:16 +0000 (03:07 -0700)]
irda: New device ID for nsc-ircc
HP OmniBook 500's DSDT code changes the HID of the FIR device from
NSC6001 to HWPC224 when run under an "NT" operating system. Add the
new ID to the pnp device id table.
Signed-off-by: Ville Syrjala <syrjala@sci.fi> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Chen [Tue, 8 Jul 2008 10:06:46 +0000 (03:06 -0700)]
irda: via-ircc proper dma freeing
1. dma should be freed when dma2 request fail.
2. dma2 should be freed too when device close.
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
sctp: Mark the tsn as received after all allocations finish
If we don't have the buffer space or memory allocations fail,
the data chunk is dropped, but TSN is still reported as received.
This introduced a data loss that can't be recovered. We should
only mark TSNs are received after memory allocations finish.
The one exception is the invalid stream identifier, but that's
due to user error and is reported back to the user.
This was noticed by Michael Tuexen.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Firat Birlik [Fri, 4 Jul 2008 03:31:50 +0000 (04:31 +0100)]
zd1211rw: add ID for AirTies WUS-201
I would like to inform you of our zd1211 based usb wifi adapter (AirTies
WUS-201), which works with the zd1211rw driver with the following device
id definition.
Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Thu, 3 Jul 2008 17:02:44 +0000 (19:02 +0200)]
mac80211: Only flush workqueue when last interface was removed
Currently the ieee80211_hw->workqueue is flushed each time
an interface is being removed. However most scheduled work
is not interface specific but device specific, for example things like
periodic work for link tuners.
This patch will move the flush_workqueue() call to directly behind
the call to ops->stop() to make sure the workqueue is only flushed
when all interfaces are gone and there really shouldn't be any scheduled
work in the drivers left.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Guy Cohen [Thu, 3 Jul 2008 16:56:13 +0000 (19:56 +0300)]
mac80211: move netif_carrier_on to after ieee80211_bss_info_change_notify
Putting netif_carrier_on before configuring the driver/device with the
new association state may cause a race (tx frames may be sent before
configuration is done)
Signed-off-by: Guy Cohen <guy.cohen@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Oliver Hartkopp [Sun, 6 Jul 2008 06:38:43 +0000 (23:38 -0700)]
can: add sanity checks
Even though the CAN netlayer only deals with CAN netdevices, the
netlayer interface to the userspace and to the device layer should
perform some sanity checks.
This patch adds several sanity checks that mainly prevent userspace apps
to send broken content into the system that may be misinterpreted by
some other userspace application.
Signed-off-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de> Signed-off-by: Urs Thuermann <urs.thuermann@volkswagen.de> Acked-by: Andre Naujoks <nautsch@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Laurent Pinchart [Thu, 26 Jun 2008 09:48:22 +0000 (11:48 +0200)]
fs_enet: restore promiscuous and multicast settings in restart()
The restart() function is called when the link state changes and resets
multicast and promiscuous settings. This patch restores those settings at the
end of restart().
Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Short packets has to be discarded by the driver. So this patch addresses the
issue of discarding the short packets of size lesser then ethernet header
size.
Signed-off-by: Sathya Narayanan <sathyan@teamf1.com> Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
ibm_newemac: Fixes kernel crashes when speed of cable connected changes
The descriptor pointers were not initialized to NIL values, so it was
poiniting to some random addresses which was completely invalid. This
fix takes care of initializing the descriptor to NIL values and clearing
the valid descriptors on clean ring operation.
Signed-off-by: Sathya Narayanan <sathyan@teamf1.com> Signed-off-by: Stefan Roese <sr@denx.de> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Roland Dreier [Tue, 1 Jul 2008 17:22:45 +0000 (10:22 -0700)]
pasemi_mac: Access iph->tot_len with correct endianness
iph->tot_len is stored in network byte order, so access it using
ntohs(). This doesn't have any real world impact on pasemi_mac, since
the device only exists as part of a big-endian system-on-chip, but
fixing this gets rid of a sparse warning and avoids having a bad example
in the tree.
Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Roland Dreier [Tue, 1 Jul 2008 17:20:33 +0000 (10:20 -0700)]
ehea: Access iph->tot_len with correct endianness
iph->tot_len is stored in network byte order, so access it using
ntohs(). This doesn't have any real world impact on ehea, since ehea
only exists for big-endian platfroms (at the moment at least) but fixing
this gets rid of a sparse warning and avoids having a bad example in the
tree.
Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
When ehea_stop is called the function
cancel_work_sync(&port->reset_task) is used to ensure
that the reset task is not running anymore. We need an
additional flag to ensure that it can not be scheduled
after this call again for a certain time.
Signed-off-by: Jan-Bernd Themann <themann@de.ibm.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
A mutex has to be replaced by spinlocks as it can be called from
a context which does not allow sleeping.
The kzalloc flag GFP_KERNEL has to be replaced by GFP_ATOMIC
for the same reason.
Signed-off-by: Jan-Bernd Themann <themann@de.ibm.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Because of nv_disable_irq this is probably not really a problem
though (I guess) and replacing the spin_lock with spin_lock_irqsave
could keep interrupts disabled for a longer period of time because
of delays in nv_stop_rx and nv_stop_tx.
Signed-off-by: Tobias Diedrich <ranma+kernel@tdiedrich.de> Cc: Ayaz Abdulla <aabdulla@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Krzysztof Halasa [Sun, 29 Jun 2008 19:48:11 +0000 (21:48 +0200)]
Add missing skb->dev assignment in Frame Relay RX code
Commit 4c13eb6657fe9ef7b4dc8f1a405c902e9e5234e0 ([ETH]: Make
eth_type_trans set skb->dev like the other *_type_trans) removed
skb->dev assignment from hdlc_fr.c:fr_rx(). Unfortunately it was also
needed for cases other than eth_type_trans().
Adding it back.
It's quite serious and may be a security risk as it causes a wrong
input interface indication (the physical hdlcX instead of logical
pvcX). Probably -stable class fix.
Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Patrick McHardy [Thu, 3 Jul 2008 10:53:42 +0000 (03:53 -0700)]
bridge: fix use-after-free in br_cleanup_bridges()
Unregistering a bridge device may cause virtual devices stacked on the
bridge, like vlan or macvlan devices, to be unregistered as well.
br_cleanup_bridges() uses for_each_netdev_safe() to iterate over all
devices during cleanup. This is not enough however, if one of the
additionally unregistered devices is next in the list to the bridge
device, it will get freed as well and the iteration continues on
the freed element.
Restart iteration after each bridge device removal from the beginning to
fix this, similar to what rtnl_link_unregister() does.
Signed-off-by: Patrick McHardy <kaber@trash.net> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Thu, 3 Jul 2008 10:22:02 +0000 (03:22 -0700)]
tcp: net/ipv4/tcp.c needs linux/scatterlist.h
alpha:
net/ipv4/tcp.c: In function 'tcp_calc_md5_hash':
net/ipv4/tcp.c:2479: error: implicit declaration of function 'sg_init_table' net/ipv4/tcp.c:2482: error: implicit declaration of function 'sg_set_buf'
net/ipv4/tcp.c:2507: error: implicit declaration of function 'sg_mark_end'
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'i2c-fix' of git://aeryn.fluff.org.uk/bjdooks/linux
* 'i2c-fix' of git://aeryn.fluff.org.uk/bjdooks/linux:
I2C: S3C2410: Add MODULE_ALIAS() for s3c2440 device.
I2C: S3C2410: Fixup error codes returned rom a transfer.
I2C: S3C2410: Check ACK on byte transmission
Merge branch 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block
* 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block:
Properly notify block layer of sync writes
block: Fix the starving writes bug in the anticipatory IO scheduler
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (8178): uvc: Fix compilation breakage for the other drivers, if uvc is selected
V4L/DVB (8145a): USB Video Class driver
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5131/1: Annotate platform_secondary_init with trace_hardirqs_off
[ARM] 5117/1: pxafb: fix __devinit/exit annotations
[ARM] Export dma_sync_sg_for_device()
[ARM] 5109/1: Mark rtc sa1100 driver as wakeup source before registering it
[ARM] 5116/1: pxafb: cleanup and fix order of failure handling
[ARM] 5115/1: pxafb: fix ifdef for command line option handling
ARM: OMAP: Correcting the gpmc prefetch control register address
ARM: OMAP: DMA: Don't mark channel active in omap_enable_channel_irq
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: fix divide error when trying to configure rt_period to zero
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c: Fix bad hint about irqs in i2c.h
i2c: Documentation: fix device matching description
Merge branch 'for-2.6.26' of git://neil.brown.name/md
* 'for-2.6.26' of git://neil.brown.name/md:
Fix error paths if md_probe fails.
Don't acknowlege that stripe-expand is complete until it really is.
Ensure interrupted recovery completed properly (v1 metadata plus bitmap)
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc/mpc5200: Fix lite5200b suspend/resume
powerpc/legacy_serial: Bail if reg-offset/shift properties are present
powerpc/bootwrapper: update for initrd with simpleImage
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (55 commits)
net: fib_rules: fix error code for unsupported families
netdevice: Fix wrong string handle in kernel command line parsing
net: Tyop of sk_filter() comment
netlink: Unneeded local variable
net-sched: fix filter destruction in atm/hfsc qdisc destruction
net-sched: change tcf_destroy_chain() to clear start of filter list
ipv4: fix sysctl documentation of time related values
mac80211: don't accept WEP keys other than WEP40 and WEP104
hostap: fix sparse warnings
hostap: don't report useless WDS frames by default
textsearch: fix Boyer-Moore text search bug
netfilter: nf_conntrack_tcp: fixing to check the lower bound of valid ACK
ipv6 route: Convert rt6_device_match() to use RT6_LOOKUP_F_xxx flags.
netlabel: Fix a problem when dumping the default IPv6 static labels
net/inet_lro: remove setting skb->ip_summed when not LRO-able
inet fragments: fix race between inet_frag_find and inet_frag_secret_rebuild
CONNECTOR: add a proc entry to list connectors
netlink: Fix some doc comments in net/netlink/attr.c
tcp: /proc/net/tcp rto,ato values not scaled properly (v2)
include/linux/netdevice.h: don't export MAX_HEADER to userspace
...
When scheduled swaps occur, we need to blit between front & back
buffers. If the buffers are tiled, we need to set the appropriate
XY_SRC_COPY tile bit, but only on 965 chips, since it will cause
corruption on pre-965 (e.g. 945).
Bug reported by and fix tested by Tomas Janousek <tomi@nomi.cz>.
| linux/drivers/input/ff-core.c: In function 'input_ff_upload':
| linux/drivers/input/ff-core.c:172: error: dereferencing pointer to incomplete type
| linux/drivers/input/ff-core.c: In function 'erase_effect':
| linux/drivers/input/ff-core.c:197: error: dereferencing pointer to incomplete type
| linux/drivers/input/ff-core.c:204: error: dereferencing pointer to incomplete type
| make[4]: *** [drivers/input/ff-core.o] Error 1
As the incomplete type is `struct task_struct', including <linux/sched.h> fixes
it.
libertas: support USB persistence on suspend/resume (resend)
Handle .reset_resume() so that libertas can survive suspend/resume without
reloading the firmware.
Signed-off-by: Andrey Yurovsky <andrey@cozybit.com> Acked-by: Deepak Saxena <dsaxena@laptop.org> Acked-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Alex Chiang [Wed, 2 Jul 2008 02:02:23 +0000 (20:02 -0600)]
PCI: acpiphp: cleanup notify handler on all root bridges
During the development of the physical PCI slot patch series, Gary Hade
kept on reporting strange oopses due to interactions between pci_slot
and acpiphp.
http://lkml.org/lkml/2007/11/28/319
find_root_bridges() unconditionally installs
handle_hotplug_event_bridge() as an ACPI_SYSTEM_NOTIFY handler for all
root bridges.
However, during module cleanup, remove_bridge() will only remove the
notify handler iff the root bridge had a hot-pluggable slot directly
underneath. That is:
root bridge -> hotplug slot
But, if the topology looks like either of the following:
Then we currently do not remove the notify handler from that root
bridge.
This can cause a kernel oops if we modprobe acpiphp later and it gets
loaded somewhere else in memory. If the root bridge then receives a
hotplug event, it will then attempt to call a stale, non-existent notify
handler and we blow up.
Much thanks goes to Gary Hade for his persistent debugging efforts.
Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Gary Hade <garyhade@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the
VPD end tag will hang the device. This problem was initially
observed when a vpd entry was created in sysfs
('/sys/bus/pci/devices/<id>/vpd'). A read to this sysfs entry
will dump 32k of data. Reading a full 32k will cause an access
beyond the VPD end tag causing the device to hang. Once the device
is hung, the bnx2 driver will not be able to reset the device.
We believe that it is legal to read beyond the end tag and
therefore the solution is to limit the read/write length.
A majority of this patch is from Matthew Wilcox who gave code for
reworking the PCI vpd size information. A PCI quirk added for the
Broadcom NIC's to limit the read/write's.
Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Milan Broz [Wed, 2 Jul 2008 08:34:28 +0000 (09:34 +0100)]
dm crypt: use cond_resched
Add cond_resched() to prevent monopolising CPU when processing large bios.
dm-crypt processes encryption of bios in sector units. If the bio request
is big it can spend a long time in the encryption call.
Signed-off-by: Milan Broz <mbroz@redhat.com> Tested-by: Yan Li <elliot.li.tech@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Wang Chen [Wed, 2 Jul 2008 02:57:19 +0000 (19:57 -0700)]
netdevice: Fix wrong string handle in kernel command line parsing
v1->v2: Use strlcpy() to ensure s[i].name be null-termination.
1. In netdev_boot_setup_add(), a long name will leak.
ex. : dev=21,0x1234,0x1234,0x2345,eth123456789verylongname.........
2. In netdev_boot_setup_check(), mismatch will happen if s[i].name
is a substring of dev->name.
ex. : dev=...eth1 dev=...eth11
[ With feedback from Ben Hutchings. ]
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ipv4: fix sysctl documentation of time related values
These sysctl values are time related and all use the same routine
(proc_dointvec_jiffies) that internally converts from seconds to jiffies.
The code is fine, the documentation is just wrong.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tim Yamin [Tue, 17 Jun 2008 08:33:14 +0000 (09:33 +0100)]
powerpc/mpc5200: Fix lite5200b suspend/resume
Suspend/resume ("echo mem > /sys/power/state") does not work with
vanilla kernels -- the system does not suspend correctly and just
hangs. This patch fixes this so suspend/resume works:
1) of_iomap does not map the whole 0xC000 of the MPC5200 immr so
saving registers does not work.
2) PCI registers need to be saved and restored.
Signed-off-by: Tim Yamin <plasm@roo.me.uk> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
John Linn [Tue, 1 Jul 2008 17:52:41 +0000 (10:52 -0700)]
powerpc/legacy_serial: Bail if reg-offset/shift properties are present
The legacy serial driver does not work with an 8250 type UART that is
described in the device tree with the reg-offset and reg-shift
properties. This change makes legacy_serial ignore these devices.
Signed-off-by: John Linn <john.linn@xilinx.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Wolfram Sang [Tue, 1 Jul 2008 20:38:18 +0000 (22:38 +0200)]
i2c: Fix bad hint about irqs in i2c.h
i2c.h mentions -1 as a not-issued irq. This false hint was taken by
of_i2c and caused crashes. Don't give any advice as 'no irq' is not
consistent across all architectures yet and it is not needed internally
by the i2c-core.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Jean Delvare <khali@linux-fr.org>
The matching process described for new style clients in
Documentation/i2c/writing-clients is classed as out-of-date
as it requires the presence of an .id_table entry in the
driver's i2c_driver entry.
Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Jean Delvare <khali@linux-fr.org>
Ben Hutchings [Tue, 1 Jul 2008 16:18:17 +0000 (17:18 +0100)]
PCI: Restrict VPD read permission to root
Some PCI devices will lock up if we attempt to read from VPD addresses
beyond some device-dependent limit. Until we can identify these
devices and adjust the file size accordingly, only let root read VPD
through sysfs to prevent a DoS by normal users.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Ben Dooks [Tue, 1 Jul 2008 10:59:41 +0000 (11:59 +0100)]
I2C: S3C2410: Check ACK on byte transmission
We should check for the reception of an ACK after transmitting each
data byte. The address send has been correctly checking this, but the
data write byte state should have also been checking for these failures.
As part of the same fix, we remove the ACK checking from the receive
path where it should not have been checking for an ACK which our hardware
was sending.
Gautham R Shenoy [Fri, 27 Jun 2008 04:47:38 +0000 (10:17 +0530)]
rcu: fix hotplug vs rcu race
Dhaval Giani reported this warning during cpu hotplug stress-tests:
| On running kernel compiles in parallel with cpu hotplug:
|
| WARNING: at arch/x86/kernel/smp.c:118
| native_smp_send_reschedule+0x21/0x36()
| Modules linked in:
| Pid: 27483, comm: cc1 Not tainted 2.6.26-rc7 #1
| [...]
| [<c0110355>] native_smp_send_reschedule+0x21/0x36
| [<c014fe8f>] force_quiescent_state+0x47/0x57
| [<c014fef0>] call_rcu+0x51/0x6d
| [<c01713b3>] __fput+0x130/0x158
| [<c0171231>] fput+0x17/0x19
| [<c016fd99>] filp_close+0x4d/0x57
| [<c016fdff>] sys_close+0x5c/0x97
IMHO the warning is a spurious one.
cpu_online_map is updated by the _cpu_down() using stop_machine_run().
Since force_quiescent_state is invoked from irqs disabled section,
stop_machine_run() won't be executing while a cpu is executing
force_quiescent_state(). Hence the cpu_online_map is stable while we're
in the irq disabled section.
However, a cpu might have been offlined _just_ before we disabled irqs
while entering force_quiescent_state(). And rcu subsystem might not yet
have handled the CPU_DEAD notification, leading to the offlined cpu's
bit being set in the rcp->cpumask.
Hence cpumask = (rcp->cpumask & cpu_online_map) to prevent sending
smp_reschedule() to an offlined CPU.
Here's the timeline:
CPU_A CPU_B
--------------------------------------------------------------
cpu_down(): .
. .
. .
stop_machine(): /* disables preemption, .
* and irqs */ .
. .
. .
take_cpu_down(); .
. .
. .
. .
cpu_disable(); /*this removes cpu .
*from cpu_online_map .
*/ .
. .
. .
restart_machine(); /* enables irqs */ .
------WINDOW DURING WHICH rcp->cpumask is stale ---------------
. call_rcu();
. /* disables irqs here */
. .force_quiescent_state();
.CPU_DEAD: .for_each_cpu(rcp->cpumask)
. . smp_send_reschedule();
. .
. . WARN_ON() for offlined CPU!
.
.
.
rcu_cpu_notify:
.
-------- WINDOW ENDS ------------------------------------------
rcu_offline_cpu() /* Which calls cpu_quiet()
* which removes
* cpu from rcp->cpumask.
*/
If a new batch was started just before calling stop_machine_run(), the
"tobe-offlined" cpu is still present in rcp-cpumask.
During a cpu-offline, from take_cpu_down(), we queue an rt-prio idle
task as the next task to be picked by the scheduler. We also call
cpu_disable() which will disable any further interrupts and remove the
cpu's bit from the cpu_online_map.
Once the stop_machine_run() successfully calls take_cpu_down(), it calls
schedule(). That's the last time a schedule is called on the offlined
cpu, and hence the last time when rdp->passed_quiesc will be set to 1
through rcu_qsctr_inc().
But the cpu_quiet() will be on this cpu will be called only when the
next RCU_SOFTIRQ occurs on this CPU. So at this time, the offlined CPU
is still set in rcp->cpumask.
Now coming back to the idle_task which truely offlines the CPU, it does
check for a pending RCU and raises the softirq, since it will find
rdp->passed_quiesc to be 0 in this case. However, since the cpu is
offline I am not sure if the softirq will trigger on the CPU.
Even if it doesn't the rcu_offline_cpu() will find that rcp->completed
is not the same as rcp->cur, which means that our cpu could be holding
up the grace period progression. Hence we call cpu_quiet() and move
ahead.
But because of the window explained in the timeline, we could still have
a call_rcu() before the RCU subsystem executes it's CPU_DEAD
notification, and we send smp_send_reschedule() to offlined cpu while
trying to force the quiescent states. The appended patch adds comments
and prevents checking for offlined cpu everytime.
cpu_online_map is updated by the _cpu_down() using stop_machine_run().
Since force_quiescent_state is invoked from irqs disabled section,
stop_machine_run() won't be executing while a cpu is executing
force_quiescent_state(). Hence the cpu_online_map is stable while we're
in the irq disabled section.
fsync_buffers_list() and sync_dirty_buffer() both issue async writes and
then immediately wait on them. Conceptually, that makes them sync writes
and we should treat them as such so that the IO schedulers can handle
them appropriately.
This patch fixes a write starvation issue that Lin Ming reported, where
xx is stuck for more than 2 minutes because of a large number of
synchronous IO in the system:
Lin Ming confirms that this patch fixes the issue. I've run tests with
it for the past week and no ill effects have been observed, so I'm
proposing it for inclusion into 2.6.26.
Divyesh Shah [Mon, 16 Jun 2008 16:37:08 +0000 (18:37 +0200)]
block: Fix the starving writes bug in the anticipatory IO scheduler
AS scheduler alternates between issuing read and write batches. It does
the batch switch only after all requests from the previous batch are
completed.
When switching to a write batch, if there is an on-going read request,
it waits for its completion and indicates its intention of switching by
setting ad->changed_batch and the new direction but does not update the
batch_expire_time for the new write batch which it does in the case of
no previous pending requests.
On completion of the read request, it sees that we were waiting for the
switch and schedules work for kblockd right away and resets the
ad->changed_data flag.
Now when kblockd enters dispatch_request where it is expected to pick
up a write request, it in turn ends the write batch because the
batch_expire_timer was not updated and shows the expire timestamp for
the previous batch.
This results in the write starvation for all the cases where there is
the intention for switching to a write batch, but there is a previous
in-flight read request and the batch gets reverted to a read_batch
right away.
This also holds true in the reverse case (switching from a write batch
to a read batch with an in-flight write request).
I've checked that this bug exists on 2.6.11, 2.6.18, 2.6.24 and
linux-2.6-block git HEAD. I've tested the fix on x86 platforms with
SCSI drives where the driver asks for the next request while a current
request is in-flight.
This patch is based off linux-2.6-block git HEAD.
Bug reproduction:
A simple scenario which reproduces this bug is:
- dd if=/dev/hda3 of=/dev/null &
- lilo
The lilo takes forever to complete.
This can also be reproduced fairly easily with the earlier dd and
another test
program doing msync().
The example test program below should print out a message after every
iteration
but it simply hangs forever. With this bugfix it makes forward progress.
====
Example test program using msync() (thanks to suleiman AT google DOT
com)
I'm checking as soon as possible for the period not being zero since, if
it is, going ahead is useless. This way we also save a mutex_lock() and
a read_lock() wrt doing it inside tg_set_bandwidth() or
__rt_schedulable().
Tony Luck [Mon, 30 Jun 2008 22:03:14 +0000 (15:03 -0700)]
[IA64] Bugfix for system with 32 cpus
On a system where there are no hot pluggable cpus "additional_cpus"
is still set to -1 at the point where we call per_cpu_scan_finalize().
If we didn't find an SRAT table and so pick the default "32" for the
number of cpus, when we get to:
high_cpu = min(high_cpu + reserve_cpus, NR_CPUS);
we will end up initializing for just 31 cpus ... and so we will
die horribly when bringing up cpu#32.
Laurent Pinchart [Mon, 30 Jun 2008 18:04:50 +0000 (15:04 -0300)]
V4L/DVB (8145a): USB Video Class driver
This driver supports video input devices compliant with the USB Video Class
specification. This means lots of currently manufactured webcams, and probably
most of the future ones.
mac80211: don't accept WEP keys other than WEP40 and WEP104
This patch makes mac80211 refuse a WEP key whose length is not WEP40 nor
WEP104.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Pavel Roskin [Fri, 27 Jun 2008 20:19:58 +0000 (16:19 -0400)]
hostap: fix sparse warnings
Rewrite AID calculation in handle_pspoll() to avoid truncating bits.
Make hostap_80211_header_parse() static, don't export it. Avoid
shadowing variables.
Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Joonwoo Park [Mon, 30 Jun 2008 19:42:23 +0000 (12:42 -0700)]
textsearch: fix Boyer-Moore text search bug
The current logic has a bug which cannot find matching pattern, if the
pattern is matched from the first character of target string.
for example:
pattern=abc, string=abcdefg
pattern=a, string=abcdefg
Searching algorithm should return 0 for those things.
Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jozsef Kadlecsik [Mon, 30 Jun 2008 19:41:30 +0000 (12:41 -0700)]
netfilter: nf_conntrack_tcp: fixing to check the lower bound of valid ACK
Lost connections was reported by Thomas Bätzler (running 2.6.25 kernel) on
the netfilter mailing list (see the thread "Weird nat/conntrack Problem
with PASV FTP upload"). He provided tcpdump recordings which helped to
find a long lingering bug in conntrack.
In TCP connection tracking, checking the lower bound of valid ACK could
lead to mark valid packets as INVALID because:
- We have got a "higher or equal" inequality, but the test checked
the "higher" condition only; fixed.
- If the packet contains a SACK option, it could occur that the ACK
value was before the left edge of our (S)ACK "window": if a previous
packet from the other party intersected the right edge of the window
of the receiver, we could move forward the window parameters beyond
accepting a valid ack. Therefore in this patch we check the rightmost
SACK edge instead of the ACK value in the lower bound of valid (S)ACK
test.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Catalin Marinas [Fri, 27 Jun 2008 14:15:12 +0000 (15:15 +0100)]
[ARM] 5131/1: Annotate platform_secondary_init with trace_hardirqs_off
This patch annotates the platform_secondary_init function in
arch/arm/mach-realview/platsmp.c with trace_hardirqs_off to avoid a
warning when LOCKDEP and TRACE_IRQFLAGS are enabled.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Mon, 30 Jun 2008 15:56:57 +0000 (08:56 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
ptrace GET/SET FPXREGS broken
x86: fix cpu hotplug crash
x86: section/warning fixes
x86: shift bits the right way in native_read_tscp
Dmitry Torokhov [Thu, 26 Jun 2008 15:30:02 +0000 (11:30 -0400)]
Input: fix locking in force-feedback core
The newly added event_lock spinlock in the input core disallows sleeping
and therefore using mutexes in event handlers. Convert force-feedback
core to rely on event_lock instead of mutex to protect slots allocated
for fore-feedback effects. The original mutex is still used to serialize
uploading and erasing of effects.
struct user_fxsr_struct {
unsigned short cwd;
unsigned short swd;
unsigned short twd;
unsigned short fop;
long fip;
long fcs;
long foo;
long fos;
long mxcsr;
long reserved;
long st_space[32]; /* 8*16 bytes for each FP-reg = 128 bytes */
long xmm_space[32]; /* 8*16 bytes for each XMM-reg = 128 bytes */
long padding[56];
};
int main(void)
{
pid_t pid;
pid = fork();
switch(pid){
case -1:/* error */
break;
case 0:/* child */
child();
break;
default:
parent(pid);
break;
}
return 0;
}
int child(void)
{
ptrace(PTRACE_TRACEME);
kill(getpid(), SIGSTOP);
sleep(10);
return 0;
}
int parent(pid_t pid)
{
int ret;
struct user_fxsr_struct fpxregs;
In function _cpu_up, the panic happens when calling
__raw_notifier_call_chain at the second time. Kernel doesn't panic when
calling it at the first time. If just say because of nr_cpu_ids, that's
not right.
By checking the source code, I found that function do_boot_cpu is the culprit.
Consider below call chain:
_cpu_up=>__cpu_up=>smp_ops.cpu_up=>native_cpu_up=>do_boot_cpu.
So do_boot_cpu is called in the end. In do_boot_cpu, if
boot_error==true, cpu_clear(cpu, cpu_possible_map) is executed. So later
on, when _cpu_up calls __raw_notifier_call_chain at the second time to
report CPU_UP_CANCELED, because this cpu is already cleared from
cpu_possible_map, get_cpu_sysdev returns NULL.
Many resources are related to cpu_possible_map, so it's better not to
change it.
Below patch against 2.6.26-rc7 fixes it by removing the bit clearing in
cpu_possible_map.
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (42 commits)
V4L/DVB (8108): Fix open/close race in saa7134
V4L/DVB (8100): V4L/vivi: fix possible memory leak in vivi_fillbuff
V4L/DVB (8097): xc5000: check device hardware state to determine if firmware download is needed
V4L/DVB (8096): au8522: prevent false-positive lock status
V4L/DVB (8092): videodev: simplify and fix standard enumeration
V4L/DVB (8075): stv0299: Uncorrected block count and bit error rate fixed
V4L/DVB (8074): av7110: OSD transfers should not be interrupted
V4L/DVB (8073): av7110: Catch another type of ARM crash
V4L/DVB (8071): tda10023: Fix possible kernel oops during initialisation
V4L/DVB (8069): cx18: Fix S-Video and Compsite inputs for the Yuan MPC718 and enable card entry
V4L/DVB (8068): cx18: Add I2C slave reset via GPIO upon initialization
V4L/DVB (8067): cx18: Fix firmware load for case when digital capture happens first
V4L/DVB (8066): cx18: Fix audio mux input definitions for HVR-1600 Line In 2 and FM radio
V4L/DVB (8063): cx18: Fix unintended auto configurations in cx18-av-core
V4L/DVB (8061): cx18: only select tuner / frontend modules if !DVB_FE_CUSTOMISE
V4L/DVB (8048): saa7134: Fix entries for Avermedia A16d and Avermedia E506
V4L/DVB (8044): au8522: tuning optimizations
V4L/DVB (8043): au0828: add support for additional USB device id's
V4L/DVB (8042): DVB-USB UMT-010 channel scan oops
V4L/DVB (8040): soc-camera: remove soc_camera_host_class class
...
Linus Torvalds [Sun, 29 Jun 2008 19:22:30 +0000 (12:22 -0700)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
dock: bay: Don't call acpi_walk_namespace() when ACPI is disabled.
ACPI: don't walk tables if ACPI was disabled
thermal: Create CONFIG_THERMAL_HWMON=n