libertas: clean up scan thread handling
The libertas scan thread expects priv->scan_req to be non-NULL. In theory,
it should always be set. In practice, we've seen the following oops:
[ 8363.067444] Unable to handle kernel NULL pointer dereference at virtual address
00000004
[ 8363.067490] pgd =
c0004000
[ 8363.078393] [
00000004] *pgd=
00000000
[ 8363.086711] Internal error: Oops: 17 [#1] PREEMPT
[ 8363.091375] Modules linked in: fuse libertas_sdio libertas psmouse mousedev ov7670 mmp_camera joydev videobuf2_core videobuf2_dma_sg videobuf2_memops [last unloaded: scsi_wait_scan]
[ 8363.107490] CPU: 0 Not tainted (
3.0.0-gf7ccc69 #671)
[ 8363.112799] PC is at lbs_scan_worker+0x108/0x5a4 [libertas]
[ 8363.118326] LR is at 0x0
[ 8363.120836] pc : [<
bf03a854>] lr : [<
00000000>] psr:
60000113
[ 8363.120845] sp :
ee66bf48 ip :
00000000 fp :
00000000
[ 8363.120845] r10:
ee2c2088 r9 :
c04e2efc r8 :
eef97005
[ 8363.132231] r7 :
eee0716f r6 :
ee2c02c0 r5 :
ee2c2088 r4 :
eee07160
[ 8363.137419] r3 :
00000000 r2 :
a0000113 r1 :
00000001 r0 :
eee07160
[ 8363.143896] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
[ 8363.157630] Control:
10c5387d Table:
2e754019 DAC:
00000015
[ 8363.163334] Process kworker/u:1 (pid: 25, stack limit = 0xee66a2f8)
While I've not found a smoking gun, there are two places that raised red flags
for me. The first is in _internal_start_scan, when we queue up a scan; we
first queue the worker, and then set priv->scan_req. There's theoretically
a 50mS delay which should be plenty, but doing things that way just seems
racy (and not in the good way).
The second is in the scan worker thread itself. Depending on the state of
priv->scan_channel, we cancel pending scan runs and then requeue a run in
300mS. We then send the scan command down to the hardware, sleep, and if
we get scan results for all the desired channels, we set priv->scan_req to
NULL. However, it that's happened in less than 300mS, what happens with
the pending scan run?
This patch addresses both of those concerns. With the patch applied, we
have not seen the oops in the past two weeks.
Signed-off-by: Andres Salomon <dilinger@queued.net>
Cc: stable@kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>