We met a ksoftirqd 100% issue, the perf top shows kernel is busy with
tasklet_action(), but no actual action is shown. From dumped kernel,
there's only one disabled tasklet on the tasklet_vec.
tasklet_action might be handled after tasklet is disabled, this will make
disabled tasklet stayed on tasklet_vec. tasklet_action will not handle
disabled tasklet, but place it on the tail of tasklet_vec, still raise
softirq for this tasklet. Things will become worse if device driver uses
tasklet_disable on its device remove/close code. The disabled tasklet
will stay on the vec, frequently __raise_softirq_off() and make ksoftirqd
wakeup even if no tasklets need to be handled.
This patch introduced a new TASKLET_STATE_HI bit to indicate HI_SOFTIRQ,
in tasklet_action(), simply ignore the disabled tasklet and don't raise
the softirq nr. In my previous patch, I remove tasklet_hi_enable() since
it is the same as tasklet_enable(). So only tasklet_enable() needs to be
modified, if tasklet state is changed from disable to enable, use
__tasklet_schedule() to put it on the right vec.
Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
enum
{
TASKLET_STATE_SCHED, /* Tasklet is scheduled for execution */
- TASKLET_STATE_RUN /* Tasklet is running (SMP only) */
+ TASKLET_STATE_RUN, /* Tasklet is running (SMP only) */
+ TASKLET_STATE_HI /* Tasklet is HI_SOFTIRQ */
};
#ifdef CONFIG_SMP
static inline void tasklet_enable(struct tasklet_struct *t)
{
smp_mb__before_atomic_dec();
- atomic_dec(&t->count);
+ if (atomic_dec_and_test(&t->count)) {
+ if (!test_bit(TASKLET_STATE_SCHED, &t->state))
+ return;
+ if (test_bit(TASKLET_STATE_HI, &t->state))
+ __tasklet_hi_schedule(t);
+ else
+ __tasklet_schedule(t);
+ }
}
static inline void tasklet_hi_enable(struct tasklet_struct *t)
*__this_cpu_read(tasklet_hi_vec.tail) = t;
__this_cpu_write(tasklet_hi_vec.tail, &(t->next));
raise_softirq_irqoff(HI_SOFTIRQ);
+ set_bit(TASKLET_STATE_HI, &t->state);
local_irq_restore(flags);
}
t->next = __this_cpu_read(tasklet_hi_vec.head);
__this_cpu_write(tasklet_hi_vec.head, t);
+ set_bit(TASKLET_STATE_HI, &t->state);
__raise_softirq_irqoff(HI_SOFTIRQ);
}
if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
BUG();
t->func(t->data);
- tasklet_unlock(t);
- continue;
}
tasklet_unlock(t);
+ continue;
}
local_irq_disable();
if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
BUG();
t->func(t->data);
- tasklet_unlock(t);
- continue;
}
tasklet_unlock(t);
+ continue;
}
local_irq_disable();