sched: Initialize cfs_rq->runtime_remaining to non-zero on cfs bw set
If cfs_rq->runtime_remaining is <= 0 then either
- cfs_rq is throttled and waiting for quota redistribution, or
- cfs_rq is currently executing and will be throttled on put_prev_entity, or
- cfs_rq is not throttled and has not executed since its quota was set
(runtime_remaining is set to 0 on cfs bandwidth reconfiguration).
It is obvious that the last case is rather an exception from the
rule "runtime_remaining<=0 iff cfs_rq is throttled or will be
throttled as soon as it finishes its execution".
Moreover, it can lead to a task hang as follows. If
put_prev_task() is called immediately after first pick_next_task
after quota was set, "immediately" meaning rq->clock in both
functions is the same, then the corresponding cfs_rq will be
throttled.
Besides being unfair (the cfs_rq has not executed in fact), the
quota refilling timer can be idle at that time and it won't be
activated on put_prev_task because update_curr calls
account_cfs_rq_runtime, which activates the timer, only if
delta_exec is strictly positive. As a result we can get a task
"running" inside a throttled cfs_rq which will probably never be
unthrottled.
To avoid the problem, the patch makes tg_set_cfs_bandwidth
initialize runtime_remaining of each cfs_rq to 1 instead of 0 so
that the cfs_rq will be throttled only if it has executed for
some positive number of nanoseconds.
Several times we had our customers encountered such hangs inside
a VM (seems something is wrong or rather different in time
accounting there). Analyzing crash dumps revealed that hung
tasks were running inside cfs_rq's, which had the following
setup:
cfs_rq->throttled=1
cfs_rq->runtime_enabled=1
cfs_rq->runtime_remaining=0
cfs_rq->tg->cfs_bandwidth.idle=1
cfs_rq->tg->cfs_bandwidth.timer_active=0
which conforms pretty nice to the explanation given above.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: <devel@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/1360307446-26978-1-git-send-email-vdavydov@parallels.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>