per-task block plug can reduce block queue lock contention and increase
request merge. Currently page reclaim doesn't support it. I originally
thought page reclaim doesn't need it, because kswapd thread count is
limited and file cache write is done at flusher mostly.
When I test a workload with heavy swap in a 4-node machine, each CPU is
doing direct page reclaim and swap. This causes block queue lock
contention. In my test, without below patch, the CPU utilization is about
2% ~ 7%. With the patch, the CPU utilization is about 1% ~ 3%. Disk
throughput isn't changed. This should improve normal kswapd write and
file cache write too (increase request merge for example), but might not
be so obvious as I explain above.
Signed-off-by: Shaohua Li <shaohua.li@intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <>