memcg, vmscan: do not fall into reclaim-all pass too quickly
shrink_zone starts with soft reclaim pass first and then falls back to
regular reclaim if nothing has been scanned. This behavior is natural but
there is a catch. Memcg iterators, when used with the reclaim cookie, are
designed to help to prevent from over reclaim by interleaving reclaimers
(per node-zone-priority) so the tree walk might miss many (even all) nodes
in the hierarchy e.g. when there are direct reclaimers racing with each
other or with kswapd in the global case or multiple allocators reaching
the limit for the target reclaim case. To make it even more complicated,
targeted reclaim doesn't do the whole tree walk because it stops
reclaiming once it reclaims sufficient pages. As a result groups over the
limit might be missed, thus nothing is scanned, and reclaim would fall
back to the reclaim all mode.
This patch checks for the incomplete tree walk in shrink_zone. If no
group has been visited and the hierarchy is soft reclaimable then we must
have missed some groups, in which case the __shrink_zone is called again.
This doesn't guarantee there will be some progress of course because the
current reclaimer might be still racing with others but it would at least
give a chance to start the walk without a big risk of reclaim latencies.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Glauber Costa <glommer@openvz.org> Cc: Greg Thelen <gthelen@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Michel Lespinasse <walken@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Ying Han <yinghan@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>