From: Michal Hocko Date: Mon, 12 May 2014 14:34:17 +0000 (+0200) Subject: memcg: remove tasks/children test from mem_cgroup_force_empty() X-Git-Url: https://git.karo-electronics.de/?a=commitdiff_plain;h=f61c42a7d911;p=linux-beck.git memcg: remove tasks/children test from mem_cgroup_force_empty() Tejun has correctly pointed out that tasks/children test in mem_cgroup_force_empty is not correct because there is no other locking which preserves this state throughout the rest of the function so both new tasks can join the group or new children groups can be added while somebody is writing to memory.force_empty. A new task would break mem_cgroup_reparent_charges expectation that all failures as described by mem_cgroup_force_empty_list are temporal and there is no way out. The main use case for the knob as described by Documentation/cgroups/memory.txt is to: " The typical use case for this interface is before calling rmdir(). Because rmdir() moves all pages to parent, some out-of-use page caches can be moved to the parent. If you want to avoid that, force_empty will be useful. " This means that reparenting is not really required as rmdir will reparent pages implicitly from the safe context. If we remove it from mem_cgroup_force_empty then we are safe even with existing tasks because the number of reclaim attempts is bounded. Moreover the knob still does what the documentation claims (modulo reparenting which doesn't make any difference) and users might expect. Longterm we want to deprecate the whole knob and put the reparented pages to the tail of parent LRU during cgroup removal. tj: Removed unused variable @cgrp from mem_cgroup_force_empty() Signed-off-by: Michal Hocko Acked-by: Johannes Weiner Acked-by: Li Zefan Signed-off-by: Tejun Heo --- diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt index 2622115276aa..8cb87e32e8cb 100644 --- a/Documentation/cgroups/memory.txt +++ b/Documentation/cgroups/memory.txt @@ -453,15 +453,11 @@ About use_hierarchy, see Section 6. 5.1 force_empty memory.force_empty interface is provided to make cgroup's memory usage empty. - You can use this interface only when the cgroup has no tasks. When writing anything to this # echo 0 > memory.force_empty - Almost all pages tracked by this memory cgroup will be unmapped and freed. - Some pages cannot be freed because they are locked or in-use. Such pages are - moved to parent (if use_hierarchy==1) or root (if use_hierarchy==0) and this - cgroup will be empty. + the cgroup will be reclaimed and as many pages reclaimed as possible. The typical use case for this interface is before calling rmdir(). Because rmdir() moves all pages to parent, some out-of-use page caches can be diff --git a/mm/memcontrol.c b/mm/memcontrol.c index a5e0417b4f9a..6144a8e7283f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4857,11 +4857,6 @@ static inline bool memcg_has_children(struct mem_cgroup *memcg) static int mem_cgroup_force_empty(struct mem_cgroup *memcg) { int nr_retries = MEM_CGROUP_RECLAIM_RETRIES; - struct cgroup *cgrp = memcg->css.cgroup; - - /* returns EBUSY if there is a task or if we come here twice. */ - if (cgroup_has_tasks(cgrp) || !list_empty(&cgrp->children)) - return -EBUSY; /* we call try-to-free pages for make this cgroup empty */ lru_add_drain_all(); @@ -4881,8 +4876,6 @@ static int mem_cgroup_force_empty(struct mem_cgroup *memcg) } } - lru_add_drain(); - mem_cgroup_reparent_charges(memcg); return 0; }