git.karo-electronics.de Git - karo-tx-linux.git/commit

author	Tim Chen <tim.c.chen@linux.intel.com>
	Fri, 7 Jun 2013 00:07:56 +0000 (10:07 +1000)
committer	Stephen Rothwell <sfr@canb.auug.org.au>
	Fri, 7 Jun 2013 05:42:16 +0000 (15:42 +1000)
commit	51b307c05c203bd411911f95f32e654511b19f8d
tree	76bc7571eaae7d9befe3ac6d104936b034924325	tree \| snapshot
parent	43db87cad3f34df7d6d6c4daaf99ccbf1ee67705	commit \| diff

mm: tune vm_committed_as percpu_counter batching size

Currently the per cpu counter's batch size for memory accounting is
configured as twice the number of cpus in the system.  However, for system
with very large memory, it is more appropriate to make it proportional to
the memory size per cpu in the system.

For example, for a x86_64 system with 64 cpus and 128 GB of memory, the
batch size is only 2*64 pages (0.5 MB).  So any memory accounting changes
of more than 0.5MB will overflow the per cpu counter into the global
counter.  Instead, for the new scheme, the batch size is configured to be
0.4% of the memory/cpu = 8MB (128 GB/64 /256), which is more inline with
the memory size.

I've done a repeated brk test of 800KB (from will-it-scale test suite)
with 80 concurrent processes on a 4 socket Westmere machine with a total
of 40 cores.  Without the patch, about 80% of cpu is spent on spin-lock
contention within the vm_committed_as counter.  With the patch, there's a
73x speedup on the benchmark and the lock contention drops off almost
entirely.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

include/linux/mman.h		diff \| blob \| history
mm/mm_init.c		diff \| blob \| history