]> git.karo-electronics.de Git - karo-tx-linux.git/commit
mm: add extra free kbytes tunable
authorRik van Riel <riel@redhat.com>
Wed, 16 Nov 2011 23:41:23 +0000 (10:41 +1100)
committerStephen Rothwell <sfr@canb.auug.org.au>
Mon, 28 Nov 2011 04:37:42 +0000 (15:37 +1100)
commitb350e5cebce57d785e8dfe24cdddebd182638410
treeeb192276433ff22de6d6851c7070fb9fb5fd8545
parent37c88e5be16edc7ce7f8d6c98a3d4fabf9f765a3
mm: add extra free kbytes tunable

Add a userspace visible knob to tell the VM to keep an extra amount of
memory free, by increasing the gap between each zone's min and low
watermarks.

This is useful for realtime applications that call system calls and have a
bound on the number of allocations that happen in any short time period.
In this application, extra_free_kbytes would be left at an amount equal to
or larger than than the maximum number of allocations that happen in any
burst.

It may also be useful to reduce the memory use of virtual machines
(temporarily?), in a way that does not cause memory fragmentation like
ballooning does.

Testing results from Satoru Moriya:

: I ran some sample workloads and measure memory allocation latency
: (latency of __alloc_page_nodemask()).
: The test is like following:
:
:  - CPU: 1 socket, 4 core
:  - Memory: 4GB
:
:  - Background load:
:    $ dd if=3D/dev/zero of=3D/tmp/tmp1
:    $ dd if=3D/dev/zero of=3D/tmp/tmp2
:    $ dd if=3D/dev/zero of=3D/tmp/tmp3
:
:  - Main load:
:    $ mapped-file-stream 1 $((1024 * 1024 * 640))  --(*)
:
:  (*) This is made by Johannes Weiner
:      https://lkml.org/lkml/2010/8/30/226
:
:      It allocates/access 640MByte memory at a burst.
:
: The result is follwoing:
:
:                                |         |  extra   |
:                                | default |  kbytes  |
: --------------------------------------------------------------
: min_free_kbytes                |    8113 |   8113   |
: extra_free_kbytes              |       0 | 640*1024 | (KB)
: --------------------------------------------------------------
: worst latency                  | 517.762 |  20.775  | (usec)
: --------------------------------------------------------------
: vmstat result                  |         |          |
:  nr_vmscan_write               |       0 |      0   |
:  pgsteal_dma                   |       0 |      0   |
:  pgsteal_dma32                 |  143667 | 144882   |
:  pgsteal_normal                |   31486 |  27001   |
:  pgsteal_movable               |       0 |      0   |
:  pgscan_kswapd_dma             |       0 |      0   |
:  pgscan_kswapd_dma32           |  138617 | 156351   |
:  pgscan_kswapd_normal          |   30593 |  27955   |
:  pgscan_kswapd_movable         |       0 |      0   |
:  pgscan_direct_dma             |       0 |      0   |
:  pgscan_direct_dma32           |    5050 |      0   |
:  pgscan_direct_normal          |     896 |      0   |
:  pgscan_direct_movable         |       0 |      0   |
:  kswapd_steal                  |  169207 | 171883   |
:  kswapd_inodesteal             |       0 |      0   |
:  kswapd_low_wmark_hit_quickly  |      43 |     45   |
:  kswapd_high_wmark_hit_quickly |       1 |      0   |
:  allocstall                    |      32 |      0   |
:
:
: As you can see, in the default case there were 32 direct reclaim
: (allocstal= l) and its worst latency was 517.762 usecs.  This value may be
: larger if a process would sleep or issue I/O in the direct reclaim path.
: OTOH, ii the other case where I add extra free bytes, there were no direct
: reclaim and its worst latency was 20.775 usecs.
:
: In this test case, we can avoid direct reclaim and keep a latency low.

Signed-off-by: Rik van Riel<riel@redhat.com>
Acked-by: Johannes Weiner <jweiner@redhat.com>
Tested-by: Satoru Moriya <satoru.moriya@hds.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Documentation/sysctl/vm.txt
kernel/sysctl.c
mm/page_alloc.c