Minchan Kim [Mon, 16 Dec 2013 23:45:52 +0000 (10:45 +1100)]
zram: promote zram from staging
Zram has lived in staging for a LONG LONG time and have been
fixed/improved by many contributors so code is clean and stable now. Of
course, there are lots of product using zram in real practice.
The major TV companys have used zram as swap since two years ago and
recently our production team released android smart phone with zram which
is used as swap, too and recently Android Kitkat start to use zram for
small memory smart phone. And there was a report Google released their
ChromeOS with zram, too and cyanogenmod have been used zram long time ago.
And I heard some disto have used zram block device for tmpfs. In
addition, I saw many report from many other peoples. For example, Lubuntu
start to use it.
The benefit of zram is very clear. With my experience, one of the benefit
was to remove jitter of video application with backgroud memory pressure.
It would be effect of efficient memory usage by compression but more issue
is whether swap is there or not in the system. Recent mobile platforms
have used JAVA so there are many anonymous pages. But embedded system
normally are reluctant to use eMMC or SDCard as swap because there is
wear-leveling and latency issues so if we do not use swap, it means we
can't reclaim anoymous pages and at last, we could encounter OOM kill. :(
Although we have real storage as swap, it was a problem, too. Because it
sometime ends up making system very unresponsible caused by slow swap
storage performance.
Quote from Luigi on Google
"
Since Chrome OS was mentioned: the main reason why we don't use swap
to a disk (rotating or SSD) is because it doesn't degrade gracefully
and leads to a bad interactive experience. Generally we prefer to
manage RAM at a higher level, by transparently killing and restarting
processes. But we noticed that zram is fast enough to be competitive
with the latter, and it lets us make more efficient use of the
available RAM.
"
and he announced. http://www.spinics.net/lists/linux-mm/msg57717.html
Other uses case is to use zram for block device. Zram is block device so
anyone can format the block device and mount on it so some guys on the
internet start zram as /var/tmp.
http://forums.gentoo.org/viewtopic-t-838198-start-0.html
Let's promote zram and enhance/maintain it instead of removing.
Signed-off-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Nitin Gupta <ngupta@vflare.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Bob Liu <bob.liu@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Luigi Semenzato <semenzato@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Minchan Kim [Mon, 16 Dec 2013 23:45:52 +0000 (10:45 +1100)]
zsmalloc: move it under mm
This patch moves zsmalloc under mm directory.
Before that, description will explain why we have needed custom allocator.
Zsmalloc is a new slab-based memory allocator for storing compressed
pages. It is designed for low fragmentation and high allocation success
rate on large object, but <= PAGE_SIZE allocations.
zsmalloc differs from the kernel slab allocator in two primary ways to
achieve these design goals.
zsmalloc never requires high order page allocations to back slabs, or
"size classes" in zsmalloc terms. Instead it allows multiple single-order
pages to be stitched together into a "zspage" which backs the slab. This
allows for higher allocation success rate under memory pressure.
Also, zsmalloc allows objects to span page boundaries within the zspage.
This allows for lower fragmentation than could be had with the kernel slab
allocator for objects between PAGE_SIZE/2 and PAGE_SIZE. With the kernel
slab allocator, if a page compresses to 60% of it original size, the
memory savings gained through compression is lost in fragmentation because
another object of the same size can't be stored in the leftover space.
This ability to span pages results in zsmalloc allocations not being
directly addressable by the user. The user is given an non-dereferencable
handle in response to an allocation request. That handle must be mapped,
using zs_map_object(), which returns a pointer to the mapped region that
can be used. The mapping is necessary since the object data may reside in
two different noncontigious pages.
The zsmalloc fulfills the allocation needs for zram perfectly
[sjenning@linux.vnet.ibm.com: borrow Seth's quote] Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Nitin Gupta <ngupta@vflare.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Luigi Semenzato <semenzato@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Pekka Enberg <penberg@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Wanpeng Li [Mon, 16 Dec 2013 23:45:52 +0000 (10:45 +1100)]
mm/migrate.c: fix setting of cpupid on page migration twice against normal page
7851a45cd3 ("mm: numa: Copy cpupid on page migration") copies over the
cpupid at page migration time. it is unnecessary to set it again in
alloc_misplaced_dst_page().
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Wanpeng Li [Mon, 16 Dec 2013 23:45:52 +0000 (10:45 +1100)]
mm/migrate.c: fix set cpupid on page migration twice against thp
7851a45cd3 (mm: numa: Copy cpupid on page migration) copies over the
cpupid at page migration time. It is unnecessary to set it again in
migrate_misplaced_transhuge_page().
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Chen Gang [Mon, 16 Dec 2013 23:45:50 +0000 (10:45 +1100)]
kernel/kexec.c: use vscnprintf() instead of vsnprintf() in vmcoreinfo_append_str()
vsnprintf() may let 'r' larger than sizeof(buf), in this case, if 'r' is
also less than "vmcoreinfo_max_size - vmcoreinfo_size" (left size of
destination buffer), next memcpy() will read the unexpected addresses.
Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>