Jacek Anaszewski [Fri, 20 Nov 2015 15:38:50 +0000 (16:38 +0100)]
leds: powernv: Implement brightness_set_blocking op
Since brightness setting can sleep for this driver, implement
brightness_set_blocking op, instead of brightness_set.
It makes this driver compatible with LED triggers.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Cc: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Jacek Anaszewski [Fri, 20 Nov 2015 15:47:51 +0000 (16:47 +0100)]
leds: ipaq-micro: Implement brightness_set_blocking op
Since brightness setting can sleep for this driver, implement
brightness_set_blocking op, instead of brightness_set.
It makes this driver compatible with LED triggers.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Simon Arlott [Sun, 15 Nov 2015 13:34:37 +0000 (13:34 +0000)]
leds: bcm6328: Swap LED ON and OFF definitions
The values of BCM6328_LED_MODE_ON and BCM6328_LED_MODE_OFF were named
for active low LEDs. These should be swapped so that they are named for
the default case of active high LEDs.
Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Simon Arlott [Mon, 16 Nov 2015 20:24:59 +0000 (20:24 +0000)]
leds: bcm6328: Reuse bcm6328_led_set() instead of copying its functionality
When ensuring a consistent initial LED state in bcm6328_led (as they may
be blinking instead of on/off), the LED register is set using an inverted
copy of bcm6328_led_set(). To avoid further errors relating to active low
handling, call this function directly instead.
As bcm6328_led_set() acquires the same spinlock again when updating the
register, it is called after unlocking.
Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Milo Kim [Fri, 20 Nov 2015 08:03:00 +0000 (17:03 +0900)]
leds: turn off the LED and wait for completion on unregistering LED class device
Workqueue, 'set_brightness_work' is used for scheduling brightness control.
This workqueue is canceled when the LED class device is unregistered.
Currently, LED subsystem handles like below.
However, this could be a problem.
Workqueue is going to be canceled but LED device needs to be off.
The worst case is null pointer access due to scheduling a workqueue.
LED module is loaded.
LED driver private data is allocated by using devm_zalloc().
LED module is unloaded.
led_classdev_unregister() is called.
cancel_work_sync()
led_set_brightness(led_cdev, LED_OFF)
schedule_work() if LED driver uses brightness_set_blocking()
In the meantime, driver private data will be freed.
..scheduling..
brightness_set_blocking() callback is invoked.
For the brightness control, LED driver tries to access private
data but resource is removed!
To avoid this problem, LED subsystem should turn off the brightness first
and wait for completion.
leds: call led_pwm_set() in leds-pwm to enforce default LED_OFF
Some PWMs are disabled by default or the default pin setting
does not match the LED_OFF state (e.g., active-low leds).
Hence, the driver may end up reporting 0 brightness, but
the leds are actually on using full brightness, because
it never enforces its default configuration.
So enforce it by calling led_pwm_set() after successfully
registering the device.
Tested on a Phytec phyFLEX i.MX6Q board based on kernel
v3.19.5.
Signed-off-by: Markus Hofstaetter <markus.hofstaetter@ait.ac.at> Tested-by: Markus Hofstaetter <markus.hofstaetter@ait.ac.at> Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
The header of this file fixes the license to GPL 2 only without the
option to use later version. So use the string "GPL v2" that is to be
used in this case.
Rob Herring [Tue, 10 Nov 2015 23:10:17 +0000 (17:10 -0600)]
leds: ledtrig-transient: fix duration to be msec instead of jiffies
The transient trigger duration is documented to be in msec units, but is
actually in jiffies units. Other time based triggers are in msec units
as well. Fix the timer setup to convert from msec.
This could break an existing userspace that worked around this problem,
but exposing jiffies to userspace is just wrong and would break anyway
if HZ is changed.
Signed-off-by: Rob Herring <robh@kernel.org> Cc: Shuah Khan <shuahkhan@gmail.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: linux-leds@vger.kernel.org Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
Andrew Lunn [Thu, 20 Aug 2015 10:59:45 +0000 (12:59 +0200)]
leds: wm8350: Remove work queue
Now the core implements the work queue, remove it from the drivers,
and switch to using brightness_set_blocking op.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Acked-by: Antonio Ospite <ao2@ao2.it> Reviewed-by: Mark Brown <broonie@kernel.org>
Andrew Lunn [Thu, 20 Aug 2015 13:54:57 +0000 (15:54 +0200)]
leds: adp5520: Remove work queue
Now the core implements the work queue, remove it from the driver,
and switch to using brightness_set_blocking op.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Cc: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com>
media: flash: use led_set_brightness_sync for torch brightness
LED subsystem shifted responsibility for choosing between SYNC or ASYNC
way of setting brightness from drivers to the caller. Adapt the wrapper
to those changes.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Cc: linux-media@vger.kernel.org Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Now the core implements the work queue, remove it from the drivers,
and switch to using brightness_set_blocking op.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Cc: Ingi Kim <ingi2.kim@samsung.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Acked-by: Pavel Machek <pavel@ucw.cz>
This patch removes SET_BRIGHTNESS_ASYNC and SET_BRIGHTNESS_SYNC flags.
led_set_brightness() now calls led_set_brightness_nosleep() instead of
choosing between sync and async op basing on the flags defined by the
driver.
From now on, if a user wants to make sure that brightness will be set
synchronously, they have to use led_set_brightness_sync() API. It is now
being made publicly available since it has become apparent that it is
a caller who should decide whether brightness is to be set in
a synchronous or an asynchronous way.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Jacek Anaszewski [Mon, 19 Oct 2015 07:04:01 +0000 (09:04 +0200)]
leds: core: Use set_brightness_work for the blocking op
This patch makes LED core capable of setting brightness for drivers
that implement brightness_set_blocking op. It removes from LED class
drivers responsibility for using work queues on their own.
In order to achieve this set_brightness_delayed callback is being
modified to directly call one of available ops for brightness setting.
led_set_brightness_async() function didn't set brightness in an
asynchronous way in all cases. It was mistakenly assuming that all
LED subsystem drivers used work queue in their brightness_set op,
whereas only half of them did that. Since it has no users now,
it is being removed.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
This patch adds led_set_brightness_nosleep() and led_set_brightness_nopm()
functions, that guarantee setting LED brightness in a non-blocking way.
The latter is used from pm_ops context and doesn't modify the brightness
cached in the struct led_classdev. Its execution always ends up with
a call to brightness setting op - either directly or through
a set_brightness_work, regardless of LED_SUSPENDED flag state.
The patch also replaces led_set_brightness_async() with
led_set_brightness_nosleep() in all places where the most vital was setting
brightness in a non sleeping way but not necessarily asynchronously, which
is not needed for non-blocking drivers.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
leds: Rename brightness_set_sync op to brightness_set_blocking
The initial purpose of brightness_set_sync op, introduced along with
the LED flash class extension, was to add a means for setting torch LED
brightness as soon as possible, which couldn't have been guaranteed by
brightness_set op. This patch renames the op to brightness_set_blocking,
which describes its purpose in a more generic way. It is beneficial
in view of the prospective changes in the LED core, aiming at removing
the need for using work queues in LED class drivers that can sleep
or use delays while setting brightness.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Acked-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
This patch adds LED_BLINK_BRIGHTNESS_CHANGE flag to indicate that blink
brightness has changed, and LED_BLINK_DISABLE flag to indicate that
blinking deactivation has been requested. In order to use the flags
led_timer_function and set_brightness_delayed callbacks as well as
led_set_brightness() function are being modified. The main goal of these
modifications is to prepare set_brightness_work for extension of the
scope of its responsibilities.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Jacek Anaszewski [Mon, 28 Sep 2015 13:07:10 +0000 (15:07 +0200)]
leds: core: Use EXPORT_SYMBOL_GPL consistently
LED core has a mixture of EXPORT_SYMBOL and EXPORT_SYMBOL_GPL macros.
This patch fixes this discrepancy and switches to using EXPORT_SYMBOL_GPL
for each exported function.
Signed-off-by: Jacek Anaszewski <j.anaszewski@samsung.com> Acked-by: Pavel Machek <pavel@ucw.cz>
Linus Torvalds [Sun, 3 Jan 2016 19:49:31 +0000 (11:49 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS build fix from Ralf Baechle:
"Fix a makefile issue resulting in build breakage with older binutils.
This has sat in -next for a few days, testers and buildbot are happy
with it, too though if you are going for another -rc that'd certainly
help ironing out a few more issues"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: VDSO: Fix build error with binutils 2.24 and earlier
Linus Torvalds [Sun, 3 Jan 2016 19:36:26 +0000 (11:36 -0800)]
Merge tag 'drm-intel-fixes-2016-01-02' of git://anongit.freedesktop.org/drm-intel
Pull i915 drm fixes from Jani Nikula:
"Two display fixes still for v4.4.
The new year's resolution is to start using signed tags per Linus'
request. This one is still unsigned; I want to fix this up in our
maintainer scripts instead of doing it one-off"
* tag 'drm-intel-fixes-2016-01-02' of git://anongit.freedesktop.org/drm-intel:
drm/i915: increase the tries for HDMI hotplug live status checking
drm/i915: Unbreak check_digital_port_conflicts()
Linus Torvalds [Thu, 31 Dec 2015 22:59:21 +0000 (14:59 -0800)]
Merge tag 'pci-v4.4-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI bugfix from Bjorn Helgaas:
"Here's another fix for v4.4.
This fixes 32-bit config reads for the HiSilicon driver. Obviously
the driver is completely broken without this fix (apparently it
actually was tested internally, but got broken somehow in the process
of upstreaming it).
Xin Long [Tue, 29 Dec 2015 09:49:25 +0000 (17:49 +0800)]
sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close
In sctp_close, sctp_make_abort_user may return NULL because of memory
allocation failure. If this happens, it will bypass any state change
and never free the assoc. The assoc has no chance to be freed and it
will be kept in memory with the state it had even after the socket is
closed by sctp_close().
So if sctp_make_abort_user fails to allocate memory, we should abort
the asoc via sctp_primitive_ABORT as well. Just like the annotation in
sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said,
"Even if we can't send the ABORT due to low memory delete the TCB.
This is a departure from our typical NOMEM handling".
But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would
dereference the chunk pointer, and system crash. So we should add
SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other
places where it adds SCTP_CMD_REPLY cmd.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolai Stange [Tue, 29 Dec 2015 12:29:55 +0000 (13:29 +0100)]
net, socket, socket_wq: fix missing initialization of flags
Commit ceb5d58b2170 ("net: fix sock_wake_async() rcu protection") from
the current 4.4 release cycle introduced a new flags member in
struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA
from struct socket's flags member into that new place.
Unfortunately, the new flags field is never initialized properly, at least
not for the struct socket_wq instance created in sock_alloc_inode().
One particular issue I encountered because of this is that my GNU Emacs
failed to draw anything on my desktop -- i.e. what I got is a transparent
window, including the title bar. Bisection lead to the commit mentioned
above and further investigation by means of strace told me that Emacs
is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is
reproducible 100% of times and the fact that properly initializing the
struct socket_wq ->flags fixes the issue leads me to the conclusion that
somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags,
preventing my Emacs from receiving any SIGIO's due to data becoming
available and it got stuck.
Make sock_alloc_inode() set the newly created struct socket_wq's ->flags
member to zero.
Fixes: ceb5d58b2170 ("net: fix sock_wake_async() rcu protection") Signed-off-by: Nicolai Stange <nicstange@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 30 Dec 2015 18:26:20 +0000 (10:26 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Make the block layer great again.
Basically three amazing fixes in this pull request, split into 4
patches. Believe me, they should go into 4.4. Two of them fix a
regression, the third and last fixes an easy-to-trigger bug.
- Fix a bad irq enable through null_blk, for queue_mode=1 and using
timer completions. Add a block helper to restart a queue
asynchronously, and use that from null_blk. From me.
- Fix a performance issue in NVMe. Some devices (Intel Pxxxx) expose
a stripe boundary, and performance suffers if we cross it. We took
that into account for merging, but not for the newer splitting
code. Fix from Keith.
- Fix a kernel oops in lightnvm with multiple channels. From Matias"
* 'for-linus' of git://git.kernel.dk/linux-block:
lightnvm: wrong offset in bad blk lun calculation
null_blk: use async queue restart helper
block: add blk_start_queue_async()
block: Split bios on chunk boundaries
It looks like 70-80 ms is BSW platform needs in some bad cases of the
monitors at this end (8 times delay at most). Keep less than 100ms for
HDCP pulse HPD low (with at least 100ms) to respond a plug out.
Reviewed-by: Cooper Chiou <cooper.chiou@intel.com> Tested-by: Gary Wang <gary.c.wang@intel.com> Cc: Gavin Hindman <gavin.hindman@intel.com> Cc: Sonika Jindal <sonika.jindal@intel.com> Cc: Shashank Sharma <shashank.sharma@intel.com> Cc: Shobhit Kumar <shobhit.kumar@intel.com> Signed-off-by: Gary Wang <gary.c.wang@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1450858295-12804-1-git-send-email-gary.c.wang@intel.com Tested-by: Shobhit Kumar <shobhit.kumar@intel.com> Cc: drm-intel-fixes@lists.freedesktop.org Fixes: 237ed86c693d ("drm/i915: Check live status before reading edid") Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit f8d03ea0053b23de42c828d559016eabe0b91523)
[Jani: undo the file mode change of the original commit] Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Linus Torvalds [Wed, 30 Dec 2015 02:04:41 +0000 (18:04 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"9 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/vmstat: fix overflow in mod_zone_page_state()
ocfs2/dlm: clear migration_pending when migration target goes down
mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()
ocfs2: fix flock panic issue
m32r: add io*_rep helpers
m32r: fix build failure
arch/x86/xen/suspend.c: include xen/xen.h
mm: memcontrol: fix possible memcg leak due to interrupted reclaim
ocfs2: fix BUG when calculate new backup super
Heiko Carstens [Tue, 29 Dec 2015 22:54:32 +0000 (14:54 -0800)]
mm/vmstat: fix overflow in mod_zone_page_state()
mod_zone_page_state() takes a "delta" integer argument. delta contains
the number of pages that should be added or subtracted from a struct
zone's vm_stat field.
If a zone is larger than 8TB this will cause overflows. E.g. for a
zone with a size slightly larger than 8TB the line
in mm/page_alloc.c:free_area_init_core() will result in a negative
result for the NR_ALLOC_BATCH entry within the zone's vm_stat, since 8TB
contain 0x8xxxxxxx pages which will be sign extended to a negative
value.
Fix this by changing the delta argument to long type.
This could fix an early boot problem seen on s390, where we have a 9TB
system with only one node. ZONE_DMA contains 2GB and ZONE_NORMAL the
rest. The system is trying to allocate a GFP_DMA page but ZONE_DMA is
completely empty, so it tries to reclaim pages in an endless loop.
This was seen on a heavily patched 3.10 kernel. One possible
explaination seem to be the overflows caused by mod_zone_page_state().
Unfortunately I did not have the chance to verify that this patch
actually fixes the problem, since I don't have access to the system
right now. However the overflow problem does exist anyway.
Given the description that a system with slightly less than 8TB does
work, this seems to be a candidate for the observed problem.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
xuejiufei [Tue, 29 Dec 2015 22:54:29 +0000 (14:54 -0800)]
ocfs2/dlm: clear migration_pending when migration target goes down
We have found a BUG on res->migration_pending when migrating lock
resources. The situation is as follows.
dlm_mark_lockres_migration
res->migration_pending = 1;
__dlm_lockres_reserve_ast
dlm_lockres_release_ast returns with res->migration_pending remains
because other threads reserve asts
wait dlm_migration_can_proceed returns 1
>>>>>>> o2hb found that target goes down and remove target
from domain_map
dlm_migration_can_proceed returns 1
dlm_mark_lockres_migrating returns -ESHOTDOWN with
res->migration_pending still remains.
When reentering dlm_mark_lockres_migrating(), it will trigger the BUG_ON
with res->migration_pending. So clear migration_pending when target is
down.
Signed-off-by: Jiufei Xue <xuejiufei@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Banman [Tue, 29 Dec 2015 22:54:25 +0000 (14:54 -0800)]
mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()
test_pages_in_a_zone() does not account for the possibility of missing
sections in the given pfn range. pfn_valid_within always returns 1 when
CONFIG_HOLES_IN_ZONE is not set, allowing invalid pfns from missing
sections to pass the test, leading to a kernel oops.
Wrap an additional pfn loop with PAGES_PER_SECTION granularity to check
for missing sections before proceeding into the zone-check code.
This also prevents a crash from offlining memory devices with missing
sections. Despite this, it may be a good idea to keep the related patch
'[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with
missing sections' because missing sections in a memory block may lead to
other problems not covered by the scope of this fix.
Signed-off-by: Andrew Banman <abanman@sgi.com> Acked-by: Alex Thorlton <athorlton@sgi.com> Cc: Russ Anderson <rja@sgi.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Greg KH <greg@kroah.com> Cc: Seth Jennings <sjennings@variantweb.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Junxiao Bi [Tue, 29 Dec 2015 22:54:22 +0000 (14:54 -0800)]
ocfs2: fix flock panic issue
Commit 4f6563677ae8 ("Move locks API users to locks_lock_inode_wait()")
move flock/posix lock indentify code to locks_lock_inode_wait(), but
missed to set fl_flags to FL_FLOCK which caused the following kernel
panic on 4.4.0_rc5.
Fixes: 4f6563677ae8 ("Move locks API users to locks_lock_inode_wait()") Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <joseph.qi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sudip Mukherjee [Tue, 29 Dec 2015 22:54:16 +0000 (14:54 -0800)]
m32r: fix build failure
m32r allmodconfig is failing with:
In file included from ../include/linux/kvm_para.h:4:0,
from ../kernel/watchdog.c:26:
../include/uapi/linux/kvm_para.h:30:26: fatal error: asm/kvm_para.h: No such file or directory
Andrew Morton [Tue, 29 Dec 2015 22:54:13 +0000 (14:54 -0800)]
arch/x86/xen/suspend.c: include xen/xen.h
Fix the build warning:
arch/x86/xen/suspend.c: In function 'xen_arch_pre_suspend':
arch/x86/xen/suspend.c:70:9: error: implicit declaration of function 'xen_pv_domain' [-Werror=implicit-function-declaration]
if (xen_pv_domain())
^
Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vladimir Davydov [Tue, 29 Dec 2015 22:54:10 +0000 (14:54 -0800)]
mm: memcontrol: fix possible memcg leak due to interrupted reclaim
Memory cgroup reclaim can be interrupted with mem_cgroup_iter_break()
once enough pages have been reclaimed, in which case, in contrast to a
full round-trip over a cgroup sub-tree, the current position stored in
mem_cgroup_reclaim_iter of the target cgroup does not get invalidated
and so is left holding the reference to the last scanned cgroup. If the
target cgroup does not get scanned again (we might have just reclaimed
the last page or all processes might exit and free their memory
voluntary), we will leak it, because there is nobody to put the
reference held by the iterator.
The problem is easy to reproduce by running the following command
sequence in a loop:
mkdir /sys/fs/cgroup/memory/test
echo 100M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
echo $$ > /sys/fs/cgroup/memory/test/cgroup.procs
memhog 150M
echo $$ > /sys/fs/cgroup/memory/cgroup.procs
rmdir test
The cgroups generated by it will never get freed.
This patch fixes this issue by making mem_cgroup_iter avoid taking
reference to the current position. In order not to hit use-after-free
bug while running reclaim in parallel with cgroup deletion, we make use
of ->css_released cgroup callback to clear references to the dying
cgroup in all reclaim iterators that might refer to it. This callback
is called right before scheduling rcu work which will free css, so if we
access iter->position from rcu read section, we might be sure it won't
go away under us.
[hannes@cmpxchg.org: clean up css ref handling] Fixes: 5ac8fb31ad2e ("mm: memcontrol: convert reclaim iterator to simple css refcounting") Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> [3.19+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joseph Qi [Tue, 29 Dec 2015 22:54:06 +0000 (14:54 -0800)]
ocfs2: fix BUG when calculate new backup super
When resizing, it firstly extends the last gd. Once it should backup
super in the gd, it calculates new backup super and update the
corresponding value.
But it currently doesn't consider the situation that the backup super is
already done. And in this case, it still sets the bit in gd bitmap and
then decrease from bg_free_bits_count, which leads to a corrupted gd and
trigger the BUG in ocfs2_block_group_set_bits:
Guenter Roeck [Thu, 24 Dec 2015 05:04:31 +0000 (21:04 -0800)]
MIPS: VDSO: Fix build error with binutils 2.24 and earlier
Commit 2a037f310bab ("MIPS: VDSO: Fix build error") tries to fix a build
error seen with binutils 2.24 and earlier. However, the fix does not work,
and again results in the already known build errors if the kernel is built
with an earlier version of binutils.
Joe Stringer [Wed, 23 Dec 2015 22:39:27 +0000 (14:39 -0800)]
openvswitch: Fix template leak in error cases.
Commit 5b48bb8506c5 ("openvswitch: Fix helper reference leak") fixed a
reference leak on helper objects, but inadvertently introduced a leak on
the ct template.
Previously, ct_info.ct->general.use was initialized to 0 by
nf_ct_tmpl_alloc() and only incremented when ovs_ct_copy_action()
returned successful. If an error occurred while adding the helper or
adding the action to the actions buffer, the __ovs_ct_free_action()
cleanup would use nf_ct_put() to free the entry; However, this relies on
atomic_dec_and_test(ct_info.ct->general.use). This reference must be
incremented first, or nf_ct_put() will never free it.
Fix the issue by acquiring a reference to the template immediately after
allocation.
Fixes: cae3a2627520 ("openvswitch: Allow attaching helpers to ct action") Fixes: 5b48bb8506c5 ("openvswitch: Fix helper reference leak") Signed-off-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Matias Bjørling [Tue, 29 Dec 2015 13:37:56 +0000 (14:37 +0100)]
lightnvm: wrong offset in bad blk lun calculation
dev->nr_luns reports the total number of luns available in a device
while dev->luns_per_chnl is the number of luns per channel.
When multiple channels are available, the offset is calculated from a
channel and lun id into a linear array. As it multiplies with
the total number of luns, we go out of bound when channel id > 0 and
causes the kernel to panic when we read a protected kernel memory area.
Linus Torvalds [Tue, 29 Dec 2015 00:45:14 +0000 (16:45 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"Three late 4.4-rc fixes.
The first two were very small in terms of number of lines, the third
is more lines of change than I like this late in the cycle, but there
are positive test results from Avagotech and from my own test setup
with the target hardware, and given the problem was a 100% failure
case, I sent it through.
- A previous patch updated the mlx4 driver to use vmalloc when there
was not enough memory to get a contiguous region large enough for
our needs, so we need kvfree() whenever we free that item. We
missed one place, so fix that now.
- A previous patch added code to match incoming packets against a
specific device, but failed to compensate for devices that have
both InfiniBand and Ethernet ports. Fix that.
- Under certain vlan conditions, the ocrdma driver would fail to
bring up any vlan interfaces and would print out a circular locking
failure. Fix that"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
RDMA/be2net: Remove open and close entry points
RDMA/ocrdma: Depend on async link events from CNA
RDMA/ocrdma: Dispatch only port event when port state changes
RDMA/ocrdma: Fix vlan-id assignment in qp parameters
IB/mlx4: Replace kfree with kvfree in mlx4_ib_destroy_srq
IB/cma: cma_match_net_dev needs to take into account port_num
Jens Axboe [Mon, 28 Dec 2015 20:02:47 +0000 (13:02 -0700)]
null_blk: use async queue restart helper
If null_blk is run in NULL_IRQ_TIMER mode and with queue_mode NULL_Q_RQ,
we need to restart the queue from the hrtimer interrupt. We can't
directly invoke the request_fn from that context, so punt the queue run
to async kblockd context.
Tested-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Jens Axboe <axboe@fb.com>
Jens Axboe [Mon, 28 Dec 2015 20:01:22 +0000 (13:01 -0700)]
block: add blk_start_queue_async()
We currently only have an inline/sync helper to restart a stopped
queue. If drivers need an async version, they have to roll their
own. Add a generic helper instead.
Linus Torvalds [Mon, 28 Dec 2015 18:44:41 +0000 (10:44 -0800)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes a bug in the algif_skcipher interface that can trigger a
kernel WARN_ON from user-space. It does so by using the new skcipher
interface which unlike the previous ablkcipher does not need to create
extra geniv objects which is what was used to trigger the WARN_ON"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: algif_skcipher - Use new skcipher interface
Devesh Sharma [Thu, 24 Dec 2015 18:14:08 +0000 (13:14 -0500)]
RDMA/be2net: Remove open and close entry points
Recently Dough Ledford reported a deadlock happening
between ocrdma-load sequence and NetworkManager service
issueing "open" on be2net interface.
The deadlock happens when any be2net hook (e.g. open/close) is called
in parallel to insmod ocrdma.ko.
A. be2net is sending administrative open/close event to ocrdma holding
device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net.
So sequence of locks is rtnl_lock---> device_list lock
B. When new ocrdma roce device gets registered, infiniband stack now
takes rtnl_lock in ib_register_device() in GID initialization routines.
So sequence of locks in this path is device_list lock ---> rtnl_lock.
This improper locking sequence causes deadlock.
In order to resolve the above deadlock condition, ocrdma intorduced a
patch to stop listening to administrative open/close events generated from
be2net driver. It now depends on link-state-change async-event generated from
CNA. This change leaves behind dead code which used to generate administrative
open/close events. This patch cleans-up all that dead code from be2net.
Devesh Sharma [Thu, 24 Dec 2015 18:14:07 +0000 (13:14 -0500)]
RDMA/ocrdma: Depend on async link events from CNA
Recently Dough Ledford reported a deadlock happening
between ocrdma-load sequence and NetworkManager service
issuing "open" on be2net interface.
The deadlock happens when any be2net hook (e.g. open/close) is called
in parallel to insmod ocrdma.ko.
A. be2net is sending administrative open/close event to ocrdma holding
device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net.
So sequence of locks is rtnl_lock---> device_list lock
B. When new ocrdma roce device gets registered, infiniband stack now
takes rtnl_lock in ib_register_device() in GID initialization routines.
So sequence of locks in this path is device_list lock ---> rtnl_lock.
This improper locking sequence causes deadlock.
With this patch we stop using administrative open and close events
injected by be2net driver. These events were used to dispatch PORT_ACTIVE
and PORT_ERROR events to the IB-stack. This patch implements a logic
to receive async-link-events generated from CNA whenever link-state-change
is detected. Now on, these async-events will be used to dispatch
PORT_ACTIVE and PORT_ERROR events to IB-stack.
Depending on async-events from CNA removes the need to hold device-list-mutex
and thus breaks the busy-wait scenario.
Devesh Sharma [Thu, 24 Dec 2015 18:14:06 +0000 (13:14 -0500)]
RDMA/ocrdma: Dispatch only port event when port state changes
Dispatch only port event to IB stack when port state changes.
Don't explicitly modify qps to error. Let application listen to
port events on async event queue or let QP fail with retry-exceeded
completion error.
Devesh Sharma [Thu, 24 Dec 2015 18:14:05 +0000 (13:14 -0500)]
RDMA/ocrdma: Fix vlan-id assignment in qp parameters
vlan-id is wrongly getting as 0 when PFC is enabled.
Set vlan-id configured by user in QP parameters.
In case vlan interface is not used, flash a warning to
user to configure vlan and assign vlan-id as 0 in qp params.
Fixes: dbf727de7440 ('IB/core: Use GID table in AH creation and dmac resolution') Cc: Matan Barak <matanb@mellanox.com> Signed-off-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Accepted or peeled off sockets were missing a security label (e.g.
SELinux) which means that socket was in "unlabeled" state.
This patch clones the sock's label from the parent sock and resolves the
issue (similar to AF_BLUETOOTH protocol family).
Cc: Paul Moore <pmoore@redhat.com> Cc: David Teigland <teigland@redhat.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Commit cacc06215271 ("sctp: use GFP_USER for user-controlled kmalloc")
missed two other spots.
For connectx, as it's more likely to be used by kernel users of the API,
it detects if GFP_USER should be used or not.
Fixes: cacc06215271 ("sctp: use GFP_USER for user-controlled kmalloc") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 28 Dec 2015 02:12:21 +0000 (18:12 -0800)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
- Fix bitrot in __get_user_unaligned()
- EVA userspace accessor bug fixes.
- Fix for build issues with certain toolchains.
- Fix build error for VDSO with particular toolchain versions.
- Fix build error due to a variable that should have been removed by an
earlier patch
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Fix bitrot in __get_user_unaligned()
MIPS: Fix build error due to unused variables.
MIPS: VDSO: Fix build error
MIPS: CPS: drop .set mips64r2 directives
MIPS: uaccess: Take EVA into account in [__]clear_user
MIPS: uaccess: Take EVA into account in __copy_from_user()
MIPS: uaccess: Fix strlen_user with EVA
Linus Torvalds [Mon, 28 Dec 2015 02:06:31 +0000 (18:06 -0800)]
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A smallish set of fixes that we've been sitting on for a while now,
flushing the queue here so they go in. Summary:
A handful of fixes for OMAP, i.MX, Allwinner and Tegra:
- A clock rate and a PHY setup fix for i.MX6Q/DL
- A couple of fixes for the reduced serial bus (sunxi-rsb) on
Allwinner
- UART wakeirq fix for an OMAP4 board, timer config fixes for AM43XX.
- Suspend fix for Tegra124 Chromebooks
- Fix for missing implicit include that's different between
ARM/ARM64"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: tegra: Fix suspend hang on Tegra124 Chromebooks
bus: sunxi-rsb: Fix peripheral IC mapping runtime address
bus: sunxi-rsb: Fix primary PMIC mapping hardware address
ARM: dts: Fix UART wakeirq for omap4 duovero parlor
ARM: OMAP2+: AM43xx: select ARM TWD timer
ARM: OMAP2+: am43xx: enable GENERIC_CLOCKEVENTS_BROADCAST
fsl-ifc: add missing include on ARM64
ARM: dts: imx6: Fix Ethernet PHY mode on Ventana boards
ARM: dts: imx: Fix the assigned-clock mismatch issue on imx6q/dl
bus: sunxi-rsb: unlock on error in sunxi_rsb_read()
ARM: dts: sunxi: sun6i-a31s-primo81.dts: add touchscreen axis swapping property
Linus Torvalds [Sun, 27 Dec 2015 04:08:47 +0000 (20:08 -0800)]
Merge tag 'pm+acpi-4.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"These fix an ACPI processor driver regression introduced during the
4.3 cycle and a mistake in the recently added SCPI support in the
arm_big_little cpufreq driver.
Specifics:
- Fix a thermal management issue introduced by an ACPI processor
driver change made during the 4.3 development cycle that failed to
return 0 from a function on success which triggered an error
cleanup path every time it had been called that deleted useful data
structures created previously (Srinivas Pandruvada).
- Fix a variable data type issue in the arm_big_little cpufreq
driver's SCPI support code added recently that prevents error
handling in there from working correctly (Dan Carpenter)"
* tag 'pm+acpi-4.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: scpi-cpufreq: signedness bug in scpi_get_dvfs_info()
ACPI / processor: Fix thermal cooling device regression
Linus Torvalds [Sun, 27 Dec 2015 03:55:16 +0000 (19:55 -0800)]
Merge tag 'upstream-4.4-rc7' of git://git.infradead.org/linux-ubifs
Pull UBI bug fixes from Richard Weinberger:
"This contains four bug fixes for UBI"
* tag 'upstream-4.4-rc7' of git://git.infradead.org/linux-ubifs:
mtd: ubi: don't leak e if schedule_erase() fails
mtd: ubi: fixup error correction in do_sync_erase()
UBI: fix use of "VID" vs. "EC" in header self-check
UBI: fix return error code
Linus Torvalds [Sun, 27 Dec 2015 03:48:09 +0000 (19:48 -0800)]
Merge tag 'trace-v4.4-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ftrace/recordmcount fix from Steven Rostedt:
"Russell King was reporting lots of warnings when he compiled his
kernel with ftrace enabled. With some investigation it was discovered
that it was his compile setup. He was using ccache with hard links,
which allowed recordmcount to process the same .o twice. When this
happens, recordmcount will detect that it was already done and give a
warning about it.
Russell fixed this by having recordmcount detect that the object file
has more than one hard link, and if it does, it unlinks the object
file after it maps it and processes then. This appears to fix the
issue.
As you did not like the fact that recordmcount modified the file in
place and thought that it should do the modifications in memory and
then write it out to disk and move it over the old file to prevent
other more subtle issues like the one above, a second patch is added
on top of Russell's to do just that. Luckily the original code had
write and lseek wrappers that I was able to modify to not do inplace
writes, but simply keep track of the changes made in memory. When a
write is made, a "update" flag is set, and at the end of processing,
if the update is set, then it writes the file with changes out to a
new file, and then renames it over the original one.
The file descriptor is still passed to the write and lseek wrappers
because removing that would cause the change to be more intrusive.
That can be removed in a follow up cleanup patch that can wait till
the next merge window"
* tag 'trace-v4.4-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace/scripts: Have recordmcount copy the object file
scripts: recordmcount: break hardlinks