Shahed Shaikh [Wed, 4 Feb 2015 10:41:25 +0000 (05:41 -0500)]
qlcnic: Fix NAPI poll routine for Tx completion
After d75b1ade567f ("net: less interrupt masking in NAPI")
driver's NAPI poll routine is expected to return
exact budget value if it wants to be re-called.
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Fixes: d75b1ade567f ("net: less interrupt masking in NAPI") Signed-off-by: David S. Miller <davem@davemloft.net>
John Stultz [Thu, 5 Feb 2015 00:45:26 +0000 (16:45 -0800)]
hrtimer: Fix incorrect tai offset calculation for non high-res timer systems
I noticed some CLOCK_TAI timer test failures on one of my
less-frequently used configurations. And after digging in I
found in 76f4108892d9 (Cleanup hrtimer accessors to the
timekepeing state), the hrtimer_get_softirq_time tai offset
calucation was incorrectly rewritten, as the tai offset we
return shold be from CLOCK_MONOTONIC, and not CLOCK_REALTIME.
This results in CLOCK_TAI timers expiring early on non-highres
capable machines.
This patch fixes the issue, calculating the tai time properly
from the monotonic base.
amd-xgbe: Set RSS enablement based on hardware features
The RSS support requires enablement based on the features reported by
the hardware. The setting of this flag is missing. Add support to
set the RSS enablement flag based on the reported hardware features.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ignacy Gawędzki [Tue, 3 Feb 2015 18:05:18 +0000 (19:05 +0100)]
cls_api.c: Fix dumping of non-existing actions' stats.
In tcf_exts_dump_stats(), ensure that exts->actions is not empty before
accessing the first element of that list and calling tcf_action_copy_stats()
on it. This fixes some random segvs when adding filters of type "basic" with
no particular action.
This also fixes the dumping of those "no-action" filters, which more often
than not made calls to tcf_action_copy_stats() fail and consequently netlink
attributes added by the caller to be removed by a call to nla_nest_cancel().
Fixes: 33be62715991 ("net_sched: act: use standard struct list_head") Signed-off-by: Ignacy Gawędzki <ignacy.gawedzki@green-communications.fr> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Configuring fq with quantum 0 hangs the system, presumably because of a
non-interruptible infinite loop. Either way quantum 0 does not make sense.
Reproduce with:
sudo tc qdisc add dev lo root fq quantum 0 initial_quantum 0
ping 127.0.0.1
Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Takashi Iwai [Tue, 3 Feb 2015 16:51:23 +0000 (17:51 +0100)]
drm/cirrus: Limit modes depending on bpp option
The commit [8975626ea35a: drm/cirrus: allow 32bpp framebuffers for
cirrus drm] broke X modesetting driver because cirrus driver still
provides the full list of modes up to 1280x1024 while the 32bpp can
support only up to 800x600.
We might be able to filter out the invalid modes in mode_valid
callback, but unfortunately the bpp in question can't be referred
there for now (let me know if there is a better way to retrieve the
bpp for the probed fb).
So, instead, this patch adds the bpp module option to specify the
maximal bpp explicitly and limits the resolutions in get_modes
depending on its value.
The default value is set to 24 so that the existing stuff keeps
working. If you need a new 32bpp feature, specify cirrus.bpp=32
option explicitly.
Fixes: 8975626ea35a ('drm/cirrus: allow 32bpp framebuffers for cirrus drm') Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Dave Airlie <airlied@redhat.com>
In virtio 1.0 mode, when mergeable buffers are enabled on a big-endian
host, num_buffers wasn't byte-swapped correctly, so large incoming
packets got corrupted.
To fix, fill it in within hdr - this also makes sure it gets
the correct type.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Chen Gang [Mon, 2 Feb 2015 21:00:40 +0000 (05:00 +0800)]
net: usb: sr9700: Use 'SR_' prefix for the common register macros
The commone register macors (e.g. RSR) is too commont to drivers, it may
be conflict with the architectures (e.g. xtensa, sh).
The related warnings (with allmodconfig under xtensa):
CC [M] drivers/net/usb/sr9700.o
In file included from drivers/net/usb/sr9700.c:24:0:
drivers/net/usb/sr9700.h:65:0: warning: "RSR" redefined
#define RSR 0x06
^
In file included from ./arch/xtensa/include/asm/bitops.h:22:0,
from include/linux/bitops.h:36,
from include/linux/kernel.h:10,
from include/linux/list.h:8,
from include/linux/module.h:9,
from drivers/net/usb/sr9700.c:13:
./arch/xtensa/include/asm/processor.h:190:0: note: this is the location of the previous definition
#define RSR(v,sr) __asm__ __volatile__ ("rsr %0,"__stringify(sr) : "=a"(v));
^
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
return type of wait_for_completion_timeout is unsigned long not int and
always returns >=0 , this patch adds a suitable return variable and
simplifies the return value checking as there is no < 0 case.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Mark Brown <broonie@kernel.org>
return type of wait_for_completion_timeout is unsigned long not int, this
patch uses the return value of wait_for_completion_timeout in the condition
directly rather than assigning it to an incorrect type variable.
Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at> Signed-off-by: Mark Brown <broonie@kernel.org>
Guenter Roeck [Tue, 3 Feb 2015 18:01:19 +0000 (10:01 -0800)]
regmap: Fix i2c word access when using SMBus access functions
SMBus access functions assume that 16-bit values are formatted as
little endian numbers. The direct i2c access functions in regmap,
however, assume that 16-bit values are formatted as big endian numbers.
As a result, the current code returns different values if an i2c chip's
16-bit registers are accessed through i2c access functions vs. SMBus
access functions.
Use regmap_smbus_read_word_swapped and regmap_smbus_write_word_swapped
for 16-bit SMBus accesses if a chip is configured as REGMAP_ENDIAN_BIG.
If the chip is configured as REGMAP_ENDIAN_LITTLE, keep using
regmap_smbus_write_word_data and regmap_smbus_read_word_data. Otherwise
reject registration if the controller does not support direct i2c accesses.
Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Mark Brown <broonie@kernel.org>
Linus Torvalds [Wed, 4 Feb 2015 18:22:08 +0000 (10:22 -0800)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three small cifs fixes. One fixes a hang under stress, and the other
two are security related"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix MUST SecurityFlags filtering
Complete oplock break jobs before closing file handle
cifs: use memzero_explicit to clear stack buffer
Linus Torvalds [Wed, 4 Feb 2015 17:42:55 +0000 (09:42 -0800)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"A number of ARM fixes, the biggest is fixing a regression caused by
appended DT blobs exceeding 64K, causing the decompressor fixup code
to fail to patch the DT blob. Another important fix is for the ASID
allocator from Will Deacon which prevents some rare crashes seen on
some systems. Lastly, there's a build fix for v7M systems when printk
support is disabled.
The last two remaining fixes are more cosmetic - the IOMMU one
prevents an annoying harmless warning message, and we disable the
kernel strict memory permissions on non-MMU which can't support it
anyway"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8299/1: mm: ensure local active ASID is marked as allocated on rollover
ARM: 8298/1: ARM_KERNMEM_PERMS only works with MMU enabled
ARM: 8295/1: fix v7M build for !CONFIG_PRINTK
ARM: 8294/1: ATAG_DTB_COMPAT: remove the DT workspace's hardcoded 64KB size
ARM: 8288/1: dma-mapping: don't detach devices without an IOMMU during teardown
return type of wait_for_completion_timeout is unsigned long not int, this
patch adds an appropriate variable and fixes up the assignment. It removes
the else branch as the only thing it was doing is assigning ret = 0; - but
ret is never used thereafter so that is not needed. As the string in
dev_err already states "timeout" there is little point in printing the 0.
A typo in "trasfer" -> transfer is also fixed.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Mark Brown <broonie@kernel.org>
Add missing stubs for regulator_suspend_prepare() and
regulator_suspend_finish() to fix exynos_defconfig build without
REGULATOR:
arch/arm/mach-exynos/built-in.o: In function `exynos_suspend_finish':
arch/arm/mach-exynos/suspend.c:537: undefined reference to `regulator_suspend_finish'
arch/arm/mach-exynos/built-in.o: In function `exynos_suspend_prepare':
arch/arm/mach-exynos/suspend.c:520: undefined reference to `regulator_suspend_prepare'
make: *** [vmlinux] Error 1
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Reported-by: Joerg Roedel <joro@8bytes.org> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Mark Brown <broonie@kernel.org>
Mark Rutland [Wed, 7 Jan 2015 15:01:54 +0000 (15:01 +0000)]
perf: Decouple unthrottling and rotating
Currently the adjusments made as part of perf_event_task_tick() use the
percpu rotation lists to iterate over any active PMU contexts, but these
are not used by the context rotation code, having been replaced by
separate (per-context) hrtimer callbacks. However, some manipulation of
the rotation lists (i.e. removal of contexts) has remained in
perf_rotate_context(). This leads to the following issues:
* Contexts are not always removed from the rotation lists. Removal of
PMUs which have been placed in rotation lists, but have not been
removed by a hrtimer callback can result in corruption of the rotation
lists (when memory backing the context is freed).
This has been observed to result in hangs when PMU drivers built as
modules are inserted and removed around the creation of events for
said PMUs.
* Contexts which do not require rotation may be removed from the
rotation lists as a result of a hrtimer, and will not be considered by
the unthrottling code in perf_event_task_tick.
This patch fixes the issue by updating the rotation ist when events are
scheduled in/out, ensuring that each rotation list stays in sync with
the HW state. As each event holds a refcount on the module of its PMU,
this ensures that when a PMU module is unloaded none of its CPU contexts
can be in a rotation list. By maintaining a list of perf_event_contexts
rather than perf_event_cpu_contexts, we don't need separate paths to
handle the cpu and task contexts, which also makes the code a little
simpler.
As the rotation_list variables are not used for rotation, these are
renamed to active_ctx_list, which better matches their current function.
perf_pmu_rotate_{start,stop} are renamed to
perf_pmu_ctx_{activate,deactivate}.
Reported-by: Johannes Jensen <johannes.jensen@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Will Deacon <Will.Deacon@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20150129134511.GR17721@leverpostej Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mark Rutland [Wed, 7 Jan 2015 14:56:51 +0000 (14:56 +0000)]
perf: Drop module reference on event init failure
When initialising an event, perf_init_event will call try_module_get() to
ensure that the PMU's module cannot be removed for the lifetime of the
event, with __free_event() dropping the reference when the event is
finally destroyed. If something fails after the event has been
initialised, but before the event is installed, perf_event_alloc will
drop the reference on the module.
However, if we fail to initialise an event for some reason (e.g. we ask
an uncore PMU to perform sampling, and it refuses to initialise the
event), we do not drop the refcount. If we try to open such a bogus
event without a precise IDR type, we will loop over each PMU in the pmus
list, incrementing each of their refcounts without decrementing them.
This patch adds a module_put when pmu->event_init(event) fails, ensuring
that the refcounts are balanced in failure cases. As the innards of the
precise and search based initialisation look very similar, this logic is
hoisted out into a new helper function. While the early return for the
failed try_module_get is removed from the search case, this is handled
by the remaining return when ret is not -ENOENT.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1420642611-22667-1-git-send-email-mark.rutland@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 29 Jan 2015 13:44:34 +0000 (14:44 +0100)]
perf: Fix put_event() ctx lock
So what I suspect; but I'm in zombie mode today it seems; is that while
I initially thought that it was impossible for ctx to change when
refcount dropped to 0, I now suspect its possible.
Note that until perf_remove_from_context() the event is still active and
visible on the lists. So a concurrent sys_perf_event_open() from another
task into this task can race.
I think the below should cure this; if we install a group leader it
will iterate the (still intact) group list and find its siblings and
try and install those too -- even though those still have the old
event->ctx -- in the new ctx.
Upon installing the first group sibling we'd try and schedule out the
group and trigger the above warn.
Fix this by installing the group leader last, installing siblings
would have no effect, they're not reachable through the group lists
and therefore we don't schedule them.
Also delay resetting the state until we're absolutely sure the events
are quiescent.
Reported-by: Jiri Olsa <jolsa@redhat.com> Reported-by: vincent.weaver@maine.edu Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20150126162639.GA21418@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
Turned off UFO support to virtio-net based devices due to issues
with IPv6 fragment id generation for UFO packets. The issue
was that IPv6 UFO/GSO implementation expects the fragment id
to be supplied in skb_shinfo(). However, for packets generated
by the VMs, the fragment id is not supplied which causes all
IPv6 fragments to have the id of 0.
The problem is that turning off UFO support on tap/macvtap
as well as virtio devices caused issues with migrations.
Migrations would fail when moving a vm from a kernel supporting
expecting UFO to work to the newer kernels that disabled UFO.
This series provides a partial solution to address the migration
issue. The series allows us to track whether skb_shinfo()->ip6_frag_id
has been set by treating value of 0 as unset.
This lets GSO code to generate fragment ids if they are necessary
(ex: packet was generated by VM or packet socket).
Since v3:
- Resolved build issue when IPv6 is a module.
- Removed trailing white space.
Since v2:
- Rebase and rebuild to make sure everything works. No changes
to the patches were done.
Since v1:
- Removed the skb bit and use value of 0 as tracker.
- Used Eric's suggestion to set fragment id as 0x80000000 if id
generation procedure yeilded a 0 result.
- Consolidated ipv6 id genration code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that GSO functionality can correctly track if the fragment
id has been selected and select a fragment id if necessary,
we can re-enable UFO on tap/macvap and virtio devices.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Now that GSO layer can track if fragment id has been selected
and can allocate one if necessary, we don't need to do this in
tap and macvtap. This reverts most of the code and only keeps
the new ipv6 fragment id generation function that is still needed.
Fixes: 3d0ad09412ff (drivers/net: Disable UFO through virtio) Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Tue, 3 Feb 2015 21:36:15 +0000 (16:36 -0500)]
ipv6: Select fragment id during UFO segmentation if not set.
If the IPv6 fragment id has not been set and we perform
fragmentation due to UFO, select a new fragment id.
We now consider a fragment id of 0 as unset and if id selection
process returns 0 (after all the pertrubations), we set it to
0x80000000, thus giving us ample space not to create collisions
with the next packet we may have to fragment.
When doing UFO integrity checking, we also select the
fragment id if it has not be set yet. This is stored into
the skb_shinfo() thus allowing UFO to function correclty.
This patch also removes duplicate fragment id generation code
and moves ipv6_select_ident() into the header as it may be
used during GSO.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Davidlohr Bueso [Mon, 2 Feb 2015 06:16:24 +0000 (22:16 -0800)]
locking/rtmutex: Optimize setting task running after being blocked
We explicitly mark the task running after returning from
a __rt_mutex_slowlock() call, which does the actual sleeping
via wait-wake-trylocking. As such, this patch does two things:
(1) refactors the code so that setting current to TASK_RUNNING
is done by __rt_mutex_slowlock(), and not by the callers. The
downside to this is that it becomes a bit unclear when at what
point we block. As such I've added a comment that the task
blocks when calling __rt_mutex_slowlock() so readers can figure
out when it is running again.
(2) relaxes setting current's state through __set_current_state(),
instead of it's more expensive barrier alternative. There was no
need for the implied barrier as we're obviously not planning on
blocking.
Davidlohr Bueso [Mon, 26 Jan 2015 07:36:04 +0000 (23:36 -0800)]
locking/rwsem: Use task->state helpers
Call __set_task_state() instead of assigning the new state
directly. These interfaces also aid CONFIG_DEBUG_ATOMIC_SLEEP
environments, keeping track of who last changed the state.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Jason Low <jason.low2@hp.com> Cc: Michel Lespinasse <walken@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1422257769-14083-2-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
Davidlohr Bueso [Tue, 20 Jan 2015 01:39:21 +0000 (17:39 -0800)]
locking/mutex: Explicitly mark task as running after wakeup
By the time we wake up and get the lock after being asleep
in the slowpath, we better be running. As good practice,
be explicit about this and avoid any mischief.
Sharon Dvir [Sun, 1 Feb 2015 21:47:32 +0000 (23:47 +0200)]
sched/Documentation: Remove unneeded word
The second 'mutex' shouldn't be there, it can't be about the mutex,
as the mutex can't be freed, but unlocked, the memory where the
mutex resides however, can be freed.
__schedule() disables preemption during its job and re-enables it
afterward without doing a preemption check to avoid recursion.
But if an event happens after the context switch which requires
rescheduling, we need to check again if a task of a higher priority
needs the CPU. A preempt irq can raise such a situation. To handle that,
__schedule() loops on need_resched().
But preempt_schedule_*() functions, which call __schedule(), also loop
on need_resched() to handle missed preempt irqs. Hence we end up with
the same loop happening twice.
Lets simplify that by attributing the need_resched() loop responsibility
to all __schedule() callers.
There is a risk that the outer loop now handles reschedules that used
to be handled by the inner loop with the added overhead of caller details
(inc/dec of PREEMPT_ACTIVE, irq save/restore) but assuming those inner
rescheduling loop weren't too frequent, this shouldn't matter. Especially
since the whole preemption path is now losing one loop in any case.
Xunlei Pang [Mon, 19 Jan 2015 04:49:37 +0000 (04:49 +0000)]
sched/deadline: Remove cpu_active_mask from cpudl_find()
cpu_active_mask is rarely changed (only on hotplug), so remove this
operation to gain a little performance.
If there is a change in cpu_active_mask, rq_online_dl() and
rq_offline_dl() should take care of it normally, so cpudl::free_cpus
carries enough information for us.
For the rare case when a task is put onto a dying cpu (which
rq_offline_dl() can't handle in a timely fashion), it will be
handled through _cpu_down()->...->multi_cpu_stop()->migration_call()
->migrate_tasks(), preventing the task from hanging on the
dead cpu.
Cc: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
[peterz: changelog] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/1421642980-10045-2-git-send-email-pang.xunlei@linaro.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Wanpeng Li [Wed, 26 Nov 2014 00:44:06 +0000 (08:44 +0800)]
sched: Fix hrtick_start() on UP
The commit 177ef2a6315e ("sched/deadline: Fix a precision problem in
the microseconds range") forgot to change the UP version of
hrtick_start(), do so now.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Fixes: 177ef2a6315e ("sched/deadline: Fix a precision problem in the microseconds range")
[ Fixed the changelog. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Kirill Tkhai <ktkhai@parallels.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1416962647-76792-7-git-send-email-wanpeng.li@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Wanpeng Li [Wed, 26 Nov 2014 00:44:04 +0000 (08:44 +0800)]
sched/deadline: Avoid pointless __setscheduler()
There is no need to dequeue/enqueue and push/pull if there are
no scheduling parameters changed for the DL class.
Both fair and RT classes already check if parameters changed for
them to avoid unnecessary overhead. This patch add the parameters
changed test for the DL class in order to reduce overhead.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
[ Fixed up the changelog. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Kirill Tkhai <ktkhai@parallels.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1416962647-76792-5-git-send-email-wanpeng.li@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Wed, 26 Nov 2014 00:44:03 +0000 (08:44 +0800)]
sched/deadline: Fix stale yield state
When we fail to start the deadline timer in update_curr_dl(), we
forget to clear ->dl_yielded, resulting in wrecked time keeping.
Since the natural place to clear both ->dl_yielded and ->dl_throttled
is in replenish_dl_entity(); both are after all waiting for that event;
make it so.
Luckily since 67dfa1b756f2 ("sched/deadline: Implement
cancel_dl_timer() to use in switched_from_dl()") the
task_on_rq_queued() condition in dl_task_timer() must be true, and can
therefore call enqueue_task_dl() unconditionally.
Reported-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Kirill Tkhai <ktkhai@parallels.com> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1416962647-76792-4-git-send-email-wanpeng.li@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 67dfa1b756f2 ("sched/deadline: Implement cancel_dl_timer() to
use in switched_from_dl()") removed the hrtimer_try_cancel() function
call out from init_dl_task_timer(), which gets called from
__setparam_dl().
The result is that we can now re-init the timer while its active --
this is bad and corrupts timer state.
Furthermore; changing the parameters of an active deadline task is
tricky in that you want to maintain guarantees, while immediately
effective change would allow one to circumvent the CBS guarantees --
this too is bad, as one (bad) task should not be able to affect the
others.
Rework things to avoid both problems. We only need to initialize the
timer once, so move that to __sched_fork() for new tasks.
Then make sure __setparam_dl() doesn't affect the current running
state but only updates the parameters used to calculate the next
scheduling period -- this guarantees the CBS functions as expected
(albeit slightly pessimistic).
This however means we need to make sure __dl_clear_params() needs to
reset the active state otherwise new (and tasks flipping between
classes) will not properly (re)compute their first instance.
Todo: close class flipping CBS hole.
Todo: implement delayed BW release.
Reported-by: Luca Abeni <luca.abeni@unitn.it> Acked-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Luca Abeni <luca.abeni@unitn.it> Fixes: 67dfa1b756f2 ("sched/deadline: Implement cancel_dl_timer() to use in switched_from_dl()") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Kirill Tkhai <tkhai@yandex.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20150128140803.GF23038@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Wed, 4 Feb 2015 04:12:57 +0000 (20:12 -0800)]
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband reverts from Roland Dreier:
"Last minute InfiniBand/RDMA changes for 3.19:
- Revert IPoIB driver back to 3.18 state. We had a number of fixes
go into 3.19, but they introduced regressions. We tried to get
everything fixed up but ran out of time, so we'll try again for
3.20.
- Similarly, turn off the new "extended query port" verb. Late in
the cycle we realized the ABI is not quite right, and rather than
freeze something in a rush and make a mistake, we'll take a bit
more time and get it right in 3.20"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/core: Temporarily disable ex_query_device uverb
Revert "IPoIB: Consolidate rtnl_lock tasks in workqueue"
Revert "IPoIB: Make the carrier_on_task race aware"
Revert "IPoIB: fix MCAST_FLAG_BUSY usage"
Revert "IPoIB: fix mcast_dev_flush/mcast_restart_task race"
Revert "IPoIB: change init sequence ordering"
Revert "IPoIB: Use dedicated workqueues per interface"
Revert "IPoIB: Make ipoib_mcast_stop_thread flush the workqueue"
Revert "IPoIB: No longer use flush as a parameter"
Linus Torvalds [Wed, 4 Feb 2015 03:54:57 +0000 (19:54 -0800)]
Merge tag 'md/3.19-fixes' of git://neil.brown.name/md
Pull two fixes for md from Neil Brown:
- Another live lock, needs backporting
- work-around false positive with new warnings.
* tag 'md/3.19-fixes' of git://neil.brown.name/md:
md/bitmap: fix a might_sleep() warning.
md/raid5: fix another livelock caused by non-aligned writes.
Myron Stowe [Tue, 3 Feb 2015 23:01:24 +0000 (16:01 -0700)]
PCI: Handle read-only BARs on AMD CS553x devices
Some AMD CS553x devices have read-only BARs because of a firmware or
hardware defect. There's a workaround in quirk_cs5536_vsa(), but it no
longer works after 36e8164882ca ("PCI: Restore detection of read-only
BARs"). Prior to 36e8164882ca, we filled in res->start; afterwards we
leave it zeroed out. The quirk only updated the size, so the driver tried
to use a region starting at zero, which didn't work.
Expand quirk_cs5536_vsa() to read the base addresses from the BARs and
hard-code the sizes.
On Nix's system BAR 2's read-only value is 0x6200. Prior to 36e8164882ca,
we interpret that as a 512-byte BAR based on the lowest-order bit set. Per
datasheet sec 5.6.1, that BAR (MFGPT) requires only 64 bytes; use that to
avoid clearing any address bits if a platform uses only 64-byte alignment.
Setting a dev_pm_ops suspend/resume pair but not a set of
hibernation functions means those pm functions will not be
called upon hibernation.
Fix this by using SIMPLE_DEV_PM_OPS, which appropriately
assigns the suspend and hibernation handlers and move
mp102_suspend/tmp102_resume under CONFIG_PM_SLEEP to avoid
build warnings.
Linus Torvalds [Tue, 3 Feb 2015 19:36:57 +0000 (11:36 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull final block layer fixes from Jens Axboe:
"Unfortunately the hctx/ctx lifetime fix from last pull had some
issues. This pull request contains a revert of the problematic
commit, and a proper rewrite of it.
The rewrite has been tested by the users complaining about the
regression, and it works fine now. Additionally, I've run testing on
all the blk-mq use cases for it and it passes. So we should
definitely get this into 3.19, to avoid regression for some cases"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: release mq's kobjects in blk_release_queue()
Revert "blk-mq: fix hctx/ctx kobject use-after-free"
Linus Torvalds [Tue, 3 Feb 2015 19:26:54 +0000 (11:26 -0800)]
Merge tag 'gpio-v3.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull gpio fixes from Linus Walleij:
"Yet more GPIO fixes for the v3.19 series.
There is a high bug-spot activity in GPIO this merge window, much due
to Johan Hovolds spearheading into actually exercising the removal
path for GPIO chips, something that was never really exercised before.
The other two fixes are augmenting erroneous behaviours in two
specific drivers for minor systems.
Summary from signed tag:
- Two fixes stabilizing that which was never stable before: removal
of GPIO chips, now let's stop leaking memory.
- Make sure OMAP IRQs are usable when the irqchip API is used
orthogonally to the gpiochip API.
- Provide a default GPIO base for the mcp23s08 driver"
* tag 'gpio-v3.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: sysfs: fix memory leak in gpiod_sysfs_set_active_low
gpio: sysfs: fix memory leak in gpiod_export_link
gpio: mcp23s08: handle default gpio base
gpio: omap: Fix bad device access with setup_irq()
Commit 5a77abf9a97a ("IB/core: Add support for extended query device caps")
added a new extended verb to query the capabilities of RDMA devices, but the
semantics of this verb are still under debate [1].
Don't expose this verb to userspace until the ABI is nailed down.
Suman Tripathi [Mon, 2 Feb 2015 18:07:19 +0000 (23:37 +0530)]
ahci_xgene: Fix the dma state machine lockup for the ATA_CMD_SMART PIO mode command.
This patch addresses the issue with ATA_CMD_SMART pio mode command for
enumeration and device detection with ATA devices. The X-Gene AHCI
controller has an errata in which it cannot clear the BSY bit after
the PIO setup FIS. The dma state machine enters CMFatalErrorUpdate
state and locks up. It is the same issue as in the commit 2a0bdff6b958
("ahci-xgene: fix the dma state machine lockup for the IDENTIFY DEVICE
PIO mode command").
For example : without this patch it results in READ DMA command failure
as shown below :
Revert "ACPI / LPSS: introduce a 'proxy' device to power on LPSS for DMA"
Revert commit 6c17ee44d524 (ACPI / LPSS: introduce a 'proxy' device
to power on LPSS for DMA), as it introduced registration and probe
ordering problems between devices on the LPSS that may lead to full
hard system hang on boot in some cases.
Eric Nelson [Fri, 30 Jan 2015 21:07:55 +0000 (14:07 -0700)]
ASoC: sgtl5000: add delay before first I2C access
To quote from section 1.3.1 of the data sheet:
The SGTL5000 has an internal reset that is deasserted
8 SYS_MCLK cycles after all power rails have been brought
up. After this time, communication can start
...
1.0us represents 8 SYS_MCLK cycles at the minimum 8.0 MHz SYS_MCLK.
Signed-off-by: Eric Nelson <eric.nelson@boundarydevices.com> Reviewed-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
Will Deacon [Thu, 29 Jan 2015 15:41:46 +0000 (16:41 +0100)]
ARM: 8299/1: mm: ensure local active ASID is marked as allocated on rollover
Commit e1a5848e3398 ("ARM: 7924/1: mm: don't bother with reserved ttbr0
when running with LPAE") removed the use of the reserved TTBR0 value
for LPAE systems, since the ASID is held in the TTBR and can be updated
atomicly with the pgd of the next mm.
Unfortunately, this patch forgot to update flush_context, which
deliberately avoids marking the local active ASID as allocated, since we
used to switch via ASID zero and didn't need to allocate the ASID of
the previous mm. The side-effect of this is that we can allocate the
same ASID to the next mm and, between flushing the local TLB and updating
TTBR0, we can perform speculative TLB fills for userspace nG mappings
using the page table of the previous mm.
The consequence of this is that the next mm can erroneously hit some
mappings of the previous mm. Note that this was made significantly
harder to hit by a391263cd84e ("ARM: 8203/1: mm: try to re-use old ASID
assignments following a rollover") but is still theoretically possible.
This patch fixes the problem by removing the code from flush_context
that forces the allocated ASID to zero for the local CPU. Many thanks
to the Broadcom guys for tracking this one down.
Fixes: e1a5848e3398 ("ARM: 7924/1: mm: don't bother with reserved ttbr0 when running with LPAE") Cc: <stable@vger.kernel.org> # v3.14+ Reported-by: Raymond Ngun <rngun@broadcom.com> Tested-by: Raymond Ngun <rngun@broadcom.com> Reviewed-by: Gregory Fong <gregory.0xf0@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Robin Gong [Tue, 3 Feb 2015 02:25:53 +0000 (10:25 +0800)]
spi: imx: use pio mode for i.mx6dl
For TKT238285 hardware issue which may cause txfifo store data twice can only
be caught on i.mx6dl, we use pio mode instead of DMA mode on i.mx6dl.
Fixes: f62caccd12c17e4 (spi: spi-imx: add DMA support) Signed-off-by: Robin Gong <b38343@freescale.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org
Mikulas Patocka [Mon, 2 Feb 2015 14:39:02 +0000 (09:39 -0500)]
sched/wait: Remove might_sleep() from wait_event_cmd()
The patch e22b886a8a43 ("sched/wait: Add might_sleep() checks")
introduced a bug in the raid5 subsystem.
The function raid5_quiesce() (and resize_stripes()) uses the 'cmd'
part to release and acquire a spinlock (so we call the sleep
primitives in atomic context), and therefore we cannot do the
might_sleep() check.
David Vrabel [Mon, 2 Feb 2015 16:57:51 +0000 (16:57 +0000)]
xen-netback: stop the guest rx thread after a fatal error
After commit e9d8b2c2968499c1f96563e6522c56958d5a1d0d (xen-netback:
disable rogue vif in kthread context), a fatal (protocol) error would
leave the guest Rx thread spinning, wasting CPU time. Commit ecf08d2dbb96d5a4b4bcc53a39e8d29cc8fef02e (xen-netback: reintroduce
guest Rx stall detection) made this even worse by removing a
cond_resched() from this path.
Since a fatal error is non-recoverable, just allow the guest Rx thread
to exit. This requires taking additional refs to the task so the
thread exiting early is handled safely.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reported-by: Julien Grall <julien.grall@linaro.org> Tested-by: Julien Grall <julien.grall@linaro.org> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx4_core: Fix kernel Oops (mem corruption) when working with more than 80 VFs
Commit de966c592802 (net/mlx4_core: Support more than 64 VFs) was meant to
allow up to 126 VFs. However, due to leaving MLX4_MFUNC_MAX too low, using
more than 80 VFs resulted in memory corruptions (and Oopses) when more than
80 VFs were requested. In addition, the number of slaves was left too high.
This commit fixes these issues.
Fixes: de966c592802 ("net/mlx4_core: Support more than 64 VFs") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Sun, 1 Feb 2015 20:54:25 +0000 (23:54 +0300)]
isdn: off by one in connect_res()
The bug here is that we use "Reject" as the index into the cau_t[] array
in the else path. Since the cau_t[] has 9 elements if Reject == 9 then
we are reading beyond the end of the array.
My understanding of the code is that it's saying that if Reject is 1 or
too high then that's invalid and we should hang up.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:
1) Validate hooks for nf_tables NAT expressions, otherwise users can
crash the kernel when using them from the wrong hook. We already
got one user trapped on this when configuring masquerading.
2) Fix a BUG splat in nf_tables with CONFIG_DEBUG_PREEMPT=y. Reported
by Andreas Schultz.
3) Avoid unnecessary reroute of traffic in the local input path
in IPVS that triggers a crash in in xfrm. Reported by Florian
Wiessner and fixes by Julian Anastasov.
4) Fix memory and module refcount leak from the error path of
nf_tables_newchain().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David L Stevens [Fri, 30 Jan 2015 17:29:45 +0000 (12:29 -0500)]
sunvnet: set queue mapping when doing packet copies
This patch fixes a bug where vnet_skb_shape() didn't set the already-selected
queue mapping when a packet copy was required. This results in using the
wrong queue index for stops/starts, hung tx queues and watchdog timeouts
under heavy load.
Signed-off-by: David L Stevens <david.stevens@oracle.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Leitner [Fri, 30 Jan 2015 11:56:01 +0000 (09:56 -0200)]
qlge: Fix qlge_update_hw_vlan_features to handle if interface is down
Currently qlge_update_hw_vlan_features() will always first put the
interface down, then update features and then bring it up again. But it
is possible to hit this code while the adapter is down and this causes a
non-paired call to napi_disable(), which will get stuck.
This patch fixes it by skipping these down/up actions if the interface
is already down.
Fixes: a45adbe8d352 ("qlge: Enhance nested VLAN (Q-in-Q) handling.") Cc: Harish Patil <harish.patil@qlogic.com> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Airlie [Tue, 3 Feb 2015 01:21:11 +0000 (11:21 +1000)]
Merge tag 'drm-amdkfd-fixes-2015-02-02' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes
Three small fixes that came up during last week, nothing scary:
- Accidently incremented a counter instead of decrementing it (copy-paste error)
- Module parameter of max num of queues must be at least 1 and not 0
- Don't do BUG() as a result from wrong user input
* tag 'drm-amdkfd-fixes-2015-02-02' of git://people.freedesktop.org/~gabbayo/linux:
drm/amdkfd: Don't create BUG due to incorrect user parameter
drm/amdkfd: max num of queues can't be 0
drm/amdkfd: Fix bug in accounting of queues
Dave Airlie [Tue, 3 Feb 2015 01:20:39 +0000 (11:20 +1000)]
Merge branch 'drm-fixes-3.19' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
One last round of fixes for radeon for 3.19:
- fix some fallout from the reservation object integration on the
test/benchmark options
- fix a crash in the gpu vm code if gfx init fails
- fix a pll issue that leads to a blank screen on older IGP parts
* 'drm-fixes-3.19' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: fix the crash in test functions
drm/radeon: fix the crash in benchmark functions
drm/radeon: properly set vm fragment size for TN/RL
drm/radeon: don't init gpuvm if accel is disabled (v3)
drm/radeon: fix PLLs on RS880 and older v2
Bhuvanchandra DV [Sat, 31 Jan 2015 16:33:25 +0000 (22:03 +0530)]
spi: fsl-dspi: Remove possible memory leak of 'chip'
Move the check for spi->bits_per_word
before allocation, to avoid memory leak.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bhuvanchandra DV <bhuvanchandra.dv@toradex.com> Signed-off-by: Mark Brown <broonie@kernel.org>
spi: sh-msiof: Update calculation of frequency dividing
sh-msiof of frequency dividing does not perform the calculation, driver have
to manage setting value in the table. It is not possible to set frequency
dividing value close to the actual data in this way. This changes from
frequency dividing of table management to setting by calculation.
This driver is able to set a value close to the actual data.
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com> Signed-off-by: Mark Brown <broonie@kernel.org>
Takashi Iwai [Fri, 30 Jan 2015 19:29:31 +0000 (20:29 +0100)]
regulator: Build sysfs entries with static attribute groups
Instead of calling device_create_file() manually after the device
registration, put all in attribute groups and filter the unwanted ones
via is_visible callback. This not only simplifies the code but also
avoids the possible race between the device registration and sysfs
registration.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org>
Ian Abbott [Fri, 30 Jan 2015 18:43:33 +0000 (18:43 +0000)]
spi: spidev: Convert buf pointers for 32-bit compat SPI_IOC_MESSAGE(n)
The SPI_IOC_MESSAGE(n) ioctl commands' argument points to an array of n
struct spi_ioc_transfer elements. The spidev's compat_ioctl handler
just converts this pointer and passes it on to the unlocked_ioctl
handler to process it.
The tx_buf and rx_buf members of struct spi_ioc_transfer are of type
__u64 and hold pointer values. A 32-bit userspace application running
in a 64-bit kernel might not have widened the 32-bit pointers correctly
for the kernel. The application might have sign-extended the pointer to
when the kernel expects it to be zero-extended, or vice versa, leading
to an -EFAULT being returned by spidev_message() if the widened pointer
is invalid.
Handle the SPI_IOC_MESSAGE(n) ioctl commands specially in the
compat_ioctl handler, calling new function spidev_compat_ioctl_message()
to handle them. This processes them in the same way as the
unlocked_ioctl handler except that it uses compat_ptr() to convert the
tx_buf and rx_buf members of each struct spi_ioc_transfer element.
To save code, factor out part of the unlocked_ioctl handler into a new
function spidev_get_ioc_message(). This checks the ioctl command code
is a valid SPI_IOC_MESSAGE(n), determines n and copies the array of n
struct spi_ioc_transfer elements from userspace into dynamically
allocated memory, returning either a pointer to the memory, an
ERR_PTR(-err) value, or NULL (for SPI_IOC_MESSAGE(0)).
Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Mark Brown <broonie@kernel.org>
Ilija Hadzic [Fri, 30 Jan 2015 05:38:44 +0000 (00:38 -0500)]
drm/radeon: fix the crash in test functions
radeon_copy_dma and radeon_copy_blit must be called with
a valid reservation object. Otherwise a crash will be provoked.
We borrow the object from vram BO.
Cc: stable@vger.kernel.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ilija Hadzic [Fri, 30 Jan 2015 05:38:43 +0000 (00:38 -0500)]
drm/radeon: fix the crash in benchmark functions
radeon_copy_dma and radeon_copy_blit must be called with
a valid reservation object. Otherwise a crash will be provoked.
We borrow the object from destination BO.
Cc: stable@vger.kernel.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Ilija Hadzic <ihadzic@research.bell-labs.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 28 Jan 2015 19:36:26 +0000 (14:36 -0500)]
drm/radeon: don't init gpuvm if accel is disabled (v3)
If acceleration is disabled, it does not make sense
to init gpuvm since nothing will use it. Moreover,
if radeon_vm_init() gets called it uses accel to try
and clear the pde tables, etc. which results in a bug.
v2: handle vm_fini as well
v3: handle bo_open/close as well
Brian King [Thu, 29 Jan 2015 21:54:40 +0000 (15:54 -0600)]
sd: Fix max transfer length for 4k disks
The following patch fixes an issue observed with 4k sector disks
where the max_hw_sectors attribute was getting set too large in
sd_revalidate_disk. Since sdkp->max_xfer_blocks is in units
of SCSI logical blocks and queue_max_hw_sectors is in units of
512 byte blocks, on a 4k sector disk, every time we went through
sd_revalidate_disk, we were taking the current value of
queue_max_hw_sectors and increasing it by a factor of 8. Fix
this by only shifting sdkp->max_xfer_blocks.
Cc: stable@vger.kernel.org Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Mike Christie [Wed, 28 Jan 2015 09:46:53 +0000 (03:46 -0600)]
scsi: fix device handler detach oops
This fixes a regression caused by commit 1d5203 ("scsi: handle more device
handler setup/teardown in common code").
The bug is that the alua detach() callout will try to access the
sddev->scsi_dh_data, but we have already set it to NULL. This patch
moves the clearing of that field to after detach() is called.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Christoph Hellwig <hch@lst.de>