git.karo-electronics.de Git - karo-tx-linux.git/log

]> git.karo-electronics.de Git - karo-tx-linux.git/log

Stan Drozd [Fri, 21 Apr 2017 11:07:10 +0000 (13:07 +0200)]

docs: Fix a spelling error in vfio-mediated-device.txt

This commit fixes a repeated "the" in vfio-mediated-device.txt and reflows the
paragraph.

Signed-off-by: Stan Drozd <drozdziak1@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

commit | commitdiff | tree

Stan Drozd [Fri, 21 Apr 2017 10:58:52 +0000 (12:58 +0200)]

docs: Fix a spelling error in ioctl-number.txt

This commit fixes a misspelled header name in the ioctl numbers list

Signed-off-by: Stan Drozd <drozdziak1@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

commit | commitdiff | tree

Tobias Klauser [Fri, 21 Apr 2017 13:59:04 +0000 (15:59 +0200)]

MAINTAINERS: update file entry for HSI subsystem

The HSI documentation was moved into Documentation/driver-api/hsi.rst in
commit 5e995786850e ("docs: split up serial-interfaces.rst"). Update the
corresponding file entry in MAINTAINERS.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>

commit | commitdiff | tree

Ankit Kumar [Thu, 27 Apr 2017 11:33:13 +0000 (17:03 +0530)]

pstore: Fix flags to enable dumps on powerpc

After commit c950fd6f201a kernel registers pstore write based on flag set.
Pstore write for powerpc is broken as flags(PSTORE_FLAGS_DMESG) is not set for
powerpc architecture. On panic, kernel doesn't write message to
/fs/pstore/dmesg*(Entry doesn't gets created at all).

This patch enables pstore write for powerpc architecture by setting
PSTORE_FLAGS_DMESG flag.

Fixes: c950fd6f201a ("pstore: Split pstore fragile flags")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Ankit Kumar <ankit@linux.vnet.ibm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>

commit | commitdiff | tree

Geliang Tang [Thu, 23 Mar 2017 13:15:13 +0000 (21:15 +0800)]

pstore: Remove unused vmalloc.h in pmsg

Since the vmalloc code has been removed from write_pmsg() in the commit
"5bf6d1b pstore/pmsg: drop bounce buffer", remove the unused header
vmalloc.h.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>

commit | commitdiff | tree

Linus Torvalds [Thu, 27 Apr 2017 20:39:19 +0000 (13:39 -0700)]

Merge tag 'nfsd-4.11-3' of git://linux-nfs.org/~bfields/linux

Pull nfsd fixes from Bruce Fields:
"Thanks to Ari Kauppi and Tuomas Haanpää at Synopsis for spotting bugs
  in our NFSv2/v3 xdr code that could crash the server or leak memory"

* tag 'nfsd-4.11-3' of git://linux-nfs.org/~bfields/linux:
  nfsd: stricter decoding of write-like NFSv2/v3 ops
  nfsd4: minor NFSv2/v3 write decoding cleanup
  nfsd: check for oversized NFSv2/v3 arguments

commit | commitdiff | tree

Linus Torvalds [Thu, 27 Apr 2017 18:38:05 +0000 (11:38 -0700)]

Merge tag 'ceph-for-4.11-rc9' of git://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
"A fix for a kernel stack overflow bug in ceph setattr code, marked for
stable"

* tag 'ceph-for-4.11-rc9' of git://github.com/ceph/ceph-client:
ceph: fix recursion between ceph_set_acl() and __ceph_setattr()

commit | commitdiff | tree

Linus Torvalds [Thu, 27 Apr 2017 18:09:37 +0000 (11:09 -0700)]

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:

- fix orangefs handling of faults on write() - I'd missed that one back
   when orangefs was going through review.

- readdir counterpart of "9p: cope with bogus responses from server in
   p9_client_{read,write}" - server might be lying or broken, and we'd
   better not overrun the kmalloc'ed buffer we are copying the results
   into.

- NFS O_DIRECT read/write can leave iov_iter advanced by too much;
   that's what had been causing iov_iter_pipe() warnings davej had been
   seeing.

- statx_timestamp.tv_nsec type fix (s32 -> u32). That one really should
   go in before 4.11.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  uapi: change the type of struct statx_timestamp.tv_nsec to unsigned
  fix nfs O_DIRECT advancing iov_iter too much
  p9_client_readdir() fix
  orangefs_bufmap_copy_from_iovec(): fix EFAULT handling

commit | commitdiff | tree

Michael Kerrisk (man-pages) [Thu, 27 Apr 2017 11:54:11 +0000 (13:54 +0200)]

statx: correct error handling of NULL pathname

The change in commit 1e2f82d1e9d1 ("statx: Kill fd-with-NULL-path
support in favour of AT_EMPTY_PATH") to error on a NULL pathname to
statx() is inconsistent.

It results in the error EINVAL for a NULL pathname. Other system calls
with similar APIs (fchownat(), fstatat(), linkat()), return EFAULT.

The solution is simply to remove the EINVAL check. As I already pointed
out in [1], user_path_at*() and filename_lookup() will handle the NULL
pathname as per the other APIs, to correctly produce the error EFAULT.

[1] https://lkml.org/lkml/2017/4/26/561

Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Jens Axboe [Thu, 27 Apr 2017 17:33:01 +0000 (11:33 -0600)]

Merge branch 'nvme-4.12' of git://git.infradead.org/nvme into for-4.12/post-merge

Christoph writes:

"A couple more updates for 4.12. The biggest pile is fc and lpfc
updates from James, but there are various small fixes and cleanups as
well."

Fixes up a few merge issues, and also a warning in
lpfc_nvmet_rcv_unsol_abort() if CONFIG_NVME_TARGET_FC isn't enabled.

Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Jens Axboe [Thu, 27 Apr 2017 13:45:46 +0000 (07:45 -0600)]

blk-mq-sched: alloate reserved tags out of normal pool

At least one driver, mtip32xx, has a hard coded dependency on
the value of the reserved tag used for internal commands. While
that should really be fixed up, for now let's ensure that we just
bypass the scheduler tags an allocation marked as reserved. They
are used for house keeping or error handling, so we can safely
ignore them in the scheduler.

Tested-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Ming Lei [Thu, 27 Apr 2017 13:45:18 +0000 (07:45 -0600)]

mtip32xx: use runtime tag to initialize command header

mtip32xx supposes that 'request_idx' passed to .init_request()
is tag of the request, and use that as request's tag to initialize
command header.

After MQ IO scheduler is in, request tag assigned isn't same with
the request index anymore, so cause strange hardware failure on
mtip32xx, even whole system panic is triggered.

This patch fixes the issue by initializing command header via
request's real tag.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Borislav Petkov [Wed, 26 Apr 2017 10:22:08 +0000 (12:22 +0200)]

EDAC, ghes: Do not enable it by default

Leave it to the user to decide whether to enable this or not. Otherwise,
platform-specific drivers won't initialize (currently, EDAC supports
only a single platform driver loaded).

Signed-off-by: Borislav Petkov <bp@suse.de>

commit | commitdiff | tree

Sudeep Holla [Tue, 21 Mar 2017 11:30:16 +0000 (11:30 +0000)]

mailbox: handle empty message in tx_tick

We already check if the message is empty before calling the client
tx_done callback. Calling completion on a wait event is also invalid
if the message is empty.

This patch moves the existing empty message check earlier.

Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

commit | commitdiff | tree

Sudeep Holla [Tue, 21 Mar 2017 11:30:15 +0000 (11:30 +0000)]

mailbox: skip complete wait event if timer expired

If a wait_for_completion_timeout() call returns due to a timeout,
complete() can get called after returning from the wait which is
incorrect and can cause subsequent transmissions on a channel to fail.
Since the wait_for_completion_timeout() sees the completion variable
is non-zero caused by the erroneous/spurious complete() call, and
it immediately returns without waiting for the time as expected by the
client.

This patch fixes the issue by skipping complete() call for the timer
expiry.

Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")
Reported-by: Alexey Klimov <alexey.klimov@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

commit | commitdiff | tree

Sudeep Holla [Tue, 21 Mar 2017 11:30:14 +0000 (11:30 +0000)]

mailbox: always wait in mbox_send_message for blocking Tx mode

There exists a race when msg_submit return immediately as there was an
active request being processed which may have completed just before it's
checked again in mbox_send_message. This will result in return to the
caller without waiting in mbox_send_message even when it's blocking Tx.

This patch fixes the issue by waiting for the completion always if Tx
is in blocking mode.

Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")
Reported-by: Alexey Klimov <alexey.klimov@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Alexey Klimov <alexey.klimov@arm.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>

commit | commitdiff | tree

Sabrina Dubroca [Thu, 27 Apr 2017 10:03:37 +0000 (12:03 +0200)]

xfrm: fix GRO for !CONFIG_NETFILTER

In xfrm_input() when called from GRO, async == 0, and we end up
skipping the processing in xfrm4_transport_finish(). GRO path will
always skip the NF_HOOK, so we don't need the special-case for
!NETFILTER during GRO processing.

Fixes: 7785bba299a8 ("esp: Add a software GRO codepath")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

commit | commitdiff | tree

Frederic Weisbecker [Tue, 25 Apr 2017 14:10:48 +0000 (16:10 +0200)]

sched/cputime: Fix ksoftirqd cputime accounting regression

irq_time_read() returns the irqtime minus the ksoftirqd time. This
is necessary because irq_time_read() is used to substract the IRQ time
from the sum_exec_runtime of a task. If we were to include the softirq
time of ksoftirqd, this task would substract its own CPU time everytime
it updates ksoftirqd->sum_exec_runtime which would therefore never
progress.

But this behaviour got broken by:

a499a5a14db ("sched/cputime: Increment kcpustat directly on irqtime account")

... which now includes ksoftirqd softirq time in the time returned by
irq_time_read().

This has resulted in wrong ksoftirqd cputime reported to userspace
through /proc/stat and thus "top" not showing ksoftirqd when it should
after intense networking load.

ksoftirqd->stime happens to be correct but it gets scaled down by
sum_exec_runtime through task_cputime_adjusted().

To fix this, just account the strict IRQ time in a separate counter and
use it to report the IRQ time.

Reported-and-tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1493129448-5356-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Christoph Hellwig [Tue, 25 Apr 2017 16:56:44 +0000 (18:56 +0200)]

nvme-scsi: remove nvme_trans_security_protocol

This function just returns the same error code and sense data as
the default statement in the switch in the caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>

commit | commitdiff | tree

Martin Schwidefsky [Thu, 27 Apr 2017 05:29:40 +0000 (07:29 +0200)]

Merge branch 's390forkvm' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into features

Pull cpacf changes for KVM from Jason Herne:

Add query support for the KMA instruction.

commit | commitdiff | tree

Dmitry V. Levin [Wed, 26 Apr 2017 13:50:00 +0000 (14:50 +0100)]

uapi: change the type of struct statx_timestamp.tv_nsec to unsigned

The comment asserting that the value of struct statx_timestamp.tv_nsec
must be negative when statx_timestamp.tv_sec is negative, is wrong, as
could be seen from the following example:

#define _FILE_OFFSET_BITS 64
#include <assert.h>
#include <fcntl.h>
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>
#include <asm/unistd.h>
#include <linux/stat.h>

int main(void)
{
static const struct timespec ts[2] = {
{ .tv_nsec = UTIME_OMIT },
{ .tv_sec = -2, .tv_nsec = 42 }
};
assert(utimensat(AT_FDCWD, ".", ts, 0) == 0);

struct stat st;
assert(stat(".", &st) == 0);
printf("st_mtim.tv_sec = %lld, st_mtim.tv_nsec = %lu\n",
       (long long) st.st_mtim.tv_sec,
       (unsigned long) st.st_mtim.tv_nsec);

struct statx stx;
assert(syscall(__NR_statx, AT_FDCWD, ".", 0, 0, &stx) == 0);
printf("stx_mtime.tv_sec = %lld, stx_mtime.tv_nsec = %lu\n",
       (long long) stx.stx_mtime.tv_sec,
       (unsigned long) stx.stx_mtime.tv_nsec);

return 0;
}

It expectedly prints:
st_mtim.tv_sec = -2, st_mtim.tv_nsec = 42
stx_mtime.tv_sec = -2, stx_mtime.tv_nsec = 42

The more generic comment asserting that the value of struct
statx_timestamp.tv_nsec might be negative is confusing to say the least.

It contradicts both the struct stat.st_[acm]time_nsec tradition and
struct timespec.tv_nsec requirements in utimensat syscall.
If statx syscall ever returns a stx_[acm]time containing a negative
tv_nsec that cannot be passed unmodified to utimensat syscall,
it will cause an immense confusion.

Fix this source of confusion by changing the type of struct
statx_timestamp.tv_nsec from __s32 to __u32.

Fixes: a528d35e8bfc ("statx: Add a system call to make enhanced file info available")
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-api@vger.kernel.org
cc: mtk.manpages@gmail.com
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Linus Torvalds [Wed, 26 Apr 2017 22:10:45 +0000 (15:10 -0700)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc

Pull sparc fixes from David Miller:
"I didn't want the release to go out without the statx system call
  properly hooked up"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Update syscall tables.
  sparc64: Fill in rest of HAVE_REGS_AND_STACK_ACCESS_API

commit | commitdiff | tree

David Howells [Wed, 26 Apr 2017 21:15:55 +0000 (22:15 +0100)]

statx: Kill fd-with-NULL-path support in favour of AT_EMPTY_PATH

With the new statx() syscall, the following both allow the attributes of
the file attached to a file descriptor to be retrieved:

statx(dfd, NULL, 0, ...);

and:

statx(dfd, "", AT_EMPTY_PATH, ...);

Change the code to reject the first option, though this means copying
the path and engaging pathwalk for the fstat() equivalent.  dfd can be a
non-directory provided path is "".

[ The timing of this isn't wonderful, but applying this now before we
  have statx() in any released kernel, before anybody starts using the
  NULL special case.    - Linus ]

Fixes: a528d35e8bfc ("statx: Add a system call to make enhanced file info available")
Reported-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Sandeen <sandeen@sandeen.net>
cc: fstests@vger.kernel.org
cc: linux-api@vger.kernel.org
cc: linux-man@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit | commitdiff | tree

Bart Van Assche [Wed, 26 Apr 2017 20:47:57 +0000 (13:47 -0700)]

scsi: Implement blk_mq_ops.show_rq()

Show the SCSI CDB for pending SCSI commands in
/sys/kernel/debug/block/*/mq/*/dispatch and */rq_list. An example
of how SCSI commands are displayed by this code:

ffff8801703245c0 {.op=READ, .cmd_flags=META PRIO, .rq_flags=DONTPREP IO_STAT STATS, .tag=14, .internal_tag=-1, .cmd=Read(10) 28 00 2a 81 1b 30 00 00 08 00}

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: <linux-scsi@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 26 Apr 2017 20:47:56 +0000 (13:47 -0700)]

blk-mq: Add blk_mq_ops.show_rq()

This new callback function will be used in the next patch to show
more information about SCSI requests.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 26 Apr 2017 20:47:55 +0000 (13:47 -0700)]

blk-mq: Show operation, cmd_flags and rq_flags names

Show the operation name, .cmd_flags and .rq_flags as names instead
of numbers.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 26 Apr 2017 20:47:54 +0000 (13:47 -0700)]

blk-mq: Make blk_flags_show() callers append a newline character

This patch does not change any functionality but makes it possible
to produce a single line of output with multiple flag-to-name
translations.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 26 Apr 2017 20:47:53 +0000 (13:47 -0700)]

blk-mq: Move the "state" debugfs attribute one level down

Move the "state" attribute from the top level to the "mq" directory
as requested by Omar.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 26 Apr 2017 20:47:52 +0000 (13:47 -0700)]

blk-mq: Unregister debugfs attributes earlier

We currently call blk_mq_free_queue() from blk_cleanup_queue()
before we unregister the debugfs attributes for that queue in
blk_release_queue(). This leaves a window open during which
accessing most of the mq debugfs attributes would cause a
use-after-free. Additionally, the "state" attribute allows
running the queue, which we should not do after the queue has
entered the "dead" state. Fix both cases by unregistering the
debugfs attributes before freeing queue resources starts.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 26 Apr 2017 20:47:51 +0000 (13:47 -0700)]

blk-mq: Only unregister hctxs for which registration succeeded

Hctx unregistration involves calling kobject_del(). kobject_del()
must not be called if kobject_add() has not been called. Hence in
the error path only unregister hctxs for which registration succeeded.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 26 Apr 2017 20:47:50 +0000 (13:47 -0700)]

blk-mq-debugfs: Rename functions for registering and unregistering the mq directory

Since the blk_mq_debugfs_*register_hctxs() functions register and
unregister all attributes under the "mq" directory, rename these
into blk_mq_debugfs_*register_mq().

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 26 Apr 2017 20:47:49 +0000 (13:47 -0700)]

blk-mq: Let blk_mq_debugfs_register() look up the queue name

A later patch will move the call of blk_mq_debugfs_register() to
a function to which the queue name is not passed as an argument.
To avoid having to add a 'name' argument to multiple callers, let
blk_mq_debugfs_register() look up the queue name.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Bart Van Assche [Wed, 26 Apr 2017 20:47:48 +0000 (13:47 -0700)]

blk-mq: Register <dev>/queue/mq after having registered <dev>/queue

A later patch in this series will modify blk_mq_debugfs_register()
such that it uses q->kobj.parent to determine the name of a
request queue. Hence make sure that that pointer is initialized
before blk_mq_debugfs_register() is called. To avoid lock inversion,
protect sysfs / debugfs registration with the queue sysfs_lock
instead of the global mutex all_q_mutex.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Linus Torvalds [Wed, 26 Apr 2017 20:42:32 +0000 (13:42 -0700)]

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) MLX5 bug fixes from Saeed Mahameed et al:
     - released wrong resources when firmware timeout happens
     - fix wrong check for encapsulation size limits
     - UAR memory leak
     - ETHTOOL_GRXCLSRLALL failed to fill in info->data

2) Don't cache l3mdev on mis-matches local route, causes net devices to
    leak refs. From Robert Shearman.

3) Handle fragmented SKBs properly in macsec driver, the problem is
    that we were mis-sizing the sgvec table. From Jason A. Donenfeld.

4) We cannot have checksum offload enabled for inner UDP tunneled
    packet during IPSEC, from Ansis Atteka.

5) Fix double SKB free in ravb driver, from Dan Carpenter.

6) Fix CPU port handling in b53 DSA driver, from Florian Dainelli.

7) Don't use on-stack buffers for usb_control_msg() in CAN usb driver,
    from Maksim Salau.

8) Fix device leak in macvlan driver, from Herbert Xu. We have to purge
    the broadcast queue properly on port destroy.

9) Fix tx ring entry limit on EF10 devices in sfc driver. From Bert
    Kenward.

10) Fix memory leaks in team driver, from Pan Bian.

11) Don't setup ipv6_stub before it can be actually used, from Paolo
    Abeni.

12) Fix tipc socket flow control accounting, from Parthasarathy
    Bhuvaragan.

13) Fix crash on module unload in hso driver, from Andreas Kemnade.

14) Fix purging of bridge multicast entries, the problem is that if we
    don't defer it to ndo_uninit it's possible for new entries to get
    added after we purge. Fix from Xin Long.

15) Don't return garbage for PACKET_HDRLEN getsockopt, from Alexander
    Potapenko.

16) Fix autoneg stall properly in PHY layer, and revert micrel driver
    change that was papering over it. From Alexander Kochetkov.

17) Don't dereference an ipv4 route as an ipv6 one in the ip6_tunnnel
    code, from Cong Wang.

18) Clear out the congestion control private of the TCP socket in all of
    the right places, from Wei Wang.

19) rawv6_ioctl measures SKB length incorrectly, fix from Jamie
    Bainbridge.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
  ipv6: check raw payload size correctly in ioctl
  tcp: memset ca_priv data to 0 properly
  ipv6: check skb->protocol before lookup for nexthop
  net: core: Prevent from dereferencing null pointer when releasing SKB
  macsec: dynamically allocate space for sglist
  Revert "phy: micrel: Disable auto negotiation on startup"
  net: phy: fix auto-negotiation stall due to unavailable interrupt
  net/packet: check length in getsockopt() called with PACKET_HDRLEN
  net: ipv6: regenerate host route if moved to gc list
  bridge: move bridge multicast cleanup to ndo_uninit
  ipv6: fix source routing
  qed: Fix error in the dcbx app meta data initialization.
  netvsc: fix calculation of available send sections
  net: hso: fix module unloading
  tipc: fix socket flow control accounting error at tipc_recv_stream
  tipc: fix socket flow control accounting error at tipc_send_stream
  ipv6: move stub initialization after ipv6 setup completion
  team: fix memory leaks
  sfc: tx ring can only have 2048 entries for all EF10 NICs
  macvlan: Fix device ref leak when purging bc_queue
  ...

commit | commitdiff | tree

Jamie Bainbridge [Wed, 26 Apr 2017 00:43:27 +0000 (10:43 +1000)]

ipv6: check raw payload size correctly in ioctl

In situations where an skb is paged, the transport header pointer and
tail pointer can be the same because the skb contents are in frags.

This results in ioctl(SIOCINQ/FIONREAD) incorrectly returning a
length of 0 when the length to receive is actually greater than zero.

skb->len is already correctly set in ip6_input_finish() with
pskb_pull(), so use skb->len as it always returns the correct result
for both linear and paged data.

Signed-off-by: Jamie Bainbridge <jbainbri@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Wei Wang [Wed, 26 Apr 2017 00:38:02 +0000 (17:38 -0700)]

tcp: memset ca_priv data to 0 properly

Always zero out ca_priv data in tcp_assign_congestion_control() so that
ca_priv data is cleared out during socket creation.
Also always zero out ca_priv data in tcp_reinit_congestion_control() so
that when cc algorithm is changed, ca_priv data is cleared out as well.
We should still zero out ca_priv data even in TCP_CLOSE state because
user could call connect() on AF_UNSPEC to disconnect the socket and
leave it in TCP_CLOSE state and later call setsockopt() to switch cc
algorithm on this socket.

Fixes: 2b0a8c9ee ("tcp: add CDG congestion control")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

WANG Cong [Tue, 25 Apr 2017 21:37:15 +0000 (14:37 -0700)]

ipv6: check skb->protocol before lookup for nexthop

Andrey reported a out-of-bound access in ip6_tnl_xmit(), this
is because we use an ipv4 dst in ip6_tnl_xmit() and cast an IPv4
neigh key as an IPv6 address:

        neigh = dst_neigh_lookup(skb_dst(skb),
                                 &ipv6_hdr(skb)->daddr);
        if (!neigh)
                goto tx_err_link_failure;

        addr6 = (struct in6_addr *)&neigh->primary_key; // <=== HERE
        addr_type = ipv6_addr_type(addr6);

        if (addr_type == IPV6_ADDR_ANY)
                addr6 = &ipv6_hdr(skb)->daddr;

        memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));

Also the network header of the skb at this point should be still IPv4
for 4in6 tunnels, we shold not just use it as IPv6 header.

This patch fixes it by checking if skb->protocol is ETH_P_IPV6: if it
is, we are safe to do the nexthop lookup using skb_dst() and
ipv6_hdr(skb)->daddr; if not (aka IPv4), we have no clue about which
dest address we can pick here, we have to rely on callers to fill it
from tunnel config, so just fall to ip6_route_output() to make the
decision.

Fixes: ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel endpoints.")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Myungho Jung [Tue, 25 Apr 2017 18:58:15 +0000 (11:58 -0700)]

net: core: Prevent from dereferencing null pointer when releasing SKB

Added NULL check to make __dev_kfree_skb_irq consistent with kfree
family of functions.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=195289
Signed-off-by: Myungho Jung <mhjungk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jason A. Donenfeld [Tue, 25 Apr 2017 17:08:18 +0000 (19:08 +0200)]

macsec: dynamically allocate space for sglist

We call skb_cow_data, which is good anyway to ensure we can actually
modify the skb as such (another error from prior). Now that we have the
number of fragments required, we can safely allocate exactly that amount
of memory.

Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David S. Miller [Wed, 26 Apr 2017 18:33:14 +0000 (14:33 -0400)]

Revert "phy: micrel: Disable auto negotiation on startup"

This reverts commit 99f81afc139c6edd14d77a91ee91685a414a1c66.

It was papering over the real problem, which is fixed by commit
f555f34fdc58 ("net: phy: fix auto-negotiation stall due to unavailable
interrupt")

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Alexander Kochetkov [Thu, 20 Apr 2017 11:00:04 +0000 (14:00 +0300)]

net: phy: fix auto-negotiation stall due to unavailable interrupt

The Ethernet link on an interrupt driven PHY was not coming up if the Ethernet
cable was plugged before the Ethernet interface was brought up.

The patch trigger PHY state machine to update link state if PHY was requested to
do auto-negotiation and auto-negotiation complete flag already set.

During power-up cycle the PHY do auto-negotiation, generate interrupt and set
auto-negotiation complete flag. Interrupt is handled by PHY state machine but
doesn't update link state because PHY is in PHY_READY state. After some time
MAC bring up, start and request PHY to do auto-negotiation. If there are no new
settings to advertise genphy_config_aneg() doesn't start PHY auto-negotiation.
PHY continue to stay in auto-negotiation complete state and doesn't fire
interrupt. At the same time PHY state machine expect that PHY started
auto-negotiation and is waiting for interrupt from PHY and it won't get it.

Fixes: 321beec5047a ("net: phy: Use interrupts when available in NOLINK state")
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Cc: stable <stable@vger.kernel.org> # v4.9+
Tested-by: Roger Quadros <rogerq@ti.com>
Tested-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Linus Torvalds [Wed, 26 Apr 2017 16:30:33 +0000 (09:30 -0700)]

Merge tag 'sound-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"Since we got a bonus week, let me try to screw a few pending fixes.

  A slightly large fix is the locking fix in ASoC STI driver, but it's
  pretty board-specific, and the risk is fairly low.

  All the rest are small / trivial fixes, mostly marked as stable, for
  ALSA sequencer core, ASoC topology, ASoC Intel bytcr and Firewire
  drivers"

* tag 'sound-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: intel: Fix PM and non-atomic crash in bytcr drivers
  ALSA: firewire-lib: fix inappropriate assignment between signed/unsigned type
  ALSA: seq: Don't break snd_use_lock_sync() loop by timeout
  ASoC: topology: Fix to store enum text values
  ASoC: STI: Fix null ptr deference in IRQ handler
  ALSA: oxfw: fix regression to handle Stanton SCS.1m/1d

commit | commitdiff | tree

Al Viro [Wed, 5 Apr 2017 23:17:18 +0000 (19:17 -0400)]

HAVE_ARCH_HARDENED_USERCOPY is unconditional now

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Wed, 5 Apr 2017 23:15:53 +0000 (19:15 -0400)]

CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now

all architectures converted

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Wed, 26 Apr 2017 16:06:59 +0000 (12:06 -0400)]

Merge branches 'uaccess.alpha', 'uaccess.arc', 'uaccess.arm', 'uaccess.arm64', 'uaccess.avr32', 'uaccess.bfin', 'uaccess.c6x', 'uaccess.cris', 'uaccess.frv', 'uaccess.h8300', 'uaccess.hexagon', 'uaccess.ia64', 'uaccess.m32r', 'uaccess.m68k', 'uaccess.metag', 'uaccess.microblaze', 'uaccess.mips', 'uaccess.mn10300', 'uaccess.nios2', 'uaccess.openrisc', 'uaccess.parisc', 'uaccess.powerpc', 'uaccess.s390', 'uaccess.score', 'uaccess.sh', 'uaccess.sparc', 'uaccess.tile', 'uaccess.um', 'uaccess.unicore32', 'uaccess.x86' and 'uaccess.xtensa' into work.uaccess

commit | commitdiff | tree

Al Viro [Tue, 21 Mar 2017 16:02:25 +0000 (12:02 -0400)]

m32r: switch to RAW_COPY_USER

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Harald Freudenberger [Wed, 26 Apr 2017 11:56:41 +0000 (13:56 +0200)]

s390/crypt: use the correct module alias for paes_s390.

For automatic module loading (e.g. as it is used with cryptsetup)
an alias "paes" for the paes_s390 kernel module is needed.
Correct the paes_s390 module alias from "aes-all" to "paes".

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Christoph Hellwig [Wed, 26 Apr 2017 07:34:22 +0000 (09:34 +0200)]

ide-pm: always pass 0 error to ide_complete_rq in ide_do_devset

The caller only looks at the scsi_request result field anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Christoph Hellwig [Wed, 26 Apr 2017 07:34:21 +0000 (09:34 +0200)]

ide-pm: always pass 0 error to __blk_end_request_all

ide_pm_execute_rq exectures a PM request synchronously, and in the failure
case where it calls __blk_end_request_all it never checks the error field
passed to the end_io callback, so don't bother setting it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Christoph Hellwig [Wed, 26 Apr 2017 07:34:20 +0000 (09:34 +0200)]

scsi_transport_sas: always pass 0 error to blk_end_request_all

The SAS transport queues are only used by bsg, and bsg always looks at
the scsi_request results and never add the error passed in the end_io
callback.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Jason J. Herne [Thu, 2 Feb 2017 13:25:19 +0000 (08:25 -0500)]

s390/cpacf: Introduce kma instruction

Provide a kma instruction definition for use by callers of __cpacf_query.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

commit | commitdiff | tree

Jason J. Herne [Tue, 21 Feb 2017 14:00:54 +0000 (09:00 -0500)]

s390/cpacf: query instructions use unique parameters for compatibility with KMA

The new KMA instruction requires unique parameters. Update __cpacf_query to
generate a compatible assembler instruction.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Acked-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

commit | commitdiff | tree

Harald Freudenberger [Fri, 24 Feb 2017 08:10:05 +0000 (09:10 +0100)]

s390/trng: Introduce s390 TRNG device driver.

This patch introduces a new device driver s390-trng for the
s390 platform which exploits the new PRNO TRNG cpacf
subfunction. The true-random-number-generator is accessible
from userspace, by default visible as /dev/trng. The driver
also registers at the kernel build-in hwrng API to feed the
hwrng with fresh entropy data. This generic device driver
for hardware random data is visible from userspace as
/dev/hwrng.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Harald Freudenberger [Fri, 17 Mar 2017 09:46:31 +0000 (10:46 +0100)]

s390/crypto: Provide s390 specific arch random functionality.

This patch introduces s390 specific arch random functionality.
There exists a generic kernel API for arch specific random
number implementation (see include/linux/random.h). Here
comes the header file and a very small static code part
implementing the arch_random_* API based on the TRNG
subfunction coming with the reworked PRNG instruction.

The arch random implementation hooks into the kernel
initialization and checks for availability of the TRNG
function. In accordance to the arch random API all functions
return false if the TRNG is not available. Otherwise the new
high quality entropy source provides fresh random on each
invocation.

The s390 arch random feature build is controlled via
CONFIG_ARCH_RANDOM. This config option located in
arch/s390/Kconfig is enabled by default and appears
as entry "s390 architectural random number generation API"
in the submenu "Processor type and features" for s390 builds.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Harald Freudenberger [Tue, 28 Feb 2017 07:59:22 +0000 (08:59 +0100)]

s390/crypto: Add new subfunctions to the cpacf PRNO function.

There is a new TRNG extension in the subcodes for the cpacf
PRNO function. This patch introduces new defines and a new
cpacf_trng inline function to provide these new features for
other kernel code parts.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Harald Freudenberger [Fri, 24 Feb 2017 09:11:54 +0000 (10:11 +0100)]

s390/crypto: Renaming PPNO to PRNO.

The PPNO (Perform Pseudorandom Number Operation) instruction
has been renamed to PRNO (Perform Random Number Operation).
To avoid confusion and conflicts with future extensions with
this instruction (like e.g. provide a true random number
generator) this patch renames all occurences in cpacf.h and
adjusts the only exploiter code which is the prng device
driver and one line in the s390 kvm feature check.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Heiko Carstens [Mon, 24 Apr 2017 13:27:35 +0000 (15:27 +0200)]

s390/pageattr: avoid unnecessary page table splitting

The kernel page table splitting code will split page tables even for
features the CPU does not support. E.g. a CPU may not support the NX
feature.
In order to avoid this, remove those bits from the flags parameter
that correlate with unsupported CPU features within __set_memory(). In
addition add an early exit if the flags parameter does not have any
bits set afterwards.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Xin Long [Mon, 24 Apr 2017 07:33:39 +0000 (15:33 +0800)]

xfrm: do the garbage collection after flushing policy

Now xfrm garbage collection can be triggered by 'ip xfrm policy del'.
These is no reason not to do it after flushing policies, especially
considering that 'garbage collection deferred' is only triggered
when it reaches gc_thresh.

It's no good that the policy is gone but the xdst still hold there.
The worse thing is that xdst->route/orig_dst is also hold and can
not be released even if the orig_dst is already expired.

This patch is to do the garbage collection if there is any policy
removed in xfrm_policy_flush.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

commit | commitdiff | tree

Andy Lutomirski [Sat, 22 Apr 2017 07:01:22 +0000 (00:01 -0700)]

x86/mm: Fix flush_tlb_page() on Xen

flush_tlb_page() passes a bogus range to flush_tlb_others() and
expects the latter to fix it up. native_flush_tlb_others() has the
fixup but Xen's version doesn't. Move the fixup to
flush_tlb_others().

AFAICS the only real effect is that, without this fix, Xen would
flush everything instead of just the one page on remote vCPUs in
when flush_tlb_page() was called.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: e7b52ffd45a6 ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range")
Link: http://lkml.kernel.org/r/10ed0e4dfea64daef10b87fb85df1746999b4dba.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Andy Lutomirski [Sat, 22 Apr 2017 07:01:21 +0000 (00:01 -0700)]

x86/mm: Make flush_tlb_mm_range() more predictable

I'm about to rewrite the function almost completely, but first I
want to get a functional change out of the way.  Currently, if
flush_tlb_mm_range() does not flush the local TLB at all, it will
never do individual page flushes on remote CPUs.  This seems to be
an accident, and preserving it will be awkward.  Let's change it
first so that any regressions in the rewrite will be easier to
bisect and so that the rewrite can attempt to change no visible
behavior at all.

The fix is simple: we can simply avoid short-circuiting the
calculation of base_pages_to_flush.

As a side effect, this also eliminates a potential corner case: if
tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range()
could have ended up flushing the entire address space one page at a
time.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4b29b771d9975aad7154c314534fec235618175a.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Andy Lutomirski [Sat, 22 Apr 2017 07:01:20 +0000 (00:01 -0700)]

x86/mm: Remove flush_tlb() and flush_tlb_current_task()

I was trying to figure out what how flush_tlb_current_task() would
possibly work correctly if current->mm != current->active_mm, but I
realized I could spare myself the effort: it has no callers except
the unused flush_tlb() macro.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e52d64c11690f85e9f1d69d7b48cc2269cd2e94b.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Andy Lutomirski [Sat, 22 Apr 2017 07:01:19 +0000 (00:01 -0700)]

x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()

mark_screen_rdonly() is the last remaining caller of flush_tlb().
flush_tlb_mm_range() is potentially faster and isn't obsolete.

Compile-tested only because I don't know whether software that uses
this mechanism even exists.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/791a644076fc3577ba7f7b7cafd643cc089baa7d.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Kirill A. Shutemov [Tue, 25 Apr 2017 09:25:57 +0000 (12:25 +0300)]

x86/mm/64: Fix crash in remove_pagetable()

remove_pagetable() does page walk using p*d_page_vaddr() plus cast.
It's not canonical approach -- we usually use p*d_offset() for that.

It works fine as long as all page table levels are present. We broke the
invariant by introducing folded p4d page table level.

As result, remove_pagetable() interprets PMD as PUD and it leads to
crash:

BUG: unable to handle kernel paging request at ffff880300000000
IP: memchr_inv+0x60/0x110
PGD 317d067
P4D 317d067
PUD 3180067
PMD 33f102067
PTE 8000000300000060

Let's fix this by using p*d_offset() instead of p*d_page_vaddr() for
page walk.

Reported-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Fixes: f2a6a7050109 ("x86: Convert the rest of the code to support p4d_t")
Link: http://lkml.kernel.org/r/20170425092557.21852-1-kirill.shutemov@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Josh Poimboeuf [Wed, 26 Apr 2017 01:48:52 +0000 (20:48 -0500)]

x86/unwind: Dump all stacks in unwind_dump()

Currently unwind_dump() dumps only the most recently accessed stack.
But it has a few issues.

In some cases, 'first_sp' can get out of sync with 'stack_info', causing
unwind_dump() to start from the wrong address, flood the printk buffer,
and eventually read a bad address.

In other cases, dumping only the most recently accessed stack doesn't
give enough data to diagnose the error.

Fix both issues by dumping *all* stacks involved in the trace, not just
the last one.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 8b5e99f02264 ("x86/unwind: Dump stack data on warnings")
Link: http://lkml.kernel.org/r/016d6a9810d7d1bfc87ef8c0e6ee041c6744c909.1493171120.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Josh Poimboeuf [Wed, 26 Apr 2017 01:48:51 +0000 (20:48 -0500)]

x86/unwind: Silence more entry-code related warnings

Borislav Petkov reported the following unwinder warning:

  WARNING: kernel stack regs at ffffc9000024fea8 in udevadm:92 has bad 'bp' value 00007fffc4614d30
  unwind stack type:0 next_sp:          (null) mask:0x6 graph_idx:0
  ffffc9000024fea8: 000055a6100e9b38 (0x55a6100e9b38)
  ffffc9000024feb0: 000055a6100e9b35 (0x55a6100e9b35)
  ffffc9000024feb8: 000055a6100e9f68 (0x55a6100e9f68)
  ffffc9000024fec0: 000055a6100e9f50 (0x55a6100e9f50)
  ffffc9000024fec8: 00007fffc4614d30 (0x7fffc4614d30)
  ffffc9000024fed0: 000055a6100eaf50 (0x55a6100eaf50)
  ffffc9000024fed8: 0000000000000000 ...
  ffffc9000024fee0: 0000000000000100 (0x100)
  ffffc9000024fee8: ffff8801187df488 (0xffff8801187df488)
  ffffc9000024fef0: 00007ffffffff000 (0x7ffffffff000)
  ffffc9000024fef8: 0000000000000000 ...
  ffffc9000024ff10: ffffc9000024fe98 (0xffffc9000024fe98)
  ffffc9000024ff18: 00007fffc4614d00 (0x7fffc4614d00)
  ffffc9000024ff20: ffffffffffffff10 (0xffffffffffffff10)
  ffffc9000024ff28: ffffffff811c6c1f (SyS_newlstat+0xf/0x10)
  ffffc9000024ff30: 0000000000000010 (0x10)
  ffffc9000024ff38: 0000000000000296 (0x296)
  ffffc9000024ff40: ffffc9000024ff50 (0xffffc9000024ff50)
  ffffc9000024ff48: 0000000000000018 (0x18)
  ffffc9000024ff50: ffffffff816b2e6a (entry_SYSCALL_64_fastpath+0x18/0xa8)
  ...

It unwinded from an interrupt which came in right after entry code
called into a C syscall handler, before it had a chance to set up the
frame pointer, so regs->bp still had its user space value.

Add a check to silence warnings in such a case, where an interrupt
has occurred and regs->sp is almost at the end of the stack.

Reported-by: Borislav Petkov <bp@suse.de>
Tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: c32c47c68a0a ("x86/unwind: Warn on bad frame pointer")
Link: http://lkml.kernel.org/r/c695f0d0d4c2cfe6542b90e2d0520e11eb901eb5.1493171120.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit | commitdiff | tree

Linus Torvalds [Tue, 25 Apr 2017 21:07:24 +0000 (14:07 -0700)]

Merge tag 'arc-4.11-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fix from Vineet Gupta:
"Last minute fixes for ARC:

   - build error in Mellanox nps platform

   - addressing lack of saving FPU regs in releavnt configs"

* tag 'arc-4.11-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARCv2: entry: save Accumulator register pair (r58:59) if present
  ARC: [plat-eznps] Fix build error

commit | commitdiff | tree

J. Bruce Fields [Fri, 21 Apr 2017 19:26:30 +0000 (15:26 -0400)]

nfsd: stricter decoding of write-like NFSv2/v3 ops

The NFSv2/v3 code does not systematically check whether we decode past
the end of the buffer. This generally appears to be harmless, but there
are a few places where we do arithmetic on the pointers involved and
don't account for the possibility that a length could be negative. Add
checks to catch these.

Reported-by: Tuomas Haanpää <thaan@synopsys.com>
Reported-by: Ari Kauppi <ari@synopsys.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

commit | commitdiff | tree

J. Bruce Fields [Tue, 25 Apr 2017 20:21:34 +0000 (16:21 -0400)]

nfsd4: minor NFSv2/v3 write decoding cleanup

Use a couple shortcuts that will simplify a following bugfix.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

commit | commitdiff | tree

J. Bruce Fields [Fri, 21 Apr 2017 20:10:18 +0000 (16:10 -0400)]

nfsd: check for oversized NFSv2/v3 arguments

A client can append random data to the end of an NFSv2 or NFSv3 RPC call
without our complaining; we'll just stop parsing at the end of the
expected data and ignore the rest.

Encoded arguments and replies are stored together in an array of pages,
and if a call is too large it could leave inadequate space for the
reply.  This is normally OK because NFS RPC's typically have either
short arguments and long replies (like READ) or long arguments and short
replies (like WRITE).  But a client that sends an incorrectly long reply
can violate those assumptions.  This was observed to cause crashes.

Also, several operations increment rq_next_page in the decode routine
before checking the argument size, which can leave rq_next_page pointing
well past the end of the page array, causing trouble later in
svc_free_pages.

So, following a suggestion from Neil Brown, add a central check to
enforce our expectation that no NFSv2/v3 call has both a large call and
a large reply.

As followup we may also want to rewrite the encoding routines to check
more carefully that they aren't running off the end of the page array.

We may also consider rejecting calls that have any extra garbage
appended.  That would be safer, and within our rights by spec, but given
the age of our server and the NFS protocol, and the fact that we've
never enforced this before, we may need to balance that against the
possibility of breaking some oddball client.

Reported-by: Tuomas Haanpää <thaan@synopsys.com>
Reported-by: Ari Kauppi <ari@synopsys.com>
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

commit | commitdiff | tree

Yan, Zheng [Wed, 19 Apr 2017 02:01:48 +0000 (10:01 +0800)]

ceph: fix recursion between ceph_set_acl() and __ceph_setattr()

ceph_set_acl() calls __ceph_setattr() if the setacl operation needs
to modify inode's i_mode. __ceph_setattr() updates inode's i_mode,
then calls posix_acl_chmod().

The problem is that __ceph_setattr() calls posix_acl_chmod() before
sending the setattr request. The get_acl() call in posix_acl_chmod()
can trigger a getxattr request. The reply of the getxattr request
can restore inode's i_mode to its old value. The set_acl() call in
posix_acl_chmod() sees old value of inode's i_mode, so it calls
__ceph_setattr() again.

Cc: stable@vger.kernel.org # needs backporting for < 4.9
Link: http://tracker.ceph.com/issues/19688
Reported-by: Jerry Lee <leisurelysw24@gmail.com>
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Tested-by: Luis Henriques <lhenriques@suse.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

commit | commitdiff | tree

Alexander Potapenko [Tue, 25 Apr 2017 16:51:46 +0000 (18:51 +0200)]

net/packet: check length in getsockopt() called with PACKET_HDRLEN

In the case getsockopt() is called with PACKET_HDRLEN and optlen < 4
|val| remains uninitialized and the syscall may behave differently
depending on its value, and even copy garbage to userspace on certain
architectures. To fix this we now return -EINVAL if optlen is too small.

This bug has been detected with KMSAN.

Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

David Ahern [Tue, 25 Apr 2017 16:17:29 +0000 (09:17 -0700)]

net: ipv6: regenerate host route if moved to gc list

Taking down the loopback device wreaks havoc on IPv6 routing. By
extension, taking down a VRF device wreaks havoc on its table.

Dmitry and Andrey both reported heap out-of-bounds reports in the IPv6
FIB code while running syzkaller fuzzer. The root cause is a dead dst
that is on the garbage list gets reinserted into the IPv6 FIB. While on
the gc (or perhaps when it gets added to the gc list) the dst->next is
set to an IPv4 dst. A subsequent walk of the ipv6 tables causes the
out-of-bounds access.

Andrey's reproducer was the key to getting to the bottom of this.

With IPv6, host routes for an address have the dst->dev set to the
loopback device. When the 'lo' device is taken down, rt6_ifdown initiates
a walk of the fib evicting routes with the 'lo' device which means all
host routes are removed. That process moves the dst which is attached to
an inet6_ifaddr to the gc list and marks it as dead.

The recent change to keep global IPv6 addresses added a new function,
fixup_permanent_addr, that is called on admin up. That function restarts
dad for an inet6_ifaddr and when it completes the host route attached
to it is inserted into the fib. Since the route was marked dead and
moved to the gc list, re-inserting the route causes the reported
out-of-bounds accesses. If the device with the address is taken down
or the address is removed, the WARN_ON in fib6_del is triggered.

All of those faults are fixed by regenerating the host route if the
existing one has been moved to the gc list, something that can be
determined by checking if the rt6i_ref counter is 0.

Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Xin Long [Tue, 25 Apr 2017 14:58:37 +0000 (22:58 +0800)]

bridge: move bridge multicast cleanup to ndo_uninit

During removing a bridge device, if the bridge is still up, a new mdb entry
still can be added in br_multicast_add_group() after all mdb entries are
removed in br_multicast_dev_del(). Like the path:

  mld_ifc_timer_expire ->
    mld_sendpack -> ...
      br_multicast_rcv ->
        br_multicast_add_group

The new mp's timer will be set up. If the timer expires after the bridge
is freed, it may cause use-after-free panic in br_multicast_group_expired.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
IP: [<ffffffffa07ed2c8>] br_multicast_group_expired+0x28/0xb0 [bridge]
Call Trace:
<IRQ>
[<ffffffff81094536>] call_timer_fn+0x36/0x110
[<ffffffffa07ed2a0>] ? br_mdb_free+0x30/0x30 [bridge]
[<ffffffff81096967>] run_timer_softirq+0x237/0x340
[<ffffffff8108dcbf>] __do_softirq+0xef/0x280
[<ffffffff8169889c>] call_softirq+0x1c/0x30
[<ffffffff8102c275>] do_softirq+0x65/0xa0
[<ffffffff8108e055>] irq_exit+0x115/0x120
[<ffffffff81699515>] smp_apic_timer_interrupt+0x45/0x60
[<ffffffff81697a5d>] apic_timer_interrupt+0x6d/0x80

Nikolay also found it would cause a memory leak - the mdb hash is
reallocated and not freed due to the mdb rehash.

unreferenced object 0xffff8800540ba800 (size 2048):
  backtrace:
    [<ffffffff816e2287>] kmemleak_alloc+0x67/0xc0
    [<ffffffff81260bea>] __kmalloc+0x1ba/0x3e0
    [<ffffffffa05c60ee>] br_mdb_rehash+0x5e/0x340 [bridge]
    [<ffffffffa05c74af>] br_multicast_new_group+0x43f/0x6e0 [bridge]
    [<ffffffffa05c7aa3>] br_multicast_add_group+0x203/0x260 [bridge]
    [<ffffffffa05ca4b5>] br_multicast_rcv+0x945/0x11d0 [bridge]
    [<ffffffffa05b6b10>] br_dev_xmit+0x180/0x470 [bridge]
    [<ffffffff815c781b>] dev_hard_start_xmit+0xbb/0x3d0
    [<ffffffff815c8743>] __dev_queue_xmit+0xb13/0xc10
    [<ffffffff815c8850>] dev_queue_xmit+0x10/0x20
    [<ffffffffa02f8d7a>] ip6_finish_output2+0x5ca/0xac0 [ipv6]
    [<ffffffffa02fbfc6>] ip6_finish_output+0x126/0x2c0 [ipv6]
    [<ffffffffa02fc245>] ip6_output+0xe5/0x390 [ipv6]
    [<ffffffffa032b92c>] NF_HOOK.constprop.44+0x6c/0x240 [ipv6]
    [<ffffffffa032bd16>] mld_sendpack+0x216/0x3e0 [ipv6]
    [<ffffffffa032d5eb>] mld_ifc_timer_expire+0x18b/0x2b0 [ipv6]

This could happen when ip link remove a bridge or destroy a netns with a
bridge device inside.

With Nikolay's suggestion, this patch is to clean up bridge multicast in
ndo_uninit after bridge dev is shutdown, instead of br_dev_delete, so
that netif_running check in br_multicast_add_group can avoid this issue.

v1->v2:
  - fix this issue by moving br_multicast_dev_del to ndo_uninit, instead
    of calling dev_close in br_dev_delete.

(NOTE: Depends upon b6fe0440c637 ("bridge: implement missing ndo_uninit()"))

Fixes: e10177abf842 ("bridge: multicast: fix handling of temp and perm entries")
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Christoph Hellwig [Fri, 21 Apr 2017 06:26:57 +0000 (08:26 +0200)]

nvme-lightnvm: add missing endianess conversion in nvme_nvm_end_io

Found by sparse.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Matias Bjørling <matias@cnexlabs.com>

commit | commitdiff | tree

Jon Derrick [Tue, 25 Apr 2017 00:02:43 +0000 (18:02 -0600)]

nvme-scsi: Consider LBA format in IO splitting calculation

The current command submission code uses a sector-based value when
considering the maximum number of blocks per command. With a
4k-formatted namespace and a command exceeding max hardware limits, this
calculation doesn't split IOs which should be split and fails in the
nvme layer. This patch fixes that calculation and enables IO splitting
in these circumstances.

Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Ewan D. Milne [Mon, 24 Apr 2017 17:24:16 +0000 (13:24 -0400)]

nvme-fc: avoid memory corruption caused by calling nvmf_free_options() twice

Do not call nvmf_free_options() from the nvme_fc_ctlr destructor if
nvme_fc_create_ctrl() returns an error, because nvmf_create_ctrl()
frees the options when an error is returned.

Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

James Smart [Sat, 22 Apr 2017 00:49:08 +0000 (17:49 -0700)]

lpfc: Fix memory corruption of the lpfc_ncmd->list pointers

lpfc was changing the private pointer that is set/maintained by
the nvme_fc transport. This caused two issues: a) the transport, on
teardown may erroneous attempt to free whatever address was set;
and b) lfpc uses any value set in lpfc_nvme_fcp_abort() and
assumes its a valid io request.

Correct issue by properly defining a context structure for lpfc.
Lpfc also updated to clear the private context structure on io
completion.

Since this bug caused scrutiny of the way lpfc moves local request
structures between lists, also cleaned up list_del()'s to
list_del_inits()'s.

This is a nvme-specific bug. The patch was cut against the
linux-block tree, for-4.12/block tree. It should be pulled in through
that tree.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

commit | commitdiff | tree

Sabrina Dubroca [Tue, 25 Apr 2017 13:56:50 +0000 (15:56 +0200)]

ipv6: fix source routing

Commit a149e7c7ce81 ("ipv6: sr: add support for SRH injection through
setsockopt") introduced handling of IPV6_SRCRT_TYPE_4, but at the same
time restricted it to only IPV6_SRCRT_TYPE_0 and
IPV6_SRCRT_TYPE_4. Previously, ipv6_push_exthdr() and fl6_update_dst()
would also handle other values (ie STRICT and TYPE_2).

Restore previous source routing behavior, by handling IPV6_SRCRT_STRICT
and IPV6_SRCRT_TYPE_2 the same way as IPV6_SRCRT_TYPE_0 in
ipv6_push_exthdr() and fl6_update_dst().

Fixes: a149e7c7ce81 ("ipv6: sr: add support for SRH injection through setsockopt")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Wei Yongjun [Tue, 25 Apr 2017 16:15:30 +0000 (16:15 +0000)]

lightnvm: fix possible memory leak in pblk_bb_discovery()

'blks' is malloced in pblk_bb_discovery() and should be freed
before leaving from the nvm_get_tgt_bb_tbl() error handling cases,
otherwise it will cause memory leak. Also skip assign blks to
rlun->bb_list when error.

Fixes: a4bd217b4326 ("lightnvm: physical block device (pblk) target")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Javier GonzÃ¡lez <javier@cnexlabs.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

sudarsana.kalluru@cavium.com [Tue, 25 Apr 2017 03:59:10 +0000 (20:59 -0700)]

qed: Fix error in the dcbx app meta data initialization.

DCBX app_data array is initialized with the incorrect values for
personality field. This would prevent offloaded protocols from
honoring the PFC.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

stephen hemminger [Tue, 25 Apr 2017 01:33:38 +0000 (18:33 -0700)]

netvsc: fix calculation of available send sections

My change (introduced in 4.11) to use find_first_clear_bit
incorrectly assumed that the size argument was words, not bits.
The effect was only a small limited number of the available send
sections were being actually used. This can cause performance loss
with some workloads.

Since map_words is now used only during initialization, it can
be on stack instead of in per-device data.

Fixes: b58a185801da ("netvsc: simplify get next send section")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Andreas Kemnade [Mon, 24 Apr 2017 19:18:39 +0000 (21:18 +0200)]

net: hso: fix module unloading

keep tty driver until usb driver is unregistered
rmmod hso
produces traces like this without that:

[40261.645904] usb 2-2: new high-speed USB device number 2 using ehci-omap
[40261.854644] usb 2-2: New USB device found, idVendor=0af0, idProduct=8800
[40261.862609] usb 2-2: New USB device strings: Mfr=3, Product=2, SerialNumber=0
[40261.872772] usb 2-2: Product: Globetrotter HSUPA Modem
[40261.880279] usb 2-2: Manufacturer: Option N.V.
[40262.021270] hso 2-2:1.5: Not our interface
[40265.556945] hso: unloaded
[40265.559875] usbcore: deregistering interface driver hso
[40265.595947] Unable to handle kernel NULL pointer dereference at virtual address 00000033
[40265.604522] pgd = ecb14000
[40265.611877] [00000033] *pgd=00000000
[40265.617034] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[40265.622650] Modules linked in: hso(-) bnep bluetooth ipv6 arc4 twl4030_madc_hwmon wl18xx wlcore mac80211 cfg80211 snd_soc_simple_card snd_soc_simple_card_utils snd_soc_omap_twl4030 snd_soc_gtm601 generic_adc_battery extcon_gpio omap3_isp videobuf2_dma_contig videobuf2_memops wlcore_sdio videobuf2_v4l2 videobuf2_core ov9650 bmp280_i2c v4l2_common bmp280 bmg160_i2c bmg160_core at24 nvmem_core videodev bmc150_accel_i2c bmc150_magn_i2c media bmc150_accel_core tsc2007 bmc150_magn leds_tca6507 bno055 snd_soc_omap_mcbsp industrialio_triggered_buffer snd_soc_omap kfifo_buf snd_pcm_dmaengine gpio_twl4030 snd_soc_twl4030 twl4030_vibra twl4030_madc wwan_on_off ehci_omap pwm_bl pwm_omap_dmtimer panel_tpo_td028ttec1 encoder_opa362 connector_analog_tv omapdrm drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect
[40265.698211]  sysimgblt fb_sys_fops cfbcopyarea drm omapdss usb_f_ecm g_ether usb_f_rndis u_ether libcomposite configfs omap2430 phy_twl4030_usb musb_hdrc twl4030_charger industrialio w2sg0004 twl4030_pwrbutton bq27xxx_battery w1_bq27000 omap_hdq [last unloaded: hso]
[40265.723175] CPU: 0 PID: 2701 Comm: rmmod Not tainted 4.11.0-rc6-letux+ #6
[40265.730346] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[40265.736938] task: ecb81100 task.stack: ecb82000
[40265.741729] PC is at cdev_del+0xc/0x2c
[40265.745666] LR is at tty_unregister_device+0x40/0x50
[40265.750915] pc : [<c027472c>]    lr : [<c04b3ecc>]    psr: 600b0113
sp : ecb83ea8  ip : eca4f898  fp : 00000000
[40265.763000] r10: 00000000  r9 : 00000000  r8 : 00000001
[40265.768493] r7 : eca4f800  r6 : 00000003  r5 : 00000000  r4 : ffffffff
[40265.775360] r3 : c1458d54  r2 : 00000000  r1 : 00000004  r0 : ffffffff
[40265.782257] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[40265.789764] Control: 10c5387d  Table: acb14019  DAC: 00000051
[40265.795806] Process rmmod (pid: 2701, stack limit = 0xecb82218)
[40265.802062] Stack: (0xecb83ea8 to 0xecb84000)
[40265.806640] 3ea0:                   ec9e8100 c04b3ecc bf737378 ed5b7c00 00000003 bf7327ec
[40265.815277] 3ec0: eca4f800 00000000 ec9fd800 eca4f800 bf737070 bf7328bc eca4f820 c05a9a04
[40265.823883] 3ee0: eca4f820 00000000 00000001 eca4f820 ec9fd870 bf737070 eca4f854 ec9fd8a4
[40265.832519] 3f00: ecb82000 00000000 00000000 c04e6960 eca4f820 bf737070 bf737048 00000081
[40265.841125] 3f20: c01071e4 c04e6a60 ecb81100 bf737070 bf737070 c04e5d94 bf737020 c05a8f88
[40265.849731] 3f40: bf737100 00000800 7f5fa254 00000081 c01071e4 c01c4afc 00000000 006f7368
[40265.858367] 3f60: ecb815f4 00000000 c0cac9c4 c01071e4 ecb82000 00000000 00000000 c01512f4
[40265.866973] 3f80: ed5b3200 c01071e4 7f5fa220 7f5fa220 bea78ec9 0010711c 7f5fa220 7f5fa220
[40265.875579] 3fa0: bea78ec9 c0107040 7f5fa220 7f5fa220 7f5fa254 00000800 dd35b800 dd35b800
[40265.884216] 3fc0: 7f5fa220 7f5fa220 bea78ec9 00000081 bea78dcc 00000000 bea78bd8 00000000
[40265.892822] 3fe0: b6f70521 bea78b6c 7f5dd613 b6f70526 80070030 7f5fa254 ffffffff ffffffff
[40265.901458] [<c027472c>] (cdev_del) from [<c04b3ecc>] (tty_unregister_device+0x40/0x50)
[40265.909942] [<c04b3ecc>] (tty_unregister_device) from [<bf7327ec>] (hso_free_interface+0x80/0x144 [hso])
[40265.919982] [<bf7327ec>] (hso_free_interface [hso]) from [<bf7328bc>] (hso_disconnect+0xc/0x18 [hso])
[40265.929718] [<bf7328bc>] (hso_disconnect [hso]) from [<c05a9a04>] (usb_unbind_interface+0x84/0x200)
[40265.939239] [<c05a9a04>] (usb_unbind_interface) from [<c04e6960>] (device_release_driver_internal+0x138/0x1cc)
[40265.949798] [<c04e6960>] (device_release_driver_internal) from [<c04e6a60>] (driver_detach+0x60/0x6c)
[40265.959503] [<c04e6a60>] (driver_detach) from [<c04e5d94>] (bus_remove_driver+0x64/0x8c)
[40265.968017] [<c04e5d94>] (bus_remove_driver) from [<c05a8f88>] (usb_deregister+0x5c/0xb8)
[40265.976654] [<c05a8f88>] (usb_deregister) from [<c01c4afc>] (SyS_delete_module+0x160/0x1dc)
[40265.985443] [<c01c4afc>] (SyS_delete_module) from [<c0107040>] (ret_fast_syscall+0x0/0x1c)
[40265.994171] Code: c1458d54 e59f3020 e92d4010 e1a04000 (e5941034)
[40266.016693] ---[ end trace 9d5ac43c7e41075c ]---

Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Parthasarathy Bhuvaragan [Mon, 24 Apr 2017 13:00:43 +0000 (15:00 +0200)]

tipc: fix socket flow control accounting error at tipc_recv_stream

Until now in tipc_recv_stream(), we update the received
unacknowledged bytes based on a stack variable and not based on the
actual message size.
If the user buffer passed at tipc_recv_stream() is smaller than the
received skb, the size variable in stack differs from the actual
message size in the skb. This leads to a flow control accounting
error causing permanent congestion.

In this commit, we fix this accounting error by always using the
size of the incoming message.

Fixes: 10724cc7bb78 ("tipc: redesign connection-level flow control")
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Parthasarathy Bhuvaragan [Mon, 24 Apr 2017 13:00:42 +0000 (15:00 +0200)]

tipc: fix socket flow control accounting error at tipc_send_stream

Until now in tipc_send_stream(), we return -1 when the socket
encounters link congestion even if the socket had successfully
sent partial data. This is incorrect as the application resends
the same the partial data leading to data corruption at
receiver's end.

In this commit, we return the partially sent bytes as the return
value at link congestion.

Fixes: 10724cc7bb78 ("tipc: redesign connection-level flow control")
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Takashi Iwai [Tue, 25 Apr 2017 15:43:56 +0000 (17:43 +0200)]

Merge tag 'asoc-fix-v4.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v4.11

A few last minute fixes for v4.11, the STI fix is relatively large but
driver specific and has been cooking in -next for a little while now:

- A fix from Takashi for some suspend/resume related crashes in the
   Intel drivers.
- A fix from Mousumi Jana for issues with incorrectly created
   enumeration controls generated from topology files which could cause
   problems for userspace.
- Fixes from Arnaud Pouliquen for some crashes due to races with the
   interrupt handler in the STI driver.

commit | commitdiff | tree

Paolo Abeni [Mon, 24 Apr 2017 12:18:28 +0000 (14:18 +0200)]

ipv6: move stub initialization after ipv6 setup completion

The ipv6 stub pointer is currently initialized before the ipv6
routing subsystem: a 3rd party can access and use such stub
before the routing data is ready.
Moreover, such pointer is not cleared in case of initialization
error, possibly leading to dangling pointers usage.

This change addresses the above moving the stub initialization
at the end of ipv6 init code.

Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Pan Bian [Mon, 24 Apr 2017 10:29:16 +0000 (18:29 +0800)]

team: fix memory leaks

In functions team_nl_send_port_list_get() and
team_nl_send_options_get(), pointer skb keeps the return value of
nlmsg_new(). When the call to genlmsg_put() fails, the memory is not
freed(). This will result in memory leak bugs.

Fixes: 9b00cf2d1024 ("team: implement multipart netlink messages for options transfers")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Mark Brown [Tue, 25 Apr 2017 15:25:07 +0000 (16:25 +0100)]

Merge remote-tracking branches 'asoc/fix/intel', 'asoc/fix/topology' and 'asoc/fix/sti' into asoc-linus

commit | commitdiff | tree

David S. Miller [Tue, 25 Apr 2017 15:20:30 +0000 (11:20 -0400)]

Merge tag 'linux-can-fixes-for-4.11-20170425' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2017-04-25

this is a pull request of three patches for net/master.

There are two patches by Stephane Grosjean for that add a new variant to the
PCAN-Chip USB driver. The other patch is by Maksim Salau, which swtiches the
memory for USB transfers from heap to stack.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Bert Kenward [Tue, 25 Apr 2017 12:44:54 +0000 (13:44 +0100)]

sfc: tx ring can only have 2048 entries for all EF10 NICs

Fixes: dd248f1bc65b ("sfc: Add PCI ID for Solarflare 8000 series 10/40G NIC")
Reported-by: Patrick Talbert <ptalbert@redhat.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Takashi Iwai [Mon, 24 Apr 2017 12:09:55 +0000 (14:09 +0200)]

ASoC: intel: Fix PM and non-atomic crash in bytcr drivers

The FE setups of Intel SST bytcr_rt5640 and bytcr_rt5651 drivers carry
the ignore_suspend flag, and this prevents the suspend/resume working
properly while the stream is running, since SST core code has the
check of the running streams and returns -EBUSY.  Drop these
superfluous flags for fixing the behavior.

Also, the bytcr_rt5640 driver lacks of nonatomic flag in some FE
definitions, which leads to the kernel Oops at suspend/resume like:

  BUG: scheduling while atomic: systemd-sleep/3144/0x00000003
  Call Trace:
   dump_stack+0x5c/0x7a
   __schedule_bug+0x55/0x70
   __schedule+0x63c/0x8c0
   schedule+0x3d/0x90
   schedule_timeout+0x16b/0x320
   ? del_timer_sync+0x50/0x50
   ? sst_wait_timeout+0xa9/0x170 [snd_intel_sst_core]
   ? sst_wait_timeout+0xa9/0x170 [snd_intel_sst_core]
   ? remove_wait_queue+0x60/0x60
   ? sst_prepare_and_post_msg+0x275/0x960 [snd_intel_sst_core]
   ? sst_pause_stream+0x9b/0x110 [snd_intel_sst_core]
   ....

This patch addresses these appropriately, too.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: <stable@vger.kernel.org> # v4.1+

commit | commitdiff | tree

Herbert Xu [Thu, 20 Apr 2017 12:55:12 +0000 (20:55 +0800)]

macvlan: Fix device ref leak when purging bc_queue

When a parent macvlan device is destroyed we end up purging its
broadcast queue without dropping the device reference count on
the packet source device. This causes the source device to linger.

This patch drops that reference count.

Fixes: 260916dfb48c ("macvlan: Fix potential use-after free for...")
Reported-by: Joe Ghalam <Joe.Ghalam@dell.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Roman Spychała [Thu, 20 Apr 2017 10:04:10 +0000 (12:04 +0200)]

usb: plusb: Add support for PL-27A1

This patch adds support for the PL-27A1 by adding the appropriate
USB ID's. This chip is used in the goobay Active USB 3.0 Data Link
and Unitek Y-3501 cables.

Signed-off-by: Roman Spychała <roed@onet.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Maksim Salau [Sun, 23 Apr 2017 17:31:40 +0000 (20:31 +0300)]

net: can: usb: gs_usb: Fix buffer on stack

Allocate buffers on HEAP instead of STACK for local structures
that are to be sent using usb_control_msg().

Signed-off-by: Maksim Salau <maksim.salau@gmail.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v4.8
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Martin Schwidefsky [Mon, 24 Apr 2017 16:14:48 +0000 (18:14 +0200)]

s390/mm: simplify arch_get_unmapped_area[_topdown]

With TASK_SIZE now reflecting the maximum size of the address space for
a process the code for arch_get_unmapped_area[_topdown] can be simplified.
Just let the logic pick a suitable address and deal with the page table
upgrade after the address has been selected.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Stephane Grosjean [Mon, 27 Mar 2017 12:36:11 +0000 (14:36 +0200)]

can: usb: Kconfig: Add PCAN-USB X6 device in help text

This patch adds a text line in the help section of the CAN_PEAK_USB
config item describing the support of the PCAN-USB X6 adapter, which is
already included in the Kernel since 4.9.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Stephane Grosjean [Mon, 27 Mar 2017 12:36:10 +0000 (14:36 +0200)]

can: usb: Add support of PCAN-Chip USB stamp module

This patch adds the support of the PCAN-Chip USB, a stamp module for
customer hardware designs, which communicates via USB 2.0 with the
hardware. The integrated CAN controller supports the protocols CAN 2.0 A/B
as well as CAN FD. The physical CAN connection is determined by external
wiring. The Stamp module with its single-sided mounting and plated
half-holes is suitable for automatic assembly.

Note that the chip is equipped with the same logic than the PCAN-USB FD.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

commit | commitdiff | tree

Martin Schwidefsky [Thu, 20 Apr 2017 12:43:51 +0000 (14:43 +0200)]

s390/mm: make TASK_SIZE independent from the number of page table levels

The TASK_SIZE for a process should be maximum possible size of the address
space, 2GB for a 31-bit process and 8PB for a 64-bit process. The number
of page table levels required for a given memory layout is a consequence
of the mapped memory areas and their location.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

commit | commitdiff | tree

Jean Delvare [Tue, 25 Apr 2017 04:07:10 +0000 (22:07 -0600)]

virtio_blk: Fix English description of VIRTIO_BLK_SCSI

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 97b50a654d5d ("virtio_blk: make SCSI passthrough support configurable")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

commit | commitdiff | tree

Andy Lutomirski [Fri, 21 Apr 2017 23:19:24 +0000 (16:19 -0700)]

nvme: Add nvme_core.force_apst to ignore the NO_APST quirk

We're probably going to be stuck quirking APST off on an over-broad
range of devices for 4.11. Let's make it easy to override the quirk
for testing.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>

Linux kernel for KaRo TX COM modules

RSS Atom