Paul Moore [Wed, 19 Mar 2014 20:46:18 +0000 (16:46 -0400)]
selinux: correctly label /proc inodes in use before the policy is loaded
This patch is based on an earlier patch by Eric Paris, he describes
the problem below:
"If an inode is accessed before policy load it will get placed on a
list of inodes to be initialized after policy load. After policy
load we call inode_doinit() which calls inode_doinit_with_dentry()
on all inodes accessed before policy load. In the case of inodes
in procfs that means we'll end up at the bottom where it does:
/* Default to the fs superblock SID. */
isec->sid = sbsec->sid;
if ((sbsec->flags & SE_SBPROC) && !S_ISLNK(inode->i_mode)) {
if (opt_dentry) {
isec->sclass = inode_mode_to_security_class(...)
rc = selinux_proc_get_sid(opt_dentry,
isec->sclass,
&sid);
if (rc)
goto out_unlock;
isec->sid = sid;
}
}
Since opt_dentry is null, we'll never call selinux_proc_get_sid()
and will leave the inode labeled with the label on the superblock.
I believe a fix would be to mimic the behavior of xattrs. Look
for an alias of the inode. If it can't be found, just leave the
inode uninitialized (and pick it up later) if it can be found, we
should be able to call selinux_proc_get_sid() ..."
On a system exhibiting this problem, you will notice a lot of files in
/proc with the generic "proc_t" type (at least the ones that were
accessed early in the boot), for example:
Paul Moore [Wed, 19 Mar 2014 20:46:11 +0000 (16:46 -0400)]
selinux: put the mmap() DAC controls before the MAC controls
It turns out that doing the SELinux MAC checks for mmap() before the
DAC checks was causing users and the SELinux policy folks headaches
as users were seeing a lot of SELinux AVC denials for the
memprotect:mmap_zero permission that would have also been denied by
the normal DAC capability checks (CAP_SYS_RAWIO).
This patch corrects things so that when the above example is run by a
user without CAP_SYS_RAWIO the SELinux AVC is no longer generated as
the DAC capability check fails before the SELinux permission check.
Signed-off-by: Paul Moore <pmoore@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Paul Moore [Wed, 19 Mar 2014 20:46:04 +0000 (16:46 -0400)]
selinux: fix the output of ./scripts/get_maintainer.pl for SELinux
Correctly tag the SELinux mailing list as moderated for non-subscribers
and do some shuffling of the SELinux maintainers to try and make things
more clear when the scripts/get_maintainer.pl script is used.
# ./scripts/get_maintainer.pl -f security/selinux
Paul Moore <paul@paul-moore.com> (supporter:SELINUX SECURITY...)
Stephen Smalley <sds@tycho.nsa.gov> (supporter:SELINUX SECURITY...)
Eric Paris <eparis@parisplace.org> (supporter:SELINUX SECURITY...)
James Morris <james.l.morris@oracle.com> (supporter:SECURITY SUBSYSTEM)
selinux@tycho.nsa.gov (moderated list:SELINUX SECURITY...)
linux-security-module@vger.kernel.org (open list:SECURITY SUBSYSTEM)
linux-kernel@vger.kernel.org (open list)
Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Eric Paris <eparis@parisplace.org> Cc: James Morris <james.l.morris@oracle.com> Signed-off-by: Paul Moore <pmoore@redhat.com>
Dmitry Kasatkin [Fri, 28 Feb 2014 12:18:09 +0000 (14:18 +0200)]
evm: enable key retention service automatically
If keys are not enabled, EVM is not visible in the configuration menu.
It may be difficult to figure out what to do unless you really know.
Other subsystems as NFS, CIFS select keys automatically. This patch does
the same.
This patch also removes '(TRUSTED_KEYS=y || TRUSTED_KEYS=n)' dependency,
which is unnecessary. EVM does not depend on trusted keys, but on
encrypted keys. evm.h provides compile time dependency.
Dmitry Kasatkin [Wed, 13 Nov 2013 20:23:20 +0000 (22:23 +0200)]
ima: return d_name.name if d_path fails
This is a small refactoring so ima_d_path() returns dentry name
if path reconstruction fails. It simplifies callers actions
and removes code duplication.
Dmitry Kasatkin [Tue, 4 Mar 2014 16:04:20 +0000 (18:04 +0200)]
integrity: fix checkpatch errors
Between checkpatch changes (eg. sizeof) and inconsistencies between
Lindent and checkpatch, unfixed checkpatch errors make it difficult
to see new errors. This patch fixes them. Some lines with over 80 chars
remained unchanged to improve code readability.
The "extern" keyword is removed from internal evm.h to make it consistent
with internal ima.h.
Dmitry Kasatkin [Wed, 13 Nov 2013 21:42:39 +0000 (23:42 +0200)]
ima: fix erroneous removal of security.ima xattr
ima_inode_post_setattr() calls ima_must_appraise() to check if the
file needs to be appraised. If it does not then it removes security.ima
xattr. With original policy matching code it might happen that even
file needs to be appraised with FILE_CHECK hook, it might not be
for POST_SETATTR hook. 'security.ima' might be erronously removed.
This patch treats POST_SETATTR as special wildcard function and will
cause ima_must_appraise() to be true if any of the hooks rules matches.
security.ima will not be removed if any of the hooks would require
appraisal.
Roberto Sassu [Mon, 3 Feb 2014 12:56:05 +0000 (13:56 +0100)]
ima: reduce memory usage when a template containing the n field is used
Before this change, to correctly calculate the template digest for the
'ima' template, the event name field (id: 'n') length was set to the fixed
size of 256 bytes.
This patch reduces the length of the event name field to the string
length incremented of one (to make room for the termination character '\0')
and handles the specific case of the digest calculation for the 'ima'
template directly in ima_calc_field_array_hash_tfm().
Roberto Sassu [Mon, 3 Feb 2014 12:56:04 +0000 (13:56 +0100)]
ima: restore the original behavior for sending data with ima template
With the new template mechanism introduced in IMA since kernel 3.13,
the format of data sent through the binary_runtime_measurements interface
is slightly changed. Now, for a generic measurement, the format of
template data (after the template name) is:
In addition, fields containing a string now include the '\0' termination
character.
Instead, the format for the 'ima' template should be:
SHA1 digest | event name length | event name
It must be noted that while in the IMA 3.13 code 'event name length' is
'IMA_EVENT_NAME_LEN_MAX + 1' (256 bytes), so that the template digest
is calculated correctly, and 'event name' contains '\0', in the pre 3.13
code 'event name length' is exactly the string length and 'event name'
does not contain the termination character.
The patch restores the behavior of the IMA code pre 3.13 for the 'ima'
template so that legacy userspace tools obtain a consistent behavior
when receiving data from the binary_runtime_measurements interface
regardless of which kernel version is used.
Tetsuo Handa [Tue, 24 Dec 2013 11:49:01 +0000 (20:49 +0900)]
Integrity: Pass commname via get_task_comm()
When we pass task->comm to audit_log_untrustedstring(), we need to pass it
via get_task_comm() because task->comm can be changed to contain untrusted
string by other threads after audit_log_untrustedstring() confirmed that
task->comm does not contain untrusted string.
Mimi Zohar [Wed, 11 Dec 2013 19:44:04 +0000 (14:44 -0500)]
ima: use static const char array definitions
A const char pointer allocates memory for a pointer as well as for
a string, This patch replaces a number of the const char pointers
throughout IMA, with a static const char array.
Suggested-by: David Howells <dhowells@redhat.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: David Howells <dhowells@redhat.com>
Jeff Layton [Wed, 5 Mar 2014 17:47:37 +0000 (12:47 -0500)]
security: have cap_dentry_init_security return error
Currently, cap_dentry_init_security returns 0 without actually
initializing the security label. This confuses its only caller
(nfs4_label_init_security) which expects an error in that situation, and
causes it to end up sending out junk onto the wire instead of simply
suppressing the label in the attributes sent.
When CONFIG_SECURITY is disabled, security_dentry_init_security returns
-EOPNOTSUPP. Have cap_dentry_init_security do the same.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
Rashika Kheria [Thu, 27 Feb 2014 12:20:19 +0000 (17:50 +0530)]
kernel: Mark function as static in kernel/seccomp.c
Mark function as static in kernel/seccomp.c because it is not used
outside this file.
This eliminates the following warning in kernel/seccomp.c:
kernel/seccomp.c:296:6: warning: no previous prototype for ?seccomp_attach_user_filter? [-Wmissing-prototypes]
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Will Drewry <wad@chromium.org> Signed-off-by: James Morris <james.l.morris@oracle.com>
Joe Perches [Fri, 21 Feb 2014 22:19:30 +0000 (14:19 -0800)]
capability: Use current logging styles
Prefix logging output with "capability: " via pr_fmt.
Convert printks to pr_<level>.
Use pr_<level>_once instead of guard flags.
Coalesce formats.
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
Paul Moore [Thu, 6 Feb 2014 12:51:54 +0000 (07:51 -0500)]
selinux: fix the output of ./scripts/get_maintainer.pl for SELinux
Correctly tag the SELinux mailing list as moderated for non-subscribers
and do some shuffling of the SELinux maintainers to try and make things
more clear when the scripts/get_maintainer.pl script is used.
# ./scripts/get_maintainer.pl -f security/selinux
Paul Moore <paul@paul-moore.com> (supporter:SELINUX SECURITY...)
Stephen Smalley <sds@tycho.nsa.gov> (supporter:SELINUX SECURITY...)
Eric Paris <eparis@parisplace.org> (supporter:SELINUX SECURITY...)
James Morris <james.l.morris@oracle.com> (supporter:SECURITY SUBSYSTEM)
selinux@tycho.nsa.gov (moderated list:SELINUX SECURITY...)
linux-security-module@vger.kernel.org (open list:SECURITY SUBSYSTEM)
linux-kernel@vger.kernel.org (open list)
Cc: Eric Paris <eparis@parisplace.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
Tetsuo Handa [Mon, 6 Jan 2014 12:28:15 +0000 (21:28 +0900)]
SELinux: Fix memory leak upon loading policy
Hello.
I got below leak with linux-3.10.0-54.0.1.el7.x86_64 .
[ 681.903890] kmemleak: 5538 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
Below is a patch, but I don't know whether we need special handing for undoing
ebitmap_set_bit() call.
----------
>>From fe97527a90fe95e2239dfbaa7558f0ed559c0992 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Mon, 6 Jan 2014 16:30:21 +0900
Subject: [PATCH] SELinux: Fix memory leak upon loading policy
Commit 2463c26d "SELinux: put name based create rules in a hashtable" did not
check return value from hashtab_insert() in filename_trans_read(). It leaks
memory if hashtab_insert() returns error.
However, we should not return EEXIST error to the caller, or the systemd will
show below message and the boot sequence freezes.
systemd[1]: Failed to load SELinux policy. Freezing.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Eric Paris <eparis@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paul Moore <pmoore@redhat.com>
Jason Gunthorpe [Tue, 26 Nov 2013 20:30:46 +0000 (13:30 -0700)]
tpm: tpm_tis: Fix compile problems with CONFIG_PM_SLEEP/CONFIG_PNP
If CONFIG_PM_SLEEP=n, CONFIG_PNP=y we get this warning:
drivers/char/tpm/tpm_tis.c:706:13: warning: 'tpm_tis_reenable_interrupts' defined but not used [-Wunused-function]
This seems to have been introduced in a2fa3fb0d 'tpm: convert tpm_tis driver
to use dev_pm_ops from legacy pm_ops'
Also, unpon reviewing, the #ifdefs around tpm_tis_pm are not right, the first
reference is protected, the second is not. tpm_tis_pm is always defined so we
can drop the #ifdef.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Jason Gunthorpe [Tue, 26 Nov 2013 20:30:45 +0000 (13:30 -0700)]
tpm: Make tpm-dev allocate a per-file structure
This consolidates everything that is only used within tpm-dev.c
into tpm-dev.c and out of the publicly visible struct tpm_chip.
The per-file allocation lays the ground work for someday fixing the
strange forced O_EXCL behaviour of the current code.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Joel Schopp <jschopp@linux.vnet.ibm.com> Reviewed-by: Ashley Lai <adlai@linux.vnet.ibm.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Jason Gunthorpe [Tue, 26 Nov 2013 20:30:44 +0000 (13:30 -0700)]
tpm: Use the ops structure instead of a copy in tpm_vendor_specific
This builds on the last commit to use the ops structure in the core
and reduce the size of tpm_vendor_specific.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Joel Schopp <jschopp@linux.vnet.ibm.com> Reviewed-by: Ashley Lai <adlai@linux.vnet.ibm.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Jason Gunthorpe [Tue, 26 Nov 2013 20:30:43 +0000 (13:30 -0700)]
tpm: Create a tpm_class_ops structure and use it in the drivers
This replaces the static initialization of a tpm_vendor_specific
structure in the drivers with the standard Linux idiom of providing
a const structure of function pointers.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Joel Schopp <jschopp@linux.vnet.ibm.com> Reviewed-by: Ashley Lai <adlai@linux.vnet.ibm.com>
[phuewe: did apply manually due to commit 191ffc6bde3 tpm/tpm_i2c_atmel: fix coccinelle warnings] Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Jason Gunthorpe [Tue, 26 Nov 2013 20:30:42 +0000 (13:30 -0700)]
tpm: Pull all driver sysfs code into tpm-sysfs.c
The tpm core now sets up and controls all sysfs attributes, instead
of having each driver have a unique take on it.
All drivers now now have a uniform set of attributes, and no sysfs
related entry points are exported from the tpm core module.
This also uses the new method used to declare sysfs attributes
with DEVICE_ATTR_RO and 'struct attribute *'
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
[phuewe: had to apply the tpm_i2c_atmel part manually due to commit 191ffc6bde3fc tpm/tpm_i2c_atmel: fix coccinelle warnings]
Jason Gunthorpe [Tue, 26 Nov 2013 20:30:41 +0000 (13:30 -0700)]
tpm: Move sysfs functions from tpm-interface to tpm-sysfs
CLASS-sysfs.c is a common idiom for linux subsystems.
This is the first step to pulling all the sysfs support code from
the drivers into tpm-sysfs. This is a plain text copy from tpm-interface
with support changes to make it compile.
_tpm_pcr_read is made non-static and is called tpm_pcr_read_dev.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Jason Gunthorpe [Tue, 26 Nov 2013 20:30:40 +0000 (13:30 -0700)]
tpm: Pull everything related to /dev/tpmX into tpm-dev.c
CLASS-dev.c is a common idiom for Linux subsystems
This pulls all the code related to the miscdev into tpm-dev.c and makes it
static. The identical file_operation structs in the drivers are purged and the
tpm common code unconditionally creates the miscdev.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Joel Schopp <jschopp@linux.vnet.ibm.com> Reviewed-by: Ashley Lai <adlai@linux.vnet.ibm.com>
[phuewe:
tpm_dev_release is now used only in this file, thus the EXPORT_SYMBOL
can be dropped and the function be marked as static.
It has no other in-kernel users] Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
“wait” wait queue is defined but never used in the function, thus
it can be removed.
Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Fengguang Wu [Sat, 2 Nov 2013 16:41:51 +0000 (17:41 +0100)]
tpm/tpm_i2c_atmel: fix coccinelle warnings
drivers/char/tpm/tpm_i2c_atmel.c:178:8-9: WARNING: return of 0/1 in function 'i2c_atmel_req_canceled' with return type bool
Return statements in functions returning bool should use
true/false instead of 1/0.
Generated by: coccinelle/misc/boolreturn.cocci
CC: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> CC: Peter Huewe <peterhuewe@gmx.de> Acked-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Peter Huewe [Tue, 29 Oct 2013 23:54:20 +0000 (00:54 +0100)]
tpm/tpm_i2c_stm_st33: Check return code of get_burstcount
The 'get_burstcount' function can in some circumstances 'return -EBUSY' which
in tpm_stm_i2c_send is stored in an 'u32 burstcnt'
thus converting the signed value into an unsigned value, resulting
in 'burstcnt' being huge.
Changing the type to u32 only does not solve the problem as the signed
value is converted to an unsigned in I2C_WRITE_DATA, resulting in the
same effect.
Thus
-> Change type of burstcnt to u32 (the return type of get_burstcount)
-> Add a check for the return value of 'get_burstcount' and propagate a
potential error.
This makes also sense in the 'I2C_READ_DATA' case, where the there is no
signed/unsigned conversion.
found by coverity Cc: stable@vger.kernel.org Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Peter Huewe [Wed, 30 Oct 2013 00:40:28 +0000 (01:40 +0100)]
tpm/tpm_ppi: Check return value of acpi_get_name
If
status = acpi_get_name(handle, ACPI_FULL_PATHNAME, &buffer);
fails for whatever reason and does not return AE_OK
if (strstr(buffer.pointer, context) != NULL) {
does dereference a null pointer.
-> Check the return value and return the status to the caller
Found by coverity Cc: stable@vger.kernel.org Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Peter Huewe [Wed, 30 Oct 2013 19:46:55 +0000 (20:46 +0100)]
tpm/tpm_ppi: Do not compare strcmp(a,b) == -1
Depending on the implementation strcmp might return the difference between
two strings not only -1,0,1 consequently
if (strcmp (a,b) == -1)
might lead to taking the wrong branch
-> compare with < 0 instead,
which in any case is more canonical.
Cc: stable@vger.kernel.org Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Linus Torvalds [Fri, 3 Jan 2014 21:48:25 +0000 (13:48 -0800)]
Merge tag 'for-v3.13-fixes' of git://git.infradead.org/battery-2.6
Pull battery fixes from Anton Vorontsov:
"Two fixes:
- fix build error caused by max17042_battery conversion to the regmap
API.
- fix kernel oops when booting with wakeup_source_activate enabled"
* tag 'for-v3.13-fixes' of git://git.infradead.org/battery-2.6:
max17042_battery: Fix build errors caused by missing REGMAP_I2C config
power_supply: Fix Oops from NULL pointer dereference from wakeup_source_activate
Linus Torvalds [Fri, 3 Jan 2014 21:44:41 +0000 (13:44 -0800)]
Merge tag 'pm+acpi-3.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and PM fixes and new device IDs from Rafael Wysocki:
"These commits, except for one, are regression fixes and the remaining
one fixes a divide error leading to a kernel panic. The majority of
the regressions fixed here were introduced during the 3.12 cycle, one
of them is from this cycle and one is older.
Specifics:
- VGA switcheroo was broken for some users as a result of the
ACPI-based PCI hotplug (ACPIPHP) changes in 3.12, because some
previously ignored hotplug events started to be handled. The fix
causes them to be ignored again.
- There are two more issues related to cpufreq's suspend/resume
handling changes from the 3.12 cycle addressed by Viresh Kumar's
fixes.
- intel_pstate triggers a divide error in a timer function if the
P-state information it needs is missing during initialization.
This leads to kernel panics on nested KVM clients and is fixed by
failing the initialization cleanly in those cases.
- PCI initalization code changes during the 3.9 cycle uncovered BIOS
issues related to ACPI wakeup notifications (some BIOSes send them
for devices that aren't supposed to support ACPI wakeup). Work
around them by installing an ACPI wakeup notify handler for all PCI
devices with ACPI support.
- The Calxeda cpuilde driver's probe function is tagged as __init,
which is incorrect and causes a section mismatch to occur during
build. Fix from Andre Przywara removes the __init tag from there.
- During the 3.12 cycle ACPIPHP started to print warnings about
missing _ADR for devices that legitimately don't have it. Fix from
Toshi Kani makes it only print the warnings where they make sense"
* tag 'pm+acpi-3.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPIPHP / radeon / nouveau: Fix VGA switcheroo problem related to hotplug
intel_pstate: Fail initialization if P-state information is missing
ARM/cpuidle: remove __init tag from Calxeda cpuidle probe function
PCI / ACPI: Install wakeup notify handlers for all PCI devs with ACPI
cpufreq: preserve user_policy across suspend/resume
cpufreq: Clean up after a failing light-weight initialization
ACPI / PCI / hotplug: Avoid warning when _ADR not present
Roberto Sassu [Fri, 8 Nov 2013 18:21:37 +0000 (19:21 +0100)]
ima: remove unneeded size_limit argument from ima_eventdigest_init_common()
This patch removes the 'size_limit' argument from
ima_eventdigest_init_common(). Since the 'd' field will never include
the hash algorithm as prefix and the 'd-ng' will always have it, we can
use the hash algorithm to differentiate the two cases in the modified
function (it is equal to HASH_ALGO__LAST in the first case, the opposite
in the second).
Mimi Zohar [Sun, 17 Nov 2013 05:31:47 +0000 (00:31 -0500)]
ima: update IMA-templates.txt documentation
Patch "ima: extend the measurement list to include the file signature"
defined a new field called 'sig' and a new template called 'ima-sig'.
This patch updates the Documentation/security/IMA-templates.txt.
Linus Torvalds [Thu, 2 Jan 2014 22:40:38 +0000 (14:40 -0800)]
Merge branch 'akpm' (incoming from Andrew)
Merge patches from Andrew Morton:
"Ten fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL
sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c
drivers/dma/ioat/dma.c: check DMA mapping error in ioat_dma_self_test()
mm/memory-failure.c: transfer page count from head page to tail page after split thp
MAINTAINERS: set up proper record for Xilinx Zynq
mm: remove bogus warning in copy_huge_pmd()
memcg: fix memcg_size() calculation
mm: fix use-after-free in sys_remap_file_pages
mm: munlock: fix deadlock in __munlock_pagevec()
mm: munlock: fix a bug where THP tail page is encountered
Jason Baron [Thu, 2 Jan 2014 20:58:54 +0000 (12:58 -0800)]
epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL
The EPOLL_CTL_DEL path of epoll contains a classic, ab-ba deadlock.
That is, epoll_ctl(a, EPOLL_CTL_DEL, b, x), will deadlock with
epoll_ctl(b, EPOLL_CTL_DEL, a, x). The deadlock was introduced with
commmit 67347fe4e632 ("epoll: do not take global 'epmutex' for simple
topologies").
The acquistion of the ep->mtx for the destination 'ep' was added such
that a concurrent EPOLL_CTL_ADD operation would see the correct state of
the ep (Specifically, the check for '!list_empty(&f.file->f_ep_links')
However, by simply not acquiring the lock, we do not serialize behind
the ep->mtx from the add path, and thus may perform a full path check
when if we had waited a little longer it may not have been necessary.
However, this is a transient state, and performing the full loop
checking in this case is not harmful.
The important point is that we wouldn't miss doing the full loop
checking when required, since EPOLL_CTL_ADD always locks any 'ep's that
its operating upon. The reason we don't need to do lock ordering in the
add path, is that we are already are holding the global 'epmutex'
whenever we do the double lock. Further, the original posting of this
patch, which was tested for the intended performance gains, did not
perform this additional locking.
Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: Nathan Zimmer <nzimmer@sgi.com> Cc: Eric Wong <normalperson@yhbt.net> Cc: Nelson Elhage <nelhage@nelhage.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c
Min_low_pfn and max_low_pfn were used in pfn_valid macro if defined
CONFIG_FLATMEM. When the functions that use the pfn_valid is used in
driver module, max_low_pfn and min_low_pfn is to undefined, and fail to
build.
Naoya Horiguchi [Thu, 2 Jan 2014 20:58:51 +0000 (12:58 -0800)]
mm/memory-failure.c: transfer page count from head page to tail page after split thp
Memory failures on thp tail pages cause kernel panic like below:
mce: [Hardware Error]: Machine check events logged
MCE exception done on CPU 7
BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
IP: [<ffffffff811b7cd1>] dequeue_hwpoisoned_huge_page+0x131/0x1e0
PGD bae42067 PUD ba47d067 PMD 0
Oops: 0000 [#1] SMP
...
CPU: 7 PID: 128 Comm: kworker/7:2 Tainted: G M O 3.13.0-rc4-131217-1558-00003-g83b7df08e462 #25
...
Call Trace:
me_huge_page+0x3e/0x50
memory_failure+0x4bb/0xc20
mce_process_work+0x3e/0x70
process_one_work+0x171/0x420
worker_thread+0x11b/0x3a0
? manage_workers.isra.25+0x2b0/0x2b0
kthread+0xe4/0x100
? kthread_create_on_node+0x190/0x190
ret_from_fork+0x7c/0xb0
? kthread_create_on_node+0x190/0x190
...
RIP dequeue_hwpoisoned_huge_page+0x131/0x1e0
CR2: 0000000000000058
The reasoning of this problem is shown below:
- when we have a memory error on a thp tail page, the memory error
handler grabs a refcount of the head page to keep the thp under us.
- Before unmapping the error page from processes, we split the thp,
where page refcounts of both of head/tail pages don't change.
- Then we call try_to_unmap() over the error page (which was a tail
page before). We didn't pin the error page to handle the memory error,
this error page is freed and removed from LRU list.
- We never have the error page on LRU list, so the first page state
check returns "unknown page," then we move to the second check
with the saved page flag.
- The saved page flag have PG_tail set, so the second page state check
returns "hugepage."
- We call me_huge_page() for freed error page, then we hit the above panic.
The root cause is that we didn't move refcount from the head page to the
tail page after split thp. So this patch suggests to do this.
This panic was introduced by commit 524fca1e73 ("HWPOISON: fix
misjudgement of page_action() for errors on mlocked pages"). Note that we
did have the same refcount problem before this commit, but it was just
ignored because we had only first page state check which returned "unknown
page." The commit changed the refcount problem from "doesn't work" to
"kernel panic."
This warning was introduced by "mm: numa: Avoid unnecessary disruption
of NUMA hinting during migration" for paranoia reasons but the warning
is bogus. I was thinking of parallel races between NUMA hinting faults
and forks but this warning would also be triggered by a parallel reclaim
splitting a THP during a fork. Remote the bogus warning.
Signed-off-by: Mel Gorman <mgorman@suse.de> Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: Alex Thorlton <athorlton@sgi.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rik van Riel [Thu, 2 Jan 2014 20:58:46 +0000 (12:58 -0800)]
mm: fix use-after-free in sys_remap_file_pages
remap_file_pages calls mmap_region, which may merge the VMA with other
existing VMAs, and free "vma". This can lead to a use-after-free bug.
Avoid the bug by remembering vm_flags before calling mmap_region, and
not trying to dereference vma later.
Signed-off-by: Rik van Riel <riel@redhat.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: PaX Team <pageexec@freemail.hu> Cc: Kees Cook <keescook@chromium.org> Cc: Michel Lespinasse <walken@google.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Thu, 2 Jan 2014 20:58:44 +0000 (12:58 -0800)]
mm: munlock: fix deadlock in __munlock_pagevec()
Commit 7225522bb429 ("mm: munlock: batch non-THP page isolation and
munlock+putback using pagevec" introduced __munlock_pagevec() to speed
up munlock by holding lru_lock over multiple isolated pages. Pages that
fail to be isolated are put_page()d immediately, also within the lock.
This can lead to deadlock when __munlock_pagevec() becomes the holder of
the last page pin and put_page() leads to __page_cache_release() which
also locks lru_lock. The deadlock has been observed by Sasha Levin
using trinity.
This patch avoids the deadlock by deferring put_page() operations until
lru_lock is released. Another pagevec (which is also used by later
phases of the function is reused to gather the pages for put_page()
operation.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: Michel Lespinasse <walken@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Thu, 2 Jan 2014 20:58:43 +0000 (12:58 -0800)]
mm: munlock: fix a bug where THP tail page is encountered
Since commit ff6a6da60b89 ("mm: accelerate munlock() treatment of THP
pages") munlock skips tail pages of a munlocked THP page. However, when
the head page already has PageMlocked unset, it will not skip the tail
pages.
Commit 7225522bb429 ("mm: munlock: batch non-THP page isolation and
munlock+putback using pagevec") has added a PageTransHuge() check which
contains VM_BUG_ON(PageTail(page)). Sasha Levin found this triggered
using trinity, on the first tail page of a THP page without PageMlocked
flag.
This patch fixes the issue by skipping tail pages also in the case when
PageMlocked flag is unset. There is still a possibility of race with
THP page split between clearing PageMlocked and determining how many
pages to skip. The race might result in former tail pages not being
skipped, which is however no longer a bug, as during the skip the
PageTail flags are cleared.
However this race also affects correctness of NR_MLOCK accounting, which
is to be fixed in a separate patch.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Sasha Levin <sasha.levin@oracle.com> Cc: Michel Lespinasse <walken@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 2 Jan 2014 20:45:47 +0000 (12:45 -0800)]
Merge tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes
Pull GFS2 fixes from Steven Whitehouse:
"Here is a set of small fixes for GFS2. There is a fix to drop
s_umount which is copied in from the core vfs, two patches relate to a
hard to hit "use after free" and memory leak. Two patches related to
using DIO and buffered I/O on the same file to ensure correct
operation in relation to glock state changes. The final patch adds an
RCU read lock to ensure correct locking on an error path"
* tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
GFS2: Fix unsafe dereference in dump_holder()
GFS2: Wait for async DIO in glock state changes
GFS2: Fix incorrect invalidation for DIO/buffered I/O
GFS2: Fix slab memory leak in gfs2_bufdata
GFS2: Fix use-after-free race when calling gfs2_remove_from_ail
GFS2: don't hold s_umount over blkdev_put
Linus Torvalds [Thu, 2 Jan 2014 20:45:07 +0000 (12:45 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"Two small bug fixes and a follow-up to the CONFIG_NR_CPUS change.
A kernel compiled with CONFIG_NR_CPUS=256 will waste quite a bit of
memory for the per-cpu arrays. Under z/VM the maximum number of CPUs
is 64, the code now limits the possible cpu mask to 64 if running
under z/VM"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pci: obtain function handle in hotplug notifier
s390/3270: fix allocation of tty3270_screen structure
s390/smp: improve setup of possible cpu mask
Jan Kiszka [Sat, 28 Dec 2013 15:31:52 +0000 (16:31 +0100)]
KVM: nVMX: Unconditionally uninit the MMU on nested vmexit
Three reasons for doing this: 1. arch.walk_mmu points to arch.mmu anyway
in case nested EPT wasn't in use. 2. this aligns VMX with SVM. But 3. is
most important: nested_cpu_has_ept(vmcs12) queries the VMCS page, and if
one guest VCPU manipulates the page of another VCPU in L2, we may be
fooled to skip over the nested_ept_uninit_mmu_context, leaving mmu in
nested state. That can crash the host later on if nested_ept_get_cr3 is
invoked while L1 already left vmxon and nested.current_vmcs12 became
NULL therefore.
Cc: stable@kernel.org Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Linus Torvalds [Wed, 1 Jan 2014 19:36:16 +0000 (11:36 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull radeon drm fixes from Dave Airlie:
"Just piping a bunch of fixes from pre-xmas from Alex for radeon, all
either fix bad hw setup issues or regressions"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: Bump version for CIK DCE tiling fix
drm/radeon: set correct number of banks for CIK chips in DCE
drm/radeon: set correct pipe config for Hawaii in DCE
drm/radeon: expose render backend mask to the userspace
drm/radeon: fix render backend setup for SI and CIK
drm/radeon: 0x9649 is SUMO2 not SUMO
drm/radeon: fix UVD 256MB check
Dave Airlie [Wed, 1 Jan 2014 10:32:19 +0000 (20:32 +1000)]
Merge branch 'drm-fixes-3.13' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Radeon fixes, Christmas eve edition. Fix incorrect family for 0x9649
which lead to bogus rendering, tiling and RB fixes for SI and CIK,
and a UVD fix.
* 'drm-fixes-3.13' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: Bump version for CIK DCE tiling fix
drm/radeon: set correct number of banks for CIK chips in DCE
drm/radeon: set correct pipe config for Hawaii in DCE
drm/radeon: expose render backend mask to the userspace
drm/radeon: fix render backend setup for SI and CIK
drm/radeon: 0x9649 is SUMO2 not SUMO
drm/radeon: fix UVD 256MB check
Krzysztof Hałasa [Tue, 31 Dec 2013 10:51:16 +0000 (11:51 +0100)]
crypto: ixp4xx - Fix kernel compile error
drivers/crypto/ixp4xx_crypto.c: In function 'ixp_module_init':
drivers/crypto/ixp4xx_crypto.c:1419:2: error: 'dev' undeclared (first use in this function)
Now builds. Not tested on real hw.
Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Casey Schaufler [Mon, 30 Dec 2013 17:38:00 +0000 (09:38 -0800)]
Smack: Rationalize mount restrictions
The mount restrictions imposed by Smack rely heavily on the
use of the filesystem "floor", which is the label that all
processes writing to the filesystem must have access to. It
turns out that while the "floor" notion is sound, it has yet
to be fully implemented and has never been used.
The sb_mount and sb_umount hooks only make sense if the
filesystem floor is used actively, and it isn't. They can
be reintroduced if a rational restriction comes up. Until
then, they get removed.
The sb_kern_mount hook is required for the option processing.
It is too permissive in the case of unprivileged mounts,
effectively bypassing the CAP_MAC_ADMIN restrictions if
any of the smack options are specified. Unprivileged mounts
are no longer allowed to set Smack filesystem options.
Additionally, the root and default values are set to the
label of the caller, in keeping with the policy that objects
get the label of their creator.
Targeted for git://git.gitorious.org/smack-next/kernel.git
* pm-cpufreq:
intel_pstate: Fail initialization if P-state information is missing
cpufreq: preserve user_policy across suspend/resume
cpufreq: Clean up after a failing light-weight initialization
* pm-cpuidle:
ARM/cpuidle: remove __init tag from Calxeda cpuidle probe function
Merge branches 'acpi-pci-pm' and 'acpi-pci-hotplug'
* acpi-pci-pm:
PCI / ACPI: Install wakeup notify handlers for all PCI devs with ACPI
* acpi-pci-hotplug:
ACPIPHP / radeon / nouveau: Fix VGA switcheroo problem related to hotplug
ACPI / PCI / hotplug: Avoid warning when _ADR not present
Linus Torvalds [Tue, 31 Dec 2013 20:19:30 +0000 (12:19 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
"A fix for a panic in gpio-keys driver when set up with absolute
events, a fixup to the new zforce driver and a new keycode definition"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: allocate absinfo data when setting ABS capability
Input: define KEY_WWAN for Wireless WAN
Input: zforce - fix possible driver hang during suspend
Linus Torvalds [Tue, 31 Dec 2013 20:17:14 +0000 (12:17 -0800)]
Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"A few small cifs fixes including two for stable, and fixing a
regression introduced by the VFS change to file create"
* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
cifs: set FILE_CREATED
cifs: We do not drop reference to tlink in CIFSCheckMFSymlink()
Add missing end of line termination to some cifs messages
Dmitry Torokhov [Fri, 27 Dec 2013 01:44:29 +0000 (17:44 -0800)]
Input: allocate absinfo data when setting ABS capability
We need to make sure we allocate absinfo data when we are setting one of
EV_ABS/ABS_XXX capabilities, otherwise we may bomb when we try to emit this
event.
Rested-by: Paul Cercueil <pcercuei@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
ACPIPHP / radeon / nouveau: Fix VGA switcheroo problem related to hotplug
The changes in the ACPI-based PCI hotplug (ACPIPHP) subsystem made
during the 3.12 development cycle uncovered a problem with VGA
switcheroo that on some systems, when the device-specific method
(ATPX in the radeon case, _DSM in the nouveau case) is used to turn
off the discrete graphics, the BIOS generates ACPI hotplug events for
that device and those events cause ACPIPHP to attempt to remove the
device from the system (they are events for a device that was present
previously and is not present any more, so that's what should be done
according to the spec). Then, the system stops functioning correctly.
Since the hotplug events in question were simply silently ignored
previously, the least intrusive way to address that problem is to
make ACPIPHP ignore them again. For this purpose, introduce a new
ACPI device flag, no_hotplug, and modify ACPIPHP to ignore hotplug
events for PCI devices whose ACPI companions have that flag set.
Next, make the radeon and nouveau switcheroo detection code set the
no_hotplug flag for the discrete graphics' ACPI companion.
Fixes: bbd34fcdd1b2 (ACPI / hotplug / PCI: Register all devices under the given bridge)
References: https://bugzilla.kernel.org/show_bug.cgi?id=61891
References: https://bugzilla.kernel.org/show_bug.cgi?id=64891 Reported-and-tested-by: Mike Lothian <mike@fireburn.co.uk> Reported-and-tested-by: <madcatx@atlas.cz> Reported-and-tested-by: Joaquín Aramendía <samsagax@gmail.com> Cc: Alex Deucher <alexdeucher@gmail.com> Cc: Dave Airlie <airlied@linux.ie> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: 3.12+ <stable@vger.kernel.org> # 3.12+
intel_pstate: Fail initialization if P-state information is missing
If pstate.current_pstate is 0 after the initial
intel_pstate_get_cpu_pstates(), this means that we were unable to
obtain any useful P-state information and there is no reason to
continue, so free memory and return an error in that case.
This fixes the following divide error occuring in a nested KVM
guest:
Jan Kiszka [Sun, 29 Dec 2013 01:29:30 +0000 (02:29 +0100)]
KVM: x86: Fix APIC map calculation after re-enabling
Update arch.apic_base before triggering recalculate_apic_map. Otherwise
the recalculation will work against the previous state of the APIC and
will fail to build the correct map when an APIC is hardware-enabled
again.
Linus Torvalds [Mon, 30 Dec 2013 18:22:57 +0000 (10:22 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
"A bit more endian problems found during testing of 3.13 and a few
other simple fixes and regressions fixes"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Fix alignment of secondary cpu spin vars
powerpc: Align p_end
powernv/eeh: Add buffer for P7IOC hub error data
powernv/eeh: Fix possible buffer overrun in ioda_eeh_phb_diag()
powerpc: Make 64-bit non-VMX __copy_tofrom_user bi-endian
powerpc: Make unaligned accesses endian-safe for powerpc
powerpc: Fix bad stack check in exception entry
powerpc/512x: dts: disable MPC5125 usb module
powerpc/512x: dts: remove misplaced IRQ spec from 'soc' node (5125)
Pull networking fixes from David Miller:
"Some holiday bug fixes for 3.13... There is still one bug I'd like to
get fixed before 3.13-final.
The vlan code erroneously assignes the header ops of the underlying
real device to the VLAN device above it when the real device can
hardware offload VLAN handling. That's completely bogus because
header ops are tied to the device type, so they only expect to see a
'dev' argument compatible with their ops.
The fix is the have the VLAN code use a special set of header ops that
does the pass-thru correctly, by calling the underlying real device's
header ops but _also_ passing in the real device instead of the VLAN
device.
That fix is currently waiting some testing.
Anyways, of note here:
1) Fix bitmap edge case in radiotap, from Johannes Berg.
2) Fix oops on driver unload in rtlwifi, from Larry Finger.
3) Bonding doesn't do locking correctly during speed/duplex/link
changes, from Ding Tianhong.
4) Fix header parsing in GRE code, this bug has been around for a few
releases. From Timo Teräs.
5) SIT tunnel driver MTU check needs to take GSO into account, from
Eric Dumazet.
6) Minor info leak in inet_diag, from Daniel Borkmann.
7) Info leak in YAM hamradio driver, from Salva Peiró.
8) Fix route expiration state handling in ipv6 routing code, from Li
RongQing.
9) DCCP probe module does not check request_module()'s return value,
from Wang Weidong.
10) cpsw driver passes NULL device names to request_irq(), from
Mugunthan V N.
11) Prevent a NULL splat in RDS binding code, from Sasha Levin.
12) Fix 4G overflow test in tg3 driver, from Nithin Sujir.
13) Cure use after free in arc_emac and fec driver's software
timestamp handling, from Eric Dumazet.
14) SIT driver can fail to release the route when
iptunnel_handle_offloads() throws an error. From Li RongQing.
15) Several batman-adv fixes from Simon Wunderlich and Antonio
Quartulli.
16) Fix deadlock during TIPC socket release, from Ying Xue.
17) Fix regression in ROSE protocol recvmsg() msg_name handling, from
Florian Westphal.
18) stmmac PTP support releases wrong spinlock, from Vince Bridgers"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
stmmac: Fix incorrect spinlock release and PTP cap detection.
phy: IRQ cannot be shared
net: rose: restore old recvmsg behavior
xen-netback: fix guest-receive-side array sizes
fec: Do not assume that PHY reset is active low
tipc: fix deadlock during socket release
netfilter: nf_tables: fix wrong datatype in nft_validate_data_load()
batman-adv: fix vlan header access
batman-adv: clean nf state when removing protocol header
batman-adv: fix alignment for batadv_tvlv_tt_change
batman-adv: fix size of batadv_bla_claim_dst
batman-adv: fix size of batadv_icmp_header
batman-adv: fix header alignment by unrolling batadv_header
batman-adv: fix alignment for batadv_coded_packet
netfilter: nf_tables: fix oops when updating table with user chains
netfilter: nf_tables: fix dumping with large number of sets
ipv6: release dst properly in ipip6_tunnel_xmit
netxen: Correct off-by-one errors in bounds checks
net: Add some clarification to skb_tx_timestamp() comment.
arc_emac: fix potential use after free
...
This patch adjusts the refcount in the walk of the interrupt tree.
When a match is found, there is no need to increase the refcount
on 'out_irq->np' as 'newpar' is already holding a ref. The refcount
balance between 'ipar' and 'newpar' is maintained in the skiplevel:
goto label.
This patch also removes the usage of the device_node variable 'old'
which seems useless after the latest changes.
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Signed-off-by: Rob Herring <robh@kernel.org>
Nikita Yushchenko reports:
While trying to make freescale p2020ds and mpc8572ds boards working
with mainline kernel, I faced that commit e38c0a1f (Handle
Both these boards have uli1575 chip.
Corresponding part in device tree is something like
With commit e38c0a1f reverted, devices under uli1575 are registered
correctly, e.g. for rtc
OF: ** translation for device /pcie@ffe09000/pcie@0/uli1575@0/isa@1e/rtc@70 **
OF: bus is isa (na=2, ns=1) on /pcie@ffe09000/pcie@0/uli1575@0/isa@1e
OF: translating address: 0000000100000070
OF: parent bus is default (na=3, ns=2) on /pcie@ffe09000/pcie@0/uli1575@0
OF: walking ranges...
OF: ISA map, cp=0, s=1000, da=70
OF: parent translation for: 010000000000000000000000
OF: with offset: 70
OF: one level translation: 000000000000000000000070
OF: parent bus is pci (na=3, ns=2) on /pcie@ffe09000/pcie@0
OF: walking ranges...
OF: default map, cp=a0000000, s=20000000, da=70
OF: default map, cp=0, s=10000, da=70
OF: parent translation for: 010000000000000000000000
OF: with offset: 70
OF: one level translation: 010000000000000000000070
OF: parent bus is pci (na=3, ns=2) on /pcie@ffe09000
OF: walking ranges...
OF: PCI map, cp=0, s=10000, da=70
OF: parent translation for: 010000000000000000000000
OF: with offset: 70
OF: one level translation: 010000000000000000000070
OF: parent bus is default (na=2, ns=2) on /
OF: walking ranges...
OF: PCI map, cp=0, s=10000, da=70
OF: parent translation for: 00000000ffc10000
OF: with offset: 70
OF: one level translation: 00000000ffc10070
OF: reached root node
With commit e38c0a1f in place, address translation fails:
OF: ** translation for device /pcie@ffe09000/pcie@0/uli1575@0/isa@1e/rtc@70 **
OF: bus is isa (na=2, ns=1) on /pcie@ffe09000/pcie@0/uli1575@0/isa@1e
OF: translating address: 0000000100000070
OF: parent bus is default (na=3, ns=2) on /pcie@ffe09000/pcie@0/uli1575@0
OF: walking ranges...
OF: ISA map, cp=0, s=1000, da=70
OF: parent translation for: 010000000000000000000000
OF: with offset: 70
OF: one level translation: 000000000000000000000070
OF: parent bus is pci (na=3, ns=2) on /pcie@ffe09000/pcie@0
OF: walking ranges...
OF: default map, cp=a0000000, s=20000000, da=70
OF: default map, cp=0, s=10000, da=70
OF: not found !
Thierry Reding confirmed this commit was not needed after all:
"We ended up merging a different address representation for Tegra PCIe
and I've confirmed that reverting this commit doesn't cause any obvious
regressions. I think all other drivers in drivers/pci/host ended up
copying what we did on Tegra, so I wouldn't expect any other breakage
either."
There doesn't appear to be a simple way to support both behaviours, so
reverting this as nothing should be depending on the new behaviour.
Cc: stable@vger.kernel.org # v3.7+ Signed-off-by: Rob Herring <robh@kernel.org>
Andre Przywara [Fri, 13 Dec 2013 20:49:19 +0000 (21:49 +0100)]
ARM/cpuidle: remove __init tag from Calxeda cpuidle probe function
Commit 60a66e370007e8535b7a561353b07b37deaf35ba changed the Calxeda
cpuidle driver to a platform driver, copying the __init tag from the
_init() to the newly used _probe() function. However, "probe should
not be __init." (Rob said ;-)
Remove the __init tag to fix a section mismatch in the Calxeda
cpuidle driver.
Signed-off-by: Andre Przywara <andre.przywara@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Sebastian Ott [Wed, 18 Dec 2013 15:46:02 +0000 (16:46 +0100)]
s390/pci: obtain function handle in hotplug notifier
When using the CLP interface to enable or disable a pci device a
valid function handle needs to be delivered. So far our assumption
was that we always have an up-to-date version of the function handle
(since it doesn't change when the device is in use). This assumption
is incorrect if the pci device is enabled or disabled outside of our
control. When we are notified about such a change we already receive
the new function handle. Just use it.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Vince Bridgers [Fri, 20 Dec 2013 17:19:34 +0000 (11:19 -0600)]
stmmac: Fix incorrect spinlock release and PTP cap detection.
This patch corrects a problem in stmmac_ptp.c, functions
stmmac_adjust_time and stmmac_adjust_freq where the incorrect spinlocks
were released. This patch also addresses a problem in stmmac_main,
function stmmac_init_ptp where the capability detection for
advanced timestamping was masked by message masking.
This patch was touch tested using linuxptp, and runs without the previously
observed instabilities. More extensive testing is ongoing.
Vince
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Fri, 20 Dec 2013 19:09:04 +0000 (22:09 +0300)]
phy: IRQ cannot be shared
With the way PHY IRQ handler is implemented (all real handling being pushed to
the workqueue and returning IRQ_HANDLED all the time PHY is active), we cannot
really claim that PHY IRQ can be shared when calling request_irq().
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Sun, 22 Dec 2013 23:32:31 +0000 (00:32 +0100)]
net: rose: restore old recvmsg behavior
recvmsg handler in net/rose/af_rose.c performs size-check ->msg_namelen.
After commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c
(net: rework recvmsg handler msg_name and msg_namelen logic), we now
always take the else branch due to namelen being initialized to 0.
Digging in netdev-vger-cvs git repo shows that msg_namelen was
initialized with a fixed-size since at least 1995, so the else branch
was never taken.
Compile tested only.
Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Durrant [Mon, 23 Dec 2013 09:27:17 +0000 (09:27 +0000)]
xen-netback: fix guest-receive-side array sizes
The sizes chosen for the metadata and grant_copy_op arrays on the guest
receive size are wrong;
- The meta array is needlessly twice the ring size, when we only ever
consume a single array element per RX ring slot
- The grant_copy_op array is way too small. It's sized based on a bogus
assumption: that at most two copy ops will be used per ring slot. This
may have been true at some point in the past but it's clear from looking
at start_new_rx_buffer() that a new ring slot is only consumed if a frag
would overflow the current slot (plus some other conditions) so the actual
limit is MAX_SKB_FRAGS grant_copy_ops per ring slot.
This patch fixes those two sizing issues and, because grant_copy_ops grows
so much, it pulls it out into a separate chunk of vmalloc()ed memory.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Estevam [Tue, 24 Dec 2013 14:16:01 +0000 (12:16 -0200)]
fec: Do not assume that PHY reset is active low
We should not assume that the PHY reset is always active low.
Retrieve this information from the device tree instead, so that the PHY reset
can work on both cases.
Reported-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The opposite order of holding port lock and node lock on above two
different paths may result in a deadlock. If socket lock instead of
port lock is used to protect port instance in tipc_withdraw(), the
reverse order of holding port lock and node lock will be eliminated,
as a result, the deadlock is killed as well.
Reported-by: Lars Everbrand <lars.everbrand@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Olof Johansson [Sat, 28 Dec 2013 21:01:47 +0000 (13:01 -0800)]
powerpc: Fix alignment of secondary cpu spin vars
Commit 5c0484e25ec0 ('powerpc: Endian safe trampoline') resulted in
losing proper alignment of the spinlock variables used when booting
secondary CPUs, causing some quite odd issues with failing to boot on
PA Semi-based systems.
This showed itself on ppc64_defconfig, but not on pasemi_defconfig,
so it had gone unnoticed when I initially tested the LE patch set.
Fix is to add explicit alignment instead of relying on good luck. :)
[ It appears that there is a different issue with PA Semi systems
however this fix is definitely correct so applying anyway -- BenH
]
Fixes: 5c0484e25ec0 ('powerpc: Endian safe trampoline') Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=67811 Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>