Nadav Amit [Thu, 25 Dec 2014 00:52:21 +0000 (02:52 +0200)]
KVM: x86: POP [ESP] is not emulated correctly
According to Intel SDM: "If the ESP register is used as a base register for
addressing a destination operand in memory, the POP instruction computes the
effective address of the operand after it increments the ESP register."
The current emulation does not behave so. The fix required to waste another
of the precious instruction flags and to check the flag in decode_modrm.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nadav Amit [Thu, 25 Dec 2014 00:52:19 +0000 (02:52 +0200)]
KVM: x86: JMP/CALL using call- or task-gate causes exception
The KVM emulator does not emulate JMP and CALL that target a call gate or a
task gate. This patch does not try to implement these scenario as they are
presumably rare; yet it returns X86EMUL_UNHANDLEABLE error in such cases
instead of generating an exception.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nadav Amit [Thu, 25 Dec 2014 00:52:18 +0000 (02:52 +0200)]
KVM: x86: fnstcw and fnstsw may cause spurious exception
Since the operand size of fnstcw and fnstsw is updated during the execution,
the emulation may cause spurious exceptions as it reads the memory beforehand.
Marking these instructions as Mov (since the previous value is ignored) and
DstMem16 to simplify the setting of operand size.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nadav Amit [Thu, 25 Dec 2014 00:52:17 +0000 (02:52 +0200)]
KVM: x86: pop sreg accesses only 2 bytes
Although pop sreg updates RSP according to the operand size, only 2 bytes are
read. The current behavior may result in incorrect #GP or #PF exceptions.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 2 Oct 2013 14:56:14 +0000 (16:56 +0200)]
KVM: x86: mmu: remove argument to kvm_init_shadow_mmu and kvm_init_shadow_ept_mmu
The initialization function in mmu.c can always use walk_mmu, which
is known to be vcpu->arch.mmu. Only init_kvm_nested_mmu is used to
initialize vcpu->arch.nested_mmu.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Marcelo Tosatti [Tue, 16 Dec 2014 14:08:15 +0000 (09:08 -0500)]
KVM: x86: add option to advance tscdeadline hrtimer expiration
For the hrtimer which emulates the tscdeadline timer in the guest,
add an option to advance expiration, and busy spin on VM-entry waiting
for the actual expiration time to elapse.
This allows achieving low latencies in cyclictest (or any scenario
which requires strict timing regarding timer expiration).
Reduces average cyclictest latency from 12us to 8us
on Core i5 desktop.
Note: this option requires tuning to find the appropriate value
for a particular hardware/guest combination. One method is to measure the
average delay between apic_timer_fn and VM-entry.
Another method is to start with 1000ns, and increase the value
in say 500ns increments until avg cyclictest numbers stop decreasing.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tiejun Chen [Mon, 22 Dec 2014 09:32:57 +0000 (10:32 +0100)]
kvm: x86: vmx: NULL out hwapic_isr_update() in case of !enable_apicv
In most cases calling hwapic_isr_update(), we always check if
kvm_apic_vid_enabled() == 1, but actually,
kvm_apic_vid_enabled()
-> kvm_x86_ops->vm_has_apicv()
-> vmx_vm_has_apicv() or '0' in svm case
-> return enable_apicv && irqchip_in_kernel(kvm)
So its a little cost to recall vmx_vm_has_apicv() inside
hwapic_isr_update(), here just NULL out hwapic_isr_update() in
case of !enable_apicv inside hardware_setup() then make all
related stuffs follow this. Note we don't check this under that
condition of irqchip_in_kernel() since we should make sure
definitely any caller don't work without in-kernel irqchip.
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nicholas Krause [Fri, 19 Dec 2014 02:13:22 +0000 (21:13 -0500)]
KVM: x86: Remove FIXMEs in emulate.c for the function,task_switch_32
Remove FIXME comments about needing fault addresses to be returned. These
are propaagated from walk_addr_generic to gva_to_gpa and from there to
ops->read_std and ops->write_std.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM: nVMX: consult PFEC_MASK and PFEC_MATCH when generating #PF VM-exit
When generating #PF VM-exit, check equality:
(PFEC & PFEC_MASK) == PFEC_MATCH
If there is equality, the 14 bit of exception bitmap is used to take decision
about generating #PF VM-exit. If there is inequality, inverted 14 bit is used.
Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch improve checks required by Intel Software Developer Manual.
- SMM MSRs are not allowed.
- microcode MSRs are not allowed.
- check x2apic MSRs only when LAPIC is in x2apic mode.
- MSR switch areas must be aligned to 16 bytes.
- address of first and last byte in MSR switch areas should not set any bits
beyond the processor's physical-address width.
Also it adds warning messages on failures during MSR switch. These messages
are useful for people who debug their VMMs in nVMX.
Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Wincy Van [Thu, 11 Dec 2014 05:52:58 +0000 (08:52 +0300)]
KVM: nVMX: Add nested msr load/restore algorithm
Several hypervisors need MSR auto load/restore feature.
We read MSRs from VM-entry MSR load area which specified by L1,
and load them via kvm_set_msr in the nested entry.
When nested exit occurs, we get MSRs via kvm_get_msr, writing
them to L1`s MSR store area. After this, we read MSRs from VM-exit
MSR load area, and load them via kvm_set_msr.
Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Mon, 5 Jan 2015 22:49:02 +0000 (14:49 -0800)]
Merge tag 'powerpc-3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc fixes from Michael Ellerman:
- Wire up sys_execveat(). Tested on 32 & 64 bit.
- Fix for kdump on LE systems with cpus hot unplugged.
- Revert Anton's fix for "kernel BUG at kernel/smpboot.c:134!", this
broke other platforms, we'll do a proper fix for 3.20.
* tag 'powerpc-3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
Revert "powerpc: Secondary CPUs must set cpu_callin_map after setting active and online"
powerpc/kdump: Ignore failure in enabling big endian exception during crash
powerpc: Wire up sys_execveat() syscall
Linus Torvalds [Sun, 4 Jan 2015 19:46:43 +0000 (11:46 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
"Two fixes for UML regressions. Nothing exciting"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
x86, um: actually mark system call tables readonly
um: Skip futex_atomic_cmpxchg_inatomic() test
Pavel Machek [Sun, 4 Jan 2015 19:01:23 +0000 (20:01 +0100)]
Revert "ARM: 7830/1: delay: don't bother reporting bogomips in /proc/cpuinfo"
Commit 9fc2105aeaaf ("ARM: 7830/1: delay: don't bother reporting
bogomips in /proc/cpuinfo") breaks audio in python, and probably
elsewhere, with message
Daniel Borkmann [Sat, 3 Jan 2015 12:11:10 +0000 (13:11 +0100)]
x86, um: actually mark system call tables readonly
Commit a074335a370e ("x86, um: Mark system call tables readonly") was
supposed to mark the sys_call_table in UML as RO by adding the const,
but it doesn't have the desired effect as it's nevertheless being placed
into the data section since __cacheline_aligned enforces sys_call_table
being placed into .data..cacheline_aligned instead. We need to use
the ____cacheline_aligned version instead to fix this issue.
Before:
$ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table"
U sys_writev 0000000000000000 D sys_call_table 0000000000000000 D syscall_table_size
After:
$ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table"
U sys_writev 0000000000000000 R sys_call_table 0000000000000000 D syscall_table_size
Fixes: a074335a370e ("x86, um: Mark system call tables readonly") Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Richard Weinberger <richard@nod.at>
futex_atomic_cmpxchg_inatomic() does not work on UML because
it triggers a copy_from_user() in kernel context.
On UML copy_from_user() can only be used if the kernel was called
by a real user space process such that UML can use ptrace()
to fetch the value.
Reported-by: Miklos Szeredi <miklos@szeredi.hu> Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Richard Weinberger <richard@nod.at> Tested-by: Daniel Walter <d.walter@0x90.at>
Linus Torvalds [Fri, 2 Jan 2015 21:24:41 +0000 (13:24 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of three fixes: one to correct an abort path thinko
causing failures (and a panic) in USB on device misbehaviour, One to
fix an out of order issue in the fnic driver and one to match discard
expectations to qemu which otherwise cause Linux to behave badly as a
guest"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
SCSI: fix regression in scsi_send_eh_cmnd()
fnic: IOMMU Fault occurs when IO and abort IO is out of order
sd: tweak discard heuristics to work around QEMU SCSI issue
Linus Torvalds [Fri, 2 Jan 2015 20:57:20 +0000 (12:57 -0800)]
Merge tag 'sound-3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Nothing too exciting as a new year's start here: most of fixes are for
ASoC, a boot crash fix on OMAP for deferred probe, a few driver
specific fixes (Intel, dwc, rockchip, rt5677), in addition to typo
fixes in kerneldoc comments for PCM"
* tag 'sound-3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: pcm: Fix kerneldoc for params_*() functions
ASoC: rockchip: i2s: fix maxburst of dma data to 4
ASoC: rockchip: i2s: fix error defination of transmit data level
ASoC: Intel: correct the fixed free block allocation
ASoC: rt5677: fixed rt5677_dsp_vad_put rt5677_dsp_vad_get panic
ASoC: Intel: Fix BYTCR machine driver MODULE_ALIAS
ASoC: Intel: Fix BYTCR firmware name
ASoC: dwc: Iterate over all channels
ASoC: dwc: Ensure FIFOs are flushed to prevent channel swap
ASoC: Intel: Add I2C dependency to two new machines
ASoC: dapm: Remove snd_soc_of_parse_audio_routing() due to deferred probe
Linus Torvalds [Fri, 2 Jan 2015 20:07:50 +0000 (12:07 -0800)]
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull vhost cleanup and virtio bugfix
"There's a single change here, fixing a vhost bug where vhost
initialization fails due to used ring alignment check being too
strict"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost: relax used address alignment
virtio_ring: document alignment requirements
Linus Torvalds [Wed, 31 Dec 2014 22:52:18 +0000 (14:52 -0800)]
Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit
Pull audit fix from Paul Moore:
"One audit patch to resolve a panic/oops when recording filenames in
the audit log, see the mail archive link below.
The fix isn't as nice as I would like, as it involves an allocate/copy
of the filename, but it solves the problem and the overhead should
only affect users who have configured audit rules involving file
names.
We'll revisit this issue with future kernels in an attempt to make
this suck less, but in the meantime I think this fix should go into
the next release of v3.19-rcX.
Tobias Klauser [Wed, 31 Dec 2014 02:53:11 +0000 (10:53 +0800)]
nios2: Use preempt_schedule_irq
Follow aa0d53260596 ("ia64: Use preempt_schedule_irq") and use
preempt_schedule_irq instead of enabling/disabling interrupts and
messing around with PREEMPT_ACTIVE in the nios2 low-level preemption
code ourselves. Also get rid of the now needless re-check for
TIF_NEED_RESCHED, preempt_schedule_irq will already take care of
rescheduling.
This also fixes the following build error when building with
CONFIG_PREEMPT:
arch/nios2/kernel/built-in.o: In function `need_resched':
arch/nios2/kernel/entry.S:374: undefined reference to `PREEMPT_ACTIVE'
Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Ley Foon Tan <lftan@altera.com>
Linus Torvalds [Wed, 31 Dec 2014 01:13:13 +0000 (17:13 -0800)]
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"A very small set of fixes for 3.19, as everyone was out.
The clocksource patch was something I missed for the merge window
after the change that broke arm64 was merged through arm-soc. The
other two patches are a fix for an undetected merge problem in mvebu
and a defconfig change to make some exynos boards work with the normal
multi_v7_defconfig"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
Add USB_EHCI_EXYNOS to multi_v7_defconfig
ARM: mvebu: Fix pinctrl configuration for Armada 370 DB
clocksource: arch_timer: Only use the virtual counter (CNTVCT) on arm64
Linus Torvalds [Wed, 31 Dec 2014 01:04:56 +0000 (17:04 -0800)]
Merge tag 'fbdev-fixes-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux
Pull fbdev fixes from Tomi Valkeinen:
- Fix regression with Nokia N900 display
- Fix crash on fbdev using freed __initdata logos
- Fix fb_deferred_io_fsync() return value.
* tag 'fbdev-fixes-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
OMAPDSS: SDI: fix output port_num
video/fbdev: fix defio's fsync
video/logo: prevent use of logos after they have been freed
OMAPDSS: pll: NULL dereference in error handling
OMAPDSS: HDMI: remove double initializer entries
It's causing severe userspace breakage. Namely, all the utilities from
wireless-utils which are relying on CONFIG_WEXT (which means tools like
'iwconfig', 'iwlist', etc) are not working anymore. There is a 'iw'
utility in newer wireless-tools, which is supposed to be a replacement
for all the "deprecated" binaries, but it's far away from being
massively adopted.
Please see [1] for example of the userspace breakage this is causing.
In addition to that, Larry Finger reports [2] that this patch is also
causing ipw2200 driver being impossible to build.
To me this clearly shows that CONFIG_WEXT is far, far away from being
"deprecated enough" to be removed.
1) Fix double SKB free in bluetooth 6lowpan layer, from Jukka Rissanen.
2) Fix receive checksum handling in enic driver, from Govindarajulu
Varadarajan.
3) Fix NAPI poll list corruption in virtio_net and caif_virtio, from
Herbert Xu. Also, add code to detect drivers that have this mistake
in the future.
4) Fix doorbell endianness handling in mlx4 driver, from Amir Vadai.
5) Don't clobber IP6CB() before xfrm6_policy_check() is called in TCP
input path,f rom Nicolas Dichtel.
6) Fix MPLS action validation in openvswitch, from Pravin B Shelar.
7) Fix double SKB free in vxlan driver, also from Pravin.
8) When we scrub a packet, which happens when we are switching the
context of the packet (namespace, etc.), we should reset the
secmark. From Thomas Graf.
9) ->ndo_gso_check() needs to do more than return true/false, it also
has to allow the driver to clear netdev feature bits in order for
the caller to be able to proceed properly. From Jesse Gross.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
genetlink: A genl_bind() to an out-of-range multicast group should not WARN().
netlink/genetlink: pass network namespace to bind/unbind
ne2k-pci: Add pci_disable_device in error handling
bonding: change error message to debug message in __bond_release_one()
genetlink: pass multicast bind/unbind to families
netlink: call unbind when releasing socket
netlink: update listeners directly when removing socket
genetlink: pass only network namespace to genl_has_listeners()
netlink: rename netlink_unbind() to netlink_undo_bind()
net: Generalize ndo_gso_check to ndo_features_check
net: incorrect use of init_completion fixup
neigh: remove next ptr from struct neigh_table
net: xilinx: Remove unnecessary temac_property in the driver
net: phy: micrel: use generic config_init for KSZ8021/KSZ8031
net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding
openvswitch: fix odd_ptr_err.cocci warnings
Bluetooth: Fix accepting connections when not using mgmt
Bluetooth: Fix controller configuration with HCI_QUIRK_INVALID_BDADDR
brcmfmac: Do not crash if platform data is not populated
ipw2200: select CFG80211_WEXT
...
Alan Stern [Fri, 21 Nov 2014 15:44:49 +0000 (10:44 -0500)]
SCSI: fix regression in scsi_send_eh_cmnd()
Commit ac61d1955934 (scsi: set correct completion code in
scsi_send_eh_cmnd()) introduced a bug. It changed the stored return
value from a queuecommand call, but it didn't take into account that
the return value was used again later on. This patch fixes the bug by
changing the later usage.
There is a big comment in the middle of scsi_send_eh_cmnd() which
does a good job of explaining how the routine works. But it mentions
a "rtn = FAILURE" value that doesn't exist in the code. This patch
adjusts the code to match the comment (I assume the comment is right
and the code is wrong).
This fixes Bugzilla #88341.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Андрей Аладьев <aladjev.andrew@gmail.com> Tested-by: Андрей Аладьев <aladjev.andrew@gmail.com> Fixes: ac61d19559349e205dad7b5122b281419aa74a82 Acked-by: Hannes Reinecke <hare@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Takashi Iwai [Tue, 30 Dec 2014 15:17:13 +0000 (16:17 +0100)]
Merge tag 'asoc-fix-v3.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v3.19
A few fixes for v3.19, a few driver specifics and one core fix which
fixes a boot crash on OMAP if deferred probing kicks in due to
attempting to modify static data.
Currently we enable Exynos devices in the multi v7 defconfig, however, when
testing on my ODROID-U3, I noticed that USB was not working. Enabling this
option causes USB to work, which enables networking support as well since the
ODROID-U3 has networking on the USB bus.
[arnd] Support for odroid-u3 was added in 3.10, so it would be nice to
backport this fix at least that far.
Paul Moore [Tue, 30 Dec 2014 14:26:21 +0000 (09:26 -0500)]
audit: create private file name copies when auditing inodes
Unfortunately, while commit 4a928436 ("audit: correctly record file
names with different path name types") fixed a problem where we were
not recording filenames, it created a new problem by attempting to use
these file names after they had been freed. This patch resolves the
issue by creating a copy of the filename which the audit subsystem
frees after it is done with the string.
At some point it would be nice to resolve this issue with refcounts,
or something similar, instead of having to allocate/copy strings, but
that is almost surely beyond the scope of a -rcX patch so we'll defer
that for later. On the plus side, only audit users should be impacted
by the string copying.
Reported-by: Toralf Foerster <toralf.foerster@gmx.de> Signed-off-by: Paul Moore <pmoore@redhat.com>
fnic: IOMMU Fault occurs when IO and abort IO is out of order
When I/O is aborted by mid-layer, fnic FW will complete the I/O before
completing the abort task. In some cases abort request is completed before
the I/O, which could lead to inconsistent driver and firmware states.
In this case firmware reset would clear the inconsistent state.
Signed-off-by: Anil Chintalapati <achintal@cisco.com> Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com> Signed-off-by: Hiral Shah <hishah@cisco.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
sd: tweak discard heuristics to work around QEMU SCSI issue
7985090aa020 changed the discard heuristics to give preference to the
WRITE SAME commands that (unlike UNMAP) guarantee deterministic results.
Ming Lei discovered that QEMU SCSI's WRITE SAME implementation
internally relied on limits that were only communicated for the UNMAP
case. And therefore discard commands backed by WRITE SAME would fail.
Tweak the heuristics so we still pick UNMAP in the LBPRZ=0 case and only
prefer the WRITE SAME variants if the device has the LBPRZ flag set.
Reported-by: Ming Lei <ming.lei@canonical.com> Tested-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Tomi Valkeinen [Mon, 29 Dec 2014 07:57:11 +0000 (09:57 +0200)]
OMAPDSS: SDI: fix output port_num
After the commit ef691ff48bc8 (OMAPDSS: DT: Get source endpoint by
matching reg-id) we look for the SDI output using the port number.
However, the SDI driver doesn't set the port number, which causes the
SDI display to not initialize.
Fix this by setting the SDI port number to 1. We use a hardcoded value,
as SDI was used only on OMAP3 and it's always port number 1 there.
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Tomi Valkeinen [Fri, 19 Dec 2014 11:55:41 +0000 (13:55 +0200)]
video/fbdev: fix defio's fsync
fb_deferred_io_fsync() returns the value of schedule_delayed_work() as
an error code, but schedule_delayed_work() does not return an error. It
returns true/false depending on whether the work was already queued.
Fix this by ignoring the return value of schedule_delayed_work().
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: stable@vger.kernel.org
Linus Torvalds [Tue, 30 Dec 2014 05:09:57 +0000 (21:09 -0800)]
Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS fixes from Steve French:
"A set of three minor cifs fixes"
* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
cifs: make new inode cache when file type is different
Fix signed/unsigned pointer warning
Convert MessageID in smb2_hdr to LE
Linus Torvalds [Tue, 30 Dec 2014 04:43:10 +0000 (20:43 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF & isofs fixes from Jan Kara:
"A couple of UDF fixes of handling of corrupted media and one iso9660
fix of the same"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Reduce repeated dereferences
udf: Check component length before reading it
udf: Check path length when reading symlink
udf: Verify symlink size before loading it
udf: Verify i_size when loading inode
isofs: Fix unchecked printing of ER records
Linus Torvalds [Tue, 30 Dec 2014 02:50:02 +0000 (18:50 -0800)]
Merge tag 'pm+acpi-3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI material from Rafael J Wysocki:
"These are fixes (operating performance points library, cpufreq-dt
driver, cpufreq core, ACPI backlight, cpupower tool), cleanups
(cpuidle), new processor IDs for the RAPL (Running Average Power
Limit) power capping driver, and a modification of the generic power
domains framework allowing modular drivers to call one of its helper
functions.
Specifics:
- Fix for a potential NULL pointer dereference in the cpufreq core
due to an initialization race condition (Ethan Zhao).
- Fixes for abuse of the OPP (Operating Performance Points) API
related to RCU and other minor issues in the OPP library and the
cpufreq-dt driver (Dmitry Torokhov).
- cpuidle governors cleanup making them measure idle duration in a
better way without using the CPUIDLE_FLAG_TIME_INVALID flag which
allows that flag to be dropped from the ACPI cpuidle driver and
from the core too (Len Brown).
- New ACPI backlight blacklist entries for Samsung machines without a
working native backlight interface that need to use the ACPI
backlight instead (Aaron Lu).
- New CPU IDs of future Intel Xeon CPUs for the Intel RAPL power
capping driver (Jacob Pan).
- Generic power domains framework modification to export the
of_genpd_get_from_provider() function to modular drivers that will
allow future driver modifications to be based on the mainline (Amit
Daniel Kachhap).
- Two fixes for the cpupower tool (Michal Privoznik, Prarit
Bhargava)"
* tag 'pm+acpi-3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: Add some Samsung models to disable_native_backlight list
tools / cpupower: Fix no idle state information return value
tools / cpupower: Correctly detect if running as root
cpufreq: fix a NULL pointer dereference in __cpufreq_governor()
cpufreq-dt: defer probing if OPP table is not ready
PM / OPP: take RCU lock in dev_pm_opp_get_opp_count
PM / OPP: fix warning in of_free_opp_table()
PM / OPP: add some lockdep annotations
powercap / RAPL: add IDs for future Xeon CPUs
PM / Domains: Export of_genpd_get_from_provider function
cpuidle / ACPI: remove unused CPUIDLE_FLAG_TIME_INVALID
cpuidle: ladder: Better idle duration measurement without using CPUIDLE_FLAG_TIME_INVALID
cpuidle: menu: Better idle duration measurement without using CPUIDLE_FLAG_TIME_INVALID
Linus Torvalds [Mon, 29 Dec 2014 21:30:50 +0000 (13:30 -0800)]
Merge tag 'spi-v3.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few driver specific fixes here, the DMA burst size increase in the
spfi driver is a fix to make the hardware happier in some situations"
* tag 'spi-v3.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: img-spfi: Increase DMA burst size
spi: img-spfi: Enable controller before starting TX DMA
spi: sh-msiof: Add runtime PM lock in initializing
Linus Torvalds [Mon, 29 Dec 2014 21:24:38 +0000 (13:24 -0800)]
Merge tag 'regulator-v3.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull one regulator fix from Mark Brown:
"One fix here, a fix for the voltage mapping on one of the s2mps11
regulators which broke systems using it including apparently the
Gear 2 smartwatches"
* tag 'regulator-v3.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: s2mps11: Fix dw_mmc failure on Gear 2
Linus Torvalds [Mon, 29 Dec 2014 21:13:41 +0000 (13:13 -0800)]
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:
"First of all, the most important change is the thermal cpu cooling
fixes. The major fix here is to have proper sequencing between
cpufreq layer and thermal cpu cooling registration. A take away of
this fix is an improvement in the thermal drivers code. Thermal
drivers that require cpu cooling do not need to check for cpufreq
layer. The requirement now is to propagate the error code, if any,
while registering cpu cooling device. Thanks to Viresh for
implementing the required CPUfreq changes.
Second, a new driver is introduced for int340x processor thermal
device. Given that int340x thermal is disabled by default, and this
processor thermal device is only available on limited platforms, plus
the driver does nothing but exposes some thermal limitation
information for user space to use, thus I think it is safe to include
it in this pull request after missing 3.19-rc2.
- several small fixes and cleanups for int340x thermal drivers"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: (43 commits)
Thermal/int340x/int3403: Free acpi notification handler
Thermal/int340x/processor_thermal: Fix memory leak
Thermal/int340x/int3403: Fix memory leak
thermal: int340x: Introduce processor reporting device
thermal: int340x_thermal: drop owner assignment from platform_drivers
thermal: drop owner assignment from platform_drivers
thermal: cpu_cooling: document node in struct cpufreq_cooling_device
thermal/powerclamp: add ids for future xeon cpus
Thermal/int340x: Handle properly the case when _trt or _art acpi entry is missing
thermal: cpu_cooling: return ERR_PTR() for !CPU_THERMAL or !THERMAL_OF
thermal: cpu_cooling: small memory leak on error
thermal: ti-soc-thermal: Do not print error message in the EPROBE_DEFER case
thermal: db8500: Do not print error message in the EPROBE_DEFER case
thermal: imx: Do not print error message in the EPROBE_DEFER case
thermal: Fix cdev registration with THERMAL_NO_LIMIT on 64bit
drivers: thermal: Remove ARCH_HAS_BANDGAP dependency for samsung
thermal:core:fix: Check return code of the ->get_max_state() callback
thermal: cpu_cooling: update copyright tags
thermal: cpu_cooling: Use cpufreq_dev->freq_table for finding level/freq
thermal: cpu_cooling: Store frequencies in descending order
...
Michal Hocko [Mon, 29 Dec 2014 19:30:35 +0000 (20:30 +0100)]
mm: get rid of radix tree gfp mask for pagecache_get_page
Commit 2457aec63745 ("mm: non-atomically mark page accessed during page
cache allocation where possible") has added a separate parameter for
specifying gfp mask for radix tree allocations.
Not only this is less than optimal from the API point of view because it
is error prone, it is also buggy currently because
grab_cache_page_write_begin is using GFP_KERNEL for radix tree and if
fgp_flags doesn't contain FGP_NOFS (mostly controlled by fs by
AOP_FLAG_NOFS flag) but the mapping_gfp_mask has __GFP_FS cleared then
the radix tree allocation wouldn't obey the restriction and might
recurse into filesystem and cause deadlocks. This is the case for most
filesystems unfortunately because only ext4 and gfs2 are using
AOP_FLAG_NOFS.
Let's simply remove radix_gfp_mask parameter because the allocation
context is same for both page cache and for the radix tree. Just make
sure that the radix tree gets only the sane subset of the mask (e.g. do
not pass __GFP_WRITE).
Long term it is more preferable to convert remaining users of
AOP_FLAG_NOFS to use mapping_gfp_mask instead and simplify this
interface even further.
Reported-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mmc: core: stop trying to switch width when only one bit is supported
mmc_select_bus_width() will try to switch to MMC_BUS_WIDTH_4 even if
MMC_CAP_4_BIT_DATA and MMC_CAP_8_BIT_DATA are not set in host->caps.
Return as soon as possible when those flags are not set
virtio 1.0 only requires used address to be 4 byte aligned,
vhost required 8 bytes (size of vring_used_elem).
Fix up vhost to match that.
Additionally, while vhost correctly requires 8 byte
alignment for log, it's unconnected to used ring:
it's a consequence that log has u64 entries.
Tweak code to make that clearer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Host needs to know vring element alignment requirements:
simply doing alignof on structures doesn't work reliably: on some
platforms gcc has alignof(uint32_t) == 2.
Add macros for alignment as specified in virtio 1.0 cs01,
export them to userspace as well.
Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tomi Valkeinen [Thu, 18 Dec 2014 11:40:06 +0000 (13:40 +0200)]
video/logo: prevent use of logos after they have been freed
If the probe of an fb driver has been deferred due to missing
dependencies, and the probe is later ran when a module is loaded, the
fbdev framework will try to find a logo to use.
However, the logos are __initdata, and have already been freed. This
causes sometimes page faults, if the logo memory is not mapped,
sometimes other random crashes as the logo data is invalid, and
sometimes nothing, if the fbdev decides to reject the logo (e.g. the
random value depicting the logo's height is too big).
This patch adds a late_initcall function to mark the logos as freed. In
reality the logos are freed later, and fbdev probe may be ran between
this late_initcall and the freeing of the logos. In that case we will
miss drawing the logo, even if it would be possible.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: stable@vger.kernel.org
Although this did fix the bug it was aimed at, it also broke secondary
startup on platforms that use give/take_timebase(). Unfortunately we
didn't detect that while it was in next.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Hari Bathini [Thu, 18 Dec 2014 18:06:55 +0000 (23:36 +0530)]
powerpc/kdump: Ignore failure in enabling big endian exception during crash
In LE kernel, we currently have a hack for kexec that resets the exception
endian before starting a new kernel as the kernel that is loaded could be a
big endian or a little endian kernel. In kdump case, resetting exception
endian fails when one or more cpus is disabled. But we can ignore the failure
and still go ahead, as in most cases crashkernel will be of same endianess
as primary kernel and reseting endianess is not even needed in those cases.
This patch adds a new inline function to say if this is kdump path. This
function is used at places where such a check is needed.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
[mpe: Rename to kdump_in_progress(), use bool, and edit comment] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Linus Torvalds [Sun, 28 Dec 2014 21:08:08 +0000 (13:08 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"The important fixes are for two bugs introduced by the merge window.
On top of this, add a couple of WARN_ONs and stop spamming dmesg on
pretty much every boot of a virtual machine"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: warn on more invariant breakage
kvm: fix sorting of memslots with base_gfn == 0
kvm: x86: drop severity of "generation wraparound" message
kvm: x86: vmx: reorder some msr writing
Paolo Bonzini [Sat, 27 Dec 2014 17:01:00 +0000 (18:01 +0100)]
kvm: fix sorting of memslots with base_gfn == 0
Before commit 0e60b0799fed (kvm: change memslot sorting rule from size
to GFN, 2014-12-01), the memslots' sorting key was npages, meaning
that a valid memslot couldn't have its sorting key equal to zero.
On the other hand, a valid memslot can have base_gfn == 0, and invalid
memslots are identified by base_gfn == npages == 0.
Because of this, commit 0e60b0799fed broke the invariant that invalid
memslots are at the end of the mslots array. When a memslot with
base_gfn == 0 was created, any invalid memslot before it were left
in place.
This can be fixed by changing the insertion to use a ">=" comparison
instead of "<=", but some care is needed to avoid breaking the case
of deleting a memslot; see the comment in update_memslots.
Thanks to Tiejun Chen for posting an initial patch for this bug.
Reported-by: Jamie Heilman <jamie@audible.transient.net> Reported-by: Andy Lutomirski <luto@amacapital.net> Tested-by: Jamie Heilman <jamie@audible.transient.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Sat, 27 Dec 2014 21:12:00 +0000 (13:12 -0800)]
Merge tag 'sound-3.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Just a couple of fixes for the new Intel Skylake HD-audio support"
* tag 'sound-3.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda_intel: apply the Seperate stream_tag for Skylake
ALSA: hda_controller: Separate stream_tag for input and output streams.
Tiejun Chen [Tue, 23 Dec 2014 08:21:11 +0000 (16:21 +0800)]
kvm: x86: vmx: reorder some msr writing
The commit 34a1cd60d17f, "x86: vmx: move some vmx setting from
vmx_init() to hardware_setup()", tried to refactor some codes
specific to vmx hardware setting into hardware_setup(), but some
msr writing should depend on our previous setting condition like
enable_apicv, enable_ept and so on.
Johannes Berg [Tue, 23 Dec 2014 20:00:06 +0000 (21:00 +0100)]
netlink/genetlink: pass network namespace to bind/unbind
Netlink families can exist in multiple namespaces, and for the most
part multicast subscriptions are per network namespace. Thus it only
makes sense to have bind/unbind notifications per network namespace.
To achieve this, pass the network namespace of a given client socket
to the bind/unbind functions.
Also do this in generic netlink, and there also make sure that any
bind for multicast groups that only exist in init_net is rejected.
This isn't really a problem if it is accepted since a client in a
different namespace will never receive any notifications from such
a group, but it can confuse the family if not rejected (it's also
possible to silently (without telling the family) accept it, but it
would also have to be ignored on unbind so families that take any
kind of action on bind/unbind won't do unnecessary work for invalid
clients like that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jia-Ju Bai [Tue, 23 Dec 2014 03:29:03 +0000 (11:29 +0800)]
ne2k-pci: Add pci_disable_device in error handling
For linux-3.18.0
The driver lacks pci_disable_device in error handling code of
ne2k_pci_init_one, so the device enabled by pci_enable_device is not
disabled when errors occur.
This patch fixes this problem.
Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wengang Wang [Tue, 23 Dec 2014 01:24:36 +0000 (09:24 +0800)]
bonding: change error message to debug message in __bond_release_one()
In __bond_release_one(), when the interface is not a slave or not a slave of
"this" master, it log error message.
The message actually should be a debug message matching what bond_enslave()
does.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
I'm looking at using the multicast group functionality in a way that would
benefit from knowing when there are subscribers to avoid collecting the
required data when there aren't any. During this I noticed that the unbind
for multicast groups doesn't actually work - it's never called when sockets
are closed. Luckily, nobody actually uses the functionality.
While looking at the code trying to find why it's not called and where the
multicast listeners are actually removed, I found the potential cleanup in
patch 3. Patch 2 also has a cleanup for a generic netlink API in this area.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Mon, 22 Dec 2014 17:56:38 +0000 (18:56 +0100)]
netlink: call unbind when releasing socket
Currently, netlink_unbind() is only called when the socket
explicitly unbinds, which limits its usefulness (luckily
there are no users of it yet anyway.)
Call netlink_unbind() also when a socket is released, so it
becomes possible to track listeners with this callback and
without also implementing a netlink notifier (and checking
netlink_has_listeners() in there.)
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Mon, 22 Dec 2014 17:56:37 +0000 (18:56 +0100)]
netlink: update listeners directly when removing socket
The code is now confusing to read - first in one function down
(netlink_remove) any group subscriptions are implicitly removed
by calling __sk_del_bind_node(), but the subscriber database is
only updated far later by calling netlink_update_listeners().
Move the latter call to just after removal from the list so it
is easier to follow the code.
This also enables moving the locking inside the kernel-socket
conditional, which improves the normal socket destruction path.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Mon, 22 Dec 2014 17:56:36 +0000 (18:56 +0100)]
genetlink: pass only network namespace to genl_has_listeners()
There's no point to force the caller to know about the internal
genl_sock to use inside struct net, just have them pass the network
namespace. This doesn't really change code generation since it's
an inline, but makes the caller less magic - there's never any
reason to pass another socket.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Sat, 27 Dec 2014 03:43:19 +0000 (22:43 -0500)]
[regression] braino in "lustre: use is_root_inode()"
In one of the places (ll_md_blocking_ast()) we had open-coded
!is_root_inode(inode) and replaced it with is_root_inode(inode).
See the last chunk of f76c23:
- inode != inode->i_sb->s_root->d_inode)
+ is_root_inode(inode))
should've been
+ !is_root_inode(inode))
obviously...
David S. Miller [Fri, 26 Dec 2014 23:08:38 +0000 (18:08 -0500)]
Merge tag 'wireless-drivers-for-davem-2014-12-26' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
o Paul made a Kconfig dependency fix to ipw2200, it was not possible to
enable that driver because Wireless Extensions is now disabled by default.
o Mika fixed brcmfmac not to crash when platform data is not populated
o Emmanuel provided few fixes to iwlwifi, he says:
"I have here new device IDs and a fix for double free bug I
introduced. I also fix an issue with the RFKILL interrupt - the HW
needs us to ACK the interrupt again after we reset it. Liad fixes an
issue with the firmware debugging infrastructure. While working on
torture scenarios of firmware restarts, Eliad found an issue which
he fixed."
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesse Gross [Wed, 24 Dec 2014 06:37:26 +0000 (22:37 -0800)]
net: Generalize ndo_gso_check to ndo_features_check
GSO isn't the only offload feature with restrictions that
potentially can't be expressed with the current features mechanism.
Checksum is another although it's a general issue that could in
theory apply to anything. Even if it may be possible to
implement these restrictions in other ways, it can result in
duplicate code or inefficient per-packet behavior.
This generalizes ndo_gso_check so that drivers can remove any
features that don't make sense for a given packet, similar to
netif_skb_features(). It also converts existing driver
restrictions to the new format, completing the work that was
done to support tunnel protocols since the issues apply to
checksums as well.
By actually removing features from the set that are used to do
offloading, it solves another problem with the existing
interface. In these cases, GSO would run with the original set
of features and not do anything because it appears that
segmentation is not required.
CC: Tom Herbert <therbert@google.com> CC: Joe Stringer <joestringer@nicira.com> CC: Eric Dumazet <edumazet@google.com> CC: Hayes Wang <hayeswang@realtek.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Tom Herbert <therbert@google.com> Fixes: 04ffcb255f22 ("net: Add ndo_gso_check") Tested-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Tue, 23 Dec 2014 16:50:37 +0000 (17:50 +0100)]
neigh: remove next ptr from struct neigh_table
After commit d7480fd3b173 ("neigh: remove dynamic neigh table registration support"),
this field is not used anymore.
CC: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 26 Dec 2014 21:41:05 +0000 (13:41 -0800)]
Merge branch 'parisc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc build fix from Helge Deller:
"This unbreaks the kernel compilation on parisc with gcc-4.9"
* 'parisc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix out-of-register compiler error in ldcw inline assembler function
Johan Hovold [Tue, 23 Dec 2014 11:59:17 +0000 (12:59 +0100)]
net: phy: micrel: use generic config_init for KSZ8021/KSZ8031
Use generic config_init callback also for KSZ8021 and KSZ8031.
This has been avoided this far due to commit b838b4aced99 ("phy/micrel:
KSZ8031RNL RMII clock reconfiguration bug"), which claims that the PHY
becomes unresponsive if the broadcast-disable flag is set before
configuring the clock mode.
Turns out that the problem seemingly worked-around by the above
mentioned commit was really due to a hardware-configuration issue, where
the PHY was in fact strapped to address 3 rather than 0.
Tested-by: Bruno Thomsen <bth@kamstrup.dk> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
When VXLAN encapsulated traffic is received from a similarly
configured peer, the above warning is generated in the receive
processing of the encapsulated packet. Note that the warning is
associated with the container eth0.
The skbs from sky2 have ip_summed set to CHECKSUM_COMPLETE, and
because the packet is an encapsulated Ethernet frame, the checksum
generated by the hardware includes the inner protocol and Ethernet
headers.
The receive code is careful to update the skb->csum, except in
__dev_forward_skb, as called by dev_forward_skb. __dev_forward_skb
calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN)
to skip over the Ethernet header, but does not update skb->csum when
doing so.
This patch resolves the problem by adding a call to
skb_postpull_rcsum to update the skb->csum after the call to
eth_type_trans.
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>