During DRAM initialization on certain ASpeed devices, an incorrect
bit (bit 10) was checked in the "SDRAM Bus Width Status" register
to determine DRAM width.
Query bit 6 instead in accordance with the Aspeed AST2050 datasheet v1.05.
In a couple places we are not converting to/from the Linux
block layer 512 bytes sectors.
1.
The request queue values and what we do are a mismatch of
things:
max_discard_sectors - This is in linux block layer 512 byte
sectors. We are just copying this to max_unmap_lba_count.
discard_granularity - This is in bytes. We are converting it
to Linux block layer 512 byte sectors.
discard_alignment - This is in bytes. We are just copying
this over.
The problem is that the core LIO code exports these values in
spc_emulate_evpd_b0 and we use them to test request arguments
in sbc_execute_unmap, but we never convert to the block size
we export to the initiator. If we are not using 512 byte sectors
then we are exporting the wrong values or are checks are off.
And, for the discard_alignment/bytes case we are just plain messed
up.
2.
blkdev_issue_discard's start and number of sector arguments
are supposed to be in linux block layer 512 byte sectors. We are
currently passing in the values we get from the initiator which
might be based on some other sector size.
There is a similar problem in iblock_execute_write_same where
the bio functions want values in 512 byte sectors but we are
passing in what we got from the initiator.
Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[ kamal: backport to 4.4-stable: no unmap_zeroes_data ] Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the PCI hotplug path of the Intel IOMMU driver, replace
the usage of the BUS_NOTIFY_DEL_DEVICE notifier, which is
executed before the driver is unbound from the device, with
BUS_NOTIFY_REMOVED_DEVICE, which runs after that.
This fixes a kernel BUG being triggered in the VT-d code
when the device driver tries to unmap DMA buffers and the
VT-d driver already destroyed all mappings.
The setup code for the performance counters in the AMD IOMMU driver
tests whether the counters can be written. It tests to setup a counter
for device 00:00.0, which fails on systems where this particular device
is not covered by the IOMMU.
Fix this by not relying on device 00:00.0 but only on the IOMMU being
present.
The AMD Family 15h Models 30h-3Fh (Kaveri) BIOS and Kernel Developer's
Guide omitted part of the BIOS IOMMU L2 register setup specification.
Without this setup the IOMMU L2 does not fully respect write permissions
when handling an ATS translation request.
The IOMMU L2 will set PTE dirty bit when handling an ATS translation with
write permission request, even when PTE RW bit is clear. This may occur by
direct translation (which would cause a PPR) or by prefetch request from
the ATC.
This is observed in practice when the IOMMU L2 modifies a PTE which maps a
pagecache page. The ext4 filesystem driver BUGs when asked to writeback
these (non-modified) pages.
Enable ATS write permission check in the Kaveri IOMMU L2 if BIOS has not.
Calling return copy_to_user(...) in an ioctl will not
do the right thing if there's a pagefault:
copy_to_user returns the number of bytes not copied
in this case.
Fix up kvm to do
return copy_to_user(...)) ? -EFAULT : 0;
everywhere.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 172b2386ed16 ("KVM: x86: fix missed hardware breakpoints",
2016-02-10) worked around a case where the debug registers are not loaded
correctly on preemption and on the first entry to KVM_RUN.
However, Xiao Guangrong pointed out that the root cause must be that
KVM_DEBUGREG_BP_ENABLED is not being set correctly. This can indeed
happen due to the lazy debug exit mechanism, which does not call
kvm_update_dr7. Fix it by replacing the existing loop (more or less
equivalent to kvm_update_dr0123) with calls to all the kvm_update_dr*
functions.
Calling return copy_to_user(...) in an ioctl will not
do the right thing if there's a pagefault:
copy_to_user returns the number of bytes not copied
in this case.
Fix up vfio to do
return copy_to_user(...)) ?
-EFAULT : 0;
everywhere.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This issue is caused by commit 02323db17e3a7 ("cifs: fix
cifs_uniqueid_to_ino_t not to ever return 0"), when BITS_PER_LONG
is 64 on s390x, the corresponding cifs_uniqueid_to_ino_t()
function will cast 64-bit fileid to 32-bit by using (ino_t)fileid,
because ino_t (typdefed __kernel_ino_t) is int type.
It's defined in arch/s390/include/uapi/asm/posix_types.h
#ifndef __s390x__
typedef unsigned long __kernel_ino_t;
...
#else /* __s390x__ */
typedef unsigned int __kernel_ino_t;
So the #ifdef condition is wrong for s390x, we can just still use
one cifs_uniqueid_to_ino_t() function with comparing sizeof(ino_t)
and sizeof(u64) to choose the correct execution accordingly.
Signed-off-by: Yadan Fan <ydfan@suse.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For interim responses we only need to parse a header and update
a number credits. Now it is done for all SMB2+ command except
SMB2_READ which is wrong. Fix this by adding such processing.
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Tested-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When opening a file, SMB2_open() attempts to parse the lease state from the
SMB2 CREATE Response. However, the parsing code was not careful to ensure
that the create contexts are not empty or invalid, which can lead to out-
of-bounds memory access. This can be seen easily by trying
to read a file from a OSX 10.11 SMB3 server. Here is sample crash output:
two attempts have been made at fixing a possible hang caused by
cursor_timer_handler. That function registers a timer to be triggered at
"jiffies + fbcon_ops.cur_blink_jiffies".
A new case had been encountered during initialisation of clcd-pl11x:
If we take an softirq anywhere between A and B (and we do),
cursor_timer_handler executes indefinitely.
Instead of patching all possible paths that lead to this case one at a
time, fix the issue at the source and initialise cur_blink_jiffies to
200ms when allocating fbcon_ops. This was its default value before
aforesaid commit. fbcon_cursor or fbcon_init will refine this value
downstream.
vmx.c writes the TSC_MULTIPLIER field in vmx_vcpu_load, but only when a
vcpu has migrated physical cpus. Record the last value written and
update in vmx_vcpu_load on any change, otherwise a cpu migration must
occur for TSC frequency scaling to take effect.
Returning directly whatever copy_to_user(...) or copy_from_user(...)
returns may not do the right thing if there's a pagefault:
copy_to_user/copy_from_user return the number of bytes not copied in
this case, but ioctls need to return -EFAULT instead.
Fix up kvm on mips to do
return copy_to_user(...)) ? -EFAULT : 0;
and
return copy_from_user(...)) ? -EFAULT : 0;
everywhere.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mike Frysinger reported that his ptrace testcase showed strange
behaviour on parisc: It was not possible to avoid a syscall and the
return value of a syscall couldn't be changed.
To modify a syscall number, we were missing to save the new syscall
number to gr20 which is then picked up later in assembly again.
The effect that the return value couldn't be changed is a side-effect of
another bug in the assembly code. When a process is ptraced, userspace
expects each syscall to report entrance and exit of a syscall. If a
syscall number was given which doesn't exist, we jumped to the normal
syscall exit code instead of informing userspace that the (non-existant)
syscall exits. This unexpected behaviour confuses userspace and thus the
bug was misinterpreted as if we can't change the return value.
This patch fixes both problems and was tested on 64bit kernel with
32bit userspace.
Signed-off-by: Helge Deller <deller@gmx.de> Cc: Mike Frysinger <vapier@gentoo.org> Tested-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit cbce7900598c ("PCI: designware: Make driver arch-agnostic") changed
the host bridge sysdata pointer from the ARM pci_sys_data to the DesignWare
pcie_port structure, and changed pcie-designware.c to reflect that. But it
did not change the corresponding code in pci-keystone-dw.c, so it caused
crashes on Keystone:
Unable to handle kernel NULL pointer dereference at virtual address 00000030
pgd = c0003000
[00000030] *pgd=80000800004003, *pmd=00000000
Internal error: Oops: 206 [#1] PREEMPT SMP ARM
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.2-00139-gb74f926 #2
Hardware name: Keystone
PC is at ks_dw_pcie_msi_irq_unmask+0x24/0x58
Change pci-keystone-dw.c to expect sysdata to be the struct pcie_port
pointer.
Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Parameter of trace_btrfs_work_queued() can be freed in its workqueue.
So no one use use that pointer after queue_work().
Fix the user-after-free bug by move the trace line before queue_work().
Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I see no_space in v4.4-rc1 again in xfstests generic/102.
It happened randomly in some node only.
(one of 4 phy-node, and a kvm with non-virtio block driver)
By bisect, we can found the first-bad is:
commit bdced438acd8 ("block: setup bi_phys_segments after splitting")'
But above patch only triggered the bug by making bio operation
faster(or slower).
Main reason is in our space_allocating code, we need to commit
page writeback before wait it complish, this patch fixed above
bug.
BTW, there is another reason for generic/102 fail, caused by
disable default mixed-blockgroup, I'll fix it in xfstests.
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This happens because we can recursively acquire the semaphore
fs_info->delayed_iput_sem when attempting to allocate space to satisfy
a file write request as shown in the first trace above - when committing
a transaction we acquire (down_read) the semaphore before running the
delayed iputs, and when running a delayed iput() we can end up calling
an inode's eviction handler, which in turn commits another transaction
and attempts to acquire (down_read) again the semaphore to run more
delayed iput operations.
This results in a deadlock because if a task acquires multiple times a
semaphore it should invoke down_read_nested() with a different lockdep
class for each level of recursion.
Fix this by simplifying the implementation and use a mutex instead that
is acquired by the cleaner kthread before it runs the delayed iputs
instead of always acquiring a semaphore before delayed references are
run from anywhere.
Fixes: d7c151717a1e (btrfs: Fix NO_SPACE bug caused by delayed-iput) Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CONFIG_ARCH_SHMOBILE is not only enabled for Renesas ARM platforms
(which are DT based and multi-platform), but also on a select set of
Renesas SuperH platforms (SH7722/SH7723/SH7724/SH7343/SH7366). Hence
since commit 0ba58de231066e47 ("drivers: sh: Get rid of
CONFIG_ARCH_SHMOBILE_MULTI"), the legacy clock domain is no longer
installed on these SuperH platforms, and module clocks may not be
enabled when needed, leading to driver failures.
To fix this, add an additional check for CONFIG_OF.
Fixes: 0ba58de231066e47 ("drivers: sh: Get rid of CONFIG_ARCH_SHMOBILE_MULTI"). Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Games with ordering and barriers are way too brittle. Just
bump ->d_seq before and after updating ->d_inode and ->d_flags
type bits, so that verifying ->d_seq would guarantee they are
coherent.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The firmware can perform a scheduled scan with not matchsets passed,
but it can't send notification that results were found. Since the
userspace then cannot know when we got new results and the firmware
wouldn't trigger a wake in case we are sleeping, it's better not to
allow scans without matchsets.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=110831
8000 device family has a new debug engine that needs to be
configured differently than 7000's.
The debug engine's DMA works in chunks of memory and the
size of the buffer really means the start of the last
chunk. Since one chunk is 256-byte long, we should
configure the device to write to buffer_size - 256.
This fixes a situation were the device would write to
memory it is not allowed to access.
My commit below introduced a mutex in the transport to
prevent concurrent operations. To do so, it added a flag
(is_down) to make sure the transport is in the right state.
This uncoverred an bug that didn't cause any harm until
now: iwldvm calls stop_device and then starts the firmware
without calling start_hw in between. While this flow is
fine from the device configuration point of view (register,
etc...), it is now forbidden by the new is_down flag.
This led to this error to appear:
iwlwifi 0000:05:00.0: Can't start_fw since the HW hasn't been started
and the suspend would fail.
This fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=109591
It looks like smack and yama weren't aware that the ptrace mode
can have flags ORed into it - PTRACE_MODE_NOAUDIT until now, but
only for /proc/$pid/stat, and with the PTRACE_MODE_*CREDS patch,
all modes have flags ORed into them.
Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cma_validate_port wrongly assumed that Ethernet devices are RoCE
devices and thus their ndev should be matched in the GID table.
This broke the iWarp support. Fixing that matching the ndev only if
we work on a RoCE port.
Cc: <stable@vger.kernel.org> # 4.4.x- Fixes: abae1b71dd37 ('IB/cma: cma_validate_port should verify the port
and netdevice') Reported-by: Hariprasad Shenai <hariprasad@chelsio.com> Tested-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
So we need to serialize the vector assignment against a pending cleanup. The
solution is rather simple now. We not only check for the move_in_progress flag
in assign_irq_vector(), we also check whether there is still a cleanup pending
in the old_domain cpumask. If so, we return -EBUSY to the caller and let him
deal with it. Though we have to be careful in the cpu unplug case. If the
cleanout has not yet completed then the following setaffinity() call would
return -EBUSY. Add code which prevents this.
Full context is here: http://lkml.kernel.org/r/5653B688.4050809@stratus.com
Reported-and-tested-by: Joe Lawrence <joe.lawrence@stratus.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@alien8.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: andy.shevchenko@gmail.com Cc: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/20151231160107.207265407@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
First of all there is no point in looking up the irq descriptor again, but we
also need the descriptor for the final cleanup race fix in the next
patch. Make that change seperate. No functional difference.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: andy.shevchenko@gmail.com Cc: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/20151231160107.125211743@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no need to allocate a new cpumask for sending the cleanup vector. The
old_domain mask is now protected by the vector_lock, so we can safely remove
the offline cpus from it and send the IPI with the resulting mask.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: andy.shevchenko@gmail.com Cc: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/20151231160106.967993932@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
send_cleanup_vector() fiddles with the old_domain mask unprotected because it
relies on the protection by the move_in_progress flag. But this is fatal, as
the flag is reset after the IPI has been sent. So a cpu which receives the IPI
can still see the flag set and therefor ignores the cleanup request. If no
other cleanup request happens then the vector stays stale on that cpu and in
case of an irq removal the vector still persists. That can lead to use after
free when the next cleanup IPI happens.
Protect the code with vector_lock and clear move_in_progress before sending
the IPI.
This does not plug the race which Joe reported because:
In the case that the new vector mask is a subset of the existing mask there is
no point to do a AND operation of currentmask & newmask. The result is
newmask. So we can simply copy the new mask to the current mask and be done
with it. Preparatory patch for further consolidation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: andy.shevchenko@gmail.com Cc: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/20151231160106.640253454@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
__assign_irq_vector() uses the vector_cpumask which is assigned by
apic->vector_allocation_domain() without doing basic sanity checks. That can
result in a situation where the final assignement of a newly found vector
fails in apic->cpu_mask_to_apicid_and(). So we have to do rollbacks for no
reason.
Check for this condition right away and if the result is empty try immediately
the next possible cpu in the requested mask. So in case of a failure the old
setting is unchanged and we can remove the rollback code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: andy.shevchenko@gmail.com Cc: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/20151231160106.561877324@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Split out the code which advances the target cpu for the search so we can
reuse it for the next patch which adds an early validation check for the
vectormask which we get from the apic.
Add comments while at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: andy.shevchenko@gmail.com Cc: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/20151231160106.484562040@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Function __assign_irq_vector() makes use of apic_chip_data.old_domain as a
temporary buffer, which is in the way of using apic_chip_data.old_domain for
synchronizing the vector cleanup with the vector assignement code.
Use a proper temporary cpumask for this.
[ tglx: Renamed the mask to searched_cpumask for clarity ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: andy.shevchenko@gmail.com Cc: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/1450880014-11741-1-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In fixup_irqs() we unconditionally dereference the irq chip of an irq
descriptor. The descriptor might still be valid, but already cleaned up,
i.e. the chip removed. Add a check for this condition.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jiang Liu <jiang.liu@linux.intel.com> Cc: Joe Lawrence <joe.lawrence@stratus.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: andy.shevchenko@gmail.com Cc: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/20151231160106.236423282@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
x86_vector_free_irqs()
{
free_apic_chip_data(irq_data->chip_data);
xxxxx //irq_data->chip_data has been freed, but the pointer
//hasn't been reset yet
irq_domain_reset_irq_data(irq_data);
}
and
smp_irq_move_cleanup_interrupt()
{
raw_spin_lock(&vector_lock);
data = apic_chip_data(irq_desc_get_irq_data(desc));
access data->xxxx // may access freed memory
raw_spin_unlock(&desc->lock);
}
which may cause smp_irq_move_cleanup_interrupt() to access freed memory.
Call irq_domain_reset_irq_data(), which clears the pointer with vector lock
held.
[ tglx: Free memory outside of lock held region. ]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Tested-by: Borislav Petkov <bp@alien8.de> Tested-by: Joe Lawrence <joe.lawrence@stratus.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: andy.shevchenko@gmail.com Cc: Guenter Roeck <linux@roeck-us.net> Link: http://lkml.kernel.org/r/1450880014-11741-3-git-send-email-jiang.liu@linux.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the unlikely event that regno == nr_registers then we get an array
overrun on regoff because the invalid register check is currently
off-by-one. Fix this with a check that regno is >= nr_registers instead.
Detected with static analysis using CoverityScan.
Fixes: fcc7ffd67991 "x86, mpx: Decode MPX instruction to get bound violation information" Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/1456512931-3388-1-git-send-email-colin.king@canonical.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The delete opration can allocate additional space on the HPFS filesystem
due to btree split. The HPFS driver checks in advance if there is
available space, so that it won't corrupt the btree if we run out of space
during splitting.
If there is not enough available space, the HPFS driver attempted to
truncate the file, but this results in a deadlock since the commit 7dd29d8d865efdb00c0542a5d2c87af8c52ea6c7 ("HPFS: Introduce a global mutex
and lock it on every callback from VFS").
This patch removes the code that tries to truncate the file and -ENOSPC is
returned instead. If the user hits -ENOSPC on delete, he should try to
delete other files (that are stored in a leaf btree node), so that the
delete operation will make some space for deleting the file stored in
non-leaf btree node.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we call pci_scan_root_bus we would pass in 'sd' (sysdata)
pointer which was an 'pcifront_sd' structure. However in the
pci_device_add it expects that the 'sd' is 'struct sysdata' and
sets the dev->node to what is in sd->node (offset 4):
That is an hole - filled with garbage as we used kmalloc instead of
kzalloc (the second problem).
This patch fixes the issue by:
1) Use kzalloc to initialize to a well known state.
2) Put 'struct pci_sysdata' at the start of 'pcifront_sd'. That
way access to the 'node' will access the right offset.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 8135cf8b092723dbfcc611fe6fdcb3a36c9951c5 (xen/pciback: Save
xen_pci_op commands before processing it) broke enabling MSI-X because
it would never copy the resulting vectors into the response. The
number of vectors requested was being overwritten by the return value
(typically zero for success).
Save the number of vectors before processing the op, so the correct
number of vectors are copied afterwards.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 408fb0e5aa7fda0059db282ff58c3b2a4278baa0 (xen/pciback: Don't
allow MSI-X ops if PCI_COMMAND_MEMORY is not set) prevented enabling
MSI-X on passed-through virtual functions, because it checked the VF
for PCI_COMMAND_MEMORY but this is not a valid bit for VFs.
Instead, check the physical function for PCI_COMMAND_MEMORY.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When adding a new frontend to xen-scsiback don't decrement the number
of active frontends in case of no error. Doing so results in a failure
when trying to remove the xen-pvscsi nexus even if no domain is using
it.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently xen_dma_map_page concludes that DMA to anything other than
the head page of a compound page must be foreign, since the PFN of the
page is that of the head.
Fix the check to instead consider the whole of a compound page to be
local if the PFN of the head passes the 1:1 check.
We can never see a compound page which is a mixture of foreign and
local sub-pages.
The comment already correctly described the intention, but fixup the
spelling and some grammar.
This fixes the various SSH protocol errors which we have been seeing
on the cubietrucks in our automated test infrastructure.
This has been broken since commit 3567258d281b ("xen/arm: use
hypercall to flush caches in map_page"), which was in v3.19-rc1.
NB arch/arm64/.../xen/page-coherent.h also includes this file.
Commit e7b11dc7b77b ("ARM: OMAP2+: Fix onenand rate detection to avoid
filesystem corruption") partially fixed onenand configuration when GPMC
module is reset. Finish the job by also providing the correct values in
ONENAND_REG_SYS_CFG1 register.
Fixes: e7b11dc7b77b ("ARM: OMAP2+: Fix onenand rate detection to avoid
filesystem corruption") Signed-off-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
... into returning a positive to path_openat(), which would interpret that
as "symlink had been encountered" and proceed to corrupt memory, etc.
It can only happen due to a bug in some ->open() instance or in some LSM
hook, etc., so we report any such event *and* make sure it doesn't trick
us into further unpleasantness.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In __request_region, if a conflict with a BUSY and MUXED resource is
detected, then the caller goes to sleep and waits for the resource to be
released. A pointer on the conflicting resource is kept. At wake-up
this pointer is used as a parent to retry to request the region.
A first problem is that this pointer might well be invalid (if for
example the conflicting resource have already been freed). Another
problem is that the next call to __request_region() fails to detect a
remaining conflict. The previously conflicting resource is passed as a
parameter and __request_region() will look for a conflict among the
children of this resource and not at the resource itself. It is likely
to succeed anyway, even if there is still a conflict.
Instead, the parent of the conflicting resource should be passed to
__request_region().
As a fix, this patch doesn't update the parent resource pointer in the
case we have to wait for a muxed region right after.
Reported-and-tested-by: Vincent Pelletier <plr.vincent@gmail.com> Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> Tested-by: Vincent Donnefort <vdonnefort@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The qword_get() function NUL-terminates its output buffer. If the input
string is in hex format \xXXXX... and the same length as the output
buffer, there is an off-by-one:
This patch ensures the NUL terminator doesn't fall outside the output
buffer.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The ftrace:function event is only displayed for parsing the function tracer
data. It is not used to enable function tracing, and does not include an
"enable" file in its event directory.
Originally, this event was kept separate from other events because it did
not have a ->reg parameter. But perf added a "reg" parameter for its use
which caused issues, because it made the event available to functions where
it was not compatible for.
Commit 9b63776fa3ca9 "tracing: Do not enable function event with enable"
added a TRACE_EVENT_FL_IGNORE_ENABLE flag that prevented the function event
from being enabled by normal trace events. But this commit missed keeping
the function event from being displayed by the "available_events" directory,
which is used to show what events can be enabled by set_event.
One documented way to enable all events is to:
cat available_events > set_event
But because the function event is displayed in the available_events, this
now causes an INVALID error:
cat: write error: Invalid argument
Reported-by: Chunyu Hu <chuhu@redhat.com> Fixes: 9b63776fa3ca9 "tracing: Do not enable function event with enable" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During error recovery, the device could be removed as part of the
partial hotplug. The criterion used to come with partial hotplug
is: if the device driver provides error_detected(), slot_reset()
and resume() callbacks, it's immune from hotplug. Otherwise,
it's going to experience partial hotplug during EEH recovery. But
the criterion isn't correct enough: mlx4_core driver for Mellanox
adapters provides error_detected(), slot_reset() callbacks, but
resume() isn't there. Those Mellanox adapters won't be to involved
in the partial hotplug.
This fixes the criterion to a practical one: adpater with driver
that provides error_detected(), slot_reset() will be immune from
partial hotplug. resume() isn't mandatory.
Fixes: f2da4ccf ("powerpc/eeh: More relaxed hotplug criterion") Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit e8dd2d2d641c ("Silence compiler warning in arch/x86/kvm/emulate.c",
2015-09-06) broke boot of the Hurd. The bug is that the "default:"
case actually could modify "la", but after the patch this change is
not reflected in *linear.
The bug is visible whenever a non-zero segment base causes the linear
address to wrap around the 4GB mark.
Sometimes when setting a breakpoint a process doesn't stop on it.
This is because the debug registers are not loaded correctly on
VCPU load.
The following simple reproducer from Oleg Nesterov tries using debug
registers in two threads. To see the bug, run a 2-VCPU guest with
"taskset -c 0" and run "./bp 0 1" inside the guest.
When we allocate bitmaps in vgic_vcpu_init_maps, we divide the number of
bits we need by 8 to figure out how many bytes to allocate. However,
bitmap elements are always accessed as unsigned longs, and if we didn't
happen to allocate a size such that size % sizeof(unsigned long) == 0,
bitmap accesses may go past the end of the allocation.
When using KASAN (which does byte-granular access checks), this results
in a continuous stream of BUGs whenever these bitmaps are accessed:
=============================================================================
BUG kmalloc-128 (Tainted: G B ): kasan: bad access detected
-----------------------------------------------------------------------------
In async_pf we try to allocate with NOWAIT to get an element quickly
or fail. This code also handle failures gracefully. Lets silence
potential page allocation failures under load.
The existing msi-map code is fine for shifting the entire RID space
upwards, but attempting finer-grained remapping reveals a bug. It turns
out that we are mistakenly treating the msi-base part as an offset, not
as a new base to remap onto, so things get squiffy when rid-base is
nonzero. Fix this, and at the same time add a sanity check against
having msi-map-mask clash with a nonzero rid-base, as that's another
thing one can easily get wrong.
Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Stuart Yoder <stuart.yoder@nxp.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the case where d_add_unique() finds an appropriate alias to use it will
have already incremented the reference count. An additional dget() to swap
the open context's dentry is unnecessary and will leak a reference.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 275bb307865a3 ("NFSv4: Move dentry instantiation into the NFSv4-...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 35dc248383bbab0a7203fca4d722875bc81ef091 introduced a check for
current->mm to see if we have a user space context and only copies data
if we do. Now if an IO gets interrupted by a signal data isn't copied
into user space any more (as we don't have a user space context) but
user space isn't notified about it.
This patch modifies the behaviour to return -EINTR from bio_uncopy_user()
to notify userland that a signal has interrupted the syscall, otherwise
it could lead to a situation where the caller may get a buffer with
no data returned.
This can be reproduced by issuing SG_IO ioctl()s in one thread while
constantly sending signals to it.
Fixes: 35dc248 [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Starting from Intel Sunrisepoint (Skylake PCH) the iTCO watchdog
resources have been moved to reside under the i801 SMBus host
controller whereas previously they were under the LPC device.
This patch adds Intel lewisburg SMBus support for iTCO device.
It allows to load watchdog dynamically when the hardware is
present.
Fixes: cdc5a3110e7c ("i2c: i801: add Intel Lewisburg device IDs") Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Alexandra Yates <alexandra.yates@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If phy_pm_runtime_get_sync failed but we already
enable regulator, current code return directly without
doing regulator_disable. This patch fix this problem
and cleanup err handle of phy_power_on to be more readable.
Fixes: 3be88125d85d ("phy: core: Support regulator ...") Cc: Roger Quadros <rogerq@ti.com> Cc: Axel Lin <axel.lin@ingics.com> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If cgroup writeback is in use, an inode is associated with a cgroup
for writeback. If the inode's main dirtier changes to another cgroup,
the association gets updated asynchronously. Nothing was pinning the
superblock while such switches are in progress and superblock could go
away while async switching is pending or in progress leading to
crashes like the following.
There are three subsystem callbacks in css shutdown path -
css_offline(), css_released() and css_free(). Except for
css_released(), cgroup core didn't guarantee the order of invocation.
css_offline() or css_free() could be called on a parent css before its
children. This behavior is unexpected and led to bugs in cpu and
memory controller.
This patch updates offline path so that a parent css is never offlined
before its children. Each css keeps online_cnt which reaches zero iff
itself and all its children are offline and offline_css() is invoked
only after online_cnt reaches zero.
This fixes the memory controller bug and allows the fix for cpu
controller.
If "cpuset.memory_migrate" is set, when a process is moved from one
cpuset to another with a different memory node mask, pages in used by
the process are migrated to the new set of nodes. This was performed
synchronously in the ->attach() callback, which is synchronized
against process management. Recently, the synchronization was changed
from per-process rwsem to global percpu rwsem for simplicity and
optimization.
Combined with the synchronous mm migration, this led to deadlocks
because mm migration could schedule a work item which may in turn try
to create a new worker blocking on the process management lock held
from cgroup process migration path.
This heavy an operation shouldn't be performed synchronously from that
deep inside cgroup migration in the first place. This patch punts the
actual migration to an ordered workqueue and updates cgroup process
migration and cpuset config update paths to flush the workqueue after
all locks are released. This way, the operations still seem
synchronous to userland without entangling mm migration with process
management synchronization. CPU hotplug can also invoke mm migration
but there's no reason for it to wait for mm migrations and thus
doesn't synchronize against their completions.
A Root Port's AER structure (rpc) contains a queue of events. aer_irq()
enqueues AER status information and schedules aer_isr() to dequeue and
process it. When we remove a device, aer_remove() waits for the queue to
be empty, then frees the rpc struct.
But aer_isr() references the rpc struct after dequeueing and possibly
emptying the queue, which can cause a use-after-free error as in the
following scenario with two threads, aer_isr() on the left and a
concurrent aer_remove() on the right:
Thread A Thread B
-------- --------
aer_irq():
rpc->prod_idx++
aer_remove():
wait_event(rpc->prod_idx == rpc->cons_idx)
# now blocked until queue becomes empty
aer_isr(): # ...
rpc->cons_idx++ # unblocked because queue is now empty
... kfree(rpc)
mutex_unlock(&rpc->rpc_mutex)
To prevent this problem, use flush_work() to wait until the last scheduled
instance of aer_isr() has completed before freeing the rpc struct in
aer_remove().
I reproduced this use-after-free by flashing a device FPGA and
re-enumerating the bus to find the new device. With SLUB debug, this
crashes with 0x6b bytes (POISON_FREE, the use-after-free magic number) in
GPR25:
ARConnect/MCIP Inter-Core-Interrupt module can't send interrupt to
local core. So use core intc capability to trigger software
interrupt to self, using an unsued IRQ #21.
This showed up as csd deadlock with LTP trace_sched on a dual core
system. This test acts as scheduler fuzzer, triggering all sorts of
schedulting activity. Trouble starts with IPI to self, which doesn't get
delivered (effectively lost due to H/w capability), but the msg intended
to be sent remain enqueued in per-cpu @ipi_data.
All subsequent IPIs to this core from other cores get elided due to the
IPI coalescing optimization in ipi_send_msg_one() where a pending msg
implies an IPI already sent and assumes other core is yet to ack it.
After the elided IPI, other core simply goes into csd_lock_wait()
but never comes out as this core never sees the interrupt.
Returning to delay slot, riding an interrupti, had one loose end.
AUX_USER_SP used for restoring user mode SP upon RTIE was not being
setup from orig task's saved value, causing task to use wrong SP,
leading to ProtV errors.
The reason being:
- INTERRUPT_EPILOGUE returns to a kernel trampoline, thus not expected to restore it
- EXCEPTION_EPILOGUE is not used at all
Fix that by restoring AUX_USER_SP explicitly in the trampoline.
This was broken in the original workaround, but the error scenarios got
reduced considerably since v3.14 due to following:
1. The Linuxthreads.old based userspace at the time caused many more
exceptions in delay slot than the current NPTL based one.
Infact with current userspace the error doesn't happen at all.
2. Return from interrupt (delay slot or otherwise) doesn't get exercised much
after commit 4de0e52867d8 ("Really Re-enable interrupts to avoid deadlocks")
since IRQ_ACTIVE.active being clear means most returns are as if from pure
kernel (even for active interrupts)
Infact the issue only happened in an experimental branch where I was tinkering with
reverted 4de0e52867d8
Fixes: 4255b07f2c9c ("ARCv2: STAR 9000793984: Handle return from intr to Delay Slot") Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The bulk of ATA host state machine is implemented by
ata_sff_hsm_move(). The function is called from either the interrupt
handler or, if polling, a work item. Unlike from the interrupt path,
the polling path calls the function without holding the host lock and
ata_sff_hsm_move() selectively grabs the lock.
This is completely broken. If an IRQ triggers while polling is in
progress, the two can easily race and end up accessing the hardware
and updating state machine state at the same time. This can put the
state machine in an illegal state and lead to a crash like the
following.
Fix it by ensuring that the polling path is holding the host lock
before entering ata_sff_hsm_move() so that all hardware accesses and
state updates are performed under the host lock.
Commit 84e32a0 ("qla2xxx: Use pci_enable_msix_range() instead of
pci_enable_msix()") introduced a regression when target mode is enabled.
In qla24xx_enable_msix(), ha->max_rsp_queues was incorrectly set
to a value higher than the number of response queues allocated causing
an invalid dereference. Specifically here in qla2x00_init_rings():
*rsp->in_ptr = 0;
Add additional check to make sure the pointer is valid. following
call stack will be seen
The non-DT platform that uses this driver (actually the AVR32) was taking a bad
branch for determining if the IP would use gpio for CS.
Adding the presence of DT as a condition fixes this issue.
Fixes: 4820303480a1 ("spi: atmel: add support for the internal chip-select of the spi controller") Reported-by: Mans Rullgard <mans@mansr.com> Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
[nicolas.ferre@atmel.com: extract from ml discussion] Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Tested-by: Mans Rullgard <mans@mansr.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes a race between setting of SCF_SEND_DELAYED_TAS
in transport_send_task_abort(), and check of the same bit in
transport_check_aborted_status().
It adds a __transport_check_aborted_status() version that is
used by target_execute_cmd() when se_cmd->t_state_lock is
held, and a transport_check_aborted_status() wrapper for
all other existing callers.
Also, it handles the case where the check happens before
transport_send_task_abort() gets called. For this, go
ahead and set SCF_SEND_DELAYED_TAS early when necessary,
and have transport_send_task_abort() send the abort.
Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To address the bug where fabric driver level shutdown
of se_cmd occurs at the same time when TMR CMD_T_ABORTED
is happening resulting in a -1 ->cmd_kref, this patch
adds a CMD_T_FABRIC_STOP bit that is used to determine
when TMR + driver I_T nexus shutdown is happening
concurrently.
It changes target_sess_cmd_list_set_waiting() to obtain
se_cmd->cmd_kref + set CMD_T_FABRIC_STOP, and drop local
reference in target_wait_for_sess_cmds() and invoke extra
target_put_sess_cmd() during Task Aborted Status (TAS)
when necessary.
Also, it adds a new target_wait_free_cmd() wrapper around
transport_wait_for_tasks() for the special case within
transport_generic_free_cmd() to set CMD_T_FABRIC_STOP,
and is now aware of CMD_T_ABORTED + CMD_T_TAS status
bits to know when an extra transport_put_cmd() during
TAS is required.
Note transport_generic_free_cmd() is expected to block on
cmd->cmd_wait_comp in order to follow what iscsi-target
expects during iscsi_conn context se_cmd shutdown.
Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@daterainc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes a bug in TMR task aborted status (TAS)
handling when multiple sessions are connected to the
same target WWPN endpoint and se_node_acl descriptor,
resulting in TASK_ABORTED status to not be generated
for aborted se_cmds on the remote port.
This is due to core_tmr_handle_tas_abort() incorrectly
comparing se_node_acl instead of se_session, for which
the multi-session case is expected to be sharing the
same se_node_acl.
Instead, go ahead and update core_tmr_handle_tas_abort()
to compare tmr_sess + cmd->se_sess in order to determine
if the LUN_RESET was received on a different I_T nexus,
and TASK_ABORTED status response needs to be generated.
Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes a NULL pointer se_cmd->cmd_kref < 0
refcount bug during TMR LUN_RESET with active TMRs,
triggered during se_cmd + se_tmr_req descriptor
shutdown + release via core_tmr_drain_tmr_list().
To address this bug, go ahead and obtain a local
kref_get_unless_zero(&se_cmd->cmd_kref) for active I/O
to set CMD_T_ABORTED, and transport_wait_for_tasks()
followed by the final target_put_sess_cmd() to drop
the local ->cmd_kref.
Also add two new checks within target_tmr_work() to
avoid CMD_T_ABORTED -> TFO->queue_tm_rsp() callbacks
ahead of invoking the backend -> fabric put in
transport_cmd_check_stop_to_fabric().
For good measure, also change core_tmr_release_req()
to use list_del_init() ahead of se_tmr_req memory
free.
Reviewed-by: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes a NULL pointer se_cmd->cmd_kref < 0
refcount bug during TMR LUN_RESET with active se_cmd
I/O, that can be triggered during se_cmd descriptor
shutdown + release via core_tmr_drain_state_list() code.
To address this bug, add common __target_check_io_state()
helper for ABORT_TASK + LUN_RESET w/ CMD_T_COMPLETE
checking, and set CMD_T_ABORTED + obtain ->cmd_kref for
both cases ahead of last target_put_sess_cmd() after
TFO->aborted_task() -> transport_cmd_finish_abort()
callback has completed.
It also introduces SCF_ACK_KREF to determine when
transport_cmd_finish_abort() needs to drop the second
extra reference, ahead of calling target_put_sess_cmd()
for the final kref_put(&se_cmd->cmd_kref).
It also updates transport_cmd_check_stop() to avoid
holding se_cmd->t_state_lock while dropping se_cmd
device state via target_remove_from_state_list(), now
that core_tmr_drain_state_list() is holding the
se_device lock while checking se_cmd state from
within TMR logic.
Finally, move transport_put_cmd() release of SGL +
TMR + extended CDB memory into target_free_cmd_mem()
in order to avoid potential resource leaks in TMR
ABORT_TASK + LUN_RESET code-paths. Also update
target_release_cmd_kref() accordingly.
Reviewed-by: Quinn Tran <quinn.tran@qlogic.com> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Andy Grover <agrover@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After login to the desktop on Dell Inspiron 3162,
there's a very loud background noise comes from the builtin speaker.
The noise does not go away even if the speaker is muted.
The noise disappears after using the aamix fixup.
Codec: Realtek ALC3234
Address: 0
AFG Function Id: 0x1 (unsol 1)
Vendor Id: 0x10ec0255
Subsystem Id: 0x10280725
Revision Id: 0x100002
No Modem Function Group found
Some Skylake machines show the codec probe errors in certain
situations, e.g. HP Z240 desktop fails to probe the onboard Realtek
codec at reloading the snd-hda-intel module like:
snd_hda_intel 0000:00:1f.3: spurious response 0x200:0x2, last cmd=0x000000
snd_hda_intel 0000:00:1f.3: azx_get_response timeout, switching to polling mode: lastcmd=0x000f0000
snd_hda_intel 0000:00:1f.3: No response from codec, disabling MSI: last cmd=0x000f0000
snd_hda_intel 0000:00:1f.3: Codec #0 probe error; disabling it...
hdaudio hdaudioC0D2: no AFG or MFG node found
snd_hda_intel 0000:00:1f.3: no codecs initialized
Also, HP G470 G3 suffers from the similar problem, as reported in
bugzilla below. On this machine, the codec probe error appears even
at a fresh boot.
As Libin suggested, the same workaround used for Broxton in the commit
[6639484ddaf6: ALSA: hda - disable dynamic clock gating on Broxton
before reset] can be applied for Skylake in order to fix this problem.
The Intel HW team also confirmed that this is needed for SKL.
This patch makes the workaround applied to both SKL and BXT
platforms. The referred macros are moved and one superfluous macro
(IS_BROXTON()) is another one (IS_BXT()) as well.
Workqueue used to implicity guarantee that work items queued without
explicit CPU specified are put on the local CPU. Recent changes in
timer broke the guarantee and led to vmstat breakage which was fixed
by 176bed1de5bf ("vmstat: explicitly schedule per-cpu work on the CPU
we need it to run on").
vmstat is the most likely to expose the issue and it's quite possible
that there are other similar problems which are a lot more difficult
to trigger. As a preventive measure, 874bbfe600a6 ("workqueue: make
sure delayed work run in local cpu") was applied to restore the local
CPU guarnatee. Unfortunately, the change exposed a bug in timer code
which got fixed by 22b886dd1018 ("timers: Use proper base migration in
add_timer_on()"). Due to code restructuring, the commit couldn't be
backported beyond certain point and stable kernels which only had 874bbfe600a6 started crashing.
The local CPU guarantee was accidental more than anything else and we
want to get rid of it anyway. As, with the vmstat case fixed, 874bbfe600a6 is causing more problems than it's fixing, it has been
decided to take the chance and officially break the guarantee by
reverting the commit. A debug feature will be added to force foreign
CPU assignment to expose cases relying on the guarantee and fixes for
the individual cases will be backported to stable as necessary.
Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 874bbfe600a6 ("workqueue: make sure delayed work run in local cpu") Link: http://lkml.kernel.org/g/20160120211926.GJ10810@quack.suse.cz Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Daniel Bilik <daniel.bilik@neosystem.cz> Cc: Jan Kara <jack@suse.cz> Cc: Shaohua Li <shli@fb.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Bilik <daniel.bilik@neosystem.cz> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When looking up the pool_workqueue to use for an unbound workqueue,
workqueue assumes that the target CPU is always bound to a valid NUMA
node. However, currently, when a CPU goes offline, the mapping is
destroyed and cpu_to_node() returns NUMA_NO_NODE.
This has always been broken but hasn't triggered often enough before 874bbfe600a6 ("workqueue: make sure delayed work run in local cpu").
After the commit, workqueue forcifully assigns the local CPU for
delayed work items without explicit target CPU to fix a different
issue. This widens the window where CPU can go offline while a
delayed work item is pending causing delayed work items dispatched
with target CPU set to an already offlined CPU. The resulting
NUMA_NO_NODE mapping makes workqueue try to queue the work item on a
NULL pool_workqueue and thus crash.
While 874bbfe600a6 has been reverted for a different reason making the
bug less visible again, it can still happen. Fix it by mapping
NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node().
This is a temporary workaround. The long term solution is keeping CPU
-> NODE mapping stable across CPU off/online cycles which is being
worked on.
During a sw scan ieee80211_iface_work ignores work items for all vifs.
However after the scan complete work is requeued only for STA, ADHOC
and MESH iftypes.
This occasionally results in event processing getting delayed/not
processed for iftype AP when it coexists with a STA. This can result
in data halt and eventually disconnection on the AP interface.
The code within wait_event_interruptible() is called with
!TASK_RUNNING, so mustn't call any functions that can sleep,
like mutex_lock().
Since we re-check the list_empty() in a loop after the wait,
it's safe to simply use list_empty() without locking.
This bug has existed forever, but was only discovered now
because all userspace implementations, including the default
'rfkill' tool, use poll() or select() to get a readable fd
before attempting to read.
commit 0ff53d096422 sets the next tick interrupt to the last jiffies update,
i.e. in the past, because the forward operation is invoked before the set
operation. There is no resulting damage (yet), but we get an extra pointless
tick interrupt.
Revert the order so we get the next tick interrupt in the future.
Fixes: commit 0ff53d096422 "tick: sched: Force tick interrupt and get rid of softirq magic" Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1453893967-3458-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This commit merged interval and overall counters reading into single
read_counters function.
The old interval code cleaned the stddev data for some reason (it's
never displayed in interval mode) and the mentioned commit kept on
cleaning the stddev data in merged function, which resulted in the
stddev not being displayed.
Removing the wrong stddev data cleanup init_stats call.
Reported-and-Tested-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Fixes: 106a94a0f8c2 ("perf stat: Introduce read_counters function") Link: http://lkml.kernel.org/r/1453290995-18485-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit f79b0d9c223c ("staging: speakup: Fixed warning <linux/serial.h>
instead of <asm/serial.h>") broke the port information in the speakup
driver: SERIAL_PORT_DFNS only gets defined if asm/serial.h is included,
and no other header includes asm/serial.h.
We here make sure serialio.c does get the arch-specific definition of
SERIAL_PORT_DFNS from asm/serial.h, if any.
Along the way, this makes sure that we do have information for the
requested serial port number (index)
Fixes: f79b0d9c223c ("staging: speakup: Fixed warning <linux/serial.h> instead of <asm/serial.h>") Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit ca369d51b3e1 ("block/sd: Fix device-imposed transfer length
limits") accidentally switched optimal I/O size reporting from bytes to
block layer sectors.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Fixes: ca369d51b3e1649be4a72addd6d6a168cfb3f537 Reviewed-by: James E.J. Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>