We need an indication that isert_conn->iscsi_conn binding has
happened so we'll know not to invoke a connection reinstatement
on an unbound connection which will lead to a bogus isert_conn->conn
dereferece.
Once connection request is accepted, one rx descriptor
is posted to receive login request. This descriptor has rx type,
but is outside the main pool of rx descriptors, and thus
was mistreated as tx type.
This patch fixes an active I/O shutdown bug for fabric
drivers using target_wait_for_sess_cmds(), where se_cmd
descriptor shutdown would result in hung tasks waiting
indefinitely for se_cmd->cmd_wait_comp to complete().
To address this bug, drop the incorrect list_del_init()
usage in target_wait_for_sess_cmds() and always complete()
during se_cmd target_release_cmd_kref() put, in order to
let caller invoke the final fabric release callback
into se_cmd->se_tfo->release_cmd() code.
Our dividers weren't being set successfully because CM_PASSWORD wasn't
included in the register write. It looks easier to just compute the
divider to write ourselves than to update clk-divider for the ability
to OR in some arbitrary bits on write.
Fixes about half of the video modes on my HDMI monitor (everything
except 720x400).
Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Michael Turquette <mturquette@baylibre.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hclk_cpubus needs to keep running because it is needed for devices like
the rom, i2s0 or spdif to be accessible via cpu. Without that all
accesses to devices (readl/writel) return wrong data. So add it
to the list of critical clocks.
The vdpu and vepu clocks can also be parented to the npll and current
parent list also is wrong as it would use the npll as "usbphy" source,
so adapt the parent to the correct one.
Similar to commit 9880d4277f6a ("clk: rockchip: fix rk3288 cpuclk core
dividers") it seems the cpuclk dividers are one to high on the rk3368
as well.
And again similar to the previous fix, we opt to make the divider list
contain the values to be written to use the same paradigm for them on all
supported socs.
Both clusters have their mux bit in bit 7 of their respective register.
For whatever reason the big cluster currently lists bit 15 which is
definitly wrong.
Normally the timeout clock frequency is read from the capabilities
register. It is also possible to set the value prior to calling
sdhci_add_host() in which case that value will override the
capabilities register value. However that was being done after
calculating max_busy_timeout so that max_busy_timeout was being
calculated using the wrong value of timeout_clk.
Fix that by moving the override before max_busy_timeout is
calculated.
The result is that the max_busy_timeout and max_discard
increase for BSW devices so that, for example, the time for
mkfs.ext4 on a 64GB eMMC drops from about 1 minute 40 seconds
to about 20 seconds.
Note, in the future, the capabilities setting will be tidied up
and this override won't be used anymore. However this fix is
needed for stable.
The calculation for the timeout based on the number of card clocks is
incorrect. The calculation assumed:
timeout in microseconds = clock cycles / clock in Hz
which is clearly a several orders of magnitude wrong. Fix this by
multiplying the clock cycles by 1000000 prior to dividing by the Hz
based clock. Also, as per part 1, ensure that the division rounds
up.
As this needs 64-bit math via do_div(), avoid it if the clock cycles
is zero.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The data timeout gives the minimum amount of time that should be
waited before timing out if no data is received from the card.
Simply dividing the nanosecond part by 1000 does not give this
required guarantee, since such a division rounds down. Use
DIV_ROUND_UP() to give the desired timeout.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
# dmesg | grep mmc
mmc_spi spi32766.0: SD/MMC host mmc0, no WP, no poweroff, cd polling
mmc0: host does not support reading read-only switch, assuming write-enable
mmc0: new SDHC card on SPI
mmcblk0: mmc0:0000 SU04G 3.69 GiB
mmcblk0: p1
With this patch applied the "cd polling" portion above disappears.
If mmc_blk_ioctl returns -EINVAL, blkdev_ioctl continues to
work without returning err to user-space. But now we check
CAP_SYS_RAWIO firstly, so we return -EPERM to blkdev_ioctl,
which make blkdev_ioctl return -EPERM to user-space directly.
So this will break all the ioctl with BLKROSET. Now we find
Android-adb suffer it for the following log:
remount of /system failed;
couldn't make block device writable: Operation not permitted
openat(AT_FDCWD, "/dev/block/platform/ff420000.dwmmc/by-name/system", O_RDONLY) = 3
ioctl(3, BLKROSET, 0) = -1 EPERM (Operation not permitted)
Fixes: a5f5774c55a2 ("mmc: block: Add new ioctl to send multi commands") Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some Lenovo ideapad models lack a physical rfkill switch.
On Lenovo models ideapad Y700 Touch-15ISK and ideapad Y700-15ISK,
ideapad-laptop would wrongly report all radios as blocked by
hardware which caused wireless network connections to fail.
Add these models without an rfkill switch to the no_hw_rfkill list.
mkspec is copying built kernel to temporrary location
/boot/vmlinuz-$KERNELRELEASE-rpm
and runs installkernel on it. This however directly leads to grub2
menuentry for this suffixed binary being generated as well during the run
of installkernel script.
Later in the process the temporary -rpm suffixed files are removed, and
therefore we end up with spurious (and non-functional) grub2 menu entries
for each installed kernel RPM.
Fix that by using a different temporary name (prefixed by '.'), so that
the binary is not recognized as an actual kernel binary and no menuentry
is created for it.
Signed-off-by: Jiri Kosina <jkosina@suse.cz> Fixes: 3c9c7a14b627 ("rpm-pkg: add %post section to create initramfs and grub hooks") Signed-off-by: Michal Marek <mmarek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Documentation/Changes still lists this as the minimal required version,
so it ought to remain usable for the time being.
Fixes: d2036f30cf ("scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target") Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michal Marek <mmarek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
__clear_bit_unlock() is a special little snowflake. While it carries the
non-atomic '__' prefix, it is specifically documented to pair with
test_and_set_bit() and therefore should be 'somewhat' atomic.
Therefore the generic implementation of __clear_bit_unlock() cannot use
the fully non-atomic __clear_bit() as a default.
If an arch is able to do better; is must provide an implementation of
__clear_bit_unlock() itself.
Specifically, this came up as a result of hackbench livelock'ing in
slab_lock() on ARC with SMP + SLUB + !LLSC.
The non serializing __clear_bit() was getting "lost"
80543b8e: ld_s r2,[r13,0] <--- (A) Finds PG_locked is set 80543b90: or r3,r2,1 <--- (B) other core unlocks right here 80543b94: st_s r3,[r13,0] <--- (C) sets PG_locked (overwrites unlock)
The trace_printk() code will allocate extra buffers if the compile detects
that a trace_printk() is used. To do this, the format of the trace_printk()
is saved to the __trace_printk_fmt section, and if that section is bigger
than zero, the buffers are allocated (along with a message that this has
happened).
If trace_printk() uses a format that is not a constant, and thus something
not guaranteed to be around when the print happens, the compiler optimizes
the fmt out, as it is not used, and the __trace_printk_fmt section is not
filled. This means the kernel will not allocate the special buffers needed
for the trace_printk() and the trace_printk() will not write anything to the
tracing buffer.
Adding a "__used" to the variable in the __trace_printk_fmt section will
keep it around, even though it is set to NULL. This will keep the string
from being printed in the debugfs/tracing/printk_formats section as it is
not needed.
Reported-by: Vlastimil Babka <vbabka@suse.cz> Fixes: 07d777fe8c398 "tracing: Add percpu buffers for trace_printk()" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If tracing contains data and the trace_pipe file is read with sendfile(),
then it can trigger a NULL pointer dereference and various BUG_ON within the
VM code.
There's a patch to fix this in the splice_to_pipe() code, but it's also a
good idea to not let that happen from trace_pipe either.
Joel Fernandes reported that the function tracing of preempt disabled
sections was not being reported when running either the preemptirqsoff or
preemptoff tracers. This was due to the fact that the function tracer
callback for those tracers checked if irqs were disabled before tracing. But
this fails when we want to trace preempt off locations as well.
Joel explained that he wanted to see funcitons where interrupts are enabled
but preemption was disabled. The expected output he wanted:
There's a comment saying that the irq disabled check is because there's a
possible race that tracing_cpu may be set when the function is executed. But
I don't remember that race. For now, I added a check for preemption being
enabled too to not record the function, as there would be no race if that
was the case. I need to re-investigate this, as I'm now thinking that the
tracing_cpu will always be correct. But no harm in keeping the check for
now, except for the slight performance hit.
Link: http://lkml.kernel.org/r/1457770386-88717-1-git-send-email-agnel.joel@gmail.com Fixes: 5e6d2b9cfa3a "tracing: Use one prologue for the preempt irqs off tracer function tracers" Reported-by: Joel Fernandes <agnel.joel@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A narrow window for race condition still exist between
multicast join thread and *dev_flush workers.
A kernel crash caused by prolong erratic link state changes
was observed (most likely a faulty cabling):
[167275.656270] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[167275.665973] IP: [<ffffffffa05f8f2e>] ipoib_mcast_join+0xae/0x1d0 [ib_ipoib]
[167275.674443] PGD 0
[167275.677373] Oops: 0000 [#1] SMP
...
[167275.977530] Call Trace:
[167275.982225] [<ffffffffa05f92f0>] ? ipoib_mcast_free+0x200/0x200 [ib_ipoib]
[167275.992024] [<ffffffffa05fa1b7>] ipoib_mcast_join_task+0x2a7/0x490
[ib_ipoib]
[167276.002149] [<ffffffff8109d5fb>] process_one_work+0x17b/0x470
[167276.010754] [<ffffffff8109e3cb>] worker_thread+0x11b/0x400
[167276.019088] [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400
[167276.027737] [<ffffffff810a5aef>] kthread+0xcf/0xe0
Here was a hit spot:
ipoib_mcast_join() {
..............
rec.qkey = priv->broadcast->mcmember.qkey;
^^^^^^^
.....
}
Proposed patch should prevent multicast join task to continue
if link state change is detected.
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Changes from v4:
- as suggested by Doug Ledford, optimized spinlock usage,
i.e. ipoib_mcast_join() is called with lock held.
Changes from v3:
- sync with priv->lock before flag check.
Chages from v2:
- Move check for OPER_UP flag state to mcast_join() to
ensure no event worker is in progress.
- minor style fixes.
Changes from v1:
- No need to lock again if error detected. Signed-off-by: Doug Ledford <dledford@redhat.com> Cc: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ken Wang <Qingqing.Wang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some PX laptops don't provide an ACPI method to control dGPU power. On
those systems, the driver is responsible for handling the dGPU power
state. Disable runtime PM on them until support for this is implemented.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As observed on Apple iMac10,1, DCE-3.2, RV-730,
link rate of 2.7 Ghz is not selected, because
the args.v1.ucConfig flag setting for 2.7 Ghz
gets overwritten by a following assignment of
the transmitter to use.
Move link rate setup a few lines down to fix this.
In practice this didn't have any positive or
negative effect on display setup on the tested
iMac10,1 so i don't know if backporting to stable
makes sense or not.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some PX laptops don't provide an ACPI method to control dGPU power. On
those systems, the driver is responsible for handling the dGPU power
state. Disable runtime PM on them until support for this is implemented.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the opmode is stopped and started again we did not free
the paging buffers. Fix that.
In addition when freeing the firmware's paging download
buffer, set the pointer to NULL.
Signed-off-by: Matti Gottlieb <matti.gottlieb@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit d63c7dd5bcb9 ("ipr: Fix out-of-bounds null overwrite") removed
the end of line handling when storing the update_fw sysfs attribute.
This changed the userpace API because it started refusing writes
terminated by a line feed, which broke the update tools we already have.
This patch re-adds that handling, so both a write terminated by a line
feed or not can make it through with the update.
Fixes: d63c7dd5bcb9 ("ipr: Fix out-of-bounds null overwrite") Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Cc: Insu Yun <wuninsu@gmail.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Return value of snprintf is not bound by size value, 2nd argument.
(https://www.kernel.org/doc/htmldocs/kernel-api/API-snprintf.html).
Return value is number of printed chars, can be larger than 2nd
argument. Therefore, it can write null byte out of bounds ofbuffer.
Since snprintf puts null, it does not need to put additional null byte.
Signed-off-by: Insu Yun <wuninsu@gmail.com> Reviewed-by: Shane Seymour <shane.seymour@hpe.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix deadlocking during concurrent receive and transmit operations on SMP
platforms caused by the use of incorrect lock: on transmit 'tx_lock'
spinlock should be used instead of 'lock' which is used for receive
operation.
This fix is applicable to kernel versions starting from v2.15.
Signed-off-by: Aurelien Jacquiot <a-jacquiot@ti.com> Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This commit fixes the following security hole affecting systems where
all of the following conditions are fulfilled:
- The fs.suid_dumpable sysctl is set to 2.
- The kernel.core_pattern sysctl's value starts with "/". (Systems
where kernel.core_pattern starts with "|/" are not affected.)
- Unprivileged user namespace creation is permitted. (This is
true on Linux >=3.8, but some distributions disallow it by
default using a distro patch.)
Under these conditions, if a program executes under secure exec rules,
causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
namespace, changes its root directory and crashes, the coredump will be
written using fsuid=0 and a path derived from kernel.core_pattern - but
this path is interpreted relative to the root directory of the process,
allowing the attacker to control where a coredump will be written with
root privileges.
To fix the security issue, always interpret core_pattern for dumps that
are written under SUID_DUMP_ROOT relative to the root directory of init.
Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'reqs' member of fuse_io_priv serves two purposes. First is to track
the number of oustanding async requests to the server and to signal that
the io request is completed. The second is to be a reference count on the
structure to know when it can be freed.
For sync io requests these purposes can be at odds. fuse_direct_IO() wants
to block until the request is done, and since the signal is sent when
'reqs' reaches 0 it cannot keep a reference to the object. Yet it needs to
use the object after the userspace server has completed processing
requests. This leads to some handshaking and special casing that it
needlessly complicated and responsible for at least one race condition.
It's much cleaner and safer to maintain a separate reference count for the
object lifecycle and to let 'reqs' just be a count of outstanding requests
to the userspace server. Then we can know for sure when it is safe to free
the object without any handshaking or special cases.
The catch here is that most of the time these objects are stack allocated
and should not be freed. Initializing these objects with a single reference
that is never released prevents accidental attempts to free the objects.
There's a race in fuse_direct_IO(), whereby is_sync_kiocb() is called on an
iocb that could have been freed if async io has already completed. The fix
in this case is simple and obvious: cache the result before starting io.
It was discovered by KASan:
kernel: ==================================================================
kernel: BUG: KASan: use after free in fuse_direct_IO+0xb1a/0xcc0 at addr ffff88036c414390
Signed-off-by: Robert Doebbelin <robert@quobyte.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: bcba24ccdc82 ("fuse: enable asynchronous processing direct IO") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Inside multipath_make_request(), multipath maps the incoming
bio into low level device's bio, but it is totally wrong to
copy the bio into mapped bio via '*mapped_bio = *bio'. For
example, .__bi_remaining is kept in the copy, especially if
the incoming bio is chained to via bio splitting, so .bi_end_io
can't be called for the mapped bio at all in the completing path
in this kind of situation.
This patch fixes the issue by using clone style.
Reported-and-tested-by: Andrea Righi <righi.andrea@gmail.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
break_stripe_batch_list breaks up a batch and copies some flags from
the batch head to the members, preserving others.
It doesn't preserve or copy STRIPE_PREREAD_ACTIVE. This is not
normally a problem as STRIPE_PREREAD_ACTIVE is cleared when a
stripe_head is added to a batch, and is not set on stripe_heads
already in a batch.
However there is no locking to ensure one thread doesn't set the flag
after it has just been cleared in another. This does occasionally happen.
md/raid5 maintains a count of the number of stripe_heads with
STRIPE_PREREAD_ACTIVE set: conf->preread_active_stripes. When
break_stripe_batch_list clears STRIPE_PREREAD_ACTIVE inadvertently
this could becomes incorrect and will never again return to zero.
md/raid5 delays the handling of some stripe_heads until
preread_active_stripes becomes zero. So when the above mention race
happens, those stripe_heads become blocked and never progress,
resulting is write to the array handing.
So: change break_stripe_batch_list to preserve STRIPE_PREREAD_ACTIVE
in the members of a batch.
URL: https://bugzilla.kernel.org/show_bug.cgi?id=108741
URL: https://bugzilla.redhat.com/show_bug.cgi?id=1258153
URL: http://thread.gmane.org/5649C0E9.2030204@zoner.cz Reported-by: Martin Svec <martin.svec@zoner.cz> (and others) Tested-by: Tom Weber <linux@junkyard.4t2.com> Fixes: 1b956f7a8f9a ("md/raid5: be more selective about distributing flags across batch.") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Revert commit e9e4c377e2f563(md/raid5: per hash value and exclusive wait_for_stripe)
The problem is raid5_get_active_stripe waits on
conf->wait_for_stripe[hash]. Assume hash is 0. My test release stripes
in this order:
- release all stripes with hash 0
- raid5_get_active_stripe still sleeps since active_stripes >
max_nr_stripes * 3 / 4
- release all stripes with hash other than 0. active_stripes becomes 0
- raid5_get_active_stripe still sleeps, since nobody wakes up
wait_for_stripe[0]
The system live locks. The problem is active_stripes isn't a per-hash
count. Revert the patch makes the live lock go away.
Cc: Yuanhan Liu <yuanhan.liu@linux.intel.com> Cc: NeilBrown <neilb@suse.de> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
check_reshape() is called from raid5d thread. raid5d thread shouldn't
call mddev_suspend(), because mddev_suspend() waits for all IO finish
but IO is handled in raid5d thread, we could easily deadlock here.
This issue is introduced by 738a273 ("md/raid5: fix allocation of 'scribble' array.")
Reported-and-tested-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
'max_discard_sectors' is in sectors, while 'stripe' is in bytes.
This fixes the problem where DISCARD would get disabled on some larger
RAID5 configurations (6 or more drives in my testing), while it worked
as expected with smaller configurations.
Fixes: 620125f2bf8 ("MD: raid5 trim support") Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If raid1d is handling a mix of read and write errors, handle_read_error's
call to freeze_array can get stuck.
This can happen because, though the bio_end_io_list is initially drained,
writes can be added to it via handle_write_finished as the retry_list
is processed. These writes contribute to nr_pending but are not included
in nr_queued.
If a later entry on the retry_list triggers a call to handle_read_error,
freeze array hangs waiting for nr_pending == nr_queued+extra. The writes
on the bio_end_io_list aren't included in nr_queued so the condition will
never be satisfied.
To prevent the hang, include bio_end_io_list writes in nr_queued.
There's probably a better way to handle decrementing nr_queued, but this
seemed like the safest way to avoid breaking surrounding code.
I'm happy to supply the script I used to repro this hang.
Fixes: 55ce74d4bfe1b(md/raid1: ensure device failure recorded before write request returns.) Signed-off-by: Nate Dailey <nate.dailey@stratus.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When dqget() in __dquot_initialize() fails e.g. due to IO error,
__dquot_initialize() will pass an array of uninitialized pointers to
dqput_all() and thus can lead to deference of random data. Fix the
problem by properly initializing the array.
Signed-off-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
read{l,w}() write{l,w}() primitives should use le{16,32}_to_cpu() and
cpu_to_le{16,32}() respectively to ensure device registers are read
correctly in Big Endian CPU configuration.
Per Arnd Bergmann
| Most drivers using readl() or readl_relaxed() expect those to perform byte
| swaps on big-endian architectures, as the registers tend to be fixed endian
This was needed for getting UART to work correctly on a Big Endian ARC.
The ARC accessors originally were fine, and the bug got introduced
inadventently by commit b8a033023994 ("ARCv2: barriers")
Simulator stdin may be connected to a file, when its end is reached
kernel hangs in infinite loop inside rs_poll, because simc_poll always
signals that descriptor 0 is readable and simc_read always returns 0.
Check simc_read return value and exit loop if it's not positive. Also
don't rewind polling timer if it's zero.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(busybox's cat uses sendfile(2), unlike the coreutils version)
This is because tracing_splice_read_pipe() can call splice_to_pipe()
with spd->nr_pages == 0. spd_pages underflows in splice_to_pipe() and
we fill the page pointers and the other fields of the pipe_buffers with
garbage.
All other callers of splice_to_pipe() avoid calling it when nr_pages ==
0, and we could make tracing_splice_read_pipe() do that too, but it
seems reasonable to have splice_to_page() handle this condition
gracefully.
Signed-off-by: Rabin Vincent <rabin@rab.in> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
early_init_dt_alloc_reserved_memory_arch passes end as 0 to
__memblock_alloc_base, when limits are not specified. But
__memblock_alloc_base takes end value of 0 as MEMBLOCK_ALLOC_ACCESSIBLE
and limits the end to memblock.current_limit. This results in regions
never being placed in HIGHMEM area, for e.g. CMA.
Let __memblock_alloc_base allocate from anywhere in memory if limits are
not specified.
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Function eth_prepare_mac_addr_change() is called as part of MAC
address change. This function check if interface is running.
To enable change MAC address when interface is running:
IFF_LIVE_ADDR_CHANGE flag must be set to dev->priv_flags field
Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP
network unit") Signed-off-by: Dmitri Epshtein <dima@marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Before 2e91fa7f6d45 ("cgroup: keep zombies associated with their
original cgroups"), all dead tasks were associated with init_css_set.
If a zombie task is requested for migration, while migration prep
operations would still be performed on init_css_set, the actual
migration would ignore zombie tasks. As init_css_set is always valid,
this worked fine.
However, after 2e91fa7f6d45, zombie tasks stay with the css_set it was
associated with at the time of death. Let's say a task T associated
with cgroup A on hierarchy H-1 and cgroup B on hiearchy H-2. After T
becomes a zombie, it would still remain associated with A and B. If A
only contains zombie tasks, it can be removed. On removal, A gets
marked offline but stays pinned until all zombies are drained. At
this point, if migration is initiated on T to a cgroup C on hierarchy
H-2, migration path would try to prepare T's css_set for migration and
trigger the following.
It doesn't make sense to prepare migration for css_sets pointing to
dead cgroups as they are guaranteed to contain only zombies which are
ignored later during migration. This patch makes cgroup destruction
path mark all affected css_sets as dead and updates the migration path
to ignore them during preparation.
Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 2e91fa7f6d45 ("cgroup: keep zombies associated with their original cgroups") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Add Advertising command handler does the appropriate checks for
the AD and Scan Response data, however fails to take into account the
general length of the mgmt command itself, which could lead to
potential buffer overflows. This patch adds the necessary check that
the mgmt command length is consistent with the given ad and scan_rsp
lengths.
Calling return copy_to_user(...) in an ioctl will not do the right thing
if there's a pagefault: copy_to_user returns the number of bytes not
copied in this case.
Fix up watchdog/rc32434_wdt to do
return copy_to_user(...)) ? -EFAULT : 0;
instead.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While working on a script to restore all sysctl params before a series of
tests I found that writing any value into the
/proc/sys/kernel/{nmi_watchdog,soft_watchdog,watchdog,watchdog_thresh}
causes them to call proc_watchdog_update().
NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
There doesn't appear to be a reason for doing this work every time a write
occurs, so only do it when the values change.
All architectures now need ioremap_uc(), ia64 seems defines this already
through its ioremap_nocache() and it already ensures it *only* uses UC.
This is needed since v4.3 to complete an allyesconfig compile on ia64,
there were others archs that needed this, and this one seems to have
fallen through the cracks.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Reported-by: kbuild test robot <fengguang.wu@intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Setting the original memory.limit_in_bytes hardlimit is subject to a
race condition when the desired value is below the current usage. The
code tries a few times to first reclaim and then see if the usage has
dropped to where we would like it to be, but there is no locking, and
the workload is free to continue making new charges up to the old limit.
Thus, attempting to shrink a workload relies on pure luck and hope that
the workload happens to cooperate.
To fix this in the cgroup2 memory.max knob, do it the other way round:
set the limit first, then try enforcement. And if reclaim is not able
to succeed, trigger OOM kills in the group. Keep going until the new
limit is met, we run out of OOM victims and there's only unreclaimable
memory left, or the task writing to memory.max is killed. This allows
users to shrink groups reliably, and the behavior is consistent with
what happens when new charges are attempted in excess of memory.max.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When setting memory.high below usage, nothing happens until the next
charge comes along, and then it will only reclaim its own charge and not
the now potentially huge excess of the new memory.high. This can cause
groups to stay in excess of their memory.high indefinitely.
To fix that, when shrinking memory.high, kick off a reclaim cycle that
goes after the delta.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When bch_cache_set_alloc() fails to kzalloc the cache_set, the
asyncronous closure handling tries to dereference a cache_set that
hadn't yet been allocated inside of cache_set_flush() which is called
by __cache_set_unregister() during cleanup. This appears to happen only
during an OOM condition on bcache_register.
Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The bch_writeback_thread might BUG_ON in read_dirty() if
dc->sb==BDEV_STATE_DIRTY and bch_sectors_dirty_init has not yet completed
its related initialization. This patch downs the dc->writeback_lock until
after initialization is complete, thus preventing bch_writeback_thread
from proceeding prematurely.
See this thread:
http://thread.gmane.org/gmane.linux.kernel.bcache.devel/3453
Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net> Tested-by: Marc MERLIN <marc@merlins.org> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix null pointer dereference by changing register_cache() to return an int
instead of being void. This allows it to return -ENOMEM or -ENODEV and
enables upper layers to handle the OOM case without NULL pointer issues.
See this thread:
http://thread.gmane.org/gmane.linux.kernel.bcache.devel/3521
Fixes this error:
gargamel:/sys/block/md5/bcache# echo /dev/sdh2 > /sys/fs/bcache/register
Let the target core check task existence instead of the SRP target
driver. Additionally, let the target core check the validity of the
task management request instead of the ib_srpt driver.
On umount path, jbd2_journal_destroy() writes latest transaction ID
(->j_tail_sequence) to be used at next mount.
The bug is that ->j_tail_sequence is not holding latest transaction ID
in some cases. So, at next mount, there is chance to conflict with
remaining (not overwritten yet) transactions.
mount (id=10)
write transaction (id=11)
write transaction (id=12)
umount (id=10) <= the bug doesn't write latest ID
mount (id=10)
write transaction (id=11)
crash
mount
[recovery process]
transaction (id=11)
transaction (id=12) <= valid transaction ID, but old commit
must not replay
Like above, this bug become the cause of recovery failure, or FS
corruption.
So why ->j_tail_sequence doesn't point latest ID?
Because if checkpoint transactions was reclaimed by memory pressure
(i.e. bdev_try_to_free_page()), then ->j_tail_sequence is not updated.
(And another case is, __jbd2_journal_clean_checkpoint_list() is called
with empty transaction.)
So in above cases, ->j_tail_sequence is not pointing latest
transaction ID at umount path. Plus, REQ_FLUSH for checkpoint is not
done too.
So, to fix this problem with minimum changes, this patch updates
->j_tail_sequence, and issue REQ_FLUSH. (With more complex changes,
some optimizations would be possible to avoid unnecessary REQ_FLUSH
for example though.)
Use the local uapi headers to keep in sync with "recently" added #define's
(e.g. VSS_OP_REGISTER1).
Fixes: 3eb2094c59e8 ("Adding makefile for tools/hv") Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cirrus HD-audio driver may adjust GPIO pins for EAPD dynamically
depending on the jack plug state. This works fine for the auto-mute
mode where the speaker gets muted upon the HP jack plug. OTOH, when
the auto-mute mode is off, this turns off the EAPD unexpectedly
depending on the jack state, which results in the silent speaker
output.
This patch fixes the silent speaker output issue by setting GPIO bits
constantly when the auto-mute mode is off.
This Lenovo ThinkCentre AIO also uses Line2 as mic mute button and
uses GPIO2 to control the mic mute led, so applying this quirk can
make both the button and led work.
The current Intel HDMI codec driver supports only three fixed ports
from port B to port D. However, i915 driver may assign a DP on other
ports, e.g. port A, when no eDP is used. This incompatibility is
caught later at pin_nid_to_pin_index() and results in a warning
message like "HDMI: pin nid 4 not registered" at each time.
This patch filters out such invalid events beforehand, so that the
kernel won't be too grumbling.
The commit [d507941beb1e: ALSA: pcm: Correct PCM BUG error message]
made the warning prefix back to "BUG:" due to its previous wrong
prefix. But a kernel message containing "BUG:" seems taken as an Oops
message wrongly by some brain-dead daemons, and it annoys users in the
end. Instead of teaching daemons, change the string again to a more
reasonable one.
Allow device initialization to finish gracefully when it is in
FTL rebuild failure state. Also, recover device out of this state
after successfully secure erasing it.
When FTL rebuild is in progress, alloc_disk() initializes the disk
but device node will be created by add_disk() only after successful
completion of FTL rebuild. So, skip deletion of device node in
removal path when FTL rebuild is in progress.
Signed-off-by: Selvan Mani <smani@micron.com> Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove setting and clearing MTIP_PF_EH_ACTIVE_BIT flag in
mtip_handle_tfe() as they are redundant. Also avoid waking
up service thread from mtip_handle_tfe() because it is
already woken up in case of taskfile error.
During the recent vb2_buffer restructuring, the calculation of the
buffer payload reported to userspace was accidentally broken for the
first encoded frame, counting only the length of the headers.
This patch re-adds the length of the actual frame data.
Fixes: 2d7007153f0c ("[media] media: videobuf2: Restructure vb2_buffer") Reported-by: Michael Olbrich <m.olbrich@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Tested-by: Jan Luebbe <jlu@pengutronix.de> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On my bttv card "Hauppauge WinTV [card=10]" capturing in YV12 fmt at max
size results in a solid green rectangle being captured (all colors 0 in
YUV).
This turns out to be caused by max-width (924) not being a multiple of 16.
We've likely never hit this problem before since normally xawtv / tvtime,
etc. will prefer packed pixel formats. But when using a video card which
is using xf86-video-modesetting + glamor, only planar XVideo fmts are
available, and xawtv will chose a matching capture format to avoid needing
to do conversion, triggering the solid green window problem.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The V4L2_CID_TX_EDID_PRESENT control reports if an EDID is present.
The adv7511 however still reported the EDID present after disconnecting
the HDMI cable. Fix the logic regarding this control. And when the EDID
is disconnected also call ADV7511_EDID_DETECT to notify the bridge driver.
This was also missing.
Some UART HW has a single register combining UART_DLL/UART_DLM
(this was probably forgotten in the change that introduced the
callbacks, commit b32b19b8ffc05cbd3bf91c65e205f6a912ca15d9)
Fixes: b32b19b8ffc0 ("[SERIAL] 8250: set divisor register correctly ...") Signed-off-by: Sebastian Frias <sf84@laposte.net> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The N_IRDA line discipline may access the previous line discipline's closed
and already-fre private data on open [1].
The tty->disc_data field _never_ refers to valid data on entry to the
line discipline's open() method. Rather, the ldisc is expected to
initialize that field for its own use for the lifetime of the instance
(ie. from open() to close() only).
[1]
==================================================================
BUG: KASAN: use-after-free in irtty_open+0x422/0x550 at addr ffff8800331dd068
Read of size 4 by task a.out/13960
=============================================================================
BUG kmalloc-512 (Tainted: G B ): kasan: bad access detected
-----------------------------------------------------------------------------
...
Call Trace:
[<ffffffff815fa2ae>] __asan_report_load4_noabort+0x3e/0x40 mm/kasan/report.c:279
[<ffffffff836938a2>] irtty_open+0x422/0x550 drivers/net/irda/irtty-sir.c:436
[<ffffffff829f1b80>] tty_ldisc_open.isra.2+0x60/0xa0 drivers/tty/tty_ldisc.c:447
[<ffffffff829f21c0>] tty_set_ldisc+0x1a0/0x940 drivers/tty/tty_ldisc.c:567
[< inline >] tiocsetd drivers/tty/tty_io.c:2650
[<ffffffff829da49e>] tty_ioctl+0xace/0x1fd0 drivers/tty/tty_io.c:2883
[< inline >] vfs_ioctl fs/ioctl.c:43
[<ffffffff816708ac>] do_vfs_ioctl+0x57c/0xe60 fs/ioctl.c:607
[< inline >] SYSC_ioctl fs/ioctl.c:622
[<ffffffff81671204>] SyS_ioctl+0x74/0x80 fs/ioctl.c:613
[<ffffffff852a7876>] entry_SYSCALL_64_fastpath+0x16/0x7a
commit 9ce119f318ba ("tty: Fix GPF in flush_to_ldisc()") fixed a
GPF caused by a line discipline which does not define a receive_buf()
method.
However, the vt driver (and speakup driver also) pushes selection
data directly to the line discipline receive_buf() method via
tty_ldisc_receive_buf(). Fix the same problem in tty_ldisc_receive_buf().
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On error platform_device_register_simple() returns ERR_PTR() value,
check for NULL always fails. The change corrects the check itself and
propagates the returned error upwards.
Fixes: 81fb0b901397 ("staging: android: ion_test: unregister the platform device") Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hid_ignore_special_drivers works fine until hid_scan_report autodetects and
reassign devices (for hid-multitouch, hid-microsoft and hid-rmi).
Simplify the handling of the parameter: if it is there, use hid-generic, no
matter what, and if not, scan the device or rely on the hid_have_special_driver
table.
This was detected while trying to disable hid-multitouch on a Surface Pro cover
which prevented to use the keyboard.
Even though hid_hw_* checks that passed in data_len is less than
HID_MAX_BUFFER_SIZE it is not enough, as i2c-hid does not necessarily
allocate buffers of HID_MAX_BUFFER_SIZE but rather checks all device
reports and select largest size. In-kernel users normally just send as much
data as report needs, so there is no problem, but hidraw users can do
whatever they please:
The patch that added Logitech Dual Action gamepad support forgot to
update the special driver list for the device. This caused the logitech
driver not to probe unless kernel module load order was favorable.
Update the special driver list to fix it. Thanks to Simon Wood for the
idea.
If the initialization fails before tpm_chip_register(), put_device()
will be not called, which causes release callback not to be called.
This patch fixes the issue by adding put_device() to devres list of
the parent device.
Fixes: 313d21eeab ("tpm: device class for tpm") Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>