Hiral Shah [Fri, 18 Apr 2014 19:28:17 +0000 (12:28 -0700)]
fnic: NoFIP solicitation frame in NONFIP mode and changed IO Throttle count
This patch contains following three minor fixes.
1) During Probe, fnic was sending FIP solicitation in Non FIP mode which is not
expected, setting the internal fip state to Non FIP mode explicitly, avoids
sending FIP frame.
2) When target goes offline, all outstanding IOs belong to the target will be
terminated by driver, If the termination count is high, then it influences
firmware responsiveness. To improve the responsiveness, default IO throttle
count is reduced to 256.
3) Accessing Virtual Fabric Id (vfid) and fc_map of Fibre-Channel Forwarder(FCF)
is invalid in fnic driver when Clear Virtual Link(CVL) is received prior to
receiving flogi reject from switch. As CVL clears all FCFs.
Signed-off-by: Hiral Shah <hishah@cisco.com> Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com> Signed-off-by: Narsimhulu Musini <nmusini@cisco.com> Signed-off-by: Anantha Tungarakodi <atungara@cisco.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Chad Dupuis [Fri, 11 Apr 2014 20:54:46 +0000 (16:54 -0400)]
qla2xxx: Remove wait for online from host reset handler.
This can block progress of the SCSI error handler thread and cause long I/O
outages. Instead just fail immediately if another reset is going on or we are
accessing flash memory.
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Chad Dupuis [Fri, 11 Apr 2014 20:54:45 +0000 (16:54 -0400)]
qla2xxx: Do logins from a chip reset in DPC thread instead of the error handler thread.
Attempting to do any logins from the SCSI reset handler can lead to a deadlock
scenario if a rport times out and the FC transport layer. Move doing any port
logins to the DPC thread so as not to impede the progress of the SCSI error
handler thread and avoid deadlock situations.
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Fix double free problem within qla2xxx driver where
current code prematurely free qla_tgt_cmd while firmware
still has the command. When firmware release the command
after abort, the code attempt a second free as part of
command completion processing.
When TCM start the free process, NULL pointer was hit.
Add support for T10-Dif for Target Mode to qla driver.
The driver will look for firmware attribute that support
this feature. When the feature is present, the capabilities
will be report to TCM layer.
Add CTIO CRC2 iocb to build T10-Dif commands.
Add support routines to process good & error cases.
Signed-off-by: Quinn Tran <quinn.tran@qlogic.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com> Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Chad Dupuis [Fri, 11 Apr 2014 20:54:31 +0000 (16:54 -0400)]
qla2xxx: Avoid escalating the SCSI error handler if the command is not found in firmware.
If the firmware cannot find the command specified then return SUCCESS to the
error handler so as not to needlessly escalate. Also cleanup the resources for
the command since we cannot expect the original command to returned in
interrupt context.
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com> Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Tomas Henzl [Tue, 1 Apr 2014 11:59:50 +0000 (13:59 +0200)]
megaraid_sas: fix a small problem when reading state value from hw
When the driver reads state values from the hw it might happen that
different values are read in subsequent reads and this can cause problems,
this may lead to a timeout in this function and a non working adapter.
Cc: Adam Radford <aradford@gmail.com> Signed-off-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Shintaro Minemoto <fj3207hq@aa.jp.fujitsu.com> Acked-by: Sumit Saxena <sumit.saxena@lsi.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
Linus Torvalds [Fri, 9 May 2014 19:24:20 +0000 (12:24 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
"A somewhat unpleasantly large collection of small fixes. The big ones
are the __visible tree sweep and a fix for 'earlyprintk=efi,keep'. It
was using __init functions with predictably suboptimal results.
Another key fix is a build fix which would produce output that simply
would not decompress correctly in some configuration, due to the
existing Makefiles picking up an unfortunate local label and mistaking
it for the global symbol _end.
Additional fixes include the handling of 64-bit numbers when setting
the vdso data page (a latent bug which became manifest when i386
started exporting a vdso with time functions), a fix to the new MSR
manipulation accessors which would cause features to not get properly
unblocked, a build fix for 32-bit userland, and a few new platform
quirks"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, vdso, time: Cast tv_nsec to u64 for proper shifting in update_vsyscall()
x86: Fix typo in MSR_IA32_MISC_ENABLE_LIMIT_CPUID macro
x86: Fix typo preventing msr_set/clear_bit from having an effect
x86/intel: Add quirk to disable HPET for the Baytrail platform
x86/hpet: Make boot_hpet_disable extern
x86-64, build: Fix stack protector Makefile breakage with 32-bit userland
x86/reboot: Add reboot quirk for Certec BPC600
asmlinkage: Add explicit __visible to drivers/*, lib/*, kernel/*
asmlinkage, x86: Add explicit __visible to arch/x86/*
asmlinkage: Revert "lto: Make asmlinkage __visible"
x86, build: Don't get confused by local symbols
x86/efi: earlyprintk=efi,keep fix
Boris Ostrovsky [Fri, 9 May 2014 15:11:27 +0000 (11:11 -0400)]
x86, vdso, time: Cast tv_nsec to u64 for proper shifting in update_vsyscall()
With tk->wall_to_monotonic.tv_nsec being a 32-bit value on 32-bit
systems, (tk->wall_to_monotonic.tv_nsec << tk->shift) in update_vsyscall()
may lose upper bits or, worse, add them since compiler will do this:
(u64)(tk->wall_to_monotonic.tv_nsec << tk->shift)
instead of
((u64)tk->wall_to_monotonic.tv_nsec << tk->shift)
So if, for example, tv_nsec is 0x800000 and shift is 8 we will end up
with 0xffffffff80000000 instead of 0x80000000. And then we are stuck in
the subsequent 'while' loop.
Andres Freund [Fri, 9 May 2014 01:29:16 +0000 (03:29 +0200)]
x86: Fix typo preventing msr_set/clear_bit from having an effect
Due to a typo the msr accessor function introduced in 22085a66c2fab6cf9b9393c056a3600a6b4735de didn't have any lasting
effects because they accidentally wrote the old value back.
After c0a639ad0bc6b178b46996bd1f821a04643e2bde this at the very least
this causes cpuid limits not to be lifted on some cpus leading to
missing capabilities for those.
Linus Torvalds [Fri, 9 May 2014 02:20:45 +0000 (19:20 -0700)]
Merge tag 'xfs-for-linus-3.15-rc5' of git://oss.sgi.com/xfs/xfs
Pull xfs fixes from Dave Chinner:
"The main fix is adding support for default ACLs on O_TMPFILE opened
inodes to bring XFS into line with other filesystems. Metadata CRCs
are now also considered well enough tested to be fully supported, so
we're removing the shouty warnings issued at mount time for
filesystems with that format. And there's transaction block
reservation overrun fix.
Summary:
- fix a remote attribute size calculation bug that leads to a
transaction overrun
- add default ACLs to O_TMPFILE files
- Remove the EXPERIMENTAL tag from filesystems with metadata CRC
support"
* tag 'xfs-for-linus-3.15-rc5' of git://oss.sgi.com/xfs/xfs:
xfs: remote attribute overwrite causes transaction overrun
xfs: initialize default acls for ->tmpfile()
xfs: fully support v5 format filesystems
Linus Torvalds [Thu, 8 May 2014 21:17:13 +0000 (14:17 -0700)]
Merge tag 'trace-fixes-v3.15-rc4-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"This contains two fixes.
The first is a long standing bug that causes bogus data to show up in
the refcnt field of the module_refcnt tracepoint. It was introduced
by a merge conflict resolution back in 2.6.35-rc days.
The result should be 'refcnt = incs - decs', but instead it did
'refcnt = incs + decs'.
The second fix is to a bug that was introduced in this merge window
that allowed for a tracepoint funcs pointer to be used after it was
freed. Moving the location of where the probes are released solved
the problem"
* tag 'trace-fixes-v3.15-rc4-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracepoint: Fix use of tracepoint funcs after rcu free
trace: module: Maintain a valid user count
Linus Torvalds [Thu, 8 May 2014 21:06:45 +0000 (14:06 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input subsystem fixes from Dmitry Torokhov:
"Just a few fixups to various drivers"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elantech - fix touchpad initialization on Gigabyte U2442
Input: tca8418 - fix loading this driver as a module from a device tree
Input: bma150 - extend chip detection for bma180
Input: atkbd - fix keyboard not working on some LG laptops
Input: synaptics - add min/max quirk for ThinkPad Edge E431
Linus Torvalds [Thu, 8 May 2014 20:51:53 +0000 (13:51 -0700)]
Merge tag 'sound-3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A bunch of small fixes for USB-audio and HD-audio, where most of them
are for regressions: USB-audio PM fixes, ratelimit annoyance fix, HDMI
offline state fix, and a couple of device-specific quirks"
* tag 'sound-3.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - hdmi: Set converter channel count even without sink
ALSA: usb-audio: work around corrupted TEAC UD-H01 feedback data
ALSA: usb-audio: Fix deadlocks at resuming
ALSA: usb-audio: Save mixer status only once at suspend
ALSA: usb-audio: Prevent printk ratelimiting from spamming kernel log while DEBUG not defined
ALSA: hda - add headset mic detect quirk for a Dell laptop
Linus Torvalds [Thu, 8 May 2014 19:41:14 +0000 (12:41 -0700)]
Merge tag 'mfd-mmc-fixes-3.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull mmc/rtsx revert from Lee Jones.
* tag 'mfd-mmc-fixes-3.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
mmc: rtsx: Revert "mmc: rtsx: add support for pre_req and post_req"
tracepoint: Fix use of tracepoint funcs after rcu free
Commit de7b2973903c "tracepoint: Use struct pointer instead of name hash
for reg/unreg tracepoints" introduces a use after free by calling
release_probes on the old struct tracepoint array before the newly
allocated array is published with rcu_assign_pointer. There is a race
window where tracepoints (RCU readers) can perform a
"use-after-grace-period-after-free", which shows up as a GPF in
stress-tests.
Romain Izard [Tue, 4 Mar 2014 09:09:39 +0000 (10:09 +0100)]
trace: module: Maintain a valid user count
The replacement of the 'count' variable by two variables 'incs' and
'decs' to resolve some race conditions during module unloading was done
in parallel with some cleanup in the trace subsystem, and was integrated
as a merge.
Unfortunately, the formula for this replacement was wrong in the tracing
code, and the refcount in the traces was not usable as a result.
Use 'count = incs - decs' to compute the user count.
commit <mmc: rtsx: add support for pre_req and post_req> did use
mutex_unlock() in tasklet, but mutex_unlock() can't be used in
tasklet(atomic context). The driver needs to use mutex to avoid
concurrency, so we can't use tasklet here, the patch need to be
removed.
The spinlock host->lock and pcr->lock may deadlock, one way to solve
the deadlock is remove host->lock in sd_isr_done_transfer(), but if
using workqueue the we can avoid using the spinlock and also avoid
the problem.
Signed-off-by: Micky Ching <micky_ching@realsil.com.cn> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
HPET on some platform has accuracy problem. Making
"boot_hpet_disable" extern so that we can runtime disable
the HPET timer by using quirk to check the platform.
Linus Torvalds [Wed, 7 May 2014 23:07:58 +0000 (16:07 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- fix a small bug in computation of report size, which might cause some
devices (Atmel touchpad found on the Samsung Ativ 9) to reject
reports with otherwise valid contents
- a few device-ID specific quirks/additions piggy-backing on top of it
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: sensor-hub: Add in quirk for sensor hub in Lenovo Ideapad Yogas
HID: add NO_INIT_REPORTS quirk for Synaptics Touch Pad V 103S
HID: core: fix computation of the report size
HID: multitouch: add support of EliteGroup 05D8 panels
Linus Torvalds [Wed, 7 May 2014 22:47:47 +0000 (15:47 -0700)]
Merge branch 'drm-radeon-mullins' of git://people.freedesktop.org/~airlied/linux
Pull radeon mullins support from Dave Airlie:
"This is support for the new AMD mullins APU, it pretty much just adds
support to the driver in the all the right places, and is pretty low
risk wrt other GPUs"
Oh well. I guess it ends up fitting under "support new hardware" for
merging late.
* 'drm-radeon-mullins' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: add pci ids for Mullins
drm/radeon: add Mullins VCE support
drm/radeon: modesetting updates for Mullins.
drm/radeon: dpm updates for KV/KB
drm/radeon: add Mullins dpm support.
drm/radeon: add Mullins UVD support.
drm/radeon: update cik init for Mullins.
drm/radeon: add Mullins chip family
Linus Torvalds [Wed, 7 May 2014 22:45:13 +0000 (15:45 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"radeon, i915 and nouveau fixes, all fixes for regressions or black
screens, or possible oopses"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: lower the ref * post PLL maximum
drm/radeon: check that we have a clock before PLL setup
drm/radeon: drm/radeon: add missing radeon_semaphore_free to error path
drm/radeon: Fix num_banks calculation for SI
agp: info leak in agpioc_info_wrap()
drm/gm107/gr: bump attrib cb size quite a bit
drm/nouveau: fix another lock unbalance in nouveau_crtc_page_flip
drm/nouveau/bios: fix shadowing from PROM on big-endian systems
drm/nouveau/acpi: allow non-optimus setups to load vbios from acpi
drm/radeon/dp: check for errors in dpcd reads
drm/radeon: avoid high jitter with small frac divs
drm/radeon: check buffer relocation offset
drm/radeon: use pflip irq on R600+ v2
drm/radeon/uvd: use lower clocks on old UVD to boot v2
drm/i915: don't try DP_LINK_BW_5_4 on HSW ULX
drm/i915: Sanitize the enable_ppgtt module option once
drm/i915: Break encoder->crtc link separately in intel_sanitize_crtc()
George Spelvin [Wed, 7 May 2014 21:05:52 +0000 (17:05 -0400)]
x86-64, build: Fix stack protector Makefile breakage with 32-bit userland
If you are using a 64-bit kernel with 32-bit userland, then
scripts/gcc-x86_64-has-stack-protector.sh invokes 32-bit gcc
with -mcmodel=kernel, which produces:
<stdin>:1:0: error: code model 'kernel' not supported in the 32 bit mode
and trips the "broken compiler" test at arch/x86/Makefile:120.
There are several places a fix is possible, but the following seems
cleanest. (But it's minimal; it would also be possible to factor
out a bunch of stuff from the two branches of the if.)
Dave Airlie [Tue, 6 May 2014 23:06:21 +0000 (09:06 +1000)]
Merge branch 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes
nouveau fixes.
* 'drm-nouveau-next' of git://anongit.freedesktop.org/git/nouveau/linux-2.6:
drm/gm107/gr: bump attrib cb size quite a bit
drm/nouveau: fix another lock unbalance in nouveau_crtc_page_flip
drm/nouveau/bios: fix shadowing from PROM on big-endian systems
drm/nouveau/acpi: allow non-optimus setups to load vbios from acpi
Dave Airlie [Tue, 6 May 2014 22:56:03 +0000 (08:56 +1000)]
Merge tag 'topc/core-stuff-2014-05-05' of git://anongit.freedesktop.org/drm-intel into drm-fixes
Some more i915 fixes. There's still some DP issues we are looking into,
but wanted to get these moving.
* tag 'topc/core-stuff-2014-05-05' of git://anongit.freedesktop.org/drm-intel:
drm/i915: don't try DP_LINK_BW_5_4 on HSW ULX
drm/i915: Sanitize the enable_ppgtt module option once
drm/i915: Break encoder->crtc link separately in intel_sanitize_crtc()
Dave Airlie [Tue, 6 May 2014 22:55:27 +0000 (08:55 +1000)]
Merge branch 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux into drm-fixes
this is the next pull quested for stashed up radeon fixes for 3.15. As discussed support for Mullins was separated out and will get it's own pull request. Remaining highlights are:
1. Some more patches to better handle PLL limits.
2. Making use of the PFLIP additional to the VBLANK interrupt, otherwise we sometimes miss page flip events.
3. Fix for the UVD command stream parser.
4. Fix for bootup UVD clocks on RV7xx systems.
5. Adding missing error check on dpcd reads.
6. Fixes number of banks calculation on SI.
* 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux:
drm/radeon: lower the ref * post PLL maximum
drm/radeon: check that we have a clock before PLL setup
drm/radeon: drm/radeon: add missing radeon_semaphore_free to error path
drm/radeon: Fix num_banks calculation for SI
drm/radeon/dp: check for errors in dpcd reads
drm/radeon: avoid high jitter with small frac divs
drm/radeon: check buffer relocation offset
drm/radeon: use pflip irq on R600+ v2
drm/radeon/uvd: use lower clocks on old UVD to boot v2
Linus Torvalds [Tue, 6 May 2014 20:07:41 +0000 (13:07 -0700)]
Merge branch 'akpm' (incoming from Andrew)
Merge misc fixes from Andrew Morton:
"13 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
agp: info leak in agpioc_info_wrap()
fs/affs/super.c: bugfix / double free
fanotify: fix -EOVERFLOW with large files on 64-bit
slub: use sysfs'es release mechanism for kmem_cache
revert "mm: vmscan: do not swap anon pages just because free+file is low"
autofs: fix lockref lookup
mm: filemap: update find_get_pages_tag() to deal with shadow entries
mm/compaction: make isolate_freepages start at pageblock boundary
MAINTAINERS: zswap/zbud: change maintainer email address
mm/page-writeback.c: fix divide by zero in pos_ratio_polynom
hugetlb: ensure hugepage access is denied if hugepages are not supported
slub: fix memcg_propagate_slab_attrs
drivers/rtc/rtc-pcf8523.c: fix month definition
Dan Carpenter [Tue, 6 May 2014 19:50:12 +0000 (12:50 -0700)]
agp: info leak in agpioc_info_wrap()
On 64 bit systems the agp_info struct has a 4 byte hole between
->agp_mode and ->aper_base. We need to clear it to avoid disclosing
stack information to userspace.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 842a859db26b ("affs: use ->kill_sb() to simplify ->put_super()
and failure exits of ->mount()") adds .kill_sb which frees sbi but
doesn't remove sbi free in case of parse_options error causing double
free+random crash.
Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> [3.14.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Woods [Tue, 6 May 2014 19:50:10 +0000 (12:50 -0700)]
fanotify: fix -EOVERFLOW with large files on 64-bit
On 64-bit systems, O_LARGEFILE is automatically added to flags inside
the open() syscall (also openat(), blkdev_open(), etc). Userspace
therefore defines O_LARGEFILE to be 0 - you can use it, but it's a
no-op. Everything should be O_LARGEFILE by default.
But: when fanotify does create_fd() it uses dentry_open(), which skips
all that. And userspace can't set O_LARGEFILE in fanotify_init()
because it's defined to 0. So if fanotify gets an event regarding a
large file, the read() will just fail with -EOVERFLOW.
This patch adds O_LARGEFILE to fanotify_init()'s event_f_flags on 64-bit
systems, using the same test as open()/openat()/etc.
Signed-off-by: Will Woods <wwoods@redhat.com> Acked-by: Eric Paris <eparis@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sysfs has a release mechanism. Use that to release the kmem_cache
structure if CONFIG_SYSFS is enabled.
Only slub is changed - slab currently only supports /proc/slabinfo and
not /sys/kernel/slab/*. We talked about adding that and someone was
working on it.
[akpm@linux-foundation.org: fix CONFIG_SYSFS=n build]
[akpm@linux-foundation.org: fix CONFIG_SYSFS=n build even more] Signed-off-by: Christoph Lameter <cl@linux.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Greg KH <greg@kroah.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Pekka Enberg <penberg@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Tue, 6 May 2014 19:50:07 +0000 (12:50 -0700)]
revert "mm: vmscan: do not swap anon pages just because free+file is low"
This reverts commit 0bf1457f0cfc ("mm: vmscan: do not swap anon pages
just because free+file is low") because it introduced a regression in
mostly-anonymous workloads, where reclaim would become ineffective and
trap every allocating task in direct reclaim.
The problem is that there is a runaway feedback loop in the scan balance
between file and anon, where the balance tips heavily towards a tiny
thrashing file LRU and anonymous pages are no longer being looked at.
The commit in question removed the safe guard that would detect such
situations and respond with forced anonymous reclaim.
This commit was part of a series to fix premature swapping in loads with
relatively little cache, and while it made a small difference, the cure
is obviously worse than the disease. Revert it.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Rafael Aquini <aquini@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: <stable@kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ian Kent [Tue, 6 May 2014 19:50:06 +0000 (12:50 -0700)]
autofs: fix lockref lookup
autofs needs to be able to see private data dentry flags for its dentrys
that are being created but not yet hashed and for its dentrys that have
been rmdir()ed but not yet freed. It needs to do this so it can block
processes in these states until a status has been returned to indicate
the given operation is complete.
It does this by keeping two lists, active and expring, of dentrys in
this state and uses ->d_release() to keep them stable while it checks
the reference count to determine if they should be used.
But with the recent lockref changes dentrys being freed sometimes don't
transition to a reference count of 0 before being freed so autofs can
occassionally use a dentry that is invalid which can lead to a panic.
Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1343 /*
1344 * This function is never used on a shmem/tmpfs
1345 * mapping, so a swap entry won't be found here.
1346 */
1347 BUG();
After commit 0cd6144aadd2 ("mm + fs: prepare for non-page entries in
page cache radix trees") this comment and BUG() are out of date because
exceptional entries can now appear in all mappings - as shadows of
recently evicted pages.
However, as Hugh Dickins notes,
"it is truly surprising for a PAGECACHE_TAG_WRITEBACK (and probably
any other PAGECACHE_TAG_*) to appear on an exceptional entry.
I expect it comes down to an occasional race in RCU lookup of the
radix_tree: lacking absolute synchronization, we might sometimes
catch an exceptional entry, with the tag which really belongs with
the unexceptional entry which was there an instant before."
And indeed, not only is the tree walk lockless, the tags are also read
in chunks, one radix tree node at a time. There is plenty of time for
page reclaim to swoop in and replace a page that was already looked up
as tagged with a shadow entry.
Remove the BUG() and update the comment. While reviewing all other
lookup sites for whether they properly deal with shadow entries of
evicted pages, update all the comments and fix memcg file charge moving
to not miss shmem/tmpfs swapcache pages.
Fixes: 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page cache radix trees") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Dave Jones <davej@redhat.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Tue, 6 May 2014 19:50:03 +0000 (12:50 -0700)]
mm/compaction: make isolate_freepages start at pageblock boundary
The compaction freepage scanner implementation in isolate_freepages()
starts by taking the current cc->free_pfn value as the first pfn. In a
for loop, it scans from this first pfn to the end of the pageblock, and
then subtracts pageblock_nr_pages from the first pfn to obtain the first
pfn for the next for loop iteration.
This means that when cc->free_pfn starts at offset X rather than being
aligned on pageblock boundary, the scanner will start at offset X in all
scanned pageblock, ignoring potentially many free pages. Currently this
can happen when
a) zone's end pfn is not pageblock aligned, or
b) through zone->compact_cached_free_pfn with CONFIG_HOLES_IN_ZONE
enabled and a hole spanning the beginning of a pageblock
This patch fixes the problem by aligning the initial pfn in
isolate_freepages() to pageblock boundary. This also permits replacing
the end-of-pageblock alignment within the for loop with a simple
pageblock_nr_pages increment.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Heesub Shin <heesub.shin@samsung.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Mel Gorman <mgorman@suse.de> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Christoph Lameter <cl@linux.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Dongjun Shin <d.j.shin@samsung.com> Cc: Sunghwan Yun <sunghwan.yun@samsung.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rik van Riel [Tue, 6 May 2014 19:50:01 +0000 (12:50 -0700)]
mm/page-writeback.c: fix divide by zero in pos_ratio_polynom
It is possible for "limit - setpoint + 1" to equal zero, after getting
truncated to a 32 bit variable, and resulting in a divide by zero error.
Using the fully 64 bit divide functions avoids this problem. It also
will cause pos_ratio_polynom() to return the correct value when
(setpoint - limit) exceeds 2^32.
Also uninline pos_ratio_polynom, at Andrew's request.
hugetlb: ensure hugepage access is denied if hugepages are not supported
Currently, I am seeing the following when I `mount -t hugetlbfs /none
/dev/hugetlbfs`, and then simply do a `ls /dev/hugetlbfs`. I think it's
related to the fact that hugetlbfs is properly not correctly setting
itself up in this state?:
Unable to handle kernel paging request for data at address 0x00000031
Faulting instruction address: 0xc000000000245710
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=2048 NUMA pSeries
....
In KVM guests on Power, in a guest not backed by hugepages, we see the
following:
HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages
are not supported at boot-time, but this is only checked in
hugetlb_init(). Extract the check to a helper function, and use it in a
few relevant places.
This does make hugetlbfs not supported (not registered at all) in this
environment. I believe this is fine, as there are no valid hugepages
and that won't change at runtime.
[akpm@linux-foundation.org: use pr_info(), per Mel]
[akpm@linux-foundation.org: fix build when HPAGE_SHIFT is undefined] Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After creating a cache for a memcg we should initialize its sysfs attrs
with the values from its parent. That's what memcg_propagate_slab_attrs
is for. Currently it's broken - we clearly muddled root-vs-memcg caches
there. Let's fix it up.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 6 May 2014 19:22:20 +0000 (12:22 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"dcache fixes + kvfree() (uninlined, exported by mm/util.c) + posix_acl
bugfix from hch"
The dcache fixes are for a subtle LRU list corruption bug reported by
Miklos Szeredi, where people inside IBM saw list corruptions with the
LTP/host01 test.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
nick kvfree() from apparmor
posix_acl: handle NULL ACL in posix_acl_equiv_mode
dcache: don't need rcu in shrink_dentry_list()
more graceful recovery in umount_collect()
don't remove from shrink list in select_collect()
dentry_kill(): don't try to remove from shrink list
expand the call of dentry_lru_del() in dentry_kill()
new helper: dentry_free()
fold try_prune_one_dentry()
fold d_kill() and d_free()
fix races between __d_instantiate() and checks of dentry flags
posix_acl: handle NULL ACL in posix_acl_equiv_mode
Various filesystems don't bother checking for a NULL ACL in
posix_acl_equiv_mode, and thus can dereference a NULL pointer when it
gets passed one. This usually happens from the NFS server, as the ACL tools
never pass a NULL ACL, but instead of one representing the mode bits.
Instead of adding boilerplat to all filesystems put this check into one place,
which will allow us to remove the check from other filesystems as well later
on.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Ben Greear <greearb@candelatech.com> Reported-by: Marco Munderloh <munderl@tnt.uni-hannover.de>, Cc: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Tue, 6 May 2014 16:09:35 +0000 (09:09 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
"This adds ctime update in the new cached writeback mode and also
fixes/simplifies the mtime update handling. Support for rename flags
(aka renameat2) is also added to the userspace API"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: add renameat2 support
fuse: clear MS_I_VERSION
fuse: clear FUSE_I_CTIME_DIRTY flag on setattr
fuse: trust kernel i_ctime only
fuse: remove .update_time
fuse: allow ctime flushing to userspace
fuse: fuse: add time_gran to INIT_OUT
fuse: add .write_inode
fuse: clean up fsync
fuse: fuse: fallocate: use file_update_time()
fuse: update mtime on open(O_TRUNC) in atomic_o_trunc mode
fuse: update mtime on truncate(2)
fuse: do not use uninitialized i_mode
fuse: fix mtime update error in fsync
fuse: check fallocate mode
fuse: add __exit to fuse_ctl_cleanup
Pull sparc fixes from David Miller:
"I've been auditing the THP support on sparc64 and found several bugs,
hopefully most of which are fixed completely here.
Also an RT kernel locking fix from Kirill Tkhai"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Give more detailed information in {pgd,pmd}_ERROR() and kill pte_ERROR().
sparc64: Add basic validations to {pud,pmd}_bad().
sparc64: Use 'ILOG2_4MB' instead of constant '22'.
sparc64: Fix range check in kern_addr_valid().
sparc64: Fix top-level fault handling bugs.
sparc64: Handle 32-bit tasks properly in compute_effective_address().
sparc64: Don't use _PAGE_PRESENT in pte_modify() mask.
sparc64: Fix hex values in comment above pte_modify().
sparc64: Fix bugs in get_user_pages_fast() wrt. THP.
sparc64: Fix huge PMD invalidation.
sparc64: Fix executable bit testing in set_pmd_at() paths.
sparc64: Normalize NMI watchdog logging and behavior.
sparc64: Make itc_sync_lock raw
sparc64: Fix argument sign extension for compat_sys_futex().
Samuel Li [Wed, 30 Apr 2014 22:40:55 +0000 (18:40 -0400)]
drm/radeon: add pci ids for Mullins
Signed-off-by: Samuel Li <samuel.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
Leo Liu [Wed, 30 Apr 2014 22:40:54 +0000 (18:40 -0400)]
drm/radeon: add Mullins VCE support
VCE 2.0 just like the other CIK parts.
Signed-off-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
Samuel Li [Wed, 30 Apr 2014 22:40:53 +0000 (18:40 -0400)]
drm/radeon: modesetting updates for Mullins.
Uses the same code as Kabini.
Signed-off-by: Samuel Li <samuel.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
Alex Deucher [Wed, 30 Apr 2014 22:40:52 +0000 (18:40 -0400)]
drm/radeon: dpm updates for KV/KB
- Use vddc/sclk dep table for voltage if available
- Fix UVD DPM setup
- Patch voltage tables properly for non-UVD blocks
- Fix DPM + UVD/VCE on Mullins
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
Samuel Li [Wed, 30 Apr 2014 22:40:51 +0000 (18:40 -0400)]
drm/radeon: add Mullins dpm support.
Generic dpm support similar to Kabini. Mullins specific features
will be worked on later.
Signed-off-by: Samuel Li <samuel.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
Samuel Li [Wed, 30 Apr 2014 22:40:50 +0000 (18:40 -0400)]
drm/radeon: add Mullins UVD support.
Has same version of UVD as other CIK parts.
Signed-off-by: Samuel Li <samuel.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
Samuel Li [Wed, 30 Apr 2014 22:40:49 +0000 (18:40 -0400)]
drm/radeon: update cik init for Mullins.
Also add golden registers, update firmware loading functions.
Signed-off-by: Samuel Li <samuel.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
Samuel Li [Wed, 30 Apr 2014 22:40:48 +0000 (18:40 -0400)]
drm/radeon: add Mullins chip family
Mullins is a new CI-based APU.
Signed-off-by: Samuel Li <samuel.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
drm/radeon: drm/radeon: add missing radeon_semaphore_free to error path
It would appear this bug has been copy/pasted many times without being noticed.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Michel Dänzer [Tue, 22 Apr 2014 07:53:52 +0000 (16:53 +0900)]
drm/radeon: Fix num_banks calculation for SI
The way the tile mode array index was calculated only makes sense for
the CIK specific macrotile mode array. For SI, we need to use one of the
tile mode array indices reserved for displayable surfaces.
This happened to result in correct display most if not all of the time
because most of the SI tiling modes use the same number of banks.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>