f2fs: fix to build free nids from readaheaded nat pages
When there is no enough free nids in free nid cache, we will try to
readahead FREE_NID_PAGES:4 nat pages into page cache of meta_inode,
then, reading nat entries in nat page for adding free nids to free nid
cache.
But when traversing all nat pages we readaheaded in a circulation,
our exit condition is not set right, one more nat page will be scanned
without readaheading, resulting worse read performance.
This patch fixes to read the correct number nat pages to avoid bad
performance.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If we clear inline data/dentry flag in handle_failed_inode, we will fail
to decline the stat count of inline data/dentry in f2fs_evict_inode due
to no flag in inode. So remove the wrong clearing.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
f2fs: convert inline data before set atomic/volatile flag
In f2fs_ioc_start_{atomic,volatile}_write, if we failed in converting
inline data, we will report error to user, but still remain atomic/volatile
flag in inode, it will impact further writes for this file. Fix it.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
f2fs: fix to wait all atomic written pages writeback
This patch fixes the incorrect range (0, LONG_MAX) which is used
in ranged fsync. If we use LONG_MAX as the parameter for indicating
the end of file we want to synchronize, in 32-bits architecture
machine, these datas after 4GB offset may not be persisted in
storage after ->fsync returned.
Here, we alter LONG_MAX to LLONG_MAX to fix this issue.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
f2fs: skip writing in ->writepages when no dirty pages exist
When flushing comes from background, if there is no dirty page in the
mapping of inode, we'd better to skip seeking dirty page from mapping
for writebacking.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Tiezhu Yang [Fri, 17 Jul 2015 04:56:00 +0000 (12:56 +0800)]
f2fs: optimize f2fs_write_cache_pages
The if statement "goto continue_unlock" is exactly the same when
each if condition is true that is depended on the value of both
"step" and "is_cold_data(page)" are 0 or 1. That means when the
value of "step" equals to "is_cold_data(page)", the if condition
is true and the if statement "goto continue_unlock" appears only
once, so it can be optimized to reduce the duplicated code.
Signed-off-by: Tiezhu Yang <kernelpatch@126.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
f2fs: reduce region of cp_rwsem covered in f2fs_do_collapse
In f2fs_do_collapse, region cp_rwsem covered is large, since it will be
held until all blocks are left shifted, so if we try to collapse small
area at the beginning of large file, checkpoint who want to grab writer's
lock of cp_rwsem will be delayed for long time.
In order to avoid this condition, altering to lock/unlock cp_rwsem each
shift operation.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Fan Li [Wed, 15 Jul 2015 10:05:17 +0000 (18:05 +0800)]
f2fs: add new interfaces for extent tree
Add a lookup and a insertion interface for extent tree.
The new lookup return the insert position and the prev/next
extents closest to the offset we lookup when find no match.
The new insertion uses above parameters to improve performance.
There are three possible insertions after the lookup in
f2fs_update_extent_tree, two of them insert parts of removed extent
back to tree, since no merge happens during this process, new insertion
skips the merge check in this scanario; the another insertion inserts a
new extent to tree, new insertion uses prev/next extent and insert
position to insert this extent directly, and save the time of searching
down the tree.
As long as tree remains unchanged between lookup and insertion, this
would work fine. And the new lookup would be useful when add
multi-blocks extent support for insertion interface.
Signed-off-by: Fan li <fanofcode.li@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
f2fs: use atomic_t to record hit ratio info of extent cache
Variables for recording extent cache ratio info were updated without
protection, this patch tries to alter them to atomic_t type for more
accurate stat.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If there are gced dirty pages and normal dirty pages in the mapping
of one inode, we might writeback them alternately with discontinuous
block address, resulting in low performance.
This patch introduces f2fs_write_cache_pages with codes copied from
write_cache_pages in mm/page-writeback.c.
In this function, we refactor flow with two steps:
1) writeback all cold type pages.
2) writeback all non-cold type pages.
By using this method, f2fs will writeback dirty pages with the same
temperature in bunch mode, it makes writeouted block being with
more continuous address, so they can be merged as much as possible
in f2fs bio cache, and also it will reduce the chance of submiting
small IO from block layer.
Test environment: 8g nokia sd card (very old sd card, but it shows
better effect when testing with this patch, and with a 32g kingston
sd card, I didn't see much more improvement).
Test step:
1. touch testfile;
2. truncate -s 512K testfile;
3. write all pages with odd index;
4. trigger gc by ioctl;
5. write all pages with even index;
6. time fsync testfile.
before:
real 0m0.402s
user 0m0.000s
sys 0m0.000s
after:
real 0m0.143s
user 0m0.004s
sys 0m0.004s
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch fixes to return correct error number of ->setxattr, which
is reported by xfstest tests/generic/026 as below:
generic/026 - output mismatch
--- tests/generic/026.out
+++ results/generic/026.out.bad
@@ -4,6 +4,6 @@
1 below acl max
acl max
1 above acl max
-chacl: cannot set access acl on "largeaclfile": Argument list too long
+chacl: cannot set access acl on "largeaclfile": Numerical result out of range
use 16 aces
use 17 aces
...
Ran: generic/026
Failures: generic/026
Failed 1 of 1 tests
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Previously, since 'commit 4531929e3922 ("f2fs: move grabing orphan
pages out of protection region")' was committed, in write_orphan_inodes(),
we will grab all meta page in a batch before we use them under spinlock,
so that we can avoid large time delay of grabbing meta pages under
spinlock.
Now, 'commit d6c67a4fee86 ("f2fs: revmove spin_lock for
write_orphan_inodes")' remove the spinlock in write_orphan_inodes,
so there is no issue we describe above, we'd better recover to move
the grab operation to original place for readability.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
With cost-benifit method, background gc will consider old section with
fewer valid blocks as candidate victim, these old blocks in section will
be treated as cold data, and laterly will be moved into cold segment.
But if the gcing page is attached by user through buffered or mmaped
write, we should reset the page as non-cold one, because this page may
have more opportunity for further updating.
So fix to add clearing code for the missed 'mmap' case.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
When background gc is off, the only way to trigger gc is executing
a force gc in some operations who wants to grab space in disk.
The executing condition is limited: to execute force gc, we should
wait for the time when there is almost no more free section for LFS
allocation. This seems not reasonable for our user who wants to
control triggering gc by himself.
This patch introduces F2FS_IOC_GARBAGE_COLLECT interface for
triggering garbage collection by using ioctl. It provides our users
one more option to trigger gc.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch moves extent cache related code from data.c into extent_cache.c
since extent cache is independent feature, and its codes are not relate to
others in data.c, it's better for us to maintain them in separated place.
There is no functionality change, but several small coding style fixes
including:
* rename __drop_largest_extent to f2fs_drop_largest_extent for exporting;
* rename misspelled word 'untill' to 'until';
* remove unneeded 'return' in the end of f2fs_destroy_extent_tree().
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Fan Li [Wed, 8 Jul 2015 08:02:54 +0000 (16:02 +0800)]
f2fs: don't try to split extents shorter than F2FS_MIN_EXTENT_LEN
Since only parts of extents longer than F2FS_MIN_EXTENT_LEN will
be kept in extent cache after split, extents already shorter than
F2FS_MIN_EXTENT_LEN don't need to try split at all.
Signed-off-by: Fan Li <fanofcode.li@samsung.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In ->writepages, we use writepages mutex lock to serialize all block
address allocation and page submitting pairs from different inodes.
This method makes our delayed dirty pages of one inode being written
continously as many as possible.
But there is one problem that we did not submit current cached bio in
protection region of writepages mutex lock, so there is a small chance
that we submit the one of other thread's as below, resulting in
splitting more bios.
They are regarded as cold file since their filename are ended with
multimedia files' extension, but this should be wrong as we only
match the extension of filename, not the whole one.
In this patch, we try to fix the format of multimedia filename to:
"filename + '.' + extension", then we set cold file only its
filename matches the format.
So after this change, it will reduce the probability we set the
wrong cold file, also it helps a little for fs_mark's performance
on f2fs.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch adds missed trace file in maintainer-ship of f2fs,
so it completes the description of files maintained in f2fs,
and also it allows people to find correct mailing list by using
get_maintainer.pl when only patching the trace file of f2fs.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Nicholas Krause [Wed, 1 Jul 2015 01:37:21 +0000 (21:37 -0400)]
f2fs: make the function check_dnode have a return type of bool and change it's name to is_alive
This makes the function check_dnode have a return type of bool
due to this particular function only ever returning either one
or zero as its return value and changes the name of the function
to is_alive in order to better explain this function's intended
work of checking if a dnode is still in use by the filesystem.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
[Jaegeuk Kim: change the return value check for the renamed function] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Jaegeuk Kim [Sat, 20 Jun 2015 00:53:26 +0000 (17:53 -0700)]
f2fs: use extent_cache by default
We don't need to handle the duplicate extent information.
The integrated rule is:
- update on-disk extent with largest one tracked by in-memory extent_cache
- destroy extent_tree for the truncation case
- drop per-inode extent_cache by shrinker
Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Jaegeuk Kim [Tue, 16 Jun 2015 22:17:01 +0000 (15:17 -0700)]
f2fs: update on-disk extents even under extent_cache
Previously, f2fs_update_extent_cache() updates in-memory extent_cache all the
time, and then finally preserves its up-to-date extent into on-disk one during
f2fs_evict_inode.
But, in the following scenario:
1. mount
2. open & write an extent X
3. f2fs_evict_inode; on-disk extent is X
4. open & update the extent X with Y
5. sync; trigger checkpoint
6. power-cut
after power-on, f2fs should serve extent Y, but we have an on-disk extent X.
This causes a failure on xfstests/311.
Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Jaegeuk Kim [Tue, 23 Jun 2015 17:36:08 +0000 (10:36 -0700)]
f2fs: avoid to use failed inode immediately
Before iput is called, the inode number used by a bad inode can be reassigned
to other new inode, resulting in any abnormal behaviors on the new inode.
This should not happen for the new inode.
Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Chao Yu [Mon, 29 Jun 2015 10:14:10 +0000 (18:14 +0800)]
f2fs: fix to record dirty page count for symlink
Dirty page can be exist in mapping of newly created symlink, but previously
we did not maintain the counting of dirty page for symlink like we maintained
for regular/directory, so the counting we lookuped should be wrong.
This patch adds missed dirty page counting for symlink to fix this issue.
Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Linus Torvalds [Tue, 4 Aug 2015 15:51:06 +0000 (08:51 -0700)]
Merge tag 'topic/mst-fixes-2015-08-04' of git://anongit.freedesktop.org/drm-intel
Pull drm mst fixes from Daniel Vetter:
"Special pull request for mst fixes since most of the patches touch
code outside of i915 proper. DRM parts have also been reviewed by
Thierry (nvidia) since Dave's enjoying vacations"
* tag 'topic/mst-fixes-2015-08-04' of git://anongit.freedesktop.org/drm-intel:
drm/atomic-helpers: Make encoder picking more robust
drm/dp-mst: Remove debug WARN_ON
drm/i915: Fixup dp mst encoder selection
drm/atomic-helper: Add an atomice best_encoder callback
Linus Torvalds [Tue, 4 Aug 2015 15:49:08 +0000 (08:49 -0700)]
Merge tag 'for-linus-4.2-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
- don't lose interrupts when offlining CPUs
- fix gntdev oops during unmap
- drop the balloon lock occasionally to allow domain create/destroy
* tag 'for-linus-4.2-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/events/fifo: Handle linked events when closing a port
xen: release lock occasionally during ballooning
xen/gntdevt: Fix race condition in gntdev_release()
Ross Lagerwall [Fri, 31 Jul 2015 13:30:42 +0000 (14:30 +0100)]
xen/events/fifo: Handle linked events when closing a port
An event channel bound to a CPU that was offlined may still be linked
on that CPU's queue. If this event channel is closed and reused,
subsequent events will be lost because the event channel is never
unlinked and thus cannot be linked onto the correct queue.
When a channel is closed and the event is still linked into a queue,
ensure that it is unlinked before completing.
If the CPU to which the event channel bound is online, spin until the
event is handled by that CPU. If that CPU is offline, it can't handle
the event, so clear the event queue during the close, dropping the
events.
This fixes the missing interrupts (and subsequent disk stalls etc.)
when offlining a CPU.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Linus Torvalds [Tue, 4 Aug 2015 13:57:32 +0000 (06:57 -0700)]
Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild fixes from Michal Marek:
"Two fixes for kbuild:
- The new ARCH_{CPP,A,C}FLAGS variables are reset before including
the arch Makefile
- Fix calling make modules_install twice when module compression is
enabled"
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
Makefile: Force gzip and xz on module install
kbuild: Do not pick up ARCH_{CPP,A,C}FLAGS from the environment
Daniel Vetter [Mon, 3 Aug 2015 15:24:11 +0000 (17:24 +0200)]
drm/atomic-helpers: Make encoder picking more robust
We've had a few issues with atomic where subtle bugs in the encoder
picking logic lead to accidental self-stealing of the encoder,
resulting in a NULL connector_state->crtc in update_connector_routing
and subsequent.
Linus applied some duct-tape for an mst regression in
i915: temporary fix for DP MST docking station NULL pointer dereference
But that was incomplete (the code will still oops when debuggin is
enabled) and mangled the state even further. So instead WARN and bail
out as the more future-proof option.
Cc: Theodore Ts'o <tytso@mit.edu> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Daniel Vetter [Mon, 3 Aug 2015 15:24:10 +0000 (17:24 +0200)]
drm/dp-mst: Remove debug WARN_ON
Apparently been in there since forever and fairly easy to hit when
hotplugging really fast. I can do that since my mst hub has a manual
button to flick the hpd line for reprobing. The resulting WARNING spam
isn't pretty.
Cc: Dave Airlie <airlied@gmail.com> Cc: stable@vger.kernel.org Reviewed-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Daniel Vetter [Mon, 3 Aug 2015 15:24:09 +0000 (17:24 +0200)]
drm/i915: Fixup dp mst encoder selection
In
commit 8c7b5ccb729870e606321b3703e2c2e698c49a95
Author: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Date: Tue Apr 21 17:13:19 2015 +0300
drm/i915: Use atomic helpers for computing changed flags
we've switched over to the atomic version to compute the
crtc->encoder->connector routing from the i915 variant. That one
relies upon the ->best_encoder callback, but the i915-private version
relied upon intel_find_encoder. Which didn't matter except for dp mst,
where the encoder depends upon the selected crtc.
Fix this functional bug by implemented a correct atomic-state based
encoder selector for dp mst.
Note that we can't get rid of the legacy best_encoder callback since
the fbdev emulation uses that still. That means it's incorrect there
still, but that's been the case ever since i915 dp mst support was
merged so not a regression. Best to fix that by converting fbdev over
to atomic too.
Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Daniel Vetter [Mon, 3 Aug 2015 15:24:08 +0000 (17:24 +0200)]
drm/atomic-helper: Add an atomice best_encoder callback
With legacy helpers all the routing was already set up when calling
best_encoder and so could be inspected. But with atomic it's staged,
hence we need a new atomic compliant callback for drivers which need
to inspect the requested state and can't just decided the best encoder
statically.
This is needed to fix up i915 dp mst where we need to pick the right
encoder depending upon the requested CRTC for the connector.
v2: Don't forget to amend the kerneldoc
Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Theodore Ts'o <tytso@mit.edu> Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Linus Torvalds [Mon, 3 Aug 2015 21:51:30 +0000 (14:51 -0700)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"A refcounting bugfix for the i2c-core, bugfixes for the generic bus
recovery algorithm and for its omap-user, making binary file
attributes for EEPROMs behave POSIX compliant, and a small typo fix
while we are here"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: fix leaked device refcount on of_find_i2c_* error path
i2c: Fix typo in i2c-bfin-twi.c
i2c: omap: fix bus recovery setup
i2c: core: only use set_scl for bus recovery after calling prepare_recovery
misc: eeprom: at24: clean up at24_bin_write()
i2c: slave eeprom: clean up sysfs bin attribute read()/write()
Linus Torvalds [Mon, 3 Aug 2015 18:09:07 +0000 (11:09 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
"There are two critical regression fixes for CephFS from Zheng, and an
RBD completion fix for layered images from Ilya"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: fix copyup completion race
ceph: always re-send cap flushes when MDS recovers
ceph: fix ceph_encode_locks_to_buffer()
Linus Torvalds [Mon, 3 Aug 2015 01:07:36 +0000 (18:07 -0700)]
Merge tag 'powerpc-4.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- TCE table memory calculation fix from Alexey
- Build fix for ans-lcd from Luis
- Unbalanced IRQ warning fix from Alistair
* tag 'powerpc-4.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/eeh-powernv: Fix unbalanced IRQ warning
macintosh/ans-lcd: fix build failure after module_init/exit relocation
powerpc/powernv/ioda2: Fix calculation for memory allocated for TCE table
i915: temporary fix for DP MST docking station NULL pointer dereference
Ted Ts'o reports that his Lenovo T540p ThinkPad crashes at boot if
attached to the docking station. This is a regression that he was able
to bisect to commit 8c7b5ccb7298: "drm/i915: Use atomic helpers for
computing changed flags:"
The reason seems to be the new call to drm_atomic_helper_check_modeset()
added to intel_modeset_compute_config(), which in turn calls
update_connector_routing(), and somehow ends up picking a NULL crtc for
the connector state, causing the subsequent drm_crtc_index() to OOPS.
Daniel Vetter says that the fundamental issue seems to be confusion in
the encoder selection, and this isn't the right fix, but while he chases
down the proper fix, this at least avoids the NULL pointer dereference
and makes Ted's docking station work again.
Reported-bisected-and-tested-by: Theodore Ts'o <tytso@mit.edu> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Mani Nikula <jani.nikula@linux.intel.com> Cc: Dave Airlie <airlied@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 2 Aug 2015 16:36:21 +0000 (09:36 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"A set of three fixes for the ipr driver and one fairly major one for
memory leaks in the mq path of SCSI"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: fix memory leak with scsi-mq
ipr: Fix invalid array indexing for HRRQ
ipr: Fix incorrect trace indexing
ipr: Fix locking for unit attention handling
Linus Torvalds [Sun, 2 Aug 2015 16:12:46 +0000 (09:12 -0700)]
Merge tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Things are calming down nicely here w.r.t. fixes. This batch
includes two week's worth since I missed to send before -rc4.
Nothing particularly scary to point out, smaller fixes here and there.
Shortlog describes it pretty well"
* tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: keystone: fix dt bindings to use post div register for mainpll
ARM: nomadik: disable UART0 on Nomadik boards
ARM: dts: i.MX35: Fix can support.
ARM: OMAP2+: hwmod: Fix _wait_target_ready() for hwmods without sysc
ARM: dts: add CPU OPP and regulator supply property for exynos4210
ARM: dts: Update video-phy node with syscon phandle for exynos3250
ARM: DRA7: hwmod: fix gpmc hwmod
Al Viro [Sat, 1 Aug 2015 23:59:28 +0000 (19:59 -0400)]
link_path_walk(): be careful when failing with ENOTDIR
In RCU mode we might end up with dentry evicted just we check
that it's a directory. In such case we should return ECHILD
rather than ENOTDIR, so that pathwalk would be retries in non-RCU
mode.
Breakage had been introduced in commit b18825a - prior to that
we were looking at nd->inode, which had been fetched before
verifying that ->d_seq was still valid. That form of check
would only be satisfied if at some point the pathname prefix
would indeed have resolved to a non-directory. The fix consists
of checking ->d_seq after we'd run into a non-directory dentry,
and failing with ECHILD in case of mismatch.
Note that all branches since 3.12 have that problem...
Linus Torvalds [Sat, 1 Aug 2015 19:47:04 +0000 (12:47 -0700)]
Merge tag 'dmaengine-fix-4.2-rc5' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
"We had a regression due to reuse of descriptor so we have reverted
that.
The rest are driver fixes:
- at_hdmac and at_xdmac for residue, trannfer width, and channel config
- pl330 final fix for dma fails and overflow issue
- xgene resouce map fix
- mv_xor big endian op fix"
* tag 'dmaengine-fix-4.2-rc5' of git://git.infradead.org/users/vkoul/slave-dma:
Revert "dmaengine: virt-dma: don't always free descriptor upon completion"
dmaengine: mv_xor: fix big endian operation in register mode
dmaengine: xgene-dma: Fix the resource map to handle overlapping
dmaengine: at_xdmac: fix transfer data width in at_xdmac_prep_slave_sg()
dmaengine: at_hdmac: fix residue computation
dmaengine: at_xdmac: fix bug about channel configuration
dmaengine: pl330: Really fix choppy sound because of wrong residue calculation
dmaengine: pl330: Fix overflow when reporting residue in memcpy
Linus Torvalds [Sat, 1 Aug 2015 16:47:11 +0000 (09:47 -0700)]
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixlets from Thomas Gleixner:
"Just two updates to the maintainers file"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Appoint Jiang and Marc as irqdomain maintainers
MAINTAINERS: Appoint Marc Zyngier as irqchips co-maintainer
Linus Torvalds [Sat, 1 Aug 2015 16:16:33 +0000 (09:16 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Fallout from the recent NMI fixes: make x86 LDT handling more robust.
Also some EFI fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ldt: Make modify_ldt synchronous
x86/xen: Probe target addresses in set_aliased_prot() before the hypercall
x86/irq: Use the caller provided polarity setting in mp_check_pin_attr()
efi: Check for NULL efi kernel parameters
x86/efi: Use all 64 bit of efi_memmap in setup_e820()
i2c: fix leaked device refcount on of_find_i2c_* error path
If of_find_i2c_device_by_node() or of_find_i2c_adapter_by_node() find
a device by node, but its type does not match, a reference to that
device is still held. This change fixes the problem.
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
1) Must teardown SR-IOV before unregistering netdev in igb driver, from
Alex Williamson.
2) Fix ipv6 route unreachable crash in IPVS, from Alex Gartrell.
3) Default route selection in ipv4 should take the prefix length, table
ID, and TOS into account, from Julian Anastasov.
4) sch_plug must have a reset method in order to purge all buffered
packets when the qdisc is reset, likewise for sch_choke, from WANG
Cong.
5) Fix deadlock and races in slave_changelink/br_setport in bridging.
From Nikolay Aleksandrov.
6) mlx4 bug fixes (wrong index in port even propagation to VFs,
overzealous BUG_ON assertion, etc.) from Ido Shamay, Jack
Morgenstein, and Or Gerlitz.
7) Turn off klog message about SCTP userspace interface compat that
makes no sense at all, from Daniel Borkmann.
8) Fix unbounded restarts of inet frag eviction process, causing NMI
watchdog soft lockup messages, from Florian Westphal.
9) Suspend/resume fixes for r8152 from Hayes Wang.
10) Fix busy loop when MSG_WAITALL|MSG_PEEK is used in TCP recv, from
Sabrina Dubroca.
11) Fix performance regression when removing a lot of routes from the
ipv4 routing tables, from Alexander Duyck.
12) Fix device leak in AF_PACKET, from Lars Westerhoff.
13) AF_PACKET also has a header length comparison bug due to signedness,
from Alexander Drozdov.
14) Fix bug in EBPF tail call generation on x86, from Daniel Borkmann.
15) Memory leaks, TSO stats, watchdog timeout and other fixes to
thunderx driver from Sunil Goutham and Thanneeru Srinivasulu.
16) act_bpf can leak memory when replacing programs, from Daniel
Borkmann.
17) WOL packet fixes in gianfar driver, from Claudiu Manoil.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (79 commits)
stmmac: fix missing MODULE_LICENSE in stmmac_platform
gianfar: Enable device wakeup when appropriate
gianfar: Fix suspend/resume for wol magic packet
gianfar: Fix warning when CONFIG_PM off
act_pedit: check binding before calling tcf_hash_release()
net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket
net: sched: fix refcount imbalance in actions
r8152: reset device when tx timeout
r8152: add pre_reset and post_reset
qlcnic: Fix corruption while copying
act_bpf: fix memory leaks when replacing bpf programs
net: thunderx: Fix for crash while BGX teardown
net: thunderx: Add PCI driver shutdown routine
net: thunderx: Fix crash when changing rss with mutliple traffic flows
net: thunderx: Set watchdog timeout value
net: thunderx: Wakeup TXQ only if CQE_TX are processed
net: thunderx: Suppress alloc_pages() failure warnings
net: thunderx: Fix TSO packet statistic
net: thunderx: Fix memory leak when changing queue count
net: thunderx: Fix RQ_DROP miscalculation
...
Linus Torvalds [Sat, 1 Aug 2015 00:05:37 +0000 (17:05 -0700)]
Merge branch 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"Filipe fixed up a hard to trigger ENOSPC regression from our merge
window pull, and we have a few other smaller fixes"
* 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix quick exhaustion of the system array in the superblock
btrfs: its btrfs_err() instead of btrfs_error()
btrfs: Avoid NULL pointer dereference of free_extent_buffer when read_tree_block() fail
btrfs: Fix lockdep warning of btrfs_run_delayed_iputs()
Linus Torvalds [Sat, 1 Aug 2015 00:00:25 +0000 (17:00 -0700)]
Merge tag 'sound-4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This became a relative big update as it includes the collected ASoC
fixes. There are a few fixes in ASoC core side, mostly for DAPM and
the new topology API. The rest are various ASoC driver-specific
fixes, as well as the usual HD-audio and USB-audio quirks"
* tag 'sound-4.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (29 commits)
ALSA: hda - Fix MacBook Pro 5,2 quirk
ALSA: hda - Fix race between PM ops and HDA init/probe
ALSA: usb-audio: add dB range mapping for some devices
ALSA: hda - Apply a fixup to Dell Vostro 5480
ALSA: hda - Add pin quirk for the headset mic jack detection on Dell laptop
ALSA: hda - Apply fixup for another Toshiba Satellite S50D
ALSA: fireworks: add support for AudioFire2 quirk
ALSA: hda - Fix the headset mic that will not work on Dell desktop machine
ALSA: hda - fix cs4210_spdif_automute()
ASoC: pcm1681: Fix setting de-emphasis sampling rate selection
ASoC: ssm4567: Keep TDM_BCLKS in ssm4567_set_dai_fmt
ASoC: sgtl5000: Fix up define for SGTL5000_SMALL_POP
ASoC: dapm: Don't add prefix to widget stream name
ASoC: rt5645: Check if codec is initialized in workqueue handler
ASoC: Intel: Get correct usage_count value to load firmware
ASoC: topology: Fix to add dapm mixer info
ASoC: zx: spdif: Fix devm_ioremap_resource return value check
ASoC: zx: i2s: Fix devm_ioremap_resource return value check
ASoC: mediatek: Use platform_of_node for machine drivers
ASoC: Free card DAPM context on snd_soc_instantiate_card() error path
...
Joachim Eastwood [Fri, 31 Jul 2015 17:13:22 +0000 (19:13 +0200)]
stmmac: fix missing MODULE_LICENSE in stmmac_platform
Commit 50649ab14982 ("stmmac: drop driver from stmmac platform code")
was a bit overzealous in removing code and dropped the MODULE_*
macro's that are still needed since stmmac_platform can be a module.
Fix this by putting the macro's remvoed in 50649ab14982 back.
This fixes the following errors when used as a module:
stmmac_platform: module license 'unspecified' taints kernel.
Disabling lock debugging due to kernel taint
stmmac_platform: Unknown symbol devm_kmalloc (err 0)
stmmac_platform: Unknown symbol stmmac_suspend (err 0)
stmmac_platform: Unknown symbol platform_get_irq_byname (err 0)
stmmac_platform: Unknown symbol stmmac_dvr_remove (err 0)
stmmac_platform: Unknown symbol platform_get_resource (err 0)
stmmac_platform: Unknown symbol of_get_phy_mode (err 0)
stmmac_platform: Unknown symbol of_property_read_u32_array (err 0)
stmmac_platform: Unknown symbol of_alias_get_id (err 0)
stmmac_platform: Unknown symbol stmmac_resume (err 0)
stmmac_platform: Unknown symbol stmmac_dvr_probe (err 0)
Fixes: 50649ab14982 ("stmmac: drop driver from stmmac platform code") Reported-by: Igor Gnatenko <i.gnatenko.brain@gmail.com> Signed-off-by: Joachim Eastwood <manabian@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 31 Jul 2015 22:41:50 +0000 (15:41 -0700)]
Merge branch 'gianfar-wol-fixes'
Claudiu Manoil says:
====================
gianfar: wol magic packet fixes
These changes were already validated as part of FSL SDK.
Patch 2 fixes occasional wake-on magic packet failures during
traffic, probably due to incorrect traffic stop/ device halt
sequence and incorrect usage of txlock.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The wol_en flag is 0 by default anyway, and we have the
following inconsistency: a MAGIC packet wol capable eth
interface is registered as a wake-up source but unable
to wake-up the system as wol_en is 0 (wake-on flag set to 'd').
Calling set_wakeup_enable() at netdev open is just redundant
because wol_en is 0 by default.
Let only ethtool call set_wakeup_enable() for now.
The bflock is obviously obsoleted, its utility has been corroded
over time. The bitfield flags used today in gianfar are accessed
only on the init/ config path, with no real possibility of
concurrency - nothing that would justify smth. like bflock.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
If we disable NAPI in the first place we can mask the device's
interrupts (and halt it) without fearing that imask may be
concurrently accessed from interrupt context, so there's
no need to do local_irq_save() around gfar_halt_nodisable().
lock_rx_qs()/unlock_tx_qs() are just obsoleted and potentially
buggy routines. The txlock is currently used in the driver only
to manage TX congestion, it has nothing to do with halting the
device. With these changes, the TX processing is stopped before
gfar_halt().
Compact gfar_halt() is used instead of gfar_halt_nodisable(),
as it disables Rx/TX DMA h/w blocks and the Rx/TX h/w queues.
gfar_start() re-enables all these blocks on resume. Enabling
the magic-packet mode remains the same, note that the RX block
is re-enabled just before entering sleep mode.
Add IRQF_NO_SUSPEND flag for the error interrupt line, to signal
that the interrupt line must remain active during sleep in order
to wake the system by magic packet (MAG) reception interrupt.
(On some systems the MAG interrupt did trigger w/o this flag
as well, but on others it didn't.)
Without these fixes, when suspended during fair Tx traffic the
interface occasionally failed to be woken up by magic packet.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
CC drivers/net/ethernet/freescale/gianfar.o
drivers/net/ethernet/freescale/gianfar.c:568:13: warning: 'lock_tx_qs'
defined but not used [-Wunused-function]
static void lock_tx_qs(struct gfar_private *priv)
^
drivers/net/ethernet/freescale/gianfar.c:576:13: warning: 'unlock_tx_qs'
defined but not used [-Wunused-function]
static void unlock_tx_qs(struct gfar_private *priv)
^
Reported-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Fri, 31 Jul 2015 00:12:21 +0000 (17:12 -0700)]
act_pedit: check binding before calling tcf_hash_release()
When we share an action within a filter, the bind refcnt
should increase, therefore we should not call tcf_hash_release().
Fixes: 1a29321ed045 ("net_sched: act: Dont increment refcnt on replace") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Murali Karicheri [Fri, 29 May 2015 16:04:13 +0000 (12:04 -0400)]
ARM: dts: keystone: fix dt bindings to use post div register for mainpll
All of the keystone devices have a separate register to hold post
divider value for main pll clock. Currently the fixed-postdiv
value used for k2hk/l/e SoCs works by sheer luck as u-boot happens to
use a value of 2 for this. Now that we have fixed this in the pll
clock driver change the dt bindings for the same.
Merge tag 'iommu-fixes-v4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"These fixes are all for the AMD IOMMU driver:
- A regression with HSA caused by the conversion of the driver to
default domains. The fixes make sure that an HSA device can still
be attached to an IOMMUv2 domain and that these domains also allow
non-IOMMUv2 capable devices.
- Fix iommu=pt mode which did not work because the dma_ops where set
to nommu_ops, which breaks devices that can only do 32bit DMA.
- Fix an issue with non-PCI devices not working, because there are no
dma_ops for them. This issue was discovered recently as new AMD
x86 platforms have non-PCI devices too"
* tag 'iommu-fixes-v4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Allow non-ATS devices in IOMMUv2 domains
iommu/amd: Set global dma_ops if swiotlb is disabled
iommu/amd: Use swiotlb in passthrough mode
iommu/amd: Allow non-IOMMUv2 devices in IOMMUv2 domains
iommu/amd: Use iommu core for passthrough mode
iommu/amd: Use iommu_attach_group()
Merge tag 'drm-intel-fixes-2015-07-31' of git://anongit.freedesktop.org/drm-intel
Pull drm intel fixes from Daniel Vetter:
"I delayed my -fixes pull a bit hoping that I could include a fix for
the dp mst stuff but looks a bit more nasty than that. So just 3
other regression fixes, one 4.2 other two cc: stable"
* tag 'drm-intel-fixes-2015-07-31' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Declare the swizzling unknown for L-shaped configurations
drm/i915: Mark PIN_USER binding as GLOBAL_BIND without the aliasing ppgtt
drm/i915: Replace WARN inside I915_READ64_2x32 with retry loop
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"This has a bunch of nouveau fixes, as Ben has been hibernating and has
lots of small fixes for lots of bugs across nouveau.
Radeon has one major fix for hdmi/dp audio regression that is larger
than Alex would like, but seems to fix up a fair few bugs, along with
some misc fixes.
And a few msm fixes, one of which is also a bit large.
But nothing in here seems insane or crazy for this stage, just more
than I'd like"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (33 commits)
drm/msm/mdp5: release SMB (shared memory blocks) in various cases
drm/msm: change to uninterruptible wait in atomic commit
drm/msm: mdp4: Fix drm_framebuffer dereference crash
drm/msm: fix msm_gem_prime_get_sg_table()
drm/amdgpu: add new parameter to seperate map and unmap
drm/amdgpu: hdp_flush is not needed for inside IB
drm/amdgpu: different emit_ib for gfx and compute
drm/amdgpu: information leak in amdgpu_info_ioctl()
drm/amdgpu: clean up init sequence for failures
drm/radeon/combios: add some validation of lvds values
drm/radeon: rework audio modeset to handle non-audio hdmi features
drm/radeon: rework audio detect (v4)
drm/amdgpu: Drop drm/ prefix for including drm.h in amdgpu_drm.h
drm/radeon: Drop drm/ prefix for including drm.h in radeon_drm.h
drm/nouveau/nouveau/ttm: fix tiled system memory with Maxwell
drm/nouveau/kms/nv50-: guard against enabling cursor on disabled heads
drm/nouveau/fbcon/g80: reduce PUSH_SPACE alloc, fire ring on accel init
drm/nouveau/fbcon/gf100-: reduce RING_SPACE allocation
drm/nouveau/fbcon/nv11-: correctly account for ring space usage
drm/nouveau/bios: add proper support for opcode 0x59
...
Jun Nie [Fri, 10 Jul 2015 12:02:49 +0000 (20:02 +0800)]
Revert "dmaengine: virt-dma: don't always free descriptor upon completion"
This reverts commit b9855f03d560d351e95301b9de0bc3cad3b31fe9.
The patch break existing DMA usage case. For example, audio SOC
dmaengine never release channel and cause virt-dma to cache too
much memory in descriptor to exhaust system memory.
dmaengine: mv_xor: fix big endian operation in register mode
Commit 6f166312c6ea2 ("dmaengine: mv_xor: add support for a38x command
in descriptor mode") introduced the support for a feature that
appeared in Armada 38x: specifying the operation to be performed in a
per-descriptor basis rather than globally per channel.
However, when doing so, it changed the function mv_chan_set_mode() to
use:
if (IS_ENABLED(__BIG_ENDIAN))
instead of:
#if defined(__BIG_ENDIAN)
While IS_ENABLED() is perfectly fine for CONFIG_* symbols, it is not
for other symbols such as __BIG_ENDIAN that is provided directly by
the compiler. Consequently, the commit broke support for big-endian,
as the XOR_DESCRIPTOR_SWAP flag was not set in the XOR channel
configuration register.
The primarily visible effect was some nasty warnings and failures
appearing during the self-test of the XOR unit:
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Fixes: 6f166312c6ea2 ("dmaengine: mv_xor: add support for a38x command in descriptor mode") Reviewed-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
dmaengine: xgene-dma: Fix the resource map to handle overlapping
There is an overlap in dma ring cmd csr region due to sharing of ethernet
ring cmd csr region. This patch fix the resource overlapping by mapping
the entire dma ring cmd csr region.
Cyrille Pitchen [Tue, 30 Jun 2015 12:36:57 +0000 (14:36 +0200)]
dmaengine: at_xdmac: fix transfer data width in at_xdmac_prep_slave_sg()
This patch adds the missing update of the transfer data width in
at_xdmac_prep_slave_sg().
Indeed, for each item in the scatter-gather list, we check whether the
transfer length is aligned with the data width provided by
dmaengine_slave_config(). If so, we directly use this data width for the
current part of the transfer we are preparing. Otherwise, the data width
is reduced to 8 bits (1 byte). Of course, the actual number of register
accesses must also be updated to match the new data width.
So one chunk was missing in the original patch (see Fixes tag below): the
number of register accesses was correctly set to (len >> fixed_dwidth) in
mbr_ubc but the real data width was not updated in mbr_cfg. Since mbr_cfg
may change for each part of the scatter-gather transfer this also explains
why the original patch used the Descriptor View 2 instead of the
Descriptor View 1.
Let's take the example of a DMA transfer to write 8bit data into an Atmel
USART with FIFOs. When FIFOs are enabled in the USART, its Transmit
Holding Register (THR) works in multidata mode, that is to say that up to
4 8bit data can be written into the THR in a single 32bit access and it is
still possible to write only one data with a 8bit access. To take
advantage of this new feature, the DMA driver was modified to allow
multiple dwidths when doing slave transfers.
For instance, when the total length is 22 bytes, the USART driver splits
the transfer into 2 parts:
First part: 20 bytes transferred through 5 32bit writes into THR
Second part: 2 bytes transferred though 2 8bit writes into THR
For the second part, the data width was first set to 4_BYTES by the USART
driver thanks to dmaengine_slave_config() then at_xdmac_prep_slave_sg()
reduces this data width to 1_BYTE because the 2 byte length is not aligned
with the original 4_BYTES data width. Since the data width is modified,
the actual number of writes into THR must be set accordingly.
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@atmel.com> Fixes: 6d3a7d9e3ada ("dmaengine: at_xdmac: allow muliple dwidths when doing slave transfers") Cc: stable@vger.kernel.org #4.0 and later Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Ludovic Desroches <ludovic.desroches@atmel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Cyrille Pitchen [Thu, 18 Jun 2015 11:25:41 +0000 (13:25 +0200)]
dmaengine: at_hdmac: fix residue computation
As claimed by the programmer datasheet and confirmed by the IP designer,
the Block Transfer Size (BTSIZE) bitfield of the Channel x Control A
Register (CTRLAx) always refers to a number of Source Width (SRC_WIDTH)
transfers.
Both the SRC_WIDTH and BTSIZE bitfields can be extacted from the CTRLAx
register to compute the DMA residue. So the 'tx_width' field is useless
and can be removed from the struct at_desc.
Before this patch, atc_prep_slave_sg() was not consistent: BTSIZE was
correctly initialized according to the SRC_WIDTH but 'tx_width' was always
set to reg_width, which was incorrect for MEM_TO_DEV transfers. It led to
bad DMA residue when 'tx_width' != SRC_WIDTH.
Also the 'tx_width' field was mostly set only in the first and last
descriptors. Depending on the kind of DMA transfer, this field remained
uninitialized for intermediate descriptors. The accurate DMA residue was
computed only when the currently processed descriptor was the first or the
last of the chain. This algorithm was a little bit odd. An accurate DMA
residue can always be computed using the SRC_WIDTH and BTSIZE bitfields
in the CTRLAx register.
Finally, the test to check whether the currently processed descriptor is
the last of the chain was wrong: for cyclic transfer, last_desc->lli.dscr
is NOT equal to zero, since set_desc_eol() is never called, but logically
equal to first_desc->txd.phys. This bug has a side effect on the
drivers/tty/serial/atmel_serial.c driver, which uses cyclic DMA transfer
to receive data. Since the DMA residue was wrong each time the DMA
transfer reaches the second (and last) period of the transfer, no more
data were received by the USART driver till the cyclic DMA transfer loops
back to the first period.
dmaengine: at_xdmac: fix bug about channel configuration
When using descriptor view 2 or higher, we don't write the configuration
into AT_XDMAC_CC register because this configuration will be fetch from
the descriptor. Unfortunately, the PROT bit is not updated with this
method, we have to do it manually before enabling the channel.
iommu/amd: Allow non-ATS devices in IOMMUv2 domains
With the grouping of multi-function devices a non-ATS
capable device might also end up in the same domain as an
IOMMUv2 capable device.
So handle this situation gracefully and don't consider it a
bug anymore.
Jan Luebbe [Wed, 8 Jul 2015 14:35:27 +0000 (16:35 +0200)]
i2c: omap: fix bus recovery setup
At least on the AM335x, enabling OMAP_I2C_SYSTEST_ST_EN is not enough to
allow direct access to the SCL and SDA pins. In addition to ST_EN, we
need to set the TMODE to 0b11 (Loop back & SDA/SCL IO mode select).
Also, as the reset values of SCL_O and SDA_O are 0 (which means "drive
low level"), we need to set them to 1 (which means "high-impedance") to
avoid unwanted changes on the pins.
As a precaution, reset all these bits to their default values after
recovery is complete.
Signed-off-by: Jan Luebbe <jlu@pengutronix.de> Tested-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Jan Luebbe [Wed, 8 Jul 2015 14:35:06 +0000 (16:35 +0200)]
i2c: core: only use set_scl for bus recovery after calling prepare_recovery
Using set_scl may be ineffective before calling the driver specific
prepare_recovery callback, which might change into a test mode. So
instead of setting SCL in i2c_generic_scl_recovery, move it to
i2c_generic_recovery (after the optional prepare_recovery).
Signed-off-by: Jan Luebbe <jlu@pengutronix.de> Acked-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Tested-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
For write/discard obj_requests that involved a copyup method call, the
opcode of the first op is CEPH_OSD_OP_CALL and the ->callback is
rbd_img_obj_copyup_callback(). The latter frees copyup pages, sets
->xferred and delegates to rbd_img_obj_callback(), the "normal" image
object callback, for reporting to block layer and putting refs.
rbd_osd_req_callback() however treats CEPH_OSD_OP_CALL as a trivial op,
which means obj_request is marked done in rbd_osd_trivial_callback(),
*before* ->callback is invoked and rbd_img_obj_copyup_callback() has
a chance to run. Marking obj_request done essentially means giving
rbd_img_obj_callback() a license to end it at any moment, so if another
obj_request from the same img_request is being completed concurrently,
rbd_img_obj_end_request() may very well be called on such prematurally
marked done request:
Calling rbd_img_obj_end_request() on such a request leads to trouble,
in particular because its ->xfferred is 0. We report 0 to the block
layer with blk_update_request(), get back 1 for "this request has more
data in flight" and then trip on
with rhs (which == ...) being 1 because rbd_img_obj_end_request() has
been called for both requests and lhs (more) being 1 because we haven't
got a chance to set ->xfferred in rbd_img_obj_copyup_callback() yet.
To fix this, leverage that rbd wants to call class methods in only two
cases: one is a generic method call wrapper (obj_request is standalone)
and the other is a copyup (obj_request is part of an img_request). So
make a dedicated handler for CEPH_OSD_OP_CALL and directly invoke
rbd_img_obj_copyup_callback() from it if obj_request is part of an
img_request, similar to how CEPH_OSD_OP_READ handler invokes
rbd_img_obj_request_read_callback().
Since rbd_img_obj_copyup_callback() is now being called from the OSD
request callback (only), it is renamed to rbd_osd_copyup_callback().
Cc: Alex Elder <elder@linaro.org> Cc: stable@vger.kernel.org # 3.10+, needs backporting for < 3.18 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
ceph: always re-send cap flushes when MDS recovers
commit e548e9b93d3e565e42b938a99804114565be1f81 makes the kclient
only re-send cap flush once during MDS failover. If the kclient sends
a cap flush after MDS enters reconnect stage but before MDS recovers.
The kclient will skip re-sending the same cap flush when MDS recovers.
This causes problem for newly created inode. The MDS handles cap
flushes before replaying unsafe requests, so it's possible that MDS
find corresponding inode is missing when handling cap flush. The fix
is reverting to old behaviour: always re-send when MDS recovers
Andy Lutomirski [Thu, 30 Jul 2015 21:31:32 +0000 (14:31 -0700)]
x86/ldt: Make modify_ldt synchronous
modify_ldt() has questionable locking and does not synchronize
threads. Improve it: redesign the locking and synchronize all
threads' LDTs using an IPI on all modifications.
This will dramatically slow down modify_ldt in multithreaded
programs, but there shouldn't be any multithreaded programs that
care about modify_ldt's performance in the first place.
This fixes some fallout from the CVE-2015-5157 fixes.
Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org <security@kernel.org> Cc: <stable@vger.kernel.org> Cc: xen-devel <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/4c6978476782160600471bd865b318db34c7b628.1438291540.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Andy Lutomirski [Thu, 30 Jul 2015 21:31:31 +0000 (14:31 -0700)]
x86/xen: Probe target addresses in set_aliased_prot() before the hypercall
The update_va_mapping hypercall can fail if the VA isn't present
in the guest's page tables. Under certain loads, this can
result in an OOPS when the target address is in unpopulated vmap
space.
While we're at it, add comments to help explain what's going on.
This isn't a great long-term fix. This code should probably be
changed to use something like set_memory_ro.
Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Vrabel <dvrabel@cantab.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org <security@kernel.org> Cc: <stable@vger.kernel.org> Cc: xen-devel <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/0b0e55b995cda11e7829f140b833ef932fcabe3a.1438291540.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent
Pull EFI fixes from Matt Fleming:
* Fix an EFI boot issue preventing a Parallels virtual machine from
booting because the upper 32-bits of the EFI memmap pointer were
being discarded in setup_e820(). (Dmitry Skorodumov)
* Validate that the "efi" kernel parameter gets used with an argument,
otherwise we will oops. (Ricardo Neri)
Merge tag 'xfs-for-linus-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
Pull xfs fixes from Dave Chinner:
"There are a couple of recently found, long standing remote attribute
corruption fixes caused by log recovery getting confused after a
crash, and the new DAX code in XFS (merged in 4.2-rc1) needs to
actually use the DAX fault path on read faults.
Summary:
- remote attribute log recovery corruption fixes
- DAX page faults need to use direct mappings, not a page cache
mapping"
* tag 'xfs-for-linus-4.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: remote attributes need to be considered data
xfs: remote attribute headers contain an invalid LSN
xfs: call dax_fault on read page faults for DAX
net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket
The newsk returned by sk_clone_lock should hold a get_net()
reference if, and only if, the parent is not a kernel socket
(making this similar to sk_alloc()).
E.g,. for the SYN_RECV path, tcp_v4_syn_recv_sock->..inet_csk_clone_lock
sets up the syn_recv newsk from sk_clone_lock. When the parent (listen)
socket is a kernel socket (defined in sk_alloc() as having
sk_net_refcnt == 0), then the newsk should also have a 0 sk_net_refcnt
and should not hold a get_net() reference.
Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the
netns of kernel sockets.") Acked-by: Eric Dumazet <edumazet@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 29 Jul 2015 21:35:25 +0000 (23:35 +0200)]
net: sched: fix refcount imbalance in actions
Since commit 55334a5db5cd ("net_sched: act: refuse to remove bound action
outside"), we end up with a wrong reference count for a tc action.
Test case 1:
FOO="1,6 0 0 4294967295,"
BAR="1,6 0 0 4294967294,"
tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 \
action bpf bytecode "$FOO"
tc actions show action bpf
action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
index 1 ref 1 bind 1
tc actions replace action bpf bytecode "$BAR" index 1
tc actions show action bpf
action order 0: bpf bytecode '1,6 0 0 4294967294' default-action pipe
index 1 ref 2 bind 1
tc actions replace action bpf bytecode "$FOO" index 1
tc actions show action bpf
action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
index 1 ref 3 bind 1
Test case 2:
FOO="1,6 0 0 4294967295,"
tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 action ok
tc actions show action gact
action order 0: gact action pass
random type none pass val 0
index 1 ref 1 bind 1
tc actions add action drop index 1
RTNETLINK answers: File exists [...]
tc actions show action gact
action order 0: gact action pass
random type none pass val 0
index 1 ref 2 bind 1
tc actions add action drop index 1
RTNETLINK answers: File exists [...]
tc actions show action gact
action order 0: gact action pass
random type none pass val 0
index 1 ref 3 bind 1
What happens is that in tcf_hash_check(), we check tcf_common for a given
index and increase tcfc_refcnt and conditionally tcfc_bindcnt when we've
found an existing action. Now there are the following cases:
1) We do a late binding of an action. In that case, we leave the
tcfc_refcnt/tcfc_bindcnt increased and are done with the ->init()
handler. This is correctly handeled.
2) We replace the given action, or we try to add one without replacing
and find out that the action at a specific index already exists
(thus, we go out with error in that case).
In case of 2), we have to undo the reference count increase from
tcf_hash_check() in the tcf_hash_check() function. Currently, we fail to
do so because of the 'tcfc_bindcnt > 0' check which bails out early with
an -EPERM error.
Now, while commit 55334a5db5cd prevents 'tc actions del action ...' on an
already classifier-bound action to drop the reference count (which could
then become negative, wrap around etc), this restriction only accounts for
invocations outside a specific action's ->init() handler.
One possible solution would be to add a flag thus we possibly trigger
the -EPERM ony in situations where it is indeed relevant.
After the patch, above test cases have correct reference count again.
Fixes: 55334a5db5cd ("net_sched: act: refuse to remove bound action outside") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>