git.karo-electronics.de Git - linux-beck.git/log

xfs: writeback and inval. file range to be shifted by collapse

The collapse range operation currently writes the entire file before
starting the collapse to avoid changes in the in-core extent list due to
writeback causing the extent count to change. Now that collapse range is
fsb based rather than extent index based it can sustain changes in the
extent list during the shift sequence without disruption.

Modify xfs_collapse_file_space() to writeback and invalidate pages
associated with the range of the file to be shifted.
xfs_free_file_space() currently has similar behavior, but the space free
need only affect the region of the file that is freed and this could
change in the future.

Also update the comments to reflect the current implementation. We
retain the eofblocks trim permanently as a best option for dealing with
delalloc extents. We don't shift delalloc extents because this scenario
only occurs with post-eof preallocation (since data must be flushed such
that the cache can be invalidated and data can be shifted). That means
said space must also be initialized before being shifted into the
accessible region of the file only to be immediately truncated off as
the last part of the collapse. In other words, the eofblocks trim will
happen anyways, we just run it first to ensure the file remains in a
consistent state throughout the collapse.

Finally, detect and fail explicitly in the event of a delalloc extent
during the extent shift. The implementation does not support delalloc
extents and the caller is expected to prevent this scenario in advance
as is done by collapse.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: refactor single extent shift into xfs_bmse_shift_one() helper

xfs_bmap_shift_extents() has a variety of conditions and error checks
that make the logic difficult to follow and indent heavy. Refactor the
loop body of this function into a new xfs_bmse_shift_one() helper. This
simplifies the error checks, eliminates index decrement on merge hack by
pushing the index increment down into the helper, and makes the code
more readable by reducing multiple levels of indentation.

This is a code refactor only. The behavior of extent shift and collapse
range is not modified.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: refactor shift-by-merge into xfs_bmse_merge() helper

The extent shift mechanism in xfs_bmap_shift_extents() is complicated
and handles several different, non-deterministic scenarios. These
include extent shifts, extent merges and potential btree updates in
either of the former scenarios.

Refactor the code to be more linear and readable. The loop logic in
xfs_bmap_shift_extents() and some initial error checking is adjusted
slightly. The associated btree lookup and update/delete operations are
condensed into single blocks of code. This reduces the number of
btree-specific blocks and facilitates the separation of the merge
operation into a new xfs_bmse_merge() and xfs_bmse_can_merge() helpers.

This is a code refactor only. The behavior of extent shift and collapse
range is not modified.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: track collapse via file offset rather than extent index

The collapse range implementation uses a transaction per extent shift.
The progress of the overall operation is tracked via the current extent
index of the in-core extent list. This is racy because the ilock must be
dropped and reacquired for each transaction according to locking and log
reservation rules. Therefore, writeback to prior regions of the file is
possible and can change the extent count. This changes the extent to
which the current index refers and causes the collapse to fail mid
operation. To avoid this problem, the entire file is currently written
back before the collapse operation starts.

To eliminate the need to flush the entire file, use the file offset
(fsb) to track the progress of the overall extent shift operation rather
than the extent index. Modify xfs_bmap_shift_extents() to
unconditionally convert the start_fsb parameter to an extent index and
return the file offset of the extent where the shift left off, if
further extents exist. The bulk of ths function can remain based on
extent index as ilock is held by the caller. xfs_collapse_file_space()
now uses the fsb output as the starting point for the subsequent shift.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: ensure WB_SYNC_ALL writeback handles partial pages correctly

XFS has been having trouble with stray delayed allocation extents
beyond EOF for a long time. Recent changes to the collapse range
code has triggered erroneous EBUSY errors on page invalidtion for
block size smaller than page size filesystems. These
have been caused by dirty buffers beyond EOF on a partial page which
do not get written to disk during a sync.

The issue is that write-ahead in xfs_cluster_write() finds such a
partial page and handles it by leaving the page dirty but pushing it
into a writeback state. This used to work just fine, as the
write_cache_pages() code would then find the dirty partial page in
the next mapping tree lookup as the dirty tag is still set.

Unfortunately, when we moved to a mark and sweep approach to
writeback to fix other writeback sync issues, we broken this. THe
act of marking the page as under writeback now clears the TOWRITE
tag in the radix tree, even though the page is still dirty. This
causes the TOWRITE tag to be cleared, and hence the next lookup on
the mapping tree does not find the dirty partial page and so doesn't
try to write it again.

This same writeback bug was found recently in ext4 and fixed in
commit 1c8349a ("ext4: fix data integrity sync in ordered mode")
without communication to the wider filesystem community. We can use
exactly the same fix here so the TOWRITE flag is not cleared on
partial page writes.

cc: stable@vger.kernel.org # dependent on 1c8349a17137b93f0a83f276c764a6df1b9a116e
Root-cause-found-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: trim eofblocks before collapse range

xfs_collapse_file_space() currently writes back the entire file
undergoing collapse range to settle things down for the extent shift
algorithm. While this prevents changes to the extent list during the
collapse operation, the writeback itself is not enough to prevent
unnecessary collapse failures.

The current shift algorithm uses the extent index to iterate the in-core
extent list. If a post-eof delalloc extent persists after the writeback
(e.g., a prior zero range op where the end of the range aligns with eof
can separate the post-eof blocks such that they are not written back and
converted), xfs_bmap_shift_extents() becomes confused over the encoded
br_startblock value and fails the collapse.

As with the full writeback, this is a temporary fix until the algorithm
is improved to cope with a volatile extent list and avoid attempts to
shift post-eof extents.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: xfs_file_collapse_range is delalloc challenged

If we have delalloc extents on a file before we run a collapse range
opertaion, we sync the range that we are going to collapse to
convert delalloc extents in that region to real extents to simplify
the shift operation.

However, the shift operation then assumes that the extent list is
not going to change as it iterates over the extent list moving
things about. Unfortunately, this isn't true because we can't hold
the ILOCK over all the operations. We can prevent new IO from
modifying the extent list by holding the IOLOCK, but that doesn't
prevent writeback from running....

And when writeback runs, it can convert delalloc extents is the
range of the file prior to the region being collapsed, and this
changes the indexes of all the extents in the file. That causes the
collapse range operation to Go Bad.

The right fix is to rewrite the extent shift operation not to be
dependent on the extent list not changing across the entire
operation, but this is a fairly significant piece of work to do.
Hence, as a short-term workaround for the problem, sync the entire
file before starting a collapse operation to remove all delalloc
ranges from the file and so avoid the problem of concurrent
writeback changing the extent list.

Diagnosed-and-Reported-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: don't log inode unless extent shift makes extent modifications

The file collapse mechanism uses xfs_bmap_shift_extents() to collapse
all subsequent extents down into the specified, previously punched out,
region. This function performs some validation, such as whether a
sufficient hole exists in the target region of the collapse, then shifts
the remaining exents downward.

The exit path of the function currently logs the inode unconditionally.
While we must log the inode (and abort) if an error occurs and the
transaction is dirty, the initial validation paths can generate errors
before the transaction has been dirtied. This creates an unnecessary
filesystem shutdown scenario, as the caller will cancel a transaction
that has been marked dirty.

Modify xfs_bmap_shift_extents() to OR the logflags bits as modifications
are made to the inode bmap. Only log the inode in the exit path if
logflags has been set. This ensures we only have to cancel a dirty
transaction if modifications have been made and prevents an unnecessary
filesystem shutdown otherwise.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: use ranged writeback and invalidation for direct IO

Now we are not doing silly things with dirtying buffers beyond EOF
and using invalidation correctly, we can finally reduce the ranges of
writeback and invalidation used by direct IO to match that of the IO
being issued.

Bring the writeback and invalidation ranges back to match the
generic direct IO code - this will greatly reduce the perturbation
of cached data when direct IO and buffered IO are mixed, but still
provide the same buffered vs direct IO coherency behaviour we
currently have.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: don't zero partial page cache pages during O_DIRECT writes

Similar to direct IO reads, direct IO writes are using
truncate_pagecache_range to invalidate the page cache. This is
incorrect due to the sub-block zeroing in the page cache that
truncate_pagecache_range() triggers.

This patch fixes things by using invalidate_inode_pages2_range
instead. It preserves the page cache invalidation, but won't zero
any pages.

cc: stable@vger.kernel.org
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: don't zero partial page cache pages during O_DIRECT writes

xfs is using truncate_pagecache_range to invalidate the page cache
during DIO reads.  This is different from the other filesystems who
only invalidate pages during DIO writes.

truncate_pagecache_range is meant to be used when we are freeing the
underlying data structs from disk, so it will zero any partial
ranges in the page.  This means a DIO read can zero out part of the
page cache page, and it is possible the page will stay in cache.

buffered reads will find an up to date page with zeros instead of
the data actually on disk.

This patch fixes things by using invalidate_inode_pages2_range
instead.  It preserves the page cache invalidation, but won't zero
any pages.

[dchinner: catch error and warn if it fails. Comment.]

cc: stable@vger.kernel.org
Signed-off-by: Chris Mason <clm@fb.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>

xfs: don't dirty buffers beyond EOF

generic/263 is failing fsx at this point with a page spanning
EOF that cannot be invalidated. The operations are:

1190 mapwrite   0x52c00 thru    0x5e569 (0xb96a bytes)
1191 mapread    0x5c000 thru    0x5d636 (0x1637 bytes)
1192 write      0x5b600 thru    0x771ff (0x1bc00 bytes)

where 1190 extents EOF from 0x54000 to 0x5e569. When the direct IO
write attempts to invalidate the cached page over this range, it
fails with -EBUSY and so any attempt to do page invalidation fails.

The real question is this: Why can't that page be invalidated after
it has been written to disk and cleaned?

Well, there's data on the first two buffers in the page (1k block
size, 4k page), but the third buffer on the page (i.e. beyond EOF)
is failing drop_buffers because it's bh->b_state == 0x3, which is
BH_Uptodate | BH_Dirty.  IOWs, there's dirty buffers beyond EOF. Say
what?

OK, set_buffer_dirty() is called on all buffers from
__set_page_buffers_dirty(), regardless of whether the buffer is
beyond EOF or not, which means that when we get to ->writepage,
we have buffers marked dirty beyond EOF that we need to clean.
So, we need to implement our own .set_page_dirty method that
doesn't dirty buffers beyond EOF.

This is messy because the buffer code is not meant to be shared
and it has interesting locking issues on the buffer dirty bits.
So just copy and paste it and then modify it to suit what we need.

Note: the solutions the other filesystems and generic block code use
of marking the buffers clean in ->writepage does not work for XFS.
It still leaves dirty buffers beyond EOF and invalidations still
fail. Hence rather than play whack-a-mole, this patch simply
prevents those buffers from being dirtied in the first place.

cc: <stable@kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>

Linux 3.17-rc2

Merge tag 'nfs-for-3.17-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client fixes from Trond Myklebust:
"Highlights:

   - more fixes for read/write codepath regressions
     * sleeping while holding the inode lock
     * stricter enforcement of page contiguity when coalescing requests
     * fix up error handling in the page coalescing code

   - don't busy wait on SIGKILL in the file locking code"

* tag 'nfs-for-3.17-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs: Don't busy-wait on SIGKILL in __nfs_iocounter_wait
  nfs: can_coalesce_requests must enforce contiguity
  nfs: disallow duplicate pages in pgio page vectors
  nfs: don't sleep with inode lock in lock_and_join_requests
  nfs: fix error handling in lock_and_join_requests
  nfs: use blocking page_group_lock in add_request
  nfs: fix nonblocking calls to nfs_page_group_lock
  nfs: change nfs_page_group_lock argument

Merge tag 'renesas-sh-drivers-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas

Pull SH driver fix from Simon Horman:
"Confine SH_INTC to platforms that need it"

* tag 'renesas-sh-drivers-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
sh: intc: Confine SH_INTC to platforms that need it

Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:
"Pretty much all across the field so with this we should be in
  reasonable shape for the upcoming -rc2"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: OCTEON: make get_system_type() thread-safe
  MIPS: CPS: Initialize EVA before bringing up VPEs from secondary cores
  MIPS: Malta: EVA: Rename 'eva_entry' to 'platform_eva_init'
  MIPS: EVA: Add new EVA header
  MIPS: scall64-o32: Fix indirect syscall detection
  MIPS: syscall: Fix AUDIT value for O32 processes on MIPS64
  MIPS: Loongson: Fix COP2 usage for preemptible kernel
  MIPS: NL: Fix nlm_xlp_defconfig build error
  MIPS: Remove race window in page fault handling
  MIPS: Malta: Improve system memory detection for '{e, }memsize' >= 2G
  MIPS: Alchemy: Fix db1200 PSC clock enablement
  MIPS: BCM47XX: Fix reboot problem on BCM4705/BCM4785
  MIPS: Remove duplicated include from numa.c
  MIPS: Add common plat_irq_dispatch declaration
  MIPS: MSP71xx: remove unused plat_irq_dispatch() argument
  MIPS: GIC: Remove useless parens from GICBIS().
  MIPS: perf: Mark pmu interupt IRQF_NO_THREAD

Merge tag 'trace-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull fix for ftrace function tracer/profiler conflict from Steven Rostedt:
"The rewrite of the ftrace code that makes it possible to allow for
  separate trampolines had a design flaw with the interaction between
  the function and function_graph tracers.

  The main flaw was the simplification of the use of multiple tracers
  having the same filter (like function and function_graph, that use the
  set_ftrace_filter file to filter their code).  The design assumed that
  the two tracers could never run simultaneously as only one tracer can
  be used at a time.  The problem with this assumption was that the
  function profiler could be implemented on top of the function graph
  tracer, and the function profiler could run at the same time as the
  function tracer.  This caused the assumption to be broken and when
  ftrace detected this failed assumpiton it would spit out a nasty
  warning and shut itself down.

  Instead of using a single ftrace_ops that switches between the
  function and function_graph callbacks, the two tracers can again use
  their own ftrace_ops.  But instead of having a complex hierarchy of
  ftrace_ops, the filter fields are placed in its own structure and the
  ftrace_ops can carefully use the same filter.  This change took a bit
  to be able to allow for this and currently only the global_ops can
  share the same filter, but this new design can easily be modified to
  allow for any ftrace_ops to share its filter with another ftrace_ops.

  The first four patches deal with the change of allowing the ftrace_ops
  to share the filter (and this needs to go to 3.16 as well).

  The fifth patch fixes a bug that was also caused by the new changes
  but only for archs other than x86, and only if those archs implement a
  direct call to the function_graph tracer which they do not do yet but
  will in the future.  It does not need to go to stable, but needs to be
  fixed before the other archs update their code to allow direct calls
  to the function_graph trampoline"

* tag 'trace-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Use current addr when converting to nop in __ftrace_replace_code()
  ftrace: Fix function_profiler and function tracer together
  ftrace: Fix up trampoline accounting with looping on hash ops
  ftrace: Update all ftrace_ops for a ftrace_hash_ops update
  ftrace: Allow ftrace_ops to use the hashes from other ops

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"A couple of EFI fixes, plus misc fixes all around the map"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi/arm64: Store Runtime Services revision
  firmware: Do not use WARN_ON(!spin_is_locked())
  x86_32, entry: Clean up sysenter_badsys declaration
  x86/doc: Fix the 'tlb_single_page_flush_ceiling' sysconfig path
  x86/mm: Fix sparse 'tlb_single_page_flush_ceiling' warning and make the variable read-mostly
  x86/mm: Fix RCU splat from new TLB tracepoints

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"A kprobes and a perf compat ioctl fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Handle compat ioctl
kprobes: Skip kretprobe hit in NMI context to avoid deadlock

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
"A collection of fixes from this week, it's been pretty quiet and
  nothing really stands out as particularly noteworthy here -- mostly
  minor fixes across the field:

   - ODROID booting was fixed due to PMIC interrupts missing in DT
   - a collection of i.MX fixes
   - minor Tegra fix for regulators
   - Rockchip fix and addition of SoC-specific mailing list to make it
     easier to find posted patches"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  bus: arm-ccn: Fix warning message
  ARM: shmobile: koelsch: Remove non-existent i2c6 pinmux
  ARM: tegra: apalis/colibri t30: fix on-module 5v0 supplies
  MAINTAINERS: add new Rockchip SoC list
  ARM: dts: rockchip: readd missing mmc0 pinctrl settings
  ARM: dts: ODROID i2c improvements
  ARM: dts: Enable PMIC interrupts on ODROID
  ARM: dts: imx6sx: fix the pad setting for uart CTS_B
  ARM: dts: i.MX53: fix apparent bug in VPU clks
  ARM: imx: correct gpu2d_axi and gpu3d_axi clock setting
  ARM: dts: imx6: edmqmx6: change enet reset pin
  ARM: dts: vf610-twr: Fix pinctrl_esdhc1 pin definitions.
  ARM: imx: remove unnecessary ARCH_HAS_OPP select
  ARM: imx: fix TLB missing of IOMUXC base address during suspend
  ARM: imx6: fix SMP compilation again
  ARM: dt: sun6i: Add #address-cells and #size-cells to i2c controller nodes

Merge tag 'gpio-v3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull gpio fixes from Linus Walleij:

- a largeish fix for the IRQ handling in the new Zynq driver.  The
   quite verbose commit message gives the exact details.
- move some defines for gpiod flags outside an ifdef to make stub
   functions work again.
- various minor fixes that we can accept for -rc1.

* tag 'gpio-v3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio-lynxpoint: enable input sensing in resume
  gpio: move GPIOD flags outside #ifdef
  gpio: delete unneeded test before of_node_put
  gpio: zynq: Fix IRQ handlers
  gpiolib: devres: use correct structure type name in sizeof
  MAINTAINERS: Change maintainer for gpio-bcm-kona.c

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Intel and radeon fixes.

  Post KS/LC git requests from i915 and radeon stacked up.  They are all
  fixes along with some new pci ids for radeon, and one maintainers file
  entry.

   - i915: display fixes and irq fixes
   - radeon: pci ids, and misc gpuvm, dpm and hdp cache"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (29 commits)
  MAINTAINERS: Add entry for Renesas DRM drivers
  drm/radeon: add additional SI pci ids
  drm/radeon: add new bonaire pci ids
  drm/radeon: add new KV pci id
  Revert "drm/radeon: Use write-combined CPU mappings of ring buffers with PCIe"
  drm/radeon: fix active_cu mask on SI and CIK after re-init (v3)
  drm/radeon: fix active cu count for SI and CIK
  drm/radeon: re-enable selective GPUVM flushing
  drm/radeon: Sync ME and PFP after CP semaphore waits v4
  drm/radeon: fix display handling in radeon_gpu_reset
  drm/radeon: fix pm handling in radeon_gpu_reset
  drm/radeon: Only flush HDP cache for indirect buffers from userspace
  drm/radeon: properly document reloc priority mask
  drm/i915: don't try to retrain a DP link on an inactive CRTC
  drm/i915: make sure VDD is turned off during system suspend
  drm/i915: cancel hotplug and dig_port work during suspend and unload
  drm/i915: fix HPD IRQ reenable work cancelation
  drm/i915: take display port power domain in DP HPD handler
  drm/i915: Don't try to enable cursor from setplane when crtc is disabled
  drm/i915: Skip load detect when intel_crtc->new_enable==true
  ...

aio: fix reqs_available handling

As reported by Dan Aloni, commit f8567a3845ac ("aio: fix aio request
leak when events are reaped by userspace") introduces a regression when
user code attempts to perform io_submit() with more events than are
available in the ring buffer.  Reverting that commit would reintroduce a
regression when user space event reaping is used.

Fixing this bug is a bit more involved than the previous attempts to fix
this regression.  Since we do not have a single point at which we can
count events as being reaped by user space and io_getevents(), we have
to track event completion by looking at the number of events left in the
event ring.  So long as there are as many events in the ring buffer as
there have been completion events generate, we cannot call
put_reqs_available().  The code to check for this is now placed in
refill_reqs_available().

A test program from Dan and modified by me for verifying this bug is available
at http://www.kvack.org/~bcrl/20140824-aio_bug.c .

Reported-by: Dan Aloni <dan@kernelim.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Acked-by: Dan Aloni <dan@kernelim.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: stable@vger.kernel.org # v3.16 and anything that f8567a3845ac was backported to
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bus: arm-ccn: Fix warning message

A message warning a user about wrong vc value was printing
out port instead.

Reported-by: Drew Richardson <drew.richardson@arm.com>
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>

ARM: shmobile: koelsch: Remove non-existent i2c6 pinmux

On r8a7791, i2c6 (aka iic3) doesn't need pinmux, but the koelsch dts
refers to non-existent pinmux configuration data:

pinmux core: sh-pfc does not support function i2c6
sh-pfc e6060000.pfc: invalid function i2c6 in map table

Remove it to fix this.

Fixes: commit 1d41f36a68c0f4e9b01d563ce33bab5201858b54 ("ARM: shmobile:
koelsch dts: Add VDD MPU regulator for DVFS")

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Olof Johansson <olof@lixom.net>

ARM: tegra: apalis/colibri t30: fix on-module 5v0 supplies

Working on Gigabit/PCIe support in U-Boot for Apalis T30 I realised
that the current device tree source includes for our modules only
happen to work due to referencing the on-carrier 5v0 supply from USB
which is not at all available on-module. The modules actually contain
TPS60150 charge pumps to generate the PMIC required 5 volts from the
one and only 3.3 volt module supply. This patch fixes this.

(Note: When back-porting this to v3.16 stable releases, simply drop the
change to tegra30-apalis.dtsi; that file was added in v3.17)

Cc: <stable@vger.kernel.org> #v3.16+
Signed-off-by: Marcel Ziswiler <marcel@ziswiler.com>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'v3.17-rockchip-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes

Merge "ARM: rockchip: fix for 3.17" from Heiko Stubner:

Pinctrl that got accidentially dropped when reorganizing the
dts files and addition of the new Rockchip list to MAINTAINERS.

* tag 'v3.17-rockchip-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
MAINTAINERS: add new Rockchip SoC list
ARM: dts: rockchip: readd missing mmc0 pinctrl settings

Signed-off-by: Olof Johansson <olof@lixom.net>

MAINTAINERS: Add entry for Renesas DRM drivers

Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>

Merge branch 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linux into drm-next

This pull just contains some new pci ids.

* 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: add additional SI pci ids
  drm/radeon: add new bonaire pci ids
  drm/radeon: add new KV pci id

MAINTAINERS: add new Rockchip SoC list

Add the new list that Rockchip-specific patches should also be directed to.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>

ARM: dts: rockchip: readd missing mmc0 pinctrl settings

During the restructuring of the Rockchip Cortex-A9 dtsi files it seems
like the pinctrl settings vanished at some point from the mmc0 support.

This of course renders them unusable, so readd the necessary pinctrl
properties.

Signed-off-by: Heiko Stuebner <heiko@sntech.de>

Merge tag 'sunxi-dt-for-3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into fixes

Merge "Allwinner DT changes, take 2" from Maxime Ripard:

Only a single patch in here that fixes a DTC warning.

* tag 'sunxi-dt-for-3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
ARM: dt: sun6i: Add #address-cells and #size-cells to i2c controller nodes

Signed-off-by: Olof Johansson <olof@lixom.net>

ftrace: Use current addr when converting to nop in __ftrace_replace_code()

In __ftrace_replace_code(), when converting the call to a nop in a function
it needs to compare against the "curr" (current) value of the ftrace ops, and
not the "new" one. It currently does not affect x86 which is the only arch
to do the trampolines with function graph tracer, but when other archs that do
depend on this code implement the function graph trampoline, it can crash.

Here's an example when ARM uses the trampolines (in the future):

------------[ cut here ]------------
WARNING: CPU: 0 PID: 9 at kernel/trace/ftrace.c:1716 ftrace_bug+0x17c/0x1f4()
Modules linked in: omap_rng rng_core ipv6
CPU: 0 PID: 9 Comm: migration/0 Not tainted 3.16.0-test-10959-gf0094b28f303-dirty #52
[<c02188f4>] (unwind_backtrace) from [<c021343c>] (show_stack+0x20/0x24)
[<c021343c>] (show_stack) from [<c095a674>] (dump_stack+0x78/0x94)
[<c095a674>] (dump_stack) from [<c02532a0>] (warn_slowpath_common+0x7c/0x9c)
[<c02532a0>] (warn_slowpath_common) from [<c02532ec>] (warn_slowpath_null+0x2c/0x34)
[<c02532ec>] (warn_slowpath_null) from [<c02cbac4>] (ftrace_bug+0x17c/0x1f4)
[<c02cbac4>] (ftrace_bug) from [<c02cc44c>] (ftrace_replace_code+0x80/0x9c)
[<c02cc44c>] (ftrace_replace_code) from [<c02cc658>] (ftrace_modify_all_code+0xb8/0x164)
[<c02cc658>] (ftrace_modify_all_code) from [<c02cc718>] (__ftrace_modify_code+0x14/0x1c)
[<c02cc718>] (__ftrace_modify_code) from [<c02c7244>] (multi_cpu_stop+0xf4/0x134)
[<c02c7244>] (multi_cpu_stop) from [<c02c6e90>] (cpu_stopper_thread+0x54/0x130)
[<c02c6e90>] (cpu_stopper_thread) from [<c0271cd4>] (smpboot_thread_fn+0x1ac/0x1bc)
[<c0271cd4>] (smpboot_thread_fn) from [<c026ddf0>] (kthread+0xe0/0xfc)
[<c026ddf0>] (kthread) from [<c020f318>] (ret_from_fork+0x14/0x20)
---[ end trace dc9ce72c5b617d8f ]---
[ 65.047264] ftrace failed to modify [<c0208580>] asm_do_IRQ+0x10/0x1c
[ 65.054070] actual: 85:1b:00:eb

Fixes: 7413af1fb70e7 "ftrace: Make get_ftrace_addr() and get_ftrace_addr_old() global"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

ftrace: Fix function_profiler and function tracer together

The latest rewrite of ftrace removed the separate ftrace_ops of
the function tracer and the function graph tracer and had them
share the same ftrace_ops. This simplified the accounting by removing
the multiple layers of functions called, where the global_ops func
would call a special list that would iterate over the other ops that
were registered within it (like function and function graph), which
itself was registered to the ftrace ops list of all functions
currently active. If that sounds confusing, the code that implemented
it was also confusing and its removal is a good thing.

The problem with this change was that it assumed that the function
and function graph tracer can never be used at the same time.
This is mostly true, but there is an exception. That is when the
function profiler uses the function graph tracer to profile.
The function profiler can be activated the same time as the function
tracer, and this breaks the assumption and the result is that ftrace
will crash (it detects the error and shuts itself down, it does not
cause a kernel oops).

To solve this issue, a previous change allowed the hash tables
for the functions traced by a ftrace_ops to be a pointer and let
multiple ftrace_ops share the same hash. This allows the function
and function_graph tracer to have separate ftrace_ops, but still
share the hash, which is what is done.

Now the function and function graph tracers have separate ftrace_ops
again, and the function tracer can be run while the function_profile
is active.

Cc: stable@vger.kernel.org # 3.16 (apply after 3.17-rc4 is out)
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

nfs: Don't busy-wait on SIGKILL in __nfs_iocounter_wait

If a SIGKILL is sent to a task waiting in __nfs_iocounter_wait,
it will busy-wait or soft lockup in its while loop.
nfs_wait_bit_killable won't sleep, and the loop won't exit on
the error return.

Stop the busy-wait by breaking out of the loop when
nfs_wait_bit_killable returns an error.

Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: can_coalesce_requests must enforce contiguity

Commit 6094f83864c1d1296566a282cba05ba613f151ee
"nfs: allow coalescing of subpage requests" got rid of the requirement
that requests cover whole pages, but it made some incorrect assumptions.

It turns out that callers of this interface can map adjacent requests
(by file position as seen by req_offset + req->wb_bytes) to different pages,
even when they could share a page. An example is the direct I/O interface -
iov_iter_get_pages_alloc may return one segment with a partial page filled
and the next segment (which is adjacent in the file position) starts with a
new page.

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: disallow duplicate pages in pgio page vectors

Adjacent requests that share the same page are allowed, but should only
use one entry in the page vector. This avoids overruning the page
vector - it is sized based on how many bytes there are, not by
request count.

This fixes issues that manifest as "Redzone overwritten" bugs (the
vector overrun) and hangs waiting on page read / write, as it waits on
the same page more than once.

This also adds bounds checking to the page vector with a graceful failure
(WARN_ON_ONCE and pgio error returned to application).

Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: don't sleep with inode lock in lock_and_join_requests

This handles the 'nonblock=false' case in nfs_lock_and_join_requests.
If the group is already locked and blocking is allowed, drop the inode lock
and wait for the group lock to be cleared before trying it all again.
This should fix warnings found in peterz's tree (sched/wait branch), where
might_sleep() checks are added to wait.[ch].

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Reviewed-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: fix error handling in lock_and_join_requests

This fixes handling of errors from nfs_page_group_lock in
nfs_lock_and_join_requests. It now releases the inode lock and the
reference to the head request.

Reported-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Reviewed-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: use blocking page_group_lock in add_request

__nfs_pageio_add_request was calling nfs_page_group_lock nonblocking, but
this can return -EAGAIN which would end up passing -EIO to the application.

There is no reason not to block in this path, so change the two calls to
do so. Also, there is no need to check the return value of
nfs_page_group_lock when nonblock=false, so remove the error handling code.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Reviewed-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: fix nonblocking calls to nfs_page_group_lock

nfs_page_group_lock was calling wait_on_bit_lock even when told not to
block. Fix by first trying test_and_set_bit, followed by wait_on_bit_lock
if and only if blocking is allowed. Return -EAGAIN if nonblocking and the
test_and_set of the bit was already locked.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Reviewed-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

nfs: change nfs_page_group_lock argument

Flip the meaning of the second argument from 'wait' to 'nonblock' to
match related functions. Update all five calls to reflect this change.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Reviewed-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

Merge tag 'pwm/for-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm fix from Thierry Reding:
"Just one bugfix for the PWM lookup table code that would cause a PWM
  channel to be set to the wrong period and polarity for non-perfect
  matches"

* tag 'pwm/for-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
  pwm: Fix period and polarity in pwm_get() for non-perfect matches

mac80211: fix channel switch for chanctx-based drivers

The new_ctx pointer is set only for non-chanctx drivers.  This yielded a
crash for chanctx-based drivers during channel switch finalization:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  IP: ieee80211_vif_use_reserved_switch+0x71c/0xb00 [mac80211]

Use an adequate chanctx pointer to fix this.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:
"Here are some bug fixes that have piled up during ksummit/linuxcon.

   1) Fix endian problems in ibmveth, from Anton Blanchard.

   2) IPV6 routing code does GFP_KERNEL allocation in atomic, fix from
      Benjamin Block.

   3) SCTP association fixes from Daniel Borkmann.

   4) When multiple VLAN headers are present we have to make sure the
      second and subsequent ones are pullable in the SKB otherwise we
      blindly dereference garbage.  From Jiri Benc.

   5) The argument adjustment of the signature of hlist_add_after*()
      introduced a regression in the batman-adv code, fix from Sven
      Eckelmann.

   6) Fix TX hang handling to avoid a panic in i40e, from Anjali Singhai
      Jain.

   7) PTP flag test is inverted in i40e driver, from Jesse Brandeburg.

   8) ATM LEC driver needs to hold RTNL mutex over MTU changes, from
      Chas Williams.

   9) Truncate packets larger then the TPACKET_V3 format configured
      buffers, otherwise we overwrite past the end of said buffers.
      From Eric Dumazet.

  10) Fix endianness bugs in qlcnic firmware handling, from Rajesh
      Borundia and Shahed Shaikh.

  11) CXGB4 sometimes doesn't get all of the TX completion events it
      should resulting in SKBs getting stuck in the TX queue, from
      Hariprasad Shenai.

  12) When the FEC chip's PTP clock is disabled, you can't access the
      register.  Add necessary checks to avoid the resulting hang, from
      Fugang Duan"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits)
  drivers: isdn: eicon: xdi_msg.h: Fix typo in #ifndef
  net: sctp: fix suboptimal edge-case on non-active active/retrans path selection
  net: sctp: spare unnecessary comparison in sctp_trans_elect_best
  net: ethernet: broadcom: bnx2x: Remove redundant #ifdef
  ibmveth: Fix endian issues with rx_no_buffer statistic
  net: xgene: fix possible NULL dereference in xgene_enet_free_desc_rings()
  openvswitch: fix panic with multiple vlan headers
  net: ipv6: fib: don't sleep inside atomic lock
  net: fec: ptp: avoid register access when ipg clock is disabled
  cxgb4: Free completed tx skbs promptly
  cxgb4: Fix race condition in cleanup
  sctp: not send SCTP_PEER_ADDR_CHANGE notifications with failed probe
  bnx2x: Revert UNDI flushing mechanism
  qlcnic: Fix endianess issue in firmware load from file operation
  qlcnic: Fix endianess issue in FW dump template header
  qlcnic: Fix flash access interface to application
  MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driver
  macvlan: Allow setting multicast filter on all macvlan types
  packet: handle too big packets for PACKET_V3
  MAINTAINERS: add entry for ec_bhf driver
  ...

ftrace: Fix up trampoline accounting with looping on hash ops

Now that a ftrace_hash can be shared by multiple ftrace_ops, they can dec
the rec->flags by more than once (one per those that share the ftrace_hash).
This means that the tramp_hash may not have a hash item when it was added.

For example, if two ftrace_ops share a hash for a ftrace record, and the
first ops has a trampoline, when it adds itself it will set the rec->flags
TRAMP flag and increments its nr_trampolines counter. When the second ops
is added, it must clear that tramp flag but also decrement the other ops
that shares its hash. As the update to the function callbacks has not yet
been performed, the other ops will not have the tramp hash set yet and it
can not be used to know to decrement its nr_trampolines.

Luckily, the tramp_hash does not need to be used. As the ftrace_mutex is
held, a ops with a trampoline to a record during an update of another ops
that shares the record will have its func_hash pointing to it. Since a
trampoline can only be set for a record if only one ops is attached to it,
we can just check if the record has a trampoline (the FTRACE_FL_TRAMP flag
is set) and then find the ops that has this record in its hashes.

Also added some output to help debug when things go wrong.

Cc: stable@vger.kernel.org # 3.16+ (apply after 3.17-rc4 is out)
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

drivers: isdn: eicon: xdi_msg.h: Fix typo in #ifndef

Test for definedness of the macro which is actually defined (the
change is hard to see: it is s/SSS/SSA/).

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sctp: fix suboptimal edge-case on non-active active/retrans path selection

In SCTP, selection of active (T.ACT) and retransmission (T.RET)
transports is being done whenever transport control operations
(UP, DOWN, PF, ...) are engaged through sctp_assoc_control_transport().

Commits 4c47af4d5eb2 ("net: sctp: rework multihoming retransmission
path selection to rfc4960") and a7288c4dd509 ("net: sctp: improve
sctp_select_active_and_retran_path selection") have both improved
it towards a more fine-grained and optimal path selection.

Currently, the selection algorithm for T.ACT and T.RET is as follows:

1) Elect the two most recently used ACTIVE transports T1, T2 for
   T.ACT, T.RET, where T.ACT<-T1 and T1 is most recently used
2) In case primary path T.PRI not in {T1, T2} but ACTIVE, set
   T.ACT<-T.PRI and T.RET<-T1
3) If only T1 is ACTIVE from the set, set T.ACT<-T1 and T.RET<-T1
4) If none is ACTIVE, set T.ACT<-best(T.PRI, T.RET, T3) where
   T3 is the most recently used (if avail) in PF, set T.RET<-T.PRI

Prior to above commits, 4) was simply a camp on T.ACT<-T.PRI and
T.RET<-T.PRI, ignoring possible paths in PF. Camping on T.PRI is
still slightly suboptimal as it can lead to the following scenario:

Setup:
        <A>                                <B>
    T1: p1p1 (10.0.10.10) <==>  .'`)  <==> p1p1 (10.0.10.12)  <= T.PRI
    T2: p1p2 (10.0.10.20) <==> (_ . ) <==> p1p2 (10.0.10.22)

    net.sctp.rto_min = 1000
    net.sctp.path_max_retrans = 2
    net.sctp.pf_retrans = 0
    net.sctp.hb_interval = 1000

T.PRI is permanently down, T2 is put briefly into PF state (e.g. due to
link flapping). Here, the first time transmission is sent over PF path
T2 as it's the only non-INACTIVE path, but the retransmitted data-chunks
are sent over the INACTIVE path T1 (T.PRI), which is not good.

After the patch, it's choosing better transports in both cases by
modifying step 4):

4) If none is ACTIVE, set T.ACT_new<-best(T.ACT_old, T3) where T3 is
   the most recently used (if avail) in PF, set T.RET<-T.ACT_new

This will still select a best possible path in PF if available (which
can also include T.PRI/T.RET), and set both T.ACT/T.RET to it.

In case sctp_assoc_control_transport() *just* put T.ACT_old into INACTIVE
as it transitioned from ACTIVE->PF->INACTIVE and stays in INACTIVE just
for a very short while before going back ACTIVE, it will guarantee that
this path will be reselected for T.ACT/T.RET since T3 (PF) is not
available.

Previously, this was not possible, as we would only select between T.PRI
and T.RET, and a possible T3 would be NULL due to the fact that we have
just transitioned T3 in sctp_assoc_control_transport() from PF->INACTIVE
and would select a suboptimal path when T.PRI/T.RET have worse properties.

In the case that T.ACT_old permanently went to INACTIVE during this
transition and there's no PF path available, plus T.PRI and T.RET are
INACTIVE as well, we would now camp on T.ACT_old, but if everything is
being INACTIVE there's really not much we can do except hoping for a
successful HB to bring one of the transports back up again and, thus
cause a new selection through sctp_assoc_control_transport().

Now both tests work fine:

Case 1:

1. T1 S(ACTIVE) T.ACT
    T2 S(ACTIVE) T.RET

2. T1 S(ACTIVE) T.ACT, T.RET
    T2 S(PF)

3. T1 S(ACTIVE) T.ACT, T.RET
    T2 S(INACTIVE)

5. T1 S(PF) T.ACT, T.RET
    T2 S(INACTIVE)

[ 5.1 T1 S(INACTIVE) T.ACT, T.RET
      T2 S(INACTIVE) ]

6. T1 S(ACTIVE) T.ACT, T.RET
    T2 S(INACTIVE)

7. T1 S(ACTIVE) T.ACT
    T2 S(ACTIVE) T.RET

Case 2:

1. T1 S(ACTIVE) T.ACT
    T2 S(ACTIVE) T.RET

2. T1 S(PF)
    T2 S(ACTIVE) T.ACT, T.RET

3. T1 S(INACTIVE)
    T2 S(ACTIVE) T.ACT, T.RET

5. T1 S(INACTIVE)
    T2 S(PF) T.ACT, T.RET

[ 5.1 T1 S(INACTIVE)
      T2 S(INACTIVE) T.ACT, T.RET ]

6. T1 S(INACTIVE)
    T2 S(ACTIVE) T.ACT, T.RET

7. T1 S(ACTIVE) T.ACT
    T2 S(ACTIVE) T.RET

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sctp: spare unnecessary comparison in sctp_trans_elect_best

When both transports are the same, we don't have to go down that
road only to realize that we will return the very same transport.
We are guaranteed that curr is always non-NULL. Therefore, just
short-circuit this special case.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: broadcom: bnx2x: Remove redundant #ifdef

Nothing defines _ASM_GENERIC_INT_L64_H, it is a weird way to check for
64 bit longs, and u64 should be printed using %llx anyway.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmveth: Fix endian issues with rx_no_buffer statistic

Hidden away in the last 8 bytes of the buffer_list page is a solitary
statistic. It needs to be byte swapped or else ethtool -S will
produce numbers that terrify the user.

Since we do this in multiple places, create a helper function with a
comment explaining what is going on.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

net: xgene: fix possible NULL dereference in xgene_enet_free_desc_rings()

A NULL pointer dereference is possible for the argument ring->buf_pool
which is passed to xgene_enet_free_desc_ring(), as ring could be NULL.

And now since NULL pointers are being checked for before the calls to
xgene_enet_free_desc_ring(), might as well take advantage of them and
not call the function if the argument would be NULL.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

openvswitch: fix panic with multiple vlan headers

When there are multiple vlan headers present in a received frame, the first
one is put into vlan_tci and protocol is set to ETH_P_8021Q. Anything in the
skb beyond the VLAN TPID may be still non-linear, including the inner TCI
and ethertype. While ovs_flow_extract takes care of IP and IPv6 headers, it
does nothing with ETH_P_8021Q. Later, if OVS_ACTION_ATTR_POP_VLAN is
executed, __pop_vlan_tci pulls the next vlan header into vlan_tci.

This leads to two things:

1. Part of the resulting ethernet header is in the non-linear part of the
   skb. When eth_type_trans is called later as the result of
   OVS_ACTION_ATTR_OUTPUT, kernel BUGs in __skb_pull. Also, __pop_vlan_tci
   is in fact accessing random data when it reads past the TPID.

2. network_header points into the ethernet header instead of behind it.
   mac_len is set to a wrong value (10), too.

Reported-by: Yulong Pei <ypei@redhat.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipv6: fib: don't sleep inside atomic lock

The function fib6_commit_metrics() allocates a piece of memory in mode
GFP_KERNEL while holding an atomic lock from higher up in the stack, in
the function __ip6_ins_rt(). This produces the following BUG:

> BUG: sleeping function called from invalid context at mm/slub.c:1250
> in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: dhcpcd
> 2 locks held by dhcpcd/2909:
>  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff81978e67>] rtnl_lock+0x17/0x20
>  #1:  (&tb->tb6_lock){++--+.}, at: [<ffffffff81a6951a>] ip6_route_add+0x65a/0x800
> CPU: 1 PID: 2909 Comm: dhcpcd Not tainted 3.17.0-rc1 #1
> Hardware name: ASUS All Series/Q87T, BIOS 0216 10/16/2013
>  0000000000000008 ffff8800c8f13858 ffffffff81af135a 0000000000000000
>  ffff880212202430 ffff8800c8f13878 ffffffff810f8d3a ffff880212202c98
>  0000000000000010 ffff8800c8f138c8 ffffffff8121ad0e 0000000000000001
> Call Trace:
>  [<ffffffff81af135a>] dump_stack+0x4e/0x68
>  [<ffffffff810f8d3a>] __might_sleep+0x10a/0x120
>  [<ffffffff8121ad0e>] kmem_cache_alloc_trace+0x4e/0x190
>  [<ffffffff81a6bcd6>] ? fib6_commit_metrics+0x66/0x110
>  [<ffffffff81a6bcd6>] fib6_commit_metrics+0x66/0x110
>  [<ffffffff81a6cbf3>] fib6_add+0x883/0xa80
>  [<ffffffff81a6951a>] ? ip6_route_add+0x65a/0x800
>  [<ffffffff81a69535>] ip6_route_add+0x675/0x800
>  [<ffffffff81a68f2a>] ? ip6_route_add+0x6a/0x800
>  [<ffffffff81a6990c>] inet6_rtm_newroute+0x5c/0x80
>  [<ffffffff8197cf01>] rtnetlink_rcv_msg+0x211/0x260
>  [<ffffffff81978e67>] ? rtnl_lock+0x17/0x20
>  [<ffffffff81119708>] ? lock_release_holdtime+0x28/0x180
>  [<ffffffff81978e67>] ? rtnl_lock+0x17/0x20
>  [<ffffffff8197ccf0>] ? __rtnl_unlock+0x20/0x20
>  [<ffffffff819a989e>] netlink_rcv_skb+0x6e/0xd0
>  [<ffffffff81978ee5>] rtnetlink_rcv+0x25/0x40
>  [<ffffffff819a8e59>] netlink_unicast+0xd9/0x180
>  [<ffffffff819a9600>] netlink_sendmsg+0x700/0x770
>  [<ffffffff81103735>] ? local_clock+0x25/0x30
>  [<ffffffff8194e83c>] sock_sendmsg+0x6c/0x90
>  [<ffffffff811f98e3>] ? might_fault+0xa3/0xb0
>  [<ffffffff8195ca6d>] ? verify_iovec+0x7d/0xf0
>  [<ffffffff8194ec3e>] ___sys_sendmsg+0x37e/0x3b0
>  [<ffffffff8111ef15>] ? trace_hardirqs_on_caller+0x185/0x220
>  [<ffffffff81af979e>] ? mutex_unlock+0xe/0x10
>  [<ffffffff819a55ec>] ? netlink_insert+0xbc/0xe0
>  [<ffffffff819a65e5>] ? netlink_autobind.isra.30+0x125/0x150
>  [<ffffffff819a6520>] ? netlink_autobind.isra.30+0x60/0x150
>  [<ffffffff819a84f9>] ? netlink_bind+0x159/0x230
>  [<ffffffff811f989a>] ? might_fault+0x5a/0xb0
>  [<ffffffff8194f25e>] ? SYSC_bind+0x7e/0xd0
>  [<ffffffff8194f8cd>] __sys_sendmsg+0x4d/0x80
>  [<ffffffff8194f912>] SyS_sendmsg+0x12/0x20
>  [<ffffffff81afc692>] system_call_fastpath+0x16/0x1b

Fixing this by replacing the mode GFP_KERNEL with GFP_ATOMIC.

Signed-off-by: Benjamin Block <bebl@mageta.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fec: ptp: avoid register access when ipg clock is disabled

The current kernel hang on i.MX6SX with rootfs mount from MMC.
The root cause is that ptp uses a periodic timer to access enet register
even if ipg clock is disabled.

FEC ptp driver start one period timer to read 1588 counter register in the
ptp init function that is called after FEC driver is probed.

To save power, after FEC probe finish, FEC driver disable all clocks including
ipg clock that is needed for register access.

i.MX5x, i.MX6q/dl/sl FEC register access don't cause system hang when ipg clock
is disabled, just return zero value. But for i.MX6sx SOC, it cause system hang.

To avoid the issue, we need to check ptp clock status before ptp timer count access.

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ftrace: Update all ftrace_ops for a ftrace_hash_ops update

When updating what an ftrace_ops traces, if it is registered (that is,
actively tracing), and that ftrace_ops uses the shared global_ops
local_hash, then we need to update all tracers that are active and
also share the global_ops' ftrace_hash_ops.

Cc: stable@vger.kernel.org # 3.16 (apply after 3.17-rc4 is out)
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

ftrace: Allow ftrace_ops to use the hashes from other ops

Currently the top level debug file system function tracer shares its
ftrace_ops with the function graph tracer. This was thought to be fine
because the tracers are not used together, as one can only enable
function or function_graph tracer in the current_tracer file.

But that assumption proved to be incorrect. The function profiler
can use the function graph tracer when function tracing is enabled.
Since all function graph users uses the function tracing ftrace_ops
this causes a conflict and when a user enables both function profiling
as well as the function tracer it will crash ftrace and disable it.

The quick solution so far is to move them as separate ftrace_ops like
it was earlier. The problem though is to synchronize the functions that
are traced because both function and function_graph tracer are limited
by the selections made in the set_ftrace_filter and set_ftrace_notrace
files.

To handle this, a new structure is made called ftrace_ops_hash. This
structure will now hold the filter_hash and notrace_hash, and the
ftrace_ops will point to this structure. That will allow two ftrace_ops
to share the same hashes.

Since most ftrace_ops do not share the hashes, and to keep allocation
simple, the ftrace_ops structure will include both a pointer to the
ftrace_ops_hash called func_hash, as well as the structure itself,
called local_hash. When the ops are registered, the func_hash pointer
will be initialized to point to the local_hash within the ftrace_ops
structure. Some of the ftrace internal ftrace_ops will be initialized
statically. This will allow for the function and function_graph tracer
to have separate ops but still share the same hash tables that determine
what functions they trace.

Cc: stable@vger.kernel.org # 3.16 (apply after 3.17-rc4 is out)
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
"This small set of fixes addresses a few issues introduced during the
  merge window, including:

   - fix typo in I-cache detection that was causing us to treat all
     I-caches as aliasing
   - hook up memfd_create and getrandom syscalls for native and compat
   - revert a temporary hack for defconfig builds in -next (the audit
     tree changes didn't make it in this merge window)
   - a couple of UEFI fixes for TEXT_OFFSET fuzzing and /memreserve/
   - a simple sparsemem fix for 48-bit physical addressing
   - small defconfig updates to get autotesters working with X-gene"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  Revert "arm64: Do not invoke audit_syscall_* functions if !CONFIG_AUDIT_SYSCALL"
  arm64: mm: update max pa bits to 48
  arm64: ignore DT memreserve entries when booting in UEFI mode
  arm64: configs: Enable X-Gene SATA and ethernet in defconfig
  arm64: align randomized TEXT_OFFSET on 4 kB boundary
  asm-generic: add memfd_create system call to unistd.h
  arm64: compat: wire up memfd_create and getrandom syscalls for aarch32
  arm64: fix typo in I-cache policy detection

Merge tag 'iommu-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU fixes from Joerg Roedel:
"The fixes include:

   - fix a crash in the VT-d driver when devices with a driver attached
     are hot-unplugged

   - fix a AMD IOMMU driver crash with device assignment of 32 bit PCI
     devices to KVM guests

   - fix for a copy&paste error in generic IOMMU code.  Now the right
     function pointer is checked before calling"

* tag 'iommu-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/core: Check for the right function pointer in iommu_map()
  iommu/amd: Fix cleanup_domain for mass device removal
  iommu/vt-d: Defer domain removal if device is assigned to a driver

drm/radeon: add additional SI pci ids

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/radeon: add new bonaire pci ids

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/radeon: add new KV pci id

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=82912

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent

Pull EFI fixes from Matt Fleming:

* WARN_ON(!spin_is_locked()) always triggers on non-SMP machines.
   Swap it for the more canonical lockdep_assert_held() which always
   does the right thing - Guenter Roeck

* Assign the correct value to efi.runtime_version on arm64 so that all
   the runtime services can be invoked - Semen Protsenko

Signed-off-by: Ingo Molnar <mingo@kernel.org>

efi/arm64: Store Runtime Services revision

"efi" global data structure contains "runtime_version" field which must
be assigned in order to use it later in Runtime Services virtual calls
(virt_efi_* functions).

Before this patch "runtime_version" was unassigned (0), so each
Runtime Service virtual call that checks revision would fail.

Signed-off-by: Semen Protsenko <semen.protsenko@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>

firmware: Do not use WARN_ON(!spin_is_locked())

spin_is_locked() always returns false for uniprocessor configurations
in several architectures, so do not use WARN_ON with it.
Use lockdep_assert_held() instead to also reduce overhead in
non-debug kernels.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>

cxgb4: Free completed tx skbs promptly

Description of problem:
The NIC card is not reporting back to the driver the transmitted skbs,
so they get stuck in the TX ring causing issues with reference
counters in other kernel components.

Developed a new Automatic Egress Queue Update firmware facility to slowly tick
through Egress Queues and send back any outstanding CIDX Updates which are
laying around.

Based on original work by Casey Leedom <leedom@chelsio.com>

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'linux-can-fixes-for-3.17-20140821' of git://gitorious.org/linux-can/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2014-08-21

The first patch is from Mirza Krak, it fixes the initialization of the hardware
in the sja1000 driver. The next patch is contributed by Dan Carpenter, it fixes
the error handling in the c_can's probe function. Then there are two patches
for the flexcan driver, one by Alexander Stein, which fixes the resetting of
the bus error interrupt mask, the other one by Sebastian Andrzej Siewior which
adds an additional error state transition message.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Fix race condition in cleanup

There is a possible race condition when we unregister the PCI Driver and then
flush/destroy the global "workq". This could lead to situations where there
are tasks on the Work Queue with references to now deleted adapter data
structures. Instead, have per-adapter Work Queues which were instantiated and
torn down in init_one() and remove_one(), respectively.

v2: Remove unnecessary call to flush_workqueue() before destroy_workqueue()

Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: not send SCTP_PEER_ADDR_CHANGE notifications with failed probe

Since the transport has always been in state SCTP_UNCONFIRMED, it
therefore wasn't active before and hasn't been used before, and it
always has been, so it is unnecessary to bug the user with a
notification.

Reported-by: Deepak Khandelwal <khandelwal.deepak.1987@gmail.com>
Suggested-by: Vlad Yasevich <vyasevich@gmail.com>
Suggested-by: Michael Tuexen <tuexen@fh-muenster.de>
Suggested-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Zhu Yanjun <Yanjun.Zhu@windriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh: intc: Confine SH_INTC to platforms that need it

Currently the sh-intc driver is compiled on all SuperH and
non-multiplatform SH-Mobile platforms, while it's only used on a limited
number of platforms:
- SuperH: SH2(A), SH3(A), SH4(A)(L) (all but SH5)
- ARM: sh7372, sh73a0

Drop the "default y" on SH_INTC, make all CPU platforms that use it
select it, and protect all sub-options by "if SH_INTC" to fix this.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Magnus Damm <damm+renesas@opensource.se>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>

bnx2x: Revert UNDI flushing mechanism

Commit 91ebb929b6f8 ("bnx2x: Add support for Multi-Function UNDI") [which was
later supposedly fixed by de682941eef3 ("bnx2x: Fix UNDI driver unload")]
introduced a bug in which in some [yet-to-be-determined] scenarios the
alternative flushing mechanism which was to guarantee the Rx buffers are
empty before resetting them during device probe will fail.
If this happens, when device will be loaded once more a fatal attention will
occur; Since this most likely happens in boot from SAN scenarios, the machine
will fail to load.

Notice this may occur not only in the 'Multi-Function' scenario but in the
regular scenario as well, i.e., this introduced a regression in the driver's
ability to perform boot from SAN.

The patch reverts the mechanism and applies the old scheme to multi-function
devices as well as to single-function devices.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'qlcnic'

Shahed Shaikh says:

====================
qlcnic: Bug fixes

This series fixes some bugs related to endianess.

Please apply this series to net.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix endianess issue in firmware load from file operation

Firmware binary file is in little endian. On big-endian architecture, while
writing this binary FW file to adapters memory, writel() swaps the data resulting into
corruption of FW image. So, swap the data before writing into adapters memory.

Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix endianess issue in FW dump template header

Firmware dump template header is read from adapter using
readl() which swaps the data. So, adjust structure
element on the boundary of 32bit dword.

Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlcnic: Fix flash access interface to application

Application expects flash data in little endian, but driver reads/writes
flash data using readl()/writel() APIs which swaps data on big endian machine.
So, swap the data after reading from and before writing to flash memory.

Signed-off-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

MAINTAINERS: Add section for MRF24J40 IEEE 802.15.4 radio driver

Alan is the original author of the driver. This change was discussed
with the 802.15.4 subsystem maintainer, Alexander Aring.

Signed-off-by: Alan Ott <alan@signal11.us>
Acked-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

macvlan: Allow setting multicast filter on all macvlan types

Currently, macvlan code restricts multicast and unicast
filter setting only to passthru devices. As a result,
if a guest using macvtap wants to receive multicast
traffic, it has to set IFF_ALLMULTI or IFF_PROMISC.

This patch makes it possible to use the fdb interface
to add multicast addresses to the filter thus allowing
a guest to receive only targeted multicast traffic.

CC: John Fastabend <john.r.fastabend@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Jason Wang <jasowang@redhat.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

packet: handle too big packets for PACKET_V3

af_packet can currently overwrite kernel memory by out of bound
accesses, because it assumed a [new] block can always hold one frame.

This is not generally the case, even if most existing tools do it right.

This patch clamps too long frames as API permits, and issue a one time
error on syslog.

[ 394.357639] tpacket_rcv: packet too big, clamped from 5042 to 3966. macoff=82

In this example, packet header tp_snaplen was set to 3966,
and tp_len was set to 5042 (skb->len)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.")
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

MAINTAINERS: add entry for ec_bhf driver

Added entry for ec_bhf driver.

Signed-off-by: Dariusz Marcinkiewicz <reksio@newterm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

lec: Use rtnl lock/unlock when updating MTU

The LECS response contains the MTU that should be used. Correctly
synchronize with other layers when updating.

Signed-off-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'drm-intel-fixes-2014-08-21' of git://anongit.freedesktop.org/drm-intel

Display fixes from Ville and Imre, all cc: stable.

* tag 'drm-intel-fixes-2014-08-21' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: don't try to retrain a DP link on an inactive CRTC
  drm/i915: make sure VDD is turned off during system suspend
  drm/i915: cancel hotplug and dig_port work during suspend and unload
  drm/i915: fix HPD IRQ reenable work cancelation
  drm/i915: take display port power domain in DP HPD handler
  drm/i915: Don't try to enable cursor from setplane when crtc is disabled
  drm/i915: Skip load detect when intel_crtc->new_enable==true
  drm/i915: Fix locking for intel_enable_pipe_a()

Merge branch 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linux

more radeon fixes

* 'drm-fixes-3.17' of git://people.freedesktop.org/~agd5f/linux:
  Revert "drm/radeon: Use write-combined CPU mappings of ring buffers with PCIe"
  drm/radeon: fix active_cu mask on SI and CIK after re-init (v3)
  drm/radeon: fix active cu count for SI and CIK
  drm/radeon: re-enable selective GPUVM flushing
  drm/radeon: Sync ME and PFP after CP semaphore waits v4
  drm/radeon: fix display handling in radeon_gpu_reset
  drm/radeon: fix pm handling in radeon_gpu_reset
  drm/radeon: Only flush HDP cache for indirect buffers from userspace
  drm/radeon: properly document reloc priority mask

Merge branch 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata

Pull libata fixes from Tejun Heo:
"Nothing drastic but pushing out early due to build breakage in the new
  tegra platform.

  Additionally:

   - M550 tagged trim blacklist pattern is widened so that it matches
     the new 1TB model

   - three controller specific fixes"

* 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  libata: widen Crucial M550 blacklist matching
  pata_scc: propagate return value of scc_wait_after_reset
  ata: ahci_tegra: Change include to fix compilation
  pata_samsung_cf: change ret type to signed
  ahci_xgene: Removing NCQ support from the APM X-Gene SoC AHCI SATA Host Controller driver.

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid

Pull HID fixes from Jiri Kosina:

- fixes for a couple potential memory corruption problems (the HW would
   have to be manufactured to be deliberately evil to trigger those)
   found by Ben Hawkes
- fix for potential infinite loop when using sysfs interface of
   logitech driver, from Simon Wood
- a couple more simple driver fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: fix a couple of off-by-ones
  HID: logitech: perform bounds checking on device_id early enough
  HID: logitech: fix bounds checking on LED report size
  HID: logitech: Prevent possibility of infinite loop when using /sys interface
  HID: rmi: print an error if F11 is not found instead of stopping the device
  HID: hid-sensor-hub: use devm_ functions consistently
  HID: huion: Use allocated buffer for DMA
  HID: huion: Fail on parameter retrieval errors

Merge tag 'sound-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"A bunch of ASoC fixes with a few HD-audio fixes in this pull request.

  All fairly small, boring and device-specific fixes, in addition to
  MAINTAINERS update for better reviewing"

* tag 'sound-3.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/hdmi - apply Valleyview fix-ups to Cherryview display codec
  ALSA: hda/hdmi - set depop_delay for haswell plus
  ALSA: hda - restore the gpio led after resume
  ALSA: hda/realtek - Avoid setting wrong COEF on ALC269 & co
  ASoC: pxa-ssp: drop SNDRV_PCM_FMTBIT_S24_LE
  ASoC: fsl-esai: Revert .xlate_tdm_slot_mask() support
  ASoC: mcasp: Fix implicit BLCK divider setting
  ASoC: arizona: Fix TDM slot length handling in arizona_hw_params
  ASoC: pcm512x: Correct Digital Playback control names
  ASoC: dapm: Fix uninitialized variable in snd_soc_dapm_get_enum_double()
  ASoC: Intel: Restore Baytrail ADSP streams only when ADSP was in reset
  ASoC: Intel: Wait Baytrail ADSP boot at resume_early stage
  ASoC: Intel: Merge Baytrail ADSP suspend_noirq into suspend_late
  MAINTAINERS: Add i.MX maintainers and paths to Freescale ASoC entry
  ASoC: Intel: Update Baytrail ADSP firmware name

Merge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
"Here is the fixup for the 'lowlight' of my last pull request.  I2C is
  not selected anymore by I2C_ACPI.  Instead, the code in question now
  depends on I2C=y.

  Also, Mika has agreed to support me and be the maintainer for I2C-ACPI
  related patches.  Finally, a new-ID-patch came along last week"

* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  MAINTAINERS: add maintainer for ACPI parts of I2C
  i2c: i801: Add PCI ID for Intel Braswell
  i2c: rework kernel config I2C_ACPI

Merge tag 'please-pull-memfd_create' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux

Pull ia64 update from Tony Luck:
"Add memfd_create syscall to ia64"

* tag 'please-pull-memfd_create' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
[IA64] Wire up memfd_create() system call

Merge tag 'microblaze-3.17-rc2' of git://git.monstr.eu/linux-2.6-microblaze

Pull microblaze update from Michal Simek:
"Wire-up seccomp/getrandom/memfd_create syscalls"

* tag 'microblaze-3.17-rc2' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Wire-up memfd_create syscall
  microblaze: Wire-up getrandom syscall
  microblaze: Wire-up seccomp syscall

HID: fix a couple of off-by-ones

There are a few very theoretical off-by-one bugs in report descriptor size
checking when performing a pre-parsing fixup. Fix those.

Cc: stable@vger.kernel.org
Reported-by: Ben Hawkes <hawkes@google.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

HID: logitech: perform bounds checking on device_id early enough

device_index is a char type and the size of paired_dj_deivces is 7
elements, therefore proper bounds checking has to be applied to
device_index before it is used.

We are currently performing the bounds checking in
logi_dj_recv_add_djhid_device(), which is too late, as malicious device
could send REPORT_TYPE_NOTIF_DEVICE_UNPAIRED early enough and trigger the
problem in one of the report forwarding functions called from
logi_dj_raw_event().

Fix this by performing the check at the earliest possible ocasion in
logi_dj_raw_event().

Cc: stable@vger.kernel.org
Reported-by: Ben Hawkes <hawkes@google.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

HID: logitech: fix bounds checking on LED report size

The check on report size for REPORT_TYPE_LEDS in logi_dj_ll_raw_request()
is wrong; the current check doesn't make any sense -- the report allocated
by HID core in hid_hw_raw_request() can be much larger than
DJREPORT_SHORT_LENGTH, and currently logi_dj_ll_raw_request() doesn't
handle this properly at all.

Fix the check by actually trimming down the report size properly if it is
too large.

Cc: stable@vger.kernel.org
Reported-by: Ben Hawkes <hawkes@google.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

gpio-lynxpoint: enable input sensing in resume

It appears that input sensing bit might be reset during
suspend/resume. Set input sensing again for all requested gpios
in resume

Tested-by: Jerome Blin <jerome.blin@intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

gpio: move GPIOD flags outside #ifdef

The GPIOD flags are defined inside the #ifdef CONFIG_GPIOLIB
switch, making the gpiolib stubs fail if these flags are used
by a consumer. This is not correct: the stubs should compile
fine without GPIOLIB.

Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

can: flexcan: handle state passive -> warning transition

Once the CAN-bus is open and a packet is sent, the controller switches
into the PASSIVE state. Once the BUS is closed again it goes the back
err-warning. The TX error counter goes 0 -> 0x80 -> 0x7f.
This patch makes sure that the user learns about this state chang
(CAN_STATE_ERROR_WARNING => CAN_STATE_ERROR_PASSIVE)

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Matthias Klein <matthias.klein@optimeas.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: flexcan: Disable error interrupt when bus error reporting is disabled

In case we don't have FLEXCAN_HAS_BROKEN_ERR_STATE and the user set
CAN_CTRLMODE_BERR_REPORTING once it can not be unset again until reboot.
So in case neither hardware nor user wants the error interrupt disable
the bit.

Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: c_can: checking IS_ERR() instead of NULL

devm_ioremap() returns NULL on error, not an ERR_PTR().

Fixes: 33cf75656923 ('can: c_can_platform: Fix raminit, use devm_ioremap() instead of devm_ioremap_resource()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.11
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: sja1000: Validate initialization state in start method

When sja1000 is not compiled as module the SJA1000 chip is only
initialized during device registration on kernel boot. Should the chip
get a hardware reset there is no way to reinitialize it without re-
booting the Linux kernel.

This patch adds a check in sja1000_start if the chip is initialized, if
not we initialize it.

Signed-off-by: Mirza Krak <mirza.krak@hostmobility.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

microblaze: Wire-up memfd_create syscall

Add new memfd_create syscall.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>

microblaze: Wire-up getrandom syscall

Add new getrandom syscall.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>

microblaze: Wire-up seccomp syscall

Add new seccomp syscall.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>