IB/hfi1: Fetch monitor values on-demand for CableInfo query
The monitor values from bytes 22 through 81 of the QSFP memory space
(SFF 8636) are dynamic and serving them out of the QSFP memory cache
maintained by the driver provides stale data to the CableInfo SMA query.
This patch refreshes the dynamic values from the QSFP memory on request
and overwrites the stale data from the cache for the overlap between the
requested range and the monitor range.
Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Marciniszyn [Fri, 12 Aug 2016 15:17:37 +0000 (11:17 -0400)]
IB/hfi1,IB/qib: Fix qp_stats sleep with rcu read lock held
The qp init function does a kzalloc() while holding the RCU
lock that encounters the following warning with a debug kernel
when a cat of the qp_stats is done:
Wei Yongjun [Fri, 5 Aug 2016 13:46:49 +0000 (13:46 +0000)]
IB/core: Fix possible memory leak in cma_resolve_iboe_route()
'work' and 'route->path_rec' are malloced in cma_resolve_iboe_route()
and should be freed before leaving from the error handling cases,
otherwise it will cause memory leak.
Fixes: 200298326b27 ('IB/core: Validate route when we init ah') Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
Tadeusz Struk [Thu, 4 Aug 2016 00:19:32 +0000 (20:19 -0400)]
IB/hfi1: Allocate cpu mask on the heap to silence warning
If CONFIG_FRAME_WARN is small (1K) and CONFIG_NR_CPUS big
then a frame size warning is triggered during build.
Allocate the cpu mask dynamically to silence the warning.
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
IB/mlx4: Return EAGAIN for any error in mlx4_ib_poll_one
Error code EAGAIN should be used when errors are temporary and next call
might succeeds.
When error code other than EAGAIN is returned, the caller (mlx4_ib_poll)
will assume all CQE in the same bunch are error too and will drop them all.
Linus Torvalds [Sun, 21 Aug 2016 21:28:24 +0000 (14:28 -0700)]
Merge branch 'parisc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull two parisc fixes from Helge Deller:
"The first patch ensures that the high-res cr16 clocksource (which was
added in kernel 4.7) gets choosen as default clocksource for parisc.
The second patch moves the #define of EREFUSED down inside errno.h and
thus unbreaks building the gccgo compiler"
* 'parisc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix order of EREFUSED define in errno.h
parisc: Fix automatic selection of cr16 clocksource
Tony Luck [Sat, 20 Aug 2016 23:27:58 +0000 (16:27 -0700)]
EDAC, skx_edac: Add EDAC driver for Skylake
This is an entirely new driver instead of yet another set of patches
to sb_edac.c because:
1) Mapping from PCI devices to socket/memory controller is significantly
different. Skylake scatters devices on a socket across a number of
PCI buses.
2) There is an extra level of interleaving via the "mcroute" register
that would be a little messy to squeeze into the old driver.
3) Validation is getting too expensive. Changes to sb_edac need to
be checked against Sandy Bridge, Ivy Bridge, Haswell, Broadwell and
Knights Landing.
Helge Deller [Sat, 20 Aug 2016 09:51:38 +0000 (11:51 +0200)]
parisc: Fix order of EREFUSED define in errno.h
When building gccgo in userspace, errno.h gets parsed and the go include file
sysinfo.go is generated.
Since EREFUSED is defined to the same value as ECONNREFUSED, and ECONNREFUSED
is defined later on in errno.h, this leads to go complaining that EREFUSED
isn't defined yet.
Fix this trivial problem by moving the define of EREFUSED down after
ECONNREFUSED in errno.h (and clean up the indenting while touching this line).
Helge Deller [Fri, 19 Aug 2016 20:39:02 +0000 (22:39 +0200)]
parisc: Fix automatic selection of cr16 clocksource
Commit 54b66800907 (parisc: Add native high-resolution sched_clock()
implementation) added support to use the CPU-internal cr16 counters as reliable
clocksource with the help of HAVE_UNSTABLE_SCHED_CLOCK.
Sadly the commit missed to remove the hack which prevented cr16 to become the
default clocksource even on SMP systems.
Linus Torvalds [Fri, 19 Aug 2016 19:47:01 +0000 (12:47 -0700)]
Make the hardened user-copy code depend on having a hardened allocator
The kernel test robot reported a usercopy failure in the new hardened
sanity checks, due to a page-crossing copy of the FPU state into the
task structure.
This happened because the kernel test robot was testing with SLOB, which
doesn't actually do the required book-keeping for slab allocations, and
as a result the hardening code didn't realize that the task struct
allocation was one single allocation - and the sanity checks fail.
Since SLOB doesn't even claim to support hardening (and you really
shouldn't use it), the straightforward solution is to just make the
usercopy hardening code depend on the allocator supporting it.
Linus Torvalds [Fri, 19 Aug 2016 19:10:06 +0000 (12:10 -0700)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"I2C has some pretty standard driver bugfixes and one minor cleanup"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: meson: Use complete() instead of complete_all()
i2c: brcmstb: Use complete() instead of complete_all()
i2c: bcm-kona: Use complete() instead of complete_all()
i2c: bcm-iproc: Use complete() instead of complete_all()
i2c: at91: fix support of the "alternative command" feature
i2c: ocores: add missed clk_disable_unprepare() on failure paths
i2c: cros-ec-tunnel: Fix usage of cros_ec_cmd_xfer()
i2c: mux: demux-pinctrl: properly roll back when adding adapter fails
Linus Torvalds [Fri, 19 Aug 2016 16:32:48 +0000 (09:32 -0700)]
Merge tag 'dm-4.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- a stable fix for DM round robin multipath path selector to disable
preemption before using this_cpu_ptr()
- a slight increase in DM crypt's mempool reserves to make swap ontop
of DM crypt more performant
- a few DM raid fixes to issues found while testing changes that were
merged in v4.8-rc1
* tag 'dm-4.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm raid: support raid0 with missing metadata devices
dm raid: enhance attempt_restore_of_faulty_devices() to support more devices
dm raid: fix restoring of failed devices regression
dm raid: fix frozen recovery regression
dm crypt: increase mempool reserve to better support swapping
dm round robin: do not use this_cpu_ptr() without having preemption disabled
Linus Torvalds [Fri, 19 Aug 2016 16:22:50 +0000 (09:22 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Six fairly small fixes. The ipr, mpt3sas and ses ones all trigger
oopses. The megaraid one fixes an attach failure on io mapped only
cards, the fcoe one is an obvious problem in the error path and the
aacraid one is a theoretical security issue (ability to trick the
kernel into a buffer overrun)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
ses: Fix racy cleanup of /sys in remove_dev()
mpt3sas: Fix resume on WarpDrive flash cards
ipr: Fix sync scsi scan
megaraid_sas: Fix probing cards without io port
aacraid: Check size values after double-fetch from user
fcoe: Use kfree_skb() instead of kfree()
Linus Torvalds [Fri, 19 Aug 2016 16:21:24 +0000 (09:21 -0700)]
Merge tag 'usb-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of USB fixes for reported issues for your tree.
The normal amount of gadget fixes, xhci fixes, new device ids, and a
few other minor things. All of them have been in linux-next for a
while, the full details are in the shortlog below"
* tag 'usb-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (43 commits)
xhci: don't dereference a xhci member after removing xhci
usb: xhci: Fix panic if disconnect
xhci: really enqueue zero length TRBs.
xhci: always handle "Command Ring Stopped" events
cdc-acm: fix wrong pipe type on rx interrupt xfers
usb: misc: usbtest: add fix for driver hang
usb: dwc3: gadget: stop processing on HWO set
usb: dwc3: don't set last bit for ISOC endpoints
usb: gadget: rndis: free response queue during REMOTE_NDIS_RESET_MSG
usb: udc: core: fix error handling
usb: gadget: fsl_qe_udc: off by one in setup_received_handle()
usb/gadget: fix gadgetfs aio support.
usb: gadget: composite: Fix return value in case of error
usb: gadget: uvc: Fix return value in case of error
usb: gadget: fix check in sync read from ep in gadgetfs
usb: misc: usbtest: usbtest_do_ioctl may return positive integer
usb: dwc3: fix missing platform_set_drvdata() in dwc3_of_simple_probe()
usb: phy: omap-otg: Fix missing platform_set_drvdata() in omap_otg_probe()
usb: gadget: configfs: add mutex lock before unregister gadget
usb: gadget: u_ether: fix dereference after null check coverify warning
...
Linus Torvalds [Fri, 19 Aug 2016 16:06:41 +0000 (09:06 -0700)]
Merge tag 'xfs-iomap-for-linus-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
Pull xfs and iomap fixes from Dave Chinner:
"Changes in this update:
Regression fixes for XFS changes introduce in 4.8-rc1:
- buffer IO accounting assert failure
- ENOSPC block accounting reservation issue
- DAX IO path page cache invalidation fix
- rmapbt on-disk block count in agf
- correct classification of rmap block type when updating AGFL.
- iomap support for attribute fork mapping
Regression fixes for iomap infrastructure in 4.8-rc1:
- fiemap: honor FIEMAP_FLAG_SYNC
- fiemap: implement FIEMAP_FLAG_XATTR support to fix XFS regression
- make mark_page_accessed and pagefault_disable usage consistent with
other IO paths"
* tag 'xfs-iomap-for-linus-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: remove OWN_AG rmap when allocating a block from the AGFL
xfs: (re-)implement FIEMAP_FLAG_XATTR
xfs: simplify xfs_file_iomap_begin
iomap: mark ->iomap_end as optional
iomap: prepare iomap_fiemap for attribute mappings
iomap: fiemap should honor the FIEMAP_FLAG_SYNC flag
iomap: remove superflous pagefault_disable from iomap_write_actor
iomap: remove superflous mark_page_accessed from iomap_write_actor
xfs: store rmapbt block count in the AGF
xfs: don't invalidate whole file on DAX read/write
xfs: fix bogus space reservation in xfs_iomap_write_allocate
xfs: don't assert fail on non-async buffers on ioacct decrement
Linus Torvalds [Fri, 19 Aug 2016 15:52:17 +0000 (08:52 -0700)]
Merge tag 'hwmon-for-linus-v4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"Fix a bug in it87 driver and URLs in ftsteutates driver"
* tag 'hwmon-for-linus-v4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (ftsteutates) Correct ftp urls in driver documentation
hwmon: (it87) Features mask must be 32 bit wide
Linus Torvalds [Fri, 19 Aug 2016 02:38:18 +0000 (19:38 -0700)]
Merge tag 'drm-fixes-for-4.8-rc3-2' of git://people.freedesktop.org/~airlied/linux
Pull more drm fixes from Dave Airlie:
"Daniel pointed out I'd missed some i915 fixes, and I also found a
single etnaviv fix I missed.
So here they are"
* tag 'drm-fixes-for-4.8-rc3-2' of git://people.freedesktop.org/~airlied/linux:
drm/etnaviv: take GPU lock later in the submit process
drm/i915: Fix modeset handling during gpu reset, v5.
drm/i915: fix aliasing_ppgtt leak
drm/i915: fix WaInsertDummyPushConstPs
drm/i915: Fix iboost setting for SKL Y/U DP DDI buffer translation entry 2
drm/i915/gen9: Give one extra block per line for SKL plane WM calculations
drm/i915: Acquire audio powerwell for HD-Audio registers
drm/i915: Add missing rpm wakelock to GGTT pread
drm/i915/fbc: FBC causes display flicker when VT-d is enabled on Skylake
drm/i915: Clean up the extra RPM ref on CHV with i915.enable_rc6=0
drm/i915: Program iboost settings for HDMI/DVI on SKL
drm/i915: Fix iboost setting for DDI with 4 lanes on SKL
drm/i915: Handle ENOSPC after failing to insert a mappable node
drm/i915: Flush GT idle status upon reset
Linus Torvalds [Fri, 19 Aug 2016 02:31:08 +0000 (19:31 -0700)]
Merge tag 'devicetree-fixes-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull DeviceTree fixes from Rob Herring:
- a couple of DT node ref counting fixes
- fix __unflatten_device_tree for PPC PCI hotplug case
- rework marking irq controllers as OF_POPULATED in cases where real
driver is used.
- disable of_platform_default_populate_init on PPC. The change in
initcall order causes problems which need to be sorted out later.
* tag 'devicetree-fixes-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: fix reference counting in of_graph_get_endpoint_by_regs
of/platform: disable the of_platform_default_populate_init() for all the ppc boards
ARM: imx6: mark GPC node as not populated after irq init to probe pm domain driver
of/irq: Mark interrupt controllers as populated before initialisation
drivers/of: Validate device node in __unflatten_device_tree()
of: Delete an unnecessary check before the function call "of_node_put"
Linus Torvalds [Fri, 19 Aug 2016 01:54:40 +0000 (18:54 -0700)]
Merge tag '4.8-doc-fixes' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"Three small fixes for Sphinx-formatted documentation generation"
* tag '4.8-doc-fixes' of git://git.lwn.net/linux:
doc-rst: customize RTD theme, drop padding of inline literal
docs: kernel-documentation: remove some highlight directives
docs: Set the Sphinx default highlight language to "guess"
Dave Airlie [Thu, 18 Aug 2016 22:51:13 +0000 (08:51 +1000)]
Merge tag 'drm-intel-fixes-2016-08-15' of git://anongit.freedesktop.org/drm-intel into drm-fixes
Collection of i915 fixes.
* tag 'drm-intel-fixes-2016-08-15' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Fix modeset handling during gpu reset, v5.
drm/i915: fix aliasing_ppgtt leak
drm/i915: fix WaInsertDummyPushConstPs
drm/i915: Fix iboost setting for SKL Y/U DP DDI buffer translation entry 2
drm/i915/gen9: Give one extra block per line for SKL plane WM calculations
drm/i915: Acquire audio powerwell for HD-Audio registers
drm/i915: Add missing rpm wakelock to GGTT pread
drm/i915/fbc: FBC causes display flicker when VT-d is enabled on Skylake
drm/i915: Clean up the extra RPM ref on CHV with i915.enable_rc6=0
drm/i915: Program iboost settings for HDMI/DVI on SKL
drm/i915: Fix iboost setting for DDI with 4 lanes on SKL
drm/i915: Handle ENOSPC after failing to insert a mappable node
drm/i915: Flush GT idle status upon reset
Linus Torvalds [Thu, 18 Aug 2016 22:09:41 +0000 (15:09 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"An initrd microcode loading fix, and an SMP bootup topology setup fix
to resolve crashes on SGI/UV systems if the BIOS is configured in a
certain way"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/smp: Fix __max_logical_packages value setup
x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y
Linus Torvalds [Thu, 18 Aug 2016 22:08:31 +0000 (15:08 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
"Three clocksource driver fixes"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/mips-gic-timer: Make gic_clocksource_of_init() return int
clocksource/drivers/kona: Fix get_counter() error handling
clocksource/drivers/time-armada-370-xp: Fix the clock reference
Linus Torvalds [Thu, 18 Aug 2016 22:04:53 +0000 (15:04 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, but also start/stop filter related fixes, a perf
event read() fix, a fix uncovered by fuzzing, and an uprobes leak fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Check return value of the perf_event_read() IPI
perf/core: Enable mapping of the stop filters
perf/core: Update filters only on executable mmap
perf/core: Fix file name handling for start/stop filters
perf/core: Fix event_function_local()
uprobes: Fix the memcg accounting
perf intel-pt: Fix occasional decoding errors when tracing system-wide
tools: Sync kvm related header files for arm64 and s390
perf probe: Release resources on error when handling exit paths
perf probe: Check for dup and fdopen failures
perf symbols: Fix annotation of objects with debuginfo files
perf script: Don't disable use_callchain if input is pipe
perf script: Show proper message when failed list scripts
perf jitdump: Add the right header to get the major()/minor() definitions
perf ppc64le: Fix build failure when libelf is not present
perf tools mem: Fix -t store option for record command
perf intel-pt: Fix ip compression
Linus Torvalds [Thu, 18 Aug 2016 18:17:13 +0000 (11:17 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Avoid a literal load with the MMU off on the CPU resume path
(potential inconsistency between cache and RAM)
- Build error with CONFIG_ACPI=n fixed
- Compiler warning in the arch/arm64/mm/dump.c code fixed
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix shift warning in arch/arm64/mm/dump.c
arm64: kernel: avoid literal load of virtual address with MMU off
arm64: Fix NUMA build error when !CONFIG_ACPI
Linus Torvalds [Thu, 18 Aug 2016 18:09:43 +0000 (11:09 -0700)]
Merge tag 'pm-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"More hibernation-related material: one fix for a recent regression in
the core, one small cleanup of the x86-64 resume code and a
documentation update.
Specifics:
- Fix a hibernate core regression resulting from uncovering a latent
bug in its implementation of memory bitmaps by a recent commit
(James Morse).
- Use __pa() to compute a physical address in the x86-64 code
finalizing resume from hibernation (Rafael Wysocki).
- Update power management documentation related to system sleep
states to remove outdated information from it and to add a
description of a recently introduced hibernation debug feature to
it (Rafael Wysocki)"
* tag 'pm-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
x86/power/64: Use __pa() for physical address computation
PM / sleep: Update some system sleep documentation
Linus Torvalds [Thu, 18 Aug 2016 17:58:50 +0000 (10:58 -0700)]
Merge tag 'drm-fixes-for-4.8-rc3' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Pretty quiet so far:
- a few amdgpu/radeon fixup for pcie pm changes
- a couple of amdgpu fixes
- some build fixes
- printk fix"
* tag 'drm-fixes-for-4.8-rc3' of git://people.freedesktop.org/~airlied/linux:
drm/amdgpu: Change GART offset to 64-bit
drm/mediatek: add ARM_SMCCC dependency
drm/mediatek: add CONFIG_OF dependency
drm/mediatek: add COMMON_CLK dependency
drm/amdgpu: Fix memory trashing if UVD ring test fails
drm/amdgpu: fix vm init error path
drm/amdkfd: print doorbell offset as a hex value
Revert "drm/radeon: work around lack of upstream ACPI support for D3cold"
Revert "drm/amdgpu: work around lack of upstream ACPI support for D3cold"
Johannes Berg [Thu, 11 Aug 2016 09:50:22 +0000 (11:50 +0200)]
locking/barriers: Suppress sparse warnings in lockless_dereference()
After Peter's commit:
331b6d8c7afc ("locking/barriers: Validate lockless_dereference() is used on a pointer type")
... we get a lot of sparse warnings (one for every rcu_dereference, and more)
since the expression here is assigning to the wrong address space.
Instead of validating that 'p' is a pointer this way, instead make
it fail compilation when it's not by using sizeof(*(p)). This will
not cause any sparse warnings (tested, likely since the address
space is irrelevant for sizeof), and will fail compilation when
'p' isn't a pointer type.
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 331b6d8c7afc ("locking/barriers: Validate lockless_dereference() is used on a pointer type") Link: http://lkml.kernel.org/r/1470909022-687-2-git-send-email-johannes@sipsolutions.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
Johannes Berg [Thu, 11 Aug 2016 09:50:21 +0000 (11:50 +0200)]
Revert "drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference"
This reverts commit:
fa7d81bb3c269 ("drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference")
As Peter explained:
[...] lockless_dereference() is _stronger_ than READ_ONCE(), not weaker.
[...]
Also, clue is in the name: 'dereference', you don't actually dereference
the pointer here, only load it.
My next patch breaks the compile without this revert, because it assumes
you want to deference and thus also need the struct type visible (which
it isn't here), so revert it.
Tested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1470909022-687-1-git-send-email-johannes@sipsolutions.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
On an i5 laptop, 4 pCPUs, 4vCPUs for one full dynticks guest, there are four
CPU hog processes(for loop) running in the guest, I hot-unplug the pCPUs
on host one by one until there is only one left, then observe CPU utilization
via 'top' in the guest, it shows:
100% st for cpu0(housekeeping)
75% st for other CPUs (nohz full mode)
However, w/o this commit it shows the correct 75% for all four CPUs.
When a guest is interrupted for a longer amount of time, missed clock ticks
are not redelivered later. Because of that, we should not limit the amount
of steal time accounted to the amount of time that the calling functions
think have passed.
However, the interval returned by account_other_time() is NOT rounded down
to the nearest jiffy, while the base interval in get_vtime_delta() it is
subtracted from is, so the max cputime limit is required to avoid underflow.
This patch fixes the regression by limiting the account_other_time() from
get_vtime_delta() to avoid underflow, and lets the other three call sites
(in account_other_time() and steal_account_process_time()) account however
much steal time the host told us elapsed.
Suggested-by: Rik van Riel <riel@redhat.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krcmar <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm@vger.kernel.org Link: http://lkml.kernel.org/r/1471399546-4069-1-git-send-email-wanpeng.li@hotmail.com
[ Improved the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
perf/core: Check return value of the perf_event_read() IPI
The call to smp_call_function_single in perf_event_read() may fail if
an invalid or not online CPU index is passed. Warn user if such bug is
present and return error.
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1471467307-61171-2-git-send-email-davidcc@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
At this time the perf_addr_filter_needs_mmap() function will _not_
return true on a user space 'stop' filter. But stop filters need
exactly the same kind of mapping that range and start filters get.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1468860187-318-4-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
Function perf_event_mmap() is called by the MM subsystem each time
part of a binary is loaded in memory. There can be several mapping
for a binary, many times unrelated to the code section.
Each time a section of a binary is mapped address filters are
updated, event when the map doesn't pertain to the code section.
The end result is that filters are configured based on the last map
event that was received rather than the last mapping of the code
segment.
For example if we have an executable 'main' that calls library
'libcstest.so.1.0', and that we want to collect traces on code
that is in that library. The perf cmd line for this scenario
would be:
perf record -e cs_etm// --filter 'filter 0x72c/0x40@/opt/lib/libcstest.so.1.0' --per-thread ./main
Before 'main()' can execute 'libcstest.so.1.0' has to be loaded in
memory. Once that has been done perf_event_mmap() has been called
4 times, with the last map starting at address 0x7fa25ce000 and
the address filter configured to start filtering when the
IP has passed over address 0x0x7fa25ce72c (0x7fa25ce000 + 0x72c).
But that is wrong since the code segment for library 'libcstest.so.1.0'
as been mapped at 0x7fa25bd000, resulting in traces not being
collected.
This patch corrects the situation by requesting that address
filters be updated only if the mapped event is for a code
segment.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1468860187-318-3-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
perf/core: Fix file name handling for start/stop filters
Binary file names have to be supplied for both range and start/stop
filters but the current code only processes the filename if an
address range filter is specified. This code adds processing of
the filename for start/stop filters.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1468860187-318-2-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
The reason for the panic is wrong value of __max_logical_packages,
which lets logical_package_map uninitialized and the uncore code
relying on this map being properly initialized (maybe we should
add some safety checks there as well).
The __max_logical_packages is computed as:
DIV_ROUND_UP(total_cpus, ncpus);
- ncpus being number of cores
With above BIOS setup we get total_cpus == 16 which set
__max_logical_packages to 2 (ncpus is 12).
Once topology_update_package_map processes CPU with logical
pkg over 2 we display above messages and fail to initialize
the physical_to_logical_pkg map, which makes the uncore code
crash.
The fix is to remove logical_package_map bitmap completely
and keep and update the logical_packages number instead.
After we enumerate all the present CPUs, we check if the
enumerated logical packages count is within its computed
maximum from BIOS data.
If it's not the case, we set this maximum to the new enumerated
value and freeze any new addition of logical packages.
The freeze is because lot of init code like uncore/rapl/cqm
depends on having maximum logical package value set to allocate
their data, so we can't change it later on.
Prarit Bhargava tested the patch and confirms that it solves
the problem:
Oleg Nesterov [Wed, 17 Aug 2016 15:36:29 +0000 (17:36 +0200)]
uprobes: Fix the memcg accounting
__replace_page() wronlgy calls mem_cgroup_cancel_charge() in "success" path,
it should only do this if page_check_address() fails.
This means that every enable/disable leads to unbalanced mem_cgroup_uncharge()
from put_page(old_page), it is trivial to underflow the page_counter->count
and trigger OOM.
Reported-and-tested-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Cc: stable@vger.kernel.org # 3.17+ Fixes: 00501b531c47 ("mm: memcontrol: rewrite charge API") Link: http://lkml.kernel.org/r/20160817153629.GB29724@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* pm-sleep:
PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
x86/power/64: Use __pa() for physical address computation
PM / sleep: Update some system sleep documentation
1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
Fietkau.
2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.
3) Fix some tg3 ethtool logic bugs, and one that would cause no
interrupts to be generated when rx-coalescing is set to 0. From
Satish Baddipadige and Siva Reddy Kallam.
4) QLCNIC mailbox corruption and napi budget handling fix from Manish
Chopra.
5) Fix fib_trie logic when walking the trie during /proc/net/route
output than can access a stale node pointer. From David Forster.
6) Several sctp_diag fixes from Phil Sutter.
7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.
8) Checksum fixup fixes in bpf from Daniel Borkmann.
9) Memork leaks in nfnetlink, from Liping Zhang.
10) Use after free in rxrpc, from David Howells.
11) Use after free in new skb_array code of macvtap driver, from Jason
Wang.
12) Calipso resource leak, from Colin Ian King.
13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.
14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.
15) Fix lockdep splats in macsec, from Sabrina Dubroca.
16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
handling.
17) Various tc-action bug fixes, from CONG Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
net_sched: allow flushing tc police actions
net_sched: unify the init logic for act_police
net_sched: convert tcf_exts from list to pointer array
net_sched: move tc offload macros to pkt_cls.h
net_sched: fix a typo in tc_for_each_action()
net_sched: remove an unnecessary list_del()
net_sched: remove the leftover cleanup_a()
mlxsw: spectrum: Allow packets to be trapped from any PG
mlxsw: spectrum: Unmap 802.1Q FID before destroying it
mlxsw: spectrum: Add missing rollbacks in error path
mlxsw: reg: Fix missing op field fill-up
mlxsw: spectrum: Trap loop-backed packets
mlxsw: spectrum: Add missing packet traps
mlxsw: spectrum: Mark port as active before registering it
mlxsw: spectrum: Create PVID vPort before registering netdevice
mlxsw: spectrum: Remove redundant errors from the code
mlxsw: spectrum: Don't return upon error in removal path
i40e: check for and deal with non-contiguous TCs
ixgbe: Re-enable ability to toggle VLAN filtering
ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
...
Roman Mashak [Sun, 14 Aug 2016 05:35:02 +0000 (22:35 -0700)]
net_sched: allow flushing tc police actions
The act_police uses its own code to walk the
action hashtable, which leads to that we could
not flush standalone tc police actions, so just
switch to tcf_generic_walker() like other actions.
(Joint work from Roman and Cong.)
Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Sun, 14 Aug 2016 05:35:01 +0000 (22:35 -0700)]
net_sched: unify the init logic for act_police
Jamal reported a crash when we create a police action
with a specific index, this is because the init logic
is not correct, we should always create one for this
case. Just unify the logic with other tc actions.
Fixes: a03e6fe56971 ("act_police: fix a crash during removal") Reported-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Sun, 14 Aug 2016 05:35:00 +0000 (22:35 -0700)]
net_sched: convert tcf_exts from list to pointer array
As pointed out by Jamal, an action could be shared by
multiple filters, so we can't use list to chain them
any more after we get rid of the original tc_action.
Instead, we could just save pointers to these actions
in tcf_exts, since they are refcount'ed, so convert
the list to an array of pointers.
The "ugly" part is the action API still accepts list
as a parameter, I just introduce a helper function to
convert the array of pointers to a list, instead of
relying on the C99 feature to iterate the array.
Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common") Reported-by: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Sun, 14 Aug 2016 05:34:59 +0000 (22:34 -0700)]
net_sched: move tc offload macros to pkt_cls.h
struct tcf_exts belongs to filters, should not be visible
to plain tc actions.
Cc: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Sun, 14 Aug 2016 05:34:58 +0000 (22:34 -0700)]
net_sched: fix a typo in tc_for_each_action()
It is harmless because all users pass 'a' to this macro.
Fixes: 00175aec941e ("net/sched: Macro instead of CONFIG_NET_CLS_ACT ifdef") Cc: Amir Vadai <amir@vadai.me> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Sun, 14 Aug 2016 05:34:57 +0000 (22:34 -0700)]
net_sched: remove an unnecessary list_del()
This list_del() for tc action is not needed actually,
because we only use this list to chain bulk operations,
therefore should not be carried for latter operations.
Fixes: ec0595cc4495 ("net_sched: get rid of struct tcf_common") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Sun, 14 Aug 2016 05:34:56 +0000 (22:34 -0700)]
net_sched: remove the leftover cleanup_a()
After refactoring tc_action into tcf_common, we no
longer need to cleanup temporary "actions" in list,
they are permanently stored in the hashtable.
Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common") Reported-by: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Aug 2016 23:20:24 +0000 (19:20 -0400)]
Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2016-08-16
This series contains fixes to e1000e, igb, ixgbe and i40e.
Kshitiz Gupta provides a fix for igb to resolve the PHY delay compensation
math in several functions.
Jarod Wilson provides a fix for e1000e which had to broken up into 2
patches, first is prepares the driver for expanding the list of NICs
that have occasional ~10 hour clock jumps when being used for PTP.
Second patch actually fixes i218 silicon which has been experiencing
the clock jumps while using PTP.
Alex provides 2 patches for ixgbe now that he is back at Intel. First
fixes setting VLNCTRL.VFE bit, which was left unchanged in earlier patches
which resulted in disabling VLAN filtering for all the VFs. Second
corrects the support for disabling the VLAN tag filtering via the
feature bit.
Lastly, David fixes i40e which was causing a kernel panic when
non-contiguous traffic classes or traffic classes not starting with TC0,
were configured on a link partner switch. To fix this, changed the
logic when determining the total number of TCs enabled.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Aug 2016 23:18:34 +0000 (19:18 -0400)]
Merge branch 'mlxsw-fixes'
Jiri Pirko says:
====================
mlxsw: IPv4 UC router fixes
Ido says:
Patches 1-3 fix a long standing problem in the driver's init sequence,
which manifests itself quite often when routing daemons try to configure
an IP address on registered netdevs that don't yet have an associated
vPort.
Patches 4-9 add missing packet traps for the router to work properly and
also fix ordering issue following the recent changes to the driver's init
sequence.
The last patch isn't related to the router, but fixes a general problem
in which under certain conditions packets aren't trapped to CPU.
v1->v2:
- Change order of patch 7
- Add patch 6 following Ilan's comment
- Add patchset name and cover letter
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 17 Aug 2016 14:39:37 +0000 (16:39 +0200)]
mlxsw: spectrum: Allow packets to be trapped from any PG
When packets enter the device they are classified to a priority group
(PG) buffer based on their PCP value. After their egress port and
traffic class are determined they are moved to the switch's shared
buffer and await transmission, if:
Packets scheduled to transmission through CPU port (trapped to CPU) use
traffic class 7, which has a zero maximum and minimum quotas. However,
when such packets arrive from PG 0 they are admitted to the shared
buffer as PG 0 has a non-zero minimum quota.
Allow all packets to be trapped to the CPU - regardless of the PG they
were classified to - by assigning a 10KB minimum quota for CPU port and
TC7.
Fixes: 8e8dfe9fdf06 ("mlxsw: spectrum: Add IEEE 802.1Qaz ETS support") Reported-by: Tamir Winetroub <tamirw@mellanox.com> Tested-by: Tamir Winetroub <tamirw@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 17 Aug 2016 14:39:36 +0000 (16:39 +0200)]
mlxsw: spectrum: Unmap 802.1Q FID before destroying it
Before destroying the 802.1Q FID we should first remove the VID-to-FID
mapping. This makes mlxsw_sp_fid_destroy() symmetric with regards to
mlxsw_sp_fid_create().
Fixes: 14d39461b3f4 ("mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridge") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 17 Aug 2016 14:39:35 +0000 (16:39 +0200)]
mlxsw: spectrum: Add missing rollbacks in error path
While going over the code I noticed we are missing two rollbacks in the
port's creation error path. Add them and adjust the place of one of them
in the port's removal sequence so that both are symmetric.
Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 17 Aug 2016 14:39:33 +0000 (16:39 +0200)]
mlxsw: spectrum: Trap loop-backed packets
One of the conditions to generate an ICMP Redirect Message is that "the
packet is being forwarded out the same physical interface that it was
received from" (RFC 1812).
Therefore, we need to be able to trap such packets and let the kernel
decide what to do with them.
For each RIF, enable the loop-back filter, which will raise the LBERROR
trap whenever the ingress RIF equals the egress RIF.
Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces") Reported-by: Ilan Tayari <ilant@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 17 Aug 2016 14:39:31 +0000 (16:39 +0200)]
mlxsw: spectrum: Mark port as active before registering it
Commit bbf2a4757b30 ("mlxsw: spectrum: Initialize ports at the end of
init sequence") moved ports initialization to the end of the init
sequence, which means ports are the first to be removed during fini.
Since the FDB delayed work is still active when ports are removed it's
possible for it to process FDB notifications of inactive ports,
resulting in a warning message.
Fix that by marking ports as inactive only after unregistering them. The
NETDEV_UNREGISTER event will invoke bridge's driver port removal
sequence that will cause the FDB (and FDB notifications) to be flushed.
Fixes: bbf2a4757b30 ("mlxsw: spectrum: Initialize ports at the end of init sequence") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 17 Aug 2016 14:39:30 +0000 (16:39 +0200)]
mlxsw: spectrum: Create PVID vPort before registering netdevice
After registering a netdevice it's possible for user space applications
to configure an IP address on it. From the driver's perspective, this
means a router interface (RIF) should be created for the PVID vPort.
Therefore, we must create the PVID vPort before registering the
netdevice.
Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 17 Aug 2016 14:39:29 +0000 (16:39 +0200)]
mlxsw: spectrum: Remove redundant errors from the code
Currently, when device configuration fails we emit errors to the kernel
log despite the fact we already get these from the EMAD transaction
layer, so remove them.
In addition to being unnecessary, removing these error messages will
allow us to reuse mlxsw_sp_port_add_vid() to create the PVID vPort
before registering the netdevice.
Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 17 Aug 2016 14:39:28 +0000 (16:39 +0200)]
mlxsw: spectrum: Don't return upon error in removal path
When removing a VLAN filter from the device we shouldn't return upon the
first error we encounter, as otherwise we'll have resources that will
never be freed nor used.
Instead, we should keep trying to free as much resources as possible in
a best effort mode.
Remove the error message as well, since we already get these from the
EMAD transaction code.
Fixes: 99724c18fc66 ("mlxsw: spectrum: Introduce support for router interfaces") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 17 Aug 2016 19:10:22 +0000 (12:10 -0700)]
Merge tag 'for-v4.8-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply fixes from Sebastian Reichel.
* tag 'for-v4.8-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
power_supply: tps65217-charger: fix missing platform_set_drvdata()
power: reset: hisi-reboot: Unmap region obtained by of_iomap
power: reset: reboot-mode: fix build error of missing ioremap/iounmap on UM
power: supply: max17042_battery: fix model download bug.
Ard Biesheuvel [Wed, 17 Aug 2016 15:54:41 +0000 (17:54 +0200)]
arm64: kernel: avoid literal load of virtual address with MMU off
Literal loads of virtual addresses are subject to runtime relocation when
CONFIG_RELOCATABLE=y, and given that the relocation routines run with the
MMU and caches enabled, literal loads of relocated values performed with
the MMU off are not guaranteed to return the latest value unless the
memory covering the literal is cleaned to the PoC explicitly.
So defer the literal load until after the MMU has been enabled, just like
we do for primary_switch() and secondary_switch() in head.S.
Fixes: 1e48ef7fcc37 ("arm64: add support for building vmlinux as a relocatable PIE binary") Cc: <stable@vger.kernel.org> # 4.6+ Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Catalin Marinas [Mon, 15 Aug 2016 15:33:10 +0000 (16:33 +0100)]
arm64: Fix NUMA build error when !CONFIG_ACPI
Since asm/acpi.h is only included by linux/acpi.h when CONFIG_ACPI is
enabled, disabling the latter leads to the following build error on
arm64:
arch/arm64/mm/numa.c: In function ‘arm64_numa_init’:
arch/arm64/mm/numa.c:395:24: error: ‘arm64_acpi_numa_init’ undeclared (first use in this function)
if (!acpi_disabled && !numa_init(arm64_acpi_numa_init))
This patch include the asm/acpi.h explicitly in arch/arm64/mm/numa.c for
the arm64_acpi_numa_init() definition.
Fixes: d8b47fca8c23 ("arm64, ACPI, NUMA: NUMA support based on SRAT and SLIT") Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
dm raid: support raid0 with missing metadata devices
The raid0 MD personality does not start a raid0 array with any of its
data devices missing.
dm-raid was removing data/metadata device pairs unconditionally if it
failed to read a superblock off the respective metadata device of such
pair, resulting in failure to start arrays with the raid0 personality.
Avoid removing any data/metadata device pairs in case of raid0
(e.g. lvm2 segment type 'raid0_meta') thus allowing MD to start the
array.
Also, avoid region size validation for raid0.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Paul Gortmaker [Wed, 17 Aug 2016 10:21:35 +0000 (12:21 +0200)]
clocksource/drivers/mips-gic-timer: Make gic_clocksource_of_init() return int
In commit:
d8152bf85d2c0 ("clocksource/drivers/mips-gic-timer: Convert init function to return error")
several return values were added to a void function resulting in the following warnings:
clocksource/mips-gic-timer.c: In function 'gic_clocksource_of_init':
clocksource/mips-gic-timer.c:175:3: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c:183:4: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c:190:3: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c:195:3: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c:200:3: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c:211:2: warning: 'return' with a value, in function returning void [enabled by default]
clocksource/mips-gic-timer.c: At top level:
clocksource/mips-gic-timer.c:213:1: warning: comparison of distinct pointer types lacks a cast [enabled by default]
clocksource/mips-gic-timer.c: In function 'gic_clocksource_of_init':
clocksource/mips-gic-timer.c:183:18: warning: ignoring return value of 'PTR_ERR', declared with attribute warn_unused_result [-Wunused-result]
Given that the addition of the return values was intentional, it seems
that the conversion of the containing function from void to int was
simply overlooked.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mips@linux-mips.org Fixes: d8152bf85d2c ("clocksource/drivers/mips-gic-timer: Convert init function to return error") Link: http://lkml.kernel.org/r/1471429296-9053-3-git-send-email-daniel.lezcano@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
I could not figure out why, but GCC cannot prove that the
kona_timer_init() function always initializes its two outputs,
and we get a warning for the use of the 'lsw' variable later,
which is obviously correct.
drivers/clocksource/bcm_kona_timer.c: In function 'kona_timer_init':
drivers/clocksource/bcm_kona_timer.c:119:13: error: 'lsw' may be used uninitialized in this function [-Werror=maybe-uninitialized]
Slightly reordering the loop makes the warning disappear, after
it becomes more obvious to the compiler that the loop is
always entered on the first iteration.
As pointed out by Ray Jui, there is a related problem in the
way we deal with the loop running into the limit, as we just
keep going there with an invalid counter data, so instead we
now propagate a -ETIMEDOUT result to the caller.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Acked-by: Ray Jui <ray.jui@broadcom.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bcm-kernel-feedback-list@broadcom.com Link: http://lkml.kernel.org/r/1471429296-9053-2-git-send-email-daniel.lezcano@linaro.org Link: https://patchwork.kernel.org/patch/9174261/ Signed-off-by: Ingo Molnar <mingo@kernel.org>
Gregory CLEMENT [Wed, 17 Aug 2016 10:21:33 +0000 (12:21 +0200)]
clocksource/drivers/time-armada-370-xp: Fix the clock reference
While converting the init function to return an error, the wrong clock
was get. This leads to the wrong clock rate and slows down the kernel.
For example, it affects typical boot time:
- without fix: over 1 minute
- with fix: 15 seconds
Tested-by: Stefan Roese <sr@denx.de> Tested-by: Ralph Sennhauser <ralph.sennhauser@gmail.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 12549e27c63c ("clocksource/drivers/time-armada-370-xp: Convert init function to return error") Link: http://lkml.kernel.org/r/1471429296-9053-1-git-send-email-daniel.lezcano@linaro.org
[ Refined the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
Darrick J. Wong [Wed, 17 Aug 2016 01:12:57 +0000 (11:12 +1000)]
xfs: remove OWN_AG rmap when allocating a block from the AGFL
When we're really tight on space, xfs_alloc_ag_vextent_small() can
allocate a block from the AGFL and give it to the caller. Since the
caller is never the AGFL-fixing method, we must remove the OWN_AG
reverse mapping because it will clash with whatever rmap the caller
wants to set up. This bug was discovered by running generic/299
repeatedly.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
Linus Torvalds [Tue, 16 Aug 2016 22:51:57 +0000 (15:51 -0700)]
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vhost fixes from Michael Tsirkin:
- test fixes
- a vsock fix
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
tools/virtio: add dma stubs
vhost/test: fix after swiotlb changes
vhost/vsock: drop space available check for TX vq
ringtest: test build fix
Linus Torvalds [Tue, 16 Aug 2016 22:50:22 +0000 (15:50 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"A couple of bug fixes, minor cleanup and a change to the default
config"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/dasd: fix failing CUIR assignment under LPAR
s390/pageattr: handle numpages parameter correctly
s390/dasd: fix hanging device after clear subchannel
s390/qdio: avoid reschedule of outbound tasklet once killed
s390/qdio: remove checks for ccw device internal state
s390/qdio: fix double return code evaluation
s390/qdio: get rid of spin_lock_irqsave usage
s390/cio: remove subchannel_id from ccw_device_private
s390/qdio: obtain subchannel_id via ccw_device_get_schid()
s390/cio: stop using subchannel_id from ccw_device_private
s390/config: make the vector optimized crc function builtin
s390/lib: fix memcmp and strstr
s390/crc32-vx: Fix checksum calculation for small sizes
s390: clarify compressed image code path
Dave Chinner [Tue, 16 Aug 2016 22:41:34 +0000 (08:41 +1000)]
iomap: prepare iomap_fiemap for attribute mappings
By bassing through an -ENOENT, similar to the old XFS implementation of
FIEMAP_FLAG_XATTR.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
[hch: split from a larger patch] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
Dave Chinner [Tue, 16 Aug 2016 22:41:10 +0000 (08:41 +1000)]
iomap: fiemap should honor the FIEMAP_FLAG_SYNC flag
The flag is checked as supported, but then we do an unconditional
sync of the file, regardless of whether the flag is set or not. Make
the sync conditional on having the FIEMAP_FLAG_SYNC flag set.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
iomap: remove superflous pagefault_disable from iomap_write_actor
iov_iter_copy_from_user_atomic disables page faults internally, no need to
do it around the call. This also brings the iomap code in line with
the original filemap version.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
iomap: remove superflous mark_page_accessed from iomap_write_actor
This catches up with commit 2457ae ("mm: non-atomically mark page
accessed during page cache allocation where possible"), which
moved the initial access marking into the pagecache allocator.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
Darrick J. Wong [Tue, 16 Aug 2016 22:31:49 +0000 (08:31 +1000)]
xfs: store rmapbt block count in the AGF
Track the number of blocks used for the rmapbt in the AGF. When we
get to the AG reservation code we need this counter to quickly
make our reservation during mount.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
Dave Chinner [Tue, 16 Aug 2016 22:31:33 +0000 (08:31 +1000)]
xfs: don't invalidate whole file on DAX read/write
When we do DAX IO, we try to invalidate the entire page cache held
on the file. This is incorrect as it will trash the entire mapping
tree that now tracks dirty state in exceptional entries in the radix
tree slots.
What we are trying to do is remove cached pages (e.g from reads
into holes) that sit in the radix tree over the range we are about
to write to. Hence we should just limit the invalidation to the
range we are about to overwrite.
Reported-by: Jan Kara <jack@suse.cz> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
xfs: fix bogus space reservation in xfs_iomap_write_allocate
The space reservations was without an explaination in commit
"Add error reporting calls in error paths that return EFSCORRUPTED"
back in 2003. There is no reason to reserve disk blocks in the
transaction when allocating blocks for delalloc space as we already
reserved the space when creating the delalloc extent.
With this fix we stop running out of the reserved pool in
generic/229, which has happened for long time with small blocksize
file systems, and has increased in severity with the new buffered
write path.
[ dchinner: we still need to pass the block reservation into
xfs_bmapi_write() to ensure we don't deadlock during AG selection.
See commit dbd5c8c ("xfs: pass total block res. as total
xfs_bmapi_write() parameter") for more details on why this is
necessary. ]
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
Brian Foster [Tue, 16 Aug 2016 22:30:28 +0000 (08:30 +1000)]
xfs: don't assert fail on non-async buffers on ioacct decrement
The buffer I/O accounting mechanism tracks async buffers under I/O. As
an optimization, the buffer I/O count is incremented only once on the
first async I/O for a given hold cycle of a buffer and decremented once
the buffer is released to the LRU (or freed).
xfs_buf_ioacct_dec() has an ASSERT() check for an XBF_ASYNC buffer, but
we have one or two corner cases where a buffer can be submitted for I/O
multiple times via different methods in a single hold cycle. If an async
I/O occurs first, the I/O count is incremented. If a sync I/O occurs
before the hold count drops, XBF_ASYNC is cleared by the time the I/O
count is decremented.
Remove the async assert check from xfs_buf_ioacct_dec() as this is a
perfectly valid scenario. For the purposes of I/O accounting, we really
only care about the buffer async state at I/O submission time.
Discovered-and-analyzed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
Dave Ertman [Fri, 12 Aug 2016 16:56:32 +0000 (09:56 -0700)]
i40e: check for and deal with non-contiguous TCs
The i40e driver was causing a kernel panic when
non-contiguous Traffic Classes, or Traffic Classes not
starting with TC0, were configured on a link partner switch.
i40e does not support non-contiguous TCs.
To fix this, the patch changes the logic when determining
the total number of TCs enabled. Before, this would use the
highest TC number enabled and assume that all TCs below it were
also enabled. Now, we create a bitmask of enabled TCs and scan
it to determine not only the number of TCs, but also if the set
of enabled TCs starts at zero and is contiguous. If not, then
DCB is disabled by only returning one TC.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
dm raid: enhance attempt_restore_of_faulty_devices() to support more devices
attempt_restore_of_faulty_devices() is limited to 64 when it should support
the new maximum of 253 when identifying any failed devices. It clears any
revivable devices via an MD personality hot remove and add cylce to allow
for their recovery.
Address by using existing functions to retrieve and update all failed
devices' bitfield members in the dm raid superblocks on all RAID devices
and check for any devices to clear in it.
Whilst on it, don't call attempt_restore_of_faulty_devices() for any MD
personality not providing disk hot add/remove methods (i.e. raid0 now),
because such personalities don't support reviving of failed disks.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Alexander Duyck [Fri, 12 Aug 2016 16:53:39 +0000 (09:53 -0700)]
ixgbe: Re-enable ability to toggle VLAN filtering
Back when I submitted the GSO code I messed up and dropped the support for
disabling the VLAN tag filtering via the feature bit. This patch
re-enables the use of the NETIF_F_HW_VLAN_CTAG_FILTER to enable/disable the
VLAN filtering independent of toggling promiscuous mode.
Fixes: b83e30104b ("ixgbe/ixgbevf: Add support for GSO partial") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
dm raid: fix restoring of failed devices regression
'lvchange --refresh RaidLV' causes a mapped device suspend/resume cycle
aiming at device restore and resync after transient device failures. This
failed because flag RT_FLAG_RS_RESUMED was always cleared in the suspend path,
thus the device restore wasn't performed in the resume path.
Solve by removing RT_FLAG_RS_RESUMED from the suspend path and resume
unconditionally. Also, remove superfluous comment from raid_resume().
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Alexander Duyck [Thu, 11 Aug 2016 21:51:56 +0000 (14:51 -0700)]
ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
When I was adding the code for enabling VLAN promiscuous mode with SR-IOV
enabled I had inadvertently left the VLNCTRL.VFE bit unchanged as I has
assumed there was code in another path that was setting it when we enabled
SR-IOV. This wasn't the case and as a result we were just disabling VLAN
filtering for all the VFs apparently.
Also the previous patches were always clearing CFIEN which was always set
to 0 by the hardware anyway so I am dropping the redundant bit clearing.
Fixes: 16369564915a ("ixgbe: Add support for VLAN promiscuous with SR-IOV") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On LVM2 conversions via lvconvert(8), the target keeps mapped devices in
frozen state when requesting RAID devices be resynchronized. This
applies to e.g. adding legs to a raid1 device or taking over from raid0
to raid4 when the rebuild flag's set on the new raid1 legs or the added
dedicated parity stripe.
Also, fix frozen recovery for reshaping as well.
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>