git.karo-electronics.de Git - karo-tx-linux.git/log

s390/vtime: correct system time accounting

There is a slight misaccounting of system time in vtime_account_user.
This function is called once per HZ tick in interrupt context.
The irq_enter function already accounted the system time up to the
point of the irq_enter call. The system time from irq_enter until
vtime_account_user/do_account_vtime is reached is irq time but it
is accounted to the previous context.

Just drop the hardirq offset from arch/s390/kernel/vtime.c.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull quota, fsnotify and ext2 updates from Jan Kara:
"Changes to locking of some quota operations from dedicated quota mutex
  to s_umount semaphore, a fsnotify fix and a simple ext2 fix"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: Fix bogus warning in dquot_disable()
  fsnotify: Fix possible use-after-free in inode iteration on umount
  ext2: reject inodes with negative size
  quota: Remove dqonoff_mutex
  ocfs2: Use s_umount for quota recovery protection
  quota: Remove dqonoff_mutex from dquot_scan_active()
  ocfs2: Protect periodic quota syncing with s_umount semaphore
  quota: Use s_umount protection for quota operations
  quota: Hold s_umount in exclusive mode when enabling / disabling quotas
  fs: Provide function to get superblock with exclusive s_umount

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"Early fixes for x86.

  Instead of the (botched) revert, the lockdep/might_sleep splat has a
  real fix provided by Andrea"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF)
  kvm: take srcu lock around kvm_steal_time_set_preempted()
  kvm: fix schedule in atomic in kvm_steal_time_set_preempted()
  KVM: hyperv: fix locking of struct kvm_hv fields
  KVM: x86: Expose Intel AVX512IFMA/AVX512VBMI/SHA features to guest.
  kvm: nVMX: Correct a VMX instruction error code for VMPTRLD

Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging

Pull dmi fix from Jean Delvare.

* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
firmware: dmi_scan: Always show system identification string

Merge tag 'mfd-for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

Pull MFD updates from Lee Jones:
"New Device Support
   - Add support for Ricoh RC5T619 PMIC to rn5t618
   - Add support for PM8821 PMIC to qcom-pm8xxx

  New Functionality:
   - Add support for GPIO to lpc_ich
   - Add support for GPADC to sun4i
   - Add ability for rk808 to shutdown

  Fix-ups:
   - Simplify/strip unnecessary code; tps65218, palmas, tps65217
   - Device Tree binding updates; tps65218, altera-a10sr
   - Provide/export device ID info; tps65218, axp20x-i2c, hi655x-pmic,
     fsl-imx25-tsadc, intel_soc_pmic_bxtwc
   - Use MFD API instead of of_platform_populate(); tps65218
   - Generalise name-space; pm8xxx
   - Supply/edit regmap configuration; axp20x, cs47l24-tables, axp20x
   - Enable compile testing; max77620, max77686, exynos-lpass,
     abx500-core
   - Coding style issues; wm8994-core, wm5102-tables
   - Supply endian support; syscon
   - Remove module support; ab3100-core, ab8500-debugfs, ab8500-gpadc,
     abx500-core

  Bug Fixes:
   - Fix ordering issues; wm8994
   - Fix dependencies (build-time/run-time); exynos_lpass, sun4i-gpadc
   - Fix compiler warnings; sun4i-gpadc
   - Fix leaks; mfd-core
   - Fix page fault during module unload; tps65217"

* tag 'mfd-for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (49 commits)
  mfd: tps65217: Support an interrupt pin as the system wakeup
  mfd: tps65217: Make an interrupt handler simpler
  mfd: tps65217: Update register interrupt mask bits instead of writing operation
  mfd: tps65217: Specify the IRQ name
  mfd: tps65217: Fix page fault on unloading modules
  mfd: palmas: Remove redundant check in palmas_power_off
  mfd: arizona: Disable IRQs during driver remove
  mfd: pm8xxx: add support to pm8821
  mfd: intel-lpss: Try to enable Memory-Write-Invalidate
  mfd: rn5t618: Add Ricoh RC5T619 PMIC support
  mfd: axp20x: Add address extension registers for AXP806 regmap
  mfd: intel_soc_pmic_bxtwc: Fix a typo in MODULE_DEVICE_TABLE()
  mfd: core: Fix device reference leak in mfd_clone_cell
  mfd: bcm590xx: Simplify a test
  mfd: sun4i-gpadc: Select regmap-irq
  mfd: abx500-core: drop unused MODULE_ tags from non-modular code
  mfd: ab8500: make sysctrl explicitly non-modular
  mfd: ab8500-gpadc: Make it explicitly non-modular
  mfd: ab8500-debugfs: Make it explicitly non-modular
  mfd: ab8500-core: Make it explicitly non-modular
  ...

kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF)

When L2 exits to L0 due to "exception or NMI", software exceptions
(#BP and #OF) for which L1 has requested an intercept should be
handled by L1 rather than L0. Previously, only hardware exceptions
were forwarded to L1.

Signed-off-by: Jim Mattson <jmattson@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm: take srcu lock around kvm_steal_time_set_preempted()

kvm_memslots() will be called by kvm_write_guest_offset_cached() so
take the srcu lock.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm: fix schedule in atomic in kvm_steal_time_set_preempted()

kvm_steal_time_set_preempted() isn't disabling the pagefaults before
calling __copy_to_user and the kernel debug notices.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

quota: Fix bogus warning in dquot_disable()

dquot_disable() was warning when sb_has_quota_loaded() was true when
invalidating page cache for quota files. The thinking behind this
warning was that we must have raced with somebody else turning quotas on
and this should not happen because all places modifying quota state must
hold s_umount exclusively now. However sb_has_quota_loaded() can be also
true at this point when we are just suspending quotas on remount
read-only. Just restore the behavior to situation before commit
c3b004460d77 ("quota: Remove dqonoff_mutex") which introduced the
warning.

The code in dquot_disable() can be further simplified with the new
locking of quota state changes however let's leave that to a separate
commit that can get more testing exposure.

Fixes: c3b004460d77bf3f980d877be539016f2df4df12
Signed-off-by: Jan Kara <jack@suse.cz>

firmware: dmi_scan: Always show system identification string

Let's keep consistent when print dmi_ids_string between SMBIOS 2.x
and SMBIOS 3.x, and always show the system identification string,
like Vendor, Product/Board name and BIOS infos.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>

Merge tag 'rtc-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
  "Subsystem:
   - non-modular drivers are now explicitly non-modular

  New driver:
    - Epson Toyocom rtc-7301sf/dg

  Drivers:
   - cmos: reject unsupported alarm values wrt the RTC capabilities
   - ds1307: ACPI support
   - jz4740: DT support, jz4780 handling, can now be used as a system
     power controller
   - mcp795: many fixes, in particular proper month handling
   - twl: driver is now DT only"

* tag 'rtc-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (31 commits)
  rtc: mcp795: Fix whitespace and indentation.
  rtc: mcp795: Prefer using the BIT() macro.
  rtc: mcp795: fix month write resetting date to 1.
  rtc: mcp795: fix time range difference between linux and RTC chip.
  rtc: mcp795: fix bitmask value for leap year (LP).
  rtc: mcp795: use bcd2bin/bin2bcd.
  rtc: add support for EPSON TOYOCOM RTC-7301SF/DG
  rtc: ds1307: Add ACPI support
  rtc: imxdi: (trivial) fix a typo
  rtc: ds1374: Merge conditional + WARN_ON()
  rtc: twl: make driver DT only
  rtc: twl: kill static variables
  rtc: fix typos in Kconfig
  rtc: jz4740: make the driver builtin only
  rtc: jz4740: remove unused EXPORT_SYMBOL
  Documentation: bindings: fix twl-rtc documentation
  rtc: Enable compile testing for Maxim and Samsung drivers
  MIPS: jz4740: Remove obsolete code
  MIPS: qi_lb60: Probe RTC driver from DT and use it as power controller
  MIPS: jz4740: DTS: Probe the jz4740-rtc driver from devicetree
  ...

rtc: mcp795: Fix whitespace and indentation.

Fix whitespace and indentation errors and the following
checkpatch warnings:
- line 15: Block comments use a trailing */ on a separate line
- line 256: Line over 80 characters
No code change.

Signed-off-by: Emil Bartczak <emilbart@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

rtc: mcp795: Prefer using the BIT() macro.

This patch doesn't change the code but replaces all bitmask values
with the BIT(x) macro.

Signed-off-by: Emil Bartczak <emilbart@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

rtc: mcp795: fix month write resetting date to 1.

According to Microchip errata some combinations of date and month
values may result in the date being reset to 1, even if the date
is also written with the month (for example 31-07 or 31-08).
As a workaround avoid writing date and month values within the same
Write command. Instead, terminate the Write command after loading
the date and begin a new command to write the month. In addition,
disable the oscillator before loading the new values. This is done
by ensuring both the ST and EXTOSC bits are cleared and waiting for
the OSCON bit to clear.

Signed-off-by: Emil Bartczak <emilbart@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

rtc: mcp795: fix time range difference between linux and RTC chip.

In linux rtc_time struct, tm_mon range is 0~11, while in RTC HW REG,
month range is 1~12. This patch adjusts difference of them.

Signed-off-by: Emil Bartczak <emilbart@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

rtc: mcp795: fix bitmask value for leap year (LP).

According the datasheet the leap year is a fifth bit in month register.

Signed-off-by: Emil Bartczak <emilbart@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

rtc: mcp795: use bcd2bin/bin2bcd.

Change rtc-mcp795.c to use the bcd2bin/bin2bcd functions.
This change fixes the wrong conversion of month value
from binary to BCD (missing right shift operation for 10 month).

Signed-off-by: Emil Bartczak <emilbart@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

rtc: add support for EPSON TOYOCOM RTC-7301SF/DG

This adds support for EPSON TOYOCOM RTC-7301SF/DG which has parallel
interface compatible with SRAM.

This driver supports basic clock, calendar and alarm functionality.

Tested with Microblaze linux running on Artix7 FPGA board with my own
custom IP for RTC-7301.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

rtc: ds1307: Add ACPI support

This patch enables ACPI support for rtc-ds1307 driver.

Signed-off-by: Tin Huynh <tnhuynh@apm.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>

Merge tag 'libnvdimm-for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
"The libnvdimm pull request is relatively small this time around due to
  some development topics being deferred to 4.11.

  As for this pull request the bulk of it has been in -next for several
  releases leading to one late fix being added (commit 868f036fee4b
  ("libnvdimm: fix mishandled nvdimm_clear_poison() return value")). It
  has received a build success notification from the 0day-kbuild robot
  and passes the latest libnvdimm unit tests.

  Summary:

   - Dynamic label support: To date namespace label support has been
     limited to disambiguating cases where PMEM (direct load/store) and
     BLK (mmio aperture) accessed-capacity alias on the same DIMM. Since
     4.9 added support for multiple namespaces per PMEM-region there is
     value to support namespace labels even in the non-aliasing case.
     The presence of a valid namespace index block force-enables label
     support when the kernel would otherwise rely on region boundaries,
     and permits the region to be sub-divided.

   - Handle media errors in namespace metadata: Complement the error
     handling for media errors in namespace data areas with support for
     clearing errors on writes, and downgrading potential machine-check
     exceptions to simple i/o errors on read.

   - Device-DAX region attributes: Add 'align', 'id', and 'size' as
     attributes for device-dax regions. In particular this enables
     userspace tooling to generically size memory mapping and i/o
     operations. Prevent userspace from growing assumptions /
     dependencies about the parent device topology for a dax region. A
     libnvdimm namespace may not always be the parent device of a dax
     region.

   - Various cleanups and small fixes"

* tag 'libnvdimm-for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dax: add region 'id', 'size', and 'align' attributes
  libnvdimm: fix mishandled nvdimm_clear_poison() return value
  libnvdimm: replace mutex_is_locked() warnings with lockdep_assert_held
  libnvdimm, pfn: fix align attribute
  libnvdimm, e820: use module_platform_driver
  libnvdimm, namespace: use octal for permissions
  libnvdimm, namespace: avoid multiple sector calculations
  libnvdimm: remove else after return in nsio_rw_bytes()
  libnvdimm, namespace: fix the type of name variable
  libnvdimm: use consistent naming for request_mem_region()
  nvdimm: use the right length of "pmem"
  libnvdimm: check and clear poison before writing to pmem
  tools/testing/nvdimm: dynamic label support
  libnvdimm: allow a platform to force enable label support
  libnvdimm: use generic iostat interfaces

Merge tag 'platform-drivers-x86-v4.10-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86

Pull more x86 platform driver updates from Darren Hart:
"Move and add registration for the mlx-platform driver. Introduce
  button and lid drivers for the surface3 (different from the
  surface3-pro). Add BXT PMIC TMU support. Add Y700 to existing
  ideapad-laptop quirk.

  Summary:

  ideapad-laptop:
   - Add Y700 15-ACZ to no_hw_rfkill DMI list

  surface3_button:
   - Introduce button support for the Surface 3

  surface3-wmi:
   - Add custom surface3 platform device for controlling LID
   - Balance locking on error path

  mlx-platform:
   - Add mlxcpld-hotplug driver registration
   - Fix semicolon.cocci warnings
   - Move module from arch/x86

  platform/x86:
   - Add Whiskey Cove PMIC TMU support"

* tag 'platform-drivers-x86-v4.10-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
  platform/x86: surface3-wmi: Balance locking on error path
  platform/x86: Add Whiskey Cove PMIC TMU support
  platform/x86: ideapad-laptop: Add Y700 15-ACZ to no_hw_rfkill DMI list
  platform/x86: Introduce button support for the Surface 3
  platform/x86: Add custom surface3 platform device for controlling LID
  platform/x86: mlx-platform: Add mlxcpld-hotplug driver registration
  platform/x86: mlx-platform: Fix semicolon.cocci warnings
  platform/x86: mlx-platform: Move module from arch/x86

platform/x86: surface3-wmi: Balance locking on error path

There is a possibility that lock will be left acquired.
Consolidate error path under out_free_unlock label.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

platform/x86: Add Whiskey Cove PMIC TMU support

This adds TMU (Time Management Unit) support for Intel BXT platform.
It enables the alarm wake-up functionality in the TMU unit of Whiskey Cove
PMIC.

Signed-off-by: Nilesh Bacchewar <nilesh.bacchewar@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[andy: resolve merge conflict in Kconfig]
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Merge branch 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer updates from Thomas Gleixner:
"This is the last functional update from the tip tree for 4.10. It got
  delayed due to a newly reported and anlyzed variant of BIOS bug and
  the resulting wreckage:

   - Seperation of TSC being marked realiable and the fact that the
     platform provides the TSC frequency via CPUID/MSRs and making use
     for it for GOLDMONT.

   - TSC adjust MSR validation and sanitizing:

     The TSC adjust MSR contains the offset to the hardware counter. The
     sum of the adjust MSR and the counter is the TSC value which is
     read via RDTSC.

     On at least two machines from different vendors the BIOS sets the
     TSC adjust MSR to negative values. This happens on cold and warm
     boot. While on cold boot the offset is a few milliseconds, on warm
     boot it basically compensates the power on time of the system. The
     BIOSes are not even using the adjust MSR to set all CPUs in the
     package to the same offset. The offsets are different which renders
     the TSC unusable,

     What's worse is that the TSC deadline timer has a HW feature^Wbug.
     It malfunctions when the TSC adjust value is negative or greater
     equal 0x80000000 resulting in silent boot failures, hard lockups or
     non firing timers. This looks like some hardware internal 32/64bit
     issue with a sign extension problem. Intel has been silent so far
     on the issue.

     The update contains sanity checks and keeps the adjust register
     within working limits and in sync on the package.

     As it looks like this disease is spreading via BIOS crapware, we
     need to address this urgently as the boot failures are hard to
     debug for users"

* 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tsc: Limit the adjust value further
  x86/tsc: Annotate printouts as firmware bug
  x86/tsc: Force TSC_ADJUST register to value >= zero
  x86/tsc: Validate TSC_ADJUST after resume
  x86/tsc: Validate cpumask pointer before accessing it
  x86/tsc: Fix broken CONFIG_X86_TSC=n build
  x86/tsc: Try to adjust TSC if sync test fails
  x86/tsc: Prepare warp test for TSC adjustment
  x86/tsc: Move sync cleanup to a safe place
  x86/tsc: Sync test only for the first cpu in a package
  x86/tsc: Verify TSC_ADJUST from idle
  x86/tsc: Store and check TSC ADJUST MSR
  x86/tsc: Detect random warps
  x86/tsc: Use X86_FEATURE_TSC_ADJUST in detect_art()
  x86/tsc: Finalize the split of the TSC_RELIABLE flag
  x86/tsc: Set TSC_KNOWN_FREQ and TSC_RELIABLE flags on Intel Atom SoCs
  x86/tsc: Mark Intel ATOM_GOLDMONT TSC reliable
  x86/tsc: Mark TSC frequency determined by CPUID as known
  x86/tsc: Add X86_FEATURE_TSC_KNOWN_FREQ flag

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes and cleanups from Thomas Gleixner:
"This set of updates contains:

   - Robustification for the logical package managment. Cures the AMD
     and virtualization issues.

   - Put the correct start_cpu() return address on the stack of the idle
     task.

   - Fixups for the fallout of the nodeid <-> cpuid persistent mapping
     modifciations

   - Move the x86/MPX specific mm_struct member to the arch specific
     mm_context where it belongs

   - Cleanups for C89 struct initializers and useless function
     arguments"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/floppy: Use designated initializers
  x86/mpx: Move bd_addr to mm_context_t
  x86/mm: Drop unused argument 'removed' from sync_global_pgds()
  ACPI/NUMA: Do not map pxm to node when NUMA is turned off
  x86/acpi: Use proper macro for invalid node
  x86/smpboot: Prevent false positive out of bounds cpumask access warning
  x86/boot/64: Push correct start_cpu() return address
  x86/boot/64: Use 'push' instead of 'call' in start_cpu()
  x86/smpboot: Make logical package management more robust

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
"Prevent NULL pointer dereferencing in the tick broadcast code. Old
bug, which got unearthed by the hotplug ordering problem"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/broadcast: Prevent NULL pointer dereference

Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull SMP hotplug fixes from Thomas Gleixner:
"Two fixlets for cpu hotplug:

   - Fix a subtle ordering problem with the dummy timer. This happened
     to work before the conversion by chance due to initcall ordering.

   - Fix the function comment for __cpuhp_setup_state()"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Clarify description of __cpuhp_setup_state() return value
  clocksource/dummy_timer: Move hotplug callback after the real timers

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Thomas Gleixner:
"A fix for the irq affinity spread algorithm so it handles non linear
node numbering nicely"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/affinity: Fix node generation from cpumask

x86/tsc: Limit the adjust value further

Adjust value 0x80000000 and other values larger than that render the TSC
deadline timer disfunctional.

We have not yet any information about this from Intel, but experimentation
clearly proves that this is a 32/64 bit and sign extension issue.

If adjust values larger than that are actually required, which might be the
case for physical CPU hotplug, then we need to disable the deadline timer
on the affected package/CPUs and use the local APIC timer instead.

That requires some surgery in the APIC setup code, so we just limit the
ADJUST register value into the known to work range for now and revisit this
when Intel comes forth with proper information.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Kevin Stanton <kevin.b.stanton@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>

x86/tsc: Annotate printouts as firmware bug

Make it more obvious that the BIOS is screwed up.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Roland Scheidegger <rscheidegger_lists@hispeed.ch>
Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
Cc: Kevin Stanton <kevin.b.stanton@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>

x86/floppy: Use designated initializers

Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161217213705.GA1248@beast
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes and cleanups from David Miller:

1) Revert bogus nla_ok() change, from Alexey Dobriyan.

2) Various bpf validator fixes from Daniel Borkmann.

3) Add some necessary SET_NETDEV_DEV() calls to hsis_femac and hip04
    drivers, from Dongpo Li.

4) Several ethtool ksettings conversions from Philippe Reynes.

5) Fix bugs in inet port management wrt. soreuseport, from Tom Herbert.

6) XDP support for virtio_net, from John Fastabend.

7) Fix NAT handling within a vrf, from David Ahern.

8) Endianness fixes in dpaa_eth driver, from Claudiu Manoil

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (63 commits)
  net: mv643xx_eth: fix build failure
  isdn: Constify some function parameters
  mlxsw: spectrum: Mark split ports as such
  cgroup: Fix CGROUP_BPF config
  qed: fix old-style function definition
  net: ipv6: check route protocol when deleting routes
  r6040: move spinlock in r6040_close as SOFTIRQ-unsafe lock order detected
  irda: w83977af_ir: cleanup an indent issue
  net: sfc: use new api ethtool_{get|set}_link_ksettings
  net: davicom: dm9000: use new api ethtool_{get|set}_link_ksettings
  net: cirrus: ep93xx: use new api ethtool_{get|set}_link_ksettings
  net: chelsio: cxgb3: use new api ethtool_{get|set}_link_ksettings
  net: chelsio: cxgb2: use new api ethtool_{get|set}_link_ksettings
  bpf: fix mark_reg_unknown_value for spilled regs on map value marking
  bpf: fix overflow in prog accounting
  bpf: dynamically allocate digest scratch buffer
  gtp: Fix initialization of Flags octet in GTPv1 header
  gtp: gtp_check_src_ms_ipv4() always return success
  net/x25: use designated initializers
  isdn: use designated initializers
  ...

Merge uncontroversial parts of branch 'readlink' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull partial readlink cleanups from Miklos Szeredi.

This is the uncontroversial part of the readlink cleanup patch-set that
simplifies the default readlink handling.

Miklos and Al are still discussing the rest of the series.

* git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  vfs: make generic_readlink() static
  vfs: remove ".readlink = generic_readlink" assignments
  vfs: default to generic_readlink()
  vfs: replace calling i_op->readlink with vfs_readlink()
  proc/self: use generic_readlink
  ecryptfs: use vfs_get_link()
  bad_inode: add missing i_op initializers

net: mv643xx_eth: fix build failure

The build of sparc allmodconfig fails with the error:
"of_irq_to_resource" [drivers/net/ethernet/marvell/mv643xx_eth.ko]
undefined!

of_irq_to_resource() is defined when CONFIG_OF_IRQ is defined. And also
CONFIG_OF_IRQ can only be defined if CONFIG_IRQ is defined. So we can
safely use #if defined(CONFIG_OF_IRQ) in the code.

Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

isdn: Constify some function parameters

The coming initify gcc plugin expects const pointer types, and caught
some __printf arguments that weren't const yet. This fixes those.

Signed-off-by: Emese Revfy <re.emese@gmail.com>
[kees: expanded commit message]
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum: Mark split ports as such

When a port is split we should mark it as such, as otherwise the split
ports aren't renamed correctly (e.g. sw1p3 -> sw1p3s1) and the unsplit
operation fails:

$ devlink port split sw1p3 count 4
$ devlink port unsplit eth0
devlink answers: Invalid argument
[ 598.565307] mlxsw_spectrum 0000:03:00.0 eth0: Port wasn't split

Fixes: 67963a33b4fd ("mlxsw: Make devlink port instances independent of spectrum/switchx2 port instances")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Tamir Winetroub <tamirw@mellanox.com>
Reviewed-by: Elad Raz <eladr@mellanox.com>
Tested-by: Tamir Winetroub <tamirw@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull more vfs updates from Al Viro:
"In this pile:

   - autofs-namespace series
   - dedupe stuff
   - more struct path constification"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
  ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features
  ocfs2: charge quota for reflinked blocks
  ocfs2: fix bad pointer cast
  ocfs2: always unlock when completing dio writes
  ocfs2: don't eat io errors during _dio_end_io_write
  ocfs2: budget for extent tree splits when adding refcount flag
  ocfs2: prohibit refcounted swapfiles
  ocfs2: add newlines to some error messages
  ocfs2: convert inode refcount test to a helper
  simple_write_end(): don't zero in short copy into uptodate
  exofs: don't mess with simple_write_{begin,end}
  9p: saner ->write_end() on failing copy into non-uptodate page
  fix gfs2_stuffed_write_end() on short copies
  fix ceph_write_end()
  nfs_write_end(): fix handling of short copies
  vfs: refactor clone/dedupe_file_range common functions
  fs: try to clone files first in vfs_copy_file_range
  vfs: misc struct path constification
  namespace.c: constify struct path passed to a bunch of primitives
  quota: constify struct path in quota_on
  ...

cgroup: Fix CGROUP_BPF config

CGROUP_BPF depended on SOCK_CGROUP_DATA which can't be manually
enabled, making it rather challenging to turn CGROUP_BPF on.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mac80211-for-davem-2016-12-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Three fixes:
* avoid a WARN_ON() when trying to use WEP with AP_VLANs
* ensure enough headroom on mesh forwarding packets
* don't report unknown/invalid rates to userspace
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

qed: fix old-style function definition

The newly added file causes a harmless warning, with "make W=1":

drivers/net/ethernet/qlogic/qed/qed_iscsi.c: In function 'qed_get_iscsi_ops':
drivers/net/ethernet/qlogic/qed/qed_iscsi.c:1268:29: warning: old-style function definition [-Wold-style-definition]

This makes it a proper prototype.

Fixes: fc831825f99e ("qed: Add support for hardware offloaded iSCSI.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipv6: check route protocol when deleting routes

The protocol field is checked when deleting IPv4 routes, but ignored for
IPv6, which causes problems with routing daemons accidentally deleting
externally set routes (observed by multiple bird6 users).

This can be verified using `ip -6 route del <prefix> proto something`.

Signed-off-by: Mantas Mikulėnas <grawity@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r6040: move spinlock in r6040_close as SOFTIRQ-unsafe lock order detected

'ifconfig eth0 down' makes r6040_close() trigger:
INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected

Fixed by moving calls to phy_stop(), napi_disable(), netif_stop_queue()
to outside of the module's private spin_lock_irq block.

Found on a Versalogic Tomcat SBC with a Vortex86 SoC

s1660e_5150:~# sudo ifconfig eth0 down
[   61.306415] ======================================================
[   61.306415] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[   61.306415] 4.9.0-gb898d2d-manuel #1 Not tainted
[   61.306415] ------------------------------------------------------
[   61.306415] ifconfig/449 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[   61.306415]  (&dev->lock){+.+...}, at: [<c1336276>] phy_stop+0x16/0x80

[   61.306415] and this task is already holding:
[   61.306415]  (&(&lp->lock)->rlock){+.-...}, at: [<d0934c84>] r6040_close+0x24/0x230 [r6040]
which would create a new lock dependency:
[   61.306415]  (&(&lp->lock)->rlock){+.-...} -> (&dev->lock){+.+...}

[   61.306415] but this new dependency connects a SOFTIRQ-irq-safe lock:
[   61.306415]  (&(&lp->lock)->rlock){+.-...}
[   61.306415] ... which became SOFTIRQ-irq-safe at:
[   61.306415]   [   61.306415] [<c1075bc5>] __lock_acquire+0x555/0x1770
[   61.306415]   [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
[   61.306415]   [   61.306415] [<c14bb334>] _raw_spin_lock_irqsave+0x24/0x40
[   61.306415]   [   61.306415] [<d0934ac0>] r6040_start_xmit+0x30/0x1d0 [r6040]
[   61.306415]   [   61.306415] [<c13a7d4d>] dev_hard_start_xmit+0x9d/0x2d0
[   61.306415]   [   61.306415] [<c13c8a38>] sch_direct_xmit+0xa8/0x140
[   61.306415]   [   61.306415] [<c13a8436>] __dev_queue_xmit+0x416/0x780
[   61.306415]   [   61.306415] [<c13a87aa>] dev_queue_xmit+0xa/0x10
[   61.306415]   [   61.306415] [<c13b4837>] neigh_resolve_output+0x147/0x220
[   61.306415]   [   61.306415] [<c144541b>] ip6_finish_output2+0x2fb/0x910
[   61.306415]   [   61.306415] [<c14494e6>] ip6_finish_output+0xa6/0x1a0
[   61.306415]   [   61.306415] [<c1449635>] ip6_output+0x55/0x320
[   61.306415]   [   61.306415] [<c146f4d2>] mld_sendpack+0x352/0x560
[   61.306415]   [   61.306415] [<c146fe55>] mld_ifc_timer_expire+0x155/0x280
[   61.306415]   [   61.306415] [<c108b081>] call_timer_fn+0x81/0x270
[   61.306415]   [   61.306415] [<c108b331>] expire_timers+0xc1/0x180
[   61.306415]   [   61.306415] [<c108b4f7>] run_timer_softirq+0x77/0x150
[   61.306415]   [   61.306415] [<c1043d04>] __do_softirq+0xb4/0x3d0
[   61.306415]   [   61.306415] [<c101a15c>] do_softirq_own_stack+0x1c/0x30
[   61.306415]   [   61.306415] [<c104416e>] irq_exit+0x8e/0xa0
[   61.306415]   [   61.306415] [<c1019d31>] do_IRQ+0x51/0x100
[   61.306415]   [   61.306415] [<c14bc176>] common_interrupt+0x36/0x40
[   61.306415]   [   61.306415] [<c1134928>] set_root+0x68/0xf0
[   61.306415]   [   61.306415] [<c1136120>] path_init+0x400/0x640
[   61.306415]   [   61.306415] [<c11386bf>] path_lookupat+0xf/0xe0
[   61.306415]   [   61.306415] [<c1139ebc>] filename_lookup+0x6c/0x100
[   61.306415]   [   61.306415] [<c1139fd5>] user_path_at_empty+0x25/0x30
[   61.306415]   [   61.306415] [<c11298c6>] SyS_faccessat+0x86/0x1e0
[   61.306415]   [   61.306415] [<c1129a30>] SyS_access+0x10/0x20
[   61.306415]   [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
[   61.306415]   [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
[   61.306415]
[   61.306415] to a SOFTIRQ-irq-unsafe lock:
[   61.306415]  (&dev->lock){+.+...}
[   61.306415] ... which became SOFTIRQ-irq-unsafe at:
[   61.306415] ...[   61.306415]
[   61.306415] [<c1075c0c>] __lock_acquire+0x59c/0x1770
[   61.306415]   [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
[   61.306415]   [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
[   61.306415]   [   61.306415] [<c133747d>] phy_probe+0x4d/0xc0
[   61.306415]   [   61.306415] [<c1338afe>] phy_attach_direct+0xbe/0x190
[   61.306415]   [   61.306415] [<c1338ca7>] phy_connect_direct+0x17/0x60
[   61.306415]   [   61.306415] [<c1338d23>] phy_connect+0x33/0x70
[   61.306415]   [   61.306415] [<d09357a0>] r6040_init_one+0x3a0/0x500 [r6040]
[   61.306415]   [   61.306415] [<c12a78c7>] pci_device_probe+0x77/0xd0
[   61.306415]   [   61.306415] [<c12f5e15>] driver_probe_device+0x145/0x280
[   61.306415]   [   61.306415] [<c12f5fd9>] __driver_attach+0x89/0x90
[   61.306415]   [   61.306415] [<c12f43ef>] bus_for_each_dev+0x4f/0x80
[   61.306415]   [   61.306415] [<c12f5954>] driver_attach+0x14/0x20
[   61.306415]   [   61.306415] [<c12f55b7>] bus_add_driver+0x197/0x210
[   61.306415]   [   61.306415] [<c12f6a21>] driver_register+0x51/0xd0
[   61.306415]   [   61.306415] [<c12a6955>] __pci_register_driver+0x45/0x50
[   61.306415]   [   61.306415] [<d0938017>] 0xd0938017
[   61.306415]   [   61.306415] [<c100043f>] do_one_initcall+0x2f/0x140
[   61.306415]   [   61.306415] [<c10e48c0>] do_init_module+0x4a/0x19b
[   61.306415]   [   61.306415] [<c10a680e>] load_module+0x1b2e/0x2070
[   61.306415]   [   61.306415] [<c10a6eb9>] SyS_finit_module+0x69/0x80
[   61.306415]   [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
[   61.306415]   [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
[   61.306415]
[   61.306415] other info that might help us debug this:
[   61.306415]
[   61.306415]  Possible interrupt unsafe locking scenario:
[   61.306415]
[   61.306415]        CPU0                    CPU1
[   61.306415]        ----                    ----
[   61.306415]   lock(&dev->lock);
[   61.306415]                                local_irq_disable();
[   61.306415]                                lock(&(&lp->lock)->rlock);
[   61.306415]                                lock(&dev->lock);
[   61.306415]   <Interrupt>
[   61.306415]     lock(&(&lp->lock)->rlock);
[   61.306415]
[   61.306415]  *** DEADLOCK ***
[   61.306415]
[   61.306415] 2 locks held by ifconfig/449:
[   61.306415]  #0:  (rtnl_mutex){+.+.+.}, at: [<c13b68ef>] rtnl_lock+0xf/0x20
[   61.306415]  #1:  (&(&lp->lock)->rlock){+.-...}, at: [<d0934c84>] r6040_close+0x24/0x230 [r6040]
[   61.306415]
[   61.306415] the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
[   61.306415] -> (&(&lp->lock)->rlock){+.-...} ops: 3049 {
[   61.306415]    HARDIRQ-ON-W at:
[   61.306415]                     [   61.306415] [<c1075be7>] __lock_acquire+0x577/0x1770
[   61.306415]                     [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
[   61.306415]                     [   61.306415] [<c14bb21b>] _raw_spin_lock+0x1b/0x30
[   61.306415]                     [   61.306415] [<d09343cc>] r6040_poll+0x2c/0x330 [r6040]
[   61.306415]                     [   61.306415] [<c13a5577>] net_rx_action+0x197/0x340
[   61.306415]                     [   61.306415] [<c1043d04>] __do_softirq+0xb4/0x3d0
[   61.306415]                     [   61.306415] [<c1044037>] run_ksoftirqd+0x17/0x40
[   61.306415]                     [   61.306415] [<c105fe91>] smpboot_thread_fn+0x141/0x180
[   61.306415]                     [   61.306415] [<c105c84e>] kthread+0xde/0x110
[   61.306415]                     [   61.306415] [<c14bb949>] ret_from_fork+0x19/0x30
[   61.306415]    IN-SOFTIRQ-W at:
[   61.306415]                     [   61.306415] [<c1075bc5>] __lock_acquire+0x555/0x1770
[   61.306415]                     [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
[   61.306415]                     [   61.306415] [<c14bb334>] _raw_spin_lock_irqsave+0x24/0x40
[   61.306415]                     [   61.306415] [<d0934ac0>] r6040_start_xmit+0x30/0x1d0 [r6040]
[   61.306415]                     [   61.306415] [<c13a7d4d>] dev_hard_start_xmit+0x9d/0x2d0
[   61.306415]                     [   61.306415] [<c13c8a38>] sch_direct_xmit+0xa8/0x140
[   61.306415]                     [   61.306415] [<c13a8436>] __dev_queue_xmit+0x416/0x780
[   61.306415]                     [   61.306415] [<c13a87aa>] dev_queue_xmit+0xa/0x10
[   61.306415]                     [   61.306415] [<c13b4837>] neigh_resolve_output+0x147/0x220
[   61.306415]                     [   61.306415] [<c144541b>] ip6_finish_output2+0x2fb/0x910
[   61.306415]                     [   61.306415] [<c14494e6>] ip6_finish_output+0xa6/0x1a0
[   61.306415]                     [   61.306415] [<c1449635>] ip6_output+0x55/0x320
[   61.306415]                     [   61.306415] [<c146f4d2>] mld_sendpack+0x352/0x560
[   61.306415]                     [   61.306415] [<c146fe55>] mld_ifc_timer_expire+0x155/0x280
[   61.306415]                     [   61.306415] [<c108b081>] call_timer_fn+0x81/0x270
[   61.306415]                     [   61.306415] [<c108b331>] expire_timers+0xc1/0x180
[   61.306415]                     [   61.306415] [<c108b4f7>] run_timer_softirq+0x77/0x150
[   61.306415]                     [   61.306415] [<c1043d04>] __do_softirq+0xb4/0x3d0
[   61.306415]                     [   61.306415] [<c101a15c>] do_softirq_own_stack+0x1c/0x30
[   61.306415]                     [   61.306415] [<c104416e>] irq_exit+0x8e/0xa0
[   61.306415]                     [   61.306415] [<c1019d31>] do_IRQ+0x51/0x100
[   61.306415]                     [   61.306415] [<c14bc176>] common_interrupt+0x36/0x40
[   61.306415]                     [   61.306415] [<c1134928>] set_root+0x68/0xf0
[   61.306415]                     [   61.306415] [<c1136120>] path_init+0x400/0x640
[   61.306415]                     [   61.306415] [<c11386bf>] path_lookupat+0xf/0xe0
[   61.306415]                     [   61.306415] [<c1139ebc>] filename_lookup+0x6c/0x100
[   61.306415]                     [   61.306415] [<c1139fd5>] user_path_at_empty+0x25/0x30
[   61.306415]                     [   61.306415] [<c11298c6>] SyS_faccessat+0x86/0x1e0
[   61.306415]                     [   61.306415] [<c1129a30>] SyS_access+0x10/0x20
[   61.306415]                     [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
[   61.306415]                     [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
[   61.306415]    INITIAL USE at:
[   61.306415]                    [   61.306415] [<c107586e>] __lock_acquire+0x1fe/0x1770
[   61.306415]                    [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
[   61.306415]                    [   61.306415] [<c14bb334>] _raw_spin_lock_irqsave+0x24/0x40
[   61.306415]                    [   61.306415] [<d093474e>] r6040_get_stats+0x1e/0x60 [r6040]
[   61.306415]                    [   61.306415] [<c139fb16>] dev_get_stats+0x96/0xc0
[   61.306415]                    [   61.306415] [<c14b416e>] rtnl_fill_stats+0x36/0xfd
[   61.306415]                    [   61.306415] [<c13b7b3c>] rtnl_fill_ifinfo+0x47c/0xce0
[   61.306415]                    [   61.306415] [<c13bc08e>] rtmsg_ifinfo_build_skb+0x4e/0xd0
[   61.306415]                    [   61.306415] [<c13bc120>] rtmsg_ifinfo.part.20+0x10/0x40
[   61.306415]                    [   61.306415] [<c13bc16b>] rtmsg_ifinfo+0x1b/0x20
[   61.306415]                    [   61.306415] [<c13a9d19>] register_netdevice+0x409/0x550
[   61.306415]                    [   61.306415] [<c13a9e72>] register_netdev+0x12/0x20
[   61.306415]                    [   61.306415] [<d09357e8>] r6040_init_one+0x3e8/0x500 [r6040]
[   61.306415]                    [   61.306415] [<c12a78c7>] pci_device_probe+0x77/0xd0
[   61.306415]                    [   61.306415] [<c12f5e15>] driver_probe_device+0x145/0x280
[   61.306415]                    [   61.306415] [<c12f5fd9>] __driver_attach+0x89/0x90
[   61.306415]                    [   61.306415] [<c12f43ef>] bus_for_each_dev+0x4f/0x80
[   61.306415]                    [   61.306415] [<c12f5954>] driver_attach+0x14/0x20
[   61.306415]                    [   61.306415] [<c12f55b7>] bus_add_driver+0x197/0x210
[   61.306415]                    [   61.306415] [<c12f6a21>] driver_register+0x51/0xd0
[   61.306415]                    [   61.306415] [<c12a6955>] __pci_register_driver+0x45/0x50
[   61.306415]                    [   61.306415] [<d0938017>] 0xd0938017
[   61.306415]                    [   61.306415] [<c100043f>] do_one_initcall+0x2f/0x140
[   61.306415]                    [   61.306415] [<c10e48c0>] do_init_module+0x4a/0x19b
[   61.306415]                    [   61.306415] [<c10a680e>] load_module+0x1b2e/0x2070
[   61.306415]                    [   61.306415] [<c10a6eb9>] SyS_finit_module+0x69/0x80
[   61.306415]                    [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
[   61.306415]                    [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
[   61.306415]  }
[   61.306415]  ... key      at: [<d0936280>] __key.45893+0x0/0xfffff739 [r6040]
[   61.306415]  ... acquired at:
[   61.306415]    [   61.306415] [<c1074a32>] check_irq_usage+0x42/0xb0
[   61.306415]    [   61.306415] [<c107677c>] __lock_acquire+0x110c/0x1770
[   61.306415]    [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
[   61.306415]    [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
[   61.306415]    [   61.306415] [<c1336276>] phy_stop+0x16/0x80
[   61.306415]    [   61.306415] [<d0934ce9>] r6040_close+0x89/0x230 [r6040]
[   61.306415]    [   61.306415] [<c13a0a91>] __dev_close_many+0x61/0xa0
[   61.306415]    [   61.306415] [<c13a0bbf>] __dev_close+0x1f/0x30
[   61.306415]    [   61.306415] [<c13a9127>] __dev_change_flags+0x87/0x150
[   61.306415]    [   61.306415] [<c13a9213>] dev_change_flags+0x23/0x60
[   61.306415]    [   61.306415] [<c1416238>] devinet_ioctl+0x5f8/0x6f0
[   61.306415]    [   61.306415] [<c1417f75>] inet_ioctl+0x65/0x90
[   61.306415]    [   61.306415] [<c1389b54>] sock_ioctl+0x124/0x2b0
[   61.306415]    [   61.306415] [<c113cf7c>] do_vfs_ioctl+0x7c/0x790
[   61.306415]    [   61.306415] [<c113d6b8>] SyS_ioctl+0x28/0x50
[   61.306415]    [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
[   61.306415]    [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
[   61.306415]
[   61.306415]
the dependencies between the lock to be acquired[   61.306415]  and SOFTIRQ-irq-unsafe lock:
[   61.306415] -> (&dev->lock){+.+...} ops: 56 {
[   61.306415]    HARDIRQ-ON-W at:
[   61.306415]                     [   61.306415] [<c1075be7>] __lock_acquire+0x577/0x1770
[   61.306415]                     [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
[   61.306415]                     [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
[   61.306415]                     [   61.306415] [<c133747d>] phy_probe+0x4d/0xc0
[   61.306415]                     [   61.306415] [<c1338afe>] phy_attach_direct+0xbe/0x190
[   61.306415]                     [   61.306415] [<c1338ca7>] phy_connect_direct+0x17/0x60
[   61.306415]                     [   61.306415] [<c1338d23>] phy_connect+0x33/0x70
[   61.306415]                     [   61.306415] [<d09357a0>] r6040_init_one+0x3a0/0x500 [r6040]
[   61.306415]                     [   61.306415] [<c12a78c7>] pci_device_probe+0x77/0xd0
[   61.306415]                     [   61.306415] [<c12f5e15>] driver_probe_device+0x145/0x280
[   61.306415]                     [   61.306415] [<c12f5fd9>] __driver_attach+0x89/0x90
[   61.306415]                     [   61.306415] [<c12f43ef>] bus_for_each_dev+0x4f/0x80
[   61.306415]                     [   61.306415] [<c12f5954>] driver_attach+0x14/0x20
[   61.306415]                     [   61.306415] [<c12f55b7>] bus_add_driver+0x197/0x210
[   61.306415]                     [   61.306415] [<c12f6a21>] driver_register+0x51/0xd0
[   61.306415]                     [   61.306415] [<c12a6955>] __pci_register_driver+0x45/0x50
[   61.306415]                     [   61.306415] [<d0938017>] 0xd0938017
[   61.306415]                     [   61.306415] [<c100043f>] do_one_initcall+0x2f/0x140
[   61.306415]                     [   61.306415] [<c10e48c0>] do_init_module+0x4a/0x19b
[   61.306415]                     [   61.306415] [<c10a680e>] load_module+0x1b2e/0x2070
[   61.306415]                     [   61.306415] [<c10a6eb9>] SyS_finit_module+0x69/0x80
[   61.306415]                     [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
[   61.306415]                     [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
[   61.306415]    SOFTIRQ-ON-W at:
[   61.306415]                     [   61.306415] [<c1075c0c>] __lock_acquire+0x59c/0x1770
[   61.306415]                     [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
[   61.306415]                     [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
[   61.306415]                     [   61.306415] [<c133747d>] phy_probe+0x4d/0xc0
[   61.306415]                     [   61.306415] [<c1338afe>] phy_attach_direct+0xbe/0x190
[   61.306415]                     [   61.306415] [<c1338ca7>] phy_connect_direct+0x17/0x60
[   61.306415]                     [   61.306415] [<c1338d23>] phy_connect+0x33/0x70
[   61.306415]                     [   61.306415] [<d09357a0>] r6040_init_one+0x3a0/0x500 [r6040]
[   61.306415]                     [   61.306415] [<c12a78c7>] pci_device_probe+0x77/0xd0
[   61.306415]                     [   61.306415] [<c12f5e15>] driver_probe_device+0x145/0x280
[   61.306415]                     [   61.306415] [<c12f5fd9>] __driver_attach+0x89/0x90
[   61.306415]                     [   61.306415] [<c12f43ef>] bus_for_each_dev+0x4f/0x80
[   61.306415]                     [   61.306415] [<c12f5954>] driver_attach+0x14/0x20
[   61.306415]                     [   61.306415] [<c12f55b7>] bus_add_driver+0x197/0x210
[   61.306415]                     [   61.306415] [<c12f6a21>] driver_register+0x51/0xd0
[   61.306415]                     [   61.306415] [<c12a6955>] __pci_register_driver+0x45/0x50
[   61.306415]                     [   61.306415] [<d0938017>] 0xd0938017
[   61.306415]                     [   61.306415] [<c100043f>] do_one_initcall+0x2f/0x140
[   61.306415]                     [   61.306415] [<c10e48c0>] do_init_module+0x4a/0x19b
[   61.306415]                     [   61.306415] [<c10a680e>] load_module+0x1b2e/0x2070
[   61.306415]                     [   61.306415] [<c10a6eb9>] SyS_finit_module+0x69/0x80
[   61.306415]                     [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
[   61.306415]                     [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
[   61.306415]    INITIAL USE at:
[   61.306415]                    [   61.306415] [<c107586e>] __lock_acquire+0x1fe/0x1770
[   61.306415]                    [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
[   61.306415]                    [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
[   61.306415]                    [   61.306415] [<c133747d>] phy_probe+0x4d/0xc0
[   61.306415]                    [   61.306415] [<c1338afe>] phy_attach_direct+0xbe/0x190
[   61.306415]                    [   61.306415] [<c1338ca7>] phy_connect_direct+0x17/0x60
[   61.306415]                    [   61.306415] [<c1338d23>] phy_connect+0x33/0x70
[   61.306415]                    [   61.306415] [<d09357a0>] r6040_init_one+0x3a0/0x500 [r6040]
[   61.306415]                    [   61.306415] [<c12a78c7>] pci_device_probe+0x77/0xd0
[   61.306415]                    [   61.306415] [<c12f5e15>] driver_probe_device+0x145/0x280
[   61.306415]                    [   61.306415] [<c12f5fd9>] __driver_attach+0x89/0x90
[   61.306415]                    [   61.306415] [<c12f43ef>] bus_for_each_dev+0x4f/0x80
[   61.306415]                    [   61.306415] [<c12f5954>] driver_attach+0x14/0x20
[   61.306415]                    [   61.306415] [<c12f55b7>] bus_add_driver+0x197/0x210
[   61.306415]                    [   61.306415] [<c12f6a21>] driver_register+0x51/0xd0
[   61.306415]                    [   61.306415] [<c12a6955>] __pci_register_driver+0x45/0x50
[   61.306415]                    [   61.306415] [<d0938017>] 0xd0938017
[   61.306415]                    [   61.306415] [<c100043f>] do_one_initcall+0x2f/0x140
[   61.306415]                    [   61.306415] [<c10e48c0>] do_init_module+0x4a/0x19b
[   61.306415]                    [   61.306415] [<c10a680e>] load_module+0x1b2e/0x2070
[   61.306415]                    [   61.306415] [<c10a6eb9>] SyS_finit_module+0x69/0x80
[   61.306415]                    [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
[   61.306415]                    [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
[   61.306415]  }
[   61.306415]  ... key      at: [<c1f28f39>] __key.43998+0x0/0x8
[   61.306415]  ... acquired at:
[   61.306415]    [   61.306415] [<c1074a32>] check_irq_usage+0x42/0xb0
[   61.306415]    [   61.306415] [<c107677c>] __lock_acquire+0x110c/0x1770
[   61.306415]    [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
[   61.306415]    [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
[   61.306415]    [   61.306415] [<c1336276>] phy_stop+0x16/0x80
[   61.306415]    [   61.306415] [<d0934ce9>] r6040_close+0x89/0x230 [r6040]
[   61.306415]    [   61.306415] [<c13a0a91>] __dev_close_many+0x61/0xa0
[   61.306415]    [   61.306415] [<c13a0bbf>] __dev_close+0x1f/0x30
[   61.306415]    [   61.306415] [<c13a9127>] __dev_change_flags+0x87/0x150
[   61.306415]    [   61.306415] [<c13a9213>] dev_change_flags+0x23/0x60
[   61.306415]    [   61.306415] [<c1416238>] devinet_ioctl+0x5f8/0x6f0
[   61.306415]    [   61.306415] [<c1417f75>] inet_ioctl+0x65/0x90
[   61.306415]    [   61.306415] [<c1389b54>] sock_ioctl+0x124/0x2b0
[   61.306415]    [   61.306415] [<c113cf7c>] do_vfs_ioctl+0x7c/0x790
[   61.306415]    [   61.306415] [<c113d6b8>] SyS_ioctl+0x28/0x50
[   61.306415]    [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
[   61.306415]    [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
[   61.306415]
[   61.306415]
[   61.306415] stack backtrace:
[   61.306415] CPU: 0 PID: 449 Comm: ifconfig Not tainted 4.9.0-gb898d2d-manuel #1
[   61.306415] Call Trace:
[   61.306415]  dump_stack+0x16/0x19
[   61.306415]  check_usage+0x3f6/0x550
[   61.306415]  ? check_usage+0x4d/0x550
[   61.306415]  check_irq_usage+0x42/0xb0
[   61.306415]  __lock_acquire+0x110c/0x1770
[   61.306415]  lock_acquire+0x7c/0x150
[   61.306415]  ? phy_stop+0x16/0x80
[   61.306415]  mutex_lock_nested+0x2d/0x4a0
[   61.306415]  ? phy_stop+0x16/0x80
[   61.306415]  ? r6040_close+0x24/0x230 [r6040]
[   61.306415]  ? __delay+0x9/0x10
[   61.306415]  phy_stop+0x16/0x80
[   61.306415]  r6040_close+0x89/0x230 [r6040]
[   61.306415]  __dev_close_many+0x61/0xa0
[   61.306415]  __dev_close+0x1f/0x30
[   61.306415]  __dev_change_flags+0x87/0x150
[   61.306415]  dev_change_flags+0x23/0x60
[   61.306415]  devinet_ioctl+0x5f8/0x6f0
[   61.306415]  inet_ioctl+0x65/0x90
[   61.306415]  sock_ioctl+0x124/0x2b0
[   61.306415]  ? dlci_ioctl_set+0x30/0x30
[   61.306415]  do_vfs_ioctl+0x7c/0x790
[   61.306415]  ? trace_hardirqs_on+0xb/0x10
[   61.306415]  ? call_rcu_sched+0xd/0x10
[   61.306415]  ? __put_cred+0x32/0x50
[   61.306415]  ? SyS_faccessat+0x178/0x1e0
[   61.306415]  SyS_ioctl+0x28/0x50
[   61.306415]  do_int80_syscall_32+0x3f/0x110
[   61.306415]  entry_INT80_32+0x2f/0x2f
[   61.306415] EIP: 0xb764d364
[   61.306415] EFLAGS: 00000286 CPU: 0
[   61.306415] EAX: ffffffda EBX: 00000004 ECX: 00008914 EDX: bfa99d7c
[   61.306415] ESI: bfa99e4c EDI: fffffffe EBP: 00000004 ESP: bfa99d58
[   61.306415]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[   63.836607] r6040 0000:00:08.0 eth0: Link is Down

Signed-off-by: Manuel Bessler <manuel.bessler@sensus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

irda: w83977af_ir: cleanup an indent issue

In commit 99d8d2159d7c ("irda: w83977af_ir: Neaten logging"), we
accidentally added an extra tab to these lines.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sfc: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Tested-by: Bert Kenward <bkenward@solarflare.com>
Acked-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: davicom: dm9000: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cirrus: ep93xx: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: chelsio: cxgb3: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: chelsio: cxgb2: use new api ethtool_{get|set}_link_ksettings

The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bpf-fixes'

Daniel Borkmann says:

====================
Couple of BPF fixes

This set contains three BPF fixes for net, one that addresses the
complaint from Geert wrt static allocations, and the other is a fix
wrt mem accounting that I found recently during testing and there's
still one more fix on the map value marking.

Thanks!

v1 -> v2:
  - Patch 1 as is.
  - Fixed kbuild bot issue by letting charging helpers stay in the
    syscall.c, since there locked_vm is valid and only export the
    ones needed by bpf_prog_realloc(). Add empty stubs in case the
    bpf syscall is not enabled.
  - Added patch 3 that addresses one more issue in map val marking.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: fix mark_reg_unknown_value for spilled regs on map value marking

Martin reported a verifier issue that hit the BUG_ON() for his
test case in the mark_reg_unknown_value() function:

  [  202.861380] kernel BUG at kernel/bpf/verifier.c:467!
  [...]
  [  203.291109] Call Trace:
  [  203.296501]  [<ffffffff811364d5>] mark_map_reg+0x45/0x50
  [  203.308225]  [<ffffffff81136558>] mark_map_regs+0x78/0x90
  [  203.320140]  [<ffffffff8113938d>] do_check+0x226d/0x2c90
  [  203.331865]  [<ffffffff8113a6ab>] bpf_check+0x48b/0x780
  [  203.343403]  [<ffffffff81134c8e>] bpf_prog_load+0x27e/0x440
  [  203.355705]  [<ffffffff8118a38f>] ? handle_mm_fault+0x11af/0x1230
  [  203.369158]  [<ffffffff812d8188>] ? security_capable+0x48/0x60
  [  203.382035]  [<ffffffff811351a4>] SyS_bpf+0x124/0x960
  [  203.393185]  [<ffffffff810515f6>] ? __do_page_fault+0x276/0x490
  [  203.406258]  [<ffffffff816db320>] entry_SYSCALL_64_fastpath+0x13/0x94

This issue got uncovered after the fix in a08dd0da5307 ("bpf: fix
regression on verifier pruning wrt map lookups"). The reason why it
wasn't noticed before was, because as mentioned in a08dd0da5307,
mark_map_regs() was doing the id matching incorrectly based on the
uncached regs[regno].id. So, in the first loop, we walked all regs
and as soon as we found regno == i, then this reg's id was cleared
when calling mark_reg_unknown_value() thus that every subsequent
register was probed against id of 0 (which, in combination with the
PTR_TO_MAP_VALUE_OR_NULL type is an invalid condition that no other
register state can hold), and therefore wasn't type transitioned such
as in the spilled register case for the second loop.

Now since that got fixed, it turned out that 57a09bf0a416 ("bpf:
Detect identical PTR_TO_MAP_VALUE_OR_NULL registers") used
mark_reg_unknown_value() incorrectly for the spilled regs, and thus
hitting the BUG_ON() in some cases due to regno >= MAX_BPF_REG.

Although spilled regs have the same type as the non-spilled regs
for the verifier state, that is, struct bpf_reg_state, they are
semantically different from the non-spilled regs. In other words,
there can be up to 64 (MAX_BPF_STACK / BPF_REG_SIZE) spilled regs
in the stack, for example, register R<x> could have been spilled by
the program to stack location X, Y, Z, and in mark_map_regs() we
need to scan these stack slots of type STACK_SPILL for potential
registers that we have to transition from PTR_TO_MAP_VALUE_OR_NULL.
Therefore, depending on the location, the spilled_regs regno can
be a lot higher than just MAX_BPF_REG's value since we operate on
stack instead. The reset in mark_reg_unknown_value() itself is
just fine, only that the BUG_ON() was inappropriate for this. Fix
it by making a __mark_reg_unknown_value() version that can be
called from mark_map_reg() generically; we know for the non-spilled
case that the regno is always < MAX_BPF_REG anyway.

Fixes: 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Reported-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: fix overflow in prog accounting

Commit aaac3ba95e4c ("bpf: charge user for creation of BPF maps and
programs") made a wrong assumption of charging against prog->pages.
Unlike map->pages, prog->pages are still subject to change when we
need to expand the program through bpf_prog_realloc().

This can for example happen during verification stage when we need to
expand and rewrite parts of the program. Should the required space
cross a page boundary, then prog->pages is not the same anymore as
its original value that we used to bpf_prog_charge_memlock() on. Thus,
we'll hit a wrap-around during bpf_prog_uncharge_memlock() when prog
is freed eventually. I noticed this that despite having unlimited
memlock, programs suddenly refused to load with EPERM error due to
insufficient memlock.

There are two ways to fix this issue. One would be to add a cached
variable to struct bpf_prog that takes a snapshot of prog->pages at the
time of charging. The other approach is to also account for resizes. I
chose to go with the latter for a couple of reasons: i) We want accounting
rather to be more accurate instead of further fooling limits, ii) adding
yet another page counter on struct bpf_prog would also be a waste just
for this purpose. We also do want to charge as early as possible to
avoid going into the verifier just to find out later on that we crossed
limits. The only place that needs to be fixed is bpf_prog_realloc(),
since only here we expand the program, so we try to account for the
needed delta and should we fail, call-sites check for outcome anyway.
On cBPF to eBPF migrations, we don't grab a reference to the user as
they are charged differently. With that in place, my test case worked
fine.

Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: dynamically allocate digest scratch buffer

Geert rightfully complained that 7bd509e311f4 ("bpf: add prog_digest
and expose it via fdinfo/netlink") added a too large allocation of
variable 'raw' from bss section, and should instead be done dynamically:

  # ./scripts/bloat-o-meter kernel/bpf/core.o.1 kernel/bpf/core.o.2
  add/remove: 3/0 grow/shrink: 0/0 up/down: 33291/0 (33291)
  function                                     old     new   delta
  raw                                            -   32832  +32832
  [...]

Since this is only relevant during program creation path, which can be
considered slow-path anyway, lets allocate that dynamically and be not
implicitly dependent on verifier mutex. Move bpf_prog_calc_digest() at
the beginning of replace_map_fd_with_map_ptr() and also error handling
stays straight forward.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile

Pull arch/tile updates from Chris Metcalf:
"Another grab-bag of miscellaneous changes"

* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: use __ro_after_init instead of tile-specific __write_once
  tile: migrate exception table users off module.h and onto extable.h
  tile: remove #pragma unroll from finv_buffer_remote()
  tile-module: Rename jump labels in module_alloc()
  tile-module: Use kmalloc_array() in module_alloc()
  tile/pci_gx: fix spelling mistake: "delievered" -> "delivered"

Merge tag 'kvmgt-vfio-mdev-for-v4.10-rc1' of git://github.com/01org/gvt-linux

Pull i915/gvt KVMGT updates from Zhenyu Wang:
"KVMGT support depending on the VFIO/mdev framework"

* tag 'kvmgt-vfio-mdev-for-v4.10-rc1' of git://github.com/01org/gvt-linux:
  drm/i915/gvt/kvmgt: add vfio/mdev support to KVMGT
  drm/i915/gvt/kvmgt: read/write GPA via KVM API
  drm/i915/gvt/kvmgt: replace kmalloc() by kzalloc()

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input subsystem updates from Dmitry Torokhov:

- updated support for Synaptics RMI4 devices, including support for
   SMBus controllers, firmware update support, sensor tuning, and PS/2
   guest support

- ALPS driver now supports tracksticks on SS5 controllers

- i8042 now uses chassis info to skip selftest on Asus laptops as list
   of individual models became too unwieldy

- miscellaneous fixes to other drivers

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (67 commits)
  Input: imx6ul_tsc - generalize the averaging property
  Input: drv260x - use generic device properties
  Input: drv260x - use temporary for &client->dev
  Input: drv260x - fix input device's parent assignment
  Input: synaptics-rmi4 - add support for F34 V7 bootloader
  Input: drv260x - fix initializing overdrive voltage
  Input: ALPS - fix protcol -> protocol
  Input: i8042 - comment #else/#endif of CONFIG_PNP
  Input: lpc32xx-keys - fix invalid error handling of a requested irq
  Input: synaptics-rmi4 - fix debug for sensor clip
  Input: synaptics-rmi4 - store the attn data in the driver
  Input: synaptics-rmi4 - allow to add attention data
  Input: synaptics-rmi4 - f03 - grab data passed by transport device
  Input: synaptics-rmi4 - add support for F03
  Input: imx6ul_tsc - convert int to u32
  Input: imx6ul_tsc - add mask when set REG_ADC_CFG
  Input: synaptics-rmi4 - have only one struct platform data
  Input: synaptics-rmi4 - remove EXPORT_SYMBOL_GPL for internal functions
  Input: synaptics-rmi4 - remove mutex calls while updating the firmware
  Input: drv2667 - fix misuse of regmap_update_bits
  ...

Merge tag 'for-linus-20161216' of git://git.infradead.org/linux-mtd

Pull MTD updates from Brian Norris:
"Nothing enormous here, though notably we have some of the first work
  of a few new maintainers. I think for now I'll still be sending pull
  requests, but that's open to change in the future. Summary:

  Core:

   - dynamic BDI object allocation (resolves some problems when built as
     a module)
   - cleanups in the ooblayout handling

  NAND:

   - new tango NAND controller driver
   - new ox820 NAND controller driver
   - addition of a new full-ID entry in the nand_ids table
   - rework of the s3c240 driver to support DT
   - extension of the nand_sdr_timings to expose tCCS, tPROG and tR
   - addition of a new flag to ask the core to wait for tCCS when
     sending a RNDIN/RNDOUT command
   - addition of a new flag to ask the core to let the controller driver
     send the READ/PROGPAGE command

  Minor fixes/cleanup/cosmetic changes:

   - properly support 512 ECC step size in the sunxi driver
   - improve the error messages in the PXA probe path
   - fix module autoload in the omap2 driver
   - cleanup of several nand drivers to return nand_scan{_tail}() error
     code instead of returning -EIO
   - various cleanups in the denali driver
   - fix an error check in nandsim

  SPI NOR:

   - new flash IDs
   - wait for Spansion flash to be ready after quad-enable
   - error handling fixes for Candence QSPI
   - constify some structures in Freescale QSPI driver"

* tag 'for-linus-20161216' of git://git.infradead.org/linux-mtd: (71 commits)
  mtd: Allocate bdi objects dynamically
  mtd: nand: tango: Add standard legalese header
  mtd: maps: add missing iounmap() in error path
  mtd: spi-nor: constify fsl_qspi_devtype_data
  mtd: spi-nor: Add support for mr25h40
  mtd: spi-nor: Add support for N25Q016A
  mtd: spi-nor: Add at25df321 spi-nor flash support
  mtd: spi-nor: Fix some error codes in cqspi_setup_flash()
  mtd: spi-nor: Off by one in cqspi_setup_flash()
  mtd: spi-nor: add support for s25fl208k
  mtd: spi-nor: fix flags for s25fl128s
  mtd: spi-nor: fix spansion quad enable
  mtd: spi-nor: add Macronix mx25u25635f to list of known devices.
  mtd: mtdswap: fix spelling mistake "erassure" -> "erasure"
  mtd: bcm47xxpart: fix parsing first block after aligned TRX
  mtd: nand: tango: Use nand_to_mtd() instead of directly accessing chip->mtd
  mtd: remove unneeded initializer in mtd_ooblayout_count_bytes()
  mtd: use min_t() to refactor mtd_ooblayout_{get, set}_bytes()
  mtd: remove unneeded initializer in mtd_ooblayout_{get, set}_bytes()
  mtd: nand: nandsim: fix error check
  ...

Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull kbuild misc updates from Michal Marek:

- one new coccinelle check and improvements to irqf_oneshot.cocci

- 'make rpm' POSIX compatibility fix

- 'make deb-pkg' arm64 cross-compiling fix. I forgot to send this one
   during the v4.9 rc-phase, therefor the pull request is based on -rc6
   and not -rc1

* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  Coccinelle: misc: Add support for devm variant in all modes
  Coccinelle: misc: Improve the result given by context mode
  Coccinelle: misc: Improve the matching of rules
  kbuild/mkspec: avoid using brace expansion
  Coccinelle: Add misc/boolconv.cocci
  builddeb: fix cross-building to arm64 producing host-arch debs

Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull kconfig updates from Michal Marek:

- 'make xconfig' gui fixes

- 'make nconfig' fix for options with long prompts

- fix 'make nconfig' warning when pkg-config forces -D_GNU_SOURCE

* 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  xconfig: fix missing suboption and help panels on first run
  xconfig: fix 'Show Debug' functionality
  kconfig/nconf: Fix hang when editing symbol with a long prompt
  Scripts: kconfig: nconf: fix _GNU_SOURCE redefined warning

Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull kbuild updates from Michal Marek:

- prototypes for x86 asm-exported symbols (Adam Borowski) and a warning
   about missing CRCs (Nick Piggin)

- asm-exports fix for LTO (Nicolas Pitre)

- thin archives improvements (Nick Piggin)

- linker script fix for CONFIG_LD_DEAD_CODE_DATA_ELIMINATION (Nick
   Piggin)

- genksyms support for __builtin_va_list keyword

- misc minor fixes

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  x86/kbuild: enable modversions for symbols exported from asm
  kbuild: fix scripts/adjust_autoksyms.sh* for the no modules case
  scripts/kallsyms: remove last remnants of --page-offset option
  make use of make variable CURDIR instead of calling pwd
  kbuild: cmd_export_list: tighten the sed script
  kbuild: minor improvement for thin archives build
  kbuild: modpost warn if export version crc is missing
  kbuild: keep data tables through dead code elimination
  kbuild: improve linker compatibility with lib-ksyms.o build
  genksyms: Regenerate parser
  kbuild/genksyms: handle va_list type
  kbuild: thin archives for multi-y targets
  kbuild: kallsyms allow 3-pass generation if symbols size has changed

Merge tag 'docs-4.10-2' of git://git.lwn.net/linux

Pull more documentation updates from Jonathan Corbet:
"This converts the crypto DocBook to Sphinx"

* tag 'docs-4.10-2' of git://git.lwn.net/linux:
  crypto: doc - optimize compilation
  crypto: doc - clarify AEAD memory structure
  crypto: doc - remove crypto_alloc_ablkcipher
  crypto: doc - add KPP documentation
  crypto: doc - fix separation of cipher / req API
  crypto: doc - fix source comments for Sphinx
  crypto: doc - remove crypto API DocBook
  crypto: doc - convert crypto API documentation to Sphinx

Merge branch 'for-4.10/libnvdimm' into libnvdimm-for-next

dax: add region 'id', 'size', and 'align' attributes

While this information is available by looking at the nvdimm parent
device that may not always be the case when/if we add support for other
memory regions. Tooling should not depend on walking a given ancestor
topology that is not guaranteed by the device's class. For example, a
device-dax instance will always have a dax_region parent, but it may not
always have a libnvdimm "dax" device as a grandparent.

Reported-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

Merge branch 'gtp-fixes'

Pablo Neira Ayuso says:

====================
GTP tunneling fixes for net

The following patchset contains two GTP tunneling fixes for your net
tree, they are:

1) Offset to IPv4 header in gtp_check_src_ms_ipv4() is incorrect, thus
   this function always succeeds and therefore this defeats this sanity
   check. This allows packets that have no PDP to go though, patch from
   Lionel Gauthier.

2) According to Note 0 of Figure 2 in Section 6 of 3GPP TS 29.060 v13.5.0
   Release 13, always set GTPv1 reserved bit to zero. This may cause
   interoperability problems, patch from Harald Welte.

Please, apply, thanks a lot!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

gtp: Fix initialization of Flags octet in GTPv1 header

When generating a GTPv1 header in gtp1_push_header(), initialize the
'reserved' bit to zero. All 3GPP specifications for GTPv1 from Release
99 through Release 13 agree that a transmitter shall set this bit to
zero, see e.g. Note 0 of Figure 2 in Section 6 of 3GPP TS 29.060 v13.5.0
Release 13, available from
http://www.etsi.org/deliver/etsi_ts/129000_129099/129060/13.05.00_60/ts_129060v130500p.pdf

Signed-off-by: Harald Welte <laforge@gnumonks.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

gtp: gtp_check_src_ms_ipv4() always return success

gtp_check_src_ms_ipv4() did not find the PDP context matching with the
UE IP address because the memory location is not right, but the result
is inverted by the Boolean "not" operator. So whatever is the PDP
context, any call to this function is successful.

Signed-off-by: Lionel Gauthier <Lionel.Gauthier@eurecom.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/x25: use designated initializers

Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

isdn: use designated initializers

Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bna: use designated initializers

Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

WAN: use designated initializers

Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: use designated initializers

Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ATM: use designated initializers

Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

isdn/gigaset: use designated initializers

Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, with most initializer fixes
extracted from grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'virtio_net-XDP'

John Fastabend says:

====================
XDP for virtio_net

This implements virtio_net for the mergeable buffers and big_packet
modes. I tested this with vhost_net running on qemu and did not see
any issues. For testing num_buf > 1 I added a hack to vhost driver
to only but 100 bytes per buffer.

There are some restrictions for XDP to be enabled and work well
(see patch 3) for more details.

  1. GUEST_TSO{4|6} must be off
  2. MTU must be less than PAGE_SIZE
  3. queues must be available to dedicate to XDP
  4. num_bufs received in mergeable buffers must be 1
  5. big_packet mode must have all data on single page

To test this I used pktgen in the hypervisor and ran the XDP sample
programs xdp1 and xdp2 from ./samples/bpf in the host. The default
mode that is used with these patches with Linux guest and QEMU/Linux
hypervisor is the mergeable buffers mode. I tested this mode for 2+
days running xdp2 without issues. Additionally I did a series of
driver unload/load tests to check the allocate/release paths.

To test the big_packets path I applied the following simple patch against
the virtio driver forcing big_packets mode,

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2242,7 +2242,7 @@ static int virtnet_probe(struct virtio_device *vdev)
                vi->big_packets = true;

        if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
-               vi->mergeable_rx_bufs = true;
+               vi->mergeable_rx_bufs = false;

        if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) ||
            virtio_has_feature(vdev, VIRTIO_F_VERSION_1))

I then repeated the tests with xdp1 and xdp2. After letting them run
for a few hours I called it good enough.

Testing the unexpected case where virtio receives a packet across
multiple buffers required patching the hypervisor vhost driver to
convince it to send these unexpected packets. Then I used ping with
the -s option to trigger the case with multiple buffers. This mode
is not expected to be used but as MST pointed out per spec it is
not strictly speaking illegal to generate multi-buffer packets so we
need someway to handle these. The following patch can be used to
generate multiple buffers,

--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1777,7 +1777,8 @@ static int translate_desc(struct vhost_virtqueue
*vq, u64

                _iov = iov + ret;
                size = node->size - addr + node->start;
-               _iov->iov_len = min((u64)len - s, size);
+               printk("%s: build 100 length headers!\n", __func__);
+               _iov->iov_len = min((u64)len - s, (u64)100);//size);
                _iov->iov_base = (void __user *)(unsigned long)
                        (node->userspace_addr + addr - node->start);
                s += size;

The qemu command I most frequently used for testing (although I did test
various other combinations of devices) is the following,

./x86_64-softmmu/qemu-system-x86_64              \
    -hda /var/lib/libvirt/images/Fedora-test0.img \
    -m 4096  -enable-kvm -smp 2                   \
    -netdev tap,id=hn0,queues=4,vhost=on          \
    -device virtio-net-pci,netdev=hn0,mq=on,vectors=9,guest_tso4=off,guest_tso6=off \
    -serial stdio

The options 'guest_tso4=off,guest_tso6=off' are required because we
do not support LRO with XDP at the moment.

Please review any comments/feedback welcome as always.
====================

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

virtio_net: xdp, add slowpath case for non contiguous buffers

virtio_net XDP support expects receive buffers to be contiguous.
If this is not the case we enable a slowpath to allow connectivity
to continue but at a significan performance overhead associated with
linearizing data. To make it painfully aware to users that XDP is
running in a degraded mode we throw an xdp buffer error.

To linearize packets we allocate a page and copy the segments of
the data, including the header, into it. After this the page can be
handled by XDP code flow as normal.

Then depending on the return code the page is either freed or sent
to the XDP xmit path. There is no attempt to optimize this path.

This case is being handled simple as a precaution in case some
unknown backend were to generate packets in this form. To test this
I had to hack qemu and force it to generate these packets. I do not
expect this case to be generated by "real" backends.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

virtio_net: add XDP_TX support

This adds support for the XDP_TX action to virtio_net. When an XDP
program is run and returns the XDP_TX action the virtio_net XDP
implementation will transmit the packet on a TX queue that aligns
with the current CPU that the XDP packet was processed on.

Before sending the packet the header is zeroed. Also XDP is expected
to handle checksum correctly so no checksum offload support is
provided.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

virtio_net: add dedicated XDP transmit queues

XDP requires using isolated transmit queues to avoid interference
with normal networking stack (BQL, NETDEV_TX_BUSY, etc). This patch
adds a XDP queue per cpu when a XDP program is loaded and does not
expose the queues to the OS via the normal API call to
netif_set_real_num_tx_queues(). This way the stack will never push
an skb to these queues.

However virtio/vhost/qemu implementation only allows for creating
TX/RX queue pairs at this time so creating only TX queues was not
possible. And because the associated RX queues are being created I
went ahead and exposed these to the stack and let the backend use
them. This creates more RX queues visible to the network stack than
TX queues which is worth mentioning but does not cause any issues as
far as I can tell.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

virtio_net: Add XDP support

This adds XDP support to virtio_net. Some requirements must be
met for XDP to be enabled depending on the mode. First it will
only be supported with LRO disabled so that data is not pushed
across multiple buffers. Second the MTU must be less than a page
size to avoid having to handle XDP across multiple pages.

If mergeable receive is enabled this patch only supports the case
where header and data are in the same buf which we can check when
a packet is received by looking at num_buf. If the num_buf is
greater than 1 and a XDP program is loaded the packet is dropped
and a warning is thrown. When any_header_sg is set this does not
happen and both header and data is put in a single buffer as expected
so we check this when XDP programs are loaded. Subsequent patches
will process the packet in a degraded mode to ensure connectivity
and correctness is not lost even if backend pushes packets into
multiple buffers.

If big packets mode is enabled and MTU/LRO conditions above are
met then XDP is allowed.

This patch was tested with qemu with vhost=on and vhost=off where
mergeable and big_packet modes were forced via hard coding feature
negotiation. Multiple buffers per packet was forced via a small
test patch to vhost.c in the vhost=on qemu mode.

Suggested-by: Shrijeet Mukherjee <shrijeet@gmail.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: xdp: add invalid buffer warning

This adds a warning for drivers to use when encountering an invalid
buffer for XDP. For normal cases this should not happen but to catch
this in virtual/qemu setups that I may not have expected from the
emulation layer having a standard warning is useful.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: sctp_transport_lookup_process should rcu_read_unlock when transport is null

Prior to this patch, sctp_transport_lookup_process didn't rcu_read_unlock
when it failed to find a transport by sctp_addrs_lookup_transport.

This patch is to fix it by moving up rcu_read_unlock right before checking
transport and also to remove the out path.

Fixes: 1cceda784980 ("sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: sctp_epaddr_lookup_transport should be protected by rcu_read_lock

Since commit 7fda702f9315 ("sctp: use new rhlist interface on sctp transport
rhashtable"), sctp has changed to use rhlist_lookup to look up transport, but
rhlist_lookup doesn't call rcu_read_lock inside, unlike rhashtable_lookup_fast.

It is called in sctp_epaddr_lookup_transport and sctp_addrs_lookup_transport.
sctp_addrs_lookup_transport is always in the protection of rcu_read_lock(),
as __sctp_lookup_association is called in rx path or sctp_lookup_association
which are in the protection of rcu_read_lock() already.

But sctp_epaddr_lookup_transport is called by sctp_endpoint_lookup_assoc, it
doesn't call rcu_read_lock, which may cause "suspicious rcu_dereference_check
usage' in __rhashtable_lookup.

This patch is to fix it by adding rcu_read_lock in sctp_endpoint_lookup_assoc
before calling sctp_epaddr_lookup_transport.

Fixes: 7fda702f9315 ("sctp: use new rhlist interface on sctp transport rhashtable")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'dpaa_eth-fixes'

Madalin Bucur says:

====================
dpaa_eth: a couple of fixes

This patch set introduces big endian accessors in the dpaa_eth driver
making sure accesses to the QBMan HW are correct on little endian
platforms. Removing a redundant Kconfig dependency on FSL_SOC.
Adding myself as maintainer of the dpaa_eth driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

MAINTAINERS: net: add entry for Freescale QorIQ DPAA Ethernet driver

Add record for Freescale QORIQ DPAA Ethernet driver adding myself as
maintainer.

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa_eth: remove redundant dependency on FSL_SOC

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dpaa_eth: use big endian accessors

Ensure correct access to the big endian QMan HW through proper
accessors.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

irda: irnet: add member name to the miscdevice declaration

Since the struct miscdevice have many members, it is dangerous to init
it without members name relying only on member order.

This patch add member name to the init declaration.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

irda: irnet: Remove unused IRNET_MAJOR define

The IRNET_MAJOR define is not used, so this patch remove it.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

irnet: ppp: move IRNET_MINOR to include/linux/miscdevice.h

This patch move the define for IRNET_MINOR to include/linux/miscdevice.h
It is better that all minor number definitions are in the same place.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

irda: irnet: Move linux/miscdevice.h include

The only use of miscdevice is irda_ppp so no need to include
linux/miscdevice.h for all irda files.
This patch move the linux/miscdevice.h include to irnet_ppp.h

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

irda: irproc.c: Remove unneeded linux/miscdevice.h include

irproc.c does not use any miscdevice so this patch remove this
unnecessary inclusion.

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: cgroup: annotate pointers in struct cgroup_bpf with __rcu

The member 'effective' in 'struct cgroup_bpf' is protected by RCU.
Annotate it accordingly to squelch a sparse warning.

Signed-off-by: Daniel Mack <daniel@zonque.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'inet_csk_get_port-and-soreusport-fixes'

Tom Herbert says:

====================
inet: Fixes for inet_csk_get_port and soreusport

This patch set fixes a couple of issues I noticed while debugging our
softlockup issue in inet_csk_get_port.

- Don't allow jump into port scan in inet_csk_get_port if function
  was called with non-zero port number (looking up explicit port
  number).
- When inet_csk_get_port is called with zero port number (ie. perform
  scan) an reuseport is set on the socket, don't match sockets that
  also have reuseport set. The intent from the user should be
  to get a new port number and then explictly bind other
  sockets to that number using soreuseport.

Tested:

Ran first patch on production workload with no ill effect.

For second patch, ran a little listener application and first
demonstrated that unbound sockets with soreuseport can indeed
be bound to unrelated soreuseport sockets.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

inet: Fix get port to handle zero port number with soreuseport set

A user may call listen with binding an explicit port with the intent
that the kernel will assign an available port to the socket. In this
case inet_csk_get_port does a port scan. For such sockets, the user may
also set soreuseport with the intent a creating more sockets for the
port that is selected. The problem is that the initial socket being
opened could inadvertently choose an existing and unreleated port
number that was already created with soreuseport.

This patch adds a boolean parameter to inet_bind_conflict that indicates
rather soreuseport is allowed for the check (in addition to
sk->sk_reuseport). In calls to inet_bind_conflict from inet_csk_get_port
the argument is set to true if an explicit port is being looked up (snum
argument is nonzero), and is false if port scan is done.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

inet: Don't go into port scan when looking for specific bind port

inet_csk_get_port is called with port number (snum argument) that may be
zero or nonzero. If it is zero, then the intent is to find an available
ephemeral port number to bind to. If snum is non-zero then the caller
is asking to allocate a specific port number. In the latter case we
never want to perform the scan in ephemeral port range. It is
conceivable that this can happen if the "goto again" in "tb_found:"
is done. This patch adds a check that snum is zero before doing
the "goto again".

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf, test_verifier: fix a test case error result on unprivileged

Running ./test_verifier as unprivileged lets 1 out of 98 tests fail:

  [...]
  #71 unpriv: check that printk is disallowed FAIL
  Unexpected error message!
  0: (7a) *(u64 *)(r10 -8) = 0
  1: (bf) r1 = r10
  2: (07) r1 += -8
  3: (b7) r2 = 8
  4: (bf) r3 = r1
  5: (85) call bpf_trace_printk#6
  unknown func bpf_trace_printk#6
  [...]

The test case is correct, just that the error outcome changed with
ebb676daa1a3 ("bpf: Print function name in addition to function id").
Same as with e00c7b216f34 ("bpf: fix multiple issues in selftest suite
and samples") issue 2), so just fix up the function name.

Fixes: ebb676daa1a3 ("bpf: Print function name in addition to function id")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: fix regression on verifier pruning wrt map lookups

Commit 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL
registers") introduced a regression where existing programs stopped
loading due to reaching the verifier's maximum complexity limit,
whereas prior to this commit they were loading just fine; the affected
program has roughly 2k instructions.

What was found is that state pruning couldn't be performed effectively
anymore due to mismatches of the verifier's register state, in particular
in the id tracking. It doesn't mean that 57a09bf0a416 is incorrect per
se, but rather that verifier needs to perform a lot more work for the
same program with regards to involved map lookups.

Since commit 57a09bf0a416 is only about tracking registers with type
PTR_TO_MAP_VALUE_OR_NULL, the id is only needed to follow registers
until they are promoted through pattern matching with a NULL check to
either PTR_TO_MAP_VALUE or UNKNOWN_VALUE type. After that point, the
id becomes irrelevant for the transitioned types.

For UNKNOWN_VALUE, id is already reset to 0 via mark_reg_unknown_value(),
but not so for PTR_TO_MAP_VALUE where id is becoming stale. It's even
transferred further into other types that don't make use of it. Among
others, one example is where UNKNOWN_VALUE is set on function call
return with RET_INTEGER return type.

states_equal() will then fall through the memcmp() on register state;
note that the second memcmp() uses offsetofend(), so the id is part of
that since d2a4dd37f6b4 ("bpf: fix state equivalence"). But the bisect
pointed already to 57a09bf0a416, where we really reach beyond complexity
limit. What I found was that states_equal() often failed in this
case due to id mismatches in spilled regs with registers in type
PTR_TO_MAP_VALUE. Unlike non-spilled regs, spilled regs just perform
a memcmp() on their reg state and don't have any other optimizations
in place, therefore also id was relevant in this case for making a
pruning decision.

We can safely reset id to 0 as well when converting to PTR_TO_MAP_VALUE.
For the affected program, it resulted in a ~17 fold reduction of
complexity and let the program load fine again. Selftest suite also
runs fine. The only other place where env->id_gen is used currently is
through direct packet access, but for these cases id is long living, thus
a different scenario.

Also, the current logic in mark_map_regs() is not fully correct when
marking NULL branch with UNKNOWN_VALUE. We need to cache the destination
reg's id in any case. Otherwise, once we marked that reg as UNKNOWN_VALUE,
it's id is reset and any subsequent registers that hold the original id
and are of type PTR_TO_MAP_VALUE_OR_NULL won't be marked UNKNOWN_VALUE
anymore, since mark_map_reg() reuses the uncached regs[regno].id that
was just overridden. Note, we don't need to cache it outside of
mark_map_regs(), since it's called once on this_branch and the other
time on other_branch, which are both two independent verifier states.
A test case for this is added here, too.

Fixes: 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: vrf: Drop conntrack data after pass through VRF device on Tx

Locally originated traffic in a VRF fails in the presence of a POSTROUTING
rule. For example,

    $ iptables -t nat -A POSTROUTING -s 11.1.1.0/24  -j MASQUERADE
    $ ping -I red -c1 11.1.1.3
    ping: Warning: source address might be selected on device other than red.
    PING 11.1.1.3 (11.1.1.3) from 11.1.1.2 red: 56(84) bytes of data.
    ping: sendmsg: Operation not permitted

Worse, the above causes random corruption resulting in a panic in random
places (I have not seen a consistent backtrace).

Call nf_reset to drop the conntrack info following the pass through the
VRF device.  The nf_reset is needed on Tx but not Rx because of the order
in which NF_HOOK's are hit: on Rx the VRF device is after the real ingress
device and on Tx it is is before the real egress device. Connection
tracking should be tied to the real egress device and not the VRF device.

Fixes: 8f58336d3f78a ("net: Add ethernet header for pass through VRF device")
Fixes: 35402e3136634 ("net: Add IPv6 support to VRF device")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: vrf: Fix NAT within a VRF

Connection tracking with VRF is broken because the pass through the VRF
device drops the connection tracking info. Removing the call to nf_reset
allows DNAT and MASQUERADE to work across interfaces within a VRF.

Fixes: 73e20b761acf ("net: vrf: Add support for PREROUTING rules on vrf device")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'cls_flower-mask'

Paul Blakey says:

====================
net/sched: cls_flower: Fix mask handling

The series fix how the mask is being handled.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/sched: cls_flower: Use masked key when calling HW offloads

Zero bits on the mask signify a "don't care" on the corresponding bits
in key. Some HWs require those bits on the key to be zero. Since these
bits are masked anyway, it's okay to provide the masked key to all
drivers.

Fixes: 5b33f48842fa ('net/flower: Introduce hardware offload support')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sched: cls_flower: Use mask for addr_type

When addr_type is set, mask should also be set.

Fixes: 66530bdf85eb ('sched,cls_flower: set key address type when present')
Fixes: bc3103f1ed40 ('net/sched: cls_flower: Classify packet in ip tunnels')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>