Rafał Miłecki [Tue, 19 Jan 2016 07:45:25 +0000 (08:45 +0100)]
bcma: use _PMU_ in all names of PMU registers
PMU (Power Management Unit) seems to be a separated piece of hardware,
just accessed using ChipCommon core registers. In recent Broadcom
chipsets PMU is not bounded to CC but available as separated core.
To make code cleaner & easier to review (for a correct R/W access) use
clearer names.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Rafał Miłecki [Fri, 15 Jan 2016 23:48:53 +0000 (00:48 +0100)]
bcma: support chipsets with PMU and GCI cores (devices)
Both cores are another exceptions. They are not accessed in a standard
way and to they don't need or have wrapping addresses.
This fixes bus scanning after finding such core.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Kalle Valo [Sat, 6 Feb 2016 11:01:25 +0000 (13:01 +0200)]
Merge tag 'iwlwifi-next-for-kalle-2016-01-31_2' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next
Here's a first batch of patches for 4.6:
* continue the work on multiple Rx queues (Sara)
* add support for beacon storing
used in low power states (Sara)
* cleanups (Rodrigo, Johannes)
* fix the LED behavior for iwldvm (Hubert)
* Use the regular firmware image of WoWLAN (Matti)
* fix 8000 devices for Big Endian machines (Johannes)
* more firmware debug hooks (Golan)
* add support for P2P Client snoozing (Avri)
* make the beacon filtering for AP mode configurable (Andrei)
* fix transmit queues overflow with LSO
iwlwifi: mvm: allow to disable beacon filtering for AP/GO interface
When in AP mode we need to filter in beacons from other APs to update HT
operation mode. As a power optimization the beacons are filtered out when
there are no associated stations. As a result, when there are no
associated stations, we will not update the HT operation mode until a
station connects.
Add a debugfs parameter that allows to disable this optimization.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Sara Sharon [Tue, 26 Jan 2016 10:35:13 +0000 (12:35 +0200)]
iwlwifi: pcie: update iwl_mpdu_desc fields
Final API of iwl_mpdu_desc has a change in the order of
the fields and does not include energy from the third
antenna (which is perfectly fine, since we don't have one).
Update the structure accordingly.
Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Sara Sharon [Mon, 25 Jan 2016 16:14:49 +0000 (18:14 +0200)]
iwlwifi: pcie: enable multi-queue rx path
Previous patches enabled new 9000 hardware DMA for one queue
only.
Enable the actual multi-queue path and configuration now.
This requires also per-queue NAPI struct.
Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Sara Sharon [Thu, 31 Dec 2015 09:49:18 +0000 (11:49 +0200)]
iwlwifi: mvm: support rss queues configuration command
9000 series supports multi-queue rx. The hardware needs
to be configured with the hash functions to perform and
indirection table that maps hash results to the relevant
CPUs\queues.
Support this configuration.
Add debugfs hook to configure the indirection table in
order to enable performance analysis. The configuration
is stateless, receives a partial or full pattern and sends
the command to the firmware.
Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Sara Sharon [Tue, 26 Jan 2016 11:17:47 +0000 (13:17 +0200)]
iwlwifi: mvm: add new ADD_STA command version
The 9000 hardware introduces the frame releaser, which
keeps track of the aggregation window and notifies host
of the window status. This requires in turn updating
the hardware with the RX BA session window size.
Firmware API was changed to enable that, update the driver
accordingly.
Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Johannes Berg [Sun, 24 Jan 2016 14:28:43 +0000 (15:28 +0100)]
iwlwifi: treat iwl_parse_nvm_data() MAC addr as little endian
The MAC address parameters passed to iwl_parse_nvm_data() are passed on
to iwl_set_hw_address_family_8000() which treats them as little endian.
Annotate them as such, and add the missing byte-swapping in mvm.
While at it, add the MAC address to the error to make debugging issues
with it easier.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Sara Sharon [Wed, 30 Dec 2015 21:58:29 +0000 (23:58 +0200)]
iwlwifi: mvm: add tlv for multi queue rx support
Previous patches enabled the multi-queue rx path based on
iwl_mvm_has_new_rx_api() which returned false by default.
Change it to return the actual value based on the firmware
TLV which is now defined.
Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Sara Sharon [Thu, 7 Jan 2016 14:50:45 +0000 (16:50 +0200)]
iwlwifi: mvm: change the check for ADD_STA status
The firmware will return the baid for BA session in the
ADD_STA command response.
This requires masking the check of the status, which is
actually only 8 bits, and not the whole 32 bits.
Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Johannes Berg [Wed, 13 Jan 2016 14:01:00 +0000 (15:01 +0100)]
iwlwifi: mvm: support setting minimum quota from debugfs
For debug purposes, allow setting minimum quota (for a single
virtual interface) from debugfs. This is an absolute minimum,
so it can only be set up to 95%.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
To be able to test low-latency behaviour properly, split the
different low-latency sources so that setting any one of them,
for example from debugfs, is sufficient; this avoids getting
the debug setting overwritten by other sources.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Sara Sharon [Tue, 29 Dec 2015 09:07:15 +0000 (11:07 +0200)]
iwlwifi: mvm: support beacon storing
Currently firmware is configured to filter out beacons. In case
a beacon was changed - it is waking the host.
However, some vendors change their IEs frequently without any
significant change, and redundant wakeups are triggered as a
result.
As a solution disable beacon filtering when entering d0i3.
Instead, firmware will store the latest beacon and upon exiting
d0i3 it will send it up to the host, so the host can act upon
changes (if there were any).
This beacon will arrive as a dedicated notification - support it
as well.
Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
iwlwifi: mvm: add support for negative temperatures
The driver should support also negative temperatures.
So there is a need to separate between the return value and
temperature in order to be able to distinguish between
a negative temperature and error value.
Avri Altman [Wed, 25 Nov 2015 11:17:10 +0000 (13:17 +0200)]
iwlwifi: mvm: Add P2P client snoozing
Enable snoozing and U-APSD on P2P client. The firwmare will
support this only if the BSS vif is not associated.
Make this configurable by a constant variable and disable
it by default.
Hubert Tarasiuk [Sun, 24 Jan 2016 12:35:06 +0000 (13:35 +0100)]
iwlwifi: dvm: handle zero brightness for wifi LED
In order to have the LED being OFF constantly when the
brightness is set to 0, we need to pass IWL_LED_SOLID to
iwl_led_cmd as the off parameter, otherwise the led will
stay on constantly.
This fixes
https://bugzilla.kernel.org/show_bug.cgi?id=110551
Matti Gottlieb [Thu, 31 Dec 2015 16:18:02 +0000 (18:18 +0200)]
iwlwifi: mvm: Do not switch to D3 image on suspend
Currently when the driver is configured with wowlan parameters, and enters
D3 mode, the driver switches the FW image to D3, and when it exists
suspend, it reloads the D0 image.
If the firmware supports the consolidation of the D0 & D3 images there is
no need to load the D3 image on suspend, and no need to reload the D0
image on resume.
Do not switch images on suspend / resume, for firmwares that support
consolidated images.
Signed-off-by: Matti Gottlieb <matti.gottlieb@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Luca Coelho [Wed, 6 Jan 2016 20:40:38 +0000 (18:40 -0200)]
iwlwifi: pcie: add initial RTPM support for PCI
Add an initial implementation of runtime power management (RTPM) for
PCI devices. With this patch, RTPM is only used when wifi is off
(i.e. the wifi interface is down). This implementation is behind a
new Kconfig flag, IWLWIFI_PCIE_RTPM.
Sara Sharon [Wed, 23 Dec 2015 13:10:03 +0000 (15:10 +0200)]
iwlwifi: pcie: add 9000 series multi queue rx DMA support
The 9000 series introduces several changes in the device
DMA operation.
As the device now supports multi-queue rx, several DMA channels
should be configured.
The flows of providing the device with the allocated RBDs now
changes as well - the device maintains a separate table of used
and free table.
The hardware may use the free table to feed RBDs to any queue.
This requires maintaing a shared table to map returned RBDs to
the original RXB - for that purpose the VID is introduced - an
internal identifier of the RB placed in the lower 12 bits and
returned by HW in the used data.
Another change is the support of 64 bit DMA address.
Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Sara Sharon [Mon, 14 Dec 2015 15:44:11 +0000 (17:44 +0200)]
iwlwifi: pcie: add infrastructure for multi-queue rx
The 9000 series devices will support multi rx queues.
Current code has one static rx queue - change it to allocate
a number of queues per the device capability (pre-9000 devices
have the number of rx queues set to one).
Subsequent generalizations are:
Change the code to access an explicit numbered rx queue only
when the queue number is known - when handling interrupt, when
accessing the default queue and when iterating the queues.
The rest of the functions will receive the rx queue as a pointer.
Generalize the warning in allocation failure to consider the
allocator status instead of a single rx queue status.
Move the rx initial pool of memory buffers to be shared among
all the queues and allocated to the default queue on init.
Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
iwlwifi: pcie: buffer packets to avoid overflowing Tx queues
When the Tx queues are full above a threshold, we
immediately stop the mac80211's queue to stop getting new
packets. This worked until TSO was enabled.
With TSO, one single packet from mac80211 can use many
descriptors since a large send needs to be split into
several segments.
This means that stopping mac80211's queues is not enough
and we also need to ensure that we don't overflow the Tx
queues with one single packet from mac80211.
Add code to transport layer to do just that. Stop
mac80211's queue as soon as the queue is full above the
same threshold as before, and keep pushing the current
packet along with its segments on the queue, but check
that we don't overflow. If that would happen, buffer the
segments, and send them when there is room in the Tx queue
again. Of course, we first need to send the buffered
segments and only then, wake up mac80211's queues.
Amitkumar Karwar [Wed, 13 Jan 2016 09:26:57 +0000 (01:26 -0800)]
mwifiex: use SYNC flag for canceling host sleep
Host sleep is cancelled in sdio resume() handler.
Cfg80211's resume handler is immediately called after
this. SYNC flag here ensures that host sleep handshake
gets completed and we have valid "adapter->nd_config"
before we report host wakeup reason in cfg80211's
resume handler.
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Xinming Hu [Wed, 13 Jan 2016 09:26:52 +0000 (01:26 -0800)]
mwifiex: add schedule scan support
This patch add sched scan support for mwifiex, include cfg80211
sched_scan_start/sched_scan_stop handler, corresponding bgscan
command path and event handler.
Signed-off-by: Xinming Hu <huxm@marvell.com> Signed-off-by: chunfan chen <jeffc@marvell.com> Signed-off-by: Cathy Luo <cluo@marvell.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Shengzhen Li [Tue, 12 Jan 2016 13:43:16 +0000 (05:43 -0800)]
mwifiex: fix power state out of sync problem
It's been observed that driver's power state goes out
of sync with firmware in some corner cases. When this
happens, driver tries to download the data to firmware
assuming it's awake which causes Tx data timeout.
Main thread will process interrupt as soon as interrupt
handler fills 'int_status' variable.
This patch fixes the race issue by updating 'int_status'
at the end of mwifiex_interrupt_status().
Signed-off-by: Shengzhen Li <szli@marvell.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Fixes: 3719c17e1816 ("wlcore/wl18xx: fw logger over sdio") Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
chunfan chen [Thu, 7 Jan 2016 07:40:48 +0000 (23:40 -0800)]
mwifiex: fix IBSS data path issue.
The port_open flag is not applicable for IBSS mode. IBSS data
path was broken when port_open flag was introduced.
This patch fixes the problem by correcting the checks.
Fixes: 5c8946330abfa4c ("mwifiex: enable traffic only when port is open") Signed-off-by: chunfan chen <jeffc@marvell.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
chunfan chen [Thu, 7 Jan 2016 07:40:47 +0000 (23:40 -0800)]
mwifiex: firmware download enhancements
Same chip is being used by WLAN as well as bluetooth
drivers. Each driver needs to check during initialisation
if firmware is already active or it needs to be freshly
downloaded. If one driver has started downloading the
firmware, other finds the winner flag as false.
mwifiex_check_fw_status() checks firmware status and also
check if WLAN is the winner for firmware downloading.
Once we detect that other interface is downloading
the firmware, we call this routine again with max
poll count to wait until firmware is ready.
This patch splits the routine to avoid checking
winner status unnecessarily multiple times and ensures
that correct messages are displayed to user.
Firmware status poll count is also increased to 150.
Signed-off-by: Chunfan Chen <jeffc@marvell.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Uri Mashiach [Wed, 30 Dec 2015 13:35:32 +0000 (15:35 +0200)]
wlcore/wl12xx: spi: add device tree support
Add DT support for the wl1271 SPI WiFi.
Add documentation file for the wl1271 SPI WiFi.
Signed-off-by: Uri Mashiach <uri.mashiach@compulab.co.il> Acked-by: Rob Herring <robh@kernel.org> Tested-By: Sebastian Reichel <sre@kernel.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Igor Grinberg <grinberg@compulab.co.il> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Uri Mashiach [Wed, 30 Dec 2015 13:35:31 +0000 (15:35 +0200)]
wlcore/wl12xx: spi: add power operation function
The power function uses a consumer regulator access to update the WiFi
enable GPIO value.
Signed-off-by: Uri Mashiach <uri.mashiach@compulab.co.il> Tested-By: Sebastian Reichel <sre@kernel.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Igor Grinberg <grinberg@compulab.co.il> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Pull networking fixes from David Miller:
"A quick set of bug fixes after there initial networking merge:
1) Netlink multicast group storage allocator only was tested with
nr_groups equal to 1, make it work for other values too. From
Matti Vaittinen.
2) Check build_skb() return value in macb and hip04_eth drivers, from
Weidong Wang.
3) Don't leak x25_asy on x25_asy_open() failure.
4) More DMA map/unmap fixes in 3c59x from Neil Horman.
5) Don't clobber IP skb control block during GSO segmentation, from
Konstantin Khlebnikov.
6) ECN helpers for ipv6 don't fixup the checksum, from Eric Dumazet.
7) Fix SKB segment utilization estimation in xen-netback, from David
Vrabel.
8) Fix lockdep splat in bridge addrlist handling, from Nikolay
Aleksandrov"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
bgmac: Fix reversed test of build_skb() return value.
bridge: fix lockdep addr_list_lock false positive splat
net: smsc: Add support h8300
xen-netback: free queues after freeing the net device
xen-netback: delete NAPI instance when queue fails to initialize
xen-netback: use skb to determine number of required guest Rx requests
net: sctp: Move sequence start handling into sctp_transport_get_idx()
ipv6: update skb->csum when CE mark is propagated
net: phy: turn carrier off on phy attach
net: macb: clear interrupts when disabling them
sctp: support to lookup with ep+paddr in transport rhashtable
net: hns: fixes no syscon error when init mdio
dts: hisi: fixes no syscon fault when init mdio
net: preserve IP control block during GSO segmentation
fsl/fman: Delete one function call "put_device" in dtsec_config()
hip04_eth: fix missing error handle for build_skb failed
3c59x: fix another page map/single unmap imbalance
3c59x: balance page maps and unmaps
x25_asy: Free x25_asy on x25_asy_open() failure.
mlxsw: fix SWITCHDEV_OBJ_ID_PORT_MDB
...
Linus Torvalds [Fri, 15 Jan 2016 21:18:47 +0000 (13:18 -0800)]
Merge tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Core:
- Ground work for the new Power9 MMU from Aneesh Kumar K.V
- Optimise FP/VMX/VSX context switching from Anton Blanchard
Misc:
- Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica
Gupta, Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling,
Andrew Donnellan
- Allow wrapper to work on non-english system from Laurent Vivier
- Add rN aliases to the pt_regs_offset table from Rashmica Gupta
- Fix module autoload for rackmeter & axonram drivers from Luis de
Bethencourt
- Include KVM guest test in all interrupt vectors from Paul Mackerras
- Fix DSCR inheritance over fork() from Anton Blanchard
- Make value-returning atomics & {cmp}xchg* & their atomic_ versions
fully ordered from Boqun Feng
- Print MSR TM bits in oops messages from Michael Neuling
- Add TM signal return & invalid stack selftests from Michael Neuling
- Limit EPOW reset event warnings from Vipin K Parashar
- Remove the Cell QPACE code from Rashmica Gupta
- Append linux_banner to exception information in xmon from Rashmica
Gupta
- Add selftest to check if VSRs are corrupted from Rashmica Gupta
- Remove broken GregorianDay() from Daniel Axtens
- Import Anton's context_switch2 benchmark into selftests from
Michael Ellerman
- Add selftest script to test HMI functionality from Daniel Axtens
- Remove obsolete OPAL v2 support from Stewart Smith
- Make enter_rtas() private from Michael Ellerman
- PPR exception cleanups from Michael Ellerman
- Add page soft dirty tracking from Laurent Dufour
- Add support for Nvlink NPUs from Alistair Popple
- Add support for kexec on 476fpe from Alistair Popple
- Enable kernel CPU dlpar from sysfs from Nathan Fontenot
- Copy only required pieces of the mm_context_t to the paca from
Michael Neuling
- Add a kmsg_dumper that flushes OPAL console output on panic from
Russell Currey
- Implement save_stack_trace_regs() to enable kprobe stack tracing
from Steven Rostedt
- Add HWCAP bits for Power9 from Michael Ellerman
- Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
- Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
- scripts/recordmcount.pl: support data in text section on powerpc
from Ulrich Weigand
- Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand
cxl:
- cxl: Fix possible idr warning when contexts are released from
Vaibhav Jain
- cxl: use correct operator when writing pcie config space values
from Andrew Donnellan
- cxl: Fix DSI misses when the context owning task exits from Vaibhav
Jain
- cxl: fix build for GCC 4.6.x from Brian Norris
- cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
- cxl: Enable PCI device ID for future IBM CXL adapter from Uma
Krishnan
Freescale:
- Freescale updates from Scott: Highlights include moving QE code out
of arch/powerpc (to be shared with arm), device tree updates, and
minor fixes"
* tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (149 commits)
powerpc/module: Handle R_PPC64_ENTRY relocations
scripts/recordmcount.pl: support data in text section on powerpc
powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
powerpc/mm: Fix _PAGE_PTE breaking swapoff
cxl: Enable PCI device ID for future IBM CXL adapter
cxl: use -Werror only with CONFIG_PPC_WERROR
cxl: fix build for GCC 4.6.x
powerpc: Add HWCAP bits for Power9
powerpc/powernv: Reserve PE#0 on NPU
powerpc/powernv: Change NPU PE# assignment
powerpc/powernv: Fix update of NVLink DMA mask
powerpc/powernv: Remove misleading comment in pci.c
powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing
powerpc: Fix build break due to paca mm_context_t changes
cxl: Fix DSI misses when the context owning task exits
MAINTAINERS: Update Scott Wood's e-mail address
powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()
powerpc: Fix style of self-test config prompts
powerpc/powernv: Only delay opal_rtc_read() retry when necessary
...
Linus Torvalds [Fri, 15 Jan 2016 21:01:01 +0000 (13:01 -0800)]
Merge tag 'vfio-v4.5-rc1' of git://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:
- Fixes in AMD xgbe reset, spapr structure padding, type 1 flags (Dan
Carpenter, Alexey Kardashevskiy, Pierre Morel)
- Re-introduce no-iommu mode, with a user this time (Alex Williamson)
* tag 'vfio-v4.5-rc1' of git://github.com/awilliam/linux-vfio:
vfio/iommu_type1: make use of info.flags
vfio: Include No-IOMMU mode
vfio: Add explicit alignments in vfio_iommu_spapr_tce_create
VFIO: platform: reset: fix a warning message condition
Linus Torvalds [Fri, 15 Jan 2016 20:49:44 +0000 (12:49 -0800)]
Merge tag 'nfsd-4.5' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"Smaller bugfixes and cleanup, including a fix for a failures of
kerberized NFSv4.1 mounts, and Scott Mayhew's work addressing ACK
storms that can affect some high-availability NFS setups"
* tag 'nfsd-4.5' of git://linux-nfs.org/~bfields/linux:
nfsd: add new io class tracepoint
nfsd: give up on CB_LAYOUTRECALLs after two lease periods
nfsd: Fix nfsd leaks sunrpc module references
lockd: constify nlmsvc_binding structure
lockd: use to_delayed_work
nfsd: use to_delayed_work
Revert "svcrdma: Do not send XDR roundup bytes for a write chunk"
lockd: Register callbacks on the inetaddr_chain and inet6addr_chain
nfsd: Register callbacks on the inetaddr_chain and inet6addr_chain
sunrpc: Add a function to close temporary transports immediately
nfsd: don't base cl_cb_status on stale information
nfsd4: fix gss-proxy 4.1 mounts for some AD principals
nfsd: fix unlikely NULL deref in mach_creds_match
nfsd: minor consolidation of mach_cred handling code
nfsd: helper for dup of possibly NULL string
svcrpc: move some initialization to common code
nfsd: fix a warning message
nfsd: constify nfsd4_callback_ops structure
nfsd: recover: constify nfsd4_client_tracking_ops structures
svcrdma: Do not send XDR roundup bytes for a write chunk
After promisc mode management was introduced a bridge device could do
dev_set_promiscuity from its ndo_change_rx_flags() callback which in
turn can be called after the bridge's addr_list_lock has been taken
(e.g. by dev_uc_add). This causes a false positive lockdep splat because
the port interfaces' addr_list_lock is taken when br_manage_promisc()
runs after the bridge's addr list lock was already taken.
To remove the false positive introduce a custom bridge addr_list_lock
class and set it on bridge init.
A simple way to reproduce this is with the following:
$ brctl addbr br0
$ ip l add l br0 br0.100 type vlan id 100
$ ip l set br0 up
$ ip l set br0.100 up
$ echo 1 > /sys/class/net/br0/bridge/vlan_filtering
$ brctl addif br0 eth0
Splat:
[ 43.684325] =============================================
[ 43.684485] [ INFO: possible recursive locking detected ]
[ 43.684636] 4.4.0-rc8+ #54 Not tainted
[ 43.684755] ---------------------------------------------
[ 43.684906] brctl/1187 is trying to acquire lock:
[ 43.685047] (_xmit_ETHER){+.....}, at: [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40
[ 43.685460] but task is already holding lock:
[ 43.685618] (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80
[ 43.686015] other info that might help us debug this:
[ 43.686316] Possible unsafe locking scenario:
CC: Vlad Yasevich <vyasevic@redhat.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: Bridge list <bridge@lists.linux-foundation.org> CC: Andy Gospodarek <gospo@cumulusnetworks.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Fixes: 2796d0c648c9 ("bridge: Automatically manage port promiscuous mode.") Reported-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 15 Jan 2016 20:28:00 +0000 (12:28 -0800)]
Merge tag 'md/4.5' of git://neil.brown.name/md
Pull md updates from Neil Brown:
"Mostly clustered-raid1 and raid5 journal updates. one Y2038 fix and
other minor stuff.
One patch removes me from the MAINTAINERS file and adds a record of my
md maintainership to Credits"
Many thanks to Neil, who has been around for a _looong_ time.
* tag 'md/4.5' of git://neil.brown.name/md: (26 commits)
md/raid: only permit hot-add of compatible integrity profiles
Remove myself as MD Maintainer, and add to Credits.
raid5-cache: handle journal hotadd in quiesce
MD: add journal with array suspended
md: set MD_HAS_JOURNAL in correct places
md: Remove 'ready' field from mddev.
md: remove unnecesary md_new_event_inintr
raid5: allow r5l_io_unit allocations to fail
raid5-cache: use a mempool for the metadata block
raid5-cache: use a bio_set
raid5-cache: add journal hot add/remove support
drivers: md: use ktime_get_real_seconds()
md: avoid warning for 32-bit sector_t
raid5-cache: free meta_page earlier
raid5-cache: simplify r5l_move_io_unit_list
md: update comment for md_allow_write
md-cluster: update comments for MD_CLUSTER_SEND_LOCKED_ALREADY
md-cluster: Protect communication with mutexes
md-cluster: Defer MD reloading to mddev->thread
md-cluster: update the documentation
...
Linus Torvalds [Fri, 15 Jan 2016 20:14:47 +0000 (12:14 -0800)]
Merge tag 'regulator-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"Aside from a fix for a spurious warning (which caused more problems
than it fixed in the fixing really) this is all driver updates,
including new drivers for Dialog PV88060/90 and TI LM363x and TPS65086
devices. The qcom_smd driver has had PM8916 and PMA8084 support
added"
* tag 'regulator-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (36 commits)
regulator: core: remove some dead code
regulator: core: use dev_to_rdev
regulator: lp872x: Get rid of duplicate reference to DVS GPIO
regulator: lp872x: Add missing of_match in regulators descriptions
regulator: axp20x: Fix GPIO LDO enable value for AXP22x
regulator: lp8788: constify regulator_ops structures
regulator: wm8*: constify regulator_ops structures
regulator: da9*: constify regulator_ops structures
regulator: mt6311: Use REGCACHE_RBTREE
regulator: tps65917/palmas: Add bypass ops for LDOs with bypass capability
regulator: qcom-smd: Add support for PMA8084
regulator: qcom-smd: Add PM8916 support
soc: qcom: documentation: Update SMD/RPM Docs
regulator: pv88090: logical vs bitwise AND typo
regulator: pv88090: Fix irq leak
regulator: pv88090: new regulator driver
regulator: wm831x-ldo: Use platform_register/unregister_drivers()
regulator: wm831x-dcdc: Use platform_register/unregister_drivers()
regulator: lp8788-ldo: Use platform_register/unregister_drivers()
regulator: core: Fix nested locking of supplies
...
Borislav Petkov [Fri, 15 Jan 2016 18:26:44 +0000 (19:26 +0100)]
amdkfd: Copy from the proper user command pointer
8f1d57c17248 ("amdkfd: don't open-code memdup_user()") mistakenly uses
an uninitialized local pointer, gcc complains:
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c: In function ‘kfd_ioctl_dbg_address_watch’:
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c:562:12: warning: ‘args_buff’ may be used uninitialized in this function [-Wmaybe-uninitialized]
args_buff = memdup_user(args_buff,
^
Fix it.
Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David Vrabel [Fri, 15 Jan 2016 14:55:34 +0000 (14:55 +0000)]
xen-netback: use skb to determine number of required guest Rx requests
Using the MTU or GSO size to determine the number of required guest Rx
requests for an skb was subtly broken since these value may change at
runtime.
After 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b (xen-netback: always
fully coalesce guest Rx packets) we always fully pack a packet into
its guest Rx slots. Calculating the number of required slots from the
packet length is then easy.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: sctp: Move sequence start handling into sctp_transport_get_idx()
net/sctp/proc.c: In function ‘sctp_transport_get_idx’:
net/sctp/proc.c:313: warning: ‘obj’ may be used uninitialized in this function
This is currently a false positive, as all callers check for a zero
offset first, and handle this case in the exact same way.
Move the check and handling into sctp_transport_get_idx() to kill the
compiler warning, and avoid future bugs.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 15 Jan 2016 12:56:56 +0000 (04:56 -0800)]
ipv6: update skb->csum when CE mark is propagated
When a tunnel decapsulates the outer header, it has to comply
with RFC 6080 and eventually propagate CE mark into inner header.
It turns out IP6_ECN_set_ce() does not correctly update skb->csum
for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure"
messages and stack traces.
Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 15 Jan 2016 19:51:51 +0000 (11:51 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF fixes and quota cleanups from Jan Kara:
"Several UDF fixes and some minor quota cleanups"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Check output buffer length when converting name to CS0
udf: Prevent buffer overrun with multi-byte characters
quota: constify qtree_fmt_operations structures
udf: avoid uninitialized variable use
udf: Fix lost indirect extent block
udf: Factor out code for creating indirect extent
udf: limit the maximum number of indirect extents in a row
udf: limit the maximum number of TD redirections
fs: make quota/dquot.c explicitly non-modular
fs: make quota/netlink.c explicitly non-modular
Sjoerd Simons [Thu, 14 Jan 2016 20:57:18 +0000 (21:57 +0100)]
net: phy: turn carrier off on phy attach
The operstate of a networking device initially IF_OPER_UNKNOWN aka
"unknown", updated on carrier state changes (with carrier state being on
by default). This means it will stay unknown unless the carrier state
goes to off at some point, which is not the case if the phy is already
up/connected at startup.
Explicitly turn off the carrier on phy attach, leaving the phy state
machine to turn the carrier on when it has done the initial negotiation.
Signed-off-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Nathan Sullivan [Thu, 14 Jan 2016 19:27:27 +0000 (13:27 -0600)]
net: macb: clear interrupts when disabling them
Disabling interrupts with the IDR register does not stop the macb hardware
from asserting its interrupt line if there are interrupts pending. Always
clear the interrupts using ISR, and be sure to write it on hardware that
is not read-to-clear, like Zynq. Not doing so will cause interrupts when
the driver doesn't expect them.
Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 15 Jan 2016 19:41:44 +0000 (11:41 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge first patch-bomb from Andrew Morton:
- A few hotfixes which missed 4.4 becasue I was asleep. cc'ed to
-stable
- A few misc fixes
- OCFS2 updates
- Part of MM. Including pretty large changes to page-flags handling
and to thp management which have been buffered up for 2-3 cycles now.
I have a lot of MM material this time.
[ It turns out the THP part wasn't quite ready, so that got dropped from
this series - Linus ]
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits)
zsmalloc: reorganize struct size_class to pack 4 bytes hole
mm/zbud.c: use list_last_entry() instead of list_tail_entry()
zram/zcomp: do not zero out zcomp private pages
zram: pass gfp from zcomp frontend to backend
zram: try vmalloc() after kmalloc()
zram/zcomp: use GFP_NOIO to allocate streams
mm: add tracepoint for scanning pages
drivers/base/memory.c: fix kernel warning during memory hotplug on ppc64
mm/page_isolation: use macro to judge the alignment
mm: fix noisy sparse warning in LIBCFS_ALLOC_PRE()
mm: rework virtual memory accounting
include/linux/memblock.h: fix ordering of 'flags' argument in comments
mm: move lru_to_page to mm_inline.h
Documentation/filesystems: describe the shared memory usage/accounting
memory-hotplug: don't BUG() in register_memory_resource()
hugetlb: make mm and fs code explicitly non-modular
mm/swapfile.c: use list_for_each_entry_safe in free_swap_count_continuations
mm: /proc/pid/clear_refs: no need to clear VM_SOFTDIRTY in clear_soft_dirty_pmd()
mm: make sure isolate_lru_page() is never called for tail page
vmstat: make vmstat_updater deferrable again and shut down on idle
...
Xin Long [Thu, 14 Jan 2016 05:49:34 +0000 (13:49 +0800)]
sctp: support to lookup with ep+paddr in transport rhashtable
Now, when we sendmsg, we translate the ep to laddr by selecting the
first element of the list, and then do a lookup for a transport.
But sctp_hash_cmp() will compare it against asoc addr_list, which may
be a subset of ep addr_list, meaning that this chosen laddr may not be
there, and thus making it impossible to find the transport.
So we fix it by using ep + paddr to lookup transports in hashtable. In
sctp_hash_cmp, if .ep is set, we will check if this ep == asoc->ep,
or we will do the laddr check.
Fixes: d6c0256a60e6 ("sctp: add the rhashtable apis for sctp global transport hashtable") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reported-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Do not __GFP_ZERO allocated zcomp ->private pages. We keep allocated
streams around and use them for read/write requests, so we supply a
zeroed out ->private to compression algorithm as a scratch buffer only
once -- the first time we use that stream. For the rest of IO requests
served by this stream ->private usually contains some temporarily data
from the previous requests.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Thu, 14 Jan 2016 23:22:32 +0000 (15:22 -0800)]
zram: pass gfp from zcomp frontend to backend
Each zcomp backend uses own gfp flag but it's pointless because the
context they could be called is driven by upper layer(ie, zcomp
frontend). As well, zcomp frondend could call them in different
context. One context(ie, zram init part) is it should be better to make
sure successful allocation other context(ie, further stream allocation
part for accelarating I/O speed) is just optional so let's pass gfp down
from driver (ie, zcomp frontend) like normal MM convention.
[sergey.senozhatsky@gmail.com: add missing __vmalloc zero and highmem gfps] Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kyeongdon Kim [Thu, 14 Jan 2016 23:22:29 +0000 (15:22 -0800)]
zram: try vmalloc() after kmalloc()
When we're using LZ4 multi compression streams for zram swap, we found
out page allocation failure message in system running test. That was
not only once, but a few(2 - 5 times per test). Also, some failure
cases were continually occurring to try allocation order 3.
In order to make parallel compression private data, we should call
kzalloc() with order 2/3 in runtime(lzo/lz4). But if there is no order
2/3 size memory to allocate in that time, page allocation fails. This
patch makes to use vmalloc() as fallback of kmalloc(), this prevents
page alloc failure warning.
After using this, we never found warning message in running test, also
It could reduce process startup latency about 60-120ms in each case.
We can end up allocating a new compression stream with GFP_KERNEL from
within the IO path, which may result is nested (recursive) IO
operations. That can introduce problems if the IO path in question is a
reclaimer, holding some locks that will deadlock nested IOs.
Allocate streams and working memory using GFP_NOIO flag, forbidding
recursive IO and FS operations.
An example:
inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes:
(jbd2_handle){+.+.?.}, at: start_this_handle+0x4ca/0x555
{IN-RECLAIM_FS-W} state was registered at:
__lock_acquire+0x8da/0x117b
lock_acquire+0x10c/0x1a7
start_this_handle+0x52d/0x555
jbd2__journal_start+0xb4/0x237
__ext4_journal_start_sb+0x108/0x17e
ext4_dirty_inode+0x32/0x61
__mark_inode_dirty+0x16b/0x60c
iput+0x11e/0x274
__dentry_kill+0x148/0x1b8
shrink_dentry_list+0x274/0x44a
prune_dcache_sb+0x4a/0x55
super_cache_scan+0xfc/0x176
shrink_slab.part.14.constprop.25+0x2a2/0x4d3
shrink_zone+0x74/0x140
kswapd+0x6b7/0x930
kthread+0x107/0x10f
ret_from_fork+0x3f/0x70
irq event stamp: 138297
hardirqs last enabled at (138297): debug_check_no_locks_freed+0x113/0x12f
hardirqs last disabled at (138296): debug_check_no_locks_freed+0x33/0x12f
softirqs last enabled at (137818): __do_softirq+0x2d3/0x3e9
softirqs last disabled at (137813): irq_exit+0x41/0x95
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(jbd2_handle);
<Interrupt>
lock(jbd2_handle);
yankejian [Wed, 13 Jan 2016 07:09:59 +0000 (15:09 +0800)]
net: hns: fixes no syscon error when init mdio
As dtsi files use the normal naming conventions using '-' instead of '_'
inside of property names, the driver needs to update the phandle name
strings of the of_parse_phandle func.
Signed-off-by: Kejian Yan <yankejian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
yankejian [Wed, 13 Jan 2016 07:09:58 +0000 (15:09 +0800)]
dts: hisi: fixes no syscon fault when init mdio
When linux start up, we get the log below:
"Hi-HNS_MDIO 803c0000.mdio: no syscon hisilicon,peri-c-subctrl
mdio_bus mdio@803c0000: mdio sys ctl reg has not maped"
The source code about the subctrl is dealt syscon, but dts doesn't.
It cause such fault, so this patch adds the syscon info on dts files to
fixes it.
Signed-off-by: Kejian Yan <yankejian@huawei.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
net: preserve IP control block during GSO segmentation
Skb_gso_segment() uses skb control block during segmentation.
This patch adds 32-bytes room for previous control block which
will be copied into all resulting segments.
This patch fixes kernel crash during fragmenting forwarded packets.
Fragmentation requires valid IP CB in skb for clearing ip options.
Also patch removes custom save/restore in ovs code, now it's redundant.
Linus Torvalds [Fri, 15 Jan 2016 01:04:19 +0000 (17:04 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
floppy: make local variable non-static
exynos: fixes an incorrect header guard
dt-bindings: fixes some incorrect header guards
cpufreq-dt: correct dead link in documentation
cpufreq: ARM big LITTLE: correct dead link in documentation
treewide: Fix typos in printk
Documentation: filesystem: Fix typo in fs/eventfd.c
fs/super.c: use && instead of & for warn_on condition
Documentation: fix sysfs-ptp
lib: scatterlist: fix Kconfig description
Linus Torvalds [Fri, 15 Jan 2016 00:38:02 +0000 (16:38 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatching updates from Jiri Kosina:
- RO/NX attribute fixes for patch module relocations from Josh
Poimboeuf. As part of this effort, module.c has been cleaned up as
well and livepatching is piggy-backing on this cleanup. Rusty is OK
with this whole lot going through livepatching tree.
- symbol disambiguation support from Chris J Arges. That series is
also
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
but this came in only after I've alredy pushed out. Didn't want to
rebase because of that, hence I am mentioning it here.
- symbol lookup fix from Miroslav Benes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: Cleanup module page permission changes
module: keep percpu symbols in module's symtab
module: clean up RO/NX handling.
module: use a structure to encapsulate layout.
gcov: use within_module() helper.
module: Use the same logic for setting and unsetting RO/NX
livepatch: function,sympos scheme in livepatch sysfs directory
livepatch: add sympos as disambiguator field to klp_reloc
livepatch: add old_sympos as disambiguator field to klp_func
Linus Torvalds [Fri, 15 Jan 2016 00:20:42 +0000 (16:20 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID updates from Jiri Kosina:
- appoint Benjamin Tissoires as co-maintainer / designated reviewer
- sysfs report_descriptor visibility fix for unclaimed devices, from
Andy Lutomirski
- suspend/resume fixes for Sony driver from Frank Praznik
- IRQ deadlock fix from Ioan-Adrian Ratiu
- hid-i2c fixes affecting (at least) Yoga 900 from Mika Westerberg and
Srinivas Pandruvada
- a lot of new device support (especially, but not limited to, Wacom)
and assorted small misc fixes
- almost complete G920 support; the only bit that is missing is
switching the device to HID mode automatically; Simon Wood and Michal
Maly are working on it.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (46 commits)
Revert "INPUT: xpad: switch Logitech G920 Wheel into HID mode"
HID: sensor-hub: Add quirk for Lenovo Yoga 900 with ITE Chips
HID: Add new PID for Microchip Pick16F1454
HID: wacom: Use correct report to query pen ID from INTUOSHT2 devices
HID: i2c-hid: Prevent sending reports from racing with device reset
HID: use kobj_to_dev()
HID: wiimote: use dev_to_wii()
HID: add a new helper to_hid_driver()
HID: use to_hid_device()
HID: move to_hid_device() to hid.h
HID: usbhid: use to_usb_device
HID: corsair: Convert to use module_hid_driver
HID: input: ignore the battery in OKLICK Laser BTmouse
HID: wacom: Fix pad button range for CINTIQ_COMPANION_2
HID: wacom: Fix touchring value reporting
HID: wacom: Report 'strip2' values in ABS_RY
HID: wacom: Limit touchstrip data to 13 bits
HID: wacom: bitwise vs logical ORs
HID: wacom: Apply lowres quirk to BAMBOO_TOUCH devices
HID: enable hid device to suspend/resume asynchronously
...
Linus Torvalds [Fri, 15 Jan 2016 00:08:23 +0000 (16:08 -0800)]
Merge tag 'nfs-for-4.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable fixes:
- Fix a regression in the SunRPC socket polling code
- Fix the attribute cache revalidation code
- Fix race in __update_open_stateid()
- Fix an lo->plh_block_lgets imbalance in layoutreturn
- Fix an Oopsable typo in ff_mirror_match_fh()
Features:
- pNFS layout recall performance improvements.
- pNFS/flexfiles: Support server-supplied layoutstats sampling period
Bugfixes + cleanups:
- NFSv4: Don't perform cached access checks before we've OPENed the
file
- Fix starvation issues with background flushes
- Reclaim writes should be flushed as unstable writes if there are
already entries in the commit lists
- Various bugfixes from Chuck to fix NFS/RDMA send queue ordering
problems
- Ensure that we propagate fatal layoutget errors back to the
application
- Fixes for sundry flexfiles layoutstats bugs
- Fix files/flexfiles to not cache invalidated layouts in the DS
commit buckets"
* tag 'nfs-for-4.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (68 commits)
NFS: Fix a compile warning about unused variable in nfs_generic_pg_pgios()
NFSv4: Fix a compile warning about no prototype for nfs4_ioctl()
NFS: Use wait_on_atomic_t() for unlock after readahead
SUNRPC: Fixup socket wait for memory
NFSv4.1/pNFS: Cleanup constify struct pnfs_layout_range arguments
NFSv4.1/pnfs: Cleanup copying of pnfs_layout_range structures
NFSv4.1/pNFS: Cleanup pnfs_mark_matching_lsegs_invalid()
NFSv4.1/pNFS: Fix a race in initiate_file_draining()
NFSv4.1/pNFS: pnfs_error_mark_layout_for_return() must always return layout
NFSv4.1/pNFS: pnfs_mark_matching_lsegs_return() should set the iomode
NFSv4.1/pNFS: Use nfs4_stateid_copy for copying stateids
NFSv4.1/pNFS: Don't pass stateids by value to pnfs_send_layoutreturn()
NFS: Relax requirements in nfs_flush_incompatible
NFSv4.1/pNFS: Don't queue up a new commit if the layout segment is invalid
NFS: Allow multiple commit requests in flight per file
NFS/pNFS: Fix up pNFS write reschedule layering violations and bugs
SUNRPC: Fix a missing break in rpc_anyaddr()
pNFS/flexfiles: Fix an Oopsable typo in ff_mirror_match_fh()
NFS: Fix attribute cache revalidation
NFS: Ensure we revalidate attributes before using execute_ok()
...
Ebru Akagunduz [Thu, 14 Jan 2016 23:22:19 +0000 (15:22 -0800)]
mm: add tracepoint for scanning pages
This patch series makes swapin readahead up to a certain number to gain
more thp performance and adds tracepoint for khugepaged_scan_pmd,
collapse_huge_page, __collapse_huge_page_isolate.
This patch series was written to deal with programs that access most,
but not all, of their memory after they get swapped out. Currently
these programs do not get their memory collapsed into THPs after the
system swapped their memory out, while they would get THPs before
swapping happened.
This patch series was tested with a test program, it allocates 400MB of
memory, writes to it, and then sleeps. I force the system to swap out
all. Afterwards, the test program touches the area by writing and
leaves a piece of it without writing. This shows how much swap in
readahead made by the patch.
John Allen [Thu, 14 Jan 2016 23:22:16 +0000 (15:22 -0800)]
drivers/base/memory.c: fix kernel warning during memory hotplug on ppc64
Fix a bug where a kernel warning is triggered when performing a memory
hotplug on ppc64. This warning may also occur on any architecture that
uses the memory_probe_store interface.
The warning is triggered because there is a udev rule that automatically
tries to online memory after it has been added. The udev rule varies
from distro to distro, but will generally look something like:
On any architecture that uses memory_probe_store to reserve memory, the
udev rule will be triggered after the first section of the block is
reserved and will subsequently attempt to online the entire block,
interrupting the memory reservation process and causing the warning.
This patch modifies memory_probe_store to add a block of memory with a
single call to add_memory as opposed to looping through and adding each
section individually. A single call to add_memory is protected by the
mem_hotplug mutex which will prevent the udev rule from onlining memory
until the reservation of the entire block is complete.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When inspecting a vague code inside prctl(PR_SET_MM_MEM) call (which
testing the RLIMIT_DATA value to figure out if we're allowed to assign
new @start_brk, @brk, @start_data, @end_data from mm_struct) it's been
commited that RLIMIT_DATA in a form it's implemented now doesn't do
anything useful because most of user-space libraries use mmap() syscall
for dynamic memory allocations.
Linus suggested to convert RLIMIT_DATA rlimit into something suitable
for anonymous memory accounting. But in this patch we go further, and
the changes are bundled together as:
* keep vma counting if CONFIG_PROC_FS=n, will be used for limits
* replace mm->shared_vm with better defined mm->data_vm
* account anonymous executable areas as executable
* account file-backed growsdown/up areas as stack
* drop struct file* argument from vm_stat_account
* enforce RLIMIT_DATA for size of data areas
This way code looks cleaner: now code/stack/data classification depends
only on vm_flags state:
Florian Fainelli [Thu, 14 Jan 2016 23:22:04 +0000 (15:22 -0800)]
include/linux/memblock.h: fix ordering of 'flags' argument in comments
for_each_free_mem_range() and for_each_free_mem_range_reverse() both
accept a 'flags' argument, the comment surrounding the macro placed the
'flags' documentation at the very end, while 'flags' is in fact the 3rd
argument to the macro, so let's preserve natural ordering here.
Fixes: fc6daaf931518 ("mm/memblock: add extra "flags" to memblock to allow selection of memory based on attribute") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rodrigo Freire [Thu, 14 Jan 2016 23:21:58 +0000 (15:21 -0800)]
Documentation/filesystems: describe the shared memory usage/accounting
The Shared Memory accounting support is present in Kernel since commit 4b02108ac1b3 ("mm: oom analysis: add shmem vmstat") and in userland
free(1) since 2014. This patch updates the Documentation to reflect
this change.
Vitaly Kuznetsov [Thu, 14 Jan 2016 23:21:55 +0000 (15:21 -0800)]
memory-hotplug: don't BUG() in register_memory_resource()
Out of memory condition is not a bug and while we can't add new memory
in such case crashing the system seems wrong. Propagating the return
value from register_memory_resource() requires interface change.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Sheng Yong <shengyong1@huawei.com> Cc: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Gortmaker [Thu, 14 Jan 2016 23:21:52 +0000 (15:21 -0800)]
hugetlb: make mm and fs code explicitly non-modular
The Kconfig currently controlling compilation of this code is:
config HUGETLBFS
bool "HugeTLB file system support"
...meaning that it currently is not being built as a module by anyone.
Lets remove the modular code that is essentially orphaned, so that when
reading the driver there is no doubt it is builtin-only.
Since module_init translates to device_initcall in the non-modular case,
the init ordering gets moved to earlier levels when we use the more
appropriate initcalls here.
Originally I had the fs part and the mm part as separate commits, just
by happenstance of the nature of how I detected these non-modular use
cases. But that can possibly introduce regressions if the patch merge
ordering puts the fs part 1st -- as the 0-day testing reported a splat
at mount time.
Investigating with "initcall_debug" showed that the delta was
init_hugetlbfs_fs being called _before_ hugetlb_init instead of after. So
both the fs change and the mm change are here together.
In addition, it worked before due to luck of link order, since they were
both in the same initcall category. So we now have the fs part using
fs_initcall, and the mm part using subsys_initcall, which puts it one
bucket earlier. It now passes the basic sanity test that failed in
earlier 0-day testing.
We delete the MODULE_LICENSE tag and capture that information at the top
of the file alongside author comments, etc.
We don't replace module.h with init.h since the file already has that.
Also note that MODULE_ALIAS is a no-op for non-modular code.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reported-by: kernel test robot <ying.huang@linux.intel.com> Cc: Nadia Yvette Chambers <nyc@holomorphy.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>