Trond Myklebust [Mon, 29 Oct 2012 23:02:20 +0000 (19:02 -0400)]
NFSv4: Clean up handling of privileged operations
Privileged rpc calls are those that are run by the state recovery thread,
in cases where we're trying to recover the system after a server reboot
or a network partition. In those cases, we want to fence off all other
rpc calls (see nfs4_begin_drain_session()) so that they don't end up
using stateids or clientids that are in the process of being recovered.
Prior to this patch, we had to set up special callback functions in
order to declare an rpc call as being privileged.
By adding a new field to the sequence arguments, this patch simplifies
things considerably, and allows us to declare the rpc call as privileged
before it is run.
Trond Myklebust [Thu, 1 Nov 2012 21:07:07 +0000 (17:07 -0400)]
NFSv4.1: Remove the 'FIFO' behaviour for nfs41_setup_sequence
It is more important to preserve the task priority behaviour, which ensures
that things like reclaim writes take precedence over background and kupdate
writes.
Trond Myklebust [Tue, 23 Oct 2012 00:28:44 +0000 (20:28 -0400)]
NFSv4.1: Simplify the sequence setup
Nobody calls nfs4_setup_sequence or nfs41_setup_sequence without
also calling rpc_call_start() on success. This commit therefore
folds the rpc_call_start call into nfs41_setup_sequence().
Trond Myklebust [Mon, 26 Nov 2012 21:16:54 +0000 (16:16 -0500)]
NFSv4.1: Ping server when our session table limits are too high
If the server requests a lower target_highest_slotid, then ensure
that we ping it with at least one RPC call containing an
appropriate SEQUENCE op. This ensures that the server won't need to
send a recall callback in order to shrink the slot table.
Trond Myklebust [Thu, 22 Nov 2012 03:34:45 +0000 (22:34 -0500)]
NFSv4.1: Set the maximum slot table size to 1024 slots
This means that we end up statically allocating 128 bytes for the
bitmap on each slot table.
For a server that supports 1MB write and read I/O sizes this means
that we can completely fill the maximum 1GB TCP send/receive
windows.
Trond Myklebust [Mon, 26 Nov 2012 18:13:29 +0000 (13:13 -0500)]
NFSv4: Move nfs4_wait_clnt_recover and nfs4_client_recover_expired_lease
nfs4_wait_clnt_recover and nfs4_client_recover_expired_lease are both
generic state related functions. As such, they belong in nfs4state.c,
and not nfs4proc.c
Trond Myklebust [Fri, 23 Nov 2012 18:09:38 +0000 (13:09 -0500)]
NFSv4.1: Clean up session draining
Coalesce nfs4_check_drain_bc_complete and nfs4_check_drain_fc_complete
into a single function that can be called when the slot table is known
to be empty, then change nfs4_callback_free_slot() and nfs4_free_slot()
to use it.
Trond Myklebust [Thu, 22 Nov 2012 18:21:02 +0000 (13:21 -0500)]
NFSv4.1: If slot allocation fails due to OOM, retry more quickly
If the NFSv4.1 session slot allocation fails due to an ENOMEM condition,
then set the task->tk_timeout to 1/4 second to ensure that we do retry
the slot allocation more quickly.
Trond Myklebust [Wed, 21 Nov 2012 14:06:11 +0000 (09:06 -0500)]
NFSv4.1: CB_RECALL_SLOT must schedule a sequence op after updating targets
RFC5661 requires us to make sure that the server knows we've updated
our slot table size by sending at least one SEQUENCE op containing the
new 'highest_slotid' value.
We can do so using the 'CHECK_LEASE' functionality of the state
manager.
Trond Myklebust [Wed, 21 Nov 2012 01:12:38 +0000 (20:12 -0500)]
NFSv4.1: Remove the state manager code to resize the slot table
The state manager no longer needs any special machinery to stop the
session flow and resize the slot table. It is all done on the fly by
the SEQUENCE op code now.
Trond Myklebust [Tue, 20 Nov 2012 23:10:30 +0000 (18:10 -0500)]
NFSv4.1: Reset the sequence number for slots that have been deallocated
When the server tells us that it is dynamically resizing the session
replay cache, we should reset the sequence number for those slots
that have been deallocated.
Trond Myklebust [Tue, 20 Nov 2012 17:49:27 +0000 (12:49 -0500)]
NFSv4.1: Ensure that the client tracks the server target_highest_slotid
Dynamic slot allocation in NFSv4.1 depends on the client being able to
track the server's target value for the highest slotid in the
slot table. See the reference in Section 2.10.6.1 of RFC5661.
To avoid ordering problems in the case where 2 SEQUENCE replies contain
conflicting updates to this target value, we also introduce a generation
counter, to track whether or not an RPC containing a SEQUENCE operation
was launched before or after the last update.
Also rename the nfs4_slot_table target_max_slots field to
'target_highest_slotid' to avoid confusion with a slot
table size or number of slots.
Trond Myklebust [Fri, 16 Nov 2012 21:10:11 +0000 (16:10 -0500)]
NFSv4.1: Simplify slot allocation
Clean up the NFSv4.1 slot allocation by replacing nfs_find_slot() with
a function nfs_alloc_slot() that returns a pointer to the nfs4_slot
instead of an offset into the slot table.
Trond Myklebust [Tue, 20 Nov 2012 16:13:12 +0000 (11:13 -0500)]
NFSv4.1: We must bump the clientid sequence number after CREATE_SESSION
We must always bump the clientid sequence number after a successful
call to CREATE_SESSION on the server. The result of
nfs4_verify_channel_attrs() is irrelevant to that requirement.
Trond Myklebust [Tue, 20 Nov 2012 15:53:39 +0000 (10:53 -0500)]
NFSv4.1: Adjust CREATE_SESSION arguments when mounting a new filesystem
If we're mounting a new filesystem, ensure that the session has negotiated
large enough request and reply sizes to match the wsize and rsize mount
arguments.
Trond Myklebust [Wed, 21 Nov 2012 14:22:14 +0000 (09:22 -0500)]
NFSv4.1: Handle session reset and bind_conn_to_session before lease check
We can't send a SEQUENCE op unless the session is OK, so it is pointless
to handle the CHECK_LEASE state before we've dealt with SESSION_RESET
and BIND_CONN_TO_SESSION.
Bryan Schumaker [Mon, 12 Nov 2012 21:55:38 +0000 (16:55 -0500)]
NFS: Add sequence_priviliged_ops for nfs4_proc_sequence()
If I mount an NFS v4.1 server to a single client multiple times and then
run xfstests over each mountpoint I usually get the client into a state
where recovery deadlocks. The server informs the client of a
cb_path_down sequence error, the client then does a
bind_connection_to_session and checks the status of the lease.
I found that bind_connection_to_session sets the NFS4_SESSION_DRAINING
flag on the client, but this flag is never unset before
nfs4_check_lease() reaches nfs4_proc_sequence(). This causes the client
to deadlock, halting all NFS activity to the server. nfs4_proc_sequence()
is only called by the state manager, so I can change it to run in privileged
mode to bypass the NFS4_SESSION_DRAINING check and avoid the deadlock.
Replace multiple BUG_ON() calls with WARN_ON_ONCE() and early return when
sanity checking socket ownership (lock). The bind call will fail if the
socket was unsuccessfully reclassified.
Replace BUG_ON() with WARN_ON_ONCE(). The error condition is a simple
ref counting sanity check and the following code will not free anything
until final put.
Replace BUG_ON() with WARN_ON_ONCE() - rpc_run_bc_task calls rpc_init_task()
then increments the tk_count, so this is a simple sanity check that
if hit once would hit every time this code path is executed.
Trond Myklebust [Mon, 15 Oct 2012 21:14:38 +0000 (17:14 -0400)]
lockd: Remove unnecessary BUG_ON()s in the xdr client code
- Offset bound checks are done in the NFS client code.
- So are filehandle size checks
- The cookie length is a constant
- The utsname()->nodename is already bounded
Trond Myklebust [Mon, 15 Oct 2012 15:51:21 +0000 (11:51 -0400)]
NFS: Remove asserts from the NFS XDR code
Convert the ones that are not trivial to check into WARN_ON_ONCE().
Remove checks for things such as NFS2_MAXPATHLEN, which are trivially
done by the caller.
Add a comment to the case of nfs3_xdr_enc_setacl3args. What is being
done there is just wrong...
rpc_shutdown_client should never be called from a workqueue context.
If it is, it could deadlock looping forever trying to kill tasks that are
assigned to the same kworker thread (and will never run rpc_exit_task).
Linus Torvalds [Sat, 3 Nov 2012 22:27:21 +0000 (15:27 -0700)]
Merge tag 'nfs-for-3.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Fix a bunch of deadlock situations:
* State recovery can deadlock if we fail to release sequence ids
before scheduling the recovery thread.
* Calling deactivate_super() from an RPC workqueue thread can
deadlock because of the call to rpc_shutdown_client.
- Display the device name correctly in /proc/*/mounts
- Fix a number of incorrect error return values:
* When NFSv3 mounts fail due to a timeout.
* On NFSv4.1 backchannel setup failure
* On NFSv4 open access checks
- pnfs_find_alloc_layout() must check the layout pointer for NULL
- Fix a regression in the legacy DNS resolved
* tag 'nfs-for-3.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS4: nfs4_opendata_access should return errno
NFSv4: Initialise the NFSv4.1 slot table highest_used_slotid correctly
SUNRPC: return proper errno from backchannel_rqst
NFS: add nfs_sb_deactive_async to avoid deadlock
nfs: Show original device name verbatim in /proc/*/mount{s,info}
nfsv3: Make v3 mounts fail with ETIMEDOUTs instead EIO on mountd timeouts
nfs: Check whether a layout pointer is NULL before free it
NFS: fix bug in legacy DNS resolver.
NFSv4: nfs4_locku_done must release the sequence id
NFSv4.1: We must release the sequence id when we fail to get a session slot
NFS: Wait for session recovery to finish before returning
Linus Torvalds [Sat, 3 Nov 2012 22:25:14 +0000 (15:25 -0700)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management & ACPI update from Zhang Rui,
Ho humm. Normally these things go through Len. But it's just three
small fixes, I guess I can pull directly too.
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
exynos4_tmu_driver_ids should be exynos_tmu_driver_ids.
ACPI video: Ignore errors after _DOD evaluation.
thermal: solve compilation errors in rcar_thermal
Linus Torvalds [Sat, 3 Nov 2012 22:14:54 +0000 (15:14 -0700)]
Merge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux
Pull i2c embedded fixes from Wolfram Sang:
"Two patches are usual stuff.
The bigger patch is needed to correct a wrong decision made in this
merge window. We hoped to get the PIOQUEUE mode in the mxs driver
working with DMA, but it turned out to be too broken (leading to data
loss), so we now think it is best to remove it entirely and work only
with DMA now. The patch should be in 3.7. IMO, so users never get
the chance to use both modes in parallel."
* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
i2c: tegra: set irq name as device name
i2c-nomadik: Fixup clock handling
i2c: mxs: remove broken PIOQUEUE support
it's slightly bigger than I'd probably like, but nothing looked
dangerous enough to hold off on."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/udl: fix stride issues scanning out stride != width*bpp
drm/radeon: add load detection support for ext DAC on R200 (v2)
DRM/radeon: For single CRTC GPUs move handling of CRTC_CRT_ON to crtc_dpms().
DRM/Radeon: Fix TV DAC Load Detection for single CRTC chips.
DRM/Radeon: Clean up code in TV DAC load detection.
drm/radeon: fix ATPX function documentation
drivers/gpu/drm/radeon/evergreen_cs.c: Remove unnecessary semicolon
DRM/Radeon: On DVI-I use Load Detection when EDID is bogus.
DRM/Radeon: Fix primary DAC Load Detection for RV100 chips.
DRM/Radeon: Fix Load Detection on legacy primary DAC.
drm: exynos: removed warning due to missing typecast for mixer driver data
drm/exynos: add support for ARCH_MULTIPLATFORM
MAINTAINERS: Add git repository for Exynos DRM
drm/exynos: fix display on issue
drm/i915: Only kick out vesafb if we takeover the fbcon with KMS
drm/i915: be less verbose about inability to provide vendor backlight
drm/i915: clear the entire sdvo infoframe buffer
drm/i915: VGA needs to be on pipe A on i830M
drm/i915: fix overlay on i830M
Pull networking fixes from David Miller:
"First post-Sandy pull request"
1) Fix antenna gain handling and initialization of chan->max_reg_power
in wireless, from Felix Fietkau.
2) Fix nexthop handling in H.232 conntrack helper, from Julian
Anastasov.
3) Only process 80211 mesh config header in certain kinds of frames,
from Javier Cardona.
4) 80211 management frame header length needs to be validated, from
Johannes Berg.
5) Don't access free'd SKBs in ath9k driver, from Felix Fietkay.
6) Test for permanent state correctly in VXLAN driver, from Stephen
Hemminger.
7) BNX2X bug fixes from Yaniv Rosner and Dmitry Kravkov.
8) Fix off by one errors in bonding, from Nikolay ALeksandrov.
9) Fix divide by zero in TCP-Illinois congestion control. From Jesper
Dangaard Brouer.
10) TCP metrics code says "Yo dawg, I heard you like sizeof, so I did a
sizeof of a sizeof, so you can size your size" Fix from Julian
Anastasov.
11) Several drivers do mdiobus_free without first doing an
mdiobus_unregister leading to stray pointer references. Fix from
Peter Senna Tschudin.
12) Fix OOPS in l2tp_eth_create() error path, it's another danling
pointer kinda situation. Fix from Tom Parkin.
13) Hardware driven by the vmxnet driver can't handle larger than 16K
fragments, so split them up when necessary. From Eric Dumazet.
14) Handle zero length data length in tcp_send_rcvq() properly. Fix
from Pavel Emelyanov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
tcp-repair: Handle zero-length data put in rcv queue
vmxnet3: must split too big fragments
l2tp: fix oops in l2tp_eth_create() error path
cxgb4: Fix unable to get UP event from the LLD
drivers/net/phy/mdio-bitbang.c: Call mdiobus_unregister before mdiobus_free
drivers/net/ethernet/nxp/lpc_eth.c: Call mdiobus_unregister before mdiobus_free
bnx2x: fix HW initialization using fw 7.8.x
tcp: Fix double sizeof in new tcp_metrics code
net: fix divide by zero in tcp algorithm illinois
net: sctp: Fix typo in net/sctp
bonding: fix second off-by-one error
bonding: fix off-by-one error
bnx2x: Disable FCoE for 57840 since not yet supported by FW
bnx2x: Fix no link on 577xx 10G-baseT
bnx2x: Fix unrecognized SFP+ module after driver is loaded
bnx2x: Fix potential incorrect link speed provision
bnx2x: Restore global registers back to default.
bnx2x: Fix link down in 57712 following LFA
bnx2x: Fix 57810 1G-KR link against certain switches.
ixgbe: PTP get_ts_info missing software support
...
Pavel Emelyanov [Mon, 29 Oct 2012 05:05:33 +0000 (05:05 +0000)]
tcp-repair: Handle zero-length data put in rcv queue
When sending data into a tcp socket in repair state we should check
for the amount of data being 0 explicitly. Otherwise we'll have an skb
with seq == end_seq in rcv queue, but tcp doesn't expect this to happen
(in particular a warn_on in tcp_recvmsg shoots).
Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Reported-by: Giorgos Mavrikas <gmavrikas@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Mon, 29 Oct 2012 23:41:48 +0000 (23:41 +0000)]
l2tp: fix oops in l2tp_eth_create() error path
When creating an L2TPv3 Ethernet session, if register_netdev() should fail for
any reason (for example, automatic naming for "l2tpeth%d" interfaces hits the
32k-interface limit), the netdev is freed in the error path. However, the
l2tp_eth_sess structure's dev pointer is left uncleared, and this results in
l2tp_eth_delete() then attempting to unregister the same netdev later in the
session teardown. This results in an oops.
To avoid this, clear the session dev pointer in the error path.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Murzov [Sat, 13 Oct 2012 00:41:25 +0000 (04:41 +0400)]
ACPI video: Ignore errors after _DOD evaluation.
There are systems where video module known to work fine regardless
of broken _DOD and ignoring returned value here doesn't cause
any issues later. This should fix brightness controls on some laptops.
Devendra Naga [Wed, 31 Oct 2012 08:46:10 +0000 (17:46 +0900)]
thermal: solve compilation errors in rcar_thermal
following were the errors reported
drivers/thermal/rcar_thermal.c: In function ‘rcar_thermal_probe’:
drivers/thermal/rcar_thermal.c:214:10: warning: passing argument 3 of ‘thermal_zone_device_register’ makes integer from pointer without a cast [enabled by default]
include/linux/thermal.h:166:29: note: expected ‘int’ but argument is of type ‘struct rcar_thermal_priv *’
drivers/thermal/rcar_thermal.c:214:10: error: too few arguments to function ‘thermal_zone_device_register’
include/linux/thermal.h:166:29: note: declared here
make[1]: *** [drivers/thermal/rcar_thermal.o] Error 1
make: *** [drivers/thermal/rcar_thermal.o] Error 2
with gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
Vipul Pandya [Mon, 29 Oct 2012 02:02:36 +0000 (02:02 +0000)]
cxgb4: Fix unable to get UP event from the LLD
If T4 configuration file gets loaded from the /lib/firmware/cxgb4/ directory
then offload capabilities of the cards were getting disabled during
initialization. Hence ULDs do not get an UP event from the LLD.
Signed-off-by: Jay Hernandez <jay@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com> Tested-by: Roland Stigge <stigge@antcom.de> Tested-by: Alexandre Pereira da Silva <aletes.xgr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Sun, 28 Oct 2012 21:59:04 +0000 (21:59 +0000)]
bnx2x: fix HW initialization using fw 7.8.x
Since commit 96bed4b9 (use FW 7.8.2) BRB HW block needs to be
initialized using fw values for all devices.
Otherwise ETS on 57712/578xx will not work.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pull more scsi target fixes from Nicholas Bellinger:
"This series is a second round of target fixes for v3.7-rc4 that have
come into target-devel over the last days, and are important enough to
be applied ASAP.
All are being CC'ed to stable. The most important two are:
- target: Re-add explict zeroing of INQUIRY bounce buffer memory to
fix a regression for handling zero-length payloads, a bug that went
during v3.7-rc1, and hit >= v3.6.3 stable. (nab + paolo)
- iscsi-target: Fix a long-standing missed R2T wakeup race in TX
thread processing when using a single queue slot. (Roland)
Thanks to Roland & PureStorage team for helping to track down this
long standing race with iscsi-target single queue slot operation.
Also, the tcm_fc(FCoE) regression bug that was observed recently with
-rc2 code has also been resolved with the cancel_delayed_work() return
bugfix (commit c0158ca64da5: "workqueue: cancel_delayed_work() should
return %false if work item is idle") now in -rc3. Thanks again to Yi
Zou, MDR, Robert Love @ Intel for helping to track this down."
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
target: Fix incorrect usage of nested IRQ spinlocks in ABORT_TASK path
iscsi-target: Fix missed wakeup race in TX thread
target: Avoid integer overflow in se_dev_align_max_sectors()
target: Don't return success from module_init() if setup fails
target: Re-add explict zeroing of INQUIRY bounce buffer memory
Linus Torvalds [Fri, 2 Nov 2012 20:27:52 +0000 (13:27 -0700)]
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"An e-mail address update, and fix a compile error on SPARC"
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: Only include of_match_table with CONFIG_OF_GPIO
hwmon, fam15h_power: Change email address, MAINTAINERS entry
Linus Torvalds [Fri, 2 Nov 2012 20:27:01 +0000 (13:27 -0700)]
Merge tag 'frv-fixes-20121102' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-frv
Pull FRV fixes from David Howells:
"A collection of small fixes for the FRV architecture."
* tag 'frv-fixes-20121102' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-frv:
frv: fix the broken preempt
frv: switch to saner kernel_execve() semantics
FRV: Fix the new-style kernel_thread() stuff
FRV: Fix the preemption handling
FRV: gcc-4.1.2 also inlines weak functions
FRV: Don't objcopy the GNU build_id note
FRV: Add missing linux/export.h #inclusions
Linus Torvalds [Fri, 2 Nov 2012 20:26:11 +0000 (13:26 -0700)]
Merge tag 'stable/for-linus-3.7-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen bugfixes from Konrad Rzeszutek Wilk:
- Use appropriate macros instead of hand-rolling our own (ARM).
- Fixes if FB/KBD closed unexpectedly.
- Fix memory leak in /dev/gntdev ioctl calls.
- Fix overflow check in xenbus_file_write.
- Document cleanup.
- Performance optimization when migrating guests.
* tag 'stable/for-linus-3.7-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/mmu: Use Xen specific TLB flush instead of the generic one.
xen/arm: use the __HVC macro
xen/xenbus: fix overflow check in xenbus_file_write()
xen-kbdfront: handle backend CLOSED without CLOSING
xen-fbfront: handle backend CLOSED without CLOSING
xen/gntdev: don't leak memory from IOCTL_GNTDEV_MAP_GRANT_REF
x86: remove obsolete comment from asm/xen/hypervisor.h
Sasha Levin [Tue, 30 Oct 2012 18:45:57 +0000 (14:45 -0400)]
hashtable: introduce a small and naive hashtable
This hashtable implementation is using hlist buckets to provide a simple
hashtable to prevent it from getting reimplemented all over the kernel.
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
[ Merging this now, so that subsystems can start applying Sasha's
patches that use this - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Fri, 2 Nov 2012 16:05:44 +0000 (12:05 -0400)]
frv: fix the broken preempt
Just get %icc2 into the state we would have after local_irq_disable()
and physical IRQ having happened since then. Then we can simply
use preempt_schedule_irq() and be done with the whole mess.
Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>