Andy Adamson [Mon, 22 Jul 2013 22:41:23 +0000 (18:41 -0400)]
NFSv4.1 Increase NFS4_DEF_SLOT_TABLE_SIZE
Increase NFS4_DEF_SLOT_TABLE_SIZE which is used as the client ca_maxreequests
value in CREATE_SESSION. Current non-dynamic session slot server
implementations use the client ca_maxrequests as a maximum slot number: 64
session slots can handle most workloads.
Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Wed, 24 Jul 2013 16:28:37 +0000 (12:28 -0400)]
NFS: Never use user credentials for lease renewal
Never try to use a non-UID 0 user credential for lease management,
as that credential can change out from under us. The server will
block NFSv4 lease recovery with NFS4ERR_CLID_INUSE.
Since the mechanism to acquire a credential for lease management
is now the same for all minor versions, replace the minor version-
specific callout with a single function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Wed, 24 Jul 2013 16:28:28 +0000 (12:28 -0400)]
NFS: Use root's credential for lease management when keytab is missing
Commit 05f4c350 "NFS: Discover NFSv4 server trunking when mounting"
Fri Sep 14 17:24:32 2012 introduced Uniform Client String support,
which forces our NFS client to establish a client ID immediately
during a mount operation rather than waiting until a user wants to
open a file.
Normally machine credentials (eg. from a keytab) are used to perform
a mount operation that is protected by Kerberos. Before 05fc350,
SETCLIENTID used a machine credential, or fell back to a regular
user's credential if no keytab is available.
On clients that don't have a keytab, performing SETCLIENTID early
means there's no user credential to fall back on, since no regular
user has kinit'd yet. 05f4c350 seems to have broken the ability
to mount with sec=krb5 on clients that don't have a keytab in
kernels 3.7 - 3.10.
To address this regression, commit 4edaa308 (NFS: Use "krb5i" to
establish NFSv4 state whenever possible), Sat Mar 16 15:56:20 2013,
was merged in 3.10. This commit forces the NFS client to fall back
to AUTH_SYS for lease management operations if no keytab is
available.
Neil Brown noticed that, since root is required to kinit to do a
sec=krb5 mount when a client doesn't have a keytab, we can try to
use root's Kerberos credential before AUTH_SYS.
Now, when determining a principal and flavor to use for lease
management, the NFS client tries in this order:
1. Flavor: AUTH_GSS, krb5i
Principal: service principal (via keytab)
2. Flavor: AUTH_GSS, krb5i
Principal: user principal established for UID 0 (via kinit)
3. Flavor: AUTH_SYS
Principal: UID 0 / GID 0
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Jeff Layton [Fri, 2 Aug 2013 15:39:32 +0000 (11:39 -0400)]
nfs: verify open flags before allowing an atomic open
Currently, you can open a NFSv4 file with O_APPEND|O_DIRECT, but cannot
fcntl(F_SETFL,...) with those flags. This flag combination is explicitly
forbidden on NFSv3 opens, and it seems like it should also be on NFSv4.
Reported-by: Chao Ye <cye@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv4: Fix nfs4_init_uniform_client_string for net namespaces
Commit 6f2ea7f2a (NFS: Add nfs4_unique_id boot parameter) introduces a
boot parameter that allows client administrators to set a string
identifier for use by the EXCHANGE_ID and SETCLIENTID arguments in order
to make them more globally unique.
Unfortunately, that uniquifier is no longer globally unique in the presence
of net namespaces, since each container expects to be able to set up their
own lease when mounting a new NFSv4/4.1 partition.
The fix is to add back in the container-specific hostname in addition to
the unique id.
Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 12 Jul 2013 16:31:45 +0000 (12:31 -0400)]
NFS: Fix return type of nfs4_end_drain_session() stub
Clean up: when NFSv4.1 support is compiled out,
nfs4_end_drain_session() becomes a stub. Make the synopsis of the
stub match the synopsis of the real version of the function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv4: encode_attrs should not backfill the bitmap and attribute length
The attribute length is already calculated in advance. There is no
reason why we cannot calculate the bitmap in advance too so that
we don't have to play pointer games.
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
Pull alpha architecture fixes from Matt Turner:
"This contains mostly clean ups and fixes but also an implementation of
atomic64_dec_if_positive() and a pair of new syscalls"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
alpha: Use handle_percpu_irq for the timer interrupt
alpha: Force the user-visible HZ to a constant 1024.
alpha: Don't if-out dp264_device_interrupt.
alpha: Use __builtin_alpha_rpcc
alpha: Fix type compatibility warning for marvel_map_irq
alpha: Generate dwarf2 unwind info for various kernel entry points.
alpha: Implement atomic64_dec_if_positive
alpha: Improve atomic_add_unless
alpha: Modernize lib/mpi/longlong.h
alpha: Add kcmp and finit_module syscalls
alpha: locks: remove unused arch_*_relax operations
alpha: kernel: typo issue, using '1' instead of '11'
alpha: kernel: using memcpy() instead of strcpy()
alpha: Convert print_symbol to %pSR
Merge tag 'trace-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes and cleanups from Steven Rostedt:
"This contains fixes, optimizations and some clean ups
Some of the fixes need to go back to 3.10. They are minor, and deal
mostly with incorrect ref counting in accessing event files.
There was a couple of optimizations that should have perf perform a
bit better when accessing trace events.
And some various clean ups. Some of the clean ups are necessary to
help in a fix to a theoretical race between opening a event file and
deleting that event"
* tag 'trace-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Kill the unbalanced tr->ref++ in tracing_buffers_open()
tracing: Kill trace_array->waiter
tracing: Do not (ab)use trace_seq in event_id_read()
tracing: Simplify the iteration logic in f_start/f_next
tracing: Add ref_data to function and fgraph tracer structs
tracing: Miscellaneous fixes for trace_array ref counting
tracing: Fix error handling to ensure instances can always be removed
tracing/kprobe: Wait for disabling all running kprobe handlers
tracing/perf: Move the PERF_MAX_TRACE_SIZE check into perf_trace_buf_prepare()
tracing/syscall: Avoid perf_trace_buf_*() if sys_data->perf_events is empty
tracing/function: Avoid perf_trace_buf_*() if event_function.perf_events is empty
tracing: Typo fix on ring buffer comments
tracing: Use trace_seq_puts()/trace_seq_putc() where possible
tracing: Use correct config guard CONFIG_STACK_TRACER
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management fixes from Zhang Rui:
"These are fixes collected over the last week, they fixes several
problems caused by the x86_pkg_temp_thermal introduced in 3.11-rc1.
Specifics:
- the x86_pkg_temp_thermal driver causes crash on systems with no
package MSR support as there is a bug in the logic to check
presence of DTHERM and PTS feature together. Added a change so
that when there is no PTS support, module doesn't get loaded.
- fix krealloc() misuse in pkg_temp_thermal_device_add().
If krealloc() returns NULL, it doesn't free the original. Thus if
we want to exit because of the krealloc() failure, we must make
sure the original one is freed.
- The error code path of the x86 package temperature thermal driver's
initialization routine makes an unbalanced call to
get_online_cpus(), which causes subsequent CPU offline operations,
and consequently system suspend, to permanently block in
cpu_hotplug_begin() on systems where get_core_online() returns an
error code.
Remove the extra get_online_cpus() to fix the problem"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
Thermal: Fix lockup of cpu_down()
Thermal: x86_pkg_temp: Limit number of pkg temp zones
Thermal: x86_pkg_temp: fix krealloc() misuse in in pkg_temp_thermal_device_add()
Thermal: x86 package temp thermal crash
Merge tag 'gpio-for-v3.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull gpio fixes from Linus Walleij:
"A first round of GPIO fixes for the v3.11 series:
- OMAP device tree boot fix
- Handle an error condition in the MSM driver
The OMAP patches have been around since around the merge window, but
since they first caused more breakage I let them boil in -next for a
while. These should be fine now"
* tag 'gpio-for-v3.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
drivers: gpio: msm: Fix the error condition for reading ngpio
gpio/omap: fix build error when OF_GPIO is not defined.
gpio/omap: auto request GPIO as input if used as IRQ via DT
gpio/omap: don't create an IRQ mapping for every GPIO on DT
Merge branch 'for-3.11/drivers' of git://git.kernel.dk/linux-block
Pull block IO driver bits from Jens Axboe:
"As I mentioned in the core block pull request, due to real life
circumstances the driver pull request would be late. Now it looks
like -rc2 late... On the plus side, apart form the rsxx update, these
are all things that I could argue could go in later in the cycle as
they are fixes and not features. So even though things are late, it's
not ALL bad.
The pull request contains:
- Updates to bcache, all bug fixes, from Kent.
- A pile of drbd bug fixes (no big features this time!).
- xen blk front/back fixes.
- rsxx driver updates, some of them deferred form 3.10. So should be
well cooked by now"
* 'for-3.11/drivers' of git://git.kernel.dk/linux-block: (63 commits)
bcache: Allocation kthread fixes
bcache: Fix GC_SECTORS_USED() calculation
bcache: Journal replay fix
bcache: Shutdown fix
bcache: Fix a sysfs splat on shutdown
bcache: Advertise that flushes are supported
bcache: check for allocation failures
bcache: Fix a dumb race
bcache: Use standard utility code
bcache: Update email address
bcache: Delete fuzz tester
bcache: Document shrinker reserve better
bcache: FUA fixes
drbd: Allow online change of al-stripes and al-stripe-size
drbd: Constants should be UPPERCASE
drbd: Ignore the exit code of a fence-peer handler if it returns too late
drbd: Fix rcu_read_lock balance on error path
drbd: fix error return code in drbd_init()
drbd: Do not sleep inside rcu
bcache: Refresh usage docs
...
Steven Rostedt [Tue, 16 Jul 2013 18:02:28 +0000 (14:02 -0400)]
Thermal: Fix lockup of cpu_down()
Commit f1a18a105 "Thermal: CPU Package temperature thermal" had code
that did a get_online_cpus(), run a loop and then do a
put_online_cpus(). The problem is that the loop had an error exit that
would skip the put_online_cpus() part.
In the error exit part of the function, it also did a get_online_cpus(),
run a loop and then put_online_cpus(). The only way to get to the error
exit part is with get_online_cpus() already performed. If this error
condition is hit, the system will be prevented from taking CPUs offline.
The process taking the CPU offline will lock up hard.
Removing the get_online_cpus() removes the lockup as the hotplug CPU
refcount is back to zero.
This was bisected with ktest.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Merge tag 'acpi-video-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI video support fixes from Rafael Wysocki:
"I'm sending a separate pull request for this as it may be somewhat
controversial. The breakage addressed here is not really new and the
fixes may not satisfy all users of the affected systems, but we've had
so much back and forth dance in this area over the last several weeks
that I think it's time to actually make some progress.
The source of the problem is that about a year ago we started to tell
BIOSes that we're compatible with Windows 8, which we really need to
do, because some systems shipping with Windows 8 are tested with it
and nothing else, so if we tell their BIOSes that we aren't compatible
with Windows 8, we expose our users to untested BIOS/AML code paths.
However, as it turns out, some Windows 8-specific AML code paths are
not tested either, because Windows 8 actually doesn't use the ACPI
methods containing them, so if we declare Windows 8 compatibility and
attempt to use those ACPI methods, things break. That occurs mostly
in the backlight support area where in particular the _BCM and _BQC
methods are plain unusable on some systems if the OS declares Windows
8 compatibility.
[ The additional twist is that they actually become usable if the OS
says it is not compatible with Windows 8, but that may cause
problems to show up elsewhere ]
Investigation carried out by Matthew Garrett indicates that what
Windows 8 does about backlight is to leave backlight control up to
individual graphics drivers. At least there's evidence that it does
that if the Intel graphics driver is used, so we've decided to follow
Windows 8 in that respect and allow i915 to control backlight (Daniel
likes that part).
The first commit from Aaron Lu makes ACPICA export the variable from
which we can infer whether or not the BIOS believes that we are
compatible with Windows 8.
The second commit from Matthew Garrett prepares the ACPI video driver
by making it initialize the ACPI backlight even if it is not going to
be used afterward (that is needed for backlight control to work on
Thinkpads).
The third commit implements the actual workaround making i915 take
over backlight control if the firmware thinks it's dealing with
Windows 8 and is based on the work of multiple developers, including
Matthew Garrett, Chun-Yi Lee, Seth Forshee, and Aaron Lu.
The final commit from Aaron Lu makes us follow Windows 8 by informing
the firmware through the _DOS method that it should not carry out
automatic brightness changes, so that brightness can be controlled by
GUI.
Hopefully, this approach will allow us to avoid using blacklists of
systems that should not declare Windows 8 compatibility just to avoid
backlight control problems in the future.
- Change from Aaron Lu makes ACPICA export a variable which can be
used by driver code to determine whether or not the BIOS believes
that we are compatible with Windows 8.
- Change from Matthew Garrett makes the ACPI video driver initialize
the ACPI backlight even if it is not going to be used afterward
(that is needed for backlight control to work on Thinkpads).
- Fix from Rafael J Wysocki implements Windows 8 backlight support
workaround making i915 take over bakclight control if the firmware
thinks it's dealing with Windows 8. Based on the work of multiple
developers including Matthew Garrett, Chun-Yi Lee, Seth Forshee,
and Aaron Lu.
- Fix from Aaron Lu makes the kernel follow Windows 8 by informing
the firmware through the _DOS method that it should not carry out
automatic brightness changes, so that brightness can be controlled
by GUI"
* tag 'acpi-video-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: no automatic brightness changes by win8-compatible firmware
ACPI / video / i915: No ACPI backlight if firmware expects Windows 8
ACPI / video: Always call acpi_video_init_brightness() on init
ACPICA: expose OSI version
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext[34] tmpfile bugfix from Ted Ts'o:
"Fix regression caused by commit af51a2ac36d1f which added ->tmpfile()
support (along with a similar fix for ext3)"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext3: fix a BUG when opening a file with O_TMPFILE flag
ext4: fix a BUG when opening a file with O_TMPFILE flag
Zheng Liu [Sun, 21 Jul 2013 02:03:20 +0000 (22:03 -0400)]
ext3: fix a BUG when opening a file with O_TMPFILE flag
When we try to open a file with O_TMPFILE flag, we will trigger a bug.
The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
this check always fails because we set ->i_nlink = 1 in
inode_init_always(). We can use the following program to trigger it:
Here we couldn't call clear_nlink() directly because in d_tmpfile() we
will call inode_dec_link_count() to decrease ->i_nlink. So this commit
tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.
Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk>
Zheng Liu [Sun, 21 Jul 2013 01:58:38 +0000 (21:58 -0400)]
ext4: fix a BUG when opening a file with O_TMPFILE flag
When we try to open a file with O_TMPFILE flag, we will trigger a bug.
The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
this check always fails because we set ->i_nlink = 1 in
inode_init_always(). We can use the following program to trigger it:
Here we couldn't call clear_nlink() directly because in d_tmpfile() we
will call inode_dec_link_count() to decrease ->i_nlink. So this commit
tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.
Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Tested-by: Darrick J. Wong <darrick.wong@oracle.com> Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Merge tag 'staging-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging tree fixes from Greg KH:
"Here are a few iio driver fixes for 3.11-rc2. They are still spread
across drivers/iio and drivers/staging/iio so they are coming in
through this tree.
I've also removed the drivers/staging/csr/ driver as the developers
who originally sent it to me have moved on to other companies, and CSR
still will not send us the specs for the device, making the driver
pretty much obsolete and impossible to fix up. Deleting it now
prevents people from sending in lots of tiny codingsyle fixes that
will never go anywhere.
It also helps to offset the large lustre filesystem merge that
happened in 3.11-rc1 in the overall 3.11.0 diffstat. :)"
* tag 'staging-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: csr: remove driver
iio: lps331ap: Fix wrong in_pressure_scale output value
iio staging: fix lis3l02dq, read error handling
staging:iio:ad7291: add missing .driver_module to struct iio_info
iio: ti_am335x_adc: add missing .driver_module to struct iio_info
iio: mxs-lradc: Remove useless check in read_raw
iio: mxs-lradc: Fix misuse of iio->trig
iio: inkern: fix iio_convert_raw_to_processed_unlocked
iio: Fix iio_channel_has_info
iio:trigger: device_unregister->device_del to avoid double free
iio: dac: ad7303: fix error return code in ad7303_probe()
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"The sget() one is a long-standing bug and will need to go into -stable
(in fact, it had been originally caught in RHEL6), the other two are
3.11-only"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: constify dentry parameter in d_count()
livelock avoidance in sget()
allow O_TMPFILE to work with O_WRONLY
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 bugfixes from Ted Ts'o:
"Fixes for 3.11-rc2, sent at 5pm, in the professoinal style. :-)"
I'm not sure I like this new level of "professionalism".
9-5, people, 9-5.
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: call ext4_es_lru_add() after handling cache miss
ext4: yield during large unlinks
ext4: make the extent_status code more robust against ENOMEM failures
ext4: simplify calculation of blocks to free on error
ext4: fix error handling in ext4_ext_truncate()
Merge tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Fix a regression against NFSv4 FreeBSD servers when creating a new
file
- Fix another regression in rpc_client_register()
* tag 'nfs-for-3.11-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Fix a regression against the FreeBSD server
SUNRPC: Fix another issue with rpc_client_register()
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next
Pull btrfs fixes from Josef Bacik:
"I'm playing the role of Chris Mason this week while he's on vacation.
There are a few critical fixes for btrfs here, all regressions and
have been tested well"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-next:
Btrfs: fix wrong write offset when replacing a device
Btrfs: re-add root to dead root list if we stop dropping it
Btrfs: fix lock leak when resuming snapshot deletion
Btrfs: update drop progress before stopping snapshot dropping
gpio/omap: fix build error when OF_GPIO is not defined.
The OMAP GPIO driver check if the chip has an associated
Device Tree node using the struct gpio_chip of_node member.
But this is only build if CONFIG_OF_GPIO is defined which
leads to the following error when using omap1_defconfig:
linux/drivers/gpio/gpio-omap.c: In function 'omap_gpio_chip_init':
linux/drivers/gpio/gpio-omap.c:1080:17: error: 'struct gpio_chip' has no member named 'of_node'
linux/drivers/gpio/gpio-omap.c: In function 'omap_gpio_irq_map':
linux/drivers/gpio/gpio-omap.c:1116:16: error: 'struct gpio_chip' has no member named 'of_node'
Reported-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
gpio/omap: auto request GPIO as input if used as IRQ via DT
When an OMAP GPIO is used as an IRQ line, a call to gpio_request()
has to be made to initialize the OMAP GPIO bank before a driver
request the IRQ. Otherwise the call to request_irq() fails.
Drives should not be aware of this neither care wether an IRQ line
is a GPIO or not. They should just request the IRQ and this has to
be handled by the irq_chip driver.
With the current OMAP GPIO DT binding, if we define:
The GPIO is correctly mapped as an IRQ but a call to gpio_request()
is never made. Since a call to the custom IRQ domain .map function
handler is made for each GPIO used as an IRQ, the GPIO can be setup
and configured as input there automatically.
Changes since v3:
- Use bank->chip.of_node instead of_have_populated_dt() to check
DT or legacy boot as suggested by Jean-Christophe PLAGNIOL-VILLARD
- Add a comment that this is just a temporary solution until and
that it has to be removed once is handled by the IRQ core.
Changes since v2:
- Only make the call to gpio_request_one() conditional in the DT
case as suggested by Grant Likely.
Changes since v1:
- Split the irq domain mapping function handler and the GPIO
request in two different patches.
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Tested-by: Enric Balletbo i Serra <eballetbo@gmail.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
gpio/omap: don't create an IRQ mapping for every GPIO on DT
When a GPIO is defined as an interrupt line using Device
Tree, a call to irq_create_of_mapping() is made that calls
irq_create_mapping(). So, is not necessary to do the mapping
for all OMAP GPIO lines and explicitly call irq_create_mapping()
on the driver probe() when booting with Device Tree.
Add a custom IRQ domain .map function handler that will be
called by irq_create_mapping() to map the GPIO lines used as IRQ.
This also allows to execute needed setup code such as configuring
a GPIO as input and enabling the GPIO bank.
Changes since v3:
- Use bank->chip.of_node instead of_have_populated_dt() to check
DT or legacy boot as suggested by Jean-Christophe PLAGNIOL-VILLARD
Changes since v2:
- Unconditionally do the IRQ setup in the .map() function and
only call irq_create_mapping() in the gpio chip init to avoid
code duplication as suggested by Grant Likely.
Changes since v1:
- Split the addition of the .map function handler and the
automatic gpio request in two different patches.
- Add GPIO IRQ setup logic to the irq domain mapping function.
- Only call irq_create_mapping for every GPIO on legacy boot.
- Only setup a GPIO IRQ on the .map function for DeviceTree boot.
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Tested-by: Enric Balletbo i Serra <eballetbo@gmail.com> Acked-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Al Viro [Fri, 19 Jul 2013 23:13:55 +0000 (03:13 +0400)]
livelock avoidance in sget()
Eric Sandeen has found a nasty livelock in sget() - take a mount(2) about
to fail. The superblock is on ->fs_supers, ->s_umount is held exclusive,
->s_active is 1. Along comes two more processes, trying to mount the same
thing; sget() in each is picking that superblock, bumping ->s_count and
trying to grab ->s_umount. ->s_active is 3 now. Original mount(2)
finally gets to deactivate_locked_super() on failure; ->s_active is 2,
superblock is still ->fs_supers because shutdown will *not* happen until
->s_active hits 0. ->s_umount is dropped and now we have two processes
chasing each other:
s_active = 2, A acquired ->s_umount, B blocked
A sees that the damn thing is stillborn, does deactivate_locked_super()
s_active = 1, A drops ->s_umount, B gets it
A restarts the search and finds the same superblock. And bumps it ->s_active.
s_active = 2, B holds ->s_umount, A blocked on trying to get it
... and we are in the earlier situation with A and B switched places.
The root cause, of course, is that ->s_active should not grow until we'd
got MS_BORN. Then failing ->mount() will have deactivate_locked_super()
shut the damn thing down. Fortunately, it's easy to do - the key point
is that grab_super() is called only for superblocks currently on ->fs_supers,
so it can bump ->s_count and grab ->s_umount first, then check MS_BORN and
bump ->s_active; we must never increment ->s_count for superblocks past
->kill_sb(), but grab_super() is never called for those.
The bug is pretty old; we would've caught it by now, if not for accidental
exclusion between sget() for block filesystems; the things like cgroup or
e.g. mtd-based filesystems don't have anything of that sort, so they get
bitten. The right way to deal with that is obviously to fix sget()...
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
"Special thanks goes to Toralf Föster for continuously testing UML and
reporting issues!"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: remove dead code
um: siginfo cleanup
uml: Fix which_tmpdir failure when /dev/shm is a symlink, and in other edge cases
um: Fix wait_stub_done() error handling
um: Mark stub pages mapping with VM_PFNMAP
um: Fix return value of strnlen_user()
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
"MIPS fixes for 3.11. Half of then is for Netlogic the remainder
touches things across arch/mips.
Nothing really dramatic and by rc1 standards MIPS will be in fairly
good shape with this applied. Tested by building all MIPS defconfigs
of which with this pull request four platforms won't build. And yes,
it boots also on my favorite test systems"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: kvm: Kconfig: Drop HAVE_KVM dependency from VIRTUALIZATION
MIPS: Octeon: Fix DT pruning bug with pip ports
MIPS: KVM: Mark KVM_GUEST (T&E KVM) as BROKEN_ON_SMP
MIPS: tlbex: fix broken build in v3.11-rc1
MIPS: Netlogic: Add XLP PIC irqdomain
MIPS: Netlogic: Fix USB block's coherent DMA mask
MIPS: tlbex: Fix typo in r3000 tlb store handler
MIPS: BMIPS: Fix thinko to release slave TP from reset
MIPS: Delete dead invocation of exception_exit().
Merge tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64
Pull arm64 fixes from Catalin Marinas:
- Post -rc1 update to the common reboot infrastructure.
- Fixes (user cache maintenance fault handling, !COMPAT compilation,
CPU online and interrupt hanlding).
* tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
arm64: use common reboot infrastructure
arm64: mm: don't treat user cache maintenance faults as writes
arm64: add '#ifdef CONFIG_COMPAT' for aarch32_break_handler()
arm64: Only enable local interrupts after the CPU is marked online
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"An update for the BFP jit to the latest and greatest, two patches to
get kdump working again, the random-abort ptrace extention for
transactional execution, the z90crypt module alias for ap and a tiny
cleanup"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/zcrypt: Alias for new zcrypt device driver base module
s390/kdump: Allow copy_oldmem_page() copy to virtual memory
s390/kdump: Disable mmap for s390
s390/bpf,jit: add pkt_type support
s390/bpf,jit: address randomize and write protect jit code
s390/bpf,jit: use generic jit dumper
s390/bpf,jit: call module_free() from any context
s390/qdio: remove unused variable
s390/ptrace: PTRACE_TE_ABORT_RAND
alpha: Fix type compatibility warning for marvel_map_irq
Acked-by: Phil Carmody <pc+lkml@asdf.org> Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
Use ll/sc loops instead of C loops around cmpxchg.
Update the atomic64_add_unless block comment to match the code.
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
Remove the compile warning for __udiv_qrnnd not having a prototype.
Use the __builtin_alpha_umulh introduced in gcc 4.0.
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
The arch_{spin,read,write}_relax macros are not used anywhere in the
kernel and are typically just aliases for cpu_relax().
This patch removes the unused definitions for Alpha.
Cc: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
Stefan Behrens [Thu, 4 Jul 2013 14:14:23 +0000 (16:14 +0200)]
Btrfs: fix wrong write offset when replacing a device
Miao Xie reported the following issue:
The filesystem was corrupted after we did a device replace.
Steps to reproduce:
# mkfs.btrfs -f -m single -d raid10 <device0>..<device3>
# mount <device0> <mnt>
# btrfs replace start -rfB 1 <device4> <mnt>
# umount <mnt>
# btrfsck <device4>
The reason for the issue is that we changed the write offset by mistake,
introduced by commit 625f1c8dc.
We read the data from the source device at first, and then write the
data into the corresponding place of the new device. In order to
implement the "-r" option, the source location is remapped using
btrfs_map_block(). The read takes place on the mapped location, and
the write needs to take place on the unmapped location. Currently
the write is using the mapped location, and this commit changes it
back by undoing the change to the write address that the aforementioned
commit added by mistake.
Reported-by: Miao Xie <miaox@cn.fujitsu.com> Cc: <stable@vger.kernel.org> # 3.10+ Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Wed, 17 Jul 2013 23:30:20 +0000 (19:30 -0400)]
Btrfs: re-add root to dead root list if we stop dropping it
If we stop dropping a root for whatever reason we need to add it back to the
dead root list so that we will re-start the dropping next transaction commit.
The other case this happens is if we recover a drop because we will add a root
without adding it to the fs radix tree, so we can leak it's root and commit root
extent buffer, adding this to the dead root list makes this cleanup happen.
Thanks,
Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Mon, 15 Jul 2013 16:41:42 +0000 (12:41 -0400)]
Btrfs: fix lock leak when resuming snapshot deletion
We aren't setting path->locks[level] when we resume a snapshot deletion which
means we won't unlock the buffer when we free the path. This causes deadlocks
if we happen to re-allocate the block before we've evicted the extent buffer
from cache. Thanks,
Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Mon, 15 Jul 2013 15:57:06 +0000 (11:57 -0400)]
Btrfs: update drop progress before stopping snapshot dropping
Alex pointed out a problem and fix that exists in the drop one snapshot at a
time patch. If we decide we need to exit for whatever reason (umount for
example) we will just exit the snapshot dropping without updating the drop
progress. So the next time we go to resume we will BUG_ON() because we can't
find the extent we left off at because we never updated it. This patch fixes
the problem.
Cc: stable@vger.kernel.org Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
tracing: Kill the unbalanced tr->ref++ in tracing_buffers_open()
tracing_buffers_open() does trace_array_get() and then it wrongly
inrcements tr->ref again under trace_types_lock. This means that
every caller leaks trace_array:
# cd /sys/kernel/debug/tracing/
# mkdir instances/X
# true < instances/X/per_cpu/cpu0/trace_pipe_raw
# rmdir instances/X
rmdir: failed to remove `instances/X': Device or resource busy
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fix from Paolo Bonzini:
"This single patch fixes a regression caused by one of the
optimizations introduced in 3.11, which is generally visible only on
AMD processors"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: MMU: avoid fast page fault fixing mmio page fault
Merge tag 'pm+acpi-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"These are fixes collected over the last week, most importnatly two
cpufreq reverts fixing regressions introduced in 3.10, an autoseelp
fix preventing systems using it from crashing during shutdown and two
ACPI scan fixes related to hotplug.
Specifics:
- Two cpufreq commits from the 3.10 cycle introduced regressions.
The first of them was buggy (it did way much more than it needed to
do) and the second one attempted to fix an issue introduced by the
first one. Fixes from Srivatsa S Bhat revert both.
- If autosleep triggers during system shutdown and the shutdown
callbacks of some device drivers have been called already, it may
crash the system. Fix from Liu Shuo prevents that from happening
by making try_to_suspend() check system_state.
- The ACPI memory hotplug driver doesn't clear its driver_data on
errors which may cause a NULL poiter dereference to happen later.
Fix from Toshi Kani.
- The ACPI namespace scanning code should not try to attach scan
handlers to device objects that have them already, which may
confuse things quite a bit, and it should rescan the whole
namespace branch starting at the given node after receiving a bus
check notify event even if the device at that particular node has
been discovered already. Fixes from Rafael J Wysocki.
- New ACPI video blacklist entry for a system whose initial backlight
setting from the BIOS doesn't make sense. From Lan Tianyu.
- Garbage string output avoindance for ACPI PNP from Liu Shuo.
- Two Kconfig fixes for issues introduced recently in the s3c24xx
cpufreq driver (when moving the driver to drivers/cpufreq) from
Paul Bolle.
- Trivial comment fix in pm_wakeup.h from Chanwoo Choi"
* tag 'pm+acpi-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: ignore BIOS initial backlight value for Fujitsu E753
PNP / ACPI: avoid garbage in resource name
cpufreq: Revert commit 2f7021a8 to fix CPU hotplug regression
cpufreq: s3c24xx: fix "depends on ARM_S3C24XX" in Kconfig
cpufreq: s3c24xx: rename CONFIG_CPU_FREQ_S3C24XX_DEBUGFS
PM / Sleep: Fix comment typo in pm_wakeup.h
PM / Sleep: avoid 'autosleep' in shutdown progress
cpufreq: Revert commit a66b2e to fix suspend/resume regression
ACPI / memhotplug: Fix a stale pointer in error path
ACPI / scan: Always call acpi_bus_scan() for bus check notifications
ACPI / scan: Do not try to attach scan handlers to devices having them
Marc Zyngier [Thu, 11 Jul 2013 11:13:00 +0000 (12:13 +0100)]
arm64: use common reboot infrastructure
Commit 7b6d864b48d9 (reboot: arm: change reboot_mode to use enum
reboot_mode) changed the way reboot is handled on arm, which has a
direct impact on arm64 as we share the reset driver on the VE platform.
The obvious fix is to move arm64 to use the same infrastructure.
Will Deacon [Fri, 19 Jul 2013 14:37:12 +0000 (15:37 +0100)]
arm64: mm: don't treat user cache maintenance faults as writes
On arm64, cache maintenance faults appear as data aborts with the CM
bit set in the ESR. The WnR bit, usually used to distinguish between
faulting loads and stores, always reads as 1 and (slightly confusingly)
the instructions are treated as reads by the architecture.
This patch fixes our fault handling code to treat cache maintenance
faults in the same way as loads.
Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Chen Gang [Mon, 24 Jun 2013 09:27:49 +0000 (10:27 +0100)]
arm64: add '#ifdef CONFIG_COMPAT' for aarch32_break_handler()
If 'COMPAT' not defined, aarch32_break_handler() cannot pass compiling,
and it can work independent with 'COMPAT', so remove dummy definition.
The related error:
arch/arm64/kernel/debug-monitors.c:249:5: error: redefinition of ‘aarch32_break_handler’
In file included from arch/arm64/kernel/debug-monitors.c:29:0:
/root/linux-next/arch/arm64/include/asm/debug-monitors.h:89:12: note: previous definition of ‘aarch32_break_handler’ was here
Signed-off-by: Chen Gang <gang.chen@asianux.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
arm64: Only enable local interrupts after the CPU is marked online
There is a slight chance that (timer) interrupts are triggered before a
secondary CPU has been marked online with implications on softirq thread
affinity.
Markos Chandras [Tue, 11 Jun 2013 09:02:33 +0000 (09:02 +0000)]
MIPS: kvm: Kconfig: Drop HAVE_KVM dependency from VIRTUALIZATION
Virtualization does not always need KVM capabilities so drop the
dependency. The KVM symbol already depends on HAVE_KVM.
Fixes the following problem on a randconfig:
warning: (REMOTEPROC && RPMSG) selects VIRTUALIZATION which has unmet direct
dependencies (HAVE_KVM)
warning: (REMOTEPROC && RPMSG) selects VIRTUALIZATION which has unmet
direct dependencies (HAVE_KVM)
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com> Acked-by: Steven J. Hill <Steven.Hill@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5443/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Currently we use both struct siginfo and siginfo_t.
Let's use struct siginfo internally to avoid ongoing
compiler warning. We are allowed to do so because
struct siginfo and siginfo_t are equivalent.
Signed-off-by: Richard Weinberger <richard@nod.at>
During the pruning of the device tree octeon_fdt_pip_iface() is called
for each PIP interface and every port up to the port count is removed
from the device tree. However, the count was set to the return value of
cvmx_helper_interface_enumerate() which doesn't actually return the
count but just returns zero on success. This effectively removed *all*
ports from the tree.
Use cvmx_helper_ports_on_interface() instead to fix this. This
successfully restores the 3 ports of my ERLite-3 and fixes the "kernel
assigns random MAC addresses" issue.
Signed-off-by: Faidon Liambotis <paravoid@debian.org> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5587/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
uml: Fix which_tmpdir failure when /dev/shm is a symlink, and in other edge cases
which_tmpdir did the wrong thing if /dev/shm was a symlink (e.g., to /run/shm),
if there were multiple mounts on top of each other, if the mount(s) were
obscured by a later mount, or if /dev/shm was a prefix of another mount point.
This fixes these cases. Applies to 3.9.6.
Signed-off-by: Tristan Schmelcher <tschmelcher@google.com> Signed-off-by: Richard Weinberger <richard@nod.at>
James Hogan [Fri, 12 Jul 2013 10:26:11 +0000 (10:26 +0000)]
MIPS: KVM: Mark KVM_GUEST (T&E KVM) as BROKEN_ON_SMP
Make KVM_GUEST depend on BROKEN_ON_SMP so that it cannot be enabled with
SMP.
SMP kernels use ll/sc instructions for an atomic section in the tlb fill
handler, with a tlbp instruction contained in the middle. This cannot be
emulated with trap & emulate KVM because the tlbp instruction traps and
the eret to return to the guest code clears the LLbit which makes the sc
instruction always fail.
Aaro Koskinen [Mon, 15 Jul 2013 07:21:57 +0000 (07:21 +0000)]
MIPS: tlbex: fix broken build in v3.11-rc1
Commit 6ba045f9fbdafb48da42aa8576ea7a3980443136 (MIPS: Move generated code
to .text for microMIPS) deleted tlbmiss_handler_setup_pgd_array, but some
references were not converted. Fix that to enable building a MIPS kernel.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Jayachandran C. <jchandra@broadcom.com> Acked-by: David Daney <david.daney@cavium.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5589/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Jayachandran C [Wed, 17 Jul 2013 10:27:26 +0000 (10:27 +0000)]
MIPS: Netlogic: Add XLP PIC irqdomain
Add a legacy irq domain for the XLP PIC interrupts. This will be used
when interrupts are assigned from the device tree. This change is required
after commit c5cdc67 "irqdomain: Remove temporary MIPS workaround code".
Signed-off-by: Jayachandran C <jchandra@broadcom.com> Cc: linux-mips@linux-mips.org Cc: Jayachandran C <jchandra@broadcom.com>
Patchwork: https://patchwork.linux-mips.org/patch/5597/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The on-chip USB controller on Netlogic XLP does not suppport
DMA beyond 32-bit physical address. Set the coherent_dma_mask
of the USB in its PCI fixup to support this.
Tony Wu [Thu, 18 Jul 2013 09:45:47 +0000 (09:45 +0000)]
MIPS: tlbex: Fix typo in r3000 tlb store handler
commit 6ba045f (MIPS: Move generated code to .text for microMIPS)
causes a panic at boot. The handler builder should test against
handle_tlbs_end, not handle_tlbs.
Signed-off-by: Tony Wu <tung7970@gmail.com> Acked-by: Jayachandran C. <jchandra@broadcom.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5600/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS: BMIPS: Fix thinko to release slave TP from reset
Commit 4df715aa ["MIPS: BMIPS: support booting from physical CPU other
than 0"] introduced a thinko which will prevents slave CPUs from being
released from reset on systems where we boot from TP0. The problem is
that we are checking whether the slave CPU logical CPU map is 0, which
is never true for systems booting from TP0, so we do not release the
slave TP from reset and we are just stuck. Fix this by properly checking
that the CPU we intend to boot really is the physical slave CPU (logical
and physical value being 1).
s390/zcrypt: Alias for new zcrypt device driver base module
The zcrypt device driver has been split into base/bus module, api-module,
card modules and message type modules. The base module has been renamed
from z90crypt to ap.
A module alias (with the well-known z90crypt identifier) will be introduced
that enable users to use their existing way to load the zcrypt device driver.
Signed-off-by: Ingo Tuchscherer <ingo.tuchscherer@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull networking fixes from David Miller:
"A couple interesting SKB fragment handling fixes, plus the usual small
bits here and there:
1) Fix 64-bit divide build failure on 32-bit platforms in mlx5, from
Tim Gardner.
2) Get rid of a stupid reimplementation on "%*phC" in our sysfs MAC
address printing helper.
3) Fix NETIF_F_SG capability advertisement in hyperv driver, if the
device can't do checksumming offloads then it shouldn't say it can
do SG either. From Haiyang Zhang.
4) bgmac needs to depend on PHYLIB, from Hauke Mehrtens.
5) Don't leak DMA mappings on mapping failures, from Neil Horman.
6) We need to reset the transport header of SKBs in ipv4 before we
attempt to perform early socket demux, just like ipv6 does. From
Eric Dumazet.
7) Add missing locking on vxlan device removal, from Stephen
Hemminger.
8) xen-netfront has to make two passes over an SKB to prepare it for
transfer. One pass calculates the number of slots needed, the
second massages the SKB and fills the slots. Unfortunately, the
first pass doesn't calculate the number of slots properly so we
can end up trying to build a MAX_SKB_FRAGS + 1 SKB which doesn't
work out so well. Fix from Jan Beulich with help and discussion
with several others.
9) Fix a similar problem in tun and macvtap, which have to split up
scatter-gather elements at PAGE_SIZE boundaries. Don't do
zerocopy if it would result in a > MAX_SKB_FRAGS skb. Fixes from
Jason Wang.
10) On receive, once we've decoded the VLAN state completely, clear
skb->vlan_tci. Otherwise demuxed tunnels underneath can trigger
the VLAN code again, corrupting the packet. Fix from Eric
Dumazet"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
vlan: fix a race in egress prio management
vlan: mask vlan prio bits
macvtap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS
tuntap: do not zerocopy if iov needs more pages than MAX_SKB_FRAGS
pkt_sched: sch_qfq: remove a source of high packet delay/jitter
xen-netfront: pull on receive skb may need to happen earlier
vxlan: add necessary locking on device removal
hyperv: Fix the NETIF_F_SG flag setting in netvsc
net: Fix sysfs_format_mac() code duplication.
be2net: Fix to avoid hardware workaround when not needed
macvtap: do not assume 802.1Q when send vlan packets
macvtap: fix the missing ret value of TUNSETQUEUE
ipv4: set transport header earlier
mlx5 core: Fix __udivdi3 when compiling for 32 bit arches
bgmac: add dependency to phylib
net/irda: fixed style issues in irlan_eth
ethtool: fixed trailing statements in ethtool
ndisc: bool initializations should use true and false
atl1e: unmap partially mapped skb on dma error and free skb
tracing: Do not (ab)use trace_seq in event_id_read()
event_id_read() has no reason to kmalloc "struct trace_seq"
(more than PAGE_SIZE!), it can use a small buffer instead.
Note: "if (*ppos) return 0" looks strange and even wrong,
simple_read_from_buffer() handles ppos != 0 case corrrectly.
And it seems that almost every user of trace_seq in this file
should be converted too. Unless you use seq_open(), trace_seq
buys nothing compared to the raw buffer, but it needs a bit
more memory and code.
tracing: Simplify the iteration logic in f_start/f_next
f_next() looks overcomplicated, and it is not strictly correct
even if this doesn't matter.
Say, FORMAT_FIELD_SEPERATOR should not return NULL (means EOF)
if trace_get_fields() returns an empty list, we should simply
advance to FORMAT_PRINTFMT as we do when we find the end of list.
1. Change f_next() to return "struct list_head *" rather than
"ftrace_event_field *", and change f_show() to do list_entry().
This simplifies the code a bit, only f_show() needs to know
about ftrace_event_field, and f_next() can play with ->prev
directly
2. Change f_next() to not play with ->prev / return inside the
switch() statement. It can simply set node = head/common_head,
the prev-or-advance-to-the-next-magic below does all work.
While at it. f_start() looks overcomplicated too. I don't think
*pos == 0 makes sense as a separate case, just change this code
to do "while" instead of "do/while".
The patch also moves f_start() down, close to f_stop(). This is
purely cosmetic, just to make the locking added by the next patch
more clear/visible.
tracing: Add ref_data to function and fgraph tracer structs
The selftest for function and function graph tracers are defined as
__init, as they are only executed at boot up. The "tracer" structs
that are associated to those tracers are not setup as __init as they
are used after boot. To stop mismatch warnings, those structures
need to be annotated with __ref_data.
Currently, the tracer structures are defined to __read_mostly, as they
do not really change. But in the future they should be converted to
consts, but that will take a little work because they have a "next"
pointer that gets updated when they are registered. That will have to
wait till the next major release.
Alexander Z Lam [Thu, 18 Jul 2013 18:18:44 +0000 (11:18 -0700)]
tracing: Miscellaneous fixes for trace_array ref counting
Some error paths did not handle ref counting properly, and some trace files need
ref counting.
Link: http://lkml.kernel.org/r/1374171524-11948-1-git-send-email-azl@google.com Cc: stable@vger.kernel.org # 3.10 Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Alexander Z Lam <lambchop468@gmail.com> Signed-off-by: Alexander Z Lam <azl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Alexander Z Lam [Thu, 11 Jul 2013 00:34:34 +0000 (17:34 -0700)]
tracing: Fix error handling to ensure instances can always be removed
Remove debugfs directories for tracing instances during creation if an error
occurs causing the trace_array for that instance to not be added to
ftrace_trace_arrays. If the directory continues to exist after the error, it
cannot be removed because the respective trace_array is not in
ftrace_trace_arrays.
Link: http://lkml.kernel.org/r/1373502874-1706-2-git-send-email-azl@google.com Cc: stable@vger.kernel.org # 3.10 Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Cc: David Sharp <dhsharp@google.com> Cc: Alexander Z Lam <lambchop468@gmail.com> Signed-off-by: Alexander Z Lam <azl@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
tracing/kprobe: Wait for disabling all running kprobe handlers
Wait for disabling all running kprobe handlers when a kprobe
event is disabled, since the caller, trace_remove_event_call()
supposes that a removing event is disabled completely by
disabling the event.
With this change, ftrace can ensure that there is no running
event handlers after disabling it.
Oleg Nesterov [Mon, 17 Jun 2013 17:02:07 +0000 (19:02 +0200)]
tracing/syscall: Avoid perf_trace_buf_*() if sys_data->perf_events is empty
perf_trace_buf_prepare() + perf_trace_buf_submit(head, task => NULL)
make no sense if hlist_empty(head). Change perf_syscall_enter/exit()
to check sys_data->{enter,exit}_event->perf_events beforehand.
Oleg Nesterov [Mon, 17 Jun 2013 17:02:04 +0000 (19:02 +0200)]
tracing/function: Avoid perf_trace_buf_*() if event_function.perf_events is empty
perf_trace_buf_prepare() + perf_trace_buf_submit(head, task => NULL)
make no sense if hlist_empty(head). Change perf_ftrace_function_call()
to check event_function.perf_events beforehand.
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
"Trying again to get the fixes queue, including the fixed IDT alignment
patch.
The UEFI patch is by far the biggest issue at hand: it is currently
causing quite a few machines to boot. Which is sad, because the only
reason they would is because their BIOSes touch memory that has
already been freed. The other major issue is that we finally have
tracked down the root cause of a significant number of machines
failing to suspend/resume"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Make sure IDT is page aligned
x86, suspend: Handle CPUs which fail to #GP on RDMSR
x86/platform/ce4100: Add header file for reboot type
Revert "UEFI: Don't pass boot services regions to SetVirtualAddressMap()"
efivars: check for EFI_RUNTIME_SERVICES
Merge tag 'md-3.11-fixes' of git://neil.brown.name/md
Pull md bug fixes from NeilBrown:
"Sorry boss, back at work now boss. Here's them nice shiny patches ya
wanted. All nicely tagged and justified for -stable and everyfing:
Three bug fixes for md in 3.10
3.10 wasn't a good release for md. The bio changes left a couple of
bugs, and an md "fix" created another one.
These three patches appear to fix the issues and have been tagged for
-stable"
* tag 'md-3.11-fixes' of git://neil.brown.name/md:
md/raid1: fix bio handling problems in process_checks()
md: Remove recent change which allows devices to skip recovery.
md/raid10: fix two problems with RAID10 resync.
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"You'll be terribly disappointed in this, I'm not trying to sneak any
features in or anything, its mostly radeon and intel fixes, a couple
of ARM driver fixes"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (34 commits)
drm/radeon/dpm: add debugfs support for RS780/RS880 (v3)
drm/radeon/dpm/atom: fix broken gcc harder
drm/radeon/dpm/atom: restructure logic to work around a compiler bug
drm/radeon/dpm: fix atom vram table parsing
drm/radeon: fix an endian bug in atom table parsing
drm/radeon: add a module parameter to disable aspm
drm/rcar-du: Use the GEM PRIME helpers
drm/shmobile: Use the GEM PRIME helpers
uvesafb: Really allow mtrr being 0, as documented and warn()ed
radeon kms: do not flush uninitialized hotplug work
drm/radeon/dpm/sumo: handle boost states properly when forcing a perf level
drm/radeon: align VM PTBs (Page Table Blocks) to 32K
drm/radeon: allow selection of alignment in the sub-allocator
drm/radeon: never unpin UVD bo v3
drm/radeon: fix UVD fence emit
drm/radeon: add fault decode function for CIK
drm/radeon: add fault decode function for SI (v2)
drm/radeon: add fault decode function for cayman/TN (v2)
drm/radeon: use radeon device for request firmware
drm/radeon: add missing ttm_eu_backoff_reservation to radeon_bo_list_validate
...
Eric Dumazet [Thu, 18 Jul 2013 16:35:10 +0000 (09:35 -0700)]
vlan: fix a race in egress prio management
egress_priority_map[] hash table updates are protected by rtnl,
and we never remove elements until device is dismantled.
We have to make sure that before inserting an new element in hash table,
all its fields are committed to memory or else another cpu could
find corrupt values and crash.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 18 Jul 2013 14:19:26 +0000 (07:19 -0700)]
vlan: mask vlan prio bits
In commit 48cc32d38a52d0b68f91a171a8d00531edc6a46e
("vlan: don't deliver frames for unknown vlans to protocols")
Florian made sure we set pkt_type to PACKET_OTHERHOST
if the vlan id is set and we could find a vlan device for this
particular id.
But we also have a problem if prio bits are set.
Steinar reported an issue on a router receiving IPv6 frames with a
vlan tag of 4000 (id 0, prio 2), and tunneled into a sit device,
because skb->vlan_tci is set.
Forwarded frame is completely corrupted : We can see (8100:4000)
being inserted in the middle of IPv6 source address :
It seems we are not really ready to properly cope with this right now.
We can probably do better in future kernels :
vlan_get_ingress_priority() should be a netdev property instead of
a per vlan_dev one.
For stable kernels, lets clear vlan_tci to fix the bugs.
Reported-by: Steinar H. Gunderson <sesse@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>