Jayachandran C [Wed, 25 Sep 2013 10:58:04 +0000 (16:28 +0530)]
MIPS: mm: Use scratch for PGD when !CONFIG_MIPS_PGD_C0_CONTEXT
Allow usage of scratch register for current pgd even when
MIPS_PGD_C0_CONTEXT is not configured. MIPS_PGD_C0_CONTEXT is set
for 64r2 platforms to indicate availability of Xcontext for saving
cpuid, thus freeing Context to be used for saving PGD. This option
was also tied to using a scratch register for storing PGD.
This commit will allow usage of scratch register to store the current
pgd if one can be allocated for the platform, even when
MIPS_PGD_C0_CONTEXT is not set. The cpuid will be kept in the CP0
Context register in this case.
The code to store the current pgd for the TLB miss handler is now
generated in all cases. When scratch register is available, the PGD
is also stored in the scratch register.
There is no reliable way to tell R4000/R4400 SC and MC variations apart,
however simple heuristic should give good results. Only the MC version
supports coherent caching so we can rely on such a mode having been set
for KSEG0 by the power-on firmware to reliably indicate an MC processor.
SC processors reportedly hang on coherent cached memory accesses and Linux
is linked to a cached load address so the firmware has to use the correct
caching mode to download the kernel image in a cached mode successfully.
OTOH if the firmware chooses to use either the non-coherent cached or the
uncached mode for KSEG0 on an MC processor, then the SC variant will be
reported, just as we currently do, so no regression here.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: Jonas Gorski <jogo@openwrt.org> Cc: MIPS Mailing List <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/5882/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This change complements commits d0da7c002f7b2a93582187a9e3f73891a01d8ee4
[MIPS: DEC: Convert to new irq_chip functions] and 5359b938c088423a28c41499f183cd10824c1816 [MIPS: DECstation I/O ASIC DMA
interrupt handling fix] and implements automatic handling of the two
classes of DMA interrupts the I/O ASIC implements, informational and
errors.
Informational DMA interrupts do not stop the transfer and use the
`handle_edge_irq' handler that clears the request right away so that
another request may be recorded while the previous is being handled.
DMA error interrupts stop the transfer and require a corrective action
before DMA can be reenabled. Therefore they use the `handle_fasteoi_irq'
handler that only clears the request on the way out. Because MIPS
processor interrupt inputs, one of which the I/O ASIC's interrupt
controller is cascaded to, are level-triggered it is recommended that
error DMA interrupt action handlers are registered with the IRQF_ONESHOT
flag set so that they are run with the interrupt line masked.
This change removes the export of clear_ioasic_dma_irq that now does not
have to be called by device drivers to clear interrupts explicitly
anymore. Originally these interrupts were cleared in the .end handler of
the `irq_chip' structure, before it was removed.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5874/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The semantics stay the same - on Cavium Octeon the functions were dead
code (it overrides the MIPS DMA ops) - on other platforms they contained
no code at all.
Add support for the LZ4 compression scheme in the ZBOOT decompression
stub, in order to support it we need to:
- select the "lz4" compression tool to compress the vmlinux.bin
payload
- memcpy() is also required for decompress_unlz4.c so we share the
implementation between GZIP, XZ and now LZ4
MIPS: ZBOOT: Define program header for text loadable segment
There is currently no corresponding ELF program header for the "text"
loadable segment which is confusing for some bootloader out there such
as CFE because it expects to find a program header matching the segment
it is trying to load. The Linux kernel ELF binary "vmlinux" has a
similar program header for the text segment so we just mimic this here
too.
Add support for the XZ compression scheme in the ZBOOT decompression
stub, in order to support it we need to:
- select the "xzkern" compression tool to compress the vmlinux.bin
payload
- link with ashldi3.o for xz_dec_run() to work
- memcpy() is also required for decompress_unxz.c so we share the
implementation between GZIP and XZ
MIPS: Kbuild: Do not allow building vmlinuz when !ZBOOT
When CONFIG_SYS_SUPPORTS_ZBOOT is not enabled, we will still try to
build the decompressor code in arch/mips/boot/compressed as a
dependency for producing the vmlinuz target and this will result in
the following build failure:
OBJCOPY arch/mips/boot/compressed/vmlinux.bin
arch/mips/boot/compressed/decompress.c: In function 'decompress_kernel':
arch/mips/boot/compressed/decompress.c:105:2: error: implicit
declaration of function 'decompress'
make[1]: *** [arch/mips/boot/compressed/decompress.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [vmlinuz] Error 2
This is a genuine build failure because we have no implementation for
the decompress() function body since no kernel compression method
defined in CONFIG_KERNEL_(GZIP,BZIP2...) has been enabled.
arch/mips/Makefile already guards the install target for the "vmlinuz"
binary with a proper ifdef CONFIG_SYS_SUPPORTS_ZBOOT, we now also do the
same if we attempt to do a "make vmlinuz" and show that
CONFIG_SYS_SUPPORTS_ZBOOT is not enabled.
[ralf@linux-mips.org: Cleanup the makefile rule as suggested by James
Hogan.]
Currently when using an initrd on a MIPS system the start of the bootmem region of
memory is set to the larger of the end of the kernel bss region (_end) or the end
of the initrd. In a typical memory layout where the initrd is at some address above
the kernel image this means that the start of the bootmem region will be the end of
the initrd. But when we are done processing/loading the initrd we have no way to
reclaim the memory region it occupied, and we lose a large chunk of now otherwise
empty RAM from our final running system.
The bootmem code is designed to allow this initrd to be reserved (and the code in
finalize_initrd() currently does this). When the initrd is finally processed/loaded
its reserved memory is freed.
Fix the setting of the start of the bootmem map to be the end of the kernel.
MIPS: BCM47XX: Fix detected clock on Asus WL520GC and WL520GU
The Asus WL520GC and WL520GU are based on the BCM5354 and clocked at
200MHz, but they do not have a clkfreq nvram variable set to the
correct value. This adds a workaround for these devices.
MIPS: BCM47XX: Fix clock detection for BCM5354 with 200MHz clock
Some BCM5354 SoCs are running at 200MHz, but it is not possible to read
the clock from a register like it is done on some other SoC in ssb and
bcma. These devices should have a clkfreq nvram configuration value set
to 200, read it and set the clock to the correct value.
Detect on which board this code is running based on some nvram
settings. This is needed to start board specific workarounds and
configure the leds and buttons which are on different gpios on every board.
This patches add some boards we have seen, but there are many more.
MIPS: Kconfig: CMP support needs to select SMP as well
The CMP code is only designed to work with SMP configurations.
Fixes multiple build problems on certain randconfigs:
In file included from arch/mips/kernel/smp-cmp.c:34:0:
arch/mips/include/asm/smp.h:28:0:
error: "raw_smp_processor_id" redefined [-Werror]
In file included from include/linux/sched.h:30:0,
from arch/mips/kernel/smp-cmp.c:22:
include/linux/smp.h:135:0: note: this is the location of the
previous definition
In file included from arch/mips/kernel/smp-cmp.c:34:0:
arch/mips/include/asm/smp.h:57:20:
error: redefinition of 'smp_send_reschedule'
In file included from include/linux/sched.h:30:0,
from arch/mips/kernel/smp-cmp.c:22:
include/linux/smp.h:179:20: note: previous
definition of 'smp_send_reschedule' was here
In file included from arch/mips/kernel/smp-cmp.c:34:0:
arch/mips/include/asm/smp.h: In function 'smp_send_reschedule':
arch/mips/include/asm/smp.h:61:8:
error: dereferencing pointer to incomplete type
[...]
In commit 15ef17f622033455dcf03ae96256e474073a7b11
(tty: ar933x_uart: use the clk API to get the uart
clock), the AR933x UART driver for has been converted
to get the uart clock rate via the clock API and it
does not use the platform data anymore.
Remove the ar933x_uart_platform.h header file and get
rid of the superfluous variable and initialization code
in platform setup.
Jiang Liu [Wed, 11 Sep 2013 16:07:15 +0000 (00:07 +0800)]
MIPS: SMP: kill redundant call of generic_smp_call_function_single_interrupt()
Since commit 9a46ad6d6df3b54 "smp: make smp_call_function_many() use
logic similar to smp_call_function_single()",
generic_smp_call_function_single_interrupt() is an alias of
generic_smp_call_function_interrupt(), so kill the redundant call.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Jiang Liu <liuj97@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Shaohua Li <shli@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Jiri Kosina <trivial@kernel.org> Cc: Wang YanQing <udknight@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: linux-arch@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/5820/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Merge tag 'staging-3.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging fixes from Greg KH:
"Here are a number of small staging tree and iio driver fixes. Nothing
major, just lots of little things"
* tag 'staging-3.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (34 commits)
iio:buffer_cb: Add missing iio_buffer_init()
iio: Prevent race between IIO chardev opening and IIO device free
iio: fix: Keep a reference to the IIO device for open file descriptors
iio: Stop sampling when the device is removed
iio: Fix crash when scan_bytes is computed with active_scan_mask == NULL
iio: Fix mcp4725 dev-to-indio_dev conversion in suspend/resume
iio: Fix bma180 dev-to-indio_dev conversion in suspend/resume
iio: Fix tmp006 dev-to-indio_dev conversion in suspend/resume
iio: iio_device_add_event_sysfs() bugfix
staging: iio: ade7854-spi: Fix return value
staging:iio:hmc5843: Fix measurement conversion
iio: isl29018: Fix uninitialized value
staging:iio:dummy fix kfifo_buf kconfig dependency issue if kfifo modular and buffer enabled for built in dummy driver.
iio: at91: fix adc_clk overflow
staging: line6: add bounds check in snd_toneport_source_put()
Staging: comedi: Fix dependencies for drivers misclassified as PCI
staging: r8188eu: Adjust RX gain
staging: r8188eu: Fix smatch warning in core/rtw_ieee80211.
staging: r8188eu: Fix smatch error in core/rtw_mlme_ext.c
staging: r8188eu: Fix Smatch off-by-one warning in hal/rtl8188e_hal_init.c
...
Merge tag 'usb-3.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of small USB fixes for 3.12-rc2.
One is a revert of a EHCI change that isn't quite ready for 3.12.
Others are minor things, gadget fixes, Kconfig fixes, and some quirks
and documentation updates.
All have been in linux-next for a bit"
* tag 'usb-3.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: pl2303: distinguish between original and cloned HX chips
USB: Faraday fotg210: fix email addresses
USB: fix typo in usb serial simple driver Kconfig
Revert "USB: EHCI: support running URB giveback in tasklet context"
usb: s3c-hsotg: do not disconnect gadget when receiving ErlySusp intr
usb: s3c-hsotg: fix unregistration function
usb: gadget: f_mass_storage: reset endpoint driver data when disabled
usb: host: fsl-mph-dr-of: Staticize local symbols
usb: gadget: f_eem: Staticize eem_alloc
usb: gadget: f_ecm: Staticize ecm_alloc
usb: phy: omap-usb3: Fix return value
usb: dwc3: gadget: avoid memory leak when failing to allocate all eps
usb: dwc3: remove extcon dependency
usb: gadget: add '__ref' for rndis_config_register() and cdc_config_register()
usb: dwc3: pci: add support for BayTrail
usb: gadget: cdc2: fix conversion to new interface of f_ecm
usb: gadget: fix a bug and a WARN_ON in dummy-hcd
usb: gadget: mv_u3d_core: fix violation of locking discipline in mv_u3d_ep_disable()
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
- some small fixes for msm and exynos
- a regression revert affecting nouveau users with old userspace
- intel pageflip deadlock and gpu hang fixes, hsw modesetting hangs
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (22 commits)
Revert "drm: mark context support as a legacy subsystem"
drm/i915: Don't enable the cursor on a disable pipe
drm/i915: do not update cursor in crtc mode set
drm/exynos: fix return value check in lowlevel_buffer_allocate()
drm/exynos: Fix address space warnings in exynos_drm_fbdev.c
drm/exynos: Fix address space warning in exynos_drm_buf.c
drm/exynos: Remove redundant OF dependency
drm/msm: drop unnecessary set_need_resched()
drm/i915: kill set_need_resched
drm/msm: fix potential NULL pointer dereference
drm/i915/dvo: set crtc timings again for panel fixed modes
drm/i915/sdvo: Robustify the dtd<->drm_mode conversions
drm/msm: workaround for missing irq
drm/msm: return -EBUSY if bo still active
drm/msm: fix return value check in ERR_PTR()
drm/msm: fix cmdstream size check
drm/msm: hangcheck harder
drm/msm: handle read vs write fences
drm/i915/sdvo: Fully translate sync flags in the dtd->mode conversion
drm/i915: Use proper print format for debug prints
...
Merge branch 'for-3.12/core' of git://git.kernel.dk/linux-block
Pull block IO fixes from Jens Axboe:
"After merge window, no new stuff this time only a collection of neatly
confined and simple fixes"
* 'for-3.12/core' of git://git.kernel.dk/linux-block:
cfq: explicitly use 64bit divide operation for 64bit arguments
block: Add nr_bios to block_rq_remap tracepoint
If the queue is dying then we only call the rq->end_io callout. This leaves bios setup on the request, because the caller assumes when the blk_execute_rq_nowait/blk_execute_rq call has completed that the rq->bios have been cleaned up.
bio-integrity: Fix use of bs->bio_integrity_pool after free
blkcg: relocate root_blkg setting and clearing
block: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...)
block: trace all devices plug operation
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"These are mostly bug fixes and a two small performance fixes. The
most important of the bunch are Josef's fix for a snapshotting
regression and Mark's update to fix compile problems on arm"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (25 commits)
Btrfs: create the uuid tree on remount rw
btrfs: change extent-same to copy entire argument struct
Btrfs: dir_inode_operations should use btrfs_update_time also
btrfs: Add btrfs: prefix to kernel log output
btrfs: refuse to remount read-write after abort
Btrfs: btrfs_ioctl_default_subvol: Revert back to toplevel subvolume when arg is 0
Btrfs: don't leak transaction in btrfs_sync_file()
Btrfs: add the missing mutex unlock in write_all_supers()
Btrfs: iput inode on allocation failure
Btrfs: remove space_info->reservation_progress
Btrfs: kill delay_iput arg to the wait_ordered functions
Btrfs: fix worst case calculator for space usage
Revert "Btrfs: rework the overcommit logic to be based on the total size"
Btrfs: improve replacing nocow extents
Btrfs: drop dir i_size when adding new names on replay
Btrfs: replay dir_index items before other items
Btrfs: check roots last log commit when checking if an inode has been logged
Btrfs: actually log directory we are fsync()'ing
Btrfs: actually limit the size of delalloc range
Btrfs: allocate the free space by the existed max extent size when ENOSPC
...
Merge tag 'iio-fixes-for-3.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus
Jonathan writes:
First round of IIO fixes for 3.12
A series of wrong 'struct dev' assumptions in suspend/resume callbacks
following on from this issue being identified in a new driver review.
One to watch out for in future.
A number of driver specific fixes
1) at91 - fix a overflow in clock rate computation
2) dummy - Kconfig dependency issue
3) isl29018 - uninitialized value
4) hmc5843 - measurement conversion bug introduced by recent cleanup.
5) ade7854-spi - wrong return value.
Some IIO core fixes
1) Wrong value picked up for event code creation for a modified channel
2) A null dereference on failure to initialize a buffer after no buffer has
been in use, when using the available_scan_masks approach.
3) Sampling not stopped when a device is removed. Effects forced removal
such as hot unplugging.
4) Prevent device going away if a chrdev is still open in userspace.
5) Prevent race on chardev opening and device being freed.
6) Add a missing iio_buffer_init in the call back buffer.
These last few are the first part of a set from Lars-Peter Clausen who
has been taking a closer look at our removal paths and buffer handling
than anyone has for quite some time.
Adding the number of bios in a remapped request to 'block_rq_remap'
tracepoint.
Request remapper clones bios in a request to track the completion
status of each bio. So the number of bios can be useful information
for investigation.
Related discussions:
http://www.redhat.com/archives/dm-devel/2013-August/msg00084.html
http://www.redhat.com/archives/dm-devel/2013-September/msg00024.html
Josef Bacik [Sat, 21 Sep 2013 02:33:20 +0000 (22:33 -0400)]
Btrfs: create the uuid tree on remount rw
Users have been complaining of the uuid tree stuff warning that there is no uuid
root when trying to do snapshot operations. This is because if you mount -o ro
we will not create the uuid tree. But then if you mount -o rw,remount we will
still not create it and then any subsequent snapshot/subvol operations you try
to do will fail gloriously. Fix this by creating the uuid_root on remount rw if
it was not already there. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Mark Fasheh [Tue, 17 Sep 2013 22:43:54 +0000 (15:43 -0700)]
btrfs: change extent-same to copy entire argument struct
btrfs_ioctl_file_extent_same() uses __put_user_unaligned() to copy some data
back to it's argument struct. Unfortunately, not all architectures provide
__put_user_unaligned(), so compiles break on them if btrfs is selected.
Instead, just copy the whole struct in / out at the start and end of
operations, respectively.
Signed-off-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Guangyu Sun [Mon, 16 Sep 2013 17:42:03 +0000 (10:42 -0700)]
Btrfs: dir_inode_operations should use btrfs_update_time also
Commit 2bc5565286121d2a77ccd728eb3484dff2035b58 (Btrfs: don't update atime on
RO subvolumes) ensures that the access time of an inode is not updated when
the inode lives in a read-only subvolume.
However, if a directory on a read-only subvolume is accessed, the atime is
updated. This results in a write operation to a read-only subvolume. I
believe that access times should never be updated on read-only subvolumes.
Reported-by: Koen De Wit <koen.de.wit@oracle.com> Signed-off-by: Guangyu Sun <guangyu.sun@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Frank Holton [Fri, 13 Sep 2013 15:46:50 +0000 (11:46 -0400)]
btrfs: Add btrfs: prefix to kernel log output
The kernel log entries for device label %s and device fsid %pU
are missing the btrfs: prefix. Add those here.
Signed-off-by: Frank Holton <fholton@gmail.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
David Sterba [Fri, 13 Sep 2013 15:41:20 +0000 (17:41 +0200)]
btrfs: refuse to remount read-write after abort
It's still possible to flip the filesystem into RW mode after it's
remounted RO due to an abort. There are lots of places that check for
the superblock error bit and will not write data, but we should not let
the filesystem appear read-write.
Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Btrfs: btrfs_ioctl_default_subvol: Revert back to toplevel subvolume when arg is 0
This patch makes it possible to set BTRFS_FS_TREE_OBJECTID as the default
subvolume by passing a subvolume id of 0.
Signed-off-by: chandan <chandan@linux.vnet.ibm.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Btrfs: don't leak transaction in btrfs_sync_file()
In btrfs_sync_file(), if the call to btrfs_log_dentry_safe() returns
a negative error (for e.g. -ENOMEM via btrfs_log_inode()), we would
return without ending/freeing the transaction.
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Stefan Behrens [Wed, 11 Sep 2013 07:59:22 +0000 (09:59 +0200)]
Btrfs: add the missing mutex unlock in write_all_supers()
The BUG() was replaced by btrfs_error() and return -EIO with the
patch "get rid of one BUG() in write_all_supers()", but the missing
mutex_unlock() was overlooked.
The 0-DAY kernel build service from Intel reported the missing
unlock which was found by the coccinelle tool:
fs/btrfs/disk-io.c:3422:2-8: preceding lock on line 3374
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Tue, 17 Sep 2013 14:55:51 +0000 (10:55 -0400)]
Btrfs: kill delay_iput arg to the wait_ordered functions
This is a left over of how we used to wait for ordered extents, which was to
grab the inode and then run filemap flush on it. However if we have an ordered
extent then we already are holding a ref on the inode, and we just use
btrfs_start_ordered_extent anyway, so there is no reason to have an extra ref on
the inode to start work on the ordered extent. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Tue, 17 Sep 2013 14:50:06 +0000 (10:50 -0400)]
Btrfs: fix worst case calculator for space usage
Forever ago I made the worst case calculator say that we could potentially split
into 3 blocks for every level on the way down, which isn't right. If we split
we're only going to get two new blocks, the one we originally cow'ed and the new
one we're going to split. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Tue, 17 Sep 2013 14:48:00 +0000 (10:48 -0400)]
Revert "Btrfs: rework the overcommit logic to be based on the total size"
This reverts commit 70afa3998c9baed4186df38988246de1abdab56d. It is causing
performance issues and wasn't actually correct. There were problems with the
way we flushed delalloc and that was the real cause of the early enospc.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Thu, 12 Sep 2013 20:58:28 +0000 (16:58 -0400)]
Btrfs: improve replacing nocow extents
Various people have hit a deadlock when running btrfs/011. This is because when
replacing nocow extents we will take the i_mutex to make sure nobody messes with
the file while we are replacing the extent. The problem is we are already
holding a transaction open, which is a locking inversion, so instead we need to
save these inodes we find and then process them outside of the transaction.
Further we can't just lock the inode and assume we are good to go. We need to
lock the extent range and then read back the extent cache for the inode to make
sure the extent really still points at the physical block we want. If it
doesn't we don't have to copy it. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Wed, 11 Sep 2013 18:17:00 +0000 (14:17 -0400)]
Btrfs: drop dir i_size when adding new names on replay
So if we have dir_index items in the log that means we also have the inode item
as well, which means that the inode's i_size is correct. However when we
process dir_index'es we call btrfs_add_link() which will increase the
directory's i_size for the new entry. To fix this we need to just set the dir
items i_size to 0, and then as we find dir_index items we adjust the i_size.
btrfs_add_link() will do it for new entries, and if the entry already exists we
can just add the name_len to the i_size ourselves. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Wed, 11 Sep 2013 15:57:23 +0000 (11:57 -0400)]
Btrfs: replay dir_index items before other items
A user reported a bug where his log would not replay because he was getting
-EEXIST back. This was because he had a file moved into a directory that was
logged. What happens is the file had a lower inode number, and so it is
processed first when replaying the log, and so we add the inode ref in for the
directory it was moved to. But then we process the directories DIR_INDEX item
and try to add the inode ref for that inode and it fails because we already
added it when we replayed the inode. To solve this problem we need to just
process any DIR_INDEX items we have in the log first so this all is taken care
of, and then we can replay the rest of the items. With this patch my reproducer
can remount the file system properly instead of erroring out. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Wed, 11 Sep 2013 13:55:42 +0000 (09:55 -0400)]
Btrfs: check roots last log commit when checking if an inode has been logged
Liu introduced a local copy of the last log commit for an inode to make sure we
actually log an inode even if a log commit has already taken place. In order to
make sure we didn't relog the same inode multiple times he set this local copy
to the current trans when we log the inode, because usually we log the inode and
then sync the log. The exception to this is during rename, we will relog an
inode if the name changed and it is already in the log. The problem with this
is then we go to sync the inode, and our check to see if the inode has already
been logged is tripped and we don't sync the log. To fix this we need to _also_
check against the roots last log commit, because it could be less than what is
in our local copy of the log commit. This fixes a bug where we rename a file
into a directory and then fsync the directory and then on remount the directory
is no longer there. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Wed, 11 Sep 2013 13:36:30 +0000 (09:36 -0400)]
Btrfs: actually log directory we are fsync()'ing
If you just create a directory and then fsync that directory and then pull the
power plug you will come back up and the directory will not be there. That is
because we won't actually create directories if we've logged files inside of
them since they will be created on replay, but in this check we will set our
logged_trans of our current directory if it happens to be a directory, making us
think it doesn't need to be logged. Fix the logic to only do this to parent
directories. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Fri, 30 Aug 2013 18:38:49 +0000 (14:38 -0400)]
Btrfs: actually limit the size of delalloc range
So forever we have had this thing to limit the amount of delalloc pages we'll
setup to be written out to 128mb. This is because we have to lock all the pages
in this range, so anything above this gets a bit unweildly, and also without a
limit we'll happily allocate gigantic chunks of disk space. Turns out our check
for this wasn't quite right, we wouldn't actually limit the chunk we wanted to
write out, we'd just stop looking for more space after we went over the limit.
So if you do a giant 20gb dd on my box with lots of ram I could get 2gig
extents. This is fine normally, except when you go to relocate these extents
and we can't find enough space to relocate these moster extents, since we have
to be able to allocate exactly the same sized extent to move it around. So fix
this by actually enforcing the limit. With this patch I'm no longer seeing
giant 1.5gb extents. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Btrfs: allocate the free space by the existed max extent size when ENOSPC
By the current code, if the requested size is very large, and all the extents
in the free space cache are small, we will waste lots of the cpu time to cut
the requested size in half and search the cache again and again until it gets
down to the size the allocator can return. In fact, we can know the max extent
size in the cache after the first search, so we needn't cut the size in half
repeatedly, and just use the max extent size directly. This way can save
lots of cpu time and make the performance grow up when there are only fragments
in the free space cache.
According to my test, if there are only 4KB free space extents in the fs,
and the total size of those extents are 256MB, we can reduce the execute
time of the following test from 5.4s to 1.4s.
dd if=/dev/zero of=<testfile> bs=1MB count=1 oflag=sync
Changelog v2 -> v3:
- fix the problem that we skip the block group with the space which is
less than we need.
Changelog v1 -> v2:
- address the problem that we return a wrong start position when searching
the free space in a bitmap.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Stefan Behrens [Tue, 3 Sep 2013 13:25:27 +0000 (15:25 +0200)]
btrfs: show compiled-in config features at module load time
We want to know if there are debugging features compiled in, this may
affect performance. The message is printed before the sanity checks.
(This commit message is a copy of David Sterba's commit message when
he introduced btrfs_print_info()).
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Btrfs: more efficient inode tree replace operation
Instead of removing the current inode from the red black tree
and then add the new one, just use the red black tree replace
operation, which is more efficient.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Reviewed-by: Zach Brown <zab@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Btrfs: do not add replace target to the alloc_list
If replace was suspended by the umount, replace target device is added
to the fs_devices->alloc_list during a later mount. This is obviously
wrong. ->is_tgtdev_for_dev_replace is supposed to guard against that,
but ->is_tgtdev_for_dev_replace is (and can only ever be) initialized
*after* everything is opened and fs_devices lists are populated. Fix
this by checking the devid instead: for replace targets it's always
equal to BTRFS_DEV_REPLACE_DEVID.
Cc: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Fri, 30 Aug 2013 19:09:51 +0000 (15:09 -0400)]
Btrfs: fixup error handling in btrfs_reloc_cow
If we failed to actually allocate the correct size of the extent to relocate we
will end up in an infinite loop because we won't return an error, we'll just
move on to the next extent. So fix this up by returning an error, and then fix
all the callers to return an error up the stack rather than BUG_ON()'ing.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
iio: Prevent race between IIO chardev opening and IIO device free
Set the IIO device as the parent for the character device
We need to make sure that the IIO device is not freed while the character device
exists, otherwise the freeing of the IIO device might race against the file open
callback. Do this by setting the character device's parent to the IIO device,
this will cause the character device to grab a reference to the IIO device and
only release it once the character device itself has been removed.
Also move the registration of the character device before the registration of
the IIO device to avoid the (rather theoretical case) that the IIO device is
already freed again before we can add the character device and grab a reference
to the IIO device.
We also need to move the call to cdev_del() from iio_dev_release() to
iio_device_unregister() (where it should have been in the first place anyway) to
avoid a reference cycle. As iio_dev_release() is only called once all reference
are dropped, but the character device holds a reference to the IIO device.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Peter Meerwald [Wed, 18 Sep 2013 21:10:00 +0000 (22:10 +0100)]
iio: Fix crash when scan_bytes is computed with active_scan_mask == NULL
if device has available_scan_masks set and the buffer is enabled without
any scan_elements enabled, in a NULL pointer is dereferenced in iio_compute_scan_bytes()
Merge tag 'pm+acpi-3.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
1) Four fixes for cpufreq regressions introduced by the changes that
removed Device Tree parsing for CPU device nodes from cpufreq
drivers from Sudeep KarkadaNagesha.
2) Two fixes for recent cpufreq regressions introduced by changes
related to the preservation of sysfs attributes over system
suspend/resume cycles from Viresh Kumar.
3) Fix for ACPI-based wakeup signaling in the PCI subsystem that
fails to stop PME polling for devices put into the D3cold power
state from Rafael J Wysocki.
4) Fix for bad interactions between cpufreq and udev on systems
supporting intel_pstate where acpi-cpufreq is available as well
from Yinghai Lu.
* tag 'pm+acpi-3.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: return EEXIST instead of EBUSY for second registering
PCI / ACPI / PM: Clear pme_poll for devices in D3cold on wakeup
ARM: shmobile: change dev_id to cpu0 while registering cpu clock
ARM: i.MX: change dev_id to cpu0 while registering cpu clock
cpufreq: imx6q-cpufreq: assign cpu_dev correctly to cpu0 device
cpufreq: cpufreq-cpu0: assign cpu_dev correctly to cpu0 device
cpufreq: unlock correct rwsem while updating policy->cpu
cpufreq: Clear policy->cpus bits in __cpufreq_remove_dev_finish()
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull vhost updates from Michael Tsirkin:
"vhost: minor changes on top of 3.12-rc1
This fixes module loading for vhost-scsi, and tweaks locking in vhost
core a bit. Both of these are not exactly release blockers but it's
early in the cycle so I think it's a good idea to apply them now"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost-scsi: whitespace tweak
vhost/scsi: use vmalloc for order-10 allocation
vhost: wake up worker outside spin_lock
David Howells [Fri, 20 Sep 2013 13:18:00 +0000 (14:18 +0100)]
CacheFiles: Don't try to dump the index key if the cookie has been cleared
Don't try to dump the index key that distinguishes an object if netfs
data in the cookie the object refers to has been cleared (ie. the
cookie has passed most of the way through
__fscache_relinquish_cookie()).
Since the netfs holds the index key, we can't get at it once the ->def
and ->netfs_data pointers have been cleared - and a NULL pointer
exception will ensue, usually just after a:
CacheFiles: Error: Unexpected object collision
error is reported.
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CacheFiles: Fix memory leak in cachefiles_check_auxdata error paths
In cachefiles_check_auxdata(), we allocate auxbuf but fail to free it if
we determine there's an error or that the data is stale.
Further, assigning the output of vfs_getxattr() to auxbuf->len gives
problems with checking for errors as auxbuf->len is a u16. We don't
actually need to set auxbuf->len, so keep the length in a variable for
now. We shouldn't need to check the upper limit of the buffer as an
overflow there should be indicated by -ERANGE.
While we're at it, fscache_check_aux() returns an enum value, not an
int, so assign it to an appropriately typed variable rather than to ret.
Will Deacon [Thu, 19 Sep 2013 18:06:46 +0000 (19:06 +0100)]
lockref: use cmpxchg64 explicitly for lockless updates
The cmpxchg() function tends not to support 64-bit arguments on 32-bit
architectures. This could be either due to use of unsigned long
arguments (like on ARM) or lack of instruction support (cmpxchgq on
x86). However, these architectures may implement a specific cmpxchg64()
function to provide 64-bit cmpxchg support instead.
Since the lockref code requires a 64-bit cmpxchg and relies on the
architecture selecting ARCH_USE_CMPXCHG_LOCKREF, move to using cmpxchg64
instead of cmpxchg and allow 32-bit architectures to make use of the
lockless lockref implementation.
Cc: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm-cpufreq:
cpufreq: return EEXIST instead of EBUSY for second registering
ARM: shmobile: change dev_id to cpu0 while registering cpu clock
ARM: i.MX: change dev_id to cpu0 while registering cpu clock
cpufreq: imx6q-cpufreq: assign cpu_dev correctly to cpu0 device
cpufreq: cpufreq-cpu0: assign cpu_dev correctly to cpu0 device
cpufreq: unlock correct rwsem while updating policy->cpu
cpufreq: Clear policy->cpus bits in __cpufreq_remove_dev_finish()
Merge tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64
Pull ARM64 fixes from Catalin Marinas:
- Compat register fault reporting fix
- Documentation clarification on tagged pointers
- hwcap widened to 64-bit (user space already reading it as 64-bit)
* tag 'arm64-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
arm64: Widen hwcap to be 64 bit
arm64: Correctly report LR and SP for compat tasks
arm64: documentation: tighten up tagged pointer documentation
arm64: Make do_bad_area() function static
arm64: Correctly report LR and SP for compat tasks
When a task crashes and we print debugging information, ensure that
compat tasks show the actual AArch32 LR and SP registers rather than the
AArch64 ones.
Will Deacon [Tue, 17 Sep 2013 10:46:23 +0000 (11:46 +0100)]
arm64: documentation: tighten up tagged pointer documentation
Commit d50240a5f6ce ("arm64: mm: permit use of tagged pointers at EL0")
added support for tagged pointers in userspace, but the corresponding
update to Documentation/ contained some imprecise statements.
This patch fixes up some minor ambiguities in the text, hopefully making
it more clear about exactly what the kernel expects from user virtual
addresses.
Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A set of fixes for ARM platforms for 3.12. Among them:
- A fix for build breakage in the MTD subsystem for some PXA devices.
David Woodhouse has this patch in his for-next branch but has not
been responding to our requests to send it up so here it is. I
should have amended the commit message to describe the build
failure for CONFIG_OF=n setups, but forgot and now it's down in the
stack of commits.
- Added device-tree for the BeagleBone Black. Turns out people have
been using the older "regualar" bone DT for the newer boards, and
there's risk of damaging hardware that way.
- Misc DT and regular fixes for OMAP.
- Fix to make the ST-Ericsson "snowball" boards boot with
multi_v7_defconfig, and enable one of the ST-E reference boards on
the same config.
- Kconfig cleanup for u300 to hide submenus when the platform isn't
enabled.
- Enable ARM_ATAG_DTB_COMPAT to let firmware override command line
when booting with an appended devicetree on non-DT-enabled firmware
(needed to boot snowball)"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (26 commits)
ARM: multi_v7: add HREFv60 to multi_v7 defconfig
ARM: OMAP2+: mux: fix trivial typo in name
ARM: OMAP4 SMP: Corrected a typo fucntions to functions
ARM: OMAP4: cpuidle: fix: call cpu_cluster_pm_exit conditionally
mailbox: remove unnecessary platform_set_drvdata()
ARM: mach-omap2: gpmc: Fix warning when CONFIG_ARM_LPAE=y
ARM: OMAP: fix return value check in omap_device_build_from_dt()
ARM: OMAP4: Fix clock_get error for GPMC during boot
ARM: sa1100: collie.c: fall back to jedec_probe flash detection
ARM: u300: hide submenus
ARM: dts: igep00x0: Add pinmux configuration for MCBSP2
ARM: dts: Fix muxing and regulator for wl12xx on the SDIO bus for blaze
ARM: dts: Fix muxing and regulator for wl12xx on the SDIO bus for pandaboard
mtd: nand: pxa3xx: Remove unneeded ifdef CONFIG_OF
ARM: multi_v7_defconfig: enable ARM_ATAG_DTB_COMPAT
ARM: ux500: disable outer cache debug
ARM: dts: OMAP5: fix ocp2scp DTS data
ARM: dts: OMAP5: fix reg property size
ARM: dts: am335x-bone*: add DT for BeagleBone Black
ARM: dts: omap3-beagle-xm: fix string error in compatible property
...
Dave Airlie [Thu, 19 Sep 2013 23:06:48 +0000 (09:06 +1000)]
Merge branch 'msm-fixes-3.12' of git://people.freedesktop.org/~robclark/linux into drm-fixes
A couple small msm fixes. Plus drop of set_need_resched().
* 'msm-fixes-3.12' of git://people.freedesktop.org/~robclark/linux:
drm/msm: drop unnecessary set_need_resched()
drm/msm: fix potential NULL pointer dereference
drm/msm: workaround for missing irq
drm/msm: return -EBUSY if bo still active
drm/msm: fix return value check in ERR_PTR()
drm/msm: fix cmdstream size check
drm/msm: hangcheck harder
drm/msm: handle read vs write fences
Dave Airlie [Thu, 19 Sep 2013 23:01:27 +0000 (09:01 +1000)]
Merge branch 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes
Just small fixes, and code cleanups.
* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
drm/exynos: fix return value check in lowlevel_buffer_allocate()
drm/exynos: Fix address space warnings in exynos_drm_fbdev.c
drm/exynos: Fix address space warning in exynos_drm_buf.c
drm/exynos: Remove redundant OF dependency
Dave Airlie [Thu, 19 Sep 2013 22:42:56 +0000 (08:42 +1000)]
Merge tag 'drm-intel-fixes-2013-09-19' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
Some more dealock fixes around pageflips and gpu hangs, fixes for hsw hangs
when doing modesets/dpms. And a few minor things to rectify issues with our
modeset state tracking which the checker spotted.
* tag 'drm-intel-fixes-2013-09-19' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: Don't enable the cursor on a disable pipe
drm/i915: do not update cursor in crtc mode set
drm/i915: kill set_need_resched
drm/i915/dvo: set crtc timings again for panel fixed modes
drm/i915/sdvo: Robustify the dtd<->drm_mode conversions
drm/i915/sdvo: Fully translate sync flags in the dtd->mode conversion
drm/i915: Use proper print format for debug prints
drm/i915: fix wait_for_pending_flips vs gpu hang deadlock
drm/i915: Track pfit enable state separately from size
Yinghai Lu [Thu, 19 Sep 2013 04:05:20 +0000 (21:05 -0700)]
cpufreq: return EEXIST instead of EBUSY for second registering
On systems that support intel_pstate, acpi_cpufreq fails to load, and
udev keeps trying until trace gets filled up and kernel crashes.
The root cause is driver return ret from cpufreq_register_driver(),
because when some other driver takes over before, it will return
EBUSY and then udev will keep trying ...
cpufreq_register_driver() should return EEXIST instead so that the
system can boot without appending intel_pstate=disable and still use
intel_pstate.
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-by: Paul Zimmerman <Paul.Zimmerman@synopsys.com> Reported-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Dave Airlie <airlied@redhat.com>
PCI / ACPI / PM: Clear pme_poll for devices in D3cold on wakeup
Commit 448bd85 (PCI/PM: add PCIe runtime D3cold support) added a
piece of code to pci_acpi_wake_dev() causing that function to behave
in a special way for devices in D3cold (so that their configuration
registers are not accessed before those devices are resumed).
However, it didn't take the clearing of the pme_poll flag into
account. That has to be done for all devices, even if they are in
D3cold, or pci_pme_list_scan() will not know that wakeup has been
signaled for the device and will poll its PME Status bit
unnecessarily.
Fix the problem by moving the clearing of the pme_poll flag in
pci_acpi_wake_dev() before the code introduced by commit 448bd85.
Reported-and-tested-by: David E. Box <david.e.box@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: 3.6+ <stable@vger.kernel.org> # 3.6+
1) If the local_df boolean is set on an SKB we have to allocate a
unique ID even if IP_DF is set in the ipv4 headers, from Ansis
Atteka.
2) Some fixups for the new chipset support that went into the sfc
driver, from Ben Hutchings.
3) Because SCTP bypasses a good chunk of, and actually duplicates, the
logic of the ipv6 output path, some IPSEC things don't get done
properly. Integrate SCTP better into the ipv6 output path so that
these problems are fixed and such issues don't get missed in the
future either. From Daniel Borkmann.
4) Fix skge regressions added by the DMA mapping error return checking
added in v3.10, from Mikulas Patocka.
5) Kill some more IRQF_DISABLED references, from Michael Opdenacker.
6) Fix races and deadlocks in the bridging code, from Hong Zhiguo.
7) Fix error handling in tun_set_iff(), in particular don't leak
resources. From Jason Wang.
8) Prevent format-string injection into xen-netback driver, from Kees
Cook.
9) Fix regression added to netpoll ARP packet handling, in particular
check for the right ETH_P_ARP protocol code. From Sonic Zhang.
10) Try to deal with AMD IOMMU errors when using r8169 chips, from
Francois Romieu.
11) Cure freezes due to recent changes in the rt2x00 wireless driver,
from Stanislaw Gruszka.
12) Don't do SPI transfers (which can sleep) in interrupt context in
cw1200 driver, from Solomon Peachy.
13) Fix LEDs handling bug in 5720 tg3 chips already handled for 5719.
From Nithin Sujir.
14) Make xen_netbk_count_skb_slots() count the actual number of slots
that will be used, taking into consideration packing and other
issues that the transmit path will run into. From David Vrabel.
15) Use the correct maximum age when calculating the bridge
message_age_timer, from Chris Healy.
16) Get rid of memory leaks in mcs7780 IRDA driver, from Alexey
Khoroshilov.
17) Netfilter conntrack extensions were converted to RCU but are not
always freed properly using kfree_rcu(). Fix from Michal Kubecek.
18) VF reset recovery not being done correctly in qlcnic driver, from
Manish Chopra.
19) Fix inverted test in ATM nicstar driver, from Andy Shevchenko.
20) Missing workqueue destroy in cxgb4 error handling, from Wei Yang.
21) Internal switch not initialized properly in bgmac driver, from Rafał
Miłecki.
22) Netlink messages report wrong local and remote addresses in IPv6
tunneling, from Ding Zhi.
23) ICMP redirects should not generate socket errors in DCCP and SCTP.
We're still working out how this should be handled for RAW and UDP
sockets. From Daniel Borkmann and Duan Jiong.
24) We've had several bugs wherein the network namespace's loopback
device gets accessed after it is free'd, NULL it out so that we can
catch these problems more readily. From Eric W Biederman.
25) Fix regression in TCP RTO calculations, from Neal Cardwell.
26) Fix too early free of xen-netback network device when VIFs still
exist. From Paul Durrant.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
netconsole: fix a deadlock with rtnl and netconsole's mutex
netpoll: fix NULL pointer dereference in netpoll_cleanup
skge: fix broken driver
ip: generate unique IP identificator if local fragmentation is allowed
ip: use ip_hdr() in __ip_make_skb() to retrieve IP header
xen-netback: Don't destroy the netdev until the vif is shut down
net:dccp: do not report ICMP redirects to user space
cnic: Fix crash in cnic_bnx2x_service_kcq()
bnx2x, cnic, bnx2i, bnx2fc: Fix bnx2i and bnx2fc regressions.
vxlan: Avoid creating fdb entry with NULL destination
tcp: fix RTO calculated from cached RTT
drivers: net: phy: cicada.c: clears warning Use #include <linux/io.h> instead of <asm/io.h>
net loopback: Set loopback_dev to NULL when freed
batman-adv: set the TAG flag for the vid passed to BLA
netfilter: nfnetlink_queue: use network skb for sequence adjustment
net: sctp: rfc4443: do not report ICMP redirects to user space
net: usb: cdc_ether: use usb.h macros whenever possible
net: usb: cdc_ether: fix checkpatch errors and warnings
net: usb: cdc_ether: Use wwan interface for Telit modules
ip6_tunnels: raddr and laddr are inverted in nl msg
...
netconsole: fix a deadlock with rtnl and netconsole's mutex
This bug was introduced by commit 7a163bfb7ce50895bbe67300ea610d31b9c09230 ("netconsole: avoid a crash with
multiple sysfs writers"). In store_enabled() we have the following
sequence: acquire nt->mutex then rtnl, but in the netconsole netdev
notifier we have rtnl then nt->mutex effectively leading to a deadlock.
The NULL pointer dereference that the above commit tries to fix is
actually due to another bug in netpoll_cleanup(). This is fixed by dropping
the mutex from the netdev notifier as it's already protected by rtnl.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
netpoll: fix NULL pointer dereference in netpoll_cleanup
I've been hitting a NULL ptr deref while using netconsole because the
np->dev check and the pointer manipulation in netpoll_cleanup are done
without rtnl and the following sequence happens when having a netconsole
over a vlan and we remove the vlan while disabling the netconsole:
CPU 1 CPU2
removes vlan and calls the notifier
enters store_enabled(), calls
netdev_cleanup which checks np->dev
and then waits for rtnl
executes the netconsole netdev
release notifier making np->dev
== NULL and releases rtnl
continues to dereference a member of
np->dev which at this point is == NULL
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>