Fenghua Yu [Fri, 21 Dec 2012 07:44:28 +0000 (23:44 -0800)]
x86/microcode_intel_early.c: Early update ucode on Intel's CPU
Implementation of early update ucode on Intel's CPU.
load_ucode_intel_bsp() scans ucode in initrd image file which is a cpio format
ucode followed by ordinary initrd image file. The binary ucode file is stored
in kernel/x86/microcode/GenuineIntel.bin in the cpio data. All ucode
patches with the same model as BSP are saved in memory. A matching ucode patch
is updated on BSP.
load_ucode_intel_ap() reads saved ucoded patches and updates ucode on AP.
Fenghua Yu [Fri, 21 Dec 2012 07:44:25 +0000 (23:44 -0800)]
x86/microcode_core_early.c: Define interfaces for early loading ucode
Define interfaces load_ucode_bsp() and load_ucode_ap() to load ucode on BSP and
AP in early boot time. These are generic interfaces. Internally they call
vendor specific implementations.
Fenghua Yu [Fri, 21 Dec 2012 07:44:24 +0000 (23:44 -0800)]
x86/common.c: load ucode in 64 bit or show loading ucode info in 32 bit on AP
In 64 bit, load ucode on AP in cpu_init().
In 32 bit, show ucode loading info on AP in cpu_init(). Microcode has been
loaded earlier before paging. Now it is safe to show the loading microcode
info on this AP.
Fenghua Yu [Fri, 21 Dec 2012 07:44:22 +0000 (23:44 -0800)]
x86/microcode_intel.h: Define functions and macros for early loading ucode
Define some functions and macros that will be used in early loading ucode. Some
of them are moved from microcode_intel.c driver in order to be called in early
boot phase before module can be called.
Yinghai Lu [Thu, 24 Jan 2013 20:20:16 +0000 (12:20 -0800)]
x86: Don't panic if can not alloc buffer for swiotlb
Normal boot path on system with iommu support:
swiotlb buffer will be allocated early at first and then try to initialize
iommu, if iommu for intel or AMD could setup properly, swiotlb buffer
will be freed.
The early allocating is with bootmem, and could panic when we try to use
kdump with buffer above 4G only, or with memmap to limit mem under 4G.
for example: memmap=4095M$1M to remove memory under 4G.
According to Eric, add _nopanic version and no_iotlb_memory to fail
map single later if swiotlb is still needed.
-v2: don't pass nopanic, and use -ENOMEM return value according to Eric.
panic early instead of using swiotlb_full to panic...according to Eric/Konrad.
-v3: make swiotlb_init to be notpanic, but will affect:
arm64, ia64, powerpc, tile, unicore32, x86.
-v4: cleanup swiotlb_init by removing swiotlb_init_with_default_size.
Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-36-git-send-email-yinghai@kernel.org Reviewed-and-tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Cc: linux-mips@linux-mips.org Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.linux-foundation.org Cc: Shuah Khan <shuahkhan@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Yinghai Lu [Thu, 24 Jan 2013 20:20:12 +0000 (12:20 -0800)]
x86: Merge early kernel reserve for 32bit and 64bit
They are the same, and we could move them out from head32/64.c to setup.c.
We are using memblock, and it could handle overlapping properly, so
we don't need to reserve some at first to hold the location, and just
need to make sure we reserve them before we are using memblock to find
free mem to use.
Yinghai Lu [Thu, 24 Jan 2013 20:20:11 +0000 (12:20 -0800)]
x86: Add Crash kernel low reservation
During kdump kernel's booting stage, it need to find low ram for
swiotlb buffer when system does not support intel iommu/dmar remapping.
kexed-tools is appending memmap=exactmap and range from /proc/iomem
with "Crash kernel", and that range is above 4G for 64bit after boot
protocol 2.12.
We need to add another range in /proc/iomem like "Crash kernel low",
so kexec-tools could find that info and append to kdump kernel
command line.
Try to reserve some under 4G if the normal "Crash kernel" is above 4G.
User could specify the size with crashkernel_low=XX[KMG].
-v2: fix warning that is found by Fengguang's test robot.
-v3: move out get_mem_size change to another patch, to solve compiling
warning that is found by Borislav Petkov <bp@alien8.de>
-v4: user must specify crashkernel_low if system does not support
intel or amd iommu.
Yinghai Lu [Tue, 29 Jan 2013 04:16:44 +0000 (20:16 -0800)]
x86, boot: Support loading bzImage, boot_params and ramdisk above 4G
xloadflags bit 1 indicates that we can load the kernel and all data
structures above 4G; it is set if kernel is relocatable and 64bit.
bootloader will check if xloadflags bit 1 is set to decide if
it could load ramdisk and kernel high above 4G.
bootloader will fill value to ext_ramdisk_image/size for high 32bits
when it load ramdisk above 4G.
kernel use get_ramdisk_image/size to use ext_ramdisk_image/size to get
right positon for ramdisk.
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Rob Landley <rob@landley.net> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Gokul Caushik <caushik1@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joe Millenbach <jmillenbach@gmail.com> Link: http://lkml.kernel.org/r/1359058816-7615-26-git-send-email-yinghai@kernel.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Yinghai Lu [Thu, 24 Jan 2013 20:20:04 +0000 (12:20 -0800)]
x86, kexec: Replace ident_mapping_init and init_level4_page
Now ident_mapping_init is checking if pgd/pud is present for every 2M,
so several 2Ms are in same PUD, it will keep checking if pud is there
with same pud.
init_level4_page just does not check existing pgd/pud.
We could use generic mapping_init with different settings in info to
replace those two local grown version functions.
Yinghai Lu [Thu, 24 Jan 2013 20:20:00 +0000 (12:20 -0800)]
x86, boot: Move verify_cpu.S and no_longmode down
We need to move some code to 32bit section in following patch:
x86, boot: Move lldt/ltr out of 64bit code section
but that will push startup_64 down from 0x200.
According to hpa, we can not change startup_64 position and that
is an ABI.
We could move function verify_cpu and no_longmode down, because
verify_cpu is used via function call and no_longmode will not
return, then we don't need to add extra code for jumping back.
Yinghai Lu [Thu, 24 Jan 2013 20:19:59 +0000 (12:19 -0800)]
x86, boot: Pass cmd_line_ptr with unsigned long instead
boot/compressed/misc.c is used for bzImage in 64bit and 32bit, and
cmd_line_ptr could point to buffer that is above 4g, cmd_line_ptr
should be 64bit otherwise high 32bit will be capped out.
So need to change data type to unsigned long, that will be 64bit get
correct address of command line buffer.
And it is still ok with 32bit bzImage, because unsigned long on 32bit kernel
is still 32bit.
Yinghai Lu [Thu, 24 Jan 2013 20:19:55 +0000 (12:19 -0800)]
x86: Merge early_reserve_initrd for 32bit and 64bit
They are the same, could move them out from head32/64.c to setup.c.
We are using memblock, and it could handle overlapping properly, so
we don't need to reserve some at first to hold the location, and just
need to make sure we reserve them before we are using memblock to find
free mem to use.
Yinghai Lu [Thu, 24 Jan 2013 20:19:53 +0000 (12:19 -0800)]
x86, 64bit: #PF handler set page to cover only 2M per #PF
We only map a single 2 MiB page per #PF, even though we should be able
to do this a full gigabyte at a time with no additional memory cost.
This is a workaround for a broken AMD reference BIOS (and its
derivatives in shipping system) which maps a large chunk of memory as
WB in the MTRR system but will #MC if the processor wanders off and
tries to prefetch that memory, which can happen any time the memory is
mapped in the TLB.
H. Peter Anvin [Thu, 24 Jan 2013 20:19:52 +0000 (12:19 -0800)]
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Yinghai Lu [Thu, 24 Jan 2013 20:19:51 +0000 (12:19 -0800)]
x86, realmode: Separate real_mode reserve and setup
After we switch to use #PF handler help to set page table, init_level4_pgt
will only have entries set after init_mem_mapping().
We need to move copying init_level4_pgt to trampoline_pgd after that.
So split reserve and setup, and move the setup after init_mem_mapping()
Yinghai Lu [Thu, 24 Jan 2013 20:19:49 +0000 (12:19 -0800)]
x86, 64bit: Copy struct boot_params early
We want to support struct boot_params (formerly known as the
zero-page, or real-mode data) above the 4 GiB mark. We will have #PF
handler to set page table for not accessible ram early, but want to
limit it before x86_64_start_reservations to limit the code change to
native path only.
Also we will need the ramdisk info in struct boot_params to access the microcode
blob in ramdisk in x86_64_start_kernel, so copy struct boot_params early makes
it accessing ramdisk info simple.
Yinghai Lu [Thu, 24 Jan 2013 20:19:47 +0000 (12:19 -0800)]
x86, realmode: Set real_mode permissions early
Trampoline code is executed by APs with kernel low mapping on 64bit.
We need to set trampoline code to EXEC early before we boot APs.
Found the problem after switching to #PF handler set page table,
and we do not set initial kernel low mapping with EXEC anymore in
arch/x86/kernel/head_64.S.
Change to use early_initcall instead that will make sure trampoline
will have EXEC set.
-v2: Merge two comments according to Borislav Petkov <bp@alien8.de>
Yinghai Lu [Thu, 24 Jan 2013 20:19:46 +0000 (12:19 -0800)]
x86, 64bit, mm: Make pgd next calculation consistent with pud/pmd
Just like the way we calculate next for pud and pmd, aka round down and
add size.
Also, do not do boundary-checking with 'next', and just pass 'end' down
to phys_pud_init() instead. Because the loop in phys_pud_init() stops at
PTRS_PER_PUD and thus can handle a possibly bigger 'end' properly.
Yinghai Lu [Thu, 24 Jan 2013 20:19:45 +0000 (12:19 -0800)]
x86: Factor out e820_add_kernel_range()
Separate out the reservation of the kernel static memory areas into a
separate function.
Also add support for case when memmap=xxM$yyM is used without exactmap.
Need to remove reserved range at first before we add E820_RAM
range, otherwise added E820_RAM range will be ignored.
Yinghai Lu [Thu, 24 Jan 2013 20:19:42 +0000 (12:19 -0800)]
x86, mm: Fix page table early allocation offset checking
During debugging loading kernel above 4G, found that one page is not used
in pre-allocated BRK area for early page allocation.
pgt_buf_top is address that can not be used, so should check if that new
end is above that top, otherwise last page will not be used.
Fix that checking and also add print out for allocation from pre-allocated
BRK area to catch possible bugs later.
But after we get back that page for pgt, it tiggers one bug in pgt allocation
with xen: We need to avoid to use page as pgt to map range that is
overlapping with that pgt page.
Add checking about overlapping, when it happens, use memblock allocation
instead. That fixes crash on Xen PV guest with 2G that Stefan found.
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1359058816-7615-2-git-send-email-yinghai@kernel.org Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
H. Peter Anvin [Tue, 29 Jan 2013 09:05:24 +0000 (01:05 -0800)]
x86, boot: Sanitize boot_params if not zeroed on creation
Use the new sentinel field to detect bootloaders which fail to follow
protocol and don't initialize fields in struct boot_params that they
do not explicitly initialize to zero.
Based on an original patch and research by Yinghai Lu.
Changed by hpa to be invoked both in the decompression path and in the
kernel proper; the latter for the case where a bootloader takes over
decompression.
H. Peter Anvin [Sun, 27 Jan 2013 18:43:28 +0000 (10:43 -0800)]
x86, boot: Define the 2.12 bzImage boot protocol
Define the 2.12 bzImage boot protocol: add xloadflags and additional
fields to allow the command line, initramfs and struct boot_params to
live above the 4 GiB mark.
The xloadflags now communicates if this is a 64-bit kernel with the
legacy 64-bit entry point and which of the EFI handover entry points
are supported.
Avoid adding new read flags to loadflags because of claimed
bootloaders testing the whole byte for == 1 to determine bzImageness
at least until the issue can be researched further.
This is based on patches by Yinghai Lu and David Woodhouse.
Originally-by: Yinghai Lu <yinghai@kernel.org> Originally-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: David Woodhouse <dwmw2@infradead.org> Acked-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1359058816-7615-26-git-send-email-yinghai@kernel.org Cc: Rob Landley <rob@landley.net> Cc: Gokul Caushik <caushik1@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joe Millenbach <jmillenbach@gmail.com>
Linus Torvalds [Fri, 25 Jan 2013 18:55:21 +0000 (10:55 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"It turns out that we had two crc bugs when running fsx-linux in a
loop. Many thanks to Josef, Miao Xie, and Dave Sterba for nailing it
all down. Miao also has a new OOM fix in this v2 pull as well.
Ilya fixed a regression Liu Bo found in the balance ioctls for pausing
and resuming a running balance across drives.
Josef's orphan truncate patch fixes an obscure corruption we'd see
during xfstests.
Arne's patches address problems with subvolume quotas. If the user
destroys quota groups incorrectly the FS will refuse to mount.
The rest are smaller fixes and plugs for memory leaks."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (30 commits)
Btrfs: fix repeated delalloc work allocation
Btrfs: fix wrong max device number for single profile
Btrfs: fix missed transaction->aborted check
Btrfs: Add ACCESS_ONCE() to transaction->abort accesses
Btrfs: put csums on the right ordered extent
Btrfs: use right range to find checksum for compressed extents
Btrfs: fix panic when recovering tree log
Btrfs: do not allow logged extents to be merged or removed
Btrfs: fix a regression in balance usage filter
Btrfs: prevent qgroup destroy when there are still relations
Btrfs: ignore orphan qgroup relations
Btrfs: reorder locks and sanity checks in btrfs_ioctl_defrag
Btrfs: fix unlock order in btrfs_ioctl_rm_dev
Btrfs: fix unlock order in btrfs_ioctl_resize
Btrfs: fix "mutually exclusive op is running" error code
Btrfs: bring back balance pause/resume logic
btrfs: update timestamps on truncate()
btrfs: fix btrfs_cont_expand() freeing IS_ERR em
Btrfs: fix a bug when llseek for delalloc bytes behind prealloc extents
Btrfs: fix off-by-one in lseek
...
Linus Torvalds [Thu, 24 Jan 2013 20:44:57 +0000 (12:44 -0800)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM fixes from Russell King:
"A number of fixes:
Patrik found a problem with preempt counting in the VFP assembly
functions which can cause the preempt count to be upset.
Nicolas fixed a problem with the parsing of the DT when it straddles a
1MB boundary.
Subhash Jadavani reported a problem with sparsemem and our highmem
support for cache maintanence for DMA areas, and TI found a bug in
their strongly ordered memory mapping type.
Also, three fixes by way of Will Deacon's tree from Dave Martin for
instruction compatibility and Marc Zyngier to fix hypervisor boot mode
issues."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7629/1: mm: Fix missing XN flag for for MT_MEMORY_SO
ARM: DMA: Fix struct page iterator in dma_cache_maint() to work with sparsemem
ARM: 7628/1: head.S: map one extra section for the ATAG/DTB area
ARM: 7627/1: Predicate preempt logic on PREEMP_COUNT not PREEMPT alone
ARM: virt: simplify __hyp_stub_install epilog
ARM: virt: boot secondary CPUs through the right entry point
ARM: virt: Avoid bx instruction for compatibility with <=ARMv4
Linus Torvalds [Thu, 24 Jan 2013 20:42:50 +0000 (12:42 -0800)]
Merge tag 'fixes-for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Here's a long-pending fixes pull request for arm-soc (I didn't send
one in the -rc4 cycle).
The larger deltas are from:
- A fixup of error paths in the mvsdio driver
- Header file move for a driver that hadn't been properly converted
to multiplatform on i.MX, which was causing build failures when
included
- Device tree updates for at91 dealing mostly with their new pinctrl
setup merged in 3.8 and mistakes in those initial configs
The rest are the normal mix of small fixes all over the place; sunxi,
omap, imx, mvebu, etc, etc."
* tag 'fixes-for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (40 commits)
mfd: vexpress-sysreg: Don't skip initialization on probe
ARM: vexpress: Enable A7 cores in V2P-CA15_A7's Device Tree
ARM: vexpress: extend the MPIDR range used for pen release check
ARM: at91/dts: correct comment in at91sam9x5.dtsi for mii
ARM: at91/at91_dt_defconfig: add at91sam9n12 SoC to DT defconfig
ARM: at91/at91_dt_defconfig: remove memory specification to cmdline
ARM: at91/dts: add macb mii pinctrl config for kizbox
ARM: at91: rm9200: remake the BGA as default version
ARM: at91: fix gpios on i2c-gpio for RM9200 DT
ARM: at91/at91sam9x5 DTS: add SCK USART pins
ARM: at91/at91sam9x5 DTS: correct wrong PIO BANK values on u(s)arts
ARM: at91/at91-pinctrl documentation: fix typo and add some details
ARM: kirkwood: fix missing #interrupt-cells property
mmc: mvsdio: use devm_ API to simplify/correct error paths.
clk: mvebu/clk-cpu.c: fix memory leakage
ARM: OMAP2+: omap4-panda: add UART2 muxing for WiLink shared transport
ARM: OMAP2+: DT node Timer iteration fix
ARM: OMAP2+: Fix section warning for omap_init_ocp2scp()
ARM: OMAP2+: fix build break for omapdrm
ARM: OMAP2: Fix missing omap2xxx_clkt_vps_late_init function calls
...
Linus Torvalds [Thu, 24 Jan 2013 18:19:13 +0000 (10:19 -0800)]
Merge tag 'pm+acpi-for-3.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
- Two cpuidle initialization fixes from Konrad Rzeszutek Wilk.
- cpufreq regression fixes for AMD processors from Borislav Petkov,
Stefan Bader, and Matthew Garrett.
- ACPI cpufreq fix from Thomas Schlichter.
- cpufreq and devfreq fixes related to incorrect usage of operating
performance points (OPP) framework and RCU from Nishanth Menon.
- APEI workaround for incorrect BIOS information from Lans Zhang.
* tag 'pm+acpi-for-3.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: Add module aliases for acpi-cpufreq
ACPI: Check MSR valid bit before using P-state frequencies
PM / devfreq: exynos4_bus: honor RCU lock usage
PM / devfreq: add locking documentation for recommended_opp
cpufreq: cpufreq-cpu0: use RCU locks around usage of OPP
cpufreq: OMAP: use RCU locks around usage of OPP
ACPI, APEI: Fixup incorrect 64-bit access width firmware bug
ACPI / processor: Get power info before updating the C-states
powernow-k8: Add a kconfig dependency on acpi-cpufreq
ACPI / cpuidle: Fix NULL pointer issues when cpuidle is disabled
intel_idle: Don't register CPU notifier if we are not running.
Linus Torvalds [Thu, 24 Jan 2013 18:18:37 +0000 (10:18 -0800)]
Merge tag 'regmap-fix-3.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"One more oversight in the debugfs code was reported and fixed, plus a
documentation fix."
* tag 'regmap-fix-3.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: fix small typo in regmap_bulk_write comment
regmap: debugfs: Fix seeking from the cache
Linus Torvalds [Thu, 24 Jan 2013 18:17:49 +0000 (10:17 -0800)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine fixes from Vinod Koul:
"A few fixes on slave dmanengine. There are trivial fixes in imx-dma,
tegra-dma & ioat driver"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
dma: tegra: implement flags parameters for cyclic transfer
dmaengine: imx-dma: Disable use of hw_chain to fix sg_dma transfers.
ioat: Fix DMA memory sync direction correct flag
Miao Xie [Tue, 22 Jan 2013 10:49:00 +0000 (10:49 +0000)]
Btrfs: fix repeated delalloc work allocation
btrfs_start_delalloc_inodes() locks the delalloc_inodes list, fetches the
first inode, unlocks the list, triggers btrfs_alloc_delalloc_work/
btrfs_queue_worker for this inode, and then it locks the list, checks the
head of the list again. But because we don't delete the first inode that it
deals with before, it will fetch the same inode. As a result, this function
allocates a huge amount of btrfs_delalloc_work structures, and OOM happens.
Fix this problem by splice this delalloc list.
Reported-by: Alex Lyakas <alex.btrfs@zadarastorage.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Miao Xie [Wed, 16 Jan 2013 11:27:17 +0000 (11:27 +0000)]
Btrfs: fix wrong max device number for single profile
The max device number of single profile is 1, not 0 (0 means 'as many as
possible'). Fix it.
Cc: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Miao Xie [Tue, 15 Jan 2013 06:29:12 +0000 (06:29 +0000)]
Btrfs: fix missed transaction->aborted check
First, though the current transaction->aborted check can stop the commit early
and avoid unnecessary operations, it is too early, and some transaction handles
don't end, those handles may set transaction->aborted after the check.
Second, when we commit the transaction, we will wake up some worker threads to
flush the space cache and inode cache. Those threads also allocate some transaction
handles and may set transaction->aborted if some serious error happens.
So we need more check for ->aborted when committing the transaction. Fix it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Miao Xie [Tue, 15 Jan 2013 06:27:25 +0000 (06:27 +0000)]
Btrfs: Add ACCESS_ONCE() to transaction->abort accesses
We may access and update transaction->aborted on the different CPUs without
lock, so we need ACCESS_ONCE() wrapper to prevent the compiler from creating
unsolicited accesses and make sure we can get the right value.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Tue, 22 Jan 2013 20:43:09 +0000 (15:43 -0500)]
Btrfs: put csums on the right ordered extent
I noticed a WARN_ON going off when adding csums because we were going over
the amount of csum bytes that should have been allowed for an ordered
extent. This is a leftover from when we used to hold the csums privately
for direct io, but now we use the normal ordered sum stuff so we need to
make sure and check if we've moved on to another extent so that the csums
are added to the right extent. Without this we could end up with csums for
bytenrs that don't have extents to cover them yet. Thanks,
Liu Bo [Sun, 6 Jan 2013 03:38:22 +0000 (03:38 +0000)]
Btrfs: use right range to find checksum for compressed extents
For compressed extents, the range of checksum is covered by disk length,
and the disk length is different with ram length, so we need to use disk
length instead to get us the right checksum.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Tue, 18 Dec 2012 16:39:19 +0000 (11:39 -0500)]
Btrfs: fix panic when recovering tree log
A user reported a BUG_ON(ret) that occured during tree log replay. Ret was
-EAGAIN, so what I think happened is that we removed an extent that covered
a bitmap entry and an extent entry. We remove the part from the bitmap and
return -EAGAIN and then search for the next piece we want to remove, which
happens to be an entire extent entry, so we just free the sucker and return.
The problem is ret is still set to -EAGAIN so we trip the BUG_ON(). The
user used btrfs-zero-log so I'm not 100% sure this is what happened so I've
added a WARN_ON() to catch the other possibility. Thanks,
Reported-by: Jan Steffens <jan.steffens@gmail.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Josef Bacik [Thu, 24 Jan 2013 17:02:07 +0000 (12:02 -0500)]
Btrfs: do not allow logged extents to be merged or removed
We drop the extent map tree lock while we're logging extents, so somebody
could come in and merge another extent into this one and screw up our
logging, or they could even remove us from the list which would keep us from
logging the extent or freeing our ref on it, so we need to make sure to not
clear LOGGING until after the extent is logged, and then we can merge it to
adjacent extents. Thanks,
Olof Johansson [Thu, 24 Jan 2013 16:12:24 +0000 (08:12 -0800)]
Merge branch 'vexpress/fixes' of git://git.linaro.org/people/pawelmoll/linux into fixes
From Pawel Moll:
- makes the V2P-CA15_A7 (a.k.a. TC2) work with 3.8 kernels
- improves vexpress-sysreg.c behaviour on arm64 platforms
* 'vexpress/fixes' of git://git.linaro.org/people/pawelmoll/linux:
mfd: vexpress-sysreg: Don't skip initialization on probe
ARM: vexpress: Enable A7 cores in V2P-CA15_A7's Device Tree
ARM: vexpress: extend the MPIDR range used for pen release check
Olof Johansson [Thu, 24 Jan 2013 15:49:49 +0000 (07:49 -0800)]
Merge tag 'at91-fixes' of git://github.com/at91linux/linux-at91 into fixes
From Nicolas Ferre:
Here are fixes for AT91 that are mainly related to device tree.
One RM9200 setup option is the only C code change.
Some documentation changes can clarify the pinctrl use.
Then, some defconfig modifications are allowing the affected platforms
to boot.
* tag 'at91-fixes' of git://github.com/at91linux/linux-at91:
ARM: at91/dts: correct comment in at91sam9x5.dtsi for mii
ARM: at91/at91_dt_defconfig: add at91sam9n12 SoC to DT defconfig
ARM: at91/at91_dt_defconfig: remove memory specification to cmdline
ARM: at91/dts: add macb mii pinctrl config for kizbox
ARM: at91: rm9200: remake the BGA as default version
ARM: at91: fix gpios on i2c-gpio for RM9200 DT
ARM: at91/at91sam9x5 DTS: add SCK USART pins
ARM: at91/at91sam9x5 DTS: correct wrong PIO BANK values on u(s)arts
ARM: at91/at91-pinctrl documentation: fix typo and add some details
Pawel Moll [Tue, 27 Nov 2012 16:48:50 +0000 (16:48 +0000)]
mfd: vexpress-sysreg: Don't skip initialization on probe
The vexpress-sysreg driver does not have to be initialized
early, when the platform doesn't require this. Unfortunately
in such case it wasn't initialized correctly - master site
lookup and config bridge registration were missing. Fixed now.
Pawel Moll [Thu, 24 Jan 2013 11:48:54 +0000 (11:48 +0000)]
ARM: vexpress: Enable A7 cores in V2P-CA15_A7's Device Tree
As the kernel is able to cope with multiple clusters,
uncomment the A7 cores in the Device Tree for V2P-CA15_A7
tile, making all 5 cores available to the user.
ARM: vexpress: extend the MPIDR range used for pen release check
In ARM multi-cluster systems the MPIDR affinity level 0 cannot be used as a
single cpu identifier, affinity levels 1 and 2 must be taken into account as
well.
This patch extends the MPIDR usage to affinity levels 1 and 2 in versatile
secondary cores start up code in order to compare the passed pen_release
value with the full-blown affinity mask.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Olof Johansson [Thu, 24 Jan 2013 04:30:52 +0000 (20:30 -0800)]
Merge tag 'mvebu_fixes_for_v3.8-rc5' of git://git.infradead.org/users/jcooper/linux into fixes
From Jason Cooper:
mvebu fixes for v3.8-rc5
- fix memory leak in mvebu/clk-cpu.c
- use devm_ to correct/simplify error paths in mvsdio
- add missing #interrupt-cells property in kirkwood
* tag 'mvebu_fixes_for_v3.8-rc5' of git://git.infradead.org/users/jcooper/linux:
ARM: kirkwood: fix missing #interrupt-cells property
mmc: mvsdio: use devm_ API to simplify/correct error paths.
clk: mvebu/clk-cpu.c: fix memory leakage
Linus Torvalds [Thu, 24 Jan 2013 04:11:35 +0000 (20:11 -0800)]
Merge tag 'usb-3.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull more USB fixes from Greg Kroah-Hartman:
"Here are some more USB fixes for the 3.8-rc4 tree.
Some gadget driver fixes, and finally resolved the ehci-mxc driver
build issues (it's just some code moving around and being deleted)."
* tag 'usb-3.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: EHCI: fix build error in ehci-mxc
USB: EHCI: add a name for the platform-private field
USB: EHCI: fix incorrect configuration test
USB: EHCI: Move definition of EHCI_STATS to ehci.h
USB: UHCI: fix IRQ race during initialization
usb: gadget: FunctionFS: Fix missing braces in parse_opts
usb: dwc3: gadget: fix ep->maxburst for ep0
ARM: i.MX clock: Change the connection-id for fsl-usb2-udc
usb: gadget: fsl_mxc_udc: replace MX35_IO_ADDRESS to ioremap
usb: gadget: fsl-mxc-udc: replace cpu_is_xxx() with platform_device_id
usb: musb: cppi_dma: drop '__init' annotation
Linus Torvalds [Thu, 24 Jan 2013 04:10:48 +0000 (20:10 -0800)]
Merge tag 'char-misc-3.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull drivers/misc fix from Greg Kroah-Hartman:
"Here is a single revert for the ti-st misc driver, fixing problem that
was introduced in 3.7-rc1 that has been bothering people."
* tag 'char-misc-3.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Revert "drivers/misc/ti-st: remove gpio handling"
Linus Torvalds [Thu, 24 Jan 2013 04:09:58 +0000 (20:09 -0800)]
Merge tag 'tty-3.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull a TTY maintainer patch from Greg Kroah-Hartman:
"Just a MAINTAINERS update, now that Alan has left for a bit, I'll
continue to watch over the serial drivers."
* tag 'tty-3.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
MAINTAINERS: Someone needs to watch over the serial drivers
Linus Torvalds [Thu, 24 Jan 2013 04:07:12 +0000 (20:07 -0800)]
Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- gspca: add needed delay for I2C traffic for sonixb/sonixj cameras
- gspca: add one missing Kinect USB ID
- usbvideo: some regression fixes
- omap3isp: fix some build issues
- videobuf2: fix video output handling
- exynos s5p/m5mols: a few regression fixes.
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] uvcvideo: Set error_idx properly for S_EXT_CTRLS failures
[media] uvcvideo: Cleanup leftovers of partial revert
[media] uvcvideo: Return -EACCES when trying to set a read-only control
[media] omap3isp: Don't include <plat/cpu.h>
[media] s5p-mfc: Fix interrupt error handling routine
[media] s5p-fimc: Fix return value of __fimc_md_create_flite_source_links()
[media] m5mols: Fix typo in get_fmt callback
[media] v4l: vb2: Set data_offset to 0 for single-plane output buffers
[media] [FOR,v3.8] omap3isp: Don't include deleted OMAP plat/ header files
[media] gspca_sonixj: Add a small delay after i2c_w1
[media] gspca_sonixb: Properly wait between i2c writes
[media] gspca_kinect: add Kinect for Windows USB id
Linus Torvalds [Wed, 23 Jan 2013 21:31:15 +0000 (13:31 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k fixes from Geert Uytterhoeven:
"The asm-generic changeset has been ack'ed by Arnd."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Wire up finit_module
asm-generic/dma-mapping-broken.h: Provide dma_alloc_attrs()/dma_free_attrs()
m68k: Provide dma_alloc_attrs()/dma_free_attrs()
Linus Torvalds [Wed, 23 Jan 2013 21:28:17 +0000 (13:28 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64
Pull arm64 fixes from Catalin Marinas:
- ELF coredump fix (more registers dumped than what user space expects)
- SUBARCH name generation (s/aarch64/arm64/)
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
arm64: makefile: fix uname munging when setting ARCH on native machine
arm64: elf: fix core dumping to match what glibc expects
Alan Stern [Wed, 23 Jan 2013 18:26:15 +0000 (13:26 -0500)]
USB: EHCI: fix build error in ehci-mxc
This patch (as1643b) fixes a build error in ehci-hcd when compiling for
ARM with allmodconfig:
drivers/usb/host/ehci-hcd.c:1285:0: warning: "PLATFORM_DRIVER" redefined [enabled by default]
drivers/usb/host/ehci-hcd.c:1255:0: note: this is the location of the previous definition
drivers/usb/host/ehci-mxc.c:280:31: warning: 'ehci_mxc_driver' defined but not used [-Wunused-variable]
drivers/usb/host/ehci-hcd.c:1285:0: warning: "PLATFORM_DRIVER" redefined [enabled by default]
drivers/usb/host/ehci-hcd.c:1255:0: note: this is the location of the previous definition
The fix is to convert ehci-mxc over to the new "ehci-hcd is a library"
scheme so that it can coexist peacefully with the ehci-platform
driver. As part of the conversion the ehci_mxc_priv data structure,
which was allocated dynamically, is now placed where it belongs: in
the private area at the end of struct ehci_hcd.
Linus Torvalds [Wed, 23 Jan 2013 17:42:46 +0000 (09:42 -0800)]
Merge tag 'sound-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Only a few small HD-audio fixes:
- Addition of new Conexant codec IDs
- Two one-liners to add fixups for Realtek codecs
- A last-minute regression fix for auto-mute with power-saving mode
(regressed since 3.8-rc1)"
* tag 'sound-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix inconsistent pin states after resume
ALSA: hda - Add Conexant CX20755/20756/20757 codec IDs
ALSA: hda - Add fixup for Acer AO725 laptop
ALSA: hda - Fix mute led for another HP machine
Takashi Iwai [Wed, 23 Jan 2013 14:58:40 +0000 (15:58 +0100)]
ALSA: hda - Fix inconsistent pin states after resume
The commit [26a6cb6c: ALSA: hda - Implement a poll loop for jacks as a
module parameter] introduced the polling jack detection code, but it
also moved the call of snd_hda_jack_set_dirty_all() in the resume path
after resume/init ops call. This caused a regression when the jack
state has been changed during power-down (e.g. in the power save
mode). Since the driver doesn't probe the new jack state but keeps
using the cached value due to no dirty flag, the pin state remains
also as if the jack is still plugged.
The fix is simply moving snd_hda_jack_set_dirty_all() to the original
position.
Boris BREZILLON [Thu, 13 Dec 2012 14:03:08 +0000 (14:03 +0000)]
ARM: at91/dts: add macb mii pinctrl config for kizbox
This patch overrides default macb pinctrl config defined in
at91sam9260.dtsi (pinctrl_macb_rmii) with kizbox board config
(pinctrl_macb_rmii + pinctrl_macb_rmii_mii_alt).
Signed-off-by: Boris BREZILLON <linux-arm@overkiz.com> Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
When it goes to error through line 144, the memory allocated to *devname is
not freed, and the caller doesn't free it either in line 250. So we free the
memroy of *devname in function cifs_compose_mount_options() when it goes to
error.
Signed-off-by: Cong Ding <dinggnu@gmail.com> CC: stable <stable@kernel.org> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
The reason is that it broke TI WiLink shared transport on Panda.
Also, callback functions should not be added to board files anymore,
so revert to implementing the power functions in the driver itself.
Additionally, changed a variable name ('status' to 'err') so that this
revert compiles properly.
Cong Ding [Tue, 15 Jan 2013 18:44:26 +0000 (19:44 +0100)]
clk: mvebu/clk-cpu.c: fix memory leakage
the variable cpuclk and clk_name should be properly freed when error happens.
Signed-off-by: Cong Ding <dinggnu@gmail.com> Acked-by: Jason Cooper <jason@lakedaemon.net> Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Acked-by: Mike Turquette <mturquette@linaro.org> Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Linus Torvalds [Wed, 23 Jan 2013 00:36:23 +0000 (16:36 -0800)]
Merge tag '3.8-pci-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"The most important is a fix for a pciehp deadlock that occurs when
unplugging a Thunderbolt adapter. We also applied the same fix to
shpchp, removed CONFIG_EXPERIMENTAL dependencies, fixed a
pcie_aspm=force problem, and fixed a refcount leak.
Details:
- Hotplug
PCI: pciehp: Use per-slot workqueues to avoid deadlock
PCI: shpchp: Make shpchp_wq non-ordered
PCI: shpchp: Handle push button event asynchronously
PCI: shpchp: Use per-slot workqueues to avoid deadlock
- Power management
PCI: Allow pcie_aspm=force even when FADT indicates it is unsupported
* tag '3.8-pci-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: remove depends on CONFIG_EXPERIMENTAL
PCI: Allow pcie_aspm=force even when FADT indicates it is unsupported
PCI: shpchp: Use per-slot workqueues to avoid deadlock
PCI: shpchp: Handle push button event asynchronously
PCI: shpchp: Make shpchp_wq non-ordered
PCI/AER: pci_get_domain_bus_and_slot() call missing required pci_dev_put()
PCI: pciehp: Use per-slot workqueues to avoid deadlock
Tejun Heo [Wed, 23 Jan 2013 00:15:15 +0000 (16:15 -0800)]
async: fix __lowest_in_progress()
Commit 083b804c4d3e ("async: use workqueue for worker pool") made it
possible that async jobs are moved from pending to running out-of-order.
While pending async jobs will be queued and dispatched for execution in
the same order, nothing guarantees they'll enter "1) move self to the
running queue" of async_run_entry_fn() in the same order.
Before the conversion, async implemented its own worker pool. An async
worker, upon being woken up, fetches the first item from the pending
list, which kept the executing lists sorted. The conversion to
workqueue was done by adding work_struct to each async_entry and async
just schedules the work item. The queueing and dispatching of such work
items are still in order but now each worker thread is associated with a
specific async_entry and moves that specific async_entry to the
executing list. So, depending on which worker reaches that point
earlier, which is non-deterministic, we may end up moving an async_entry
with larger cookie before one with smaller one.
This broke __lowest_in_progress(). running->domain may not be properly
sorted and is not guaranteed to contain lower cookies than pending list
when not empty. Fix it by ensuring sort-inserting to the running list
and always looking at both pending and running when trying to determine
the lowest cookie.
Over time, the async synchronization implementation became quite messy.
We better restructure it such that each async_entry is linked to two
lists - one global and one per domain - and not move it when execution
starts. There's no reason to distinguish pending and running. They
behave the same for synchronization purposes.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>