git.karo-electronics.de Git - linux-beck.git/log

m68k: use tracehook_report_syscall_entry/exit for ColdFire MMU ptrace path

The existing ColdFire code (which is all non-mmu) for system call entry
and exit uses the more modern tracehook_report_syscall_entry()/exit()
into the ptrace code. Now that we are supporting ColdFire with MMU we
need the same hooks for these.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: ColdFire V4e MMU context support code

Add code to manage the context's of the ColdFire V4e MMU. This code is
mostly taken from the Freescale 2.6.35 kernel BSP for MMU enabled ColdFire.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: MMU enabled ColdFire needs 8k ELF alignment

Like the SUN3 hardware MMU the ColdFire MMU uses 8k pages. So we want
our ELF page size alingment to also be 8k. Modify the ELF alignment
setting.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: set ColdFire MMU page size

We use the ColdFire V4e MMU page size of 8KiB. Define PAGE_SHIFT
appropriately.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: define PAGE_OFFSET_RAW for ColdFire CPU with MMU enabled

The ColdFire CPU configurations need PAGE_OFFSET_RAW set to the base of
their RAM. It doesn't matter if they are running with the MMU enabled or
disabled, it is always set to the base of RAM.

We can keep the choices simple here and key of CONFIG_RAMBASE. If it is
defined we are on a plaftorm (ColdFire or other non-MMU systems) which
have a configurable RAM base, just use it.

Reported-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

m68k: add TLB flush support for the ColdFire V4e MMU hardware

The ColdFire V4e MMU is unlike any of the other m68k MMU hardware.
It needs its own TLB flush support code.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: modify ColdFire 54xx cache support for MMU enabled

Modify the cache setup for the ColdFire 54xx parts when running with
the MMU enabled.

We want to map the peripheral register space (MBAR region) as non
cacheable. And create an identity mapping for all of RAM for the
kernel.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: add cache support for V4e ColdFire cores running with MMU enabled

Add code to deal with instruction, data and branch caches of the V4e
ColdFire cores when they are running with the MMU enabled.

This code is loosely based on Freescales changes for the caches of the
V4e ColdFire in the 2.6.25 kernel BSP. That code was originally by
Kurt Mahan <kmahan@freescale.com> (now <kmahan@xmission.com>).

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: add ColdFire paging exception handling code

Add code to traps.c to handle MMU exceptions for the ColdFire.
Most of this code is from the 2.6.25 kernel BSP code released by
Freescale.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: add page table size definitions for ColdFire V4e MMU

Define the page table size and attributes for the ColdFire V4e MMU.
Also setup the vmalloc and kmap regions we will use.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: page table support definitions and code for ColdFire MMU

The ColdFire V4e MMU is nothing like any of the other m68k MMU's.
So we need to create a set of definitions and support routines
for the kernels paging functions.

This is largely taken from Freescales BSP code for this (though it
was a 2.6.25 kernel). I have cleaned it up alot from the original.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: set register a2 to current if MMU enabled on ColdFire

Virtual memory m68k systems build with register a2 dedicated to being the
current proc pointer (non-MMU don't do this). Add code to the ColdFire
interrupt and exception processing to set this on entry, and at context
switch time. We use the same GET_CURRENT() macro that MMU enabled code
uses - modifying it so that the assembler is ColdFire clean.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

m68k: add ColdFire 54xx CPU MMU memory init code

Add code to the 54xx ColdFire CPU init to setup memory ready for the m68k
paged memory start up.

Some of the RAM variables that were specific to the non-mmu code paths
now need to be used during this setup, so when CONFIG_MMU is enabled.
Move these out of page_no.h and into page.h.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

m68k: init the MMU hardware for the 54xx ColdFire

The 54xx ColdFire CPU family has an internal MMU. Up to now though we
have only supported running on them with the MMU disabled.

Add code to the 54xx ColdFire init sequence to initialize the bootmem
used by the usual MMU m68k code for paging init.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: use addr_limit checking for m68k CPUs that do no support address spaces

The ColdFire CPU family, and the original 68000, do not support separate
address spaces like the other 680x0 CPU types. Modify the set_fs()/get_fs()
functions and macros to use a thread_info addr_limit for address space
checking. This is pretty much what all other architectures that do not
support separate setable address spaces do.

Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: modify user space access functions to support ColdFire CPUs

Modify the user space access functions to support the ColdFire V4e cores
running with MMU enabled.

The ColdFire processors do not support the "moves" instruction used by
the traditional 680x0 processors for moving data into and out of another
address space. They only support the notion of a single address space,
and you use the usual "move" instruction to access that.

Create a new config symbol (CONFIG_CPU_HAS_ADDRESS_SPACES) to mark the
CPU types that support separate address spaces, and thus also support
the sfc/dfc registers and the "moves" instruction that go along with that.

The code is almost identical for user space access, so lets just use a
define to choose either the "move" or "moves" in the assembler code.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

m68k: add TASK definitions for ColdFires running with MMU

Add appropriate TASK_SIZE and TASK_UNMAPPED_BASE definitions for running
on ColdFire V4e cores with MMU enabled.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: make interrupt definitions conditional on correct CPU types

The interrupt handling support defines and code is not so much conditional
on an MMU being present (CONFIG_MMU), as it is on which type of CPU we are
building for. So make the code conditional on the CPU types instead. The
current irq.h is mostly specific to the interrupt code for the 680x0 CPUs,
so it should only be used for them.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: definitions for the ColdFire V4e MMU hardware

Basic register level definitions to support the internal MMU of the
V4e ColdFire cores.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: show ColdFire CPU/FPU/MMU type

Update the show_cpuinfo() code to display info about ColdFire cores.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68k: add machine and CPU definitions for ColdFire cores

Create machine and CPU definitions to support the ColdFire CPU family
members that have a virtual memory management unit.

The ColdFire V4e core contains an MMU, and it is quite different to
any other 68k family members.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Matt Waddel <mwaddel@yahoo.com>
Acked-by: Kurt Mahan <kmahan@xmission.com>

m68knommu: remove no longer used rom_length from Palm/Pilot start up code

Compiling for the m68knommu/68328 Palm/Pilot target you get:

LD vmlinux
arch/m68k/platform/68328/head.o: In function `L3':
(.text+0x170): undefined reference to `rom_length'

"rom_length" is not used any longer by any of the m68knommu code.
So remove it from here too.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68knommu: fix broken boot logo inclusion

Compiling for the m68knommu/68328 Palm/Pilot target you get:

AS arch/m68k/platform/68328/head-pilot.o
arch/m68k/platform/68328/head-pilot.S:37:23: fatal error: bootlogo.rh: No such file or directory

The build for this target used to do a conversion on a C coded boot logo
and include this in the head assembler code. This got broken by changes to
the local Makefile.

Clean all this up by just including the C coded boot logo struct in the
C code. With the appropriate alignment attribute there is no difference
to the way it can be used.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68k: consolidate the vmlinux.lds linker scripts

The merge of m68knommu left the linker scripts a little disorganized.
Some consistent naming and squashing two of scripts that just include
others can simplify things a lot.

So merge the two simple including scripts, and rename the nommu script
to be consistent with the existing m68k linker scripts.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68knommu: remove unused anchor.h include file

The code that used the anchor.h include file has long been removed from
the kernel. Remove it too.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68knommu: fix broken ColdFire slice timer read_clk() code

There is a race on reading the ColdFire slice timer current count and the
total clock count so far. Interrupts are off, and we may have just missed
getting a new timer wrap event interrupt. Check for this and adjust the
cycle count and current read count accordingly.

Also the slice timer counts down from the terminal count. So in read_clk()
we need take the current clock count away from the terminal count.

Reported-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68knommu: disable cache early in startup for ColdFire

Disbale the CPU cache really early in the ColdFire startup code. We set
up some variables for RAM sizing and we want to make they stick in RAM.

Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68k: handle presence of 64bit mul/div instructions cleanly

The traditional 68000 processors and the newer reduced instruction set
ColdFire processors do not support the 32*32->64 multiply or the 64/32->32
divide instructions. This is not a difference based on the presence of
a hardware MMU or not.

Create a new config symbol to mark that a CPU type doesn't support the
longer multiply/divide instructions. Use this then as a basis for using
the fast 64bit based divide (in div64.h) and for linking in the extra
libgcc functions that may be required (mulsi3, divsi3, etc).

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

m68k: simpler m68k and ColdFire CPU's can use generic csum code

We have two implementations of the IP checksuming code for the m68k arch.
One uses the more advanced instructions available in 68020 and above
processors, the other uses the simpler instructions available on the
original 68000 processors and the modern ColdFire processors.

This simpler code is pretty much the same as the generic lib implementation
of the IP csum functions. So lets just switch over to using that. That
means we can completely remove the checksum_no.c file, and only have the
local fast code used for the more complex 68k CPU family members.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68k: make fp register stores consistent for m68k and ColdFire

There is no reason we can't make the saved fp registers the same for all
m68k types and ColdFire. There is a little wasted space, but the code
consistency and cleanliness is a big win.

sigcontext.h is an exported header, but currently there is no in-mainline
users of the !__uClinux__ and __mcoldfire__ case that this change effects.
Even better this change actually makes this structure consistent with
the out-of-mainline ColdFire/MMU code.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

m68knommu: no need to set register marker on traps

Commit 61619b12078dc8b85a3d4cbfa16f650daa341bd1 ("m68k: merge mmu and
non-mmu include/asm/entry.h files") made the trap entry code basically
the same for mmu and non-mmu builds. This means we no longer need code
to mark the stack frame as "system-call" type or other in the non-mmu
trap handling entry points. This is done in the SAVE_ALL_INT macro now.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68k: support configure time command line for MMU m68k

The non-MMU builds of m68k allow a fixed kernel boot command line to
be configured at configure time. Allow this MMU builds as well.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68k: print memory layout info in boot log

Output a table of the kernel memory regions at boot time.
This is taken directly from the ARM architecture code that does this.
The table looks like this:

Virtual kernel memory layout:
    vector  : 0x00000000 - 0x00000400   (   0 KiB)
    kmap    : 0xd0000000 - 0xe0000000   ( 256 MiB)
    vmalloc : 0xc0000000 - 0xcfffffff   ( 255 MiB)
    lowmem  : 0x00000000 - 0x02000000   (  32 MiB)
      .init : 0x00128000 - 0x00134000   (  48 KiB)
      .text : 0x00020000 - 0x00118d54   ( 996 KiB)
      .data : 0x00118d60 - 0x00126000   (  53 KiB)
      .bss  : 0x00134000 - 0x001413e0   (  53 KiB)

This has been very useful while debugging the ColdFire virtual memory
support code. But in general I think it is nice to know extacly where
the kernel has layed everything out on boot.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68knommu: move definition of mach_gettod to where it is used

The mach_gettod function pointer is only called from the time_no.c
code. So move its actual definition to there too. It is currently in
setup_no.c for no particularly good reason.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68k: selection of GENERIC_ATOMIC64 is not MMU specific

The selection of the CONFIG_GENERIC_ATOMIC64 option is not specific to the
MMU being present and enabled. It is a property of certain CPU families.
So select it based on those CPU types being selected.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68k: remove thread_info struct from thread struct

Currently on m68k we have a comeplete thread_info structure stored inside
of the thread_struct, and we also have it in the initial part of the kernel
stack. Mostly the code currently uses the one inside of the thread_struct,
only using the "task" pointer from the stack based one.

This is wasteful and confusing, we should only have the single instance of
thread_info inside the stack page. And this is the norm for all other
architectures.

This change makes m68k handle thread_info consistently on both MMU enabled
and non-MMU setups.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68k: remove duplicate asm offset for task thread.info

We have a duplicate name and definition for the offset of the thread.info
struct within the task struct in our asm-offsets.c code. Remove one of them,
and consolidate to use a single define, TASK_INFO.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

m68k: merge the init_task code for mmu and non-mmu targets

The init_task code can be the same for both mmu and non-mmu targets.
None of the alignment carried out in the the current init_task code
is necessary. The linker script takes care of aligning the init_thread
structure to a THREAD SIZE boundary, and that is all we need.

So use the init_task.c code for all target types, that makes m68k
code consistent with what most other architectures do.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

m68knommu: remove unused fasthandler declaration

The fasthandler code was removed long ago. Remove the now unused
declaration of it.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>

m68k: Fall back to __gpio_to_irq() for non-arch GPIOs

gpiolib provides __gpio_to_irq() to map gpiolib gpios to interrupts - hook
that up on m68k.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>

clocksource: m86k: Convert to clocksource_register_hz/khz

Updated to merge the valid bits of the two m68k patches.

This converts the m86k clocksources to use clocksource_register_hz/khz

This is untested, so any assistance in testing would be appreciated!

CC: Geert Uytterhoeven <geert@linux-m68k.org>
CC: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>

Linux 3.2-rc7

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
VFS: Fix race between CPU hotplug and lglocks

Merge tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux

for linus: writeback reason binary tracing format fix

* tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
writeback: show writeback reason with __print_symbolic

Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
kconfig: adapt update-po-config to new UML layout

Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] omap3isp: Fix crash caused by subdevs now having a pointer to devnodes

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: call d_instantiate after all ops are setup
Btrfs: fix worker lock misuse in find_worker

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix MSIQ HV call ordering in pci_sun4v_msiq_build_irq().

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  netfilter: xt_connbytes: handle negation correctly
  net: relax rcvbuf limits
  rps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt()
  net: introduce DST_NOPEER dst flag
  mqprio: Avoid panic if no options are provided
  bridge: provide a mtu() method for fake_dst_ops

Merge branch 'nf' of git://1984.lsi.us.es/net

netfilter: xt_connbytes: handle negation correctly

"! --connbytes 23:42" should match if the packet/byte count is not in range.

As there is no explict "invert match" toggle in the match structure,
userspace swaps the from and to arguments
(i.e., as if "--connbytes 42:23" were given).

However, "what <= 23 && what >= 42" will always be false.

Change things so we use "||" in case "from" is larger than "to".

This change may look like it breaks backwards compatibility when "to" is 0.
However, older iptables binaries will refuse "connbytes 42:0",
and current releases treat it to mean "! --connbytes 0:42",
so we should be fine.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

Btrfs: call d_instantiate after all ops are setup

This closes races where btrfs is calling d_instantiate too soon during
inode creation. All of the callers of btrfs_add_nondir are updated to
instantiate after the inode is fully setup in memory.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: fix worker lock misuse in find_worker

Dan Carpenter noticed that we were doing a double unlock on the worker
lock, and sometimes picking a worker thread without the lock held.

This fixes both errors.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

net: relax rcvbuf limits

skb->truesize might be big even for a small packet.

Its even bigger after commit 87fb4b7b533 (net: more accurate skb
truesize) and big MTU.

We should allow queueing at least one packet per receiver, even with a
low RCVBUF setting.

Reported-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rps: fix insufficient bounds checking in store_rps_dev_flow_table_cnt()

Setting a large rps_flow_cnt like (1 << 30) on 32-bit platform will
cause a kernel oops due to insufficient bounds checking.

if (count > 1<<30) {
/* Enforce a limit to prevent overflow */
return -EINVAL;
}
count = roundup_pow_of_two(count);
table = vmalloc(RPS_DEV_FLOW_TABLE_SIZE(count));

Note that the macro RPS_DEV_FLOW_TABLE_SIZE(count) is defined as:

... + (count * sizeof(struct rps_dev_flow))

where sizeof(struct rps_dev_flow) is 8. (1 << 30) * 8 will overflow
32 bits.

This patch replaces the magic number (1 << 30) with a symbolic bound.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: introduce DST_NOPEER dst flag

Chris Boot reported crashes occurring in ipv6_select_ident().

[  461.457562] RIP: 0010:[<ffffffff812dde61>]  [<ffffffff812dde61>]
ipv6_select_ident+0x31/0xa7

[  461.578229] Call Trace:
[  461.580742] <IRQ>
[  461.582870]  [<ffffffff812efa7f>] ? udp6_ufo_fragment+0x124/0x1a2
[  461.589054]  [<ffffffff812dbfe0>] ? ipv6_gso_segment+0xc0/0x155
[  461.595140]  [<ffffffff812700c6>] ? skb_gso_segment+0x208/0x28b
[  461.601198]  [<ffffffffa03f236b>] ? ipv6_confirm+0x146/0x15e
[nf_conntrack_ipv6]
[  461.608786]  [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
[  461.614227]  [<ffffffff81271d64>] ? dev_hard_start_xmit+0x357/0x543
[  461.620659]  [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
[  461.626440]  [<ffffffffa0379745>] ? br_parse_ip_options+0x19a/0x19a
[bridge]
[  461.633581]  [<ffffffff812722ff>] ? dev_queue_xmit+0x3af/0x459
[  461.639577]  [<ffffffffa03747d2>] ? br_dev_queue_push_xmit+0x72/0x76
[bridge]
[  461.646887]  [<ffffffffa03791e3>] ? br_nf_post_routing+0x17d/0x18f
[bridge]
[  461.653997]  [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
[  461.659473]  [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
[  461.665485]  [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
[  461.671234]  [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
[  461.677299]  [<ffffffffa0379215>] ?
nf_bridge_update_protocol+0x20/0x20 [bridge]
[  461.684891]  [<ffffffffa03bb0e5>] ? nf_ct_zone+0xa/0x17 [nf_conntrack]
[  461.691520]  [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge]
[  461.697572]  [<ffffffffa0374812>] ? NF_HOOK.constprop.8+0x3c/0x56
[bridge]
[  461.704616]  [<ffffffffa0379031>] ?
nf_bridge_push_encap_header+0x1c/0x26 [bridge]
[  461.712329]  [<ffffffffa037929f>] ? br_nf_forward_finish+0x8a/0x95
[bridge]
[  461.719490]  [<ffffffffa037900a>] ?
nf_bridge_pull_encap_header+0x1c/0x27 [bridge]
[  461.727223]  [<ffffffffa0379974>] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge]
[  461.734292]  [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77
[  461.739758]  [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge]
[  461.746203]  [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111
[  461.751950]  [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge]
[  461.758378]  [<ffffffffa037533a>] ? NF_HOOK.constprop.4+0x56/0x56
[bridge]

This is caused by bridge netfilter special dst_entry (fake_rtable), a
special shared entry, where attaching an inetpeer makes no sense.

Problem is present since commit 87c48fa3b46 (ipv6: make fragment
identifications less predictable)

Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and
__ip_select_ident() fallback to the 'no peer attached' handling.

Reported-by: Chris Boot <bootc@bootc.net>
Tested-by: Chris Boot <bootc@bootc.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mqprio: Avoid panic if no options are provided

Userspace may not provide TCA_OPTIONS, in fact tc currently does
so not do so if no arguments are specified on the command line.
Return EINVAL instead of panicing.

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: provide a mtu() method for fake_dst_ops

Commit 618f9bc74a039da76 (net: Move mtu handling down to the protocol
depended handlers) forgot the bridge netfilter case, adding a NULL
dereference in ip_fragment().

Reported-by: Chris Boot <bootc@bootc.net>
CC: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-linus' of git://neil.brown.name/md

* 'for-linus' of git://neil.brown.name/md:
  md/bitmap: It is OK to clear bits during recovery.
  md: don't give up looking for spares on first failure-to-add
  md/raid5: ensure correct assessment of drives during degraded reshape.
  md/linear: fix hot-add of devices to linear arrays.

md/bitmap: It is OK to clear bits during recovery.

commit d0a4bb492772ce5c4bdfba3744a99ed6f6fb238f introduced a
regression which is annoying but fairly harmless.

When writing to an array that is undergoing recovery (a spare
in being integrated into the array), writing to the array will
set bits in the bitmap, but they will not be cleared when the
write completes.

For bits covering areas that have not been recovered yet this is not a
problem as the recovery will clear the bits.  However bits set in
already-recovered region will stay set and never be cleared.
This doesn't risk data integrity.  The only negatives are:
- next time there is a crash, more resyncing than necessary will
   be done.
- the bitmap doesn't look clean, which is confusing.

While an array is recovering we don't want to update the
'events_cleared' setting in the bitmap but we do still want to clear
bits that have very recently been set - providing they were written to
the recovering device.

So split those two needs - which previously both depended on 'success'
and always clear the bit of the write went to all devices.

Signed-off-by: NeilBrown <neilb@suse.de>

md: don't give up looking for spares on first failure-to-add

Before performing a recovery we try to remove any spares that
might not be working, then add any that might have become relevant.

Currently we abort on the first spare that cannot be added.
This is a false optimisation.
It is conceivable that - depending on rules in the personality - a
subsequent spare might be accepted.
Also the loop does other things like count the available spares and
reset the 'recovery_offset' value.

If we abort early these might not happen properly.

So remove the early abort.

In particular if you have an array what is undergoing recovery and
which has extra spares, then the recovery may not restart after as
reboot as the could of 'spares' might end up as zero.

Reported-by: Anssi Hannula <anssi.hannula@iki.fi>
Signed-off-by: NeilBrown <neilb@suse.de>

md/raid5: ensure correct assessment of drives during degraded reshape.

While reshaping a degraded array (as when reshaping a RAID0 by first
converting it to a degraded RAID4) we currently get confused about
which devices are in_sync. In most cases we get it right, but in the
region that is being reshaped we need to treat non-failed devices as
in-sync when we have the data but haven't actually written it out yet.

Reported-by: Adam Kwolek <adam.kwolek@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md/linear: fix hot-add of devices to linear arrays.

commit d70ed2e4fafdbef0800e73942482bb075c21578b
broke hot-add to a linear array.
After that commit, metadata if not written to devices until they
have been fully integrated into the array as determined by
saved_raid_disk. That patch arranged to clear that field after
a recovery completed.

However for linear arrays, there is no recovery - the integration is
instantaneous. So we need to explicitly clear the saved_raid_disk
field.

Signed-off-by: NeilBrown <neilb@suse.de>

sparc64: Fix MSIQ HV call ordering in pci_sun4v_msiq_build_irq().

This silently was working for many years and stopped working on
Niagara-T3 machines.

We need to set the MSIQ to VALID before we can set it's state to IDLE.

On Niagara-T3, setting the state to IDLE first was causing HV_EINVAL
errors. The hypervisor documentation says, rather ambiguously, that
the MSIQ must be "initialized" before one can set the state.

I previously understood this to mean merely that a successful setconf()
operation has been performed on the MSIQ, which we have done at this
point. But it seems to also mean that it has been set VALID too.

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: Fix usb/isp1760 build on sparc
  usb: gadget: epautoconf: do not change number of streams
  usb: dwc3: core: fix cached revision on our structure
  usb: musb: fix reset issue with full speed device

Merge branch 'upstream-linus' of git://github.com/jgarzik/libata-dev

* 'upstream-linus' of git://github.com/jgarzik/libata-dev:
pata_of_platform: Add missing CONFIG_OF_IRQ dependency.

pata_of_platform: Add missing CONFIG_OF_IRQ dependency.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

ipv4: using prefetch requires including prefetch.h

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

VFS: Fix race between CPU hotplug and lglocks

Currently, the *_global_[un]lock_online() routines are not at all synchronized
with CPU hotplug. Soft-lockups detected as a consequence of this race was
reported earlier at https://lkml.org/lkml/2011/8/24/185. (Thanks to Cong Meng
for finding out that the root-cause of this issue is the race condition
between br_write_[un]lock() and CPU hotplug, which results in the lock states
getting messed up).

Fixing this race by just adding {get,put}_online_cpus() at appropriate places
in *_global_[un]lock_online() is not a good option, because, then suddenly
br_write_[un]lock() would become blocking, whereas they have been kept as
non-blocking all this time, and we would want to keep them that way.

So, overall, we want to ensure 3 things:
1. br_write_lock() and br_write_unlock() must remain as non-blocking.
2. The corresponding lock and unlock of the per-cpu spinlocks must not happen
   for different sets of CPUs.
3. Either prevent any new CPU online operation in between this lock-unlock, or
   ensure that the newly onlined CPU does not proceed with its corresponding
   per-cpu spinlock unlocked.

To achieve all this:
(a) We introduce a new spinlock that is taken by the *_global_lock_online()
    routine and released by the *_global_unlock_online() routine.
(b) We register a callback for CPU hotplug notifications, and this callback
    takes the same spinlock as above.
(c) We maintain a bitmap which is close to the cpu_online_mask, and once it is
    initialized in the lock_init() code, all future updates to it are done in
    the callback, under the above spinlock.
(d) The above bitmap is used (instead of cpu_online_mask) while locking and
    unlocking the per-cpu locks.

The callback takes the spinlock upon the CPU_UP_PREPARE event. So, if the
br_write_lock-unlock sequence is in progress, the callback keeps spinning,
thus preventing the CPU online operation till the lock-unlock sequence is
complete. This takes care of requirement (3).

The bitmap that we maintain remains unmodified throughout the lock-unlock
sequence, since all updates to it are managed by the callback, which takes
the same spinlock as the one taken by the lock code and released only by the
unlock routine. Combining this with (d) above, satisfies requirement (2).

Overall, since we use a spinlock (mentioned in (a)) to prevent CPU hotplug
operations from racing with br_write_lock-unlock, requirement (1) is also
taken care of.

By the way, it is to be noted that a CPU offline operation can actually run
in parallel with our lock-unlock sequence, because our callback doesn't react
to notifications earlier than CPU_DEAD (in order to maintain our bitmap
properly). And this means, since we use our own bitmap (which is stale, on
purpose) during the lock-unlock sequence, we could end up unlocking the
per-cpu lock of an offline CPU (because we had locked it earlier, when the
CPU was online), in order to satisfy requirement (2). But this is harmless,
though it looks a bit awkward.

Debugged-by: Cong Meng <mc@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net: Add a flow_cache_flush_deferred function
  ipv4: reintroduce route cache garbage collector
  net: have ipconfig not wait if no dev is available
  sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd
  asix: new device id
  davinci-cpdma: fix locking issue in cpdma_chan_stop
  sctp: fix incorrect overflow check on autoclose
  r8169: fix Config2 MSIEnable bit setting.
  llc: llc_cmsg_rcv was getting called after sk_eat_skb.
  net: bpf_jit: fix an off-one bug in x86_64 cond jump target
  iwlwifi: update SCD BC table for all SCD queues
  Revert "Bluetooth: Revert: Fix L2CAP connection establishment"
  Bluetooth: Clear RFCOMM session timer when disconnecting last channel
  Bluetooth: Prevent uninitialized data access in L2CAP configuration
  iwlwifi: allow to switch to HT40 if not associated
  iwlwifi: tx_sync only on PAN context
  mwifiex: avoid double list_del in command cancel path
  ath9k: fix max phy rate at rate control init
  nfc: signedness bug in __nci_request()
  iwlwifi: do not set the sequence control bit is not needed

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: atmel/ac97c: using software reset instead hardware reset if not available

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
  mfd: Include linux/io.h to jz4740-adc
  mfd: Use request_threaded_irq for twl4030-irq instead of irq_set_chained_handler
  mfd: Base interrupt for twl4030-irq must be one-shot
  mfd: Handle tps65910 clear-mask correctly
  mfd: add #ifdef CONFIG_DEBUG_FS guard for ab8500_debug_resources
  mfd: Fix twl-core oops while calling twl_i2c_* for unbound driver
  mfd: include linux/module.h for ab5500-debugfs
  mfd: Update wm8994 active device checks for WM1811
  mfd: Set tps6586x bits if new value is different from the old one
  mfd: Set da903x bits if new value is different from the old one
  mfd: Set adp5520 bits if new value is different from the old one
  mfd: Add missed free_irq in da903x_remove

vfs: __read_cache_page should use gfp argument rather than GFP_KERNEL

lockdep reports a deadlock in jfs because a special inode's rw semaphore
is taken recursively. The mapping's gfp mask is GFP_NOFS, but is not
used when __read_cache_page() calls add_to_page_cache_lru().

Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-greg' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus

* 'for-greg' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb:
  usb: gadget: epautoconf: do not change number of streams
  usb: dwc3: core: fix cached revision on our structure
  usb: musb: fix reset issue with full speed device

USB: Fix usb/isp1760 build on sparc

This commit:

commit 8f5d621543cb064d2989fc223d3c2bc61a43981e
Author: Joachim Foerster <joachim.foerster@missinglinkelectronics.com>
Date:   Mon Oct 10 18:06:54 2011 +0200

    usb/isp1760: Let OF bindings depend on general CONFIG_OF instead of PPC_OF .

    To be able to use the driver on other OF-aware architectures, too.
    And add necessary OF related #includes to fix compilation error.

Signed-off-by: Joachim Foerster <joachim.foerster@missinglinkelectronics.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
enabled the build on all CONFIG_OF architectures, but it cannot do
this.

This driver depends upon CONFIG_OF_IRQ but not all CONFIG_OF platforms
support that infrastructure, in particular Sparc does not so the
build fails.

Please push a patch like the following to Linus so that this code only
gets built where it actually should.

--------------------
usb/isp1760: Add missing CONFIG_OF_IRQ dependency on OF code.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

net: Add a flow_cache_flush_deferred function

flow_cach_flush() might sleep but can be called from
atomic context via the xfrm garbage collector. So add
a flow_cache_flush_deferred() function and use this if
the xfrm garbage colector is invoked from within the
packet path.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: reintroduce route cache garbage collector

Commit 2c8cec5c10b (ipv4: Cache learned PMTU information in inetpeer)
removed IP route cache garbage collector a bit too soon, as this gc was
responsible for expired routes cleanup, releasing their neighbour
reference.

As pointed out by Robert Gladewitz, recent kernels can fill and exhaust
their neighbour cache.

Reintroduce the garbage collection, since we'll have to wait our
neighbour lookups become refcount-less to not depend on this stuff.

Reported-by: Robert Gladewitz <gladewitz@gmx.de>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem

firmware: Refer to the co-maintained linux-firmware.git repository

David and I are sharing maintenance of this repository. Patches
should be sent to both of us.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.infradead.org/mtd-2.6

* git://git.infradead.org/mtd-2.6:
  mtd: plat_ram: call mtd_device_register only if partition data exists
  mtd: pxa2xx-flash.c: It used to fall back to provided table.
  mtd: gpmi: add missing include 'module.h'
  mtd: ndfc: fix typo in structure dereference

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: vub300: fix type of firmware_rom_wait_states module parameter
  Revert "mmc: enable runtime PM by default"
  mmc: sdhci: remove "state" argument from sdhci_suspend_host

SELinux: Fix RCU deref check warning in sel_netport_insert()

Fix the following bug in sel_netport_insert() where rcu_dereference() should
be rcu_dereference_protected() as sel_netport_lock is held.

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
security/selinux/netport.c:127 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by ossec-rootcheck/3323:
#0: (sel_netport_lock){+.....}, at: [<ffffffff8117d775>] sel_netport_sid+0xbb/0x226

stack backtrace:
Pid: 3323, comm: ossec-rootcheck Not tainted 3.1.0-rc8-fsdevel+ #1095
Call Trace:
[<ffffffff8105cfb7>] lockdep_rcu_dereference+0xa7/0xb0
[<ffffffff8117d871>] sel_netport_sid+0x1b7/0x226
[<ffffffff8117d6ba>] ? sel_netport_avc_callback+0xbc/0xbc
[<ffffffff8117556c>] selinux_socket_bind+0x115/0x230
[<ffffffff810a5388>] ? might_fault+0x4e/0x9e
[<ffffffff810a53d1>] ? might_fault+0x97/0x9e
[<ffffffff81171cf4>] security_socket_bind+0x11/0x13
[<ffffffff812ba967>] sys_bind+0x56/0x95
[<ffffffff81380dac>] ? sysret_check+0x27/0x62
[<ffffffff8105b767>] ? trace_hardirqs_on_caller+0x11e/0x155
[<ffffffff81076fcd>] ? audit_syscall_entry+0x17b/0x1ae
[<ffffffff811b5eae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff81380d7b>] system_call_fastpath+0x16/0x1b

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: stable@kernel.org
Signed-off-by: James Morris <jmorris@namei.org>

Merge branch 'evm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kasatkin/linux-digsig into for-linus

Merge branch 'for-3.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

* 'for-3.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroups: fix a css_set not found bug in cgroup_attach_proc

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time/clocksource: Fix kernel-doc warnings
  rtc: m41t80: Workaround broken alarm functionality
  rtc: Expire alarms after the time is set.

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
oprofile: Fix uninitialized memory access when writing to writing to oprofilefs

Merge branch 'stable/for-linus-fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

* 'stable/for-linus-fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"

Merge branch 'sh-fixes-for-linus' of git://github.com/pmundt/linux-sh

* 'sh-fixes-for-linus' of git://github.com/pmundt/linux-sh:
sh: fix build warning in board-sh7757lcr

Merge branch 'rmobile-fixes-for-linus' of git://github.com/pmundt/linux-sh

* 'rmobile-fixes-for-linus' of git://github.com/pmundt/linux-sh:
  ARM: mach-shmobile: SH73A0 external Ethernet fix
  ARM: mach-shmobile: AG5EVM GIC Sparse IRQ fix
  ARM: mach-shmobile: Kota2 TPU LED platform data
  ARM: mach-shmobile: Kota2 GIC Sparse IRQ fix
  ARM: mach-shmobile: Kota2 PINT fix

Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Fix a regression in nfs_file_llseek()
  NFSv4: Do not accept delegated opens when a delegation recall is in effect
  NFSv4: Ensure correct locking when accessing the 'lock_states' list
  NFSv4.1: Ensure that we handle _all_ SEQUENCE status bits.
  NFSv4: Don't error if we handled it in nfs4_recovery_handle_error
  SUNRPC: Ensure we always bump the backlog queue in xprt_free_slot
  SUNRPC: Fix the execution time statistics in the face of RPC restarts

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  vmwgfx: Clip cliprects against screen boundaries in present and dirty
  vmwgfx: Resend the cursor after legacy modeset
  vmwgfx: Do better culling of presents
  vmwgfx: Refactor kms code to use vmw_user_lookup_handle helper
  vmwgfx: Add helper function to get surface or dmabuf
  vmwgfx: Refactor cursor update
  vmwgfx: Remove dmabuf check in present ioctl
  vmwgfx: Use the revised fifo hw version register when present

net: have ipconfig not wait if no dev is available

previous commit 3fb72f1e6e6165c5f495e8dc11c5bbd14c73385c
makes IP-Config wait for carrier on at least one network device.

Before waiting (predefined value 120s), check that at least one device
was successfully brought up. Otherwise (e.g. buggy bootloader
which does not set the MAC address) there is no point in waiting
for carrier.

Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Holger Brunck <holger.brunck@keymile.com>
Signed-off-by: Gerlando Falauto <gerlando.falauto@keymile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd

When checking whether a DATA chunk fits into the estimated rwnd a
full sizeof(struct sk_buff) is added to the needed chunk size. This
quickly exhausts the available rwnd space and leads to packets being
sent which are much below the PMTU limit. This can lead to much worse
performance.

The reason for this behaviour was to avoid putting too much memory
pressure on the receiver. The concept is not completely irational
because a Linux receiver does in fact clone an skb for each DATA chunk
delivered. However, Linux also reserves half the available socket
buffer space for data structures therefore usage of it is already
accounted for.

When proposing to change this the last time it was noted that this
behaviour was introduced to solve a performance issue caused by rwnd
overusage in combination with small DATA chunks.

Trying to reproduce this I found that with the sk_buff overhead removed,
the performance would improve significantly unless socket buffer limits
are increased.

The following numbers have been gathered using a patched iperf
supporting SCTP over a live 1 Gbit ethernet network. The -l option
was used to limit DATA chunk sizes. The numbers listed are based on
the average of 3 test runs each. Default values have been used for
sk_(r|w)mem.

Chunk
Size    Unpatched     No Overhead
-------------------------------------
   4    15.2 Kbit [!]   12.2 Mbit [!]
   8    35.8 Kbit [!]   26.0 Mbit [!]
  16    95.5 Kbit [!]   54.4 Mbit [!]
  32   106.7 Mbit      102.3 Mbit
  64   189.2 Mbit      188.3 Mbit
128   331.2 Mbit      334.8 Mbit
256   537.7 Mbit      536.0 Mbit
512   766.9 Mbit      766.6 Mbit
1024   810.1 Mbit      808.6 Mbit

Signed-off-by: Thomas Graf <tgraf@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (31 commits)
  Revert "[media] af9015: limit I2C access to keep FW happy"
  [media] s5p-fimc: Fix camera input configuration in subdev operations
  [media] m5mols: Fix logic in sanity check
  [media] ati_remote: switch to single-byte scancodes
  [media] V4L: mt9m111: fix uninitialised mutex
  [media] V4L: omap1_camera: fix missing <linux/module.h> include
  [media] V4L: mt9t112: use after free in mt9t112_probe()
  [media] V4L: soc-camera: fix compiler warnings on 64-bit platforms
  [media] s5p_mfc_enc: fix s/H264/H263/ typo
  [media] omap_vout: Fix compile error in 3.1
  [media] au0828: add missing models 72101, 72201 & 72261 to the model matrix
  [media] au0828: add missing USB ID 2040:7213
  [media] au0828: add missing USB ID 2040:7260
  [media] [trivial] omap24xxcam-dma: Fix logical test
  [media] omap_vout: fix crash if no driver for a display
  [media] media: video: s5p-tv: fix build break
  [media] omap3isp: fix compilation of ispvideo.c
  [media] m5mols: Fix set_fmt to return proper pixel format code
  [media] s5p-fimc: Use correct fourcc for RGB565 colour format
  [media] s5p-fimc: Fail driver probing when sensor configuration is wrong
  ...

binary_sysctl(): fix memory leak

binary_sysctl() calls sysctl_getname() which allocates from names_cache
slab usin __getname()

The matching function to free the name is __putname(), and not putname()
which should be used only to match getname() allocations.

This is because when auditing is enabled, putname() calls audit_putname
*instead* (not in addition) to __putname().  Then, if a syscall is in
progress, audit_putname does not release the name - instead, it expects
the name to get released when the syscall completes, but that will happen
only if audit_getname() was called previously, i.e.  if the name was
allocated with getname() rather than the naked __getname().  So,
__getname() followed by putname() ends up leaking memory.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/vmalloc.c: remove static declaration of va from __get_vm_area_node

Static storage is not required for the struct vmap_area in
__get_vm_area_node.

Removing "static" to store this variable on the stack instead.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ipmi_watchdog: restore settings when BMC reset

If the BMC gets reset, it will return 0x80 response errors.

In less than a week
# grep "Error 80 on cmd 22" /var/log/kernel |wc -l
378681

In this case, it is probably a good idea to restore the IPMI settings.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Tested-by: Arkadiusz Miśkiewicz <a.miskiewicz@gmail.com>
Reported-by: Arkadiusz Miśkiewicz <a.miskiewicz@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

oom: fix integer overflow of points in oom_badness

An integer overflow will happen on 64bit archs if task's sum of rss,
swapents and nr_ptes exceeds (2^31)/1000 value.  This was introduced by
commit

f755a04 oom: use pte pages in OOM score

where the oom score computation was divided into several steps and it's no
longer computed as one expression in unsigned long(rss, swapents, nr_pte
are unsigned long), where the result value assigned to points(int) is in
range(1..1000).  So there could be an int overflow while computing

176          points *= 1000;

and points may have negative value. Meaning the oom score for a mem hog task
will be one.

196          if (points <= 0)
197                  return 1;

For example:
[ 3366]     0  3366 35390480 24303939   5       0             0 oom01
Out of memory: Kill process 3366 (oom01) score 1 or sacrifice child

Here the oom1 process consumes more than 24303939(rss)*4096~=92GB physical
memory, but it's oom score is one.

In this situation the mem hog task is skipped and oom killer kills another and
most probably innocent task with oom score greater than one.

The points variable should be of type long instead of int to prevent the
int overflow.

Signed-off-by: Frantisek Hrbata <fhrbata@redhat.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org> [2.6.36+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: keep root group unchanged if creation fails

If the request is to create non-root group and we fail to meet it, we
should leave the root unchanged.

Signed-off-by: Hillf Danton <dhillf@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>