git.karo-electronics.de Git - karo-tx-linux.git/log

vfs: turn do_path_lookup into wrapper around struct filename variant

...and make the user_path callers use that variant instead.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

audit: allow audit code to satisfy getname requests from its names_list

Currently, if we call getname() on a userland string more than once,
we'll get multiple copies of the string and multiple audit_names
records.

Add a function that will allow the audit_names code to satisfy getname
requests using info from the audit_names list, avoiding a new allocation
and audit_names records.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: define struct filename and have getname() return it

getname() is intended to copy pathname strings from userspace into a
kernel buffer. The result is just a string in kernel space. It would
however be quite helpful to be able to attach some ancillary info to
the string.

For instance, we could attach some audit-related info to reduce the
amount of audit-related processing needed. When auditing is enabled,
we could also call getname() on the string more than once and not
need to recopy it from userspace.

This patchset converts the getname()/putname() interfaces to return
a struct instead of a string. For now, the struct just tracks the
string in kernel space and the original userland pointer for it.

Later, we'll add other information to the struct as it becomes
convenient.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: unexport getname and putname symbols

I see no callers in module code.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

acct: constify the name arg to acct_on

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: allocate page instead of names_cache buffer in mount_block_root

First, it's incorrect to call putname() after __getname_gfp() since the
bare __getname_gfp() call skips the auditing code, while putname()
doesn't.

mount_block_root allocates a PATH_MAX buffer via __getname_gfp, and then
calls get_fs_names to fill the buffer. That function can call
get_filesystem_list which assumes that that buffer is a full page in
size. On arches where PAGE_SIZE != 4k, then this could potentially
overrun.

In practice, it's hard to imagine the list of filesystem names even
approaching 4k, but it's best to be safe. Just allocate a page for this
purpose instead.

With this, we can also remove the __getname_gfp() definition since there
are no more callers.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

audit: overhaul __audit_inode_child to accomodate retrying

In order to accomodate retrying path-based syscalls, we need to add a
new "type" argument to audit_inode_child. This will tell us whether
we're looking for a child entry that represents a create or a delete.

If we find a parent, don't automatically assume that we need to create a
new entry. Instead, use the information we have to try to find an
existing entry first. Update it if one is found and create a new one if
not.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

audit: optimize audit_compare_dname_path

In the cases where we already know the length of the parent, pass it as
a parm so we don't need to recompute it. In the cases where we don't
know the length, pass in AUDIT_NAME_FULL (-1) to indicate that it should
be determined.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

audit: make audit_compare_dname_path use parent_len helper

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

audit: remove dirlen argument to audit_compare_dname_path

All the callers set this to NULL now.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

audit: set the name_len in audit_inode for parent lookups

Currently, this gets set mostly by happenstance when we call into
audit_inode_child. While that might be a little more efficient, it seems
wrong. If the syscall ends up failing before audit_inode_child ever gets
called, then you'll have an audit_names record that shows the full path
but has the parent inode info attached.

Fix this by passing in a parent flag when we call audit_inode that gets
set to the value of LOOKUP_PARENT. We can then fix up the pathname for
the audit entry correctly from the get-go.

While we're at it, clean up the no-op macro for audit_inode in the
!CONFIG_AUDITSYSCALL case.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

audit: add a new "type" field to audit_names struct

For now, we just have two possibilities:

UNKNOWN: for a new audit_names record that we don't know anything about yet
NORMAL: for everything else

In later patches, we'll add other types so we can distinguish and update
records created under different circumstances.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

audit: reverse arguments to audit_inode_child

Most of the callers get called with an inode and dentry in the reverse
order. The compiler then has to reshuffle the arg registers and/or
stack in order to pass them on to audit_inode_child.

Reverse those arguments for a micro-optimization.

Reported-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

audit: no need to walk list in audit_inode if name is NULL

If name is NULL then the condition in the loop will never be true. Also,
with this change, we can eliminate the check for n->name == NULL since
the equivalence check will never be true if it is.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

audit: pass in dentry to audit_copy_inode wherever possible

In some cases, we were passing in NULL even when we have a dentry.

Reported-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

audit: remove unnecessary NULL ptr checks from do_path_lookup

As best I can tell, whenever retval == 0, nd->path.dentry and nd->inode
are also non-NULL. Eliminate those checks and the superfluous
audit_context check.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge branch 'for-linus' into for-next

vfs: bogus warnings in fs/namei.c

The follow_link() function always initializes its *p argument,
or returns an error, but when building with 'gcc -s', the compiler
gets confused by the __always_inline attribute to the function
and can no longer detect where the cookie was initialized.

The solution is to always initialize the pointer from follow_link,
even in the error path. When building with -O2, this has zero impact
on generated code and adds a single instruction in the error path
for a -Os build on ARM.

Without this patch, building with gcc-4.6 through gcc-4.8 and
CONFIG_CC_OPTIMIZE_FOR_SIZE results in:

fs/namei.c: In function 'link_path_walk':
fs/namei.c:649:24: warning: 'cookie' may be used uninitialized in this function [-Wuninitialized]
fs/namei.c:1544:9: note: 'cookie' was declared here
fs/namei.c: In function 'path_lookupat':
fs/namei.c:649:24: warning: 'cookie' may be used uninitialized in this function [-Wuninitialized]
fs/namei.c:1934:10: note: 'cookie' was declared here
fs/namei.c: In function 'path_openat':
fs/namei.c:649:24: warning: 'cookie' may be used uninitialized in this function [-Wuninitialized]
fs/namei.c:2899:9: note: 'cookie' was declared here

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

consitify do_mount() arguments

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge tag 'for-3.7-rc1' of git://gitorious.org/linux-pwm/linux-pwm

Pull pwm changes from Thierry Reding:
"All legacy PWM providers have now been moved to the PWM subsystem.
  The plan for 3.8 is to adapt all board files to provide a lookup table
  for PWM devices in order to get rid of the global namespace.
  Subsequently, users of the legacy pwm_request() and pwm_free()
  functions can be migrated to the new pwm_get() and pwm_put()
  functions.  Once this has been completed, the legacy API and the
  compatibility code in the core can be removed.

  In addition to the above, these changes also add support for
  configuring the polarity of a PWM signal (currently only supported on
  ECAP and EHRPWM) and include a much needed rework of the i.MX driver.
  Managed functions to obtain and release a PWM device (devm_pwm_get()
  and devm_pwm_put()) have been added and the pwm-backlight driver has
  been updated to use them.  If the PWM subsystem hasn't been enabled,
  dummy functions are provided that allow the subsystem to safely
  compile out.

  Some common checks on input parameters have been moved to the core and
  removed from the drivers.  Finally, a small fix corrects the
  description of the PWM specifier's second cell in the device tree
  representation."

* tag 'for-3.7-rc1' of git://gitorious.org/linux-pwm/linux-pwm: (23 commits)
  pwm: dt: Fix description of second PWM cell
  pwm: Check for negative duty-cycle and period
  pwm: Add Ingenic JZ4740 support
  MIPS: JZ4740: Export timer API
  pwm: Move PUV3 PWM driver to PWM framework
  unicore32: pwm: Use managed resource allocations
  unicore32: pwm: Remove unnecessary indirection
  unicore32: pwm: Use module_platform_driver()
  unicore32: pwm: Properly remap memory-mapped registers
  pwm-backlight: Use devm_pwm_get() instead of pwm_get()
  pwm: Move AB8500 PWM driver to PWM framework
  pwm: Fix compilation error when CONFIG_PWM is not defined
  pwm: i.MX: fix clock lookup
  pwm: i.MX: use per clock unconditionally
  pwm: i.MX: add devicetree support
  pwm: i.MX: Use module_platform_driver
  pwm: i.MX: add functions to enable/disable pwm.
  pwm: i.MX: remove unnecessary if in pwm_[en|dis]able
  pwm: i.MX: factor out SoC specific functions
  pwm: pwm-tiehrpwm: Add support for configuring polarity of PWM
  ...

Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds

Pull LED subsystem update from Bryan Wu.

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds: (24 commits)
  leds: add output driver configuration for pca9633 led driver
  leds: lm3642: Use regmap_update_bits() in lm3642_chip_init()
  leds: Add new LED driver for lm3642 chips
  leds-lp5523: Fix riskiness of the page fault
  leds-lp5523: turn off the LED engines on unloading the driver
  leds-lm3530: Fix smatch warnings
  leds-lm3530: Use devm_regulator_get function
  leds: leds-gpio: adopt pinctrl support
  leds: Add new LED driver for lm355x chips
  leds-lp5523: use the i2c device id rather than fixed name
  leds-lp5523: add new device id for LP55231
  leds-lp5523: support new LP55231 device
  leds: triggers: send uevent when changing triggers
  leds-lp5523: minor code style fixes
  leds-lp5523: change the return type of lp5523_set_mode()
  leds-lp5523: set the brightness to 0 forcely on removing the driver
  leds-lp5523: add channel name in the platform data
  leds: leds-gpio: Use of_get_child_count() helper
  leds: leds-gpio: Use platform_{get,set}_drvdata
  leds: leds-gpio: use of_match_ptr()
  ...

Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending

Pull scsi target updates from Nicholas Bellinger:
"Things have been calm for the most part with no new fabric drivers in
  flight for v3.7 (we're up to eight now !), so this update is primarily
  focused on addressing a few long-standing items within target-core and
  iscsi-target fabric code.

  The highlights include:

   - target: Simplify fabric sense data length handling (roland)
   - qla2xxx: Fix endianness of task management response code (roland)
   - target: fix truncation of mode data, support zero allocation length
     (paolo)
   - target: Properly support zero-length commands in normal processing
     path (paolo)
   - iscsi-target: Correctly set 0xffffffff field within ISCSI_OP_REJECT
     PDU (ronnie + nab)
   - iscsi-target: Add explicit set of cache_dynamic_acls=1 for TPG
     demo-mode (ronnie + nab)
   - target/file: Re-enable optional fd_buffered_io=1 operation (nab +
     hch)
   - iscsi-target: Add MaxXmitDataSegmenthLength forr target ->
     initiator MDRSL declaration (nab)
   - target: Add target_submit_cmd_map_sgls for SGL fabric memory
     passthrough (nab + hch)
   - tcm_loop: Convert I/O path to use target_submit_cmd_map_sgls (hch +
     nab)
   - tcm_vhost: Convert I/O path to use target_submit_cmd_map_sgls (nab
     + hch)

  The last series for adding a new target_submit_cmd_map_sgls() fabric
  caller (as requested by hch) that accepts pre-allocated SGL memory
  (using existing logic), along with converting tcm_loop + tcm_vhost has
  only been in -next for the last days, but has gotten enough review
  +testing and is clear enough a mechanical change that I think it's
  reasonable to merge for -rc1 code.

  Thanks again to everyone who contributed this round! Extra special
  thanks to Roland (PureStorage) for tracking down the qla2xxx target
  TMR response code endian issue, and to Paolo (Redhat) for resolving
  the long standing zero-length CDB issues within target-core between
  virtual and pSCSI backends."

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (44 commits)
  iscsi-target: Bump defaults for nopin_timeout + nopin_response_timeout values
  iscsit: proper endianess conversions
  iscsit: use the itt_t abstract type
  iscsit: add missing endianess conversion in iscsit_check_inaddr_any
  iscsit: remove incorrect unlock in iscsit_build_sendtargets_resp
  iscsit: mark various functions static
  target/iscsi: precedence bug in iscsit_set_dataout_sequence_values()
  target/usb-gadget: strlen() doesn't count the terminator
  target/usb-gadget: remove duplicate initialization
  tcm_vhost: Convert I/O path to use target_submit_cmd_map_sgls
  target: Add control CDB READ payload zero work-around
  tcm_loop: Convert I/O path to use target_submit_cmd_map_sgls
  target: Add target_submit_cmd_map_sgls for SGL fabric memory passthrough
  iscsi-target: Add explicit set of cache_dynamic_acls=1 for TPG demo-mode
  iscsi-target: Change iscsi_target_seq_pdu_list.c to honor MaxXmitDataSegmentLength
  iscsi-target: Add MaxXmitDataSegmentLength connection recovery check
  iscsi-target: Convert incoming PDU payload checks to MaxXmitDataSegmentLength
  iscsi-target: Enable MaxXmitDataSegmentLength operation in login path
  iscsi-target: Add base MaxXmitDataSegmentLength code
  target/file: Re-enable optional fd_buffered_io=1 operation
  ...

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull second s390 update from Martin Schwidefsky:
"The big thing in this pull request is the UAPI patch from David, and
  worth mentioning is the page table dumper.  The rest are small
  improvements and bug fixes."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/entry: fix svc number for TIF_SYSCALL system call restart
  s390/mm,vmem: fix vmem_add_mem()/vmem_remove_range()
  s390/vmalloc: have separate modules area
  s390/zcrypt: remove duplicated include from zcrypt_pcixcc.c
  s390/css_chars: remove superfluous ifdef
  s390/chsc: make headers usable
  s390/mm: let kernel text section always begin at 1MB
  s390/mm: fix mapping of read-only kernel text section
  s390/mm: add page table dumper
  s390: add support to start the kernel in 64 bit mode.
  s390/mm,pageattr: remove superfluous EXPORT_SYMBOLs
  s390/mm,pageattr: add more page table walk sanity checks
  s390/mm: fix pmd_huge() usage for kernel mapping
  s390/dcssblk: cleanup device attribute usage
  s390/mm: use pfmf instruction to initialize storage keys
  s390/facilities: cleanup PFMF and HPAGE machine facility detection
  UAPI: (Scripted) Disintegrate arch/s390/include/asm

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull nouveau drm fixes from Dave Airlie:
"Just a bunch of nouveau fixes, Ben wants to get some alternate
  versions into stable."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau/timer: bump ptimer's alarm delay from u32 to u64
  drm/nouveau/fan: fix a typo in PWM's input clock calculation
  drm/nv50/clk: wire up pll_calc hook
  drm/nouveau: remove unused _nouveau_parent_ctor
  drm/nouveau/bios: fix shadowing of ACPI ROMs larger than 64KiB

lglock: add DEFINE_STATIC_LGLOCK()

When the lglock doesn't need to be exported we can use
DEFINE_STATIC_LGLOCK().

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

lglock: make the per_cpu locks static

The per_cpu locks are not used outside the file which contains the
DEFINE_LGLOCK(), so we can make these symbols static.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

lglock: remove unused DEFINE_LGLOCK_LOCKDEP()

struct lglocks use their own lock_key/lock_dep_map which are defined in
struct lglock. DEFINE_LGLOCK_LOCKDEP() is unused, so remove it and save a
small piece of memory.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

MAX_LFS_FILESIZE definition for 64bit needs LL...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

tmpfs,ceph,gfs2,isofs,reiserfs,xfs: fix fh_len checking

Fuzzing with trinity oopsed on the 1st instruction of shmem_fh_to_dentry(),
u64 inum = fid->raw[2];
which is unhelpfully reported as at the end of shmem_alloc_inode():

BUG: unable to handle kernel paging request at ffff880061cd3000
IP: [<ffffffff812190d0>] shmem_alloc_inode+0x40/0x40
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Call Trace:
[<ffffffff81488649>] ? exportfs_decode_fh+0x79/0x2d0
[<ffffffff812d77c3>] do_handle_open+0x163/0x2c0
[<ffffffff812d792c>] sys_open_by_handle_at+0xc/0x10
[<ffffffff83a5f3f8>] tracesys+0xe1/0xe6

Right, tmpfs is being stupid to access fid->raw[2] before validating that
fh_len includes it: the buffer kmalloc'ed by do_sys_name_to_handle() may
fall at the end of a page, and the next page not be present.

But some other filesystems (ceph, gfs2, isofs, reiserfs, xfs) are being
careless about fh_len too, in fh_to_dentry() and/or fh_to_parent(), and
could oops in the same way: add the missing fh_len checks to those.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Sage Weil <sage@inktank.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: drop lock/unlock super

Removed s_lock from super_block and removed lock/unlock super.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ufs: drop lock/unlock super

Removed lock/unlock super. Added a new private s_lock mutex.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

sysv: drop lock/unlock super

Removed lock/unlock super. Added a new private s_lock mutex.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

hpfs: drop lock/unlock super

Removed lock/unlock super.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Acked-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fat: drop lock/unlock super

Removed lock/unlock super. Added a new private s_lock mutex.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ext3: drop lock/unlock super

Removed lock/unlock super.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

exofs: drop lock/unlock super

Removed lock/unlock super.

Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

dup3: Return an error when oldfd == newfd.

I have tested the attached patch to fix the dup3 regression.

Rich.

From 0944e30e12dec6544b3602626b60ff412375c78f Mon Sep 17 00:00:00 2001
From: "Richard W.M. Jones" <rjones@redhat.com>
Date: Tue, 9 Oct 2012 14:42:45 +0100
Subject: [PATCH] dup3: Return an error when oldfd == newfd.

The following commit:

  commit fe17f22d7fd0e344ef6447238f799bb49f670c6f
  Author: Al Viro <viro@zeniv.linux.org.uk>
  Date:   Tue Aug 21 11:48:11 2012 -0400

    take purely descriptor-related stuff from fcntl.c to file.c

was supposed to be just code motion, but it dropped the following two
lines:

  if (unlikely(oldfd == newfd))
          return -EINVAL;

from the dup3 system call.  dup3 is not specified by POSIX, so Linux
can do what it likes.  However the POSIX proposal for dup3 [1] states
that it should return an error if oldfd == newfd.

[1] http://austingroupbugs.net/view.php?id=411

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fs: handle failed audit_log_start properly

audit_log_start() may return NULL, this is unchecked by the caller in
audit_log_link_denied() and could cause a NULL ptr deref.

Introduced by commit a51d9eaa ("fs: add link restriction audit reporting").

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fs: prevent use after free in auditing when symlink following was denied

Commit "fs: add link restriction audit reporting" has added auditing of failed
attempts to follow symlinks. Unfortunately, the auditing was being done after
the struct path structure was released earlier.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal

Pull generic execve() changes from Al Viro:
"This introduces the generic kernel_thread() and kernel_execve()
  functions, and switches x86, arm, alpha, um and s390 over to them."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (26 commits)
  s390: convert to generic kernel_execve()
  s390: switch to generic kernel_thread()
  s390: fold kernel_thread_helper() into ret_from_fork()
  s390: fold execve_tail() into start_thread(), convert to generic sys_execve()
  um: switch to generic kernel_thread()
  x86, um/x86: switch to generic sys_execve and kernel_execve
  x86: split ret_from_fork
  alpha: introduce ret_from_kernel_execve(), switch to generic kernel_execve()
  alpha: switch to generic kernel_thread()
  alpha: switch to generic sys_execve()
  arm: get rid of execve wrapper, switch to generic execve() implementation
  arm: optimized current_pt_regs()
  arm: introduce ret_from_kernel_execve(), switch to generic kernel_execve()
  arm: split ret_from_fork, simplify kernel_thread() [based on patch by rmk]
  generic sys_execve()
  generic kernel_execve()
  new helper: current_pt_regs()
  preparation for generic kernel_thread()
  um: kill thread->forking
  um: let signal_delivered() do SIGTRAP on singlestepping into handler
  ...

Merge branch 'for-linus-37rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml

Pull UML changes from Richard Weinberger:
"UML receives this time only cleanups.

  The most outstanding change is the 'include "foo.h"' do 'include
  <foo.h>' conversion done by Al Viro.

  It touches many files, that's why the diffstat is rather big."

* 'for-linus-37rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  typo in UserModeLinux-HOWTO
  hppfs: fix the return value of get_inode()
  hostfs: drop vmtruncate
  um: get rid of pointless include "..." where include <...> will do
  um: move sysrq.h out of include/shared
  um/x86: merge 32 and 64 bit variants of ptrace.h
  um/x86: merge 32 and 64bit variants of checksum.h

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking updates from David Miller:

1) UAPI changes for networking from David Howells

2) A netlink dump is an operation we can sleep within, and therefore we
    need to make sure the dump provider module doesn't disappear on us
    meanwhile.  Fix from Gao Feng.

3) Now that tunnels support GRO, we have to be more careful in
    skb_gro_reset_offset() otherwise we OOPS, from Eric Dumazet.

4) We can end up processing packets for VLANs we aren't actually
    configured to be on, fix from Florian Zumbiehl.

5) Fix routing cache removal regression in redirects and IPVS.  The
    core issue on the IPVS side is that it wants to rewrite who the
    nexthop is and we have to explicitly accomodate that case.  From
    Julian Anastasov.

6) Error code return fixes all over the networking drivers from Peter
    Senna Tschudin.

7) Fix routing cache removal regressions in IPSEC, from Steffen
    Klassert.

8) Fix deadlock in RDS during pings, from Jeff Liu.

9) Neighbour packet queue can trigger skb_under_panic() because we do
    not reset the network header of the SKB in the right spot.  From
    Ramesh Nagappa.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
  RDS: fix rds-ping spinlock recursion
  netdev/phy: Prototype of_mdio_find_bus()
  farsync: fix support for over 30 cards
  be2net: Remove code that stops further access to BE NIC based on UE bits
  pch_gbe: Fix build error by selecting all the possible dependencies.
  e1000e: add device IDs for i218
  ixgbe/ixgbevf: Limit maximum jumbo frame size to 9.5K to avoid Tx hangs
  ixgbevf: Set the netdev number of Tx queues
  UAPI: (Scripted) Disintegrate include/linux/tc_ematch
  UAPI: (Scripted) Disintegrate include/linux/tc_act
  UAPI: (Scripted) Disintegrate include/linux/netfilter_ipv6
  UAPI: (Scripted) Disintegrate include/linux/netfilter_ipv4
  UAPI: (Scripted) Disintegrate include/linux/netfilter_bridge
  UAPI: (Scripted) Disintegrate include/linux/netfilter_arp
  UAPI: (Scripted) Disintegrate include/linux/netfilter/ipset
  UAPI: (Scripted) Disintegrate include/linux/netfilter
  UAPI: (Scripted) Disintegrate include/linux/isdn
  UAPI: (Scripted) Disintegrate include/linux/caif
  net: fix typo in freescale/ucc_geth.c
  vxlan: fix more sparse warnings
  ...

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc

Pull sparc update from David Miller:
"This is just the UAPI commits for sparc via David Howells."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
UAPI: (Scripted) Disintegrate arch/sparc/include/asm

Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma

Pull slave-dmaengine updates from Vinod Koul:
"This time we have Andy updates on dw_dmac which is attempting to make
  this IP block available as PCI and platform device though not fully
  complete this time.

  We also have TI EDMA moving the dma driver to use dmaengine APIs, also
  have a new driver for mmp-tdma, along with bunch of small updates.

  Now for your excitement the merge is little unusual here, while
  merging the auto merge on linux-next picks wrong choice for pl330
  (drivers/dma/pl330.c) and this causes build failure.  The correct
  resolution is in linux-next.  (DMA: PL330: Fix build error) I didn't
  back merge your tree this time as you are better than me so no point
  in doing that for me :)"

Fixed the pl330 conflict as in linux-next, along with trivial header
file conflicts due to changed includes.

* 'next' of git://git.infradead.org/users/vkoul/slave-dma: (29 commits)
  dma: tegra: fix interrupt name issue with apb dma.
  dw_dmac: fix a regression in dwc_prep_dma_memcpy
  dw_dmac: introduce software emulation of LLP transfers
  dw_dmac: autoconfigure data_width or get it via platform data
  dw_dmac: autoconfigure block_size or use platform data
  dw_dmac: get number of channels from hardware if possible
  dw_dmac: fill optional encoded parameters in register structure
  dw_dmac: mark dwc_dump_chan_regs as inline
  DMA: PL330: return ENOMEM instead of 0 from pl330_alloc_chan_resources
  DMA: PL330: Remove redundant runtime_suspend/resume functions
  DMA: PL330: Remove controller clock enable/disable
  dmaengine: use kmem_cache_zalloc instead of kmem_cache_alloc/memset
  DMA: PL330: Set the capability of pdm0 and pdm1 as DMA_PRIVATE
  ARM: EXYNOS: Set the capability of pdm0 and pdm1 as DMA_PRIVATE
  dma: tegra: use list_move_tail instead of list_del/list_add_tail
  mxs/dma: Enlarge the CCW descriptor area to 4 pages
  dw_dmac: utilize slave_id to pass request line
  dmaengine: mmp_tdma: add dt support
  dmaengine: mmp-pdma support
  spi: davici - make davinci select edma
  ...

Merge tag 'mmc-merge-for-3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc

Pull MMC updates from Chris Ball:
"Core:
   - Add DT properties for card detection (broken-cd, cd-gpios,
     non-removable)
   - Don't poll non-removable devices
   - Fixup/rework eMMC sleep mode/"power off notify" feature
   - Support eMMC background operations (BKOPS).  To set the one-time
     programmable fuse that enables bkops on an eMMC that doesn't
     already have it set, you can use the "mmc bkops enable" command in:

       git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc-utils.git

  Drivers:
   - atmel-mci, dw_mmc, pxa-mci, dove, s3c, spear: Add device tree
     support
   - bfin_sdh: Add support for the controller in bf60x
   - dw_mmc: Support Samsung Exynos SoCs
   - eSDHC: Add ADMA support
   - sdhci: Support testing a cd-gpio (from slot-gpio) instead of
     presence bit
   - sdhci-pltfm: Support broken-cd DT property
   - tegra: Convert to only supporting DT (mach-tegra has gone DT-only)"

* tag 'mmc-merge-for-3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (67 commits)
  mmc: core: Fixup broken suspend and eMMC4.5 power off notify
  mmc: sdhci-spear: Add clk_{un}prepare() support
  mmc: sdhci-spear: add device tree bindings
  mmc: sdhci-s3c: Add clk_(enable/disable) in runtime suspend/resume
  mmc: core: Replace MMC_CAP2_BROKEN_VOLTAGE with test for fixed regulator
  mmc: sdhci-pxav3: Use sdhci_get_of_property for parsing DT quirks
  mmc: dt: Support "broken-cd" property in sdhci-pltfm
  mmc: sdhci-s3c: fix the wrong number of max bus clocks
  mmc: sh-mmcif: avoid oops on spurious interrupts
  mmc: sh-mmcif: properly handle MMC_WRITE_MULTIPLE_BLOCK completion IRQ
  mmc: sdhci-s3c: Fix crash on module insertion for second time
  mmc: sdhci-s3c: Enable only required bus clock
  mmc: Revert "mmc: dw_mmc: Add check for IDMAC configuration"
  mmc: mxcmmc: fix bug that may block a data transfer forever
  mmc: omap_hsmmc: Pass on the suspend failure to the PM core
  mmc: atmel-mci: AP700x PDC is not connected to MCI
  mmc: atmel-mci: DMA can be used with other controllers
  mmc: mmci: use clk_prepare_enable and clk_disable_unprepare
  mmc: sdhci-s3c: Add device tree support
  mmc: dw_mmc: add support for exynos specific implementation of dw-mshc
  ...

Merge tag 'for-linus-20121009' of git://git.infradead.org/mtd-2.6

Pull MTD updates from David Woodhouse:

- Disable broken mtdchar mmap() on MMU systems
- Additional ECC tests for NAND flash, and some test cleanups
- New NAND and SPI chip support
- Fixes/cleanup for SH FLCTL NAND controller driver
- Improved hardware support for GPMI NAND controller
- Conversions to device-tree support for various drivers
- Removal of obsolete drivers (sbc8xxx, bcmring, etc.)
- New LPC32xx drivers for MLC and SLC NAND
- Further cleanup of NAND OOB/ECC handling
- UAPI cleanup merge from David Howells (just moving files, since MTD
   headers were sorted out long ago to separate user-visible from kernel
   bits)

* tag 'for-linus-20121009' of git://git.infradead.org/mtd-2.6: (168 commits)
  mtd: Disable mtdchar mmap on MMU systems
  UAPI: (Scripted) Disintegrate include/mtd
  mtd: nand: detect Samsung K9GBG08U0A, K9GAG08U0F ID
  mtd: nand: decode Hynix MLC, 6-byte ID length
  mtd: nand: increase max OOB size to 640
  mtd: nand: add generic READ ID length calculation functions
  mtd: nand: split simple ID decode into its own function
  mtd: nand: split extended ID decoding into its own function
  mtd: nand: split BB marker options decoding into its own function
  mtd: nand: remove redundant ID read
  mtd: nand: remove unnecessary variable
  mtd: docg4: add missing HAS_IOMEM dependency
  mtd: gpmi: initialize the timing registers only one time
  mtd: gpmi: add EDO feature for imx6q
  mtd: gpmi: do not set the default values for the extra clocks
  mtd: gpmi: simplify the DLL setting code
  mtd: gpmi: add a new field for HW_GPMI_CTRL1
  mtd: gpmi: do not get the clock frequency in gpmi_begin()
  mtd: gpmi: add a new field for HW_GPMI_TIMING1
  mtd: add helpers to get the supportted ONFI timing mode
  ...

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs

Pull btrfs update from Chris Mason:
"This is a large pull, with the bulk of the updates coming from:

   - Hole punching

   - send/receive fixes

   - fsync performance

   - Disk format extension allowing more hardlinks inside a single
     directory (btrfs-progs patch required to enable the compat bit for
     this one)

  I'm cooking more unrelated RAID code, but I wanted to make sure this
  original batch makes it in.  The largest updates here are relatively
  old and have been in testing for some time."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (121 commits)
  btrfs: init ref_index to zero in add_inode_ref
  Btrfs: remove repeated eb->pages check in, disk-io.c/csum_dirty_buffer
  Btrfs: fix page leakage
  Btrfs: do not warn_on when we cannot alloc a page for an extent buffer
  Btrfs: don't bug on enomem in readpage
  Btrfs: cleanup pages properly when ENOMEM in compression
  Btrfs: make filesystem read-only when submitting barrier fails
  Btrfs: detect corrupted filesystem after write I/O errors
  Btrfs: make compress and nodatacow mount options mutually exclusive
  btrfs: fix message printing
  Btrfs: don't bother committing delayed inode updates when fsyncing
  btrfs: move inline function code to header file
  Btrfs: remove unnecessary IS_ERR in bio_readpage_error()
  btrfs: remove unused function btrfs_insert_some_items()
  Btrfs: don't commit instead of overcommitting
  Btrfs: confirmation of value is added before trace_btrfs_get_extent() is called
  Btrfs: be smarter about dropping things from the tree log
  Btrfs: don't lookup csums for prealloc extents
  Btrfs: cache extent state when writing out dirty metadata pages
  Btrfs: do not hold the file extent leaf locked when adding extent item
  ...

Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6

Pull CIFS fixes from Steve French.

* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: reinstate the forcegid option
  Convert properly UTF-8 to UTF-16
  [CIFS] WARN_ON_ONCE if kernel_sendmsg() returns -ENOSPC

typo in UserModeLinux-HOWTO

[it seems that I sent it to the wrong maintainer at first... sorry for that]
copy_from_user was meant instead of copy_to_user.

Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>

hppfs: fix the return value of get_inode()

In case of error, the function get_inode() returns ERR_PTR().
But the users hppfs_lookup() and hppfs_fill_super() use NULL
test for check the return value, not IS_ERR(), so we'd better
change the return value of get_inode() to NULL instead of
ERR_PTR().

dpatch engine is used to generated this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Richard Weinberger <richard@nod.at>

hostfs: drop vmtruncate

Removed vmtruncate.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>

um: get rid of pointless include "..." where include <...> will do

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>

um: move sysrq.h out of include/shared

never used by userland-side objects

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>

um/x86: merge 32 and 64 bit variants of ptrace.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>

um/x86: merge 32 and 64bit variants of checksum.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>

RDS: fix rds-ping spinlock recursion

This is the revised patch for fixing rds-ping spinlock recursion
according to Venkat's suggestions.

RDS ping/pong over TCP feature has been broken for years(2.6.39 to
3.6.0) since we have to set TCP cork and call kernel_sendmsg() between
ping/pong which both need to lock "struct sock *sk". However, this
lock has already been hold before rds_tcp_data_ready() callback is
triggerred. As a result, we always facing spinlock resursion which
would resulting in system panic.

Given that RDS ping is only used to test the connectivity and not for
serious performance measurements, we can queue the pong transmit to
rds_wq as a delayed response.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
CC: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
CC: David S. Miller <davem@davemloft.net>
CC: James Morris <james.l.morris@oracle.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netdev/phy: Prototype of_mdio_find_bus()

Ensure that of_mdio_find_bus() matches the prototype in the header (and
stop sparse complaining) by including the header with the prototype.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

farsync: fix support for over 30 cards

We're trying to fill a 64 bit bitmap but only the lower 30 shifts work
because the shift wraps around.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be2net: Remove code that stops further access to BE NIC based on UE bits

On certain platforms, BE hardware could falsely indicate UE.
For BE family of NICs, do not set hw_error based on the UE bits.
If there was a real fatal error, the corresponding h/w block will
automatically go offline and stop traffic.

Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

pch_gbe: Fix build error by selecting all the possible dependencies.

Fengguang reported a kernel build failure as following:
drivers/built-in.o: In function `pch_gbe_ioctl':
pch_gbe_main.c:(.text+0x510370): undefined reference to `pch_ch_control_write'
pch_gbe_main.c:(.text+0x510393): undefined reference to `pch_ch_control_write'
pch_gbe_main.c:(.text+0x5103b3): undefined reference to `pch_ch_control_write'
...

It's a regression by commit da1586461. The root cause is that
the CONFIG_PPS is not set there, consequently CONFIG_PTP_1588_CLOCK
can not be set anyway, which finally causes ptp_pch and pch_gbe_main
build failures.

As David prefers to use *select* to fix such module co-dependency issues,
this patch explicitly selects all the possible dependencies of PCH_PTP.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Haicheng Li <haicheng.lee@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'disintegrate-isdn-20121009' of git://git.infradead.org/users/dhowells/linux-headers

UAPI Disintegration 2012-10-09

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'disintegrate-net-20121009' of git://git.infradead.org/users/dhowells/linux-headers

UAPI Disintegration 2012-10-09

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux

Pulled mainline in order to get the UAPI infrastructure already
merged before I pull in David Howells's UAPI trees for networking.

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'disintegrate-sparc-20121009' of git://git.infradead.org/users/dhowells/linux-headers

UAPI Disintegration 2012-10-09

Signed-off-by: David S. Miller <davem@davemloft.net>

btrfs: init ref_index to zero in add_inode_ref

Signed-off-by: Chris Mason <chris.mason@fusionio.com>

mtd: Disable mtdchar mmap on MMU systems

This code was broken because it assumed that all MTD devices were map-based.
Disable it for now, until it can be fixed properly for the next merge window.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

Merge tag 'disintegrate-mtd-20121009' of git://git.infradead.org/users/dhowells/linux-headers

UAPI Disintegration 2012-10-09

Conflicts:
MAINTAINERS
arch/arm/configs/bcmring_defconfig
arch/arm/mach-imx/clk-imx51-imx53.c
drivers/mtd/nand/Kconfig
drivers/mtd/nand/bcm_umi_nand.c
drivers/mtd/nand/nand_bcm_umi.h
drivers/mtd/nand/orion_nand.c

Btrfs: remove repeated eb->pages check in, disk-io.c/csum_dirty_buffer

In csum_dirty_buffer, we first get eb from page->private.
Then we check if the page is the first page of eb. Later
we check it again. Remove the repeated check here.

Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>

Btrfs: fix page leakage

Alloc_dummy_extent_buffer will not free the first page in the eb array if we
fail to allocate a page, fix this. Thanks,

Reported-by: David Sterba <dave@jikos.cz>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

Btrfs: do not warn_on when we cannot alloc a page for an extent buffer

It's just annoying and the user will have gotten a nice OOM killer message
so they are already fully aware they are screwed :). Thanks,

Reported-by: Jérôme Poulin <jeromepoulin@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

Btrfs: don't bug on enomem in readpage

Get rid of the BUG_ON(ret == -ENOMEM) in __extent_read_full_page. Thanks,

Reported-by: Jérôme Poulin <jeromepoulin@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

Btrfs: cleanup pages properly when ENOMEM in compression

We were freeing non-existent pages which was causing a panic for a user who
was suffering from ENOMEM. This patch fixes the problem. Thanks,

Reported-by: Jérôme Poulin <jeromepoulin@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>

Btrfs: make filesystem read-only when submitting barrier fails

So far the return code of barrier_all_devices() is ignored, which
means that errors are ignored. The result can be a corrupt
filesystem which is not consistent.
This commit adds code to evaluate the return code of
barrier_all_devices(). The normal btrfs_error() mechanism is used to
switch the filesystem into read-only mode when errors are detected.

In order to decide whether barrier_all_devices() should return
error or success, the number of disks that are allowed to fail the
barrier submission is calculated. This calculation accounts for the
worst RAID level of metadata, system and data. If single, dup or
RAID0 is in use, a single disk error is already considered to be
fatal. Otherwise a single disk error is tolerated.

The calculation of the number of disks that are tolerated to fail
the barrier operation is performed when the filesystem gets mounted,
when a balance operation is started and finished, and when devices
are added or removed.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>

Btrfs: detect corrupted filesystem after write I/O errors

In check-integrity, detect when a superblock is written that points
to blocks that have not been written to disk due to I/O write errors.

Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>

Btrfs: make compress and nodatacow mount options mutually exclusive

If a filesystem is mounted with compression and then remounted by adding nodatacow,
the compression is disabled but the compress flag is still visible.
Also, if a filesystem is mounted with nodatacow and then remounted with compression,
nodatacow flag is still present but it's not active.
This patch:
- removes compress flags and notifies that the compression has been disabled if the
filesystem is mounted with nodatacow
- removes nodatacow and nodatasum flags if mounted with compress.

Signed-off-by: Andrei Popa <andrei.popa@i-neo.ro>

btrfs: fix message printing

Fix various messages to include newline and module prefix.

Signed-off-by: Daniel J Blueman <daniel@quora.org>

Btrfs: don't bother committing delayed inode updates when fsyncing

We can just copy the in memory inode into the tree log directly, no sense in
updating the fs tree so we can copy it into the tree log tree. Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>

btrfs: move inline function code to header file

When building btrfs from kernel code, it will report:

fs/btrfs/extent_io.h:281: warning: 'extent_buffer_page' declared inline after being called
fs/btrfs/extent_io.h:281: warning: previous declaration of 'extent_buffer_page' was here
fs/btrfs/extent_io.h:280: warning: 'num_extent_pages' declared inline after being called
fs/btrfs/extent_io.h:280: warning: previous declaration of 'num_extent_pages' was here

because of the wrong declaration of inline functions.

Signed-off-by: Robin Dong <sanbai@taobao.com>

Btrfs: remove unnecessary IS_ERR in bio_readpage_error()

Because the value of extent_map is only a correct value or NULL,
so IS_ERR is unnecessary.

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>

btrfs: remove unused function btrfs_insert_some_items()

The function btrfs_insert_some_items() would not be called by any other functions,
so remove it.

Signed-off-by: Robin Dong <sanbai@taobao.com>

Btrfs: don't commit instead of overcommitting

I don't think we have the same problem that this was supposed to fix
originally since we can allocate chunks in the enospc path now. This code
is causing us to constantly commit the transaction as we get close to using
all of our available space in our currently allocated chunks, instead of
allocating another chunk and carrying on with life, which is not nice for
performance. Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>

Btrfs: confirmation of value is added before trace_btrfs_get_extent() is called

We should confirm the value of extent_map before calling
trace_btrfs_get_extent() because the value of extent_map has the
possibility of NULL.

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>

Btrfs: be smarter about dropping things from the tree log

When we truncate existing items in the tree log we've been searching for
each individual item and removing them. This is unnecessary churn and
searching, just keep track of the slot we are on and how many items we need
to delete and delete them all at once. Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>

Btrfs: don't lookup csums for prealloc extents

The tree logging stuff was looking up csums to copy over for prealloc
extents which is just work we don't need to be doing. Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>

Btrfs: cache extent state when writing out dirty metadata pages

Everytime we write out dirty pages we search for an offset in the tree,
convert the bits in the state, and then when we wait we search for the
offset again and clear the bits.  So for every dirty range in the io tree we
are doing 4 rb searches, which is suboptimal.  With this patch we are only
doing 2 searches for every cycle (modulo weird things happening).  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>

Btrfs: do not hold the file extent leaf locked when adding extent item

For some reason we unlock everything except the leaf we are on, set the path
blocking and then add the extent item for the extent we just finished
writing.  I can't for the life of me figure out why we would want to do
this, and the history doesn't really indicate that there was a real reason
for it, so just remove it.  This will reduce our tree lock contention on
heavy writes.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>

Btrfs: do not async metadata csumming in certain situations

There are a coule scenarios where farming metadata csumming off to an async
thread doesn't help.  The first is if our processor supports crc32c, in
which case the csumming will be fast and so the overhead of the async model
is not worth the cost.  The other case is for our tree log.  We will be
making that stuff dirty and writing it out and waiting for it immediately.
Even with software crc32c this gives me a ~15% increase in speed with O_SYNC
workloads.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>

btrfs: fix min csum item size warnings in 32bit

commit 7ca4be45a0255ac8f08c05491c6add2dd87dd4f8 limited csum items to
PAGE_CACHE_SIZE. It used min() with incompatible types in 32bit which
generates warnings:

fs/btrfs/file-item.c: In function ‘btrfs_csum_file_blocks’:
fs/btrfs/file-item.c:717: warning: comparison of distinct pointer types lacks a cast

This uses min_t(u32,) to fix the warnings. u32 seemed reasonable
because btrfs_root->leafsize is u32 and PAGE_CACHE_SIZE is unsigned
long.

Signed-off-by: Zach Brown <zab@zabbo.net>

Btrfs: run delayed refs first when out of space

Running delayed refs is faster than running delalloc, so lets do that first
to try and reclaim space. This makes my fs_mark test about 20% faster.
Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>

Btrfs: fix orphan transaction on the freezed filesystem

With the following debug patch:

static int btrfs_freeze(struct super_block *sb)
{
+ struct btrfs_fs_info *fs_info = btrfs_sb(sb);
+ struct btrfs_transaction *trans;
+
+ spin_lock(&fs_info->trans_lock);
+ trans = fs_info->running_transaction;
+ if (trans) {
+ printk("Transid %llu, use_count %d, num_writer %d\n",
+ trans->transid, atomic_read(&trans->use_count),
+ atomic_read(&trans->num_writers));
+ }
+ spin_unlock(&fs_info->trans_lock);
return 0;
}

I found there was a orphan transaction after the freeze operation was done.

It is because the transaction may not be committed when the transaction handle
end even though it is the last handle of the current transaction. This design
avoid committing the transaction frequently, but also introduce the above
problem.

So I add btrfs_attach_transaction() which can catch the current transaction
and commit it. If there is no transaction, it will return ENOENT, and do not
anything.

This function also can be used to instead of btrfs_join_transaction_freeze()
because it don't increase the writer counter and don't start a new transaction,
so it also can fix the deadlock between sync and freeze.

Besides that, it is used to instead of btrfs_join_transaction() in
transaction_kthread(), because if there is no transaction, the transaction
kthread needn't anything.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>

Btrfs: add a type field for the transaction handle

This patch add a type field into the transaction handle structure,
in this way, we needn't implement various end-transaction functions
and can make the code more simple and readable.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>

Btrfs: fix memory leak in start_transaction()

This patch fixes memory leak of the transaction handle which happened
when starting transaction failed on a freezed fs.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>

btrfs: extended inode ref iteration

The iterate_irefs in backref.c is used to build path components from inode
refs. This patch adds code to iterate extended refs as well.

I had modify the callback function signature to abstract out some of the
differences between ref structures. iref_to_path() also needed similar
changes.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>

btrfs: extended inode refs

This patch adds basic support for extended inode refs. This includes support
for link and unlink of the refs, which basically gets us support for rename
as well.

Inode creation does not need changing - extended refs are only added after
the ref array is full.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>

Merge tag 'disintegrate-s390-20121009' of
git://git.infradead.org/users/dhowells/linux-headers

Pull UAPI patchset from David Howells:
"Can you merge the following branch into the s390 tree please.
This is to complete part of the UAPI disintegration for which the
preparatory patches were pulled recently."

Conflicts:
arch/s390/include/asm/chpid.h

s390/entry: fix svc number for TIF_SYSCALL system call restart

The load of the svc number in the TIF_SYSCALL restart path needs to be
done with an instruction that loads all 64 bits of %r1, 'lh' only loads
32 bits. If the upper half of %r1 is not zero and has the msb set,
entry64.S will try to execute an svc with a really large number.
What will be in the upper half of %r1 depends on the code generated by
gcc for the functions on the do_signal() callchain.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

s390/mm,vmem: fix vmem_add_mem()/vmem_remove_range()

vmem_add_mem() should only then insert a large page if pmd_none() is true
for the specific entry. We might have a leftover from a previous mapping.
In addition make vmem_remove_range()'s page table walk code more complete
and fix a couple of potential endless loops (which can never happen :).

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

s390/vmalloc: have separate modules area

Add a special module area on top of the vmalloc area, which may be only
used for modules and bpf jit generated code.
This makes sure that inter module branches will always happen without a
trampoline and in addition having all the code within a 2GB frame is
branch prediction unit friendly.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>

s390/zcrypt: remove duplicated include from zcrypt_pcixcc.c

Remove duplicated include.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>

s390/css_chars: remove superfluous ifdef

No need for an ifdef __KERNEL__ since css_chars.h is not exported.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>