The aim of this patch is to use the newly enhanced ->dirty_inode()
super block operation to deal with atime updates, rather than
piggy backing that code into ->write_inode() as is currently
done.
The net result is a simplification of the code in various places
and a reduction of the number of gfs2_dinode_out() calls since
this is now implied by ->dirty_inode().
Some of the mark_inode_dirty() calls have been moved under glocks
in order to take advantage of then being able to avoid locking in
->dirty_inode() when we already have suitable locks.
One consequence is that generic_write_end() now correctly deals
with file size updates, so that we do not need a separate check
for that afterwards. This also, indirectly, means that fdatasync
should work correctly on GFS2 - the current code always syncs the
metadata whether it needs to or not.
Has survived testing with postmark (with and without atime) and
also fsx.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Journaled data requires that a complete flush of all dirty data for
the file is done, in order that the ail flush which comes after
will succeed.
Also the recently enhanced bug trap can trigger falsely in case
an ail flush from fsync races with a page read. This updates the
bug trap such that it will ignore buffers which are locked and
only trigger on dirty and/or pinned buffers when the ail flush
is run from fsync. The original bug trap is retained when ail
flush is run from ->go_sync()
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
If we have got far enough through the inode allocation code
path that an inode has already been allocated, then we must
call iput to dispose of it, if an error occurs during a
later part of the process. This will always be the final iput
since there will be no other references to the inode.
Unlike when the inode has been unlinked, its block state will
be GFS2_BLKST_INODE rather than GFS2_BLKST_UNLINKED so we need
to skip the test in ->evict_inode() for this one case in order
to ensure that it will be deallocated correctly. This patch adds
a new flag in order to ensure that this will happen correctly.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
We do not need to start a transaction unless the atime
check has proved positive. Also if we are going to flush
the complete ail list anyway, we might as well skip the
writeback for this specific inode's metadata, since that
will be done as part of the ail writeback process in an
order offering potentially more efficient I/O.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The assert was being tested under the wrong lock, a
legacy of the original code. Also, if it does trigger,
the resulting information was not always a lot of help.
This moves the patch under the correct lock and also
prints out more useful information in tacking down the
source of the problem.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Now that the data writing is part of fsync proper, we can split
the waiting part out and do it later on. This reduces the
number of waits that we do during fsync on average.
There is also no need to take the i_mutex unless we are flushing
metadata to disk, so we can move that to within the metadata
flushing code.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Since there is now only a single caller to gfs2_dir_read_data()
and it has a number of constant arguments, we can factor
those out. Also some tests relating to the inode size were
being done twice.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
fib_rules: fix unresolved_rules counting
r8169: fix wrong eee setting for rlt8111evl
r8169: fix driver shutdown WoL regression.
ehea: Change maintainer to me
pptp: pptp_rcv_core() misses pskb_may_pull() call
tproxy: copy transparent flag when creating a time wait
pptp: fix skb leak in pptp_xmit()
bonding: use local function pointer of bond->recv_probe in bond_handle_frame
smsc911x: Add support for SMSC LAN89218
tg3: negate USE_PHYLIB flag check
netconsole: enable netconsole can make net_device refcnt incorrent
bluetooth: Properly clone LSM attributes to newly created child connections
l2tp: fix a potential skb leak in l2tp_xmit_skb()
bridge: fix hang on removal of bridge via netlink
x25: Prevent skb overreads when checking call user data
x25: Handle undersized/fragmented skbs
x25: Validate incoming call user data lengths
udplite: fast-path computation of checksum coverage
IPVS netns shutdown/startup dead-lock
netfilter: nf_conntrack: fix event flooding in GRE protocol tracker
Hugh Dickins [Wed, 19 Oct 2011 19:50:35 +0000 (12:50 -0700)]
mm: fix race between mremap and removing migration entry
I don't usually pay much attention to the stale "? " addresses in
stack backtraces, but this lucky report from Pawel Sikora hints that
mremap's move_ptes() has inadequate locking against page migration.
mremap's down_write of mmap_sem, together with i_mmap_mutex or lock,
and pagetable locks, were good enough before page migration (with its
requirement that every migration entry be found) came in, and enough
while migration always held mmap_sem; but not enough nowadays, when
there's memory hotremove and compaction.
The danger is that move_ptes() lets a migration entry dodge around
behind remove_migration_pte()'s back, so it's in the old location when
looking at the new, then in the new location when looking at the old.
Either mremap's move_ptes() must additionally take anon_vma lock(), or
migration's remove_migration_pte() must stop peeking for is_swap_entry()
before it takes pagetable lock.
Consensus chooses the latter: we prefer to add overhead to migration
than to mremapping, which gets used by JVMs and by exec stack setup.
Kjetil Oftedal [Wed, 19 Oct 2011 23:20:50 +0000 (16:20 -0700)]
sparc: Add alignment flag to PCI expansion resources
Currently no type of alignment is specified for PCI expansion roms while
parsing the openfirmware tree. This causes calls to pci_map_rom() to fail.
IORESOURCE_SIZEALIGN is the default alignment used for rom resouces in
pci/probe.c, and has been verified to work with various cards on a ultra 10.
Signed-off-By: Kjetil Oftedal <oftedal@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
françois romieu [Fri, 14 Oct 2011 00:57:45 +0000 (00:57 +0000)]
r8169: fix driver shutdown WoL regression.
Due to commit 92fc43b4159b518f5baae57301f26d770b0834c9 ("r8169: modify the
flow of the hw reset."), rtl8169_hw_reset stomps during driver shutdown on
RxConfig bits which are needed for WOL on some versions of the hardware.
As these bits were formerly set from the r81{0x, 68}_pll_power_down methods,
factor them out for use in the driver shutdown (rtl_shutdown) handler.
I favored __rtl8169_get_wol() -hardware state indication- over
RTL_FEATURE_WOL as the latter has become a good candidate for removal.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Hayes <hayeswang@realtek.com> Tested-by: Marc Ballarin <ballarin.marc@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Cc: Breno Leitao <leitao@linux.vnet.ibm.com> Acked-by: Breno Leitão <leitao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 19 Oct 2011 13:43:24 +0000 (06:43 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/kms/atom: fix handling of FB scratch indices
drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2)
drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls
drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping
ttm: Fix error-path using an uninitialized value
Antonio Ospite [Wed, 12 Oct 2011 20:59:26 +0000 (17:59 -0300)]
[media] videodev: fix a NULL pointer dereference in v4l2_device_release()
The change in 8280b66 does not cover the case when v4l2_dev is already
NULL, fix that.
With a Kinect sensor, seen as an USB camera using GSPCA in this context,
a NULL pointer dereference BUG can be triggered by just unplugging the
device after the camera driver has been loaded.
Signed-off-by: Antonio Ospite <ospite@studenti.unina.it> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Alex Deucher [Wed, 19 Oct 2011 00:10:05 +0000 (20:10 -0400)]
drm/radeon/kms/atom: fix handling of FB scratch indices
FB scratch indices are dword indices, but we were treating
them as byte indices. As such, we were getting the wrong
FB scratch data for non-0 indices. Fix the indices and
guard the indexing against indices larger than the scratch
allocation.
Fixes memory corruption on some boards if data was written
past the end of the FB scratch array.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Dave Airlie <airlied@redhat.com> Tested-by: Dave Airlie <airlied@redhat.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
pptp_rcv_core() is a nice example of a function assuming everything it
needs is available in skb head.
Reported-by: Bradley Peterson <despite@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
KOVACS Krisztian [Tue, 18 Oct 2011 10:17:35 +0000 (10:17 +0000)]
tproxy: copy transparent flag when creating a time wait
The transparent socket option setting was not copied to the time wait
socket when an inet socket was being replaced by a time wait socket. This
broke the --transparent option of the socket match and may have caused
that FIN packets belonging to sockets in FIN_WAIT2 or TIME_WAIT state
were being dropped by the packet filter.
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
Mitsuo Hayasaka [Wed, 12 Oct 2011 16:04:29 +0000 (16:04 +0000)]
bonding: use local function pointer of bond->recv_probe in bond_handle_frame
The bond->recv_probe is called in bond_handle_frame() when
a packet is received, but bond_close() sets it to NULL. So,
a panic occurs when both functions work in parallel.
Why this happen:
After null pointer check of bond->recv_probe, an sk_buff is
duplicated and bond->recv_probe is called in bond_handle_frame.
So, a panic occurs when bond_close() is called between the
check and call of bond->recv_probe.
Patch:
This patch uses a local function pointer of bond->recv_probe
in bond_handle_frame(). So, it can avoid the null pointer
dereference.
Signed-off-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 11 Oct 2011 23:00:41 +0000 (23:00 +0000)]
tg3: negate USE_PHYLIB flag check
USE_PHYLIB flag in tg3_remove_one() is being checked incorrectly. This
results tg3_phy_fini->phy_disconnect is never called and when tg3 module
is removed.
In my case this resulted in panics in phy_state_machine calling function
phydev->adjust_link.
So correct this check.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Moore [Fri, 7 Oct 2011 09:40:59 +0000 (09:40 +0000)]
bluetooth: Properly clone LSM attributes to newly created child connections
The Bluetooth stack has internal connection handlers for all of the various
Bluetooth protocols, and unfortunately, they are currently lacking the LSM
hooks found in the core network stack's connection handlers. I say
unfortunately, because this can cause problems for users who have have an
LSM enabled and are using certain Bluetooth devices. See one problem
report below:
In order to keep things simple at this point in time, this patch fixes the
problem by cloning the parent socket's LSM attributes to the newly created
child socket. If we decide we need a more elaborate LSM marking mechanism
for Bluetooth (I somewhat doubt this) we can always revisit this decision
in the future.
Reported-by: James M. Cape <jcape@ignore-your.tv> Signed-off-by: Paul Moore <pmoore@redhat.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Need to cleanup bridge device timers and ports when being bridge
device is being removed via netlink.
This fixes the problem of observed when doing:
ip link add br0 type bridge
ip link set dev eth1 master br0
ip link set br0 up
ip link del br0
which would cause br0 to hang in unregister_netdev because
of leftover reference count.
Reported-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
the second one is keeping up-to-date.
This problem was introduced by e8abccb7193 ("posix-cpu-timers: Cure
SMP accounting oddities").
Cure the problem by removing the cputimer->lock and rq->lock nesting,
this leaves concurrent enablers doing duplicate work, but the time
wasted should be on the same order otherwise wasted spinning on the
lock and the greater-than assignment filter should ensure we preserve
monotonicity.
Reported-by: Dave Jones <davej@redhat.com> Reported-by: Simon Kirby <sim@hostway.ca> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: stable@kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/1318928713.21167.4.camel@twins Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Matthew Daley [Fri, 14 Oct 2011 18:45:05 +0000 (18:45 +0000)]
x25: Prevent skb overreads when checking call user data
x25_find_listener does not check that the amount of call user data given
in the skb is big enough in per-socket comparisons, hence buffer
overreads may occur. Fix this by adding a check.
Signed-off-by: Matthew Daley <mattjd@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Andrew Hendry <andrew.hendry@gmail.com> Cc: stable <stable@kernel.org> Acked-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Daley [Fri, 14 Oct 2011 18:45:04 +0000 (18:45 +0000)]
x25: Handle undersized/fragmented skbs
There are multiple locations in the X.25 packet layer where a skb is
assumed to be of at least a certain size and that all its data is
currently available at skb->data. These assumptions are not checked,
hence buffer overreads may occur. Use pskb_may_pull to check these
minimal size assumptions and ensure that data is available at skb->data
when necessary, as well as use skb_copy_bits where needed.
Signed-off-by: Matthew Daley <mattjd@gmail.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Andrew Hendry <andrew.hendry@gmail.com> Cc: stable <stable@kernel.org> Acked-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Daley [Fri, 14 Oct 2011 18:45:03 +0000 (18:45 +0000)]
x25: Validate incoming call user data lengths
X.25 call user data is being copied in its entirety from incoming messages
without consideration to the size of the destination buffers, leading to
possible buffer overflows. Validate incoming call user data lengths before
these copies are performed.
It appears this issue was noticed some time ago, however nothing seemed to
come of it: see http://www.spinics.net/lists/linux-x25/msg00043.html and
commit 8db09f26f912f7c90c764806e804b558da520d4f.
Signed-off-by: Matthew Daley <mattjd@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Andrew Hendry <andrew.hendry@gmail.com> Cc: stable <stable@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Mon, 17 Oct 2011 23:07:30 +0000 (19:07 -0400)]
udplite: fast-path computation of checksum coverage
Commit 903ab86d195cca295379699299c5fc10beba31c7 of 1 March this year ("udp: Add
lockless transmit path") introduced a new fast TX path that broke the checksum
coverage computation of UDP-lite, which so far depended on up->len (only set
if the socket is locked and 0 in the fast path).
Fixed by providing both fast- and slow-path computation of checksum coverage.
The latter can be removed when UDP(-lite)v6 also uses a lockless transmit path.
Reported-by: Thomas Volkert <thomas@homer-conferencing.com> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 17 Oct 2011 15:24:24 +0000 (08:24 -0700)]
Avoid using variable-length arrays in kernel/sys.c
The size is always valid, but variable-length arrays generate worse code
for no good reason (unless the function happens to be inlined and the
compiler sees the length for the simple constant it is).
Also, there seems to be some code generation problem on POWER, where
Henrik Bakken reports that register r28 can get corrupted under some
subtle circumstances (interrupt happening at the wrong time?). That all
indicates some seriously broken compiler issues, but since variable
length arrays are bad regardless, there's little point in trying to
chase it down.
"Just don't do that, then".
Reported-by: Henrik Grindal Bakken <henribak@cisco.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Per the text in Documentation/SubmitChecklist as below, we should
explicitly have header linux/errno.h in localtimer.h for ENXIO
reference.
1: If you use a facility then #include the file that defines/declares
that facility. Don't depend on other header files pulling in ones
that you use.
Otherwise, we may run into some compiling error like the following one,
if any file includes localtimer.h without CONFIG_LOCAL_TIMERS defined.
arch/arm/include/asm/localtimer.h: In function ‘local_timer_setup’:
arch/arm/include/asm/localtimer.h:53:10: error: ‘ENXIO’ undeclared (first use in this function)
Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Will Deacon [Mon, 3 Oct 2011 17:30:53 +0000 (18:30 +0100)]
ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9
Using COHERENT_LINE_{MISS,HIT} for cache misses and references
respectively is completely wrong. Instead, use the L1D events which
are a better and more useful approximation despite ignoring instruction
traffic.
Reported-by: Alasdair Grant <alasdair.grant@arm.com> Reported-by: Matt Horsnell <matt.horsnell@arm.com> Reported-by: Michael Williams <michael.williams@arm.com> Cc: stable@kernel.org Cc: Jean Pihet <j-pihet@ti.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Fri, 14 Oct 2011 05:06:39 +0000 (17:06 +1200)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: revert to using a kthread for AIL pushing
xfs: force the log if we encounter pinned buffers in .iop_pushbuf
xfs: do not update xa_last_pushed_lsn for locked items
Mika Westerberg [Thu, 13 Oct 2011 09:04:20 +0000 (12:04 +0300)]
x86, mrst: use a temporary variable for SFI irq
SFI tables reside in RAM and should not be modified once they are
written. Current code went to set pentry->irq to zero which causes
subsequent reads to fail with invalid SFI table checksum. This will
break kexec as the second kernel fails to validate SFI tables.
To fix this we use temporary variable for irq number.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The w83627ehf driver is improperly reporting thermal diode sensors as
type 2, instead of 3. This caused "sensors" and possibly other
monitoring tools to report these sensors as "transistor" instead of
"thermal diode".
Furthermore, diode subtype selection (CPU vs. external) is only
supported by the original W83627EHF/EHG. All later models only support
CPU diode type, and some (NCT6776F) don't even have the register in
question so we should avoid reading from it.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: stable@kernel.org Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Hartmut Knaack [Mon, 10 Oct 2011 22:22:45 +0000 (00:22 +0200)]
gpio-pca953x: fix gpio_base
gpio_base was set to 0 if no system platform data or open firmware
platform data was provided. This led to conflicts, if any other gpiochip
with a gpiobase of 0 was instantiated already. Setting it to -1 will
automatically use the first one available.
Signed-off-by: Hartmut Knaack <knaack.h@gmx.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
gpio/omap: fix build error with certain OMAP1 configs
With commit f64ad1a0e21a, "gpio/omap: cleanup _set_gpio_wakeup(), remove
ifdefs", access to build time conditionally omitted 'suspend_wakeup'
member of the 'gpio_bank' structure has been placed unconditionally in
function _set_gpio_wakeup(), which is always built. This resulted in the
driver compilation broken for certain OMAP1, i.e., non-OMAP16xx,
configurations.
Really required or not in previously excluded cases, define this
structure member unconditionally as a fix.
Tested with a custom OMAP1510 only configuration.
Signed-off-by: Janusz Krzysztofik <jkrzyszt@tis.icnet.pl> Acked-by: Kevin Hilman <khilman@ti.com> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Chris Metcalf [Wed, 5 Oct 2011 21:09:29 +0000 (17:09 -0400)]
tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files
The 32-bit TILEPro support uses some #defines in <asm/atomic_32.h>
for atomic support routines in assembly. To make this more explicit,
I've turned those includes into includes of <asm/atomic_32.h>, which
should hopefully make it clear that they shouldn't be bombed into
<linux/atomic.h> in any cleanups.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
mscan: too much data copied to CAN frame due to 16 bit accesses
gro: refetch inet6_protos[] after pulling ext headers
bnx2x: fix cl_id allocation for non-eth clients for NPAR mode
mlx4_en: fix endianness with blue frame support
There are a lot of file references to now moved or deleted files in the
whole tree, especially in documentation and Kconfig files. This patch
fixes the references in drivers/ide/.
Signed-off-by: Johann Felix Soden <johfel@users.sourceforge.net> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt Fleming [Thu, 11 Aug 2011 13:57:02 +0000 (14:57 +0100)]
sparc: Use set_current_blocked()
As described in e6fa16ab ("signal: sigprocmask() should do
retarget_shared_pending()") the modification of current->blocked is
incorrect as we need to check whether the signal we're about to block
is pending in the shared queue.
Cc: Oleg Nesterov <oleg@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we have a few issues with the way the workqueue code is used to
implement AIL pushing:
- it accidentally uses the same workqueue as the syncer action, and thus
can be prevented from running if there are enough sync actions active
in the system.
- it doesn't use the HIGHPRI flag to queue at the head of the queue of
work items
At this point I'm not confident enough in getting all the workqueue flags and
tweaks right to provide a perfectly reliable execution context for AIL
pushing, which is the most important piece in XFS to make forward progress
when the log fills.
Revert back to use a kthread per filesystem which fixes all the above issues
at the cost of having a task struct and stack around for each mounted
filesystem. In addition this also gives us much better ways to diagnose
any issues involving hung AIL pushing and removes a small amount of code.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs: force the log if we encounter pinned buffers in .iop_pushbuf
We need to check for pinned buffers even in .iop_pushbuf given that inode
items flush into the same buffers that may be pinned directly due operations
on the unlinked inode list operating directly on buffers. To do this add a
return value to .iop_pushbuf that tells the AIL push about this and use
the existing log force mechanisms to unpin it.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs: do not update xa_last_pushed_lsn for locked items
If an item was locked we should not update xa_last_pushed_lsn and thus skip
it when restarting the AIL scan as we need to be able to lock and write it
out as soon as possible. Otherwise heavy lock contention might starve AIL
pushing too easily, especially given the larger backoff once we moved
xa_last_pushed_lsn all the way to the target lsn.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Tested-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Chris Mason [Tue, 11 Oct 2011 15:41:40 +0000 (11:41 -0400)]
Btrfs: make sure not to defrag extents past i_size
The btrfs file defrag code will loop through the extents and
force COW on them. But there is a concurrent truncate in the middle of
the defrag, it might end up defragging the same range over and over
again.
The problem is that writepage won't go through and do anything on pages
past i_size, so the cow won't happen, so the file will appear to still
be fragmented. defrag will end up hitting the same extents again and
again.
In the worst case, the truncate can actually live lock with the defrag
because the defrag keeps creating new ordered extents which the truncate
code keeps waiting on.
The fix here is to make defrag check for i_size inside the main loop,
instead of just once before the looping starts.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Is caused by commit 3ae36655 ("x86-64: Rework vsyscall emulation and add
vsyscall= parameter") - the vsyscall emulation code is not fully cooked
yet as UML relies on some rather fragile SIGSEGV semantics.
Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default
to vsyscall=native for now, this patch implements that.
and then it'll go into a loop: writeback -> defrag -> writeback ...
It's because writeback writes [8K, 200K] and then writes [0, 8K].
I tried to make writeback know if the pages are dirtied by defrag,
but the patch was a bit intrusive. Here I simply set writeback_index
when we defrag a file.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
mscan: too much data copied to CAN frame due to 16 bit accesses
Due to the 16 bit access to mscan registers there's too much data copied to
the zero initialized CAN frame when having an odd number of bytes to copy.
This patch ensures that only the requested bytes are copied by using an
8 bit access for the remaining byte.
Reported-by: Andre Naujoks <nautsch@gmail.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Sun, 9 Oct 2011 23:57:36 +0000 (23:57 +0000)]
bnx2x: fix cl_id allocation for non-eth clients for NPAR mode
There are some consolidations of NPAR configuration
when FCoE and iSCSI L2 clients will get the same id,
in this case FCoE ring will be non-functional.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The doorbell register was being unconditionally swapped. In x86, that
meant it was being swapped to BE and written to the descriptor and to
memory, depending on the case of blue frame support or writing to
doorbell register. On PPC, this meant it was being swapped to LE and
then swapped back to BE while writing to the register. But in the blue
frame case, it was being written as LE to the descriptor.
The fix is not to swap doorbell unconditionally, write it to the
register as BE and convert it to BE when writing it to the descriptor.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Reported-by: Richard Hendrickson <richhend@us.ibm.com> Cc: Eli Cohen <eli@dev.mellanox.co.il> Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 10 Oct 2011 02:48:27 +0000 (14:48 +1200)]
Merge branch 'fixes' of git://git.linaro.org/people/arnd/arm-soc
* 'fixes' of git://git.linaro.org/people/arnd/arm-soc:
ARM: mach-ux500: enable fix for ARM errata 754322
ARM: OMAP: musb: Remove a redundant omap4430_phy_init call in usb_musb_init
ARM: OMAP: Fix i2c init for twl4030
ARM: OMAP4: MMC: fix power and audio issue, decouple USBC1 from MMC1
Marc Dietrich [Fri, 7 Oct 2011 15:31:41 +0000 (08:31 -0700)]
ARM: tegra: fix compilation error due to mach/hardware.h removal
This fixes a compilation error in cpu-tegra.c which was introduced in dc8d966bccde ("ARM: convert PCI defines to variables") which removed the
now obsolete mach/hardware.h from the mach-tegra subtree.
Signed-off-by: Marc Dietrich <marvin24@gmx.de> Signed-off-by: Olof Johansson <olof@lixom.net> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 10 Oct 2011 02:43:06 +0000 (14:43 +1200)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/kms: use hardcoded dig encoder to transmitter mapping for DCE4.1
drm/radeon/kms: fix dp_detect handling for DP bridge chips
drm/radeon/kms: retry aux transactions if there are status flags
Olof Johansson [Fri, 7 Oct 2011 20:27:48 +0000 (13:27 -0700)]
MAINTAINERS: Update tegra maintainer information
A couple of changes to the Tegra maintainership setup:
I'm very glad to bring on Stephen Warren on board as a maintainer. The
work he has done so far is excellent, and the fact that he works for
Nvidia means he has long-term interest in the platform.
Erik Gilling did an astounding amount of work on getting things up and
running but has been a silent partner on the maintainership side for a
while, and is stepping down. Thanks for your contributions so far, Erik.
Finally, update the git URL since I'll take over running the main repo
for a while.
Overall maintainership model isn't changing much at this time: We'll all
three review patches as appropriate, and one of us will collect the main
repo (me at this time).
Signed-off-by: Olof Johansson <olof@lixom.net> Cc: Erik Gilling <konkers@android.com> Acked-by: Colin Cross <ccross@android.com> Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 10 Oct 2011 02:39:03 +0000 (14:39 +1200)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (29 commits)
MIPS: Call oops_enter, oops_exit in die
staging/octeon: Software should check the checksum of no tcp/udp packets
MIPS: Octeon: Enable C0_UserLocal probing.
MIPS: No branches in delay slots for huge pages in handle_tlbl
MIPS: Don't clobber CP0_STATUS value for CONFIG_MIPS_MT_SMTC
MIPS: Octeon: Select CONFIG_HOLES_IN_ZONE
MIPS: PM: Use struct syscore_ops instead of sysdevs for PM (v2)
MIPS: Compat: Use 32-bit wrapper for compat_sys_futex.
MIPS: Do not use EXTRA_CFLAGS
MIPS: Alchemy: DB1200: Disable cascade IRQ in handler
SERIAL: Lantiq: Set timeout in uart_port
MIPS: Lantiq: Fix setting the PCI bus speed on AR9
MIPS: Lantiq: Fix external interrupt sources
MIPS: tlbex: Fix build error in R3000 code.
MIPS: Alchemy: Include Au1100 in PM code.
MIPS: Alchemy: Fix typo in MAC0 registration
MIPS: MSP71xx: Fix build error.
MIPS: Handle __put_user() sleeping.
MIPS: Allow forced irq threading
MIPS: i8259: Mark cascade interrupt non-threaded
...
Steve French [Fri, 7 Oct 2011 04:14:07 +0000 (23:14 -0500)]
[CIFS] Fix first time message on mount, ntlmv2 upgrade delayed to 3.2
Microsoft has a bug with ntlmv2 that requires use of ntlmssp, but
we didn't get the required information on when/how to use ntlmssp to
old (but once very popular) legacy servers (various NT4 fixpacks
for example) until too late to merge for 3.1. Will upgrade
to NTLMv2 in NTLMSSP in 3.2
Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com>
The LEON MMU Model (SRMMU) does not implement MMu Table probing
in hardware, instead it is implemented in software. However the
software implementation does not return the PTE as it should which
always results in INVALID entires and the PROM mappings are not
inherited as they should during startup. The following patch
removes the masking of the PTE.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 6 Oct 2011 23:15:10 +0000 (16:15 -0700)]
Merge git://github.com/davem330/net
* git://github.com/davem330/net:
net: fix typos in Documentation/networking/scaling.txt
bridge: leave carrier on for empty bridge
netfilter: Use proper rwlock init function
tcp: properly update lost_cnt_hint during shifting
tcp: properly handle md5sig_pool references
macvlan/macvtap: Fix unicast between macvtap interfaces in bridge mode
Paul Menzel [Wed, 31 Aug 2011 15:07:10 +0000 (17:07 +0200)]
x86/PCI: use host bridge _CRS info on ASUS M2V-MX SE
In summary, this DMI quirk uses the _CRS info by default for the ASUS
M2V-MX SE by turning on `pci=use_crs` and is similar to the quirk
added by commit 2491762cfb47 ("x86/PCI: use host bridge _CRS info on
ASRock ALiveSATA2-GLAN") whose commit message should be read for further
information.
Since commit 3e3da00c01d0 ("x86/pci: AMD one chain system to use pci
read out res") Linux gives the following oops:
Trusting the ACPI _CRS information (`pci=use_crs`) fixes this problem.
$ dmesg | grep -i crs # with the quirk
PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
The match has to be against the DMI board entries though since the vendor entries are not populated.
DMI: System manufacturer System Product Name/M2V-MX SE, BIOS 0304 10/30/2007
This quirk should be removed when `pci=use_crs` is enabled for machines
from 2006 or earlier or some other solution is implemented.
Using coreboot [1] with this board the problem does not exist but this
quirk also does not affect it either. To be safe though the check is
tightened to only take effect when the BIOS from American Megatrends is
used.
15:13 < ruik> but coreboot does not need that
15:13 < ruik> because i have there only one root bus
15:13 < ruik> the audio is behind a bridge
$ sudo dmidecode
BIOS Information
Vendor: American Megatrends Inc.
Version: 0304
Release Date: 10/30/2007
This resolves a regression seen by some users of bridging.
Some users use the bridge like a dummy device.
They expect to be able to put an IPv6 address on the device
with no ports attached. Although there are better ways of doing
this, there is no reason to not allow it.
Note: the bridge still will reflect the state of ports in the
bridge if there are any added.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
Thomas Gleixner [Wed, 5 Oct 2011 03:24:43 +0000 (03:24 +0000)]
netfilter: Use proper rwlock init function
Replace the open coded initialization with the init function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* git://bedivere.hansenpartnership.com/git/scsi-rc-fixes-2.6:
[SCSI] libsas: fix panic when single phy is disabled on a wide port
[SCSI] qla2xxx: Fix crash in qla2x00_abort_all_cmds() on unload
Alex Deucher [Tue, 4 Oct 2011 16:23:24 +0000 (12:23 -0400)]
drm/radeon/kms: fix dp_detect handling for DP bridge chips
The HPD pin is not reliable for detecting whether a monitor
is connected or not. Skip HPD and just use DDC or load
detection.
Fixes phantom VGA connected bugs.
[Michel: fixes phantom VGA bugs on his llano system.]
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Tue, 4 Oct 2011 21:23:15 +0000 (17:23 -0400)]
drm/radeon/kms: retry aux transactions if there are status flags
If there are error flags in the aux status, retry the transaction.
This makes aux much more reliable, especially on llano systems.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
Yan, Zheng [Sun, 2 Oct 2011 04:21:50 +0000 (04:21 +0000)]
tcp: properly update lost_cnt_hint during shifting
lost_skb_hint is used by tcp_mark_head_lost() to mark the first unhandled skb.
lost_cnt_hint is the number of packets or sacked packets before the lost_skb_hint;
When shifting a skb that is before the lost_skb_hint, if tcp_is_fack() is ture,
the skb has already been counted in the lost_cnt_hint; if tcp_is_fack() is false,
tcp_sacktag_one() will increase the lost_cnt_hint. So tcp_shifted_skb() does not
need to adjust the lost_cnt_hint by itself. When shifting a skb that is equal to
lost_skb_hint, the shifted packets will not be counted by tcp_mark_head_lost().
So tcp_shifted_skb() should adjust the lost_cnt_hint even tcp_is_fack(tp) is true.
Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_v4_clear_md5_list() assumes that multiple tcp md5sig peers
only hold one reference to md5sig_pool. but tcp_v4_md5_do_add()
increases use count of md5sig_pool for each peer. This patch
makes tcp_v4_md5_do_add() only increases use count for the first
tcp md5sig peer.
Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ward [Sun, 18 Sep 2011 12:53:20 +0000 (12:53 +0000)]
macvlan/macvtap: Fix unicast between macvtap interfaces in bridge mode
Packets should always be forwarded to the lowerdev using dev_forward_skb.
vlan->forward is for packets being forwarded directly to another macvlan/
macvtap device (used for multicast in bridge mode).
Reported-and-tested-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 4 Oct 2011 17:37:06 +0000 (10:37 -0700)]
Merge git://github.com/davem330/net
* git://github.com/davem330/net:
pch_gbe: Fixed the issue on which a network freezes
pch_gbe: Fixed the issue on which PC was frozen when link was downed.
make PACKET_STATISTICS getsockopt report consistently between ring and non-ring
net: xen-netback: correctly restart Tx after a VM restore/migrate
bonding: properly stop queuing work when requested
can bcm: fix incomplete tx_setup fix
RDSRDMA: Fix cleanup of rds_iw_mr_pool
net: Documentation: Fix type of variables
ibmveth: Fix oops on request_irq failure
ipv6: nullify ipv6_ac_list and ipv6_fl_list when creating new socket
cxgb4: Fix EEH on IBM P7IOC
can bcm: fix tx_setup off-by-one errors
MAINTAINERS: tehuti: Alexander Indenbaum's address bounces
dp83640: reduce driver noise
ptp: fix L2 event message recognition
Linus Torvalds [Tue, 4 Oct 2011 16:59:22 +0000 (09:59 -0700)]
Merge branch 'fix/asoc' of git://github.com/tiwai/sound
* 'fix/asoc' of git://github.com/tiwai/sound:
ASoC: omap_mcpdm_remove cannot be __devexit
ASoC: Fix setting update bits for WM8753_LADC and WM8753_RADC
ASoC: use a valid device for dev_err() in Zylonite