john stultz [Fri, 2 Oct 2009 23:24:15 +0000 (16:24 -0700)]
time: Remove xtime_cache
With the prior logarithmic time accumulation patch, xtime will now
always be within one "tick" of the current time, instead of
possibly half a second off.
This removes the need for the xtime_cache value, which always
stored the time at the last interrupt, so this patch cleans that up
removing the xtime_cache related code.
This is a bit simpler, but still could use some wider testing.
Signed-off-by: John Stultz <johnstul@us.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: John Kacur <jkacur@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1254525855.7741.95.camel@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
john stultz [Fri, 2 Oct 2009 23:17:53 +0000 (16:17 -0700)]
time: Implement logarithmic time accumulation
Accumulating one tick at a time works well unless we're using NOHZ.
Then it can be an issue, since we may have to run through the loop
a few thousand times, which can increase timer interrupt caused
latency.
The current solution was to accumulate in half-second intervals
with NOHZ. This kept the number of loops down, however it did
slightly change how we make NTP adjustments. While not an issue
with NTPd users, as NTPd makes adjustments over a longer period of
time, other adjtimex() users have noticed the half-second
granularity with which we can apply frequency changes to the clock.
For instance, if a application tries to apply a 100ppm frequency
correction for 20ms to correct a 2us offset, with NOHZ they either
get no correction, or a 50us correction.
Now, there will always be some granularity error for applying
frequency corrections. However with users sensitive to this error
have seen a 50-500x increase with NOHZ compared to running without
NOHZ.
So I figured I'd try another approach then just simply increasing
the interval. My approach is to consume the time interval
logarithmically. This reduces the number of times through the loop
needed keeping latency down, while still preserving the original
granularity error for adjtimex() changes.
Further, this change allows us to remove the xtime_cache code
(patch to follow), as xtime is always within one tick of the
current time, instead of the half-second updates it saw before.
An earlier version of this patch has been shipping to x86 users in
the RedHat MRG releases for awhile without issue, but I've reworked
this version to be even more careful about avoiding possible
overflows if the shift value gets too large.
Signed-off-by: John Stultz <johnstul@us.ibm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: John Kacur <jkacur@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1254525473.7741.88.camel@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Sun, 4 Oct 2009 04:44:21 +0000 (21:44 -0700)]
tty: Avoid dropping ldisc_mutex over hangup tty re-initialization
A couple of people have hit the WARN_ON() in drivers/char/tty_io.c,
tty_open() that is unhappy about seeing the tty line discipline go away
during the tty hangup. See for example
http://bugzilla.kernel.org/show_bug.cgi?id=14255
and the reason is that we do the tty_ldisc_halt() outside the
ldisc_mutex in order to be able to flush the scheduled work without a
deadlock with vhangup_work.
However, it turns out that we can solve this particular case by
- using "cancel_delayed_work_sync()" in tty_ldisc_halt(), which waits
for just the particular work, rather than synchronizing with any
random outstanding pending work.
This won't deadlock, since the buf.work we synchronize with doesn't
care about the ldisc_mutex, it just flushes the tty ldisc buffers.
- realize that for this particular case, we don't need to wait for any
hangup work, because we are inside the hangup codepaths ourselves.
so as a result we can just drop the flush_scheduled_work() entirely, and
then move the tty_ldisc_halt() call to inside the mutex. That way we
never expose the partially torn down ldisc state to tty_open(), and hold
the ldisc_mutex over the whole sequence.
This patch fixes the m32r SMP kernel after 2.6.27.
A part of the following patch breaks m32r SMP operation.
> m32r: convert to generic helpers for IPI function calls
> commit 7b7426c8a615cf61df9a77b9df7d5b75d91e3fa0
In the above patch, a CALL_FUNC_SINGLE_IPI was newly introduced,
but the its IPI vector number was wrong in the patch code.
The m32r SMP kernel hanged-up during boot operation, because
the CPU_BOOT_IPI was called instead of CALL_FUNC_SINGLE_IPI
(CPU_BOOT_IPI had no side effect at that time because the 2nd
core had already been started up),
as a result, csd_unlock() was not called, then a dead lock
occurred in csd_lock_wait() after the detection of Compact Flash
memory as IDE generic disk.
Hirokazu Takata [Wed, 26 Aug 2009 04:04:33 +0000 (13:04 +0900)]
m32r: define ioread* and iowrite* macros
Define ioread* and iowrite* macros to fix the following build errors:
CC [M] drivers/uio/uio_smx.o
drivers/uio/uio_smx.c: In function 'smx_handler':
drivers/uio/uio_smx.c:31: error: implicit declaration of function 'ioread32'
drivers/uio/uio_smx.c:37: error: implicit declaration of function 'iowrite32'
Linus Torvalds [Sat, 3 Oct 2009 18:24:19 +0000 (11:24 -0700)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
[PATCH] ext4: retry failed direct IO allocations
ext4: Fix build warning in ext4_dirty_inode()
ext4: drop ext4dev compat
ext4: fix a BUG_ON crash by checking that page has buffers attached to it
ARM: 5728/1: Proper prefetch abort handling on ARMv6 and ARMv7
Currently, on ARMv6 and ARMv7, if an application tries to execute
code (or garbage) on non-executable page it hangs. It caused by
incorrect prefetch abort handling. Now every prefetch abort
processes as a translation fault.
To fix this we have to analyze instruction fault status register
to figure out reason why we've got the abort and process it
accordingly.
To make IFSR different from DFSR we set bit 31 which is reserved in
both IFSR and DFSR.
This patch also tries to protect from future hangs on unexpected
exceptions. An application will be killed if unexpected exception
type was received.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ARM: 5727/1: Pass IFSR register to do_PrefetchAbort()
Instruction fault status register, IFSR, was introduced on ARMv6 to
provide status information about the last insturction fault. It
needed for proper prefetch abort handling.
Now we have three prefetch abort model:
* legacy - for CPUs before ARMv6. They doesn't provide neither
IFSR nor IFAR. We simulate IFSR with section translation fault
status for them to generalize code;
* ARMv6 - provides IFSR, but not IFAR;
* ARMv7 - provides both IFSR and IFAR.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Greg Ungerer [Thu, 1 Oct 2009 23:45:28 +0000 (00:45 +0100)]
ARM: 5740/1: fix valid_phys_addr_range() range check
Commit 1522ac3ec95ff0230e7aa516f86b674fdf72866c
("Fix virtual to physical translation macro corner cases")
breaks the end of memory check in valid_phys_addr_range().
The modified expression results in the apparent /dev/mem size
being 2 bytes smaller than what it actually is.
This patch reworks the expression to correctly check the address,
while maintaining use of a valid address to __pa().
Signed-off-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
David Brown [Thu, 1 Oct 2009 16:43:29 +0000 (17:43 +0100)]
ARM: 5739/1: ARM: allow empty ATAG_CORE
From: David Brown <davidb@quicinc.com>
The ATAG_CORE is allowed to be empty. Although this is handled
by parse_tag_core(), __vet_atags during startup rejects this tag
unless it contains data. Allow the initial tag to be either the
full size, or empty.
Signed-off-by: David Brown <davidb@quicinc.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
EXPORT_* macros should follow immediately after the closing function
brace line.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Acked-by: Kristoffer Ericson <kristoffer.ericson@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (46 commits)
cnic: Fix NETDEV_UP event processing.
uvesafb/connector: Disallow unpliviged users to send netlink packets
pohmelfs/connector: Disallow unpliviged users to configure pohmelfs
dst/connector: Disallow unpliviged users to configure dst
dm/connector: Only process connector packages from privileged processes
connector: Removed the destruct_data callback since it is always kfree_skb()
connector/dm: Fixed a compilation warning
connector: Provide the sender's credentials to the callback
connector: Keep the skb in cn_callback_data
e1000e/igb/ixgbe: Don't report an error if devices don't support AER
net: Fix wrong sizeof
net: splice() from tcp to pipe should take into account O_NONBLOCK
net: Use sk_mark for routing lookup in more places
sky2: irqname based on pci address
skge: use unique IRQ name
IPv4 TCP fails to send window scale option when window scale is zero
net/ipv4/tcp.c: fix min() type mismatch warning
Kconfig: STRIP: Remove stale bits of STRIP help text
NET: mkiss: Fix typo
tg3: Remove prev_vlan_tag from struct tx_ring_info
...
Michael Chan [Fri, 2 Oct 2009 18:03:28 +0000 (11:03 -0700)]
cnic: Fix NETDEV_UP event processing.
This fixes the problem of not handling the NETDEV_UP event properly
during hot-plug or modprobe of bnx2 after cnic. The handling was
skipped by mistakenly using "else if" to check for the event.
Also update version to 2.0.1.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Philipp Reisner [Fri, 2 Oct 2009 02:40:07 +0000 (02:40 +0000)]
connector: Removed the destruct_data callback since it is always kfree_skb()
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Philipp Reisner [Fri, 2 Oct 2009 02:40:06 +0000 (02:40 +0000)]
connector/dm: Fixed a compilation warning
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Philipp Reisner [Fri, 2 Oct 2009 02:40:05 +0000 (02:40 +0000)]
connector: Provide the sender's credentials to the callback
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Philipp Reisner [Fri, 2 Oct 2009 02:40:04 +0000 (02:40 +0000)]
connector: Keep the skb in cn_callback_data
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Acked-by: Lars Ellenberg <lars.ellenberg@linbit.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Frans Pop [Fri, 2 Oct 2009 17:04:12 +0000 (10:04 -0700)]
e1000e/igb/ixgbe: Don't report an error if devices don't support AER
The only error returned by pci_{en,dis}able_pcie_error_reporting() is
-EIO which simply means that Advanced Error Reporting is not supported.
There is no need to report that, so remove the error check from e1000e,
igb and ixgbe.
Signed-off-by: Frans Pop <elendil@planet.nl> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 1 Oct 2009 22:26:00 +0000 (15:26 -0700)]
net: splice() from tcp to pipe should take into account O_NONBLOCK
tcp_splice_read() doesnt take into account socket's O_NONBLOCK flag
Before this patch :
splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE);
causes a random endless block (if pipe is full) and
splice(socket,0,pipe,0,128*1024,SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
will return 0 immediately if the TCP buffer is empty.
User application has no way to instruct splice() that socket should be in blocking mode
but pipe in nonblock more.
Many projects cannot use splice(tcp -> pipe) because of this flaw.
It doesn't make the splice itself necessarily nonblocking (because the
actual file descriptors that are spliced from/to may block unless they
have the O_NONBLOCK flag set), but it makes the splice pipe operations
nonblocking.
Linus intention was clear : let SPLICE_F_NONBLOCK control the splice pipe mode only
This patch instruct tcp_splice_read() to use the underlying file O_NONBLOCK
flag, as other socket operations do.
Takashi Iwai [Fri, 2 Oct 2009 12:06:08 +0000 (14:06 +0200)]
ALSA: usb - Use strlcat() correctly
Don't pass the advanced position to strlcat() but just gives the buffer
head position so that the max size limit can be checked correctly.
Introduced a new helper function to standaralize strlcat() calls.
Takashi Iwai [Fri, 2 Oct 2009 07:03:58 +0000 (09:03 +0200)]
ALSA: hda - Fix / improve ALC66x parser
The auto-parser for ALC662/663/272 codecs doesn't work properly when
a speaker is connected to mono NID 0x17, and doesn't handle the dynamic
DAC assignment properly.
This patch fixes the issues and also improves the assignment of DACs
so that HP and speakers can have independent volume controls.
Sven Eckelmann [Thu, 1 Oct 2009 18:06:39 +0000 (20:06 +0200)]
ALSA: ctxfi: Swapped SURROUND-SIDE mute
On Soundblaster X-FI Titenium with emu20k2 the SIDE and SURROUND mute
functions are swapped.
It was checked with 'speaker-test -c 8 -s 3' and (un)mute surround or
'speaker-test -c 8 -s 7' and (un)mute side. The volume seems not
to be affected and works as expected.
Signed-off-by: Sven Eckelmann <sven.eckelmann@gmx.de> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Daniel T Chen [Thu, 1 Oct 2009 22:56:30 +0000 (18:56 -0400)]
ALSA: intel8x0 - Mute External Amplifier by default for Sony VAIO VGN-B1VP
BugLink: https://bugs.launchpad.net/bugs/410933
This Sony VAIO model also needs External Amplifier unmuted for audible
playback, so make sure we set the inv_eapd quirk.
Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Linus Torvalds [Fri, 2 Oct 2009 03:23:15 +0000 (20:23 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: fix data space leak fix
Btrfs: remove duplicates of filemap_ helpers
Btrfs: take i_mutex before generic_write_checks
Btrfs: fix arguments to btrfs_wait_on_page_writeback_range
Btrfs: fix deadlock with free space handling and user transactions
Btrfs: fix error cases for ioctl transactions
Btrfs: Use CONFIG_BTRFS_POSIX_ACL to enable ACL code
Btrfs: introduce missing kfree
Btrfs: Fix setting umask when POSIX ACLs are not enabled
Btrfs: proper -ENOSPC handling
Otherwise the config function uses random data from the stack. This
didn't stick out because config is called once more in the chipselect
function with correct parameters.
Andy Spencer [Thu, 1 Oct 2009 22:44:27 +0000 (15:44 -0700)]
sscanf(): fix %*s%n
When using %*s, sscanf should honor conversion specifiers immediately
following the %*s. For example, the following code should find the
position of the end of the string "hello".
Ideally, sscanf would advance the fmt and str pointers the same as it
would without the *, but the code for that is rather complicated and is
not included in the patch.
Signed-off-by: Andy Spencer <andy753421@gmail.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chuck Ebbert [Thu, 1 Oct 2009 22:44:26 +0000 (15:44 -0700)]
serial: add parameter to force skipping the test for the TXEN bug
Allow users to force skipping the TXEN test at init time. Applies
to all serial ports. Intended for debugging only.
There is a blacklist for devices where we need to skip the test but the
list is not complete. This lets users force skipping the test so we can
determine if they need to be added to the list.
Some HP machines with weird serial consoles have this problem and there
may be more.
Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roel Kluin [Thu, 1 Oct 2009 22:44:24 +0000 (15:44 -0700)]
cyclades: fix read buffer overflow
irq is declared with size NR_CARDS (4), but the loop containing this
segment runs up until NR_ISA_ADDRS (16), possibly reading from irq[i] (and
trying to use the result)
Ben Dooks [Thu, 1 Oct 2009 22:44:21 +0000 (15:44 -0700)]
s3cmci: add better support for no card detect or write protect available
Add better support for omitting either the card detect or the write
protect GPIOs if the board does not support it. Add the fields
no_wprotect and no_detect to the platform data which when set indicate the
absence of the respective GPIOs.
Note, this also fixes a minor bug where it tries to free IRQ0 if there is
no detect gpio available.
Signed-off-by: Ben Dooks <ben@simtec.co.uk> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Thu, 1 Oct 2009 22:44:20 +0000 (15:44 -0700)]
s3cmci: make SDIO IRQ hardware IRQ support build-time configurable
We have found a couple of boards where the SDIO IRQ hardware support has
failed to work properly, and thus we should make it configurable whether
or not to be included in the driver.
Signed-off-by: Ben Dooks <ben@simtec.co.uk> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Thu, 1 Oct 2009 22:44:19 +0000 (15:44 -0700)]
s3cmci: DMA fixes
Fixes for the DMA transfer mode of the driver to try and improve the state
of the code:
- Ensure that dma_complete is set during the end of the command phase
so that transfers do not stall awaiting the completion
- Update the DMA debugging to provide a bit more useful information
such as how many DMA descriptors where not processed and print the
DMA addresses in hexadecimal.
- Fix the DMA channel request code to actually request DMA for the
S3CMCI block instead of whatever '0' signified.
- Add fallback to PIO if we cannot get the DMA channel, as many of the
devices with this block only have a limited number of DMA channels.
- Only try and claim and free the DMA channel if we are trying to use it.
This improves the driver DMA code to the point where it can now identify a
card and read the partition table. However the DMA can still stall when
trying to move data between the host and memory.
Signed-off-by: Ben Dooks <ben@simtec.co.uk> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Thu, 1 Oct 2009 22:44:18 +0000 (15:44 -0700)]
s3cmci: Kconfig selection for PIO/DMA/Both
Add a selection for the data transfer mode of the s3cmci driver, allowing
for either a configuration or rumtime selection of the use of the DMA or
PIO transfer code.
The PIO only mode is 476 bytes smaller than the driver with both methods
compiled in.
Signed-off-by: Ben Dooks <ben@simtec.co.uk> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Thu, 1 Oct 2009 22:44:18 +0000 (15:44 -0700)]
s3cmci: add SDIO IRQ support
The controller supports SDIO IRQ detection so add support for hardware
assisted SDIO interrupt detection for the SDIO core. This improves the
response time for SDIO interrupts and thus the transfer rate from devices
such as the Marvel 8686.
As a note, it does seem that the controller will miss an IRQ than is held
asserted, so there are some manual checks to see if the SDIO interrupt is
active after a transfer.
Major testing on the S3C2440.
Signed-off-by: Ben Dooks <ben@simtec.co.uk> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Thu, 1 Oct 2009 22:44:17 +0000 (15:44 -0700)]
s3cmci: add debugfs support for examining driver and hardware state
Export driver state and hardware register state via debugfs entries
created under a directory formed from dev_name() on the probed device when
CONFIG_DEBUG_FS is set.
Signed-off-by: Ben Dooks <ben@simtec.co.uk> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Thu, 1 Oct 2009 22:44:14 +0000 (15:44 -0700)]
s3cmci: update probe to use new platform id list
Use the platform id list to match the three different versions of the
hardware block that this driver supports.
This will change the prefix of the console messages produced by this
driver to be prefixed by s3c-mci instead of the hardware block name, such
as s3c2440-mci.
Signed-off-by: Ben Dooks <ben@simtec.co.uk> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
memcg: some modification to softlimit under hierarchical memory reclaim.
This patch clean up/fixes for memcg's uncharge soft limit path.
Problems:
Now, res_counter_charge()/uncharge() handles softlimit information at
charge/uncharge and softlimit-check is done when event counter per memcg
goes over limit. Now, event counter per memcg is updated only when
memory usage is over soft limit. Here, considering hierarchical memcg
management, ancesotors should be taken care of.
Now, ancerstors(hierarchy) are handled in charge() but not in uncharge().
This is not good.
Prolems:
1. memcg's event counter incremented only when softlimit hits. That's bad.
It makes event counter hard to be reused for other purpose.
2. At uncharge, only the lowest level rescounter is handled. This is bug.
Because ancesotor's event counter is not incremented, children should
take care of them.
3. res_counter_uncharge()'s 3rd argument is NULL in most case.
ops under res_counter->lock should be small. No "if" sentense is better.
Fixes:
* Removed soft_limit_xx poitner and checks in charge and uncharge.
Do-check-only-when-necessary scheme works enough well without them.
* make event-counter of memcg incremented at every charge/uncharge.
(per-cpu area will be accessed soon anyway)
* All ancestors are checked at soft-limit-check. This is necessary because
ancesotor's event counter may never be modified. Then, they should be
checked at the same time.
Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__mem_cgroup_largest_soft_limit_node() returns a mem_cgroup_per_zone "mz"
with incremnted mz->mem->css's refcnt. Then, the caller of this function
has to call css_put(mz->mem->css).
But, mz can be !NULL even if "not found" i.e. without css_get(). By
this, css->refcnt will go down to minus.
This may cause various things...one of results will be
initite-loop in css_tryget() as this.
INFO: RCU detected CPU 0 stall (t=10000 jiffies)
sending NMI to all CPUs:
NMI backtrace for cpu 0
CPU 0:
<snip>
Albert Herranz [Thu, 1 Oct 2009 22:44:05 +0000 (15:44 -0700)]
sdio: pass whitelisted cis funce tuples to sdio drivers
Some manufacturers provide vendor information in non-vendor specific CIS
tuples. For example, Broadcom uses an Extended Function tuple to provide
the MAC address on some of their network cards, as in the case of the
Nintendo Wii WLAN daughter card.
This patch allows passing whitelisted FUNCE tuples unknown to the SDIO
core to a matching SDIO driver instead of rejecting them and failing.
Signed-off-by: Albert Herranz <albert_herranz@yahoo.es> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Samuel Thibault [Thu, 1 Oct 2009 22:44:02 +0000 (15:44 -0700)]
x86: fix csum_ipv6_magic asm memory clobber
Just like ip_fast_csum, the assembly snippet in csum_ipv6_magic needs a
memory clobber, as it is only passed the address of the buffer, not a
memory reference to the buffer itself.
This caused failures in Hurd's pfinetv4 when we tried to compile it with
gcc-4.3 (bogus checksums).
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <andi@firstfloor.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark Salter [Thu, 1 Oct 2009 22:44:01 +0000 (15:44 -0700)]
mn10300: fix kernel build failures when using gcc-4.x
Fix some build failures when using gcc-4.x for MN10300.
Firstly, __get_user() fails to build because the pointer points to a const and
__gu_val ends up being read-only:
In file included from include/linux/mempolicy.h:62,
from init/main.c:50:
include/linux/pagemap.h: In function 'fault_in_pages_readable':
include/linux/pagemap.h:394: error: read-only variable '__gu_val' used as 'asm' output
include/linux/pagemap.h:394: error: read-only variable '__gu_val' used as 'asm' output
include/linux/pagemap.h:394: error: read-only variable '__gu_val' used as 'asm' output
include/linux/pagemap.h:400: error: read-only variable '__gu_val' used as 'asm' output
include/linux/pagemap.h:400: error: read-only variable '__gu_val' used as 'asm' output
include/linux/pagemap.h:400: error: read-only variable '__gu_val' used as 'asm' output
make[1]: *** [init/main.o] Error 1
Secondly, gcc-4 doesn't allow casts of lvalues:
UPD include/linux/compile.h
arch/mn10300/kernel/rtc.c: In function 'calibrate_clock':
arch/mn10300/kernel/rtc.c:170: error: lvalue required as left operand of assignment
arch/mn10300/kernel/rtc.c:172: error: lvalue required as left operand of assignment
make[1]: *** [arch/mn10300/kernel/rtc.o] Error 1
These are seen with gcc 4.2.1.
Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Thu, 1 Oct 2009 22:43:59 +0000 (15:43 -0700)]
MAINTAINERS: ARM/Palm file patterns
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Marek Vasut <marek.vasut@gmail.com> Acked-by: Tomas Cech <sleep_walker@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In file included from include/linux/irq.h:12,
from include/asm-generic/hardirq.h:6,
from /usr/src/devel/arch/m68k/include/asm/hardirq_mm.h:6,
from /usr/src/devel/arch/m68k/include/asm/hardirq.h:4,
from include/linux/hardirq.h:10,
from /usr/src/devel/arch/m68k/include/asm/system_mm.h:69,
from /usr/src/devel/arch/m68k/include/asm/system.h:4,
from include/linux/list.h:7,
from include/linux/preempt.h:11,
from include/linux/spinlock.h:50,
from include/linux/seqlock.h:29,
from include/linux/time.h:8,
from include/linux/timex.h:56,
from include/linux/sched.h:56,
from arch/m68k/kernel/asm-offsets.c:14:
include/linux/smp.h:17: error: field 'list' has incomplete type
Cc: Christoph Hellwig <hch@lst.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Thu, 1 Oct 2009 22:43:56 +0000 (15:43 -0700)]
asm-generic/gpio.h: pull in linux/kernel.h for might_sleep()
The asm-generic/gpio.h header uses the might_sleep() macro but doesn't
include the header for it, so any source code that might include
linux/gpio.h before linux/kernel.h can easily lead to a build failure.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Mundt [Thu, 1 Oct 2009 22:43:54 +0000 (15:43 -0700)]
module: fix up CONFIG_KALLSYMS=n build.
Starting from commit 4a4962263f07d14660849ec134ee42b63e95ea9a "reduce
symbol table for loaded modules (v2)", the kernel/module.c build is broken
with CONFIG_KALLSYMS disabled.
CC kernel/module.o
kernel/module.c:1995: warning: type defaults to 'int' in declaration of 'Elf_Hdr'
kernel/module.c:1995: error: expected ';', ',' or ')' before '*' token
kernel/module.c: In function 'load_module':
kernel/module.c:2203: error: 'strmap' undeclared (first use in this function)
kernel/module.c:2203: error: (Each undeclared identifier is reported only once
kernel/module.c:2203: error: for each function it appears in.)
kernel/module.c:2239: error: 'symoffs' undeclared (first use in this function)
kernel/module.c:2239: error: implicit declaration of function 'layout_symtab'
kernel/module.c:2240: error: 'stroffs' undeclared (first use in this function)
make[1]: *** [kernel/module.o] Error 1
make: *** [kernel/module.o] Error 2
There are three different issues:
- layout_symtab() takes a const Elf_Ehdr
- layout_symtab() needs to return a value
- symoffs/stroffs/strmap are referenced by the load_module() code
despite being ifdefed out, which seems unnecessary given the noop
behaviour of layout_symtab()/add_kallsyms() in the case of
CONFIG_KALLSYMS=n.
Signed-off-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Jan Beulich <jbeulich@novell.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Atis Elsts [Thu, 1 Oct 2009 22:16:49 +0000 (15:16 -0700)]
net: Use sk_mark for routing lookup in more places
This patch against v2.6.31 adds support for route lookup using sk_mark in some
more places. The benefits from this patch are the following.
First, SO_MARK option now has effect on UDP sockets too.
Second, ip_queue_xmit() and inet_sk_rebuild_header() could fail to do routing
lookup correctly if TCP sockets with SO_MARK were used.
Signed-off-by: Atis Elsts <atis@mikrotik.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Most network drivers request their IRQ when the interface is activated.
sky2 does it in ->probe() instead, because it can work with two-port
cards where the two net_devices use the same IRQ. This works fine most
of the time, except in some situations when the interface gets renamed.
Consider this example:
1. modprobe sky2
The card is detected as eth0 and requests IRQ 17. Directory
/proc/irq/17/eth0 is created.
2. There is an udev rule which says this interface should be called
eth1, so udev renames eth0 -> eth1.
3. modprobe 8139too
The Realtek card is detected as eth0. It will be using IRQ 17 too.
4. ip link set eth0 up
Now 8139too requests IRQ 17.
The result is:
WARNING: at fs/proc/generic.c:590 proc_register ...
proc_dir_entry '17/eth0' already registered
The fix is for sky2 to name the irq based on the pci device, as is done
by some other devices DRM, infiniband, ... ie. sky2@pci:0000:00:00
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Reviewed-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Schmidt [Thu, 1 Oct 2009 08:13:23 +0000 (08:13 +0000)]
skge: use unique IRQ name
Most network drivers request their IRQ when the interface is activated.
skge does it in ->probe() instead, because it can work with two-port
cards where the two net_devices use the same IRQ. This works fine most
of the time, except in some situations when the interface gets renamed.
Consider this example:
1. modprobe skge
The card is detected as eth0 and requests IRQ 17. Directory
/proc/irq/17/eth0 is created.
2. There is an udev rule which says this interface should be called
eth1, so udev renames eth0 -> eth1.
3. modprobe 8139too
The Realtek card is detected as eth0. It will be using IRQ 17 too.
4. ip link set eth0 up
Now 8139too requests IRQ 17.
The result is:
WARNING: at fs/proc/generic.c:590 proc_register ...
proc_dir_entry '17/eth0' already registered
...
And "ls /proc/irq/17" shows two subdirectories, both called eth0.
Fix it by using a unique name for skge's IRQ, based on the PCI address.
The naming from the example then looks like this:
$ grep skge /proc/interrupts
17: 169 IO-APIC-fasteoi skge@pci:0000:00:0a.0, eth0
irqbalance daemon will have to be taught to recognize "skge@" as an
Ethernet interrupt. This will be a one-liner addition in classify.c. I
will send a patch to irqbalance if this change is accepted.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ori Finkelman [Thu, 1 Oct 2009 06:41:59 +0000 (06:41 +0000)]
IPv4 TCP fails to send window scale option when window scale is zero
Acknowledge TCP window scale support by inserting the proper option in SYN/ACK
and SYN headers even if our window scale is zero.
This fixes the following observed behavior:
1. Client sends a SYN with TCP window scaling option and non zero window scale
value to a Linux box.
2. Linux box notes large receive window from client.
3. Linux decides on a zero value of window scale for its part.
4. Due to compare against requested window scale size option, Linux does not to
send windows scale TCP option header on SYN/ACK at all.
With the following result:
Client box thinks TCP window scaling is not supported, since SYN/ACK had no
TCP window scale option, while Linux thinks that TCP window scaling is
supported (and scale might be non zero), since SYN had TCP window scale
option and we have a mismatched idea between the client and server
regarding window sizes.
Probably it also fixes up the following bug (not observed in practice):
1. Linux box opens TCP connection to some server.
2. Linux decides on zero value of window scale.
3. Due to compare against computed window scale size option, Linux does
not to set windows scale TCP option header on SYN.
With the expected result that the server OS does not use window scale option
due to not receiving such an option in the SYN headers, leading to suboptimal
performance.
Signed-off-by: Gilad Ben-Yossef <gilad@codefidence.com> Signed-off-by: Ori Finkelman <ori@comsleep.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Matti Aarnio <matti.aarnio@zmailer.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The function virtnet_remove is used only wrapped by __devexit_p so define
it using __devexit.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Sam Ravnborg <sam@ravnborg.org> Cc: David S. Miller <davem@davemloft.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Alex Williamson <alex.williamson@hp.com> Cc: Mark McLoughlin <markmc@redhat.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
The function sgiseeq_remove is defined using __exit, so don't use
__devexit_p but __exit_p to wrap it.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: David S. Miller <davem@davemloft.net> Cc: Wang Chen <wangchen@cn.fujitsu.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Patrick McHardy <kaber@trash.net> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
The function meth_remove is defined using __exit, so don't use __devexit_p
but __exit_p to wrap it.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: David S. Miller <davem@davemloft.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Primary module parameter passed to bonding is pernament. That means if you
release the primary slave and enslave it again, it becomes the primary slave
again. But if you set primary slave via sysfs, the primary slave is only set
once and it's not remembered in bond->params structure. Therefore the setting is
lost after releasing the primary slave. This simple one-liner fixes this.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Josef Bacik [Thu, 1 Oct 2009 21:10:23 +0000 (17:10 -0400)]
Btrfs: fix data space leak fix
There is a problem where page_mkwrite can be called on a dirtied page that
already has a delalloc range associated with it. The fix is to clear any
delalloc bits for the range we are dirtying so the space accounting gets
handled properly. This is the same thing we do in the normal write case, so we
are consistent across the board. With this patch we no longer leak reserved
space.
Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>