The dynamic tick allows the kernel to sleep for periods longer than a
single tick, but it does not limit the sleep time currently. In the
worst case the kernel could sleep longer than the wrap around time of
the time keeping clock source which would result in losing track of
time.
Prevent this by limiting it to the safe maximum sleep time of the
current time keeping clock source. The value is calculated when the
clock source is registered.
[ tglx: simplified the code a bit and massaged the commit msg ]
Signed-off-by: Jon Hunter <jon-hunter@ti.com> Cc: John Stultz <johnstul@us.ibm.com>
LKML-Reference: <1250617512-23567-2-git-send-email-jon-hunter@ti.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Based on Peter Zijlstras patch suggestion this enables recalculation of
the scheduler tunables in response of a change in the number of cpus. It
also adds a max of eight cpus that are considered in that scaling.
Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1259579808-11357-2-git-send-email-ehrhardt@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
> We allocate and zero cpu_isolated_map after the isolcpus
> __setup option has run. This means cpu_isolated_map always
> ends up empty and if CPUMASK_OFFSTACK is enabled we write to a
> cpumask that hasn't been allocated.
I introduced this regression in 49557e620339cb13 (sched: Fix
boot crash by zalloc()ing most of the cpu masks).
Use the bootmem allocator if they set isolcpus=, otherwise
allocate and zero like normal.
Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: peterz@infradead.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@kernel.org>
LKML-Reference: <200912021409.17013.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Anton Blanchard <anton@samba.org>
Add proper suspend/resume code for Juli@ cards. Based on ice1724
suspend/resume work of Igor Chernyshev.
Fixes bug https://bugtrack.alsa-project.org/alsa-bug/view.php?id=4413
Tested on linux-2.6.31.6
UEFI standard (version 2.3, May 2009, 5.3.1 GUID Format overview, page
95) defines that LBA is always based on the logical block size. It
means bdev_logical_block_size() (aka BLKSSZGET) for Linux.
This patch removes static sector size from EFI GPT parser.
The problem is reproducible with the latest GNU Parted:
The size of EFI GPT header is not static, but whole sector is
allocated for the header. The HeaderSize field must be greater
than 92 (= sizeof(struct gpt_header) and must be less than or
equal to the logical block size.
It means we have to read whole sector with the header, because the
header crc32 checksum is calculated according to HeaderSize.
For more details see UEFI standard (version 2.3, May 2009):
- 5.3.1 GUID Format overview, page 93
- Table 13. GUID Partition Table Header, page 96
commit d6d3f08b0fd998b647a05540cedd11a067b72867
(netfilter: xtables: conntrack match revision 2) does break the
v1 conntrack match iptables-save output in a subtle way.
Problem is as follows:
up = kmalloc(sizeof(*up), GFP_KERNEL);
[..]
/*
* The strategy here is to minimize the overhead of v1 matching,
* by prebuilding a v2 struct and putting the pointer into the
* v1 dataspace.
*/
memcpy(up, info, offsetof(typeof(*info), state_mask));
[..]
*(void **)info = up;
As the v2 struct pointer is saved in the match data space,
it clobbers the first structure member (->origsrc_addr).
Because the _v1 match function grabs this pointer and does not actually
look at the v1 origsrc, run time functionality does not break.
But iptables -nvL (or iptables-save) cannot know that v1 origsrc_addr
has been overloaded in this way:
(128.173... is the address to the v2 match structure).
To fix this, we take advantage of the fact that the v1 and v2 structures
are identical with exception of the last two structure members (u8 in v1,
u16 in v2).
We extract them as early as possible and prevent the v2 matching function
from looking at those two members directly.
Previously reported by Michel Messerschmidt via Ben Hutchings, also
see Debian Bug tracker #556587.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If docs are being built in a separate directory, xmlto and xsltproc
can't find included sources. Make links back to the source directory.
I would much prefer to have xmlto and xsltproc look in the source
directory for included entities but couldn't see how to do that. This
needs to be solved in some way for 2.6.32, even if this patch isn't the
right way to do it.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
For hardware limit to support TSOV6, just disable this feature Signed-off-by: Jie Yang <jie.yang@atheros.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
use common_task instead of reset_task and link_chg_task, so it fix "call cancel_work_sync
from the work itself".
Signed-off-by: Jie Yang <jie.yang@atheros.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Adds the device IDs and driver linking to allow the Asus Europa DVB-T
card to operate with these drivers.
The device has a SAA7134 chipset with a TD1316 Hybrid Tuner.
All inputs work on the card including switching between DVB-T and
Analogue TV, there is also no IR with this card.
[mchehab@redhat.com: CodingStyle fixes]
Signed-off-by: Danny Wood <danwood76@gmail.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Filesystem code usually destroys the option buffer while
parsing it. This leads to errors when the same buffer is
passed twice. In case we fill a new superblock do not call
remount.
This is needed to quite a warning that the debugfs code
causes every boot.
There is nothing arch specific to devtmpfs. This part crashes because the
kernel tries to modify the data read-only section which is write protected
on s390.
Properly handle version of the protocol where standard PS/2 packets
from trackpoint are stuffed into middle (byte 3-6) of the standard
ALPS packets when both the touchpad and trackpoint are used together.
The patch is based on work done by Matthew Chapman and additional
research done by David Kubicek and Erik Osterholm:
Many thanks to David Kubicek for his efforts in researching fine points
of this new version of the protocol, especially interaction between pad
and stick in these models.
Cc: Andy Isaacson <adi@hexapodia.org> Signed-off-by: Sebastian Kapfer <sebastian_kapfer@gmx.net> Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
DM6467 silicon revisions 3.x have variant field in JTAGID register as '1'.
This path adds entry for the same in dm646x_ids to be able to boot on boards
with 3.x revision chips.
Also modifies name for 'variant=0' (revisions 1.0, 1.1).
Signed-off-by: Hemant Pedanekar <hemantp@ti.com> Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
At least two revisions of the D-Link DWA 160 exist, called A1 and A2. A1
(USB-ID 07d1:3c10) is already listed in usb.c as D-Link DWA 160A. A2
(USB-ID 07d1:3a09) works if added to ar9170_usb_ids. I didn't do much
testing until now, but I was able to connect to APs using WPA or WEP and
transmit data.
Summary:
* Add model revision number to the comment for D-Link DWA 160 A1 (07d1:3c10)
* Add support for D-Link DWA 160 A2 (07d1:3a09)
Signed-off-by: Thomas Klute <thomas2.klute@uni-dortmund.de> Signed-off-by: John W. Linville <linville@tuxdriver.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Added device ids range for { 0x80 - 87 } , modified mpi/mpi2_cnfg.h containing
MPI2_MFGPAGE_DEVID_SAS2208_X.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Signed-off-by: Eric Moore <Eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch adds the PCI IDs for the next generation chip to the
PCI_DEVICE_ID table.
Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add new PCI ids to support next generation of BladeEngine device.
Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We need buffer->len to remain valid to work out the correct address to
be unmapped. We therefore need to clear buffer->len after the unmap
operation.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 8bd108d adds preemption point after each opcode parse, then
a sleeping function called from invalid context bug was founded
during suspend/resume stage. this was fixed in commit abe1dfa by
don't cond_resched when irq_disabled. But recent commit 138d156 changes
the behaviour to don't cond_resched when in_atomic. This makes the
sleeping function called from invalid context bug happen again, which
is reported in http://lkml.org/lkml/2009/12/1/371.
This patch also fixes http://bugzilla.kernel.org/show_bug.cgi?id=14483
Reported-and-bisected-by: Larry Finger <Larry.Finger@lwfinger.net> Reported-and-bisected-by: Justin P. Mattock <justinmattock@gmail.com> Signed-off-by: Xiaotian Feng <dfeng@redhat.com> Acked-by: Alexey Starikovskiy <astarikovskiy@suse.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Marc reported that the BUG_ON in clockevents_notify() triggers on his
system. This happens because the kernel tries to remove an active
clock event device (used for broadcasting) from the device list.
The handling of devices which can be used as per cpu device and as a
global broadcast device is suboptimal.
The simplest solution for now (and for stable) is to check whether the
device is used as global broadcast device, but this needs to be
revisited.
[ tglx: restored the cpuweight check and massaged the changelog ]
Reported-by: Marc Dionne <marc.c.dionne@gmail.com> Tested-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
LKML-Reference: <1262834564-13033-1-git-send-email-dfeng@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Anton reported that perf record kept receiving events even after calling
ioctl(PERF_EVENT_IOC_DISABLE). It turns out that FORK,COMM and MMAP
events didn't respect the disabled state and kept flowing in.
Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Anton Blanchard <anton@samba.org>
LKML-Reference: <1263459187.4244.265.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A process that changes its comm field, does this on a per kernel
task struct basis. The timechart tool used, incorrectly, the pid
to track this, and should have used the tid instead...
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100116125319.34ac3edd@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In free_unmap_area_noflush(), va->flags is marked as VM_LAZY_FREE first, and
then vmap_lazy_nr is increased atomically.
But, in __purge_vmap_area_lazy(), while traversing of vmap_are_list, nr
is counted by checking VM_LAZY_FREE is set to va->flags. After counting
the variable nr, kernel reads vmap_lazy_nr atomically and checks a
BUG_ON condition whether nr is greater than vmap_lazy_nr to prevent
vmap_lazy_nr from being negative.
The problem is that, if interrupted right after marking VM_LAZY_FREE,
increment of vmap_lazy_nr can be delayed. Consequently, BUG_ON
condition can be met because nr is counted more than vmap_lazy_nr.
It is highly probable when vmalloc/vfree are called frequently. This
scenario have been verified by adding delay between marking VM_LAZY_FREE
and increasing vmap_lazy_nr in free_unmap_area_noflush().
Even the vmap_lazy_nr is for checking high watermark, it never be the
strict watermark. Although the BUG_ON condition is to prevent
vmap_lazy_nr from being negative, vmap_lazy_nr is signed variable. So,
it could go down to negative value temporarily.
Consequently, removing the BUG_ON condition is proper.
Remove entry for 2770:915d (usb digital camera with mass storage
support) from unusual_devs.h. The fix triggered by the entry causes
the file system on the camera to be completely inaccessible (no
partition table, the device is not mountable).
The patch works, but let me clarify a few things about it. All the
patch does is remove the entry for this device from the
drivers/usb/storage/unusual_devs.h, which is supposed to help with a
problem with the device's reported size (I think). I'm pretty sure it
was originally added for a reason, so I'm not sure removing it won't
cause other problems to reappear. Also, I should note that this
unusual_devs.h entry was present (and activating workarounds) in
2.6.29, but in that version everything works fine. Starting with
2.6.30, things no longer work.
Signed-off-by: Ryan May <rmay31@gmail.com> Cc: Rohan Hart <rohan.hart17@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Schlichter reported:
> X.org uses libpciaccess which tries to mmap with write combining enabled via
> /sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, the
> kernel does fall back to uncached mmap. Then libpciaccess thinks it succeeded
> mapping with write combining enabled and does not set up suited MTRR entries.
> ;-(
Instead of silently mapping pci mmap region as UC minus in the case
of !pat_enabled and wc request, we can return error. Eric Anholt mentioned
that caller (like X) typically follows up with UC minus pci mmap request and
if there is a free mtrr slot, caller will manage adding WC mtrr.
Jesse Barnes says:
> Older versions of libpciaccess will behave better if we do it that way
> (iirc it only allocates an MTRR if the resource_wc file doesn't exist or
> fails to get mapped).
Reported-by: Thomas Schlichter <thomas.schlichter@web.de> Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There exist multiple DDC buses for the SDVO cards with multiple outputs.
When we can't get the EDID by using the select DDC bus, we can try the other
possible DDC bus to see whether the EDID can be obtained.
For some SDVO cards based on conexant chip, we can't read the EDID if
we don't read the response after issuing SDVO DDC bus switch
command.
From the SDVO spec once when another I2C transaction is finished after
completing the I2C transaction of issuing the bus switch command, it
will be switched back to the SDVO internal state again. So we can't
initiate a new I2C transaction to read the response after issuing the
DDC bus switch command. Instead we should issue DDC bus switch command
and read the response in the same I2C transaction.
Based on patch originally by Jeff Mahoney <jeffm@suse.com>
enclosure_status is expected to be a NULL terminated array of strings
but isn't actually NULL terminated. When writing an invalid value to
/sys/class/enclosure/.../.../status, it goes off the end of the array
and Oopses.
Fix by making the assumption true and adding NULL at the end.
Reported-by: Artur Wojcik <artur.wojcik@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Merge of poll and irq modes accelerated EC transaction, so
that keyboard starts to suffer again. Add msleep(1) into
transaction path for the storm to allow keyboard controller
to do its job.
Split EC query handling into acknowledge and execution phase.
This allows much smaller pending query lattency and lowers chances
of EC going "wild" and losing events.
This patch (as1330) fixes a bug in khbud's handling of remote
wakeups. When a device sends a remote-wakeup request, the parent hub
(or the host controller driver, for directly attached devices) begins
the resume sequence and notifies khubd when the sequence finishes. At
this point the port's SUSPEND feature is automatically turned off.
However the device needs an additional 10-ms resume-recovery time
(TRSMRCY in the USB spec). Khubd does not wait for this delay if the
SUSPEND feature is off, and as a result some devices fail to behave
properly following a remote wakeup. This patch adds the missing
delay to the remote-wakeup path.
It also extends the resume-signalling delay used by ehci-hcd and
uhci-hcd from 20 ms (the value in the spec) to 25 ms (the value we use
for non-remote-wakeup resumes). The extra time appears to help some
devices.
This patch (as1321) fixes a problem with EHCI and UHCI root-hub
suspends: If the suspend occurs while a port is trying to resume, the
resume doesn't finish and simply gets lost. When remote wakeup is
enabled, this is undesirable behavior.
The patch checks first to see if any port resumes are in progress, and
if they are then it fails the root-hub suspend with -EBUSY.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1320) fixes two problems related to interrupt-URB
scheduling in ehci-hcd.
URBs with an interval of 2 or 4 microframes aren't handled.
For the time being, the patch reduces to interval to 1 uframe.
URBs are constrained to have an interval no larger than 1024
frames by usb_submit_urb(). But some EHCI controllers allow
use of a schedule as short as 256 frames; for these
controllers we may have to decrease the interval to the
actual schedule length.
The second problem isn't very significant since few devices expose
interrupt endpoints with an interval larger than 256 frames. But the
first problem is critical; it will prevent the kernel from working
with devices having interrupt intervals of 2 or 4 uframes.
Memory allocations with GFP_KERNEL can cause IO to a storage
device which can fail resulting in a need to reset the device.
Therefore GFP_KERNEL cannot be safely used between usb_lock_device()
and usb_unlock_device(). Replace by GFP_NOIO.
Signed-off-by: Oliver Neukum <oliver@neukum.org> Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Wacom claims that the WACF namespace will always be devoted to serial
Wacom tablets. Remove the existing entries and add a wildcard to avoid
having to update the kernel every time they add a new device.
Signed-off-by: Ping Cheng <pingc@wacom.com> Signed-off-by: Matthew Garrett <mjg@redhat.com> Tested-by: Ping Cheng <pingc@wacom.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is a quick patch up for the problem. It's not really fixing Nozomi
which completely fails to implement tty open/close semantics and all the
other needed stuff. Doing it right is a rather more invasive patch set and
not one that will backport.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ecryptfs_open dereferences a pointer to the private lower file (the one
stored in the ecryptfs inode), without checking if the pointer is NULL.
Right afterward, it initializes that pointer if it is NULL. Swap order of
statements to first initialize. Bug discovered by Duckjin Kang.
After updating to 2.6.32 kernel, I started experiencing Oopses caused by
the asus_oled module. After quick investigation, I wrapped this simple
patch which fixes an Oops in by asus_oled module on 2.6.32.2 kernel,
caused by incorrect usage of strict_strtoul function call within
set_enabled and set_disabled functions. This can be triggered by simple
running the userspace client for asus_old (e.g., 'asusoled -e' or
'asusoled -d').
register_chrdev() hardcodes registering 256 minors, presumably to
avoid breaking old drivers. However, we need to register enough
minors so that we have all possible CPUs.
checkpatch warns on this patch, but the patch is correct: NR_CPUS here
is a static *upper bound* on the *maximum CPU index* (not *number of
CPUs!*) and that is what we want.
Reported-and-tested-by: Russ Anderson <rja@sgi.com> Cc: Tejun Heo <tj@kernel.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Takashi Iwai <tiwai@suse.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <tip-*@git.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If __block_prepare_write() was failed in block_write_begin(), the
allocated blocks can be outside of ->i_size.
But new truncate_pagecache() in vmtuncate() does nothing if new < old.
It means the above usage is not working anymore.
So, this patch fixes it by removing "new < old" check. It would need
more cleanup/change. But, now -rc and truncate working is in progress,
so, this tried to fix it minimum change.
83f9ac removed a call to effective_prio() in wake_up_new_task(), which
leads to tasks running at MAX_PRIO.
This is caused by the idle thread being set to MAX_PRIO before forking
off init. O(1) used that to make sure idle was always preempted, CFS
uses check_preempt_curr_idle() for that so we can savely remove this bit
of legacy code.
Reported-by: Mike Galbraith <efault@gmx.de> Tested-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1259754383.4003.610.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
dev_dbg outputs dev_name, which is released with device_unregister. This bug
resulted in output like this:
i2c Xy2�0: adapter [SMBus I801 adapter at 1880] unregistered
The right output would be:
i2c i2c-0: adapter [SMBus I801 adapter at 1880] unregistered
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
For chips like Niagara2 that have true overflow indications
in the %pcr (which we don't actually need and don't use)
the interrupt signal persists until the overflow bits are
cleared by an explicit %pcr write.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Relax stable-sched-clock architectures to not save/disable/restore
hardirqs in cpu_clock().
The background is that I was trying to resolve a sparc64 perf
issue when I discovered this problem.
On sparc64 I implement pseudo NMIs by simply running the kernel
at IRQ level 14 when local_irq_disable() is called, this allows
performance counter events to still come in at IRQ level 15.
This doesn't work if any code in an NMI handler does
local_irq_save() or local_irq_disable() since the "disable" will
kick us back to cpu IRQ level 14 thus letting NMIs back in and
we recurse.
The only path which that does that in the perf event IRQ
handling path is the code supporting frequency based events. It
uses cpu_clock().
cpu_clock() simply invokes sched_clock() with IRQs disabled.
And that's a fundamental bug all on it's own, particularly for
the HAVE_UNSTABLE_SCHED_CLOCK case. NMIs can thus get into the
sched_clock() code interrupting the local IRQ disable code
sections of it.
Furthermore, for the not-HAVE_UNSTABLE_SCHED_CLOCK case, the IRQ
disabling done by cpu_clock() is just pure overhead and
completely unnecessary.
So the core problem is that sched_clock() is not NMI safe, but
we are invoking it from NMI contexts in the perf events code
(via cpu_clock()).
A less important issue is the overhead of IRQ disabling when it
isn't necessary in cpu_clock().
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK architectures are not
affected by this patch.
Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091213.182502.215092085.davem@davemloft.net> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Lenovo SL series laptop has a very similar DSDT with Asus laptops. We can
easily have the extra ACPI function support with little modification in
asus-laptop.c
Here is the hotkey enablement for Lenovo SL series laptop.
This patch will enable the following hotkey:
- Volumn Up
- Volumn Down
- Mute
- Screen Lock (Fn+F2)
- Battery Status (Fn+F3)
- WLAN switch (Fn+F5)
- Video output switch (Fn+F7)
- Touchpad switch (Fn+F8)
- Screen Magnifier (Fn+Space)
The following function of Lenovo SL laptop is still need to be enabled:
- Hotkey: KEY_SUSPEND (Fn+F4), KEY_SLEEP (Fn+F12), Dock Eject (Fn+F9)
- Rfkill for bluetooth and wlan
- LenovoCare LED
- Hwmon for fan speed
- Fingerprint scanner
- Active Protection System
Signed-off-by: Ike Panhc <ike.pan@canonical.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The cardbus code creates PCI devices without ever going through the
necessary fixup bits and pieces that normal PCI devices go through.
There's in fact a commented out call to pcibios_fixup_bus() in there,
it's commented because ... it doesn't work.
I could make pcibios_fixup_bus() do the right thing on powerpc easily
but I felt it cleaner instead to provide a specific hook pci_fixup_cardbus
for which a weak empty implementation is provided by the PCI core.
This fixes cardbus on powerbooks and probably all other PowerPC
platforms which was broken completely for ever on some platforms and
since 2.6.31 on others such as PowerBooks when we made the DMA ops
mandatory (since those are setup by the fixups).
Acked-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It can happen that write does not use all the blocks allocated in
write_begin either because of some filesystem error (like ENOSPC) or
because page with data to write has been removed from memory. We truncate
these blocks so that we don't have dangling blocks beyond i_size.
Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The constants used to specify ISINK ramp times for WM835x had the
wrong shifts so that the on times applied to the off ramp and vice
versa. The masks for the bitfields are correct.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
No need to set the security key when writing to it.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This fixes the problem of the initialization code not correctly
mapping the entire MMIO space on a UV system. A side effect is
the map_high() interface needed to be changed to accommodate
different address and size shifts.
Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Mike Habeck <habeck@sgi.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4B479202.7080705@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
EDAC MC0: INTERNAL ERROR: channel-b out of range (4 >= 4)
Kernel panic - not syncing: EDAC MC0: Uncorrected Error (XEN) Domain 0 crashed: 'noreboot' set - not rebooting.
This happens because FERR_NF_FBD bit 28 is not updated on i5000. Due to
that, both bits 28 and 29 may be equal to one, returning channel = 3. As
this value is invalid, EDAC core generates the panic.
Chris McDermott from IBM confirmed that hurricane chipset in IBM summit
platforms doesn't support logical flat mode. Irrespective of the other
things like apic_id's, total number of logical cpu's, Linux kernel
should default to physical mode for this system.
The 32-bit kernel does so using the OEM checks for the IBM summit
platform. Add a similar OEM platform check for the 64bit kernel too.
Otherwise the linux kernel boot can hang on this platform under certain
bios/platform settings.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Chris McDermott <lcm@linux.vnet.ibm.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit f2260e6b (page allocator: update NR_FREE_PAGES only as necessary)
made one minor regression. if __rmqueue() was failed, NR_FREE_PAGES stat
go wrong. this patch fixes it.
A) The current futex code doesn't handle private file mappings properly.
get_futex_key() uses PageAnon() to distinguish file and
anon, which can cause the following bad scenario:
1) thread-A call futex(private-mapping, FUTEX_WAIT), it
sleeps on file mapping object.
2) thread-B writes a variable and it makes it cow.
3) thread-B calls futex(private-mapping, FUTEX_WAKE), it
wakes up blocked thread on the anonymous page. (but it's nothing)
B) Current futex code doesn't handle zero page properly.
Read mode get_user_pages() can return zero page, but current
futex code doesn't handle it at all. Then, zero page makes
infinite loop internally.
The solution is to use write mode get_user_page() always for
page lookup. It prevents the lookup of both file page of private
mappings and zero page.
Performance concerns:
Probaly very little, because glibc always initialize variables
for futex before to call futex(). It means glibc users never see
the overhead of this patch.
Compatibility concerns:
This patch has few compatibility issues. After this patch,
FUTEX_WAIT require writable access to futex variables (read-only
mappings makes EFAULT). But practically it's not a problem,
glibc always initalizes variables for futexes explicitly - nobody
uses read-only mappings.
Add check if APIC is not disabled since thermal
monitoring depends on it. As only apic gets disabled
we should not try to install "thermal monitor" vector,
print out that thermal monitoring is enabled and etc...
Note that "Intel Correct Machine Check Interrupts" already
has such a check.
Also I decided to not add cpu_has_apic check into
mcheck_intel_therm_init since even if it'll call apic_read on
disabled apic -- it's safe here and allow us to save a few code
bytes.
queue_sector_alignment_offset returned the wrong value which caused
partitions to report an incorrect alignment_offset. Since offset
calculation is needed several places it has been split into a separate
helper function.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Tested-by: Mike Snitzer <snitzer@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On Ironlake, there is an interrupt master control bit. With the bit
disabled before clearing IIR, we do not need to handle extra interrupt
in a loop. This patch removes the loop in Ironlake interrupt handler.
It fixed irq lost issue on some Ironlake platforms.
Signed-off-by: Zou Nan hai <Nanhai.zou@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Current mem_cgroup_force_empty() only ensures mem->res.usage == 0 on
success. But this doesn't guarantee memcg's LRU is really empty, because
there are some cases in which !PageCgrupUsed pages exist on memcg's LRU.
For example:
- Pages can be uncharged by its owner process while they are on LRU.
- race between mem_cgroup_add_lru_list() and __mem_cgroup_uncharge_common().
So there can be a case in which the usage is zero but some of the LRUs are not empty.
OTOH, mem_cgroup_del_lru_list(), which can be called asynchronously with
rmdir, accesses the mem_cgroup, so this access can cause a problem if it
races with rmdir because the mem_cgroup might have been freed by rmdir.
Actually, I saw a bug which seems to be caused by this race.
The problem here is pages on LRU may contain pointer to stale memcg. To
make res->usage to be 0, all pages on memcg must be uncharged or moved to
another(parent) memcg. Moved page_cgroup have already removed from
original LRU, but uncharged page_cgroup contains pointer to memcg withou
PCG_USED bit. (This asynchronous LRU work is for improving performance.)
If PCG_USED bit is not set, page_cgroup will never be added to memcg's
LRU. So, about pages not on LRU, they never access stale pointer. Then,
what we have to take care of is page_cgroup _on_ LRU list. This patch
fixes this problem by making mem_cgroup_force_empty() visit all LRUs
before exiting its loop and guarantee there are no pages on its LRU.
Fix divide by zero and broken output. Commit 600ce1a0fa ("fix clock
setting for Samsung SoC Framebuffer") introduced a mandatory refresh
parameter to the platform data for the S3C framebuffer but did not
introduce any validation code, causing existing platforms (none of which
have refresh set) to divide by zero whenever the framebuffer is
configured, generating warnings and unusable output.
Ben Dooks noted several problems with the patch:
- The platform data supplies the pixclk directly and should already
have taken care of the refresh rate.
- The addition of a window ID parameter doesn't help since only the
root framebuffer can control the pixclk.
- pixclk is specified in picoseconds (rather than Hz) as the patch
assumed.
and suggests reverting the commit so do that. Without fixing this no
mainline user of the driver will produce output.
[akpm@linux-foundation.org: don't revert the correct bit] Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: InKi Dae <inki.dae@samsung.com> Cc: Kyungmin Park <kmpark@infradead.org> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
inotify will WARN() if it finds that the idr and the fsnotify internals
somehow got out of sync. It was only supposed to do this once but due
to this stupid bug it would warn every single time a problem was
detected.
Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since commit 7e790dd5fc937bc8d2400c30a05e32a9e9eef276 ("inotify: fix
error paths in inotify_update_watch") inotify changed the manor in which
it gave watch descriptors back to userspace. Previous to this commit
inotify acted like the following:
inotify_add_watch(X, Y, Z) = 1
inotify_rm_watch(X, 1);
inotify_add_watch(X, Y, Z) = 2
but after this patch inotify would return watch descriptors like so:
inotify_add_watch(X, Y, Z) = 1
inotify_rm_watch(X, 1);
inotify_add_watch(X, Y, Z) = 1
which I saw as equivalent to opening an fd where
open(file) = 1;
close(1);
open(file) = 1;
seemed perfectly reasonable. The issue is that quite a bit of userspace
apparently relies on the behavior in which watch descriptors will not be
quickly reused. KDE relies on it, I know some selinux packages rely on
it, and I have heard complaints from other random sources such as debian
bug 558981.
Although the man page implies what we do is ok, we broke userspace so
this patch almost reverts us to the old behavior. It is still slightly
racey and I have patches that would fix that, but they are rather large
and this will fix it for all real world cases. The race is as follows:
- task1 creates a watch and blocks in idr_new_watch() before it updates
the hint.
- task2 creates a watch and updates the hint.
- task1 updates the hint with it's older wd
- task removes the watch created by task2
- task adds a new watch and will reuse the wd originally given to task2
it requires moving some locking around the hint (last_wd) but this should
solve it for the real world and be -stable safe.
As a side effect this patch papers over a bug in the lib/idr code which
is causing a large number WARN's to pop on people's system and many
reports in kerneloops.org. I'm working on the root cause of that idr
bug seperately but this should make inotify immune to that issue.
Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Some BIOSes fail to initialise the GTT, which will cause DMA faults when
the IOMMU is enabled. We need to clear the whole thing to point at the
scratch page, not just the part that Linux is going to use.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
[anholt: Note that this may also help with stability in the presence of
driver bugs, by not drawing to memory we don't own] Signed-off-by: Eric Anholt <eric@anholt.net> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Discovered by Olli Jarva and Tuomo Untinen from the CROSS
project at Codenomicon Ltd.
Just like in CVE-2007-4567, we can't rely upon skb_dst() being
non-NULL at this point. We fixed that in commit e76b2b2567b83448c2ee85a896433b96150c92e6 ("[IPV6]: Do no rely on
skb->dst before it is assigned.")
Complicating analysis further, this bug can only trigger when network
namespaces are enabled in the build. When namespaces are turned off,
the dev_net() does not evaluate it's argument, so the dereference
would not occur.
So, for a long time, namespaces couldn't be turned on unless SYSFS was
disabled. Therefore, this code has largely been disabled except by
people turning it on explicitly for namespace development.
With help from Eugene Teo <eugene@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
powerpc applies relocations to the kcrctab. They're absolute symbols,
but it's not completely unreasonable: other archs may too, but the
relocation is often 0.
Inspired-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Several leaks in audit_tree didn't get caught by commit 318b6d3d7ddbcad3d6867e630711b8a705d873d7, including the leak on normal
exit in case of multiple rules refering to the same chunk.
... aka "Al had badly fscked up when writing that thing and nobody
noticed until Eric had fixed leaks that used to mask the breakage".
The function essentially creates a copy of old array sans one element
and replaces the references to elements of original (they are on cyclic
lists) with those to corresponding elements of new one. After that the
old one is fair game for freeing.
First of all, there's a dumb braino: when we get to list_replace_init we
use indices for wrong arrays - position in new one with the old array
and vice versa.
Another bug is more subtle - termination condition is wrong if the
element to be excluded happens to be the last one. We shouldn't go
until we fill the new array, we should go until we'd finished the old
one. Otherwise the element we are trying to kill will remain on the
cyclic lists...
That crap used to be masked by several leaks, so it was not quite
trivial to hit. Eric had fixed some of those leaks a while ago and the
shit had hit the fan...
Here is the description of the first of
those patches (the other two just fixed
bugs added by that patch):
Since I removed the master netdev, we've been
keeping internal queues only, and even before
that we never told the networking stack above
the virtual interfaces about congestion. This
means that packets are queued in mac80211 and
the upper layers never know, possibly leading
to memory exhaustion and other problems.
This patch makes all interfaces multiqueue and
uses ndo_select_queue to put the packets into
queues per AC. Additionally, when the driver
stops a queue, we now stop all corresponding
queues for the virtual interfaces as well.
The injection case will use VO by default for
non-data frames, and BE for data frames, but
downgrade any data frames according to ACM. It
needs to be fleshed out in the future to allow
chosing the queue/AC in radiotap.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mike Frysinger [Fri, 8 Jan 2010 05:40:42 +0000 (00:40 -0500)]
kernel/sysctl.c: fix stable merge error in NOMMU mmap_min_addr
Stable commit 0399123f3dcce1a515d021107ec0fb4413ca3efa didn't match the
original upstream commit. The CONFIG_MMU check was added much too early
in the list disabling a lot of proc entries in the process.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>