* pa[idp->layers] should be cleared even if it's not used by
sub_alloc() because it's used by mark idr_mark_full().
* The original condition check also assigned pa[l] to p which the new
code didn't do thus leaving p pointing at the wrong layer.
Both problems have been fixed and the idr code has received good amount
testing using userland testing setup where simple bitmap allocator is
run parallel to verify the result of idr allocation.
The bug this patch fixes is caused by sub_alloc() optimization path
bypassing out-of-room condition check and restarting allocation loop
with starting value higher than maximum allowed value. For detailed
description, please read commit message of 859ddf09.
Signed-off-by: Tejun Heo <tj@kernel.org>
Based-on-patch-from: Eric Paris <eparis@redhat.com> Reported-by: Eric Paris <eparis@redhat.com> Tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de> Tested-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Patch prevents call set_wep_key() with zero key length. That fix long
standing regression since commit c0380693520b1a1e4f756799a0edc379378b462a
"airo: clean up WEP key operations". Additionally print call trace when
someone will try to use improper parameters, and remove key.len = 0
assignment, because it is in not possible code path.
Reported-by: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu> Bisected-by: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu> Tested-by: Chris Siebenmann <cks@cs.toronto.edu> Cc: Dan Williams <dcbw@redhat.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Some devices do not react to a control request (seen on APC UPS's) resulting in
a slow stream of messages, "generic-usb ... control queue full". Therefore
request needs a timeout.
Signed-off-by: Oliver Neukum <oliver@neukum.org> Signed-off-by: David Fries <david@fries.net> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
These touchscreens are mounted onto HP TouchSmart and the Dell Studio One
19. Without a quirk they report a wrong button set and the x/y coordinates
through ABS_Z/ABS_RX, confusing the higher levels (most notably X.Org's
evdev driver).
Device id 0x003 covers models 1900, 2150, and 2700 [1] though testing could
only be performed on a model 1900.
There were multiple reports which indicate that vendor messed up horribly
and the same VID/PID combination is used for completely different devices,
some of them requiring the blacklist entry and other not.
Remove the blacklist entry for this combination of VID/PID completely, and let
the user decide and unbind the driver via sysfs eventually, if needed. Proper
fix would be fixing the vendor.
This fixes inefficient page-by-page reads on POSIX_FADV_RANDOM.
POSIX_FADV_RANDOM used to set ra_pages=0, which leads to poor performance:
a 16K read will be carried out in 4 _sync_ 1-page reads.
In other places, ra_pages==0 means
- it's ramfs/tmpfs/hugetlbfs/sysfs/configfs
- some IO error happened
where multi-page read IO won't help or should be avoided.
POSIX_FADV_RANDOM actually want a different semantics: to disable the
*heuristic* readahead algorithm, and to use a dumb one which faithfully
submit read IO for whatever application requests.
So introduce a flag FMODE_RANDOM for POSIX_FADV_RANDOM.
Note that the random hint is not likely to help random reads performance
noticeably. And it may be too permissive on huge request size (its IO
size is not limited by read_ahead_kb).
In Quentin's report (http://lkml.org/lkml/2009/12/24/145), the overall
(NFS read) performance of the application increased by 313%!
Tested-by: Quentin Barnes <qbarnes+nfs@yahoo-inc.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: <qbarnes+nfs@yahoo-inc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If you read the mail to Oliver Neukum on the linux-usb list, then you know
that I found a cure for the mysterious problem that the MR97310a CIF "type
1" cameras have been freezing up and refusing to stream if hooked up to a
machine with a UHCI controller.
Namely, the cure is that if the camera is an mr97310a CIF type 1 camera, you
have to send it 0xa0, 0x00. Somehow, this is a timing reset command, or
such. It un-blocks whatever was previously stopping the CIF type 1 cameras
from working on the UHCI-based machines.
Signed-off-by: Theodore Kilgore <kilgota@auburn.edu> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make addba_resp_timer aware the HT_AGG_STATE_REQ_STOP_BA_MSK mask
so that when ___ieee80211_stop_tx_ba_session() is issued the timer
will quit. Otherwise when suspend happens before the timer expired,
the timer handler will be called immediately after resume and
messes up driver status.
Signed-off-by: Zhu Yi <yi.zhu@intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The driver hangs when doing `rmmod mpt2sas` if there are any
IR volumes present.The hang is due the scsi midlayer trying to access the
IR volumes after the driver releases controller resources. Perhaps when
scsi_remove_host is called,the scsi mid layer is sending some request.
This doesn't occur for bare drives becuase the driver is already reporting
those drives deleted prior to calling mpt2sas_base_detach.
To solve this issue, we need to delete the volumes as well.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Reviewed-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ACPI deep C-state entry had a long standing bug/missing feature, wherein we were sending
resched IPIs when an idle CPU is in mwait based deep C-state. Only mwait based C1 was using
the write to the monitored address to wake up mwait'ing CPU.
This patch changes the code to retain TS_POLLING bit if we are entering an mwait based
deep C-state.
The patch has been verified to reduce the number of resched IPIs in general and also
improves the performance/power on workloads with low system utilization (i.e., when mwait based
deep C-states are being used).
Fixes "netperf ~50% regression with 2.6.33-rc1, bisect to 1b9508f"
http://marc.info/?l=linux-kernel&m=126441481427331&w=4
Reported-by: Lin Ming <ming.m.lin@intel.com> Tested-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We broke "acpi=ht" in 2.6.32 by disabling MADT parsing
for acpi=disabled. e5b8fc6ac158f65598f58dba2c0d52ba3b412f52
This also broke systems which invoked acpi=ht via DMI blacklist.
acpi=ht is a really ugly hack,
but restore it for those that still use it.
http://bugzilla.kernel.org/show_bug.cgi?id=14886
Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We realized when we broke acpi=ht
http://bugzilla.kernel.org/show_bug.cgi?id=14886
that acpi=ht is not needed on this box
and folks have been using acpi=force on it anyway.
Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ibmphp driver currently maps only 1KB of ebda memory area into kernel address
space during driver initialization. This causes kernel oops when the driver is
modprobe'd and it accesses memory beyond 1KB within ebda segment. The first
byte of ebda segment actually stores the length of the ebda region in
Kilobytes. Hence make use of the length parameter and map the entire ebda
region.
Mike Cui reported that his system with an NVIDIA MCP79 (aka MCP7A)
chipset stopped working with 2.6.32. The problem appears to be that
2.6.32 now enables the FPDMA auto-activate optimization in the ahci
driver. The drive works fine with this enabled on an Intel AHCI so
this appears to be a chipset bug. Since MCP79 is a fairly recent
NVIDIA chipset and we don't have any info on whether any other NVIDIA
chipsets have this issue, disable FPDMA AA optimization on all NVIDIA
AHCI controllers for now.
Should address http://bugzilla.kernel.org/show_bug.cgi?id=14922
Signed-off-by: Robert Hancock <hancockrwd@gmail.com>
While-we-investigate-issue-this-patch-looks-good-to-me-by:
Prajakta Gudadhe <pgudadhe@nvidia.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This fixes corrupted CIPSO packets when SELinux categories greater than 127
are used. The bug occured on the second (and later) loops through the
while; the inner for loop through the ebitmap->maps array used the same
index as the NetLabel catmap->bitmap array, even though the NetLabel bitmap
is twice as long as the SELinux bitmap.
Signed-off-by: Joshua Roys <joshua.roys@gtri.gatech.edu> Acked-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Check the frame control for ieee80211_is_data_qos() is true before
counting the number of tfds can be free, the tfds_in_queue only
increment when ieee80211_is_data_qos() is true before transmit; so it
should only decrement if the type match.
Remove ieee80211_is_data_qos check for frame_ctrl in tx_resp to avoid
invalid information pass from uCode.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The HT extension channel settings require priv->staging_rxon.channel to be
accurate. However, iwl_set_rxon_ht was being called before iwl_set_rxon_channel
and thus HT40 could be broken unless another call to iwl_mac_config came in.
This problem was recently introduced by "iwlwifi: Fix to set correct ht
configuration"
The particular setting in which I noticed this was monitor mode:
iwconfig wlan0 mode monitor
ifconfig wlan0 up
./iw wlan0 set channel 64 HT40-
#./iw wlan0 set channel 64 HT40-
tcpdump -i wlan0 -y IEEE802_11_RADIO
would only catch HT40 packets if I issued the IW command twice.
From visual inspection, iwl_set_rxon_channel does not depend on
iwl_set_rxon_ht, so simply swapping them should be safe and fixes this problem.
Signed-off-by: Daniel Halperin <dhalperi@cs.washington.edu> Acked-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When receive reply_tx and ready to decrement the count for number of
tfds in queue, do error checking to prevent error condition and
tfds_in_queue become negative number.
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
803bf5ec259941936262d10ecc84511b76a20921 ("fs/exec.c: restrict initial
stack space expansion to rlimit") attempts to limit the initial stack to
20*PAGE_SIZE. Unfortunately, in attempting ensure the stack is not
reduced in size, we ended up not changing the stack at all.
This size reduction check is not necessary as the expand_stack call does
this already.
This caused a regression in UML resulting in most guest processes being
killed.
Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Jouni Malinen <j@w1.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Move I2C IR initialization from just after I2C bus setup to right
before non-I2C IR initialization. This avoids the case where an I2C IR
device is blocking audio support (at least the PV951 suffers from
this). It is also more logical to group IR support together,
regardless of the connectivity.
This fixes bug #15184:
http://bugzilla.kernel.org/show_bug.cgi?id=15184
This patch fixes two bugs that revolve around the miscalculation and
misuse of the variable 'overhead_size'. 'overhead_size' is the size of
the various header structures used during communication.
The first bug is the use of 'sizeof' with the pointer of a structure
instead of the structure itself - resulting in the wrong size being
computed. This is then used in a check to see if the payload
(data_size) would be to large for the preallocated structure. Since the
bug produces a smaller value for the overhead, it was possible for the
structure to be breached. (Although the current users of the code do
not currently send enough data to trigger this bug.)
The second bug is that the 'overhead_size' value is used to compute how
much of the preallocated space should be cleared before populating it
with fresh data. This should have simply been 'sizeof(struct cn_msg)'
not overhead_size. The fact that 'overhead_size' was computed
incorrectly made this problem "less bad" - leaving only a pointer's
worth of space at the end uncleared. Thus, this bug was never producing
a bad result, but still needs to be fixed - especially now that the
value is computed correctly.
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
iwl_set_rxon_ht() only get called in iwl_post_associate(); which cause
possible incorrect ht configuration. Adding the call in iwl_mac_config() if
IEEE80211_CONF_CHANGE_CHANNEL flag is set to re-configure and send rxon
command.
We only reply to probe request if either the requested SSID is the
broadcast SSID or if the requested SSID matches our own SSID. This
latter case was not properly handled since we were replying to different
SSID with the same length as our own SSID.
Signed-off-by: Benoit Papillault <benoit.papillault@free.fr> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Currently, PAE frames are not assigned proper sequence numbers.
Since sending PAE frames as part of aggregates breaks
crupto with several APs, they are sent as normal MPDUs.
Fix the seqeuence number issue by updating the frame with the
internal sequence number.
Tested-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit c7ab5ef9bcd281135c21b4732c9be779585181be entitled "b43: implement
short slot and basic rate handling" reduced the transmit throughput for
my BCM4311 device from 18 Mb/s to 0.7 Mb/s. The basic rate handling
portion is OK, the problem is in the short slot handling.
Prior to this change, the short slot enable/disable routines were never
called. Experimentation showed that the critical part was changing the
value at offset 0x0010 in the shared memory. This is supposed to contain
the 802.11 Slot Time in usec, but if it is changed from its initial value
of zero, performance is destroyed. On the other hand, changing the value
in the MMIO register corresponding to the Interframe Slot Time increased
performance from 18 to 22 Mb/s. A BCM4306/3 also shows dramatic
improvement of the transmit rate from 5.3 to 19.0 Mb/s.
Other changes in the patch include removal of the magic number for the
MMIO register, and allowing the slot time to be set for any PHY operating
in the 2.4 GHz band. Previously, the routine was executed only for G PHYs.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The i_blocks field of an eCryptfs inode cannot be trusted, but
generic_fillattr() uses it to instantiate the blocks field of a stat()
syscall when a filesystem doesn't implement its own getattr(). Users
have noticed that the output of du is incorrect on newly created files.
This patch creates ecryptfs_getattr() which calls into the lower
filesystem's getattr() so that eCryptfs can use its kstat.blocks value
after calling generic_fillattr(). It is important to note that the
block count includes the eCryptfs metadata stored in the beginning of
the lower file plus any padding used to fill an extent before
encryption.
The cached read and write paths initialize fattr->time_start in their
setup procedures. The value of fattr->time_start is propagated to
read_cache_jiffies by nfs_update_inode(). Subsequent calls to
nfs_attribute_timeout() will then use a good time stamp when
computing the attribute cache timeout, and squelch unneeded GETATTR
calls.
Since the direct I/O paths erroneously leave the inode's
fattr->time_start field set to zero, read_cache_jiffies for that inode
is set to zero after any direct read or write operation. This
triggers an otw GETATTR or ACCESS call to update the file's attribute
and access caches properly, even when the NFS READ or WRITE replies
have usable post-op attributes.
Make sure the direct read and write setup code performs the same fattr
initialization as the cached I/O paths to prevent unnecessary GETATTR
calls.
This was likely introduced by commit 0e574af1 in 2.6.15, which appears
to add new nfs_fattr_init() call sites in the cached read and write
paths, but not in the equivalent places in fs/nfs/direct.c. A
subsequent commit in the same series, 33801147, introduces the
fattr->time_start field.
Interestingly, the direct write reschedule path already has a call to
nfs_fattr_init() in the right place.
For usec delays use udelay instead of scheduling, this should
allow reclocking to happen faster. This also was the cause
of reported 33s delays at bootup on certain systems.
fixes: freedesktop.org bug 25506
Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since the rewrite of the CPU idle governor in 2.6.32, two laptops have
surfaced where the BIOS advertises a C2 power state, but for some reason
this state is not functioning (as verified in both cases by powertop
before the patch in .32).
The old governor had the accidental behavior that if a non-working state
was chosen too many times, it would end up falling back to C1. The new
governor works differently and this accidental behavior is no longer
there; the result is a high temperature on these two machines.
This patch adds these 2 machines to the DMI table for C state anomalies;
by just not using C2 both these machines are better off (the TSC can be
used instead of the pm timer, giving a performance boost for example).
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Reported-by: <akwatts@ymail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I notice that the processcompl_compat() function seems to be leaking the
'struct async *as' in the error paths.
I think that the calling convention is fundamentally buggered. The
caller is the one that did the "reap_as()" to get the as thing, the
caller should be the one to free it too.
Freeing it in the caller also means that it very clearly always gets
freed, and avoids the need for any "free in the error case too".
From: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Marcus Meissner <meissner@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is currently a bug in sysfs_sd_setattr inherited from
sysfs_setattr in 2.6.32 where the first time we set the attributes
on a sysfs file we allocate backing store but do not set the
backing store attributes. Resulting in overly restrictive
permissions on sysfs files.
The fix is to simply modify the code so that it always executes
when we update the sysfs attributes, as we did in 2.6.31 and earlier.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Tested-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When controlling an industrial radio modem it can be necessary to
manipulate the handshake lines in order to control the radio modem's
transmitter, from userspace.
The transmitter should not be turned off before all characters have been
transmitted. serial8250_tx_empty() was reporting that all characters were
transmitted before they actually were.
===
Discovered in parallel with more testing and analysis by Kees Schoenmakers
as follows:
I ran into an NetMos 9835 serial pci board which behaves a little
different than the standard. This type of expansion board is very common.
"Standard" 8250 compatible devices clear the 'UART_LST_TEMT" bit together
with the "UART_LSR_THRE" bit when writing data to the device.
The NetMos device does it slightly different
I believe that the TEMT bit is coupled to the shift register. The problem
is that after writing data to the device and very quickly after that one
does call serial8250_tx_empty, it returns the wrong information.
My patch makes the test more robust (and solves the problem) and it does
not affect the already correct devices.
Alan:
We may yet need to quirk this but now we know which chips we have a
way to do that should we find this breaks some other 8250 clone with
dodgy THRE.
Signed-off-by: Dick Hollenbeck <dick@softplc.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: Kees Schoenmakers <k.schoenmakers@sigmae.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As the release of substreams may be done asynchronously from the
disconnection, close callback needs to check the shutdown flag before
actually accessing the usb interface.
This patch moves the initialization of the iommu-api out of
the dma-ops initialization code. This ensures that the
iommu-api is initialized even with iommu=pt.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-and-tested-by: Ciprian Dorin Craciun <ciprian.craciun@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Acer G725 shares the same suspend problem with the HP laptops which
lose ATA devices on resume. New firmware which fixes the problem is
already available. Add G725 with old firmwares to the broken suspend
list.
flush_dcache_page() must be called after (!ATA_TFLAG_WRITE) the
data copying to avoid D-cache aliasing with user space or I-D cache
coherency issues (when reading data from an ATA device using PIO,
the kernel dirties the D-cache but there is no flush_dcache_page()
required on Harvard architectures).
This patch adds support for automatically muting the speakers when headphones
are inserted, as well as relabelling the headphone widgets from the
non-standard "HP" to the standard "Headphone" for the mb5 model.
Replace the zero-division warning message with WARN_ON_ONCE() per the
advice by Linus. This shouldn't happen, but if it happens, it's
possible that the bug happens often due to buggy IRQs.
pte_write() should check whether the permissions include either the user
or kernel write permission bits. Likewise, pte_wrprotect() needs to
remove both the kernel and user write bits.
Without this patch handle_tlbmiss() doesn't handle faulting in pages
from the P3 area (our vmalloc space) because of a write. Mappings of the
P3 space have the _PAGE_EXT_KERN_WRITE bit but not _PAGE_EXT_USER_WRITE.
Signed-off-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
write_kmem() used to assume vwrite() always return the full buffer length.
However now vwrite() could return 0 to indicate memory hole. This
creates a bug that "buf" is not advanced accordingly.
Fix it to simply ignore the return value, hence the memory hole.
This also makes the kmem read/write implementation aligned with mem(4):
"References to nonexistent locations cause errors to be returned." Here we
return -ENXIO (inspired by Hugh) if no bytes have been transfered to/from
user space, otherwise return partial read/write results.
As the padlock driver for SHA uses a software fallback to perform
partial hashing, it must implement custom import/export functions.
Otherwise hmac which depends on import/export for prehashing will
not work with padlock-sha.
Reported-by: Wolfgang Walter <wolfgang.walter@stwm.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When we setup buffer for display plane, we'll check any pending
required GPU flush and possible make interruptible wait for flush
complete. But that wait would be most possibly to fail in case of
signals received for X process, which will then fail modeset process
and put display engine in unconsistent state. The result could be
blank screen or CPU hang, and DDX driver would always turn on outputs
DPMS after whatever modeset fails or not.
So this one creates new helper for setup display plane buffer, and
when needing flush using uninterruptible wait for that.
This one should fix bug like https://bugs.freedesktop.org/show_bug.cgi?id=24009.
Also fixing mode switch stress test on Ironlake.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This just waits until the hw passed the current ring position with
cmd execution. This slightly changes the existing i915_wait_request
function to make uninterruptible waiting possible - no point in
returning to userspace while mucking around with the overlay, that
piece of hw is just too fragile.
Also replace a magic 0 with the symbolic constant (and kill the then
superflous comment) while I was looking at the code.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This one reverts 9e3a6d155ed0a7636b926a798dd7221ea107b274.
As reported by http://bugzilla.kernel.org/show_bug.cgi?id=14485,
this dump will cause hang problem on some machine. If something
really needs this kind of full registers dump, that could be done
within intel-gpu-tools.
Cc: Ben Gamari <bgamari.foss@gmail.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As noticed by Jon Masters <jonathan@jonmasters.org>, the conntrack hash
size is global and not per namespace, but modifiable at runtime through
/sys/module/nf_conntrack/hashsize. Changing the hash size will only
resize the hash in the current namespace however, so other namespaces
will use an invalid hash size. This can cause crashes when enlarging
the hashsize, or false negative lookups when shrinking it.
Move the hash size into the per-namespace data and only use the global
hash size to initialize the per-namespace value when instanciating a
new namespace. Additionally restrict hash resizing to init_net for
now as other namespaces are not handled currently.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Expectation hashtable size was simply glued to a variable with no code
to rehash expectations, so it was a bug to allow writing to it.
Make "expect_hashsize" readonly.
nf_conntrack_cachep is currently shared by all netns instances, but
because of SLAB_DESTROY_BY_RCU special semantics, this is wrong.
If we use a shared slab cache, one object can instantly flight between
one hash table (netns ONE) to another one (netns TWO), and concurrent
reader (doing a lookup in netns ONE, 'finding' an object of netns TWO)
can be fooled without notice, because no RCU grace period has to be
observed between object freeing and its reuse.
We dont have this problem with UDP/TCP slab caches because TCP/UDP
hashtables are global to the machine (and each object has a pointer to
its netns).
If we use per netns conntrack hash tables, we also *must* use per netns
conntrack slab caches, to guarantee an object can not escape from one
namespace to another one.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
[Patrick: added unique slab name allocation] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As discovered by Jon Masters <jonathan@jonmasters.org>, the "untracked"
conntrack, which is located in the data section, might be accidentally
freed when a new namespace is instantiated while the untracked conntrack
is attached to a skb because the reference count it re-initialized.
The best fix would be to use a seperate untracked conntrack per
namespace since it includes a namespace pointer. Unfortunately this is
not possible without larger changes since the namespace is not easily
available everywhere we need it. For now move the untracked conntrack
initialization to the init_net setup function to make sure the reference
count is not re-initialized and handle cleanup in the init_net cleanup
function to make sure namespaces can exit properly while the untracked
conntrack is in use in other namespaces.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
An unfortunate "WARNING" in the message amd64_edac dumps when the system
doesn't support DRAM ECC or ECC checking is not enabled in the BIOS
used to trigger kerneloops which qualified the message as an OOPS thus
misleading the users. See, e.g.
When suspending, tpm_infineon calls the generic suspend function of the
TPM framework. However, the TPM framework does not return and the system
hangs upon suspend. When sending the necessary command "TPM_SaveState"
directly within the driver, suspending and resuming works fine.
Current kvm wallclock does not consider the total_sleep_time which could cause
wrong wallclock in guest after host suspend/resume. This patch solve
this issue by counting total_sleep_time to get the correct host boot time.
Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A DVB demultiplexer device can be used to set up either a PES filter or
a section filter. In the former case, the ts field of the feed union of
struct dmxdev_filter is used, in the latter case the sec field of the
same union is used.
The ts field is a struct list_head, and is currently initialized in the
open() method of the demux device. When for a given demuxer a section
filter is set up, the sec field is played with, thus if a PES filter
needs to be set up after that the ts field will be corrupted, causing a
kernel oops.
This fix moves the list head initialization to
dvb_dmxdev_pes_filter_set(), so that the ts field is properly
initialized every time a PES filter is set up.
Signed-off-by: Francesco Lavra <francescolavra@interfree.it> Reviewed-by: Andy Walls <awalls@radix.net> Tested-by: hermann pitton <hermann-pitton@arcor.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This code was written long ago when it was not possible to
reshape a degraded array. Now it is so the current level of
degraded-ness needs to be taken in to account. Also newly addded
devices should only reduce degradedness if they are deemed to be
in-sync.
In particular, if you convert a RAID5 to a RAID6, and increase the
number of devices at the same time, then the 5->6 conversion will
make the array degraded so the current code will produce a wrong
value for 'degraded' - "-1" to be precise.
If the reshape runs to completion end_reshape will calculate a correct
new value for 'degraded', but if a device fails during the reshape an
incorrect decision might be made based on the incorrect value of
"degraded".
This patch is suitable for 2.6.32-stable and if they are still open,
2.6.31-stable and 2.6.30-stable as well.
It was recently pointed out that the NFSERR_SERVERFAULT error, which is
designed to inform the user of a serious internal error on the server, was
being mapped to an error value that is internal to the kernel.
This patch maps it to the error EREMOTEIO, which is exported to userland
through errno.h.
The VM/VFS does not allow mapping->a_ops->invalidatepage() to fail.
Unfortunately, nfs_wb_page_cancel() may fail if a fatal signal occurs.
Since the NFS code assumes that the page stays mapped for as long as the
writeback is active, we can end up Oopsing (among other things).
The only safe fix here is to convert nfs_wait_on_request(), so as to make
it uninterruptible (as is already the case with wait_on_page_writeback()).
cifs_from_ucs2 returns the length of the converted name, including the
length of the NULL terminator. We don't want to include the NULL
terminator in the dentry name length however since that'll throw off the
hash calculation for the dentry cache.
I believe that this is the root cause of several problems that have
cropped up recently that seem to be papered over with the "noserverino"
mount option. More confirmation of that would be good, but this is
clearly a bug and it fixes at least one reproducible problem that
was reported.
This patch fixes at least this reproducer in this kernel.org bug:
This bug means that when limiting the stack to less the 20*PAGE_SIZE (eg.
80K on 4K pages or 'ulimit -s 79') all processes will be killed before
they start. This is particularly bad with 64K pages, where a ulimit below
1280K will kill every process.
To test, do:
'ulimit -s 15; ls'
before and after the patch is applied. Before it's applied, 'ls' should
be killed. After the patch is applied, 'ls' should no longer be killed.
A stack limit of 15KB since it's small enough to trigger 20*PAGE_SIZE.
Also 15KB not a multiple of PAGE_SIZE, which is a trickier case to handle
correctly with this code.
4K pages should be fine to test with.
[kosaki.motohiro@jp.fujitsu.com: cleanup]
[akpm@linux-foundation.org: cleanup cleanup] Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Americo Wang <xiyou.wangcong@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We want to be sure that compiler fetches the limit variable only
once, so add helpers for fetching current and maximal resource
limits which do that.
Add them to sched.h (instead of resource.h) due to circular dependency
sched.h->resource.h->task_struct
Alternative would be to create a separate res_access.h or similar.
It is possible (and expected) for there to be holes in the h->drv[]
array, that is, some elements may be NULL pointers. cciss_seq_show
needs to be made aware of this possibility to avoid an Oops.
To reproduce the Oops which this fixes:
1) Create two "arrays" in the Array Configuratino Utility and
several logical drives on each array.
2) cat /proc/driver/cciss/cciss* in an infinite loop
3) delete some of the logical drives in the first "array."
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>