Commit 48b32a3553a54740d236b79a90f20147a25875e3 ("reiserfs: use generic
xattr handlers") introduced a problem that causes corruption when extended
attributes are replaced with a smaller value.
The issue is that the reiserfs_setattr to shrink the xattr file was moved
from before the write to after the write.
The root issue has always been in the reiserfs xattr code, but was papered
over by the fact that in the shrink case, the file would just be expanded
again while the xattr was written.
The end result is that the last 8 bytes of xattr data are lost.
Commit 677c9b2e393a0cd203bd54e9c18b012b2c73305a ("reiserfs: remove
privroot hiding in lookup") removed the magic from the lookup code to hide
the .reiserfs_priv directory since it was getting loaded at mount-time
instead. The intent was that the entry would be hidden from the user via
a poisoned d_compare, but this was faulty.
This introduced a security issue where unprivileged users could access and
modify extended attributes or ACLs belonging to other users, including
root.
This patch resolves the issue by properly hiding .reiserfs_priv. This was
the intent of the xattr poisoning code, but it appears to have never
worked as expected. This is fixed by using d_revalidate instead of
d_compare.
This patch makes -oexpose_privroot a no-op. I'm fine leaving it this way.
The effort involved in working out the corner cases wrt permissions and
caching outweigh the benefit of the feature.
Signed-off-by: Jeff Mahoney <jeffm@suse.com> Acked-by: Edward Shishkin <edward.shishkin@gmail.com> Reported-by: Matt McCutchen <matt@mattmccutchen.net> Tested-by: Matt McCutchen <matt@mattmccutchen.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The original code doesn't take into consideration that the value of
MIXART_BA0_SIZE - pos can be less than zero which would lead to a large
unsigned value for "count".
Also I moved the check that read size is a multiple of 4 bytes below
the code that adjusts "count".
Add missing newline to dev_warn() message string. This is more of an issue
with older kernels that don't automatically add a newline if it was missing
from the end of the previous line.
Signed-off-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ever for 32-bit with sufficiently high NR_CPUS, and starting
with commit 789d03f584484af85dbdc64935270c8e45f36ef7 also for
64-bit, the statically allocated early fixmap page tables were
not covering FIX_OHCI1394_BASE, leading to a boot time crash
when "ohci1394_dma=early" was used. Despite this entry not being
a permanently used one, it needs to be moved into the permanent
range since it has to be close to FIX_DBGP_BASE and
FIX_EARLYCON_MEM_BASE.
This patch (as1352) fixes a bug in the way isochronous input data is
returned to userspace for usbfs transfers. The entire buffer must be
copied, not just the first actual_length bytes, because the individual
packets will be discontiguous if any of them are short.
Reported-by: Markus Rechberger <mrechberger@gmail.com> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 71fe804b6d5 (mempolicy: use struct mempolicy pointer in
shmem_sb_info) added mpol=local mount option. but its feature is broken
since it was born. because such code always return 1 (i.e. mount
failure).
Fix an 'oops' when a tmpfs mount point is mounted with the mpol=default
mempolicy.
Upon remounting a tmpfs mount point with 'mpol=default' option, the mount
code crashed with a null pointer dereference. The initial problem report
was on 2.6.27, but the problem exists in mainline 2.6.34-rc as well. On
examining the code, we see that mpol_new returns NULL if default mempolicy
was requested. This 'NULL' mempolicy is accessed to store the node mask
resulting in oops.
this patch fixes a memory leak which occurs when an em28xx card with DVB
extension is unplugged or its DVB extension driver is unloaded. In
dvb_fini(), dev->dvb must be freed before being set to NULL, as is done
in dvb_init() in case of error.
Note that this bug is also present in the latest stable kernel release.
Modify uid check in do_coredump so as to not apply it in the case of
pipes.
This just got noticed in testing. The end of do_coredump validates the
uid of the inode for the created file against the uid of the crashing
process to ensure that no one can pre-create a core file with different
ownership and grab the information contained in the core when they
shouldn' tbe able to. This causes failures when using pipes for a core
dumps if the crashing process is not root, which is the uid of the pipe
when it is created.
The fix is simple. Since the check for matching uid's isn't relevant for
pipes (a process can't create a pipe that the uermodehelper code will open
anyway), we can just just skip it in the event ispipe is non-zero
Reverts a pipe-affecting change which was accidentally made in
Commit 221af7f87b9 ("Split 'flush_old_exec' into two functions") split
the function at the point of no return - ie right where there were no
more error cases to check. That made sense from a technical standpoint,
but when we then also combined it with the actual personality setting
going in between flush_old_exec() and setup_new_exec(), it needs to be a
bit more careful.
In particular, we need to make sure that we really flush the old
personality bits in the 'flush' stage, rather than later in the 'setup'
stage, since otherwise we might be flushing the _new_ personality state
that we're just setting up.
So this moves the flags and personality flushing (and 'flush_thread()',
which is the arch-specific function that generally resets lazy FP state
etc) of the old process into flush_old_exec(), so that it doesn't affect
any state that execve() is setting up for the new process environment.
This was reported by Michal Simek as breaking his Microblaze qemu
environment.
Reported-and-tested-by: Michal Simek <michal.simek@petalogix.com> Cc: Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Now that the previous commit made it possible to do the personality
setting at the point of no return, we do just that for ELF binaries.
And suddenly all the reasons for that insane TIF_ABI_PENDING bit go
away, and we can just make SET_PERSONALITY() just do the obvious thing
for a 32-bit compat process.
Everything becomes much more straightforward this way.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Here are the sparc bits to remove TIF_ABI_PENDING now that
set_personality() is called at the appropriate place in exec.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
'flush_old_exec()' is the point of no return when doing an execve(), and
it is pretty badly misnamed. It doesn't just flush the old executable
environment, it also starts up the new one.
Which is very inconvenient for things like setting up the new
personality, because we want the new personality to affect the starting
of the new environment, but at the same time we do _not_ want the new
personality to take effect if flushing the old one fails.
As a result, the x86-64 '32-bit' personality is actually done using this
insane "I'm going to change the ABI, but I haven't done it yet" bit
(TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
personality, but just the "pending" bit, so that "flush_thread()" can do
the actual personality magic.
This patch in no way changes any of that insanity, but it does split the
'flush_old_exec()' function up into a preparatory part that can fail
(still called flush_old_exec()), and a new part that will actually set
up the new exec environment (setup_new_exec()). All callers are changed
to trivially comply with the new world order.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Do not set current->mm->mmap to NULL in 32-bit emulation on 64-bit
load_aout_binary after flush_old_exec as it would destroy already
set brpm mapping with arguments.
Thomas Renninger <trenn@suse.de> reported on IBM x3330
booting a latest kernel on this machine results in:
PCI: PCI BIOS revision 2.10 entry at 0xfd61c, last bus=1
PCI: Using configuration type 1 for base access bio: create slab <bio-0> at 0
ACPI: SCI (IRQ30) allocation failed
ACPI Exception: AE_NOT_ACQUIRED, Unable to install System Control Interrupt handler (20090903/evevent-161)
ACPI: Unable to start the ACPI Interpreter
x86/pci: update pirq_enable_irq() to setup io apic routing
it turns out we need to set irq routing for the sci on ioapic1 early.
-v2: make it work without sparseirq too.
-v3: fix checkpatch.pl warning, and cc to stable
Reported-by: Thomas Renninger <trenn@suse.de> Bisected-by: Thomas Renninger <trenn@suse.de> Tested-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <1265793639-15071-2-git-send-email-yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The driver hangs when doing `rmmod mpt2sas` if there are any
IR volumes present.The hang is due the scsi midlayer trying to access the
IR volumes after the driver releases controller resources. Perhaps when
scsi_remove_host is called,the scsi mid layer is sending some request.
This doesn't occur for bare drives becuase the driver is already reporting
those drives deleted prior to calling mpt2sas_base_detach.
To solve this issue, we need to delete the volumes as well.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com> Reviewed-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Patch prevents call set_wep_key() with zero key length. That fix long
standing regression since commit c0380693520b1a1e4f756799a0edc379378b462a
"airo: clean up WEP key operations". Additionally print call trace when
someone will try to use improper parameters, and remove key.len = 0
assignment, because it is in not possible code path.
Reported-by: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu> Bisected-by: Chris Siebenmann <cks-rhbugzilla@cs.toronto.edu> Tested-by: Chris Siebenmann <cks@cs.toronto.edu> Cc: Dan Williams <dcbw@redhat.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
ACPI deep C-state entry had a long standing bug/missing feature, wherein we were sending
resched IPIs when an idle CPU is in mwait based deep C-state. Only mwait based C1 was using
the write to the monitored address to wake up mwait'ing CPU.
This patch changes the code to retain TS_POLLING bit if we are entering an mwait based
deep C-state.
The patch has been verified to reduce the number of resched IPIs in general and also
improves the performance/power on workloads with low system utilization (i.e., when mwait based
deep C-states are being used).
Fixes "netperf ~50% regression with 2.6.33-rc1, bisect to 1b9508f"
http://marc.info/?l=linux-kernel&m=126441481427331&w=4
Reported-by: Lin Ming <ming.m.lin@intel.com> Tested-by: Alex Shi <alex.shi@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When controlling an industrial radio modem it can be necessary to
manipulate the handshake lines in order to control the radio modem's
transmitter, from userspace.
The transmitter should not be turned off before all characters have been
transmitted. serial8250_tx_empty() was reporting that all characters were
transmitted before they actually were.
===
Discovered in parallel with more testing and analysis by Kees Schoenmakers
as follows:
I ran into an NetMos 9835 serial pci board which behaves a little
different than the standard. This type of expansion board is very common.
"Standard" 8250 compatible devices clear the 'UART_LST_TEMT" bit together
with the "UART_LSR_THRE" bit when writing data to the device.
The NetMos device does it slightly different
I believe that the TEMT bit is coupled to the shift register. The problem
is that after writing data to the device and very quickly after that one
does call serial8250_tx_empty, it returns the wrong information.
My patch makes the test more robust (and solves the problem) and it does
not affect the already correct devices.
Alan:
We may yet need to quirk this but now we know which chips we have a
way to do that should we find this breaks some other 8250 clone with
dodgy THRE.
Signed-off-by: Dick Hollenbeck <dick@softplc.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: Kees Schoenmakers <k.schoenmakers@sigmae.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit c7ab5ef9bcd281135c21b4732c9be779585181be entitled "b43: implement
short slot and basic rate handling" reduced the transmit throughput for
my BCM4311 device from 18 Mb/s to 0.7 Mb/s. The basic rate handling
portion is OK, the problem is in the short slot handling.
Prior to this change, the short slot enable/disable routines were never
called. Experimentation showed that the critical part was changing the
value at offset 0x0010 in the shared memory. This is supposed to contain
the 802.11 Slot Time in usec, but if it is changed from its initial value
of zero, performance is destroyed. On the other hand, changing the value
in the MMIO register corresponding to the Interframe Slot Time increased
performance from 18 to 22 Mb/s. A BCM4306/3 also shows dramatic
improvement of the transmit rate from 5.3 to 19.0 Mb/s.
Other changes in the patch include removal of the magic number for the
MMIO register, and allowing the slot time to be set for any PHY operating
in the 2.4 GHz band. Previously, the routine was executed only for G PHYs.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On my AMD780V chipset, hda_intel.c can crash the kernel with a divide by
zero
for as-yet unknown reasons. A simple check for zero prevents it, though
the problem that causes it remains. Since the workaround is harmless and
won't affect anyone except victims of this bug, it should be safe;
moreover,
because this crash can be triggered by a user-mode application, there are
denial of service implications on the systems affected by the bug without
the patch.
Almost all r128's private ioctls require that the CCE state has
already been initialised. However, most do not test that this has
been done, and will proceed to dereference a null pointer. This may
result in a security vulnerability, since some ioctls are
unprivileged.
This adds a macro for the common initialisation test and changes all
ioctl implementations that require prior initialisation to use that
macro.
Also, r128_do_init_cce() does not test that the CCE state has not
been initialised already. Repeated initialisation may lead to a crash
or resource leak. This adds that test.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
cifs_from_ucs2 returns the length of the converted name, including the
length of the NULL terminator. We don't want to include the NULL
terminator in the dentry name length however since that'll throw off the
hash calculation for the dentry cache.
I believe that this is the root cause of several problems that have
cropped up recently that seem to be papered over with the "noserverino"
mount option. More confirmation of that would be good, but this is
clearly a bug and it fixes at least one reproducible problem that
was reported.
This patch fixes at least this reproducer in this kernel.org bug:
Sizing of memory allocations shouldn't depend on the number of physical
pages found in a system, as that generally includes (perhaps a huge amount
of) non-RAM pages. The amount of what actually is usable as storage
should instead be used as a basis here.
Some of the calculations (i.e. those not intending to use high memory)
should likely even use (totalram_pages - totalhigh_pages).
Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Dave Airlie <airlied@linux.ie> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: "David S. Miller" <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The duplicate events should be coalesced into a single event. But those two
events do not be coalesced into a single event, due to some bad check in
event_compare(). It can not match the two NULL inodes as the same event.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mask off FS_EVENT_ON_CHILD in dnotify_handle_event(). Otherwise, when there
is more than one watch on a directory and dnotify_should_send_event()
succeeds, events with FS_EVENT_ON_CHILD set will trigger all watches and cause
spurious events.
if (fork()) {
/* This triggers a create event */
creat(file, 0600);
/* This triggers a create and delete event (!) */
unlink(file);
} else {
sleep(1);
rmdir(tmpdir);
}
return 0;
}
Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Different motherboards have different PNP declarations for
W83781D/W83782D chips. Some declare the whole range of I/O ports (8
ports), some declare only the useful ports (2 ports at offset 5) and
some declare fancy ranges, for example 4 ports at offset 4. To
properly handle all cases, request all ports individually for probing.
After we have determined that we really have a W83781D or W83782D
chip, the useful port range will be requested again, as a single
block.
I did not see a board which needs this yet, but I know of one for lm78
driver and I'd like to keep the logic of these two drivers in sync.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Different motherboards have different PNP declarations for LM78/LM79
chips. Some declare the whole range of I/O ports (8 ports), some
declare only the useful ports (2 ports at offset 5) and some declare
fancy ranges, for example 4 ports at offset 4. To properly handle all
cases, request all ports individually for probing. After we have
determined that we really have an LM78 or LM79 chip, the useful port
range will be requested again, as a single block.
This fixes the driver on the Olivetti M3000 DT 540, at least.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When /dev/watchdog gets opened a second time we return -EBUSY, but
we already have got a kref then, so we end up leaking our data struct.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The #define ADT7462_VOLT_COUNT is wrong, it should be 13 not 12. All the
for loops that use this as a limit count are of the typical form, "for
(n = 0; n < ADT7462_VOLT_COUNT; n++)", so to loop through all voltages
w/o missing the last one it is necessary for the count to be one greater
than it is. (Specifically, you will miss the +1.5V 3GPIO input with count
= 12 vs. 13.)
Signed-off-by: Ray Copeland <ray.copeland@aprius.com> Acked-by: "Darrick J. Wong" <djwong@us.ibm.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The default streaming control timeout was found by Ondrej Zary to be too low
for some Logitech webcams. With kernel 2.6.22 and newer they would timeout
during initialization unles the audio function was initialized before the
video function.
Add a module parameter to set the streaming control timeout and increase the
default value from 1000ms to 3000ms to fix the above problem.
Thanks to Ondrej Zary for investigating the issue and providing an initial
patch.
dev_dbg outputs dev_name, which is released with device_unregister. This bug
resulted in output like this:
i2c Xy2�0: adapter [SMBus I801 adapter at 1880] unregistered
The right output would be:
i2c i2c-0: adapter [SMBus I801 adapter at 1880] unregistered
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
USB: usbfs: properly clean up the as structure on error paths
I notice that the processcompl_compat() function seems to be leaking the
'struct async *as' in the error paths.
I think that the calling convention is fundamentally buggered. The
caller is the one that did the "reap_as()" to get the as thing, the
caller should be the one to free it too.
Freeing it in the caller also means that it very clearly always gets
freed, and avoids the need for any "free in the error case too".
From: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Marcus Meissner <meissner@suse.de> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Jeff Mahoney <jeffm@suse.com>
This reverts commit 703625118069 ("tty: fix race in tty_fasync") and
commit b04da8bfdfbb ("fnctl: f_modown should call write_lock_irqsave/
restore") that tried to fix up some of the fallout but was incomplete.
It turns out that we really cannot hold 'tty->ctrl_lock' over calling
__f_setown, because not only did that cause problems with interrupt
disables (which the second commit fixed), it also causes a potential
ABBA deadlock due to lock ordering.
Thanks to Tetsuo Handa for following up on the issue, and running
lockdep to show the problem. It goes roughly like this:
- f_getown gets filp->f_owner.lock for reading without interrupts
disabled, so an interrupt that happens while that lock is held can
cause a lockdep chain from f_owner.lock -> sighand->siglock.
- at the same time, the tty->ctrl_lock -> f_owner.lock chain that
commit 703625118069 introduced, together with the pre-existing
sighand->siglock -> tty->ctrl_lock chain means that we have a lock
dependency the other way too.
So instead of extending tty->ctrl_lock over the whole __f_setown() call,
we now just take a reference to the 'pid' structure while holding the
lock, and then release it after having done the __f_setown. That still
guarantees that 'struct pid' won't go away from under us, which is all
we really ever needed.
Commit 703625118069f9f8960d356676662d3db5a9d116 exposed that f_modown()
should call write_lock_irqsave instead of just write_lock_irq so that
because a caller could have a spinlock held and it would not be good to
renable interrupts.
Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Tavis Ormandy <taviso@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have apparently had a memory leak since 7ca7e564e049d8b350ec9d958ff25eaa24226352 "ipc: store ipcs into IDRs" in
2007. The idr of which 3 exist for each ipc namespace is never freed.
This patch simply frees them when the ipcns is freed. I don't believe any
idr_remove() are done from rcu (and could therefore be delayed until after
this idr_destroy()), so the patch should be safe. Some quick testing
showed no harm, and the memory leak fixed.
Caught by kmemleak.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We need buffer->len to remain valid to work out the correct address to
be unmapped. We therefore need to clear buffer->len after the unmap
operation.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
UEFI standard (version 2.3, May 2009, 5.3.1 GUID Format overview, page
95) defines that LBA is always based on the logical block size. It
means bdev_logical_block_size() (aka BLKSSZGET) for Linux.
This patch removes static sector size from EFI GPT parser.
The problem is reproducible with the latest GNU Parted:
The size of EFI GPT header is not static, but whole sector is
allocated for the header. The HeaderSize field must be greater
than 92 (= sizeof(struct gpt_header) and must be less than or
equal to the logical block size.
It means we have to read whole sector with the header, because the
header crc32 checksum is calculated according to HeaderSize.
For more details see UEFI standard (version 2.3, May 2009):
- 5.3.1 GUID Format overview, page 93
- Table 13. GUID Partition Table Header, page 96
Properly handle version of the protocol where standard PS/2 packets
from trackpoint are stuffed into middle (byte 3-6) of the standard
ALPS packets when both the touchpad and trackpoint are used together.
The patch is based on work done by Matthew Chapman and additional
research done by David Kubicek and Erik Osterholm:
Many thanks to David Kubicek for his efforts in researching fine points
of this new version of the protocol, especially interaction between pad
and stick in these models.
Cc: Andy Isaacson <adi@hexapodia.org> Signed-off-by: Sebastian Kapfer <sebastian_kapfer@gmx.net> Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In free_unmap_area_noflush(), va->flags is marked as VM_LAZY_FREE first, and
then vmap_lazy_nr is increased atomically.
But, in __purge_vmap_area_lazy(), while traversing of vmap_are_list, nr
is counted by checking VM_LAZY_FREE is set to va->flags. After counting
the variable nr, kernel reads vmap_lazy_nr atomically and checks a
BUG_ON condition whether nr is greater than vmap_lazy_nr to prevent
vmap_lazy_nr from being negative.
The problem is that, if interrupted right after marking VM_LAZY_FREE,
increment of vmap_lazy_nr can be delayed. Consequently, BUG_ON
condition can be met because nr is counted more than vmap_lazy_nr.
It is highly probable when vmalloc/vfree are called frequently. This
scenario have been verified by adding delay between marking VM_LAZY_FREE
and increasing vmap_lazy_nr in free_unmap_area_noflush().
Even the vmap_lazy_nr is for checking high watermark, it never be the
strict watermark. Although the BUG_ON condition is to prevent
vmap_lazy_nr from being negative, vmap_lazy_nr is signed variable. So,
it could go down to negative value temporarily.
Consequently, removing the BUG_ON condition is proper.
Remove entry for 2770:915d (usb digital camera with mass storage
support) from unusual_devs.h. The fix triggered by the entry causes
the file system on the camera to be completely inaccessible (no
partition table, the device is not mountable).
The patch works, but let me clarify a few things about it. All the
patch does is remove the entry for this device from the
drivers/usb/storage/unusual_devs.h, which is supposed to help with a
problem with the device's reported size (I think). I'm pretty sure it
was originally added for a reason, so I'm not sure removing it won't
cause other problems to reappear. Also, I should note that this
unusual_devs.h entry was present (and activating workarounds) in
2.6.29, but in that version everything works fine. Starting with
2.6.30, things no longer work.
Signed-off-by: Ryan May <rmay31@gmail.com> Cc: Rohan Hart <rohan.hart17@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Thomas Schlichter reported:
> X.org uses libpciaccess which tries to mmap with write combining enabled via
> /sys/bus/pci/devices/*/resource0_wc. Currently, when PAT is not enabled, the
> kernel does fall back to uncached mmap. Then libpciaccess thinks it succeeded
> mapping with write combining enabled and does not set up suited MTRR entries.
> ;-(
Instead of silently mapping pci mmap region as UC minus in the case
of !pat_enabled and wc request, we can return error. Eric Anholt mentioned
that caller (like X) typically follows up with UC minus pci mmap request and
if there is a free mtrr slot, caller will manage adding WC mtrr.
Jesse Barnes says:
> Older versions of libpciaccess will behave better if we do it that way
> (iirc it only allocates an MTRR if the resource_wc file doesn't exist or
> fails to get mapped).
Reported-by: Thomas Schlichter <thomas.schlichter@web.de> Signed-off-by: Thomas Schlichter <thomas.schlichter@web.de> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Based on patch originally by Jeff Mahoney <jeffm@suse.com>
enclosure_status is expected to be a NULL terminated array of strings
but isn't actually NULL terminated. When writing an invalid value to
/sys/class/enclosure/.../.../status, it goes off the end of the array
and Oopses.
Fix by making the assumption true and adding NULL at the end.
Reported-by: Artur Wojcik <artur.wojcik@intel.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1321) fixes a problem with EHCI and UHCI root-hub
suspends: If the suspend occurs while a port is trying to resume, the
resume doesn't finish and simply gets lost. When remote wakeup is
enabled, this is undesirable behavior.
The patch checks first to see if any port resumes are in progress, and
if they are then it fails the root-hub suspend with -EBUSY.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1320) fixes two problems related to interrupt-URB
scheduling in ehci-hcd.
URBs with an interval of 2 or 4 microframes aren't handled.
For the time being, the patch reduces to interval to 1 uframe.
URBs are constrained to have an interval no larger than 1024
frames by usb_submit_urb(). But some EHCI controllers allow
use of a schedule as short as 256 frames; for these
controllers we may have to decrease the interval to the
actual schedule length.
The second problem isn't very significant since few devices expose
interrupt endpoints with an interval larger than 256 frames. But the
first problem is critical; it will prevent the kernel from working
with devices having interrupt intervals of 2 or 4 uframes.
Memory allocations with GFP_KERNEL can cause IO to a storage
device which can fail resulting in a need to reset the device.
Therefore GFP_KERNEL cannot be safely used between usb_lock_device()
and usb_unlock_device(). Replace by GFP_NOIO.
Signed-off-by: Oliver Neukum <oliver@neukum.org> Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1330) fixes a bug in khbud's handling of remote
wakeups. When a device sends a remote-wakeup request, the parent hub
(or the host controller driver, for directly attached devices) begins
the resume sequence and notifies khubd when the sequence finishes. At
this point the port's SUSPEND feature is automatically turned off.
However the device needs an additional 10-ms resume-recovery time
(TRSMRCY in the USB spec). Khubd does not wait for this delay if the
SUSPEND feature is off, and as a result some devices fail to behave
properly following a remote wakeup. This patch adds the missing
delay to the remote-wakeup path.
It also extends the resume-signalling delay used by ehci-hcd and
uhci-hcd from 20 ms (the value in the spec) to 25 ms (the value we use
for non-remote-wakeup resumes). The extra time appears to help some
devices.
Wacom claims that the WACF namespace will always be devoted to serial
Wacom tablets. Remove the existing entries and add a wildcard to avoid
having to update the kernel every time they add a new device.
Signed-off-by: Ping Cheng <pingc@wacom.com> Signed-off-by: Matthew Garrett <mjg@redhat.com> Tested-by: Ping Cheng <pingc@wacom.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This is a quick patch up for the problem. It's not really fixing Nozomi
which completely fails to implement tty open/close semantics and all the
other needed stuff. Doing it right is a rather more invasive patch set and
not one that will backport.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Ecryptfs_open dereferences a pointer to the private lower file (the one
stored in the ecryptfs inode), without checking if the pointer is NULL.
Right afterward, it initializes that pointer if it is NULL. Swap order of
statements to first initialize. Bug discovered by Duckjin Kang.
It can happen that write does not use all the blocks allocated in
write_begin either because of some filesystem error (like ENOSPC) or
because page with data to write has been removed from memory. We truncate
these blocks so that we don't have dangling blocks beyond i_size.
Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Chris McDermott from IBM confirmed that hurricane chipset in IBM summit
platforms doesn't support logical flat mode. Irrespective of the other
things like apic_id's, total number of logical cpu's, Linux kernel
should default to physical mode for this system.
The 32-bit kernel does so using the OEM checks for the IBM summit
platform. Add a similar OEM platform check for the 64bit kernel too.
Otherwise the linux kernel boot can hang on this platform under certain
bios/platform settings.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Chris McDermott <lcm@linux.vnet.ibm.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit f2260e6b (page allocator: update NR_FREE_PAGES only as necessary)
made one minor regression. if __rmqueue() was failed, NR_FREE_PAGES stat
go wrong. this patch fixes it.
EDAC MC0: INTERNAL ERROR: channel-b out of range (4 >= 4)
Kernel panic - not syncing: EDAC MC0: Uncorrected Error (XEN) Domain 0 crashed: 'noreboot' set - not rebooting.
This happens because FERR_NF_FBD bit 28 is not updated on i5000. Due to
that, both bits 28 and 29 may be equal to one, returning channel = 3. As
this value is invalid, EDAC core generates the panic.
inotify will WARN() if it finds that the idr and the fsnotify internals
somehow got out of sync. It was only supposed to do this once but due
to this stupid bug it would warn every single time a problem was
detected.
Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since commit 7e790dd5fc937bc8d2400c30a05e32a9e9eef276 ("inotify: fix
error paths in inotify_update_watch") inotify changed the manor in which
it gave watch descriptors back to userspace. Previous to this commit
inotify acted like the following:
inotify_add_watch(X, Y, Z) = 1
inotify_rm_watch(X, 1);
inotify_add_watch(X, Y, Z) = 2
but after this patch inotify would return watch descriptors like so:
inotify_add_watch(X, Y, Z) = 1
inotify_rm_watch(X, 1);
inotify_add_watch(X, Y, Z) = 1
which I saw as equivalent to opening an fd where
open(file) = 1;
close(1);
open(file) = 1;
seemed perfectly reasonable. The issue is that quite a bit of userspace
apparently relies on the behavior in which watch descriptors will not be
quickly reused. KDE relies on it, I know some selinux packages rely on
it, and I have heard complaints from other random sources such as debian
bug 558981.
Although the man page implies what we do is ok, we broke userspace so
this patch almost reverts us to the old behavior. It is still slightly
racey and I have patches that would fix that, but they are rather large
and this will fix it for all real world cases. The race is as follows:
- task1 creates a watch and blocks in idr_new_watch() before it updates
the hint.
- task2 creates a watch and updates the hint.
- task1 updates the hint with it's older wd
- task removes the watch created by task2
- task adds a new watch and will reuse the wd originally given to task2
it requires moving some locking around the hint (last_wd) but this should
solve it for the real world and be -stable safe.
As a side effect this patch papers over a bug in the lib/idr code which
is causing a large number WARN's to pop on people's system and many
reports in kerneloops.org. I'm working on the root cause of that idr
bug seperately but this should make inotify immune to that issue.
Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As the release of substreams may be done asynchronously from the
disconnection, close callback needs to check the shutdown flag before
actually accessing the usb interface.
While we are never normally passed an instruction that exceeds 15 bytes,
smp games can cause us to attempt to interpret one, which will cause
large latencies in non-preempt hosts.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On Tue, Feb 02, 2010 at 02:57:14PM -0800, Greg KH (gregkh@suse.de) wrote:
> > There are at least two ways to fix it: using a big cannon and a small
> > one. The former way is to disable notification registration, since it is
> > not used by anyone at all. Second way is to check whether calling
> > process is root and its destination group is -1 (kind of priveledged
> > one) before command is dispatched to workqueue.
>
> Well if no one is using it, removing it makes the most sense, right?
>
> No objection from me, care to make up a patch either way for this?
Getting it is not used, let's drop support for notifications about
(un)registered events from connector.
Another option was to check credentials on receiving, but we can always
restore it without bugs if needed, but genetlink has a wider code base
and none complained, that userspace can not get notification when some
other clients were (un)registered.
Kudos for Sebastian Krahmer <krahmer@suse.de>, who found a bug in the
code.
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Marc reported that the BUG_ON in clockevents_notify() triggers on his
system. This happens because the kernel tries to remove an active
clock event device (used for broadcasting) from the device list.
The handling of devices which can be used as per cpu device and as a
global broadcast device is suboptimal.
The simplest solution for now (and for stable) is to check whether the
device is used as global broadcast device, but this needs to be
revisited.
[ tglx: restored the cpuweight check and massaged the changelog ]
Reported-by: Marc Dionne <marc.c.dionne@gmail.com> Tested-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
LKML-Reference: <1262834564-13033-1-git-send-email-dfeng@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit 8bf3d79bc401ca417ccf9fc076d3295d1a71dbf5 enabled EEPROM
checksum checks to avoid bogus bug reports but failed to address
updating the code to consider devices with custom EEPROM sizes.
Devices with custom sized EEPROMs have the upper limit size stuffed
in the EEPROM. Use this as the upper limit instead of the static
default size. In case of a checksum error also provide back the
max size and whether or not this was the default size or a custom
one. If the EEPROM is busted we add a failsafe check to ensure
we don't loop forever or try to read bogus areas of hardware.
This closes bug 14874
http://bugzilla.kernel.org/show_bug.cgi?id=14874
Cc: stable@kernel.org Cc: David Quan <david.quan@atheros.com> Cc: Stephen Beahm <stephenbeahm@comcast.net> Reported-by: Joshua Covington <joshuacov@googlemail.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
We want to be sure that compiler fetches the limit variable only
once, so add helpers for fetching current and maximal resource
limits which do that.
Add them to sched.h (instead of resource.h) due to circular dependency
sched.h->resource.h->task_struct
Alternative would be to create a separate res_access.h or similar.
On managed CPUs the cpufreq.c core will call driver->exit(cpu) on the
managed cpus and powernow_k8 will free the core's data.
Later driver->get(cpu) function might get called trying to read out the
current freq of a managed cpu and the NULL pointer check does not work on
the freed object -> better set it to NULL.
->get() is unsigned and must return 0 as invalid frequency.
Signed-off-by: Thomas Renninger <trenn@suse.de> Tested-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
It is possible (and expected) for there to be holes in the h->drv[]
array, that is, some elements may be NULL pointers. cciss_seq_show
needs to be made aware of this possibility to avoid an Oops.
To reproduce the Oops which this fixes:
1) Create two "arrays" in the Array Configuratino Utility and
several logical drives on each array.
2) cat /proc/driver/cciss/cciss* in an infinite loop
3) delete some of the logical drives in the first "array."
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Don't pass current RLIMIT_RTTIME to update_rlimit_cpu() in
selinux_bprm_committing_creds, since update_rlimit_cpu expects
RLIMIT_CPU limit.
Use proper rlim[RLIMIT_CPU].rlim_cur instead to fix that.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Acked-by: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Eric Paris <eparis@parisplace.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This fixes a futex key reference count bug in futex_lock_pi(),
where a key's reference count is incremented twice but decremented
only once, causing the backing object to not be released.
If the futex is created in a temporary file in an ext3 file system,
this bug causes the file's inode to become an "undead" orphan,
which causes an oops from a BUG_ON() in ext3_put_super() when the
file system is unmounted. glibc's test suite is known to trigger this,
see <http://bugzilla.kernel.org/show_bug.cgi?id=14256>.
The bug is a regression from 2.6.28-git3, namely Peter Zijlstra's 38d47c1b7075bd7ec3881141bb3629da58f88dab "[PATCH] futex: rely on
get_user_pages() for shared futexes". That commit made get_futex_key()
also increment the reference count of the futex key, and updated its
callers to decrement the key's reference count before returning.
Unfortunately the normal exit path in futex_lock_pi() wasn't corrected:
the reference count is incremented by get_futex_key() and queue_lock(),
but the normal exit path only decrements once, via unqueue_me_pi().
The fix is to put_futex_key() after unqueue_me_pi(), since 2.6.31
this is easily done by 'goto out_put_key' rather than 'goto out'.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Darren Hart <dvhltc@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If the owner of a PI futex dies we fix up the pi_state and set
pi_state->owner to NULL. When a malicious or just sloppy programmed
user space application sets the futex value to 0 e.g. by calling
pthread_mutex_init(), then the futex can be acquired again. A new
waiter manages to enqueue itself on the pi_state w/o damage, but on
unlock the kernel dereferences pi_state->owner and oopses.
Prevent this by checking pi_state->owner in the unlock path. If
pi_state->owner is not current we know that user space manipulated the
futex value. Ignore the mess and return -EINVAL.
This catches the above case and also the case where a task hijacks the
futex by setting the tid value and then tries to unlock it.
Reported-by: Jermome Marchand <jmarchan@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Darren Hart <dvhltc@us.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The WARN_ON in lookup_pi_state which complains about a mismatch
between pi_state->owner->pid and the pid which we retrieved from the
user space futex is completely bogus.
The code just emits the warning and then continues despite the fact
that it detected an inconsistent state of the futex. A conveniant way
for user space to spam the syslog.
Replace the WARN_ON by a consistency check. If the values do not match
return -EINVAL and let user space deal with the mess it created.
This also fixes the missing task_pid_vnr() when we compare the
pi_state->owner pid with the futex value.
Reported-by: Jermome Marchand <jmarchan@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Darren Hart <dvhltc@us.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We incorrectly depended on the 'node_state/node_isset()' functions
testing the node range, rather than checking it explicitly. That's not
reliable, even if it might often happen to work. So do the proper
explicit test.
Signed-off-by: Gustavo Maciel Dias Vieira <gustavo@sagui.org> Signed-off-by: Len Brown <len.brown@intel.com> Cc: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>