The user-space hibernation sends a wrong notification after the image
restoration because of thinko for the file flag check. RDONLY
corresponds to hibernation and WRONLY to restoration, confusingly.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In ib_uverbs_poll_cq() code there is a potential integer overflow if
userspace passes in a large cmd.ne. The calls to kmalloc() would
allocate smaller buffers than intended, leading to memory corruption.
There iss also an information leak if resp wasn't all used.
Unprivileged userspace may call this function, although only if an
RDMA device that uses this function is present.
Fix this by copying CQ entries one at a time, which avoids the
allocation entirely, and also by moving this copying into a function
that makes sure to initialize all memory copied to userspace.
Special thanks to Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
for his help and advice.
Signed-off-by: Dan Carpenter <error27@gmail.com>
[ Monkey around with things a bit to avoid bad code generation by gcc
when designated initializers are used. - Roland ]
Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When matching error address to the range contained by one memory node,
we're in valid range when node interleaving
1. is disabled, or
2. enabled and when the address bits we interleave on match the
interleave selector on this node (see the "Node Interleaving" section in
the BKDG for an enlightening example).
Thus, when we early-exit, we need to reverse the compound logic
statement properly.
When an xprt is created, it has a refcount of 1, and XPT_BUSY is set.
The refcount is *not* owned by the thread that created the xprt
(as is clear from the fact that creators never put the reference).
Rather, it is owned by the absence of XPT_DEAD. Once XPT_DEAD is set,
(And XPT_BUSY is clear) that initial reference is dropped and the xprt
can be freed.
So when a creator clears XPT_BUSY it is dropping its only reference and
so must not touch the xprt again.
However svc_recv, after calling ->xpo_accept (and so getting an XPT_BUSY
reference on a new xprt), calls svc_xprt_recieved. This clears
XPT_BUSY and then svc_xprt_enqueue - this last without owning a reference.
This is dangerous and has been seen to leave svc_xprt_enqueue working
with an xprt containing garbage.
So we need to hold an extra counted reference over that call to
svc_xprt_received.
For safety, any time we clear XPT_BUSY and then use the xprt again, we
first get a reference, and the put it again afterwards.
Note that svc_close_all does not need this extra protection as there are
no threads running, and the final free can only be called asynchronously
from such a thread.
Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The commit 129a84de2347002f09721cda3155ccfd19fade40 (locks: fix F_GETLK
regression (failure to find conflicts)) fixed the posix_test_lock()
function by itself, however, its usage in NFS changed by the commit 9d6a8c5c213e34c475e72b245a8eb709258e968c (locks: give posix_test_lock
same interface as ->lock) remained broken - subsequent NFS-specific
locking code received F_UNLCK instead of the user-specified lock type.
To fix the problem, fl->fl_type needs to be saved before the
posix_test_lock() call and restored if no local conflicts were reported.
If vfs_getattr in fill_post_wcc returns an error, we don't
set fh_post_change.
For NFSv4, this can result in set_change_info triggering a BUG_ON.
i.e. fh_post_saved being zero isn't really a bug.
So:
- instead of BUGging when fh_post_saved is zero, just clear ->atomic.
- if vfs_getattr fails in fill_post_wcc, take a copy of i_ctime anyway.
This will be used i seg_change_info, but not overly trusted.
- While we are there, remove the pointless 'if' statements in set_change_info.
There is no harm setting all the values.
Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
After a few unsuccessful NFS mount attempts in which the client and
server cannot agree on an authentication flavor both support, the
client panics. nfs_umount() is invoked in the kernel in this case.
Turns out nfs_umount()'s UMNT RPC invocation causes the RPC client to
write off the end of the rpc_clnt's iostat array. This is because the
mount client's nrprocs field is initialized with the count of defined
procedures (two: MNT and UMNT), rather than the size of the client's
proc array (four).
The fix is to use the same initialization technique used by most other
upper layer clients in the kernel.
Introduced by commit 0b524123, which failed to update nrprocs when
support was added for UMNT in the kernel.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=24302 BugLink: http://bugs.launchpad.net/bugs/683938 Reported-by: Stefan Bader <stefan.bader@canonical.com> Tested-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This fixes a bug as seen on 2.6.32 based kernels where timers got
enqueued on offline cpus.
If a cpu goes offline it might still have pending timers. These will
be migrated during CPU_DEAD handling after the cpu is offline.
However while the cpu is going offline it will schedule the idle task
which will then call tick_nohz_stop_sched_tick().
That function in turn will call get_next_timer_intterupt() to figure
out if the tick of the cpu can be stopped or not. If it turns out that
the next tick is just one jiffy off (delta_jiffies == 1)
tick_nohz_stop_sched_tick() incorrectly assumes that the tick should
not stop and takes an early exit and thus it won't update the load
balancer cpu.
Just afterwards the cpu will be killed and the load balancer cpu could
be the offline cpu.
On 2.6.32 based kernel get_nohz_load_balancer() gets called to decide
on which cpu a timer should be enqueued (see __mod_timer()). Which
leads to the possibility that timers get enqueued on an offline cpu.
These will never expire and can cause a system hang.
This has been observed 2.6.32 kernels. On current kernels
__mod_timer() uses get_nohz_timer_target() which doesn't have that
problem. However there might be other problems because of the too
early exit tick_nohz_stop_sched_tick() in case a cpu goes offline.
The easiest and probably safest fix seems to be to let
get_next_timer_interrupt() just lie and let it say there isn't any
pending timer if the current cpu is offline.
I also thought of moving migrate_[hr]timers() from CPU_DEAD to
CPU_DYING, but seeing that there already have been fixes at least in
the hrtimer code in this area I'm afraid that this could add new
subtle bugs.
This patch fixes a hang observed with 2.6.32 kernels where timers got enqueued
on offline cpus.
printk_needs_cpu() may return 1 if called on offline cpus. When a cpu gets
offlined it schedules the idle process which, before killing its own cpu, will
call tick_nohz_stop_sched_tick(). That function in turn will call
printk_needs_cpu() in order to check if the local tick can be disabled. On
offline cpus this function should naturally return 0 since regardless if the
tick gets disabled or not the cpu will be dead short after. That is besides the
fact that __cpu_disable() should already have made sure that no interrupts on
the offlined cpu will be delivered anyway.
In this case it prevents tick_nohz_stop_sched_tick() to call
select_nohz_load_balancer(). No idea if that really is a problem. However what
made me debug this is that on 2.6.32 the function get_nohz_load_balancer() is
used within __mod_timer() to select a cpu on which a timer gets enqueued. If
printk_needs_cpu() returns 1 then the nohz_load_balancer cpu doesn't get
updated when a cpu gets offlined. It may contain the cpu number of an offline
cpu. In turn timers get enqueued on an offline cpu and not very surprisingly
they never expire and cause system hangs.
This has been observed 2.6.32 kernels. On current kernels __mod_timer() uses
get_nohz_timer_target() which doesn't have that problem. However there might be
other problems because of the too early exit tick_nohz_stop_sched_tick() in
case a cpu goes offline.
Easiest way to fix this is just to test if the current cpu is offline and call
printk_tick() directly which clears the condition.
Alternatively I tried a cpu hotplug notifier which would clear the condition,
however between calling the notifier function and printk_needs_cpu() something
could have called printk() again and the problem is back again. This seems to
be the safest fix.
BugLink: https://launchpad.net/bugs/595482
The original reporter states that audible playback from the internal
speaker is inaudible despite the hardware being properly detected. To
work around this symptom, he uses the model=lg quirk to properly enable
both playback, capture, and jack sense. Another user corroborates this
workaround on separate hardware. Add this PCI SSID to the quirk table
to enable it for further LG P1 Expresses.
Reported-and-tested-by: Philip Peitsch <philip.peitsch@gmail.com> Tested-by: nikhov Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Under heavy load conditions, our set of xpc messages may become exhausted.
The code handles this correctly with the exception of the management code
which hits a NULL pointer dereference.
Signed-off-by: Robin Holt <holt@sgi.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Many times while the initial connection is being made, the contacted
partition will send back both the ACTIVATING and the ACTIVE
remote_act_state changes in very close succescion. The 1/4 second delay
in the make first contact loop is large enough to nearly always miss the
ACTIVATING state change.
Since either state indicates the remote partition has acknowledged our
state change, accept either.
Signed-off-by: Robin Holt <holt@sgi.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This was a difficult bug to trip. XPC was in the middle of sending an
acknowledgement for a received message.
In xpc_received_payload_uv():
.
ret = xpc_send_gru_msg(ch->sn.uv.cached_notify_gru_mq_desc, msg,
sizeof(struct xpc_notify_mq_msghdr_uv));
if (ret != xpSuccess)
XPC_DEACTIVATE_PARTITION(&xpc_partitions[ch->partid], ret);
msg->hdr.msg_slot_number += ch->remote_nentries;
at the point in xpc_send_gru_msg() where the hardware has dispatched the
acknowledgement, the remote side is able to reuse the message structure
and send a message with a different slot number. This problem is made
worse by interrupts.
The adjustment of msg_slot_number and the BUG_ON in
xpc_handle_notify_mq_msg_uv() which verifies the msg_slot_number is
consistent are only used for debug purposes. Since a fix for this that
preserves the debug functionality would either have to infringe upon the
payload or allocate another structure just for debug, I decided to remove
it entirely.
Signed-off-by: Robin Holt <holt@sgi.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We leak at least 32bits of kernel memory to user land in tc dump,
because we dont init all fields (capab ?) of the dumped structure.
Use C99 initializers so that holes and non explicit fields are zeroed.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: dann frazier <dannf@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On each machine check all registers are revalidated. The save area for
the clock comparator however only contains the upper most seven bytes
of the former contents, if valid.
Therefore the machine check handler uses a store clock instruction to
get the current time and writes that to the clock comparator register
which in turn will generate an immediate timer interrupt.
However within the lowcore the expected time of the next timer
interrupt is stored. If the interrupt happens before that time the
handler won't be called. In turn the clock comparator won't be
reprogrammed and therefore the interrupt condition stays pending which
causes an interrupt loop until the expected time is reached.
On NOHZ machines this can result in unresponsive machines since the
time of the next expected interrupted can be a couple of days in the
future.
To fix this just revalidate the clock comparator register with the
expected value.
In addition the special handling for udelay must be changed as well.
This helps protect us from overflow issues down in the
individual protocol sendmsg/recvmsg handlers. Once
we hit INT_MAX we truncate out the rest of the iovec
by setting the iov_len members to zero.
This works because:
1) For SOCK_STREAM and SOCK_SEQPACKET sockets, partial
writes are allowed and the application will just continue
with another write to send the rest of the data.
2) For datagram oriented sockets, where there must be a
one-to-one correspondance between write() calls and
packets on the wire, INT_MAX is going to be far larger
than the packet size limit the protocol is going to
check for and signal with -EMSGSIZE.
Based upon a patch by Linus Torvalds.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In rds_cmsg_rdma_args(), the user-provided args->nr_local value is
restricted to less than UINT_MAX. This seems to need a tighter upper
bound, since the calculation of total iov_size can overflow, resulting
in a small sock_kmalloc() allocation. This would probably just result
in walking off the heap and crashing when calling rds_rdma_pages() with
a high count value. If it somehow doesn't crash here, then memory
corruption could occur soon after.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
rc2 kernel crashes when booting second cpu on this CONFIG_VMSPLIT_2G_OPT
laptop: whereas cloning from kernel to low mappings pgd range does need
to limit by both KERNEL_PGD_PTRS and KERNEL_PGD_BOUNDARY, cloning kernel
pgd range itself must not be limited by the smaller KERNEL_PGD_BOUNDARY.
Signed-off-by: Hugh Dickins <hughd@google.com>
LKML-Reference: <alpine.LSU.2.00.1008242235120.2515@sister.anvils> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch fixes machine crashes which occur when heavily exercising the
CPU hotplug codepaths on a 32-bit kernel. These crashes are caused by
AMD Erratum 383 and result in a fatal machine check exception. Here's
the scenario:
1. On 32-bit, the swapper_pg_dir page table is used as the initial page
table for booting a secondary CPU.
2. To make this work, swapper_pg_dir needs a direct mapping of physical
memory in it (the low mappings). By adding those low, large page (2M)
mappings (PAE kernel), we create the necessary conditions for Erratum
383 to occur.
3. Other CPUs which do not participate in the off- and onlining game may
use swapper_pg_dir while the low mappings are present (when leave_mm is
called). For all steps below, the CPU referred to is a CPU that is using
swapper_pg_dir, and not the CPU which is being onlined.
4. The presence of the low mappings in swapper_pg_dir can result
in TLB entries for addresses below __PAGE_OFFSET to be established
speculatively. These TLB entries are marked global and large.
5. When the CPU with such TLB entry switches to another page table, this
TLB entry remains because it is global.
6. The process then generates an access to an address covered by the
above TLB entry but there is a permission mismatch - the TLB entry
covers a large global page not accessible to userspace.
7. Due to this permission mismatch a new 4kb, user TLB entry gets
established. Further, Erratum 383 provides for a small window of time
where both TLB entries are present. This results in an uncorrectable
machine check exception signalling a TLB multimatch which panics the
machine.
There are two ways to fix this issue:
1. Always do a global TLB flush when a new cr3 is loaded and the
old page table was swapper_pg_dir. I consider this a hack hard
to understand and with performance implications
2. Do not use swapper_pg_dir to boot secondary CPUs like 64-bit
does.
This patch implements solution 2. It introduces a trampoline_pg_dir
which has the same layout as swapper_pg_dir with low_mappings. This page
table is used as the initial page table of the booting CPU. Later in the
bringup process, it switches to swapper_pg_dir and does a global TLB
flush. This fixes the crashes in our test cases.
-v2: switch to swapper_pg_dir right after entering start_secondary() so
that we are able to access percpu data which might not be mapped in the
trampoline page table.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
LKML-Reference: <20100816123833.GB28147@aftab> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On certain VIA chipsets AES-CBC requires the input/output to be
a multiple of 64 bytes. We had a workaround for this but it was
buggy as it sent the whole input for processing when it is meant
to only send the initial number of blocks which makes the rest
a multiple of 64 bytes.
As expected this causes memory corruption whenever the workaround
kicks in.
Reported-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
On parsing malformed X.25 facilities, decrementing the remaining length
may cause it to underflow. Since the length is an unsigned integer,
this will result in the loop continuing until the kernel crashes.
This patch adds checks to ensure decrementing the remaining length does
not cause it to wrap around.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The FBIOGET_VBLANK device ioctl allows unprivileged users to read 16
bytes of uninitialized stack memory, because the "reserved" member of
the fb_vblank struct declared on the stack is not altered or zeroed
before being copied back to the user. This patch takes care of it.
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by: Andy Walls <awalls@md.metrocast.net> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Reported-by: Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> CC: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-of-by: Andrew Hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Dmitry Torokhov [Thu, 4 Nov 2010 16:12:44 +0000 (09:12 -0700)]
Input: i8042 - add Sony VAIO VPCZ122GX to nomux list
[Note that the mainline will not have this particular fix but rather
will blacklist entire VAIO line based off DMI board name. For stable
I am being a bit more cautious and blacklist one particular product.]
Trying to query/activate active multiplexing mode on this VAIO makes
both keyboard and touchpad inoperable. Futher kernels will blacklist
entire VAIO line, however here we blacklist just one particular model.
At least one 5986:0241 webcam model includes vendor-specific descriptors
at the end of its streaming interface descriptors. Print an information
UVC_TRACE_DESCR message and try to continue parsing the descriptors
rather than bailing out with an error.
Enable the EFI framebuffer on 14 more Macs, including the iMac11,1
iMac10,1 iMac8,1 Macmini3,1 Macmini4,1 MacBook5,1 MacBook6,1 MacBook7,1
MacBookPro2,2 MacBookPro5,2 MacBookPro5,3 MacBookPro6,1 MacBookPro6,2 and
MacBookPro7,1
Information gathered from various user submissions.
This is a patch for the EFI framebuffer driver to enable the framebuffer
of the NVIDIA 9400M as found in MacBook Pro (MBP) 5,1 and up. The
framebuffer of the NVIDIA graphic cards are located at the following
addresses in memory:
9400M: 0xC0010000
9600M GT: 0xB0030000
The patch delivered right here only provides the memory location of the
framebuffer of the 9400M device. The 9600M GT is not covered. It is
assumed that the 9400M is used when powered up the MBP.
The information which device is currently powered and in use is stored in
the 64 bytes large EFI variable "gpu-power-prefs". More specifically,
byte 0x3B indicates whether 9600M GT (0x00) or 9400M (0x01) is online.
The PCI bus IDs are the following:
9400M: PCI 03:00:00
9600M GT: PCI 02:00:00
The EFI variables can be easily read-out and manipulated with "rEFIt", an
MBP specific bootloader tool. For more information on how handle rEFIt
and EFI variables please consult "http://refit.sourceforge.net" and
"http://ubuntuforums.org/archive/index.php/t-1076879.html".
IMPORTANT NOTE: The information on how to activate the 9400M device given
at "ubuntuforums.org" is not correct, since it states
Actually, the assignment of the values and the PCI bus IDs are swapped.
Suggestions:
------------
To cover framebuffers of both 9400M and 9600M GT, I would suggest to
implement a conditional on "gpu-power-prefs". Depending on the value of
byte 0x3B, the according framebuffer is selected. However, this requires
kernel access to the EFI variables.
[akpm@linux-foundation.org: rename optname, per Peter Jones] Signed-off-by: Thomas Gerlach <t.m.gerlach@freenet.de> Acked-by: Peter Jones <pjones@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: maximilian attems <max@stro.at> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If another cpu does a very wide munmap() on the signal frame area,
it can tear down the page table hierarchy from underneath us.
Borrow an idea from the 64-bit fault path's get_user_insn(), and
disable cross call interrupts during the page table traversal
to lock them in place while we operate.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Robin Holt [Wed, 20 Oct 2010 02:03:37 +0000 (02:03 +0000)]
Limit sysctl_tcp_mem and sysctl_udp_mem initializers to prevent integer overflows.
[ Upstream fixed this in a different way as parts of the commits: 8d987e5c7510 (net: avoid limits overflow) a9febbb4bd13 (sysctl: min/max bounds are optional) 27b3d80a7b6a (sysctl: fix min/max handling in __do_proc_doulongvec_minmax())
-DaveM ]
On a 16TB x86_64 machine, sysctl_tcp_mem[2], sysctl_udp_mem[2], and
sysctl_sctp_mem[2] can integer overflow. Set limit such that they are
maximized without overflowing.
The rx_recycle queue is global per device but can be accesed by many
napi handlers at the same time, so it needs full skb_queue primitives
(with locking). Otherwise, various crashes caused by broken skbs are
possible.
This patch resolves, at least partly, bugzilla bug 19692. (Because of
some doubts that there could be still something around which is hard
to reproduce my proposal is to leave this bug opened for a month.)
Reported-by: emin ak <eminak71@gmail.com> Tested-by: emin ak <eminak71@gmail.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> CC: Andy Fleming <afleming@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
IP: [<ffffffffa0f0a625>] hidraw_write+0x3b/0x116 [hid]
[...]
This is reproducible by disconnecting the device while userspace writes
to dev node in a loop and doesn't check return values in order to exit
the loop.
That commit covered almost all of them but act_police still had a leak.
opt.limit and opt.capab aren't zeroed out before the structure is
passed out.
This patch uses the C99 initializers to zero everything unused out.
Signed-off-by: Jeff Mahoney <jeffm@suse.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The find_next_bit, find_first_bit, find_next_zero_bit
and find_first_zero_bit functions were not properly
clamping to the maxbit argument at the bit level. They
were instead only checking maxbit at the byte level.
To fix this, add a compare and a conditional move
instruction to the end of the common bit-within-the-
byte code used by all the functions and be sure not to
clobber the maxbit argument before it is used.
Reviewed-by: Nicolas Pitre <nicolas.pitre@linaro.org> Tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: James Jones <jajones@nvidia.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit 8b592783 added a Thumb-2 variant of usracc which, when it is
called with \rept=2, calls usraccoff once with an offset of 0 and
secondly with a hard-coded offset of 4 in order to avoid incrementing
the pointer again. If \inc != 4 then we will store the data to the wrong
offset from \ptr. Luckily, the only caller that passes \rept=2 to this
function is __clear_user so we haven't been actively corrupting user data.
This patch fixes usracc to pass \inc instead of #4 to usraccoff
when it is called a second time.
Reported-by: Tony Thompson <tony.thompson@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
As pointed out by Linus, commit dab5855 ("perf_counter: Add mmap event hooks to
mprotect()") is fundamentally wrong as mprotect_fixup() can free 'vma' due to
merging. Fix the problem by moving perf_event_mmap() hook to
mprotect_fixup().
Note: there's another successful return path from mprotect_fixup() if old
flags equal to new flags. We don't, however, need to call
perf_event_mmap() there because 'perf' already knows the VMA is
executable.
Reported-by: Dave Jones <davej@redhat.com> Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ingo Molnar <mingo@elte.hu> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A single uninitialized padding byte is leaked to userspace.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Depending on processor speed, page size, and the amount of memory a
process is allowed to amass, cleanup of a large VM may freeze the system
for many seconds. This can result in a watchdog timeout.
Make sure other tasks receive some service when cleaning up large VMs.
Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Cc: Greg Ungerer <gerg@snapgear.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
According to the comment describing ops_lock in the definition of struct
backlight_device and when comparing with other functions in backlight.c
the mutex must be hold when checking ops to be non-NULL.
Fixes a problem added by c835ee7f4154992e6 ("backlight: Add suspend/resume
support to the backlight core") in Jan 2009.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Richard Purdie <rpurdie@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Disable the winch irq early to make sure we don't take an interrupt part
way through the freeing of the handler data, resulting in a crash on
shutdown:
Signed-off-by: Will Newton <will.newton@gmail.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If a user manages to trigger an oops with fs set to KERNEL_DS, fs is not
otherwise reset before do_exit(). do_exit may later (via mm_release in
fork.c) do a put_user to a user-controlled address, potentially allowing
a user to leverage an oops into a controlled write into kernel memory.
This is only triggerable in the presence of another bug, but this
potentially turns a lot of DoS bugs into privilege escalations, so it's
worth fixing. I have proof-of-concept code which uses this bug along
with CVE-2010-3849 to write a zero to an arbitrary kernel address, so
I've tested that this is not theoretical.
A more logical place to put this fix might be when we know an oops has
occurred, before we call do_exit(), but that would involve changing
every architecture, in multiple places.
The attribute cache for a file was not being cleared when a file is opened
with O_TRUNC.
If the filesystem's open operation truncates the file ("atomic_o_trunc"
feature flag is set) then the kernel should invalidate the cached st_mtime
and st_ctime attributes.
Also i_size should be explicitly be set to zero as it is used sometimes
without refreshing the cache.
Signed-off-by: Ken Sumrall <ksumrall@android.com> Cc: Anfei <anfei.zhou@gmail.com> Cc: "Anand V. Avati" <avati@gluster.com> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
RTS and DTR should not be modified based on CRTSCTS when calling
set_termios.
Modem control lines are raised at port open by the tty layer and should stay
raised regardless of whether hardware flow control is enabled or not.
This is in conformance with the way serial ports work today and many
applications depend on this behaviour to be able to talk to hardware
implementing hardware flow control (without the applications actually using
it).
Hardware which expects different behaviour on these lines can always
use TIOCMSET/TIOCMBI[SC] after port open to change them.
Reported-by: Daniel Mack <daniel@caiaq.de> Reported-by: Dave Mielke <dave@mielke.cc> Signed-off-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1435) fixes an obscure and unlikely race in ehci-hcd.
When an async URB is unlinked, the corresponding QH is removed from
the async list. If the QH's endpoint is then disabled while the URB
is being given back, ehci_endpoint_disable() won't find the QH on the
async list, causing it to believe that the QH has been lost. This
will lead to a memory leak at best and quite possibly to an oops.
The solution is to trust usbcore not to lose track of endpoints. If
the QH isn't on the async list then it doesn't need to be taken off
the list, but the driver should still wait for the QH to become IDLE
before disabling it.
In theory this fixes Bugzilla #20182. In fact the race is so rare
that it's not possible to tell whether the bug is still present.
However, adding delays and making other changes to force the race
seems to show that the patch works.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de> CC: David Brownell <david-b@pacbell.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Structure usbdevfs_connectinfo is copied to userland with padding byted
after "slow" field uninitialized. It leads to leaking of contents of
kernel stack memory.
Structure iowarrior_info is copied to userland with padding byted
between "serial" and "revision" fields uninitialized. It leads to
leaking of contents of kernel stack memory.
When huawei datacard with PID 0x14AC is insterted into Linux system, the
present kernel will load the "option" driver to all the interfaces. But
actually, some interfaces run as other function and do not need "option"
driver.
In this path, we modify the id_tables, when the PID is 0x14ac ,VID is
0x12d1, Only when the interface's Class is 0xff,Subclass is 0xff, Pro is
0xff, it does need "option" driver.
Signed-off-by: ma rui <m00150988@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add the USB IDs for the Milkymist One FTDI-based JTAG/serial adapter
(http://projects.qi-hardware.com/index.php/p/mmone-jtag-serial-cable/)
to the ftdi_sio driver and disable the first serial channel (used as
JTAG from userspace).
Some Apple machines have identical DMI data but different memory
configurations for the video. Given that, check that the address in our
table is actually within the range of a PCI BAR on a VGA device in the
machine.
This also fixes up the return value from set_system(), which has always
been wrong, but never resulted in bad behavior since there's only ever
been one matching entry in the dmi table.
The patch
1) stops people's machines from crashing when we get their display wrong,
which seems to be unfortunately inevitable,
2) allows us to support identical dmi data with differing video memory
configurations
This also adds me as the efifb maintainer, since I've effectively been
acting as such for quite some time.
Signed-off-by: Peter Jones <pjones@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: maximilian attems <max@stro.at> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This code seems to be asking for a shared read/write mapping of 16MB worth of
BAR0 starting at file offset 0, and letting the kernel assign a starting
address. Unfortunately, this -EINVAL causes X not to start. Looking into
dmesg, there's a complaint like so:
process "Xorg" tried to map 0x01000000 bytes at page 0x00000000 on 0000:07:00.0 BAR 0 (start 0x 96000000, size 0x 1000000)
It looks like the logic here is set up such that when the mmap call comes via
sysfs, the check in pci_mmap_fits wants vma->vm_pgoff to be between the
resource's start and end address, and the end of the vma to be no farther than
the end. However, the sysfs PCI resource files always start at offset zero,
which means that this test always fails for programs that mmap the sysfs files.
Given the comment in the original commit 3b519e4ea618b6943a82931630872907f9ac2c2b, I _think_ the old procfs files
require that the file offset be equal to the resource's base address when
mmapping.
I think what we want here is for pci_start to be 0 when mmap_api ==
PCI_MMAP_PROCFS. The following patch makes that change, after which the Matrox
and Mach64 X drivers work again.
Acked-by: Martin Wilck <martin.wilck@ts.fujitsu.com> Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The checks for valid mmaps of PCI resources made through /proc/bus/pci files
that were introduced in 9eff02e2042f96fb2aedd02e032eca1c5333d767 have several
problems:
1. mmap() calls on /proc/bus/pci files are made with real file offsets > 0,
whereas under /sys/bus/pci/devices, the start of the resource corresponds
to offset 0. This may lead to false negatives in pci_mmap_fits(), which
implicitly assumes the /sys/bus/pci/devices layout.
2. The loop in proc_bus_pci_mmap doesn't skip empty resouces. This leads
to false positives, because pci_mmap_fits() doesn't treat empty resources
correctly (the calculated size is 1 << (8*sizeof(resource_size_t)-PAGE_SHIFT)
in this case!).
3. If a user maps resources with BAR > 0, pci_mmap_fits will emit bogus
WARNINGS for the first resources that don't fit until the correct one is found.
On many controllers the first 2-4 BARs are used, and the others are empty.
In this case, an mmap attempt will first fail on the non-empty BARs
(including the "right" BAR because of 1.) and emit bogus WARNINGS because
of 3., and finally succeed on the first empty BAR because of 2.
This is certainly not the intended behaviour.
This patch addresses all 3 issues.
Updated with an enum type for the additional parameter for pci_mmap_fits().
SCSI commands may be issued between __scsi_add_device() and dev->sdev
assignment, so it's unsafe for ata_qc_complete() to dereference
dev->sdev->locked without checking whether it's NULL or not. Fix it.
It makes sense for a BO to move after a process has requested
exclusive RW access on it (e.g. because the BO used to be located in
unmappable VRAM and we intercepted the CPU access from the fault
handler).
If we let the ghost object inherit cpu_writers from the original
object, ttm_bo_release_list() will raise a kernel BUG when the ghost
object is destroyed. This can be reproduced with the nouveau driver on
nv5x.
Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Tested-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Francisco Jerez <currojerez@riseup.net> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If the iovec is being set up in a way that causes uaddr + PAGE_SIZE
to overflow, we could end up attempting to map a huge number of
pages. Check for this invalid input type.
70 hours into some stress tests of a 2.6.32-based enterprise kernel, we
ran into a NULL dereference in here:
int block_is_partially_uptodate(struct page *page, read_descriptor_t *desc,
unsigned long from)
{
----> struct inode *inode = page->mapping->host;
It looks like page->mapping was the culprit. (xmon trace is below).
After closer examination, I realized that do_generic_file_read() does a
find_get_page(), and eventually locks the page before calling
block_is_partially_uptodate(). However, it doesn't revalidate the
page->mapping after the page is locked. So, there's a small window
between the find_get_page() and ->is_partially_uptodate() where the page
could get truncated and page->mapping cleared.
We _have_ a reference, so it can't get reclaimed, but it certainly
can be truncated.
I think the correct thing is to check page->mapping after the
trylock_page(), and jump out if it got truncated. This patch has been
running in the test environment for a month or so now, and we have not
seen this bug pop up again.
Per task latencytop accumulator prematurely terminates due to erroneous
placement of latency_record_count. It should be incremented whenever a
new record is allocated instead of increment on every latencytop event.
Also fix search iterator to only search known record events instead of
blindly searching all pre-allocated space.
Signed-off-by: Ken Chen <kenchen@google.com> Reviewed-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BugLink: https://launchpad.net/bugs/683695
The original reporter states that headphone jacks do not appear to
work. Upon inspecting his codec dump, and upon further testing, it is
confirmed that the "alienware" model quirk is correct.
Reported-and-tested-by: Cody Thierauf Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
BugLink: https://launchpad.net/bugs/669279
The original reporter states: "The Master mixer does not change the
volume from the headphone output (which is affected by the headphone
mixer). Instead it only seems to control the on-board speaker volume.
This confuses PulseAudio greatly as the Master channel is merged into
the volume mix."
Fix this symptom by applying the hp_only quirk for the reporter's SSID.
The fix is applicable to all stable kernels.
Reported-and-tested-by: Ben Gamari <bgamari@gmail.com> Signed-off-by: Daniel T Chen <crimsun@ubuntu.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When handling an AR buffer that has been completely filled, we assumed
that its descriptor will not be read by the controller and can be
overwritten. However, when the last received packet happens to end at
the end of the buffer, the controller might not yet have moved on to the
next buffer and might read the branch address later. If we overwrite
and free the page before that, the DMA context will either go dead
because of an invalid Z value, or go off into some random memory.
To fix this, ensure that the descriptor does not get overwritten by
using only the actual buffer instead of the entire page for reassembling
the split packet. Furthermore, to avoid freeing the page too early,
move on to the next buffer only when some data in it guarantees that the
controller has moved on.
This should eliminate the remaining firewire-net problems.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Tested-by: Maxim Levitsky <maximlevitsky@gmail.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When the controller had to split a received asynchronous packet into two
buffers, the driver tries to reassemble it by copying both parts into
the first page. However, if size + rest > PAGE_SIZE, i.e., if the yet
unhandled packets before the split packet, the split packet itself, and
any received packets after the split packet are together larger than one
page, then the memory after the first page would get overwritten.
To fix this, do not try to copy the data of all unhandled packets at
once, but copy the possibly needed data every time when handling
a packet.
This gets rid of most of the infamous crashes and data corruptions when
using firewire-net.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Tested-by: Maxim Levitsky <maximlevitsky@gmail.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If a device exposes a sparsely populated configuration ROM,
firewire-core's sysfs interface and character device file interface
showed random data in the gaps between config ROM blocks. Fix this by
zero-initialization of the config ROM reader's scratch buffer.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: maximilian attems <max@stro.at> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
A userspace client got to see uninitialized stack-allocated memory if it
specified an _IOC_READ type of ioctl and an argument size larger than
expected by firewire-core's ioctl handlers (but not larger than the
core's union ioctl_arg).
Fix this by clearing the requested buffer size to zero, but only at _IOR
ioctls. This way, there is almost no runtime penalty to legitimate
ioctls. The only legitimate _IOR is FW_CDEV_IOC_GET_CYCLE_TIMER with 12
or 16 bytes to memset.
[Another way to fix this would be strict checking of argument size (and
possibly direction) vs. command number. However, we then need a lookup
table, and we need to allow for slight size deviations in case of 32bit
userland on 64bit kernel.]
Reported-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
[ Backported to 2.6.32 firewire core -maks ] Signed-off-by: maximilian attems <max@stro.at> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We now use load_gs_index() to load gs safely; unfortunately this also
changes MSR_KERNEL_GS_BASE, which we managed separately. This resulted
in confusion and breakage running 32-bit host userspace on a 64-bit kernel.
Fix by
- saving guest MSR_KERNEL_GS_BASE before we we reload the host's gs
- doing the host save/load unconditionally, instead of only when in guest
long mode
Things can be cleaned up further, but this is the minmal fix for now.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
[bwh: Backport to 2.6.32] Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>