- Moved fan pwm register array pointers into per-instance data.
- Only read fan pwm data for installed/supported fans.
- Update fan max output and fan step output information from data in
registers.
- Create max_output and step_output attribute files only if respective
fan pwm registers exist.
This bug was introduced in:
commit 81adee47dfb608df3ad0b91d230fb3cef75f0060
Author: Eric W. Biederman <ebiederm@aristanetworks.com>
Date: Sun Nov 8 00:53:51 2009 -0800
net: Support specifying the network namespace upon device creation.
There is no good reason to not support userspace specifying the
network namespace during device creation, and it makes it easier
to create a network device and pass it to a child network namespace
with a well known name.
We have to be careful to ensure that the target network namespace
for the new device exists through the life of the call. To keep
that logic clear I have factored out the network namespace grabbing
logic into rtnl_link_get_net.
In addtion we need to continue to pass the source network namespace
to the rtnl_link_ops.newlink method so that we can find the base
device source network namespace.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Where apparently I forgot to add error handling to the path where we create
a new network device in a new network namespace, and pass in an invalid pid.
Cc: stable@kernel.org Reported-by: Ed Swierk <eswierk@bigswitch.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andi Kleen <ak@linux.intel.com>
| Mount-cache hash table entries: 512
| slab error in verify_redzone_free(): cache `idr_layer_cache': memory outside object was overwritten
| Backtrace:
| [<c0227088>] (dump_backtrace+0x0/0x110) from [<c0431afc>] (dump_stack+0x18/0x1c)
| [<c0431ae4>] (dump_stack+0x0/0x1c) from [<c0293304>] (__slab_error+0x28/0x30)
| [<c02932dc>] (__slab_error+0x0/0x30) from [<c0293a74>] (cache_free_debugcheck+0x1c0/0x2b8)
| [<c02938b4>] (cache_free_debugcheck+0x0/0x2b8) from [<c0293f78>] (kmem_cache_free+0x3c/0xc0)
| [<c0293f3c>] (kmem_cache_free+0x0/0xc0) from [<c032b1c8>] (ida_get_new_above+0x19c/0x1c0)
| [<c032b02c>] (ida_get_new_above+0x0/0x1c0) from [<c02af7ec>] (alloc_vfsmnt+0x54/0x144)
| [<c02af798>] (alloc_vfsmnt+0x0/0x144) from [<c0299830>] (vfs_kern_mount+0x30/0xec)
| [<c0299800>] (vfs_kern_mount+0x0/0xec) from [<c0299908>] (kern_mount_data+0x1c/0x20)
| [<c02998ec>] (kern_mount_data+0x0/0x20) from [<c02146c4>] (sysfs_init+0x68/0xc8)
| [<c021465c>] (sysfs_init+0x0/0xc8) from [<c02137d4>] (mnt_init+0x90/0x1b0)
| [<c0213744>] (mnt_init+0x0/0x1b0) from [<c0213388>] (vfs_caches_init+0x100/0x140)
| [<c0213288>] (vfs_caches_init+0x0/0x140) from [<c0208c0c>] (start_kernel+0x2e8/0x368)
| [<c0208924>] (start_kernel+0x0/0x368) from [<c0208034>] (__enable_mmu+0x0/0x2c)
| c0113268: redzone 1:0xd84156c5c032b3ac, redzone 2:0xd84156c5635688c0.
| slab error in cache_alloc_debugcheck_after(): cache `idr_layer_cache': double free, or memory outside object was overwritten
| ...
| c011307c: redzone 1:0x9f91102ffffffff, redzone 2:0x9f911029d74e35b
| slab: Internal list corruption detected in cache 'idr_layer_cache'(24), slabp c0113000(16). Hexdump:
|
| 000: 20 4f 10 c0 20 4f 10 c0 7c 00 00 00 7c 30 11 c0
| 010: 10 00 00 00 10 00 00 00 00 00 c9 17 fe ff ff ff
| 020: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
| 030: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
| 040: fe ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff
| 050: fe ff ff ff fe ff ff ff fe ff ff ff 11 00 00 00
| 060: 12 00 00 00 13 00 00 00 14 00 00 00 15 00 00 00
| 070: 16 00 00 00 17 00 00 00 c0 88 56 63
| kernel BUG at /home/rmk/git/linux-2.6-rmk/mm/slab.c:2928!
Reference: https://lkml.org/lkml/2011/2/7/238 Cc: <stable@kernel.org> # 2.6.35.y and later Reported-and-analyzed-by: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andi Kleen <ak@linux.intel.com>
When destroying inherited events, we need to destroy groups too,
otherwise the event iteration in perf_event_exit_task_context() will
miss group siblings and we leak events with all the consequences.
When an endpoint stalls, the xHCI driver must move the endpoint ring's
dequeue pointer past the stalled transfer. To do that, the driver issues
a Set TR Dequeue Pointer command, which will complete some time later.
Takashi was having issues with USB 1.1 audio devices that stalled, and his
analysis of the code was that the old code would not update the xHCI
driver's ring dequeue pointer after the command completes. However, the
dequeue pointer is set in xhci_find_new_dequeue_state(), just before the
set command is issued to the hardware.
Setting the dequeue pointer before the Set TR Dequeue Pointer command
completes is a dangerous thing to do, since the xHCI hardware can fail the
command. Instead, store the new dequeue pointer in the xhci_virt_ep
structure, and update the ring's dequeue pointer when the Set TR dequeue
pointer command completes.
While we're at it, make sure we can't queue another Set TR Dequeue Command
while the first one is still being processed. This just won't work with
the internal xHCI state code. I'm still not sure if this is the right
thing to do, since we might have a case where a driver queues multiple
URBs to a control ring, one of the URBs Stalls, and then the driver tries
to cancel the second URB. There may be a race condition there where the
xHCI driver might try to issue multiple Set TR Dequeue Pointer commands,
but I would have to think very hard about how the Stop Endpoint and
cancellation code works. Keep the fix simple until when/if we run into
that case.
This patch should be queued to kernels all the way back to 2.6.31.
The document says:
|2.1 Problem description
| When at least two USB devices are simultaneously running, it is observed that
| sometimes the INT corresponding to one of the USB devices stops occurring. This may
| be observed sometimes with USB-to-serial or USB-to-network devices.
| The problem is not noticed when only USB mass storage devices are running.
|2.2 Implication
| This issue is because of the clearing of the respective Done Map bit on reading the ATL
| PTD Done Map register when an INT is generated by another PTD completion, but is not
| found set on that read access. In this situation, the respective Done Map bit will remain
| reset and no further INT will be asserted so the data transfer corresponding to that USB
| device will stop.
|2.3 Workaround
| An SOF INT can be used instead of an ATL INT with polling on Done bits. A time-out can
| be implemented and if a certain Done bit is never set, verification of the PTD completion
| can be done by reading PTD contents (valid bit).
| This is a proven workaround implemented in software.
Russell King run into this with an USB-to-serial converter. This patch
implements his suggestion to enable the high frequent SOF interrupt only
at the time we have ATL packages queued. It goes even one step further
and enables the SOF interrupt only if we have more than one ATL packet
queued at the same time.
Cc: <stable@kernel.org> # [2.6.35.x, 2.6.36.x, 2.6.37.x] Tested-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
This follows wireless-testing 9236d838c920e90708570d9bbd7bb82d30a38130
("cfg80211: fix extension channel checks to initiate communication") and
fixes accidental case fall-through. Without this fix, HT40 is entirely
blocked.
Signed-off-by: Mark Mentovai <mark@moxienet.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: stable@kernel.org Acked-by: Luis R. Rodriguez <lrodriguez@atheros.com Signed-off-by: John W. Linville <linville@tuxdriver.com>
Chuck Ebbert [Thu, 31 Mar 2011 18:58:52 +0000 (11:58 -0700)]
revert misc: uss720.c: add another vendor/product ID
In 2.6.35.10:
[ 122.146074] usb 2-1: new full speed USB device using uhci_hcd and address 2
[ 122.325102] usb 2-1: New USB device found, idVendor=050d, idProduct=0002
[ 122.325110] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 122.325117] usb 2-1: Product: IEEE-1284 Controller
[ 122.325121] usb 2-1: Manufacturer: Belk USB Printing Support
[ 123.531167] usblp0: USB Bidirectional printer dev 2 if 0 alt 1 proto 2 vid
0x050D pid 0x0002
[ 123.531208] usbcore: registered new interface driver usblp
In 2.6.35.11:
[ 8046.227051] usb 2-1: new full speed USB device using uhci_hcd and address 6
[ 8046.408083] usb 2-1: New USB device found, idVendor=050d, idProduct=0002
[ 8046.408088] usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 8046.408092] usb 2-1: Product: IEEE-1284 Controller
[ 8046.408094] usb 2-1: Manufacturer: Belk USB Printing Support
[ 8047.552140] get_1284_register timeout
[ 8047.554102] uss720: async_complete: urb error -104
[repeats]
[ 8047.556111] uss720: async_complete: urb error -32
[sequence repeats]
[unplug connector]
[ 8485.688067] parport0: fix this legacy no-device port driver!
[ 8485.688427] uss720: async_complete: urb error -32
Blacklisting the uss720 driver fixes the problem.
From 0a67b7cf26d73ed1dbea7e99d63673b5c4aa479e Mon Sep 17 00:00:00 2001
From: Thomas Sailer <t.sailer@alumni.ethz.ch>
Date: Tue, 14 Dec 2010 16:04:05 +0100
Subject: [PATCH] USB: misc: uss720.c: add another vendor/product ID
stored a ref to the current cred struct in struct scm_cookie. This was fine
with AF_UNIX as that calls scm_destroy() from its packet sending functions, but
AF_NETLINK, which also uses scm_send(), does not call scm_destroy() - meaning
that the copied credentials leak each time SCM data is sent over a netlink
socket.
This can be triggered quite simply on a Fedora 13 or 14 userspace with the
2.6.35.11 kernel (or something based off of that) by calling:
#!/bin/bash
for ((i=0; i<100; i++))
do
su - -c /bin/true
cut -d: -f1 /proc/slabinfo | grep 'cred\|key\|task_struct'
cat /proc/keys | wc -l
done
This leaks the session key that pam_keyinit creates for 'su -', which appears
in /proc/keys as being revoked (has the R flag set against it) afterward su is
called.
Furthermore, if CONFIG_SLAB=y, then the cred and key slab object usage counts
can be viewed and seen to increase. The key slab increases by one object per
loop, and this can be seen after the system has had a couple of minutes to
stand after the script above has been run on it.
If the system is working correctly, the key and cred counts should return to
roughly what they were before.
af_netlink: Add needed scm_destroy after scm_send.
scm_send occasionally allocates state in the scm_cookie, so I have
modified netlink_sendmsg to guarantee that when scm_send succeeds
scm_destory will be called to free that state.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
netfilter: arpt_mangle: fix return values of checkentry
In 135367b "netfilter: xtables: change xt_target.checkentry return type",
the type returned by checkentry was changed from boolean to int, but the
return values where not adjusted.
arptables: Input/output error
This broke arptables with the mangle target since it returns true
under success, which is interpreted by xtables as >0, thus
returning EIO.
The following Linux kernels are affected:
* 2.6.35.9
* 2.6.36.4
* 2.6.37.3
Cc: stable@kernel.org Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Andi Kleen <ak@linux.intel.com>
(cherry picked from commit 9d0db8b6b1da9e3d4c696ef29449700c58d589db)
Randy Dunlap has reported that building classmate-laptop fails when
CONFIG_RFKILL=m and CONFIG_ACPI_CMPC=y. He suggested depending on
RFKILL, but, then, it will not be possible to select classmate-laptop
when RFKILL is off. There's no known problem with building and using
classmate-laptop with RFKILL off. So depend on RFKILL or RFKILL=n.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: platform-driver-x86@vger.kernel.org Cc: Daniel Oliveira Nascimento <don@syst.com.br>
The default hibernation image size is currently hard coded and euqal
to 500 MB, which is not a reasonable default on many contemporary
systems. Make it equal 2/5 of the total RAM size (this is slightly
below the maximum, i.e. 1/2 of the total RAM size, and seems to be
generally suitable).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Tested-by: M. Vefa Bicakci <bicave@superonline.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
The pointer '(*auth_tok_key)' is set to NULL in case request_key()
fails, in order to prevent its use by functions calling
ecryptfs_keyring_auth_tok_for_sig().
During device discovery, scsi mid layer sends INQUIRY command to LUN
0. If the LUN 0 is not mapped to host, it creates a temporary
scsi_device with LUN id 0 and sends REPORT_LUNS command to it. After
the REPORT_LUNS succeeds, it walks through the LUN table and adds each
LUN found to sysfs. At the end of REPORT_LUNS lun table scan, it will
delete the temporary scsi_device of LUN 0.
When scsi devices are added to sysfs, it calls add_dev function of all
the registered class interfaces. If ses driver has been registered,
ses_intf_add() of ses module will be called. This function calls
scsi_device_enclosure() to check the inquiry data for EncServ
bit. Since inquiry was not allocated for temporary LUN 0 scsi_device,
it will cause NULL pointer exception.
To fix the problem, sdev->inquiry is checked for NULL before reading it.
enclosure page 7 gives us the "pretty" names of the enclosure slots.
Without a page 7, we can still use the enclosure code as long as we
make up numeric names for the slots. Unfortunately, the current code
fails to add any devices because the check for page 10 is in the wrong
place if we have no page 7. Fix it so that devices show up even if
the enclosure has no page 7.
This field is used to determine the inactivity time. When in AP mode,
hostapd uses it for kicking out inactive clients after a while. Without this
patch, hostapd immediately deauthenticates a new client if it checks the
inactivity time before the client sends its first data frame.
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
User-controllable indexes for voice and channel values may cause reading
and writing beyond the bounds of their respective arrays, leading to
potentially exploitable memory corruption. Validate these indexes.
Under certain workloads a command may seem to get lost. IOW, the Smart Array
thinks all commands have been completed but we still have commands in our
completion queue. This may lead to system instability, filesystems going
read-only, or even panics depending on the affected filesystem. We add an
extra read to force the write to complete.
Testing shows this extra read avoids the problem.
Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
Rmmod myri10ge crash at free_netdev() -> netif_napi_del(), because napi
structures are already deallocated. To fix call netif_napi_del() before
kfree() at myri10ge_free_slices().
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
The maximum kilobytes of locked memory that an unprivileged user
can reserve is of 512 kB = 128 pages by default, scaled to the
number of onlined CPUs, which fits well with the tools that use
128 data pages by default.
However tools actually use 129 pages, because they need one more
for the user control page. Thus the default mlock threshold is
not sufficient for the default tools needs and we always end up
to evaluate the constant mlock rlimit policy, which doesn't have
this scaling with the number of online CPUs.
Hence, on systems that have more than 16 CPUs, we overlap the
rlimit threshold and fail to mmap:
$ perf record ls
Error: failed to mmap with 1 (Operation not permitted)
Just increase the max unprivileged mlock threshold by one page
so that it supports well perf tools even after 16 CPUs.
This patch fixes a race between snd_card_file_remove() and
snd_card_disconnect(). When the card is added to shutdown_files list
in snd_card_disconnect(), but it's freed in snd_card_file_remove() at
the same time, the shutdown_files list gets corrupted. The list member
must be freed in snd_card_file_remove() as well.
The commit 5a8cfb4e8ae317d283f84122ed20faa069c5e0c4
ALSA: hda - Use ALC_INIT_DEFAULT for really default initialization
changed to use the default initialization method for ALC889, but
this caused a regression on SPDIF output on some machines.
This seems due to the COEF setup included in the default init procedure.
For making SPDIF working again, the COEF-setup has to be avoided for
the id 0889.
The dcdbas driver can do an I/O write to cause a SMI to occur. The SMI handler
looks at certain registers and memory locations, so the SMI needs to happen
immediately. On some systems I/O writes are posted, though, causing the SMI to
happen well after the "outb" occurred, which causes random failures. Following
the "outb" with an "inb" forces the write to go through even if it is posted.
While trying to track down some NFS problems with BTRFS, I kept noticing I was
getting -EACCESS for no apparent reason. Eric Paris and printk() helped me
figure out that it was SELinux that was giving me grief, with the following
denial
Turns out this is because in d_obtain_alias if we can't find an alias we create
one and do all the normal instantiation stuff, but we don't do the
security_d_instantiate.
Usually we are protected from getting a hashed dentry that hasn't yet run
security_d_instantiate() by the parent's i_mutex, but obviously this isn't an
option there, so in order to deal with the case that a second thread comes in
and finds our new dentry before we get to run security_d_instantiate(), we go
ahead and call it if we find a dentry already. Eric assures me that this is ok
as the code checks to see if the dentry has been initialized already so calling
security_d_instantiate() against the same dentry multiple times is ok. With
this patch I'm no longer getting errant -EACCESS values.
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If we call xs_close(), we're in one of two situations:
- Autoclose, which means we don't expect to resend a request
- bind+connect failed, which probably means the port is in use
A virtualized display device is usually viewed with the vncviewer
application, either by 'xm vnc domU' or with vncviewer localhost:port.
vncviewer and the RFB protocol provides absolute coordinates to the
virtual display. These coordinates are either passed through to a PV
guest or converted to relative coordinates for a HVM guest.
A PV guest receives these coordinates and passes them to the kernels
evdev driver. There it can be picked up by applications such as the
xorg-input drivers. Using absolute coordinates avoids issues such as
guest mouse pointer not tracking host mouse pointer due to wrong mouse
acceleration settings in the guests X display.
Advertise either absolute or relative coordinates to the input system
and the evdev driver, depending on what dom0 provides. The xorg-input
driver prefers relative coordinates even if a devices provides both.
Fix potential null-pointer exception on disconnect introduced by commit 11ea859d64b69a747d6b060b9ed1520eab1161fe (USB: additional power savings
for cdc-acm devices that support remote wakeup).
Only access acm->dev after making sure it is non-null in control urb
completion handler.
Prevent read urbs from being resubmitted from tasklet after port close.
The receive tasklet was not disabled on port close, which could lead to
corruption of receive lists on consecutive port open. In particular,
read urbs could be re-submitted before port open, added to free list in
open, and then added a second time to the free list in the completion
handler.
My testprog do a lot of bitbang - after hours i got following warning and my machine lockups:
WARNING: at /build/buildd/linux-2.6.38/lib/kref.c:34
After debugging uss720 driver i discovered that the completion callback was called before
usb_submit_urb returns. The callback frees the request structure that is krefed on return by
usb_submit_urb.
Signed-off-by: Peter Holik <peter@holik.at> Acked-by: Thomas Sailer <t.sailer@alumni.ethz.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
This patch (as1453) fixes a long-standing bug in the ehci-hcd driver.
There is no need to set the Halt bit in the overlay region for an
unlinked or blocked QH. Contrary to what the comment says, setting
the Halt bit does not cause the QH to be patched later; that decision
(made in qh_refresh()) depends only on whether the QH is currently
pointing to a valid qTD. Likewise, setting the Halt bit does not
prevent completions from activating the QH while it is "stopped"; they
are prevented by the fact that qh_completions() temporarily changes
qh->qh_state to QH_STATE_COMPLETING.
On the other hand, there are circumstances in which the QH will be
reactivated _without_ being patched; this happens after an URB beyond
the head of the queue is unlinked. Setting the Halt bit will then
cause the hardware to see the QH with both the Active and Halt bits
set, an invalid combination that will prevent the queue from
advancing and may even crash some controllers.
Apparently the only reason this hasn't been reported before is that
unlinking URBs from the middle of a running queue is quite uncommon.
However Test 17, recently added to the usbtest driver, does exactly
this, and it confirms the presence of the bug.
In short, there is no reason to set the Halt bit for an unlinked or
blocked QH, and there is a very good reason not to set it. Therefore
the code that sets it is removed.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Andiry Xu <andiry.xu@amd.com> CC: David Brownell <david-b@pacbell.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
The scheme used to index format in uvc_fixup_video_ctrl() is not robust:
format index is based on descriptor ordering, which does not necessarily
match bFormatIndex ordering. Searching for first matching format will
prevent uvc_fixup_video_ctrl() from using the wrong format/frame to make
adjustments.
Use mask 0x10 for "soft cursor" detection on in function tile_cursor.
(Tile Blitting Operation in framebuffer console).
The old mask 0x01 for vc_cursor_type detects CUR_NONE, CUR_LOWER_THIRD
and every second mode value as "software cursor". This hides the cursor
for these modes (cursor.mode = 0). But, only CUR_NONE or "software cursor"
should hide the cursor.
See also 0x10 in functions add_softcursor, bit_cursor and cw_cursor.
Signed-off-by: Henry Nestler <henry.nestler@gmail.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
While mm->start_stack was protected from cross-uid viewing (commit f83ce3e6b02d5 ("proc: avoid information leaks to non-privileged
processes")), the start_code and end_code values were not. This would
allow the text location of a PIE binary to leak, defeating ASLR.
Note that the value "1" is used instead of "0" for a protected value since
"ps", "killall", and likely other readers of /proc/pid/stat, take
start_code of "0" to mean a kernel thread and will misbehave. Thanks to
Brad Spengler for pointing this out.
The current code fails to print the "[heap]" marking if the heap is split
into multiple mappings.
Fix the check so that the marking is displayed in all possible cases:
1. vma matches exactly the heap
2. the heap vma is merged e.g. with bss
3. the heap vma is splitted e.g. due to locked pages
Test cases. In all cases, the process should have mapping(s) with
[heap] marking:
It looks like the bug has been there forever, and since it only results in
some information missing from a procfile, it does not fulfil the -stable
"critical issue" criteria.
Orphan cleanup is currently executed even if the file system has some
number of unknown ROCOMPAT features, which deletes inodes and frees
blocks, which could be very bad for some RO_COMPAT features.
This patch skips the orphan cleanup if it contains readonly compatible
features not known by this ext3 implementation, which would prevent
the fs from being mounted (or remounted) readwrite.
Signed-off-by: Amir Goldstein <amir73il@users.sf.net> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
Userland should be able to trust the pid and uid of the sender of a
signal if the si_code is SI_TKILL.
Unfortunately, the kernel has historically allowed sigqueueinfo() to
send any si_code at all (as long as it was negative - to distinguish it
from kernel-generated signals like SIGILL etc), so it could spoof a
SI_TKILL with incorrect siginfo values.
Happily, it looks like glibc has always set si_code to the appropriate
SI_QUEUE, so there are probably no actual user code that ever uses
anything but the appropriate SI_QUEUE flag.
So just tighten the check for si_code (we used to allow any negative
value), and add a (one-time) warning in case there are binaries out
there that might depend on using other si_code values.
If a device doesn't support power management (pm_cap == 0) but it is
acpi_pci_power_manageable() because there is a _PS0 method declared for
it and _EJ0 is also declared for the slot then nobody is going to set
current_state = PCI_D0 for this device. This is what I think it is
happening:
pci_enable_device
|
__pci_enable_device_flags
/* here we do not set current_state because !pm_cap */
|
do_pci_enable_device
|
pci_set_power_state
|
__pci_start_power_transition
|
pci_platform_power_transition
/* platform_pci_power_manageable() calls acpi_pci_power_manageable that
* returns true */
|
platform_pci_set_power_state
/* acpi_pci_set_power_state gets called and does nothing because the
* acpi device has _EJ0, see the comment "If the ACPI device has _EJ0,
* ignore the device" */
at this point if we refer to the commit message that introduced the
comment above (10b3dcae0f275e2546e55303d64ddbb58cec7599), it is up to
the hotplug driver to set the state to D0.
However AFAICT the pci hotplug driver never does, in fact
drivers/pci/hotplug/acpiphp_glue.c:register_slot sets the slot flags to
(SLOT_ENABLED | SLOT_POWEREDON) but it does not set the pci device
current state to PCI_D0.
So my proposed fix is also to set current_state = PCI_D0 in
register_slot.
Comments are very welcome.
Up to 2.6.22, you could use remap_file_pages(2) on a tmpfs file or a
shared mapping of /dev/zero or a shared anonymous mapping. In 2.6.23 we
disabled it by default, but set VM_CAN_NONLINEAR to enable it on safe
mappings. We made sure to set it in shmem_mmap() for tmpfs files, but
missed it in shmem_zero_setup() for the others. Fix that at last.
The test program below will hang because io_getevents() uses
add_wait_queue_exclusive(), which means the wake_up() in io_destroy() only
wakes up one of the threads. Fix this by using wake_up_all() in the aio
code paths where we want to make sure no one gets stuck.
An integer overflow occurs in the calculation of RHlinear when the
relative humidity is greater than around 30%. The consequence is a subtle
(but noticeable) error in the resulting humidity measurement.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jonathan Cameron <jic23@cam.ac.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The latest binutils (2.21.0.20110302/Ubuntu) breaks the build
yet another time, under CONFIG_XEN=y due to a .size directive that
refers to a slightly differently named (hence, to the now very
strict and unforgiving assembler, non-existent) symbol.
[ mingo:
This unnecessary build breakage caused by new binutils
version 2.21 gets escallated back several kernel releases spanning
several years of Linux history, affecting over 130,000 upstream
kernel commits (!), on CONFIG_XEN=y 64-bit kernels (i.e. essentially
affecting all major Linux distro kernel configs).
Git annotate tells us that this slight debug symbol code mismatch
bug has been introduced in 2008 in commit 3d75e1b8:
Human reviewers almost never catch such small mismatches, and binutils
never even warned about it either.
This new binutils version thus breaks the Xen build on all upstream kernels
since v2.6.27, out of the blue.
This makes a straightforward Git bisection of all 64-bit Xen-enabled kernels
impossible on such binutils, for a bisection window of over hundred
thousand historic commits. (!)
This is a major fail on the side of binutils and binutils needs to turn
this show-stopper build failure into a warning ASAP. ]
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm> Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jan Beulich <jbeulich@novell.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <kees.cook@canonical.com>
LKML-Reference: <1299877178-26063-1-git-send-email-heukelum@fastmail.fm> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
During redetection of a SDIO card, a request for a new card RCA
was submitted to the card, but was then overwritten by the old RCA.
This caused the card to be deselected instead of selected when using
the incorrect RCA. This bug's been present since the "oldcard"
handling was introduced in 2.6.32.
Mike Galbraith reported finding a lockup ("perma-spin bug") where the
cpumask passed to smp_call_function_many was cleared by other cpu(s)
while a cpu was preparing its call_data block, resulting in no cpu to
clear the last ref and unlock the block.
Having cpus clear their bit asynchronously could be useful on a mask of
cpus that might have a translation context, or cpus that need a push to
complete an rcu window.
Instead of adding a BUG_ON and requiring yet another cpumask copy, just
detect the race and handle it.
Note: arch_send_call_function_ipi_mask must still handle an empty
cpumask because the data block is globally visible before the that arch
callback is made. And (obviously) there are no guarantees to which cpus
are notified if the mask is changed during the call; only cpus that were
online and had their mask bit set during the whole call are guaranteed
to be called.
Reported-by: Mike Galbraith <efault@gmx.de> Reported-by: Jan Beulich <JBeulich@novell.com> Acked-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
According to intel CPU manual, every time PGD entry is changed in i386 PAE
mode, we need do a full TLB flush. Current code follows this and there is
comment for this too in the code.
But current code misses the multi-threaded case. A changed page table
might be used by several CPUs, every such CPU should flush TLB. Usually
this isn't a problem, because we prepopulate all PGD entries at process
fork. But when the process does munmap and follows new mmap, this issue
will be triggered.
When it happens, some CPUs keep doing page faults:
Paul McKenney's review pointed out two problems with the barriers in the
2.6.38 update to the smp call function many code.
First, a barrier that would force the func and info members of data to
be visible before their consumption in the interrupt handler was
missing. This can be solved by adding a smp_wmb between setting the
func and info members and setting setting the cpumask; this will pair
with the existing and required smp_rmb ordering the cpumask read before
the read of refs. This placement avoids the need a second smp_rmb in
the interrupt handler which would be executed on each of the N cpus
executing the call request. (I was thinking this barrier was present
but was not).
Second, the previous write to refs (establishing the zero that we the
interrupt handler was testing from all cpus) was performed by a third
party cpu. This would invoke transitivity which, as a recient or
concurrent addition to memory-barriers.txt now explicitly states, would
require a full smp_mb().
However, we know the cpumask will only be set by one cpu (the data
owner) and any preivous iteration of the mask would have cleared by the
reading cpu. By redundantly writing refs to 0 on the owning cpu before
the smp_wmb, the write to refs will follow the same path as the writes
that set the cpumask, which in turn allows us to keep the barrier in the
interrupt handler a smp_rmb instead of promoting it to a smp_mb (which
will be be executed by N cpus for each of the possible M elements on the
list).
I moved and expanded the comment about our (ab)use of the rcu list
primitives for the concurrent walk earlier into this function. I
considered moving the first two paragraphs to the queue list head and
lock, but felt it would have been too disconected from the code.
Cc: Paul McKinney <paulmck@linux.vnet.ibm.com> Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
Peter pointed out there was nothing preventing the list_del_rcu in
smp_call_function_interrupt from running before the list_add_rcu in
smp_call_function_many.
Fix this by not setting refs until we have gotten the lock for the list.
Take advantage of the wmb in list_add_rcu to save an explicit additional
one.
I tried to force this race with a udelay before the lock & list_add and
by mixing all 64 online cpus with just 3 random cpus in the mask, but
was unsuccessful. Still, inspection shows a valid race, and the fix is
a extension of the existing protection window in the current code.
Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
When ext3_dx_add_entry() has to split an index node, it has to ensure that
name_len of dx_node's fake_dirent is also zero, because otherwise e2fsck
won't recognise it as an intermediate htree node and consider the htree to
be corrupted.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
Events on POWER7 can roll back if a speculative event doesn't
eventually complete. Unfortunately in some rare cases they will
raise a performance monitor exception. We need to catch this to
ensure we reset the PMC. In all cases the PMC will be 256 or less
cycles from overflow.
This fixes a race in which the task->tk_callback() puts the rpc_task
to sleep, setting a new callback. Under certain circumstances, the current
code may end up executing the task->tk_action before it gets round to the
callback.
Commit 280c73d ("PCI: centralize the capabilities code in
pci-sysfs.c") changed the initialisation of the "rom" and "vpd"
attributes, and made the failure path for the "vpd" attribute
incorrect. We must free the new attribute structure (attr), but
instead we currently free dev->vpd->attr. That will normally be NULL,
resulting in a memory leak, but it might be a stale pointer, resulting
in a double-free.
Some broken BIOSes on ICH4 chipset report an ACPI region which is in
conflict with legacy IDE ports when ACPI is disabled. Even though the
regions overlap, IDE ports are working correctly (we cannot find out
the decoding rules on chipsets).
So the only problem is the reported region itself, if we don't reserve
the region in the quirk everything works as expected.
This patch avoids reserving any quirk regions below PCIBIOS_MIN_IO
which is 0x1000. Some regions might be (and are by a fast google
query) below this border, but the only difference is that they won't
be reserved anymore. They should still work though the same as before.
The conflicts look like (1f.0 is bridge, 1f.1 is IDE ctrl):
pci 0000:00:1f.1: address space collision: [io 0x0170-0x0177] conflicts with 0000:00:1f.0 [io 0x0100-0x017f]
At 0x0100 a 128 bytes long ACPI region is reported in the quirk for
ICH4. ata_piix then fails to find disks because the IDE legacy ports
are zeroed:
ata_piix 0000:00:1f.1: device not available (can't reserve [io 0x0000-0x0007])
Per ICH4 and ICH6 specs, ACPI and GPIO regions are valid iff ACPI_EN
and GPIO_EN bits are set to 1. Add checks for these bits into the
quirks prior to the region creation.
When the mux for digital mic is different from the mux for other mics,
the current auto-parser doesn't handle them in a right way but provides
only one mic. This patch fixes the issue.
When an endpoint stalls, we need to update the xHCI host's internal
dequeue pointer to move it past the stalled transfer. This includes
updating the cycle bit (TRB ownership bit) if we have moved the dequeue
pointer past a link TRB with the toggle cycle bit set.
When we're trying to find the new dequeue segment, find_trb_seg() is
supposed to keep track of whether we've passed any link TRBs with the
toggle cycle bit set. However, this while loop's body
Will never get executed if the ring only contains one segment.
find_trb_seg() will return immediately, without updating the new cycle
bit. Since find_trb_seg() has no idea where in the segment the TD that
stalled was, make the caller, xhci_find_new_dequeue_state(), check for
this special case and update the cycle bit accordingly.
This patch should be queued to kernels all the way back to 2.6.31.
I picked up a new DAK-780EX(professional digitl reverb/mix system),
which use CH341T chipset to communication with computer on 3/2011
and the CH341T's vendor code is 1a86
Looking up the CH341T's vendor and product id's I see:
1a86 QinHeng Electronics
5523 CH341 in serial mode, usb to serial port converter
CH341T,CH341 are the products of the same company, maybe
have some common hardware, and I test the ch341.c works
well with CH341T
There are few places where we are checking for macversion and revsions
before RTC is powered ON. However we are reading the macversion and
revisions only after RTC is powered ON and so both macversion and
revisions are actully zero and this leads to incorrect srev checks
Incorrect srev checks can cause registers to be configured wrongly and can
cause unexpected behavior. Fixing this seems to address the ASPM issue that
we have observed. The laptop becomes very slow and hangs mostly with ASPM L1
enabled without this fix.
fix this by reading the macversion and revisisons even before we start
using them. There is no reason why should we delay reading this info
until RTC is powered on as this is just a register information.
Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
Commit 7f74f8f28a2bd9db9404f7d364e2097a0c42cc12
(x86 quirk: Fix polarity for IRQ0 pin2 override on SB800
systems) introduced a regression. It removed some SB600 specific
code to determine the revision ID without adapting a
corresponding revision ID check for SB600.
When processing a SIDR REQ, the ib_cm allocates a new cm_id. The
refcount of the cm_id is initialized to 1. However, cm_process_work
will decrement the refcount after invoking all callbacks. The result
is that the cm_id will end up with refcount set to 0 by the end of the
sidr req handler.
If a user tries to destroy the cm_id, the destruction will proceed,
under the incorrect assumption that no other threads are referencing
the cm_id. This can lead to a crash when the cm callback thread tries
to access the cm_id.
This problem was noticed as part of a larger investigation with kernel
crashes in the rdma_cm when running on a real time OS.
Signed-off-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
They were able to reproduce the crash multiple times with the
following details:
Crash seems to always happen on the:
mutex_unlock(&conn_id->handler_mutex);
as conn_id looks to have been freed during this code path.
An examination of the code shows that a race exists in the request
handlers. When a new connection request is received, the rdma_cm
allocates a new connection identifier. This identifier has a single
reference count on it. If a user calls rdma_destroy_id() from another
thread after receiving a callback, rdma_destroy_id will proceed to
destroy the id and free the associated memory. However, the request
handlers may still be in the process of running. When control returns
to the request handlers, they can attempt to access the newly created
identifiers.
Fix this by holding a reference on the newly created rdma_cm_id until
the request handler is through accessing it.
Signed-off-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
Avoid removing all of memory and panicing when "mem={invalid}"
is specified, e.g. mem=blahblah, mem=0, or mem=nopentium (on
platforms other than x86_32).
Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> BugLink: http://bugs.launchpad.net/bugs/553464 Cc: Yinghai Lu <yinghai@kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl>
LKML-Reference: <1296783486-23033-1-git-send-email-kamal@canonical.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When the fuction graph tracer starts, it needs to make a special
stack for each task to save the real return values of the tasks.
All running tasks have this stack created, as well as any new
tasks.
On CPU hot plug, the new idle task will allocate a stack as well
when init_idle() is called. The problem is that cpu hotplug does
not create a new idle_task. Instead it uses the idle task that
existed when the cpu went down.
ftrace_graph_init_task() will add a new ret_stack to the task
that is given to it. Because a clone will make the task
have a stack of its parent it does not check if the task's
ret_stack is already NULL or not. When the CPU hotplug code
starts a CPU up again, it will allocate a new stack even
though one already existed for it.
The solution is to treat the idle_task specially. In fact, the
function_graph code already does, just not at init_idle().
Instead of using the ftrace_graph_init_task() for the idle task,
which that function expects the task to be a clone, have a
separate ftrace_graph_init_idle_task(). Also, we will create a
per_cpu ret_stack that is used by the idle task. When we call
ftrace_graph_init_idle_task() it will check if the idle task's
ret_stack is NULL, if it is, then it will assign it the per_cpu
ret_stack.
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
mm_fault_error() should not execute oom-killer, if page fault
occurs in kernel space. E.g. in copy_from_user()/copy_to_user().
This would happen if we find ourselves in OOM on a
copy_to_user(), or a copy_from_user() which faults.
Without this patch, the kernels hangs up in copy_from_user(),
because OOM killer sends SIG_KILL to current process, but it
can't handle a signal while in syscall, then the kernel returns
to copy_from_user(), reexcute current command and provokes
page_fault again.
With this patch the kernel return -EFAULT from copy_from_user().
The code, which checks that page fault occurred in kernel space,
has been copied from do_sigbus().
This situation is handled by the same way on powerpc, xtensa,
tile, ...
When au1000_eth probes the MII bus for PHY address, if we do not set
au1000_eth platform data's phy_search_highest_address, the MII probing
logic will exit early and will assume a valid PHY is found at address 0.
For MTX-1, the PHY is at address 31, and without this patch, the link
detection/speed/duplex would not work correctly.
ata_qc_complete() contains special handling for certain commands. For
example, it schedules EH for device revalidation after certain
configurations are changed. These shouldn't be applied to EH
commands but they were.
In most cases, it doesn't cause an actual problem because EH doesn't
issue any command which would trigger special handling; however, ACPI
can issue such commands via _GTF which can cause weird interactions.
Restructure ata_qc_complete() such that EH commands are always passed
on to __ata_qc_complete().
stable: Please apply to -stable only after 2.6.38 is released.
Add necessary alias to autoload ip6ip6 tunnel module.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
Since a8f80e8ff94ecba629542d9b4b5f5a8ee3eb565c any process with
CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean
that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are
limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't
allow anybody load any module not related to networking.
This patch restricts an ability of autoloading modules to netdev modules
with explicit aliases. This fixes CVE-2011-1019.
Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior
of loading netdev modules by name (without any prefix) for processes
with CAP_SYS_MODULE to maintain the compatibility with network scripts
that use autoloading netdev modules by aliases like "eth0", "wlan0".
Currently there are only three users of the feature in the upstream
kernel: ipip, ip_gre and sit.
root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) --
root@albatros:~# grep Cap /proc/$$/status
CapInh: 0000000000000000
CapPrm: fffffff800001000
CapEff: fffffff800001000
CapBnd: fffffff800001000
root@albatros:~# modprobe xfs
FATAL: Error inserting xfs
(/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted
root@albatros:~# lsmod | grep xfs
root@albatros:~# ifconfig xfs
xfs: error fetching interface information: Device not found
root@albatros:~# lsmod | grep xfs
root@albatros:~# lsmod | grep sit
root@albatros:~# ifconfig sit
sit: error fetching interface information: Device not found
root@albatros:~# lsmod | grep sit
root@albatros:~# ifconfig sit0
sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1
root@albatros:~# lsmod | grep sit
sit 10457 0
tunnel4 2957 1 sit
For CAP_SYS_MODULE module loading is still relaxed:
I found that one of the 8168c chipsets (concretely XID 1c4000c0) starts
generating RxFIFO overflow errors. The result is an infinite loop in
interrupt handler as the RxFIFOOver is handled only for ...MAC_VER_11.
With the workaround everything goes fine.
Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Hayes <hayeswang@realtek.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Like many other places, we have to check that the array index is
within allowed limits, or otherwise, a kernel oops and other nastiness
can ensue when we access memory beyond the end of the array.
The problem goes back to v2.6.30-rc1~1372~1342~31 where nf_log_bind
was decoupled from nf_log_register.
Reported-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>,
via irc.freenode.net/#netfilter Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
When CPU hotplug is used, some CPUs may be offline at the time a kexec is
performed. The subsequent kernel may expect these CPUs to be already running,
and will declare them stuck. On pseries, there's also a soft-offline (cede)
state that CPUs may be in; this can also cause problems as the kexeced kernel
may ask RTAS if they're online -- and RTAS would say they are. The CPU will
either appear stuck, or will cause a crash as we replace its cede loop beneath
it.
This patch kicks each present offline CPU awake before the kexec, so that
none are forever lost to these assumptions in the subsequent kernel.
Now, the behaviour is that all available CPUs that were offlined are now
online & usable after the kexec. This mimics the behaviour of a full reboot
(on which all CPUs will be restarted).
Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Kamalesh babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
cc: Anton Blanchard <anton@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Robert Swiecki reported a BUG_ON(page_mapped) from a fuzzer, punching
a hole with madvise(,, MADV_REMOVE). That path is under mutex, and
cannot be explained by lack of serialization in unmap_mapping_range().
Reviewing the code, I found one place where vm_truncate_count handling
should have been updated, when I switched at the last minute from one
way of managing the restart_addr to another: mremap move changes the
virtual addresses, so it ought to adjust the restart_addr.
But rather than exporting the notion of restart_addr from memory.c, or
converting to restart_pgoff throughout, simply reset vm_truncate_count
to 0 to force a rescan if mremap move races with preempted truncation.
We have no confirmation that this fixes Robert's BUG,
but it is a fix that's worth making anyway.
We have found a hardware erratum on 82599 hardware that can lead to
unpredictable behavior when Header Splitting mode is enabled. So
we are no longer enabling this feature on affected hardware.
Please see the 82599 Specification Update for more information.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
However the code in rxrpc_instantiate strips it away:
data += sizeof(kver);
datalen -= sizeof(kver);
Removing kif_version fixes my problem.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
When we get oplock break notification we should set the appropriate
value of OplockLevel field in oplock break acknowledge according to
the oplock level held by the client in this time. As we only can have
level II oplock or no oplock in the case of oplock break, we should be
aware only about clientCanCacheRead field in cifsInodeInfo structure.
Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
NETDEV_NOTIFY_PEER is an explicit request by the driver to send a link
notification while NETDEV_UP/NETDEV_CHANGEADDR generate link
notifications as a sort of side effect.
In the later cases the sysctl option is present because link
notification events can have undesired effects e.g. if the link is
flapping. I don't think this applies in the case of an explicit
request from a driver.
This patch makes NETDEV_NOTIFY_PEER unconditional, if preferred we
could add a new sysctl for this case which defaults to on.
This change causes Xen post-migration ARP notifications (which cause
switches to relearn their MAC tables etc) to be sent by default.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andi Kleen <ak@linux.intel.com>
[reported to solve hyperv live migration problem - gkh] Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Mike Surcouf <mike@surcouf.co.uk> Cc: Hank Janssen <hjanssen@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If the iowarrior devices in this case statement support more than 8 bytes
per report, it is possible to write past the end of a kernel heap allocation.
This will probably never be possible, but change the allocation to be more
defensive anyway.
For some time is known that ASPM is causing troubles on r8169, i.e. make
device randomly stop working without any errors in dmesg.
Currently Tomi Leppikangas reports that system with r8169 device hangs
with MCE errors when ASPM is enabled:
https://bugzilla.redhat.com/show_bug.cgi?id=642861#c4
Lets disable ASPM for r8169 devices at all, to avoid problems with
r8169 PCIe devices at least for some users.
Reported-by: Tomi Leppikangas <tomi.leppikangas@gmail.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
When support for 82577/82578 was added[1] in 2.6.31, PHY wakeup was in-
advertently enabled (even though it does not function properly) on ICH10
LOMs. This patch makes it so that the ICH10 LOMs use MAC wakeup instead
as was done with the initial support for those devices (i.e. 82567LM-3,
82567LF-3 and 82567V-4).
This fixes a bug in the order of dccp_rcv_state_process() that still permitted
reception even after closing the socket. A Reset after close thus causes a NULL
pointer dereference by not preventing operations on an already torn-down socket.
dccp_v4_do_rcv()
|
| state other than OPEN
v
dccp_rcv_state_process()
|
| DCCP_PKT_RESET
v
dccp_rcv_reset()
|
v
dccp_time_wait()
WARNING: at net/ipv4/inet_timewait_sock.c:141 __inet_twsk_hashdance+0x48/0x128()
Modules linked in: arc4 ecb carl9170 rt2870sta(C) mac80211 r8712u(C) crc_ccitt ah
[<c0038850>] (unwind_backtrace+0x0/0xec) from [<c0055364>] (warn_slowpath_common)
[<c0055364>] (warn_slowpath_common+0x4c/0x64) from [<c0055398>] (warn_slowpath_n)
[<c0055398>] (warn_slowpath_null+0x1c/0x24) from [<c02b72d0>] (__inet_twsk_hashd)
[<c02b72d0>] (__inet_twsk_hashdance+0x48/0x128) from [<c031caa0>] (dccp_time_wai)
[<c031caa0>] (dccp_time_wait+0x40/0xc8) from [<c031c15c>] (dccp_rcv_state_proces)
[<c031c15c>] (dccp_rcv_state_process+0x120/0x538) from [<c032609c>] (dccp_v4_do_)
[<c032609c>] (dccp_v4_do_rcv+0x11c/0x14c) from [<c0286594>] (release_sock+0xac/0)
[<c0286594>] (release_sock+0xac/0x110) from [<c031fd34>] (dccp_close+0x28c/0x380)
[<c031fd34>] (dccp_close+0x28c/0x380) from [<c02d9a78>] (inet_release+0x64/0x70)
The fix is by testing the socket state first. Receiving a packet in Closed state
now also produces the required "No connection" Reset reply of RFC 4340, 8.3.1.
Reported-and-tested-by: Johan Hovold <jhovold@gmail.com> Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reported-by: Mark Davis Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>
vfs_rename_other() does not lock renamed inode with i_mutex. Thus changing
i_nlink in a non-atomic manner (which happens in ext2_rename()) can corrupt
it as reported and analyzed by Josh.
In fact, there is no good reason to mess with i_nlink of the moved file.
We did it presumably to simulate linking into the new directory and unlinking
from an old one. But the practical effect of this is disputable because fsck
can possibly treat file as being properly linked into both directories without
writing any error which is confusing. So we just stop increment-decrement
games with i_nlink which also fixes the corruption.
CC: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Josh Hunt <johunt@akamai.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andi Kleen <ak@linux.intel.com>