Gwendal Grignou [Fri, 18 Jan 2013 18:56:43 +0000 (10:56 -0800)]
[libata] Set proper SK when CK_COND is set.
When the user application sends a ATA_12 or ATA_16 PASSTHROUGH
scsi command, put the task file register in the sense data with the
proper Sense Key. Instead of NO SENSE, set RECOVERED, as
specified in [SAT2]12.2.5 Table 92.
Tested:
Using udev ata_id to generate a passthrough command, IDENTIFY:
before:
sd 0:0:0:0: [sda] CDB: ATA command pass through(12)/Blank: \
a1 08 2e 00 01 00 00 00 00 ec 00 00
sd 0:0:0:0: [sda] Sense Key : No Sense [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
72 00 00 00 00 00 00 0e 09 0c 00 00 00 00 00 3f
00 18 00 a6 e0 50
after
sd 0:0:0:0: [sda] CDB: ATA command pass through(12)/Blank: \
a1 08 2e 00 01 00 00 00 00 ec 00 00
sd 0:0:0:0: [sda] Sense Key : Recovered Error [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
72 01 00 1d 00 00 00 0e 09 0c 00 00 00 01 00 00
00 00 00 00 00 50
Signed-off-by: Gwendal Grignou <gwendal@google.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Vladimir Barinov [Wed, 20 Feb 2013 20:10:29 +0000 (23:10 +0300)]
libata: add R-Car SATA driver
Add Renesas R-Car on-chip 3Gbps SATA controller driver.
Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
[Sergei: few bugs fixed, significant cleanup] Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Aaron Lu [Wed, 23 Jan 2013 07:09:33 +0000 (15:09 +0800)]
[SCSI] remove can_power_off flag from scsi_device
Commit 166a2967b45ede2e2e56f3ede3cd32053dc17812 "libata: tell scsi layer
device supports runtime power off" introduced the can_power_off flag for
scsi_device and is used to support ZPODD implementation in SCSI layer.
Since ZPODD is now implemented in ATA layer, that flag is no longer
needed, so remove it.
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Aaron Lu [Wed, 23 Jan 2013 07:09:32 +0000 (15:09 +0800)]
[libata] scsi: no poll when ODD is powered off
When the ODD is powered off, any action the user did to the ODD that
would generate a media event will trigger an ACPI interrupt, so the
poll for media event is no longer necessary. And the poll will also
cause a runtime status change, which will stop the ODD from staying in
powered off state, so the poll should better be stopped.
But since we don't have access to the gendisk structure in LLDs, here
comes the disk_events_disable_depth for scsi device. This field is a
hint set by LLDs to convey information to upper layer drivers. A value
of 0 means media poll is necessary for the device, while values above 0
means media poll is not needed and should better be skipped. So we can
increase its value when we are to power off the ODD in ATA layer and
decrease its value when the ODD is powered on, effectively silence the
media events poll.
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Aaron Lu [Wed, 23 Jan 2013 07:09:31 +0000 (15:09 +0800)]
[SCSI] sr: support runtime pm
This patch adds runtime pm support for sr.
It did this by increasing the runtime usage_count of the device when
its block device is accessed. And decreasing the runtime usage_count
of the device when the access is done.
If there is media inside, runtime suspend is not allowed as we don't
always know if the ODD is being used or not.
The idea is discussed here:
http://thread.gmane.org/gmane.linux.acpi.devel/55243/focus=52703
and the restriction to check media inside is discussed here:
http://thread.gmane.org/gmane.linux.ide/53665/focus=58836
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Aaron Lu [Fri, 25 Jan 2013 06:32:25 +0000 (14:32 +0800)]
[libata] PM code cleanup for ata port
For system freeze, if the port is already runtime suspended, leave it
alone and just return. The port will be resumed on thaw before it will
be used.
And since we will call get_noresume for every device during prepare
phase, and the port is resumed during thaw phase, it can't be in runtime
suspended state during the poweroff phase. So remove the
runtime_suspended check in poweroff callback.
And for all suspend(freeze/suspend/poweroff/etc.), there is no need to
touch the device, so set no_autopsy and no_recovery for them all.
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Aaron Lu [Fri, 25 Jan 2013 06:29:35 +0000 (14:29 +0800)]
[libata] pm: differentiate system and runtime pm for ata port
We need to do different things for system PM and runtime PM, e.g. we do
not need to enable runtime wake for ZPODD when we are doing system
suspend, etc.
Currently, we use PMSG_SUSPEND for both system suspend and runtime
suspend and PMSG_ON for both system resume and runtime resume. Change
this by using PMSG_AUTO_SUSPEND for runtime suspend and PMSG_AUTO_RESUME
for runtime resume. And since PMSG_ON means no transition, it is changed
to PMSG_RESUME for ata port's system resume.
The ata_acpi_set_state is modified accordingly, and the sata case and
pata case is seperated for easy reading.
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Aaron Lu [Tue, 15 Jan 2013 09:21:04 +0000 (17:21 +0800)]
libata: do not suspend port if normal ODD is attached
For ODDs, the upper layer will poll for media change every few
seconds, which will make it enter and leave suspend state very
often. And as each suspend will also cause a hard/soft reset,
the gain of runtime suspend is very little while the ODD may
malfunction after constantly being reset. So the idle callback
here will not proceed to suspend if a non-ZPODD capable ODD is
attached to the port.
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Aaron Lu [Tue, 15 Jan 2013 09:21:02 +0000 (17:21 +0800)]
libata: expose pm qos flags for ata device
Expose pm qos flags to user space so that user has a chance to disable
ZPODD feature, if he/she has a broken platform or devices or simply does
not like this feature.
This flag is exposed to user space only for ZPODD devices.
Due to this flag, it is possible the ODD is ZP ready but we didn't power
it off. So the zp_ready flag will need to be cleared whenever we found
the ODD is not in ZP ready state. Previously, once zp_ready is set, the
ODD will always be powered off and the flag will be cleared in
post_poweron. But this is no longer the case now.
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Aaron Lu [Tue, 15 Jan 2013 09:21:01 +0000 (17:21 +0800)]
libata: handle power transition of ODD
When ata port is runtime suspended, it will check if the ODD attched to
it is a zero power(ZP) capable ODD and if the ZP capable ODD is in zero
power ready state. And if this is not the case, the highest acpi state
will be limited to ACPI_STATE_D3_HOT to avoid powering off the ODD. And
if the ODD can be powered off, runtime wake capability needs to be
enabled and powered_off flag will be set to let resume code knows that
the ODD was in powered off state.
And on resume, before it is powered on, if it was powered off during
suspend, runtime wake capability needs to be disabled. After it is
recovered, the ODD is considered functional, post power on processing
like eject tray if the ODD is drawer type is done, and several ZPODD
related fields will also be reset.
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Aaron Lu [Tue, 15 Jan 2013 09:21:00 +0000 (17:21 +0800)]
libata: check zero power ready status for ZPODD
Per the Mount Fuji spec, the ODD is considered zero power ready when:
- For slot type ODD, no media inside;
- For tray type ODD, no media inside and tray closed.
The information can be retrieved by either the returned information of
command GET_EVENT_STATUS_NOTIFICATION(the command is used to poll for
media event) or sense code.
The information provided by the media status byte is not accurate, it
is possible that after a new disc is just inserted, the status byte
still returns media not present. So this information can not be used as
the deciding factor, we use sense code to decide if zpready status is
true.
When we first sensed the ODD in the zero power ready state, the
zp_sampled will be set and timestamp will be recoreded. And after ODD
stayed in this state for some pre-defined period, the ODD is considered
as power off ready and the zp_ready flag will be set. The zp_ready flag
serves as the deciding factor other code will use to see if power off is
OK for the ODD.
The Mount Fuji spec suggests a delay should be used here, to avoid the
case user ejects the ODD and then instantly inserts a new one again, so
that we can avoid a power transition. And some ODDs may be slow to place
its head to the home position after disc is ejected, so a delay here is
generally a good idea. And the delay time can be changed via the module
param zpodd_poweroff_delay.
The zero power ready status check is performed in the ata port's runtime
suspend code path, when port is not frozen yet, as we need to issue some
IOs to the ODD.
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Aaron Lu [Tue, 15 Jan 2013 09:20:59 +0000 (17:20 +0800)]
libata: move acpi notification code to zpodd
Since the ata acpi notification code introduced in commit 3bd46600a7a7e938c54df8cdbac9910668c7dfb0 is solely for ZPODD, and we
now have a dedicated place for it, move these code there.
And the ata_acpi_add_pm_notifier code is changed a little bit in that it
is now invoked when scsi device is not bound with ACPI yet, so the way
to get the acpi handle is different with the previous version. And the
ata_acpi_add/remove_pm_notifier is also simplified a little bit in that
it doesn't check if the acpi_device for the handle exists or not as the
odd_can_poweroff function already checked that.
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Aaron Lu [Tue, 15 Jan 2013 09:20:58 +0000 (17:20 +0800)]
libata: identify and init ZPODD devices
The ODD can be enabled for ZPODD if the following three conditions are
satisfied:
1 The ODD supports device attention;
2 The platform can runtime power off the ODD through ACPI;
3 The ODD is either slot type or drawer type.
For such ODDs, zpodd_init is called and a new structure is allocated for
it to store ZPODD related stuffs.
And the zpodd_dev_enabled function is used to test if ZPODD is currently
enabled for this ODD.
A new config CONFIG_SATA_ZPODD is added to selectively build ZPODD code.
Signed-off-by: Aaron Lu <aaron.lu@intel.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
David Milburn [Mon, 14 Jan 2013 15:59:30 +0000 (09:59 -0600)]
libata: export host controller number thru /sys
As low-level drivers register their host controller(s), keep track
of the number of controllers and export thru /sys in a <host.port>
format so that udev can better match up port numbers with a
specific controller.
Shane Huang [Mon, 17 Dec 2012 15:18:59 +0000 (23:18 +0800)]
[libata] replace sata_settings with devslp_timing
NCQ capability was used to check availability of SATA Settings page
from Identify Device Data Log, which contains DevSlp timing variables.
It does not work on some HDDs and leads to error messages.
IDENTIFY word 78 bit 5(Hardware Feature Control) can't work either
because it is only the sufficient condition of Identify Device data
log, not the necessary condition.
This patch replaced ata_device->sata_settings with ->devslp_timing
to only save DevSlp timing variables(8 bytes), instead of the whole
SATA Settings page(512 bytes).
Linus Torvalds [Fri, 11 Jan 2013 22:56:41 +0000 (14:56 -0800)]
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull a hwmon patch from Guenter Roeck:
"Fix build error in vexpress driver"
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (vexpress) Fix build error seen if CONFIG_OF_DEVICE is not set
Linus Torvalds [Fri, 11 Jan 2013 22:55:15 +0000 (14:55 -0800)]
Merge branch 'akpm' (incoming fixes from Andrew)
Merge misc fixes from Andrew Morton:
"The audit fixes have been floating around for a while - Al and Eric
aren't responding to either myself or Kees so I asked Kees to
re-review them and here they are."
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
lib/rbtree.c: avoid the use of non-static __always_inline
MAINTAINERS: Omar had moved
mm: compaction: partially revert capture of suitable high-order page
linux/audit.h: move ptrace.h include to kernel header
kernel/audit.c: avoid negative sleep durations
audit: catch possible NULL audit buffers
audit: create explicit AUDIT_SECCOMP event type
MAINTAINERS: fix a status pattern
MAINTAINERS: fix arch/arm/plat-omap/include/plat/omap_hwmod.h
mm: thp: acquire the anon_vma rwsem for write during split
mm: mmap: annotate vm_lock_anon_vma locking properly for lockdep
lockdep, rwsem: provide down_write_nest_lock()
arch/mn10300/Kconfig: select CONFIG_GENERIC_ATOMIC64
mm: bootmem: fix free_all_bootmem_core() with odd bitmap alignment
mm: use aligned zone start for pfn_to_bitidx calculation
fs/exec.c: work around icc miscompilation
mm: compaction: fix echo 1 > compact_memory return error issue
mm: memblock: fix wrong memmove size in memblock_merge_regions()
drivers/video/ssd1307fb.c: fix bit order bug in the byte translation function
mm: migrate: check page_count of THP before migrating
...
lib/rbtree.c: avoid the use of non-static __always_inline
lib/rbtree.c declared __rb_erase_color() as __always_inline void, and
then exported it with EXPORT_SYMBOL.
This was because __rb_erase_color() must be exported for augmented
rbtree users, but it must also be inlined into rb_erase() so that the
dummy callback can get optimized out of that call site.
(Actually with a modern compiler, none of the dummy callback functions
should even be generated as separate text functions).
The above usage is legal C, but it was unusual enough for some compilers
to warn about it. This change makes things more explicit, with a static
__always_inline ____rb_erase_color function for use in rb_erase(), and a
separate non-inline __rb_erase_color function for use in
rb_erase_augmented call sites.
Signed-off-by: Michel Lespinasse <walken@google.com> Reported-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chen Gang [Fri, 11 Jan 2013 22:32:17 +0000 (14:32 -0800)]
MAINTAINERS: Omar had moved
Signed-off-by: Chen Gang <gang.chen@asianux.com> Cc: Omar Ramirez Luna <omar.ramirez@ti.com> Cc: Omar Ramirez Luna <omar.ramirez@copitl.com> Cc: David Miller <davem@davemloft.net> Cc: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Fri, 11 Jan 2013 22:32:16 +0000 (14:32 -0800)]
mm: compaction: partially revert capture of suitable high-order page
Eric Wong reported on 3.7 and 3.8-rc2 that ppoll() got stuck when
waiting for POLLIN on a local TCP socket. It was easier to trigger if
there was disk IO and dirty pages at the same time and he bisected it to
commit 1fb3f8ca0e92 ("mm: compaction: capture a suitable high-order page
immediately when it is made available").
The intention of that patch was to improve high-order allocations under
memory pressure after changes made to reclaim in 3.6 drastically hurt
THP allocations but the approach was flawed. For Eric, the problem was
that page->pfmemalloc was not being cleared for captured pages leading
to a poor interaction with swap-over-NFS support causing the packets to
be dropped. However, I identified a few more problems with the patch
including the fact that it can increase contention on zone->lock in some
cases which could result in async direct compaction being aborted early.
In retrospect the capture patch took the wrong approach. What it should
have done is mark the pageblock being migrated as MIGRATE_ISOLATE if it
was allocating for THP and avoided races that way. While the patch was
showing to improve allocation success rates at the time, the benefit is
marginal given the relative complexity and it should be revisited from
scratch in the context of the other reclaim-related changes that have
taken place since the patch was first written and tested. This patch
partially reverts commit 1fb3f8ca0e92 ("mm: compaction: capture a
suitable high-order page immediately when it is made available").
Reported-and-tested-by: Eric Wong <normalperson@yhbt.net> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Fri, 11 Jan 2013 22:32:13 +0000 (14:32 -0800)]
linux/audit.h: move ptrace.h include to kernel header
While the kernel internals want pt_regs (and so it includes
linux/ptrace.h), the user version of audit.h does not need it. So move
the include out of the uapi version.
This avoids issues where people want the audit defines and userland
ptrace api. Including both the kernel ptrace and the userland ptrace
headers can easily lead to failure.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Fri, 11 Jan 2013 22:32:11 +0000 (14:32 -0800)]
kernel/audit.c: avoid negative sleep durations
audit_log_start() performs the same jiffies comparison in two places.
If sufficient time has elapsed between the two comparisons, the second
one produces a negative sleep duration:
schedule_timeout: wrong timeout value fffffffffffffff0
Pid: 6606, comm: trinity-child1 Not tainted 3.8.0-rc1+ #43
Call Trace:
schedule_timeout+0x305/0x340
audit_log_start+0x311/0x470
audit_log_exit+0x4b/0xfb0
__audit_syscall_exit+0x25f/0x2c0
sysret_audit+0x17/0x21
Fix it by performing the comparison a single time.
Reported-by: Dave Jones <davej@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Fri, 11 Jan 2013 22:32:07 +0000 (14:32 -0800)]
audit: catch possible NULL audit buffers
It's possible for audit_log_start() to return NULL. Handle it in the
various callers.
Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: Jeff Layton <jlayton@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Julien Tinnes <jln@google.com> Cc: Will Drewry <wad@google.com> Cc: Steve Grubb <sgrubb@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Fri, 11 Jan 2013 22:32:05 +0000 (14:32 -0800)]
audit: create explicit AUDIT_SECCOMP event type
The seccomp path was using AUDIT_ANOM_ABEND from when seccomp mode 1
could only kill a process. While we still want to make sure an audit
record is forced on a kill, this should use a separate record type since
seccomp mode 2 introduces other behaviors.
In the case of "handled" behaviors (process wasn't killed), only emit a
record if the process is under inspection. This change also fixes
userspace examination of seccomp audit events, since it was considered
malformed due to missing fields of the AUDIT_ANOM_ABEND event type.
Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: Jeff Layton <jlayton@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Julien Tinnes <jln@google.com> Acked-by: Will Drewry <wad@chromium.org> Acked-by: Steve Grubb <sgrubb@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Beregalov and Alex Xu reported similar bugs and Hillf Danton
identified that commit 5a505085f043 ("mm/rmap: Convert the struct
anon_vma::mutex to an rwsem") and commit 4fc3f1d66b1e ("mm/rmap,
migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable")
were likely the problem. Reverting these commits was reported to solve
the problem for Alexander.
Despite the reason for these commits, NUMA balancing is not the direct
source of the problem. split_huge_page() expects the anon_vma lock to
be exclusive to serialise the whole split operation. Ordinarily it is
expected that the anon_vma lock would only be required when updating the
avcs but THP also uses the anon_vma rwsem for collapse and split
operations where the page lock or compound lock cannot be used (as the
page is changing from base to THP or vice versa) and the page table
locks are insufficient.
This patch takes the anon_vma lock for write to serialise against parallel
split_huge_page as THP expected before the conversion to rwsem.
Reported-and-tested-by: Zhouping Liu <zliu@redhat.com> Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Reported-by: Alex Xu <alex_y_xu@yahoo.ca> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
which is incomplete, and causes the false positive report from lockdep
below.
Annotate the fact that mmap_sem is used as an outter lock to serialize
taking of all the anon_vma rwsems at once no matter the order, using the
down_write_nest_lock() primitive.
This patch fixes this lockdep report:
=============================================
[ INFO: possible recursive locking detected ] 3.8.0-rc2-00036-g5f73896 #171 Not tainted
---------------------------------------------
qemu-kvm/2315 is trying to acquire lock:
(&anon_vma->rwsem){+.+...}, at: mm_take_all_locks+0x149/0x1b0
but task is already holding lock:
(&anon_vma->rwsem){+.+...}, at: mm_take_all_locks+0x149/0x1b0
other info that might help us debug this:
Possible unsafe locking scenario:
Jiri Kosina [Fri, 11 Jan 2013 22:31:56 +0000 (14:31 -0800)]
lockdep, rwsem: provide down_write_nest_lock()
down_write_nest_lock() provides a means to annotate locking scenario
where an outer lock is guaranteed to serialize the order nested locks
are being acquired.
This is analogoue to already existing mutex_lock_nest_lock() and
spin_lock_nest_lock().
Max Filippov [Fri, 11 Jan 2013 22:31:52 +0000 (14:31 -0800)]
mm: bootmem: fix free_all_bootmem_core() with odd bitmap alignment
Currently free_all_bootmem_core ignores that node_min_pfn may be not
multiple of BITS_PER_LONG. Eg commit 6dccdcbe2c3e ("mm: bootmem: fix
checking the bitmap when finally freeing bootmem") shifts vec by lower
bits of start instead of lower bits of idx. Also
if (IS_ALIGNED(start, BITS_PER_LONG) && vec == ~0UL)
assumes that vec bit 0 corresponds to start pfn, which is only true when
node_min_pfn is a multiple of BITS_PER_LONG. Also loop in the else
clause can double-free pages (e.g. with node_min_pfn == start == 1,
map[0] == ~0 on 32-bit machine page 32 will be double-freed).
This bug causes the following message during xtensa kernel boot:
bootmem::free_all_bootmem_core nid=0 start=1 end=8000
BUG: Bad page state in process swapper pfn:00001
page:d04bd020 count:0 mapcount:-127 mapping: (null) index:0x2
page flags: 0x0()
Call Trace:
bad_page+0x8c/0x9c
free_pages_prepare+0x5e/0x88
free_hot_cold_page+0xc/0xa0
__free_pages+0x24/0x38
__free_pages_bootmem+0x54/0x56
free_all_bootmem_core$part$11+0xeb/0x138
free_all_bootmem+0x46/0x58
mem_init+0x25/0xa4
start_kernel+0x11e/0x25c
should_never_return+0x0/0x3be7
The fix is the following:
- always align vec so that its bit 0 corresponds to start
- provide BITS_PER_LONG bits in vec, if those bits are available in the
map
- don't free pages past next start position in the else clause.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Cc: Gavin Shan <shangw@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Prasad Koya <prasad.koya@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Laura Abbott [Fri, 11 Jan 2013 22:31:51 +0000 (14:31 -0800)]
mm: use aligned zone start for pfn_to_bitidx calculation
The current calculation in pfn_to_bitidx assumes that (pfn -
zone->zone_start_pfn) >> pageblock_order will return the same bit for
all pfn in a pageblock. If zone_start_pfn is not aligned to
pageblock_nr_pages, this may not always be correct.
Consider the following with pageblock order = 10, zone start 2MB:
This means that calling {get,set}_pageblock_migratetype on a single page
will not set the migratetype for the full block. Fix this by rounding
down zone_start_pfn when doing the bitidx calculation.
For our use case, the effects of this bug were mostly tied to the fact
that CMA allocations would either take a long time or fail to happen.
Depending on the driver using CMA, this could result in anything from
visual glitches to application failures.
Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Acked-by: Mel Gorman <mgorman@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xi Wang [Fri, 11 Jan 2013 22:31:48 +0000 (14:31 -0800)]
fs/exec.c: work around icc miscompilation
The tricky problem is this check:
if (i++ >= max)
icc (mis)optimizes this check as:
if (++i > max)
The check now becomes a no-op since max is MAX_ARG_STRINGS (0x7FFFFFFF).
This is "allowed" by the C standard, assuming i++ never overflows,
because signed integer overflow is undefined behavior. This
optimization effectively reverts the previous commit 362e6663ef23
("exec.c, compat.c: fix count(), compat_count() bounds checking") that
tries to fix the check.
This patch simply moves ++ after the check.
Signed-off-by: Xi Wang <xi.wang@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lin Feng [Fri, 11 Jan 2013 22:31:44 +0000 (14:31 -0800)]
mm: memblock: fix wrong memmove size in memblock_merge_regions()
The memmove span covers from (next+1) to the end of the array, and the
index of next is (i+1), so the index of (next+1) is (i+2). So the size
of remaining array elements is (type->cnt - (i + 2)).
Since the remaining elements of the memblock array are move forward by
one element and there is only one additional element caused by this bug.
So there won't be any write overflow here but read overflow. It may
read one more element out of the array address if the array happens to
be full. Commonly it doesn't matter at all but if the array happens to
be located at the end a memblock, it may cause a invalid read operation
for the physical address doesn't exist.
There are 2 *happens to be* here, so I think the probability is quite
low, I don't know if any guy is haunted by this bug before.
Mostly I think it's user-invisible.
Signed-off-by: Lin Feng <linfeng@cn.fujitsu.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Maxime Ripard [Fri, 11 Jan 2013 22:31:43 +0000 (14:31 -0800)]
drivers/video/ssd1307fb.c: fix bit order bug in the byte translation function
This was leading to a strange behaviour when using the fbcon driver on
top of this one: the letters were in the right order, but each letter
had a vertical symmetry.
This was because the addressing was right for the byte, but the
addressing of each individual bit was inverted.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Cc: Brian Lilly <brian@crystalfontz.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: Thomas Petazzoni <thomas@free-electrons.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Fri, 11 Jan 2013 22:31:40 +0000 (14:31 -0800)]
mm: migrate: check page_count of THP before migrating
Hugh Dickins pointed out that migrate_misplaced_transhuge_page() does
not check page_count before migrating like base page migration and
khugepage. He could not see why this was safe and he is right.
The potential impact of the bug is avoided due to the limitations of
NUMA balancing. The page_mapcount() check ensures that only a single
address space is using this page and as THPs are typically private it
should not be possible for another address space to fault it in
parallel. If the address space has one associated task then it's
difficult to have both a GUP pin and be referencing the page at the same
time. If there are multiple tasks then a buggy scenario requires that
another thread be accessing the page while the direct IO is in flight.
This is dodgy behaviour as there is a possibility of corruption with or
without THP migration. It would be
While we happen to be safe for the most part it is shoddy to depend on
such "safety" so this patch checks the page count similar to anonymous
pages. Note that this does not mean that the page_mapcount() check can
go away. If we were to remove the page_mapcount() check the the THP
would have to be unmapped from all referencing PTEs, replaced with
migration PTEs and restored properly afterwards.
WARNING: drivers/rtc/rtc-da9055.o(.text+0xa71): Section mismatch in reference from the function da9055_rtc_probe() to the function .init.text:da9055_rtc_device_init()
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Decotigny [Fri, 11 Jan 2013 22:31:36 +0000 (14:31 -0800)]
lib: cpu_rmap: avoid flushing all workqueues
In some cases, free_irq_cpu_rmap() is called while holding a lock (eg
rtnl). This can lead to deadlocks, because it invokes
flush_scheduled_work() which ends up waiting for whole system workqueue
to flush, but some pending works might try to acquire the lock we are
already holding.
This commit uses reference-counting to replace
irq_run_affinity_notifiers(). It also removes
irq_run_affinity_notifiers() altogether.
[akpm@linux-foundation.org: eliminate free_cpu_rmap, rename cpu_rmap_reclaim() to cpu_rmap_release(), propagate kref_put() retval from cpu_rmap_put()] Signed-off-by: David Decotigny <decot@googlers.com> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 11 Jan 2013 20:09:04 +0000 (12:09 -0800)]
Merge tag 'nfs-for-3.8-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfix from Trond Myklebust:
- Fix a socket lock leak in net/sunrpc/xprt.c
* tag 'nfs-for-3.8-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: Ensure we release the socket write lock if the rpc_task exits early
Linus Torvalds [Fri, 11 Jan 2013 17:05:28 +0000 (09:05 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull intel DRM fixes from Dave Airlie:
"Just intel fixes, including getting the Ironlake systems back to the
state they were in for 3.6."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Revert shrinker changes from "Track unbound pages"
drm/i915: Use pixel size for computing linear offsets into a sprite
drm/i915: Add DEBUG messages to all intel_create_user_framebuffer error paths
drm/i915: The sprite scaler on Ironlake also support YUV planes
drm: Only evict the blocks required to create the requested hole
drm/i915: Treat crtc->mode.clock == 0 as disabled
Revert "drm/i915: no lvds quirk for Zotac ZDBOX SD ID12/ID13"
drm/i915; Only increment the user-pin-count after successfully pinning the bo
Mel Gorman [Fri, 11 Jan 2013 09:27:01 +0000 (09:27 +0000)]
mm: compaction: Partially revert capture of suitable high-order page
Eric Wong reported on 3.7 and 3.8-rc2 that ppoll() got stuck when
waiting for POLLIN on a local TCP socket. It was easier to trigger if
there was disk IO and dirty pages at the same time and he bisected it to
commit 1fb3f8ca0e92 ("mm: compaction: capture a suitable high-order page
immediately when it is made available").
The intention of that patch was to improve high-order allocations under
memory pressure after changes made to reclaim in 3.6 drastically hurt
THP allocations but the approach was flawed. For Eric, the problem was
that page->pfmemalloc was not being cleared for captured pages leading
to a poor interaction with swap-over-NFS support causing the packets to
be dropped. However, I identified a few more problems with the patch
including the fact that it can increase contention on zone->lock in some
cases which could result in async direct compaction being aborted early.
In retrospect the capture patch took the wrong approach. What it should
have done is mark the pageblock being migrated as MIGRATE_ISOLATE if it
was allocating for THP and avoided races that way. While the patch was
showing to improve allocation success rates at the time, the benefit is
marginal given the relative complexity and it should be revisited from
scratch in the context of the other reclaim-related changes that have
taken place since the patch was first written and tested. This patch
partially reverts commit 1fb3f8ca "mm: compaction: capture a suitable
high-order page immediately when it is made available".
Reported-and-tested-by: Eric Wong <normalperson@yhbt.net> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Thu, 10 Jan 2013 01:13:00 +0000 (17:13 -0800)]
seq_file: fix new kernel-doc warnings
Fix kernel-doc warnings in fs/seq_file.c:
Warning(fs/seq_file.c:304): No description found for parameter 'whence'
Warning(fs/seq_file.c:304): Excess function parameter 'origin' description in 'seq_lseek'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Thu, 10 Jan 2013 01:12:45 +0000 (17:12 -0800)]
audit: fix auditfilter.c kernel-doc warnings
Fix new kernel-doc warning in auditfilter.c:
Warning(kernel/auditfilter.c:1157): Excess function parameter 'uid' description in 'audit_receive_filter'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: linux-audit@redhat.com (subscribers-only) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Thu, 10 Jan 2013 01:12:35 +0000 (17:12 -0800)]
nfs: fix sunrpc/clnt.c kernel-doc warnings
Fix new kernel-doc warnings in clnt.c:
Warning(net/sunrpc/clnt.c:561): No description found for parameter 'flavor'
Warning(net/sunrpc/clnt.c:561): Excess function parameter 'auth' description in 'rpc_clone_client_set_auth'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: linux-nfs@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Airlie [Thu, 10 Jan 2013 21:47:25 +0000 (07:47 +1000)]
Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel
Daniel writes:
"Pretty much all just major fixes:
- 2 pieces of duct-tape for the ilk bug.
- Sprite regression fixes from Chris.
- OOPS fix for a div-by-zero from Chris, regression due to the modeset
rework in 3.7, now brought to light by a benign change in 3.8.
- Fix interrupted bo pinning, used to work around CS coherency issues on
i830/i845 (kernel also has a w/a newly in 3.8, but pinning is more efficient if
possible)."
Linus Torvalds [Thu, 10 Jan 2013 17:09:41 +0000 (09:09 -0800)]
Merge branch 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86
Pull x86 platform driver bugfixes from Matthew Garrett.
* 'for_linus' of git://cavan.codon.org.uk/platform-drivers-x86:
asus-laptop: Fix potential invalid pointer dereference
Update MAINTAINERS entry
asus-laptop: Do not call HWRS on init
sony-laptop: fix SNC buffer calls when SN06 returns Integers
samsung-laptop: Add quirk for broken acpi_video backlight on N250P
acer-wmi: add Aspire 5741G touchpad toggle key
acer-wmi: change to emit touchpad on off key
acer-wmi: fix obj is NULL but dereferenced
MAINTAINERS: change the mail address of acer-wmi/msi-laptop maintainer
Linus Torvalds [Thu, 10 Jan 2013 17:05:18 +0000 (09:05 -0800)]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM bugfixes from Marcelo Tosatti.
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: use dynamic percpu allocations for shared msrs area
KVM: PPC: Book3S HV: Fix compilation without CONFIG_PPC_POWERNV
powerpc: Corrected include header path in kvm_para.h
Add rcu user eqs exception hooks for async page fault
Linus Torvalds [Thu, 10 Jan 2013 17:03:16 +0000 (09:03 -0800)]
Merge tag 'trace-3.8-rc2-regression-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing regression fix from Steven Rostedt:
"A change that came in this merge window broke the writing to the
trace_options file. It causes garbage to be read during the compare
of option names, and breaks setting options via the trace_options
file, although options can still be set via the options/<option>
files."
* tag 'trace-3.8-rc2-regression-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix regression of trace_options file setting
Closer inspection of that patch revealed a bunch of unrelated changes
in the shrinker:
- The shrinker count is now in pages instead of objects.
- For counting the shrinkable objects the old code only looked at the
inactive list, the new code looks at all bounds objects (including
pinned ones). That is obviously in addition to the new unbound list.
- The shrinker cound is no longer scaled with
sysctl_vfs_cache_pressure. Note though that with the default tuning
value of vfs_cache_pressue = 100 this doesn't affect the shrinker
behaviour.
- When actually shrinking objects, the old code first dropped
purgeable objects, then normal (inactive) objects. Only then did it,
in a last-ditch effort idle the gpu and evict everything. The new
code omits the intermediate step of evicting normal inactive
objects.
Safe for the first change, which seems benign, and the shrinker count
scaling, which is a bit a different story, the endresult of all these
changes is that the shrinker is _much_ more likely to fall back to the
last-ditch resort of idling the gpu and evicting everything. The old
code could only do that if something else evicted lots of objects
meanwhile (since without any other changes the nr_to_scan will be
smaller than the object count).
Reverting the vfs_cache_pressure behaviour itself is a bit bogus: Only
dentry/inode object caches should scale their shrinker counts with
vfs_cache_pressure. Originally I've had that change reverted, too. But
Chris Wilson insisted that it's too bogus and shouldn't again see the
light of day.
Hence revert all these other changes and restore the old shrinker
behaviour, with the minor adjustment that we now first scan the
unbound list, then the inactive list for each object category
(purgeable or normal).
A similar patch has been tested by a few people affected by the gen4/5
hangs which started to appear in 3.7, which some people bisected to
the "drm/i915: Track unbound pages" commit. But just disabling the
unbound logic alone didn't change things at all.
Note that this patch doesn't fix the referenced bugs, it only hides
the underlying bug(s) well enough to restore pre-3.7 behaviour. The
key to achieve that is to massively reduce the likelyhood of going
into a full gpu stall and evicting everything.
v2: Reword commit message a bit, taking Chris Wilson's comment into
account.
v3: On Chris Wilson's insistency, do not reinstate the rather bogus
vfs_cache_pressure change.
Tested-by: Greg KH <gregkh@linuxfoundation.org> Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=55984
References: https://bugs.freedesktop.org/show_bug.cgi?id=57122
References: https://bugs.freedesktop.org/show_bug.cgi?id=56916
References: https://bugs.freedesktop.org/show_bug.cgi?id=57136 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Linus Torvalds [Thu, 10 Jan 2013 16:20:15 +0000 (08:20 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 patches from Martin Schwidefsky:
"Add the finit_module system call, fix the irq statistics in
/proc/stat, fix a s390dbf lockdep problem, a patch revert for a
problem that is not 100% understood yet, and a few patches to
fix warnings."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pci: define read*_relaxed functions
s390/topology: export cpu_topology
s390/pm: export pm_power_off
s390/pci: define isa_dma_bridge_buggy
s390/3215: partially revert tty close handling fix
s390/irq: count cpu restart events
s390/irq: remove split irq fields from /proc/stat
s390/irq: enable irq sum accounting for /proc/stat again
s390/syscalls: wire up finit_module syscall
s390/pci: remove dead code
s390/smp: fix section mismatch for smp_add_present_cpu()
s390/debug: Fix s390dbf lockdep problem in debug_(un)register_view()
Guenter Roeck [Mon, 31 Dec 2012 08:04:59 +0000 (00:04 -0800)]
hwmon: (vexpress) Fix build error seen if CONFIG_OF_DEVICE is not set
Fix:
vexpress.c: In function ‘vexpress_hwmon_name_show’:
vexpress.c:34:2: error: implicit declaration of function
‘of_get_property’ [-Werror=implicit-function-declaration]
vexpress.c:34:27: warning: initialization makes pointer from integer
without a cast [enabled by default]
vexpress.c: In function ‘vexpress_hwmon_label_show’:
vexpress.c:43:22: warning: initialization makes pointer from integer
without a cast [enabled by default]
Seen if CONFIG_OF_DEVICE is not defined. of_get_property is declared in
of.h which is only included by of_device.h if CONFIG_OF_DEVICE is defined.
of.h needs to be included directly.
Steven Rostedt [Thu, 10 Jan 2013 01:54:17 +0000 (20:54 -0500)]
tracing: Fix regression of trace_options file setting
The latest change to allow trace options to be set on the command
line also broke the trace_options file.
The zeroing of the last byte of the option name that is echoed into
the trace_option file was removed with the consolidation of some
of the code. The compare between the option and what was written to
the trace_options file fails because the string holding the data
written doesn't terminate with a null character.
A zero needs to be added to the end of the string copied from
user space.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Linus Torvalds [Wed, 9 Jan 2013 16:58:57 +0000 (08:58 -0800)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM fixes from Russell King.
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7616/1: cache-l2x0: aurora: Use writel_relaxed instead of writel
ARM: 7615/1: cache-l2x0: aurora: Invalidate during clean operation with WT enable
ARM: 7614/1: mm: fix wrong branch from Cortex-A9 to PJ4b
ARM: 7612/1: imx: Do not select some errata that depends on !ARCH_MULTIPLATFORM
ARM: 7611/1: VIC: fix bug in VIC irqdomain code
ARM: 7610/1: versatile: bump IRQ numbers
ARM: 7609/1: disable errata work-arounds which access secure registers
ARM: 7608/1: l2x0: Only set .set_debug on PL310 r3p0 and earlier
Linus Torvalds [Wed, 9 Jan 2013 16:43:56 +0000 (08:43 -0800)]
Merge tag 'edac_fixes_for_3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Pull EDAC fixes from Borislav Petkov:
"Two error path fixes causing a crash and a Kconfig fix for an issue
which spilled all EDAC suboptions into the 'Device Drivers' menu."
* tag 'edac_fixes_for_3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC: Cleanup device deregistering path
EDAC: Fix EDAC Kconfig menu
EDAC: Fix kernel panic on module unloading
The check for a pmd being in the process of being split was dropped by
mistake by commit d10e63f29488 ("mm: numa: Create basic numa page
hinting infrastructure"). Put it back.
Reported-by: Dave Jones <davej@redhat.com> Debugged-by: Hillf Danton <dhillf@gmail.com> Acked-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Kirill Shutemov <kirill@shutemov.name> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marc Dionne [Wed, 9 Jan 2013 14:16:30 +0000 (14:16 +0000)]
cred: Remove tgcred pointer from struct cred
Commit 3a50597de863 ("KEYS: Make the session and process keyrings
per-thread") removed the definition of the thread_group_cred structure,
but left a now unused pointer in struct cred.
Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Wilson [Wed, 19 Dec 2012 12:14:22 +0000 (12:14 +0000)]
drm/i915: Use pixel size for computing linear offsets into a sprite
This fixes an original bug in the sprite code that miscomputed the
source offset into a linear YUV packed framebuffer, that was magnified
into an oops with
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.com> Cc: Damien Lespiau <damien.lespiau@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Linus Torvalds [Wed, 9 Jan 2013 02:53:56 +0000 (18:53 -0800)]
Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"People are back from the holiday breaks, and it shows. Here are a
bunch of fixes for a number of platforms:
- A couple of small fixes for Nomadik
- A larger set of changes for kirkwood/mvebu
- uart driver selection, dt clocks, gpio-poweroff fixups, a few
__init annotation fixes and some error handling improvement in
their xor dma driver.
- i.MX had a couple of minor fixes (and a critical one for flexcan2
clock setup)
- MXS has a small board fix and a framebuffer bugfix
- A set of fixes for Samsung Exynos, fixing default bootargs and some
Exynos5440 clock issues
- A set of OMAP changes including PM fixes and a few sparse warning
fixups
All in all a bit more positive code delta than we'd ideally want to
see here, mostly from the OMAP PM changes, but nothing overly crazy."
* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (44 commits)
ARM: clps711x: Fix bad merge of clockevents setup
ARM: highbank: save and restore L2 cache and GIC on suspend
ARM: highbank: add a power request clear
ARM: highbank: fix secondary boot and hotplug
ARM: highbank: fix typos with hignbank in power request functions
ARM: dts: fix highbank cpu mpidr values
ARM: dts: add device_type prop to cpu nodes on Calxeda platforms
ARM: mx5: Fix MX53 flexcan2 clock
ARM: OMAP2+: am33xx-hwmod: Fix wrongly terminated am33xx_usbss_mpu_irqs array
pinctrl: mvebu: make pdma clock on dove mandatory
ARM: Dove: Add pinctrl clock to DT
dma: mv_xor: fix error handling for clocks
dma: mv_xor: fix error handling of mv_xor_channel_add()
arm: mvebu: Add missing ; for cpu node.
arm: mvebu: Armada XP MV78230 has only three Ethernet interfaces
arm: mvebu: Armada XP MV78230 has two cores, not one
clk: mvebu: Remove inappropriate __init tagging
ARM: Kirkwood: Use fixed-regulator instead of board gpio call
ARM: Kirkwood: Fix missing sdio clock
ARM: Kirkwood: Switch TWSI1 of 88f6282 to DT clock providers
...
Linus Torvalds [Wed, 9 Jan 2013 00:08:10 +0000 (16:08 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm update from Dave Airlie:
"Exynos and Radeon mostly, with a dma-buf and ttm fix thrown in.
It's a bit big but its mostly exynos license fix ups and I'd rather
not hold those up since its legally stuff.
Radeon has a couple of fixes from dma engine work, TTM is just a
locking fix, and dma-buf fix has been hanging around and I finally got
a chance to review it."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (30 commits)
drm/ttm: fix fence locking in ttm_buffer_object_transfer
drm/prime: drop reference on imported dma-buf come from gem
drm/radeon: add quirk for d3 delay during switcheroo poweron for apple macbooks
drm/exynos: move finish page flip to a common place
drm/exynos: fimd: modify condition in fimd resume
drm/radeon: fix DMA CS parser for r6xx linear copy packet
drm/radeon: split r6xx and r7xx copy_dma functions
drm/exynos: Use devm_clk_get in exynos_drm_gsc.c
drm/exynos: Remove redundant NULL check in exynos_drm_gsc.c
drm/exynos: Remove explicit freeing using devm_* APIs in exynos_drm_gsc.c
drm/exynos: Use devm_clk_get in exynos_drm_rotator.c
drm/exynos: Remove redundant NULL check in exynos_drm_rotator.c
drm/exynos: Remove unnecessary devm_* freeing APIs in exynos_drm_rotator.c
drm/exynos: Use devm_clk_get in exynos_drm_fimc.c
drm/exynos: Remove redundant NULL check
drm/exynos: Remove explicit freeing using devm_* APIs in exynos_drm_fimc.c
drm/exynos: Use devm_kzalloc in exynos_drm_ipp.c
drm/exynos: fix gem buffer allocation type checking
drm/exynos: remove needless parenthesis.
drm/exynos: fix incorrect interrupt induced by m2m operation.
...
Trond Myklebust [Mon, 7 Jan 2013 19:30:46 +0000 (14:30 -0500)]
SUNRPC: Ensure we release the socket write lock if the rpc_task exits early
If the rpc_task exits while holding the socket write lock before it has
allocated an rpc slot, then the usual mechanism for releasing the write
lock in xprt_release() is defeated.
The problem occurs if the call to xprt_lock_write() initially fails, so
that the rpc_task is put on the xprt->sending wait queue. If the task
exits after being assigned the lock by __xprt_lock_write_func, but
before it has retried the call to xprt_lock_and_alloc_slot(), then
it calls xprt_release() while holding the write lock, but will
immediately exit due to the test for task->tk_rqstp != NULL.
Olof Johansson [Tue, 8 Jan 2013 17:49:50 +0000 (09:49 -0800)]
Merge tag 'omap-for-v3.8-rc2/fixes-signed-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
From Tony Lindgren:
The biggest change is a fix to deal with different power state
on omap2 registers that causes issues trying to use common PM code.
Also fix few incorrect registers, and an issue for omap1 USB, and
few sparse fixes for issues that sneaked in with all the clean-up.
Olof Johansson [Tue, 8 Jan 2013 17:42:52 +0000 (09:42 -0800)]
Merge branch 'v3.8-samsung-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes
From Kukjin Kim:
Most of them are EXYNOS5440 fixes which are for changing uart console,
cpu id (typo) and silent complaining gpio error in kernel boot.
* 'v3.8-samsung-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: EXYNOS: skip the clock initialization for exynos5440
ARM: EXYNOS: enable PINCTRL for EXYNOS5440
ARM: dts: use uart port1 for console on exynos4210-smdkv310
ARM: dts: use uart port0 for console on exynos5440-ssdk5440
ARM: SAMSUNG: fix the cpu id for EXYNOS5440
ARM: EXYNOS: Revise HDMI resource size
Olof Johansson [Tue, 8 Jan 2013 16:39:27 +0000 (08:39 -0800)]
Merge tag 'mxs-fixes-3.8' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes
From Shawn Guo:
I have to send one critical mxsfb fix through arm-soc, as FB maintainer
is unresponsive for quite a while. People start complaining the missing
of such an important fix.
* tag 'mxs-fixes-3.8' of git://git.linaro.org/people/shawnguo/linux-2.6:
video: mxsfb: fix crash when unblanking the display
ARM: dts: imx23-olinuxino: Fix IOMUX settings
Olof Johansson [Tue, 8 Jan 2013 16:39:00 +0000 (08:39 -0800)]
Merge tag 'imx-fixes-3.8' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes
From Shawn Guo:
It includes one critical fix - wrong flexcan2 clock will hang system
when the port gets brought up. The other two are non-critical fixes,
which are sent together here, since it's still early -rc stage.
* tag 'imx-fixes-3.8' of git://git.linaro.org/people/shawnguo/linux-2.6:
ARM: mx5: Fix MX53 flexcan2 clock
ARM: dts: imx31-bug: Fix manufacturer compatible string
clk: imx: Remove 'clock-output-names' from the examples
Linus Torvalds [Tue, 8 Jan 2013 15:33:41 +0000 (07:33 -0800)]
Merge tag 'sound-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Nothing too exciting here, just a few regression and trivial fixes,
and new quirks for HD-audio and USB-audio.
- HD-audio mute LED mode enum fix
- Fix kernel panic of Digidesign Mbox2 usb-audio quirk (which was new
in 3.8-rc1)
- Creative BT-D1 usb-audio quirk
- mute LED fixup for HP Pavillion 17 laptop"
* tag 'sound-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - add mute LED for HP Pavilion 17 (Realtek codec)
ALSA: au88x0: fix incorrect left shift
sound: oss/pas2: Fix possible access out of array
ALSA: usb-audio: Fix kernel panic of Digidesign Mbox2 quirk
ALSA: usb-audio: Add support for Creative BT-D1 via usb sound quirks
ALSA: hda - Switch "On" and "Off" for "Mute-LED Mode" kcontrol
7) Fix high order page allocation failures in netfilter xt_recent, from
Eric Dumazet.
8) mac802154 needs to use netif_rx_ni() instead of netif_rx() because
mac802154_process_data() can execute in process rather than
interrupt context. From Alexander Aring.
9) Fix splice handling of MSG_SENDPAGE_NOTLAST, otherwise we elide one
tcp_push() too many. From Eric Dumazet and Willy Tarreau.
10) Fix skb->truesize tracking in XEN netfront driver, from Ian
Campbell.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
xen/netfront: improve truesize tracking
ipv4: fix NULL checking in devinet_ioctl()
tcp: fix MSG_SENDPAGE_NOTLAST logic
net/ipv4/ipconfig: really display the BOOTP/DHCP server's address.
ip-sysctl: fix spelling errors
mac802154: fix NOHZ local_softirq_pending 08 warning
ipv6: document ndisc_notify in networking/ip-sysctl.txt
ath9k: Fix Kconfig for ATH9K_HTC
netfilter: xt_recent: avoid high order page allocations
netfilter: fix missing dependencies for the NOTRACK target
netfilter: ip6t_NPT: fix IPv6 NTP checksum calculation
bridge: add empty br_mdb_init() and br_mdb_uninit() definitions.
vxlan: allow live mac address change
bridge: Correctly unregister MDB rtnetlink handlers
brcmfmac: fix parsing rsn ie for ap mode.
brcmsmac: add copyright information for Canonical
rtlwifi: rtl8723ae: Fix warning for unchecked pci_map_single() call
rtlwifi: rtl8192se: Fix warning for unchecked pci_map_single() call
rtlwifi: rtl8192de: Fix warning for unchecked pci_map_single() call
rtlwifi: rtl8192ce: Fix warning for unchecked pci_map_single() call
...
Chris Wilson [Wed, 19 Dec 2012 16:51:06 +0000 (16:51 +0000)]
drm: Only evict the blocks required to create the requested hole
Avoid clobbering adjacent blocks if they happen to expire earlier and
amalgamate together to form the requested hole.
In passing this fixes a regression from
commit ea7b1dd44867e9cd6bac67e7c9fc3f128b5b255c
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Fri Feb 18 17:59:12 2011 +0100
drm: mm: track free areas implicitly
which swaps the end address for size (with a potential overflow) and
effectively causes the eviction code to clobber almost all earlier
buffers above the evictee.
v2: Check the original hole not the adjusted as the coloring may confuse
us when later searching for the overlapping nodes. Also make sure that
we do apply the range restriction and color adjustment in the same
order for both scanning, searching and insertion.
v3: Send the version that was actually tested.
Note that this seems to be ducttape of decent quality ot paper over
some of our unbind related gpu hangs reported since 3.7. It is not
fully effective though, and certainly doesn't fix the underlying bug.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
[danvet: Added note plus bugzilla link and tested-by.] Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55984 Tested-by: Norbert Preining <preining@logic.at> Acked-by: Dave Airlie <airlied@gmail.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Heiko Carstens [Mon, 7 Jan 2013 12:56:17 +0000 (13:56 +0100)]
s390/pm: export pm_power_off
Export pm_power_off symbol. Needed by at least one of the new device
drivers that come with CONFIG_PCI.
And all other architectures export that symbol as well.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Mon, 7 Jan 2013 12:52:53 +0000 (13:52 +0100)]
s390/pci: define isa_dma_bridge_buggy
Define isa_dma_bridge_buggy. Needed to make pci quirks compile:
drivers/pci/quirks.c: In function ‘quirk_isa_dma_hangs’:
drivers/pci/quirks.c:88:7: error: ‘isa_dma_bridge_buggy’ undeclared (first use in this function)
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Thu, 3 Jan 2013 13:31:53 +0000 (14:31 +0100)]
s390/3215: partially revert tty close handling fix
Partially revert ae289dc1f "s390/3215: fix tty close handling", since this
leads sometimes to hanging agetty processes and therefore systems that get
stuck while starting.
This was magically fixed (bisected) by a common code patch from Alan Cox: 36b3c070 "tty: Move the handling of the tty release logic", however it was
unrelated.
Since the removed code worked for a decade, nobody knows anymore why it was
in there in the first place and debugging the observed hang is non-trivial
(at least for me :) ), let's just re-add the removed code before we see
other side effects.
Heiko Carstens [Wed, 2 Jan 2013 14:18:18 +0000 (15:18 +0100)]
s390/irq: remove split irq fields from /proc/stat
Now that irq sum accounting for /proc/stat's "intr" line works again we
have the oddity that the sum field (first field) contains only the sum
of the second (external irqs) and third field (I/O interrupts).
The reason for that is that these two fields are already sums of all other
fields. So if we would sum up everything we would count every interrupt
twice.
This is broken since the split interrupt accounting was merged two years
ago: 052ff461c8427629aee887ccc27478fc7373237c "[S390] irq: have detailed
statistics for interrupt types".
To fix this remove the split interrupt fields from /proc/stat's "intr"
line again and only have them in /proc/interrupts.
This restores the old behaviour, seems to be the only sane fix and mimics
a behaviour from other architectures where /proc/interrupts also contains
more than /proc/stat's "intr" line does.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Wed, 2 Jan 2013 13:01:23 +0000 (14:01 +0100)]
s390/irq: enable irq sum accounting for /proc/stat again
For more than two years, since f2c66cd8eeddedb440f33bc0f5cec1ed7ae376cb
"/proc/stat: scalability of irq num per cpu" the output of /proc/stat is
broken.
The first field in the "intr" line should contain the sum of all interrupts,
however since the above mentioned change it is always zero.
The reason for that is that a per cpu irq sum variable had been introduced
which got incremented when calling kstat_incr_irqs_this_cpu(). However
on s390 we directly incremented only the per cpu per irq counter by accessing
the array element via kstat_cpu(smp_processor_id()).irqs[...].
So fix this and use the kstat_incr_irqs_this_cpu() wrapper which increments
both: the per cpu per irq counter and the per cpu irq sum counter.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Fri, 28 Dec 2012 12:15:36 +0000 (13:15 +0100)]
s390/pci: remove dead code
Get rid of these:
arch/s390/pci/pci_dma.c:16:29: warning: ‘zpci_ioat_dt’ defined but not used [-Wunused-variable]
arch/s390/pci/pci.c:164:12: warning: ‘zpci_store_fib’ defined but not used [-Wunused-function]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Thu, 27 Dec 2012 13:03:36 +0000 (14:03 +0100)]
s390/smp: fix section mismatch for smp_add_present_cpu()
Fixes this section mismatch:
WARNING: vmlinux.o(.text+0x145e4): Section mismatch in reference from the function
smp_add_present_cpu() to the function .cpuinit.text:register_cpu()
The function smp_add_present_cpu() references
the function __cpuinit register_cpu().
This is often because smp_add_present_cpu lacks a __cpuinit
annotation or the annotation of register_cpu is wrong.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Thu, 20 Dec 2012 16:30:52 +0000 (17:30 +0100)]
s390/debug: Fix s390dbf lockdep problem in debug_(un)register_view()
The debug_register/unregister_view() functions call debugfs_remove()
while holding the debug_info spinlock. Because debugfs_remove() takes
a mutex and therefore can sleep this is not allowed. To fix the problem
we give up the debug_info lock before calling debugfs_remove().
The following shows the lockdep message:
[ INFO: possible circular locking dependency detected ]
-------------------------------------------------------
rmmod/4379 is trying to acquire lock:
(&sb->s_type->i_mutex_key#2){+.+.+.}, at: [<00000000003acae2>] debugfs_remove+0x5e/0xa
but task is already holding lock:
(&(&rc->lock)->rlock){-.-...}, at: [<000000000010a5ae>] debug_unregister_view+0x3a/0xd
Daniel Vetter [Tue, 18 Dec 2012 21:25:11 +0000 (22:25 +0100)]
drm/ttm: fix fence locking in ttm_buffer_object_transfer
Noticed while reviewing the fence locking in the radeon pageflip
handler.
v2: Instead of grabbing the bdev->fence_lock in object_transfer just
move the single callsite of that function a few lines, so that it is
protected by the fence_lock. Suggested by Jerome Glisse.
v3: Fix typo in commit message.
Reviewed-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>
Olof Johansson [Tue, 8 Jan 2013 05:09:17 +0000 (21:09 -0800)]
Merge tag 'mvebu_fixes_for_v3.8' of git://git.infradead.org/users/jcooper/linux into fixes
From Jason Cooper:
fixes for mvebu/kirkwood v3.8
- use correct uart driver for mvebu boards
- add a missing DT clocks
- gpio-poweroff level vs. edge triggering, use gpio_is_valid()
- remove an inappropriate __init, modules need to access function.
- various DT fixes
- error handling in mv_xor
* tag 'mvebu_fixes_for_v3.8' of git://git.infradead.org/users/jcooper/linux:
pinctrl: mvebu: make pdma clock on dove mandatory
ARM: Dove: Add pinctrl clock to DT
dma: mv_xor: fix error handling for clocks
dma: mv_xor: fix error handling of mv_xor_channel_add()
arm: mvebu: Add missing ; for cpu node.
arm: mvebu: Armada XP MV78230 has only three Ethernet interfaces
arm: mvebu: Armada XP MV78230 has two cores, not one
clk: mvebu: Remove inappropriate __init tagging
ARM: Kirkwood: Use fixed-regulator instead of board gpio call
ARM: Kirkwood: Fix missing sdio clock
ARM: Kirkwood: Switch TWSI1 of 88f6282 to DT clock providers
Power: gpio-poweroff: Fix documentation and gpio_is_valid
ARM: Kirkwood: Fix missing clk for USB device.
arm: mvebu: Use dw-apb-uart instead of ns16650 as UART driver
Olof Johansson [Sat, 5 Jan 2013 02:45:08 +0000 (18:45 -0800)]
Merge tag 'nomadik-fixes-for-arm-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-nomadik into fixes
From Linus Walleij:
Two fixes to the Nomadik:
- Delete a dangling include
- Bump IRQ numbers to offset at 32
* tag 'nomadik-fixes-for-arm-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-nomadik:
ARM: nomadik: bump the IRQ numbers again
ARM: nomadik: delete dangling include