When reading/writing a subpage (When HW ECC is not available/enabled)
for number of bytes not aligned to 4, the mis-aligned bytes are handled
first (by cpu copy method) before enabling the Prefetch engine to/from
'p'(start of buffer 'buf'). Then it reads/writes rest of the bytes with
the help of Prefetch engine, if available, or again using cpu copy method.
Currently, reading/writing of rest of bytes, is not done correctly since
its trying to read/write again to/from begining of buffer 'buf',
overwriting the mis-aligned bytes.
Read & write using prefetch engine got broken in commit '2c01946c'.
We never hit a scenario of not getting 'gpmc_prefetch_enable' call
success. So, problem did not get caught up.
In commit c7b28e25cb9beb943aead770ff14551b55fa8c79 the initialization of
the backblockbits was accidentally removed. This patch returns it back,
because otherwise some NAND drivers are broken.
This problem was reported by "Saxena, Parth <parth.saxena@ti.com>" here:
http://lists.infradead.org/pipermail/linux-mtd/2011-April/035221.html
Currently mtdconcat is broken for NAND. An attemtpt to create
JFFS2 filesystem on concatenation of several NAND devices fails
with OOB write errors. This patch fixes that problem.
Signed-off-by: Felix Radensky <felix@embedded-sol.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
9fd097b149 (block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe
drivers) removed DISK_EVENT_MEDIA_CHANGE from legacy/fringe block
drivers which have inadequate ->check_events(). Combined with earlier
change 7c88a168da (block: don't propagate unlisted DISK_EVENTs to
userland), this enables using ->check_events() for internal processing
while avoiding enabling in-kernel block event polling which can lead
to infinite event loop.
Unfortunately, this made many drivers including floppy without any bit
set in disk->events and ->async_events in which case disk_add_events()
simply skipped allocation of disk->ev, which disables whole event
handling. As ->check_events() is still used during open processing
for revalidation, this can lead to open failure.
This patch always allocates disk->ev if ->check_events is implemented.
In the long term, it would make sense to simply include the event
structure inline into genhd as it's now used by virtually all block
devices.
blk_cleanup_queue() calls elevator_exit() and after this, we can't
touch the elevator without oopsing. __elv_next_request() must check
for this state because in the refcounted queue model, we can still
call it after blk_cleanup_queue() has been called.
This was reported as causing an oops attributable to scsi.
In some cases we would end up stacking discard_zeroes_data incorrectly.
Fix this by enabling the feature by default for stacking drivers and
clearing it for low-level drivers. Incorporating a device that does not
support dzd will then cause the feature to be disabled in the stacking
driver.
Also ensure that the maximum discard value does not overflow when
exported in sysfs and return 0 in the alignment and dzd fields for
devices that don't support discard.
Reported-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In some drives, flush requests are non-queueable. When flush request is
running, normal read/write requests can't run. If block layer dispatches
such request, driver can't handle it and requeue it. Tejun suggested we
can hold the queue when flush is running. This can avoid unnecessary
requeue. Also this can improve performance. For example, we have
request flush1, write1, flush 2. flush1 is dispatched, then queue is
hold, write1 isn't inserted to queue. After flush1 is finished, flush2
will be dispatched. Since disk cache is already clean, flush2 will be
finished very soon, so looks like flush2 is folded to flush1.
flush request isn't queueable in some drives. Add a flag to let driver
notify block layer about this. We can optimize flush performance with the
knowledge.
02e352287a4 (block: rescan partitions on invalidated devices on
-ENOMEDIA too) relocated partition rescan above explicit bd_set_size()
to simplify condition check. As rescan_partitions() does its own bdev
size setting, this doesn't break anything; however,
rescan_partitions() prints out the following messages when adjusting
bdev size, which can be confusing.
sda: detected capacity change from 0 to 146815737856
sdb: detected capacity change from 0 to 146815737856
This patch restores the original order and remove the warning
messages.
stable: Please apply together with 02e352287a4 (block: rescan
partitions on invalidated devices on -ENOMEDIA too).
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Tony Luck <tony.luck@gmail.com> Tested-by: Tony Luck <tony.luck@gmail.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
In the bio completion routine, we should not be setting
PageUptodate at all -- it's set at sys_write() time, and is
unaffected by success/failure of the write to disk.
This can cause a page corruption bug when the file system's
block size is less than the architecture's VM page size.
if we have only written a single block -- we might end up
setting the page's PageUptodate flag, indicating that page
is completely read into memory, which may not be true.
This could cause subsequent reads to get bad data.
This commit also takes the opportunity to clean up error
handling in ext4_end_bio(), and remove some extraneous code:
- fixes ext4_end_bio() to set AS_EIO in the
page->mapping->flags on error, which was left out by
mistake. This is needed so that fsync() will
return an error if there was an I/O error.
- remove the clear_buffer_dirty() call on unmapped
buffers for each page.
- consolidate page/buffer error handling in a single
section.
The function iwl_is_any_associated() was intended
to check both contexts, but due to an oversight
it only checks the BSS context. This leads to a
problem with scanning since the passive dwell
time isn't restricted appropriately and a scan
that includes passive channels will never finish
if only the PAN context is associated since the
default dwell time of 120ms won't fit into the
normal 100 TU DTIM interval.
Fix the function by using for_each_context() and
also reorganise the other functions a bit to take
advantage of each other making the code easier to
read.
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Commit 0837e3242c73566fc1c0196b4ec61779c25ffc93 fixes a situation on POWER7
where events can roll back if a specualtive event doesn't actually complete.
This can raise a performance monitor exception. We need to catch this to ensure
that we reset the PMC. In all cases the PMC will be less than 256 cycles from
overflow.
This patch lifts Anton's fix for the problem in perf and applies it to oprofile
as well.
Signed-off-by: Eric B Munson <emunson@mgebm.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The XB113 cards are single band, 5 GHz-only, but the
default settings were configured to assume it was dual
band. Users of these cards then would see 2.4 GHz channels
but you would never get any scan results from these channels
given that the radio is not present.
Cc: Fiona Cain <Fiona.Cain@atheros.com> Cc: Ray Li <ray.li@greenwavereality.com> Cc: Kathy Giori <kathy.giori@atheros.com> Cc: Aeolus Yang <aeolus.yang@atheros.com> Cc: Dan Friedman <dan.friedman@atheros.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
With AR9003 at about ~ 10 feet from an AP that uses RTS / CTS you
will be able to associate but not not get data through given that
the power for the rates used was set too low. This increases the
power and permits data connectivity at longer distances from
access points when connected with HT40. Without this you will not
get any data through when associated to APs configured in HT40
at about more than 10 feet away.
Cc: Fiona Cain <fcain@atheros.com> Cc: Zhen Xie <Zhen.Xie@Atheros.com> Cc: Kathy Giori <kathy.giori@atheros.com> Cc: Neha Choksi <neha.choksi@atheros.com> Cc: Wayne Daniel <wayne.daniel@atheros.com> Cc: Gaurav Jauhar <gaurav.jauhar@atheros.com> Cc: Samira Naraghi <samira.naraghi@atheros.com> CC: Ashok Chennupati <ashok.chennupati@atheros.com> Cc: Lance Zimmerman <lance.zimmerman@atheros.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
At present the noise floor calibration is processed in supported
control and extension chains rather than required chains.
Unnccesarily doing nfcal in all supported chains leads to
invalid nf readings on extn chains and these invalid values
got updated into history buffer. While loading those values
from history buffer is moving the chip to deaf state.
This issue was observed in AR9002/AR9003 chips while doing
associate/dissociate in HT40 mode and interface up/down
in iterative manner. After some iterations, the chip was moved
to deaf state. Somehow the pci devices are recovered by poll work
after chip reset. Raading the nf values in all supported extension chains
when the hw is not yet configured in HT40 mode results invalid values.
Signed-off-by: Rajkumar Manoharan <rmanoharan@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
CPUID leaf 7, subleaf 0 returns the maximum subleaf in EAX, not the
number of subleaves. Since so far only subleaf 0 is defined (and only
the EBX bitfield) we do not need to qualify the test.
Commit 1fc711f7ffb01089efc58042cfdbac8573d1b59a (powerpc/kexec: Fix race
in kexec shutdown) moved the write to signal the cpu had exited the kernel
from before the transition to real mode in kexec_smp_wait to kexec_wait.
Unfornately it missed that kexec_wait is used both by cpus leaving the
kernel and by secondary slave cpus that were not allocated a paca for
what ever reason -- they could be beyond nr_cpus or not described in
the current device tree for whatever reason (for example, kexec-load
was not refreshed after a cpu hotplug operation). Cpus coming through
that path they will write to paca[NR_CPUS] which is beyond the space
allocated for the paca data and overwrite memory not allocated to pacas
but very likely still real mode accessable).
Move the write back to kexec_smp_wait, which is used only by cpus that
found their paca, but after the transition to real mode.
Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Starting with 1426d5a3bd07589534286375998c0c8c6fdc5260 (powerpc:
Dynamically allocate pacas) the space for pacas beyond cpu_possible
is freed, but we failed to update the loop in crash.c.
This patch ensures qla82xx_watchdog is not being run for the vport. It also
makes sure that beacon ON is not done for the vport, as it will lead to the
waking up of the dpc thread again and again.
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <jbottomley@parallels.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The firmware spec has the fcp_data_dseg_len defined as a 32-bit
value, while the corresponding field in the driver structure has
it defined as a 16-bit value.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com> Signed-off-by: James Bottomley <jbottomley@parallels.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
If function tracing is enabled, a read of the filter files will
cause the call to stop_machine to update the function trace sites.
It should only call stop_machine on write.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
recvmmsg fails on a raw socket with EINVAL. The reason for this is
packet_recvmsg checks the incoming flags:
err = -EINVAL;
if (flags & ~(MSG_PEEK|MSG_DONTWAIT|MSG_TRUNC|MSG_CMSG_COMPAT|MSG_ERRQUEUE))
goto out;
This patch strips out MSG_WAITFORONE when calling recvmmsg which
fixes the issue.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When a CPU is taken offline in an SMP system, cpufreq_remove_dev()
nulls out the per-cpu policy before cpufreq_stats_free_table() can
make use of it. cpufreq_stats_free_table() then skips the
call to sysfs_remove_group(), leaving about 100 bytes of sysfs-related
memory unclaimed each time a CPU-removal occurs. Break up
cpu_stats_free_table into sysfs and table portions, and
call the sysfs portion early.
Signed-off-by: Steven Finney <steven.finney@palm.com> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When we discover CPUs that are affected by each other's
frequency/voltage transitions, the first CPU gets a sysfs directory
created, and rest of the siblings get symlinks. Currently, when we
hotplug off only the first CPU, all of the symlinks and the sysfs
directory gets removed. Even though rest of the siblings are still
online and functional, they are orphaned, and no longer governed by
cpufreq.
This patch, given the above scenario, creates a sysfs directory for
the first sibling and symlinks for the rest of the siblings.
Please note the recursive call, it was rather too ugly to roll it
out. And the removal of redundant NULL setting (it is already taken
care of near the top of the function).
Signed-off-by: Jacob Shin <jacob.shin@amd.com> Acked-by: Mark Langsdorf <mark.langsdorf@amd.com> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kmemleak frees objects via RCU and when CONFIG_DEBUG_OBJECTS_RCU_HEAD
is enabled, the RCU callback triggers a call to free_object() in
lib/debugobjects.c. Since kmemleak is initialised before debug objects
initialisation, it may result in a kernel panic during booting. This
patch moves the kmemleak_init() call after debug_objects_mem_init().
The kmemleak_seq_next() function tries to get an object (and increment
its use count) before returning it. If it could not get the last object
during list traversal (because it may have been freed), the function
should return NULL rather than a pointer to such object that it did not
get.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Acked-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We reserved the numbers a long time ago, but never wired them up in the
syscall table as they need TIF_RESTORE_SIGMASK, which we only got last year
in commit cb6831d5d3099e772a510eb3e1ed0760ccffb45e ("m68k: Switch to saner
sigsuspend()")
Linus Torvalds [Wed, 18 May 2011 23:50:28 +0000 (16:50 -0700)]
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
configfs: Fix race between configfs_readdir() and configfs_d_iput()
configfs: Don't try to d_delete() negative dentries.
ocfs2/dlm: Target node death during resource migration leads to thread spin
ocfs2: Skip mount recovery for hard-ro mounts
ocfs2/cluster: Heartbeat mismatch message improved
ocfs2/cluster: Increase the live threshold for global heartbeat
ocfs2/dlm: Use negotiated o2dlm protocol version
ocfs2: skip existing hole when removing the last extent_rec in punching-hole codes.
ocfs2: Initialize data_ac (might be used uninitialized)
Linus Torvalds [Wed, 18 May 2011 20:25:57 +0000 (13:25 -0700)]
Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
drivercore: revert addition of of_match to struct device
of: fix race when matching drivers
Grant Likely [Wed, 18 May 2011 17:19:24 +0000 (11:19 -0600)]
drivercore: revert addition of of_match to struct device
Commit b826291c, "drivercore/dt: add a match table pointer to struct
device" added an of_match pointer to struct device to cache the
of_match_table entry discovered at driver match time. This was unsafe
because matching is not an atomic operation with probing a driver. If
two or more drivers are attempted to be matched to a driver at the
same time, then the cached matching entry pointer could get
overwritten.
This patch reverts the of_match cache pointer and reworks all users to
call of_match_device() directly instead.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Milton Miller [Wed, 18 May 2011 15:27:39 +0000 (10:27 -0500)]
of: fix race when matching drivers
If two drivers are probing devices at the same time, both will write
their match table result to the dev->of_match cache at the same time.
Only write the result if the device matches.
In a thread titled "SBus devices sometimes detected, sometimes not",
Meelis reported his SBus hme was not detected about 50% of the time.
From the debug suggested by Grant it was obvious another driver matched
some devices between the call to match the hme and the hme discovery
failling.
Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Milton Miller <miltonm@bga.com>
[grant.likely: modified to only call of_match_device() once] Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Linus Torvalds [Wed, 18 May 2011 13:49:02 +0000 (06:49 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: don't delay blk_run_queue_async
scsi: remove performance regression due to async queue run
blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup
block: rescan partitions on invalidated devices on -ENOMEDIA too
cdrom: always check_disk_change() on open
block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers
Florian Fainelli [Fri, 13 May 2011 15:41:21 +0000 (17:41 +0200)]
MIPS: AR7: Fix GPIO register size for Titan variant.
The 'size' variable contains the correct register size for both AR7
and Titan, but we never used it to ioremap the correct register size.
This problem only shows up on Titan.
[ralf@linux-mips.org: Fixed the fix. The original patch as in patchwork
recognizes the problem correctly then fails to fix it ...]
This is the MIPS portion of Joe Perches <joe@perches.com>'s
https://patchwork.linux-mips.org/patch/2172/ which seems to have been
lost in time and space.
Joel Becker [Wed, 18 May 2011 11:08:16 +0000 (04:08 -0700)]
configfs: Fix race between configfs_readdir() and configfs_d_iput()
configfs_readdir() will use the existing inode numbers of inodes in the
dcache, but it makes them up for attribute files that aren't currently
instantiated. There is a race where a closing attribute file can be
tearing down at the same time as configfs_readdir() is trying to get its
inode number.
We want to get the inode number of open attribute files, because they
should match while instantiated. We can't lock down the transition
where dentry->d_inode is set to NULL, so we just check for NULL there.
We can, however, ensure that an inode we find isn't iput() in
configfs_d_iput() until after we've accessed it.
Joel Becker [Tue, 22 Feb 2011 09:09:49 +0000 (01:09 -0800)]
configfs: Don't try to d_delete() negative dentries.
When configfs is faking mkdir() on its subsystem or default group
objects, it starts by adding a negative dentry. It then tries to
instantiate the group. If that should fail, it must clean up after
itself.
I was using d_delete() here, but configfs_attach_group() promises to
return an empty dentry on error. d_delete() explodes with the entry
dentry. Let's try d_drop() instead. The unhashing is what we want for
our dentry.
Shaohua Li [Wed, 18 May 2011 09:22:43 +0000 (11:22 +0200)]
block: don't delay blk_run_queue_async
Let's check a scenario:
1. blk_delay_queue(q, SCSI_QUEUE_DELAY);
2. blk_run_queue_async();
the second one will became a noop, because q->delay_work already has
WORK_STRUCT_PENDING_BIT set, so the delayed work will still run after
SCSI_QUEUE_DELAY. But blk_run_queue_async actually hopes the delayed
work runs immediately.
Fix this by doing a cancel on potentially pending delayed work
before queuing an immediate run of the workqueue.
Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Linus Torvalds [Wed, 18 May 2011 10:13:46 +0000 (03:13 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf evlist: Fix per thread mmap setup
perf tools: Honour the cpu list parameter when also monitoring a thread list
kprobes, x86: Disable irqs during optimized callback
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix cifsConvertToUCS() for the mapchars case
cifs: add fallback in is_path_accessible for old servers
os_dump_core() uses abort() to terminate UML in case of an fatal error.
glibc's abort() calls raise(SIGABRT) which makes use of tgkill().
tgkill() has no effect within UML's kernel threads because they are not
pthreads. As fallback abort() executes an invalid instruction to
terminate the process. Therefore UML gets killed by SIGSEGV and leaves a
ugly log entry in the host's kernel ring buffer.
To get rid of this we use our own abort routine.
Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ZONE_CONGESTED should be a state of global memory reclaim. If not, a busy
memcg sets this and give unnecessary throttoling in wait_iff_congested()
against memory recalim in other contexts. This makes system performance
bad.
I'll think about "memcg is congested!" flag is required or not, later.
But this fix is required first.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: Ying Han <yinghan@google.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix switch initialization to ensure that all switches have default routing
disabled. This guarantees that no unexpected RapidIO packets arrive to
the default port set by reset and there is no default routing destination
until it is properly configured by software.
This update also unifies handling of unmapped destinations by tsi57x, IDT
Gen1 and IDT Gen2 switches.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com> Cc: Kumar Gala <galak@kernel.crashing.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Li Yang <leoli@freescale.com> Cc: Thomas Moll <thomas.moll@sysgo.com> Cc: <stable@kernel.org> [2.6.37+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Layton [Tue, 17 May 2011 19:28:21 +0000 (15:28 -0400)]
cifs: fix cifsConvertToUCS() for the mapchars case
As Metze pointed out, commit 84cdf74e broke mapchars option:
Commit "cifs: fix unaligned accesses in cifsConvertToUCS"
(84cdf74e8096a10dd6acbb870dd404b92f07a756) does multiple steps
in just one commit (moving the function and changing it without
testing).
put_unaligned_le16(temp, &target[j]); is never called for any
codepoint the goes via the 'default' switch statement. As a result
we put just zero (or maybe uninitialized) bytes into the target
buffer.
His proposed patch looks correct, but doesn't apply to the current head
of the tree. This patch should also fix it.
Cc: <stable@kernel.org> # .38.x: 581ade4: cifs: clean up various nits in unicode routines (try #2) Reported-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
Jeff Layton [Tue, 17 May 2011 10:40:30 +0000 (06:40 -0400)]
cifs: add fallback in is_path_accessible for old servers
The is_path_accessible check uses a QPathInfo call, which isn't
supported by ancient win9x era servers. Fall back to an older
SMBQueryInfo call if it fails with the magic error codes.
Cc: stable@kernel.org Reported-and-Tested-by: Sandro Bonazzola <sandro.bonazzola@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
Borislav Petkov [Tue, 17 May 2011 12:55:19 +0000 (14:55 +0200)]
x86, AMD: Fix ARAT feature setting again
Trying to enable the local APIC timer on early K8 revisions
uncovers a number of other issues with it, in conjunction with
the C1E enter path on AMD. Fixing those causes much more churn
and troubles than the benefit of using that timer brings so
don't enable it on K8 at all, falling back to the original
functionality the kernel had wrt to that.
Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com> Cc: Boris Ostrovsky <Boris.Ostrovsky@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Nick Bowler <nbowler@elliptictech.com> Cc: Joerg-Volker-Peetz <jvpeetz@web.de> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1305636919-31165-3-git-send-email-bp@amd64.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
Moving the lower endpoint of the Erratum 400 check to accomodate
earlier K8 revisions (A-E) opens a can of worms which is simply
not worth to fix properly by tweaking the errata checking
framework:
* missing IntPenging MSR on revisions < CG cause #GP:
Jens Axboe [Tue, 17 May 2011 09:04:44 +0000 (11:04 +0200)]
scsi: remove performance regression due to async queue run
Commit c21e6beb removed our queue request_fn re-enter
protection, and defaulted to always running the queues from
kblockd to be safe. This was a known potential slow down,
but should be safe.
Unfortunately this is causing big performance regressions for
some, so we need to improve this logic. Looking into the details
of the re-enter, the real issue is on requeue of requests.
Requeue of requests upon seeing a BUSY condition from the device
ends up re-running the queue, causing traces like this:
potentially causing the issue we want to avoid. So special
case the requeue re-run of the queue, but improve it to offload
the entire run of local queue and starved queue from a single
workqueue callback. This is a lot better than potentially
kicking off a workqueue run for each device seen.
This also fixes the issue of the local device going into recursion,
since the above mentioned commit never moved that queue run out
of line.
Linus Torvalds [Tue, 17 May 2011 01:36:47 +0000 (18:36 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
Revert "mmc: fix a race between card-detect rescan and clock-gate work instances"
Yinghai Lu [Sat, 14 May 2011 01:06:17 +0000 (18:06 -0700)]
PCI: Clear bridge resource flags if requested size is 0
During pci remove/rescan testing found:
pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
pci 0000:c0:03.0: bridge window [io 0x1000-0x0fff]
pci 0000:c0:03.0: bridge window [mem 0xf0000000-0xf00fffff]
pci 0000:c0:03.0: bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
pci 0000:c0:03.0: device not available (can't reserve [io 0x1000-0x0fff])
pci 0000:c0:03.0: Error enabling bridge (-22), continuing
pci 0000:c0:03.0: enabling bus mastering
pci 0000:c0:03.0: setting latency timer to 64
pcieport 0000:c0:03.0: device not available (can't reserve [io 0x1000-0x0fff])
pcieport: probe of 0000:c0:03.0 failed with error -22
This bug was caused by commit c8adf9a3e873 ("PCI: pre-allocate
additional resources to devices only after successful allocation of
essential resources.")
After that commit, pci_hotplug_io_size is changed to additional_io_size
from minium size. So it will not go through resource_size(res) != 0
path, and will not be reset.
The root cause is: pci_bridge_check_ranges will set RESOURCE_IO flag for
pci bridge, and later if children do not need IO resource. those bridge
resources will not need to be allocated. but flags is still there.
that will confuse the the pci_enable_bridges later.
for (list = head->next; list; list = list->next) {
res = list->res;
idx = res - &list->dev->resource[0];
if (resource_size(res) && pci_assign_resource(list->dev, idx)) {
...
reset_resource(res);
}
}
}
At last, We have to clear the flags in pbus_size_mem/io when requested
size == 0 and !add_head. becasue this case it will not go through
adjust_resources_sorted().
Just make size1 = size0 when !add_head. it will make flags get cleared.
At the same time when requested size == 0, add_size != 0, will still
have in head and add_list. because we do not clear the flags for it.
After this, we will get right result:
pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
pci 0000:c0:03.0: bridge window [io disabled]
pci 0000:c0:03.0: bridge window [mem 0xf0000000-0xf00fffff]
pci 0000:c0:03.0: bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
pci 0000:c0:03.0: enabling bus mastering
pci 0000:c0:03.0: setting latency timer to 64
pcieport 0000:c0:03.0: setting latency timer to 64
pcieport 0000:c0:03.0: irq 160 for MSI/MSI-X
pcieport 0000:c0:03.0: Signaling PME through PCIe PME interrupt
pci 0000:c4:00.0: Signaling PME through PCIe PME interrupt
pcie_pme 0000:c0:03.0:pcie01: service driver pcie_pme loaded
aer 0000:c0:03.0:pcie02: service driver aer loaded
pciehp 0000:c0:03.0:pcie04: Hotplug Controller:
v3: more simple fix. also fix one typo in pbus_size_mem
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Mon, 16 May 2011 09:07:48 +0000 (11:07 +0200)]
tick: Clear broadcast active bit when switching to oneshot
The first cpu which switches from periodic to oneshot mode switches
also the broadcast device into oneshot mode. The broadcast device
serves as a backup for per cpu timers which stop in deeper
C-states. To avoid starvation of the cpus which might be in idle and
depend on broadcast mode it marks the other cpus as broadcast active
and sets the brodcast expiry value of those cpus to the next tick.
The oneshot mode broadcast bit for the other cpus is sticky and gets
only cleared when those cpus exit idle. If a cpu was not idle while
the bit got set in consequence the bit prevents that the broadcast
device is armed on behalf of that cpu when it enters idle for the
first time after it switched to oneshot mode.
In most cases that goes unnoticed as one of the other cpus has usually
a timer pending which keeps the broadcast device armed with a short
timeout. Now if the only cpu which has a short timer active has the
bit set then the broadcast device will not be armed on behalf of that
cpu and will fire way after the expected timer expiry. In the case of
Christians bug report it took ~145 seconds which is about half of the
wrap around time of HPET (the limit for that device) due to the fact
that all other cpus had no timers armed which expired before the 145
seconds timeframe.
The solution is simply to clear the broadcast active bit
unconditionally when a cpu switches to oneshot mode after the first
cpu switched the broadcast device over. It's not idle at that point
otherwise it would not be executing that code.
[ I fundamentally hate that broadcast crap. Why the heck thought some
folks that when going into deep idle it's a brilliant concept to
switch off the last device which brings the cpu back from that
state? ]
Thanks to Christian for providing all the valuable debug information!
Michał Mirosław [Mon, 16 May 2011 19:14:21 +0000 (15:14 -0400)]
net: Change netdev_fix_features messages loglevel
Those reduced to DEBUG can possibly be triggered by unprivileged processes
and are nothing exceptional. Illegal checksum combinations can only be
caused by driver bug, so promote those messages to WARN.
Since GSO without SG will now only cause DEBUG message from
netdev_fix_features(), remove the workaround from register_netdevice().
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Jarosch [Mon, 16 May 2011 06:28:15 +0000 (06:28 +0000)]
vmxnet3: Fix inconsistent LRO state after initialization
During initialization of vmxnet3, the state of LRO
gets out of sync with netdev->features.
This leads to very poor TCP performance in a IP forwarding
setup and is hitting many VMware users.
Simplified call sequence:
1. vmxnet3_declare_features() initializes "adapter->lro" to true.
2. The kernel automatically disables LRO if IP forwarding is enabled,
so vmxnet3_set_flags() gets called. This also updates netdev->features.
3. Now vmxnet3_setup_driver_shared() is called. "adapter->lro" is still
set to true and LRO gets enabled again, even though
netdev->features shows it's disabled.
Fix it by updating "adapter->lro", too.
The private vmxnet3 adapter flags are scheduled for removal
in net-next, see commit a0d2730c9571aeba793cb5d3009094ee1d8fda35
"net: vmxnet3: convert to hw_features".
Patch applies to 2.6.37 / 2.6.38 and 2.6.39-rc6.
Please CC: comments.
Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Mon, 16 May 2011 06:13:49 +0000 (06:13 +0000)]
sfc: Fix oops in register dump after mapping change
Commit 747df2258b1b9a2e25929ef496262c339c380009 ('sfc: Always map MCDI
shared memory as uncacheable') introduced a separate mapping for the
MCDI shared memory (MC_TREG_SMEM). This means we can no longer easily
include it in the register dump. Since it is not particularly useful
in debugging, substitute a recognisable dummy value.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 16 May 2011 15:55:49 +0000 (08:55 -0700)]
Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
OMAP3: set the core dpll clk rate in its set_rate function
omap: iommu: Return IRQ_HANDLED in fault handler when no fault occured
Linus Torvalds [Mon, 16 May 2011 15:47:31 +0000 (08:47 -0700)]
Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: Take lock around probes for drm_fb_helper_hotplug_event
drm/i915: Revert i915.semaphore=1 default from 47ae63e0
vga_switcheroo: don't toggle-switch devices
drm/radeon/kms: add some evergreen/ni safe regs
drm/radeon/kms: fix extended lvds info parsing
drm/radeon/kms: fix tiling reg on fusion
Vivek Goyal [Mon, 16 May 2011 13:24:08 +0000 (15:24 +0200)]
blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup
Currentlly we first map the task to cgroup and then cgroup to
blkio_cgroup. There is a more direct way to get to blkio_cgroup
from task using task_subsys_state(). Use that.
The real reason for the fix is that it also avoids a race in generic
cgroup code. During remount/umount rebind_subsystems() is called and
it can do following with and rcu protection.
cgrp->subsys[i] = NULL;
That means if somebody got hold of cgroup under rcu and then it tried
to do cgroup->subsys[] to get to blkio_cgroup, it would get NULL which
is wrong. I was running into this race condition with ltp running on a
upstream derived kernel and that lead to crash.
So ideally we should also fix cgroup generic code to wait for rcu
grace period before setting pointer to NULL. Li Zefan is not very keen
on introducing synchronize_wait() as he thinks it will slow
down moun/remount/umount operations.
So for the time being atleast fix the kernel crash by taking a more
direct route to blkio_cgroup.
One tester had reported a crash while running LTP on a derived kernel
and with this fix crash is no more seen while the test has been
running for over 6 days.
Youquan Song [Thu, 21 Apr 2011 16:22:43 +0000 (00:22 +0800)]
x86, apic: Fix spurious error interrupts triggering on all non-boot APs
This patch fixes a bug reported by a customer, who found
that many unreasonable error interrupts reported on all
non-boot CPUs (APs) during the system boot stage.
According to Chapter 10 of Intel Software Developer Manual
Volume 3A, Local APIC may signal an illegal vector error when
an LVT entry is set as an illegal vector value (0~15) under
FIXED delivery mode (bits 8-11 is 0), regardless of whether
the mask bit is set or an interrupt actually happen. These
errors are seen as error interrupts.
The initial value of thermal LVT entries on all APs always reads
0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
sequence to them and LVT registers are reset to 0s except for
the mask bits which are set to 1s when APs receive INIT IPI.
When the BIOS takes over the thermal throttling interrupt,
the LVT thermal deliver mode should be SMI and it is required
from the kernel to keep AP's LVT thermal monitoring register
programmed as such as well.
This issue happens when BIOS does not take over thermal throttling
interrupt, AP's LVT thermal monitor register will be restored to
0x10000 which means vector 0 and fixed deliver mode, so all APs will
signal illegal vector error interrupts.
This patch check if interrupt delivery mode is not fixed mode before
restoring AP's LVT thermal monitor register.
Signed-off-by: Youquan Song <youquan.song@intel.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Yong Wang <yong.y.wang@intel.com> Cc: hpa@linux.intel.com Cc: joe@perches.com Cc: jbaron@redhat.com Cc: trenn@suse.de Cc: kent.liu@intel.com Cc: chaohong.guo@intel.com Cc: <stable@kernel.org> # As far back as possible Link: http://lkml.kernel.org/r/1303402963-17738-1-git-send-email-youquan.song@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
Chris Wilson [Fri, 22 Apr 2011 10:03:57 +0000 (11:03 +0100)]
drm: Take lock around probes for drm_fb_helper_hotplug_event
We need to hold the dev->mode_config.mutex whilst detecting the output
status. But we also need to drop it for the call into
drm_fb_helper_single_fb_probe(), which indirectly acquires the lock when
attaching the fbcon.
Failure to do so exposes a race with normal output probing. Detected by
adding some warnings that the mutex is held to the backend detect routines:
Reported-by: Frederik Himpe <fhimpe@telenet.be>
References: https://bugs.freedesktop.org/show_bug.cgi?id=36394 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>
Andy Lutomirski [Fri, 13 May 2011 16:14:54 +0000 (12:14 -0400)]
drm/i915: Revert i915.semaphore=1 default from 47ae63e0
My Q67 / i7-2600 box has rev09 Sandy Bridge graphics. It hangs
instantly when GNOME loads and it hangs so hard the reset button
doesn't work. Setting i915.semaphore=0 fixes it.
Hans Schillstrom [Sun, 15 May 2011 15:20:29 +0000 (17:20 +0200)]
IPVS: fix netns if reading ip_vs_* procfs entries
Without this patch every access to ip_vs in procfs will increase
the netns count i.e. an unbalanced get_net()/put_net().
(ipvsadm commands also use procfs.)
The result is you can't exit a netns if reading ip_vs_* procfs entries.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e
bridge: Reset IPCB when entering IP stack on NF_FORWARD
broke forwarding of IPV6 packets in bridge because it would
call bp_parse_ip_options with an IPV6 packet.
Reported-by: Noah Meyerhans <noahm@debian.org> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using
--pid when monitoring multithreaded apps, as we can only share a ring
buffer for events on the same thread if not doing per cpu.
Fix it by using per thread ring buffers.
Tested with:
[root@felicio ~]# tuna -t 26131 -CP | nl
1 thread ctxt_switches
2 pid SCHED_ rtpri affinity voluntary nonvoluntary cmd
3 26131 OTHER 0 0,1 108142762397830 chromium-browse
4 642 OTHER 0 0,1 14688 0 chromium-browse
5 26148 OTHER 0 0,1 713602 115479 chromium-browse
6 26149 OTHER 0 0,1 801958 2262 chromium-browse
7 26150 OTHER 0 0,1 1271128 248 chromium-browse
8 26151 OTHER 0 0,1 3 0 chromium-browse
9 27049 OTHER 0 0,1 36796 9 chromium-browse
10 618 OTHER 0 0,1 14711 0 chromium-browse
11 661 OTHER 0 0,1 14593 0 chromium-browse
12 29048 OTHER 0 0,1 28125 0 chromium-browse
13 26143 OTHER 0 0,1 2202789 781 chromium-browse
[root@felicio ~]#
So 11 threads under pid 26131, then:
[root@felicio ~]# perf record -F 50000 --pid 26131
11 mmaps, one per thread since we didn't specify any CPU list, so we need one
mmap per thread and:
[root@felicio ~]# perf record -F 50000 --pid 26131
^M
^C[ perf record: Woken up 79 times to write data ]
[ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ]
Ok, one of the threads, 26151 was quiescent, so no samples there, but all the
others are there.
Then, if I specify one CPU:
[root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ]
Just one mmap, as now we can use just one per-cpu buffer instead of the
per-thread needed in the previous case.
For global profiling:
[root@felicio ~]# perf record -F 50000 -a
^C[ perf record: Woken up 26 times to write data ]
[ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ]
[root@felicio ~]# perf record -F 50000 --tid 26148
^C[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ]
Li Zefan [Fri, 15 Apr 2011 03:03:17 +0000 (03:03 +0000)]
Btrfs: fix FS_IOC_SETFLAGS ioctl
Steps to reproduce the bug:
- Call FS_IOC_SETLFAGS ioctl with flags=FS_COMPR_FL
- Call FS_IOC_SETFLAGS ioctl with flags=0
- Call FS_IOC_GETFLAGS ioctl, and you'll see FS_COMPR_FL is still set!
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Li Zefan [Fri, 15 Apr 2011 03:02:49 +0000 (03:02 +0000)]
fs: remove FS_COW_FL
FS_COW_FL and FS_NOCOW_FL were newly introduced to control per file
COW in btrfs, but FS_NOCOW_FL is sufficient.
The fact is we don't have corresponding BTRFS_INODE_COW flag.
COW is default, and FS_NOCOW_FL can be used to switch off COW for
a single file.
If we mount btrfs with nodatacow, a newly created file will be set with
the FS_NOCOW_FL flag. So to turn on COW for it, we can just clear the
FS_NOCOW_FL flag.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
When a btrfs disk is created by mixed data & metadata option, it will have no
pure data or pure metadata space info.
In btrfs's for-linus branch, commit 78b1ea13838039cd88afdd62519b40b344d6c920
(Btrfs: fix OOPS of empty filesystem after balance) initializes space infos at
the very beginning. The problem is this initialization does not take the mixed
case into account, which will cause btrfs will easily get into ENOSPC in mixed
case.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
If posix_acl_from_xattr() returns an error code, a negative address is
dereferenced causing an oops; fix by checking for error code first.
Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Reviewed-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Hugh Dickins [Sat, 14 May 2011 19:06:42 +0000 (12:06 -0700)]
tmpfs: fix race between swapoff and writepage
Shame on me! Commit b1dea800ac39 "tmpfs: fix race between umount and
writepage" fixed the advertized race, but introduced another: as even
its comment makes clear, we cannot safely rely on a peek at list_empty()
while holding no lock - until info->swapped is set, shmem_unuse_inode()
may delete any formerly-swapped inode from the shmem_swaplist, which
in this case would leave a swap area impossible to swapoff.
Although I don't relish taking the mutex every time, I don't care much
for the alternatives either; and at least the peek at list_empty() in
shmem_evict_inode() (a hotter path since most inodes would never have
been swapped) remains safe, because we already truncated the whole file.
Tejun Heo [Mon, 9 May 2011 14:04:11 +0000 (16:04 +0200)]
libata: fix oops when LPM is used with PMP
ae01b2493c (libata: Implement ATA_FLAG_NO_DIPM and apply it to mcp65)
added ATA_FLAG_NO_DIPM and made ata_eh_set_lpm() check the flag.
However, @ap is NULL if @link points to a PMP link and thus the
unconditional @ap->flags dereference leads to the following oops.
stable: ATA_FLAG_NO_DIPM was added during 2.6.39 cycle but was
backported to 2.6.37 and 38. This is a fix for that and thus
also applicable to 2.6.37 and 38.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: "Nathan A. Mourey II" <nmoureyii@ne.rr.com>
LKML-Reference: <1304555277.2059.2.camel@localhost.localdomain> Cc: Connor H <cmdkhh@gmail.com> Cc: stable@kernel.org Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
The commits causes command timeouts on AC plug/unplug. It isn't yet
clear why. As the commit was for a single rather obscure controller,
revert the change for now.
The problem was reported and bisected by Gu Rui in bug#34692.
https://bugzilla.kernel.org/show_bug.cgi?id=34692
Also, reported by Rafael and Michael in the following thread.
Bruno Prémont [Sat, 14 May 2011 10:24:15 +0000 (12:24 +0200)]
Further fbcon sanity checking
This moves the
if (num_registered_fb == FB_MAX)
return -ENXIO;
check _AFTER_ the call to do_remove_conflicting_framebuffers() as this
would (now in a safe way) allow a native driver to replace the
conflicting one even if all slots in registered_fb[] are taken.
This also prevents unregistering a framebuffer that is no longer
registered (vga16f will unregister at module unload time even if the
frame buffer had been unregistered earlier due to being found
conflicting).
Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 13 May 2011 23:16:41 +0000 (16:16 -0700)]
fbmem: fix remove_conflicting_framebuffers races
When a register_framebuffer() call results in us removing old
conflicting framebuffers, the new registration_lock doesn't protect that
situation. And we can't just add the same locking to the function,
because these functions call each other: register_framebuffer() calls
remove_conflicting_framebuffers, which in turn calls
unregister_framebuffer for any conflicting entry.
In order to fix it, this just creates wrapper functions around all three
functions and makes the versions that actually do the work be called
"do_xxx()", leaving just the wrapper that gets the lock and calls the
worker function.
So the rule becomes simply that "do_xxxx()" has to be called with the
lock held, and now do_register_framebuffer() can just call
do_remove_conflicting_framebuffers(), and that in turn can call
_do_unregister_framebuffer(), and there is no deadlock, and we can hold
the registration lock over the whole sequence, fixing the races.
It also makes error cases simpler, and fixes one situation where we
would return from unregister_framebuffer() without releasing the lock,
pointed out by Bruno Prémont.
Tested-by: Bruno Prémont <bonbons@linux-vserver.org> Tested-by: Anca Emanuel <anca.emanuel@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 14 May 2011 00:29:03 +0000 (17:29 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
alpha: Wire up syscalls new to 2.6.39
alpha: convert to clocksource_register_hz