Ingo Molnar [Thu, 9 Aug 2007 09:16:47 +0000 (11:16 +0200)]
sched: remove 'now' use from assignments
change all 'now' timestamp uses in assignments to rq->clock.
( this is an identity transformation that causes no functionality change:
all such new rq->clock is necessarily preceded by an update_rq_clock()
call. )
Ingo Molnar [Thu, 9 Aug 2007 09:16:46 +0000 (11:16 +0200)]
sched: add [__]update_rq_clock(rq)
add the [__]update_rq_clock(rq) functions. (No change in functionality,
just reorganization to prepare for elimination of the heavy 64-bit
timestamp-passing in the scheduler.)
Peter Williams [Thu, 9 Aug 2007 09:16:46 +0000 (11:16 +0200)]
sched: fix bug in balance_tasks()
There are two problems with balance_tasks() and how it used:
1. The variables best_prio and best_prio_seen (inherited from the old
move_tasks()) were only required to handle problems caused by the
active/expired arrays, the order in which they were processed and the
possibility that the task with the highest priority could be on either.
These issues are no longer present and the extra overhead associated
with their use is unnecessary (and possibly wrong).
2. In the absence of CONFIG_FAIR_GROUP_SCHED being set, the same
this_best_prio variable needs to be used by all scheduling classes or
there is a risk of moving too much load. E.g. if the highest priority
task on this at the beginning is a fairly low priority task and the rt
class migrates a task (during its turn) then that moved task becomes the
new highest priority task on this_rq but when the sched_fair class
initializes its copy of this_best_prio it will get the priority of the
original highest priority task as, due to the run queue locks being
held, the reschedule triggered by pull_task() will not have taken place.
This could result in inappropriate overriding of skip_for_load and
excessive load being moved.
The attached patch addresses these problems by deleting all reference to
best_prio and best_prio_seen and making this_best_prio a reference
parameter to the various functions involved.
load_balance_fair() has also been modified so that this_best_prio is
only reset (in the loop) if CONFIG_FAIR_GROUP_SCHED is set. This should
preserve the effect of helping spread groups' higher priority tasks
around the available CPUs while improving system performance when
CONFIG_FAIR_GROUP_SCHED isn't set.
Signed-off-by: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Josh Triplett [Thu, 9 Aug 2007 09:16:46 +0000 (11:16 +0200)]
sched: mark print_cfs_stats static
sched_fair.c defines print_cfs_stats, and sched_debug.c uses it, but sched.c
includes both sched_fair.c and sched_debug.c, so all the references to
print_cfs_stats occur in the same compilation unit. Thus, mark
print_cfs_stats static.
Eliminates a sparse warning:
warning: symbol 'print_cfs_stats' was not declared. Should it be static?
Ulrich Drepper [Thu, 9 Aug 2007 09:16:46 +0000 (11:16 +0200)]
sched: clean up sched_getaffinity()
here's another tiny cleanup. The generated code is not affected (gcc is
smart enough) but for people looking over the code it is just irritating
to have the extra conditional.
Peter Williams [Thu, 9 Aug 2007 09:16:46 +0000 (11:16 +0200)]
sched: simplify move_tasks()
The move_tasks() function is currently multiplexed with two distinct
capabilities:
1. attempt to move a specified amount of weighted load from one run
queue to another; and
2. attempt to move a specified number of tasks from one run queue to
another.
The first of these capabilities is used in two places, load_balance()
and load_balance_idle(), and in both of these cases the return value of
move_tasks() is used purely to decide if tasks/load were moved and no
notice of the actual number of tasks moved is taken.
The second capability is used in exactly one place,
active_load_balance(), to attempt to move exactly one task and, as
before, the return value is only used as an indicator of success or failure.
This multiplexing of sched_task() was introduced, by me, as part of the
smpnice patches and was motivated by the fact that the alternative, one
function to move specified load and one to move a single task, would
have led to two functions of roughly the same complexity as the old
move_tasks() (or the new balance_tasks()). However, the new modular
design of the new CFS scheduler allows a simpler solution to be adopted
and this patch addresses that solution by:
1. adding a new function, move_one_task(), to be used by
active_load_balance(); and
2. making move_tasks() a single purpose function that tries to move a
specified weighted load and returns 1 for success and 0 for failure.
One of the consequences of these changes is that neither move_one_task()
or the new move_tasks() care how many tasks sched_class.load_balance()
moves and this enables its interface to be simplified by returning the
amount of load moved as its result and removing the load_moved pointer
from the argument list. This helps simplify the new move_tasks() and
slightly reduces the amount of work done in each of
sched_class.load_balance()'s implementations.
Further simplification, e.g. changes to balance_tasks(), are possible
but (slightly) complicated by the special needs of load_balance_fair()
so I've left them to a later patch (if this one gets accepted).
NB Since move_tasks() gets called with two run queue locks held even
small reductions in overhead are worthwhile.
[ mingo@elte.hu ]
this change also reduces code size nicely:
text data bss dec hex filename
39216 3618 24 42858 a76a sched.o.before
39173 3618 24 42815 a73f sched.o.after
Signed-off-by: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Thu, 9 Aug 2007 09:16:45 +0000 (11:16 +0200)]
sched: reorder update_cpu_load(rq) with the ->task_tick() call
Peter Williams suggested to flip the order of update_cpu_load(rq) with
the ->task_tick() call. This is a NOP for the current scheduler (the
two functions are independent of each other), ->task_tick() might
create some state for update_cpu_load() in the future (or in PlugSched).
David S. Miller [Thu, 9 Aug 2007 00:32:33 +0000 (17:32 -0700)]
[SPARC64]: Fix memory leak when cpu hotplugging.
Every time a cpu is added via hotplug, we allocate the per-cpu MONDO
queues but we never free them up. Freeing isn't easy since the first
cpu gets this memory from bootmem.
Therefore, the simplest thing to do to fix this bug is to allocate the
queues for all possible cpus at boot time.
Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Tue, 7 Aug 2007 23:01:46 +0000 (00:01 +0100)]
fix oops in __audit_signal_info()
The check for audit_signals is misplaced and the check for
audit_dummy_context() is missing; as the result, if we send a signal to
auditd from task with NULL ->audit_context while we have audit_signals
!= 0 we end up with an oops.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix estimation of maxRTT. The original code ignores rtt measurements
during slow start (via the check tp->snd_ssthresh < 0xFFFF) yet this
is probably a good time to try to estimate max rtt as delayed acking
is disabled and slow start will only exit on a loss which presumably
corresponds to a maxrtt measurement. Second, the original code (via
the check htcp_ccount(ca) > 3) ignores rtt data during what it
estimates to be the first 3 round-trip times. This seems like an
unnecessary check now that the RCV timestamp are no longer used
for rtt estimation.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: ctnetlink: return EEXIST instead of EINVAL for existing nat'ed conntracks
ctnetlink must return EEXIST for existing nat'ed conntracks instead of
EINVAL. Only return EINVAL if we try to update a conntrack with NAT
handlings (that is not allowed).
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Juhl [Wed, 8 Aug 2007 01:10:54 +0000 (18:10 -0700)]
[NETFILTER]: ipt_recent: avoid a possible NULL pointer deref in recent_seq_open()
If the call to seq_open() returns != 0 then the code calls
kfree(st) but then on the very next line proceeds to
dereference the pointer - not good.
Problem spotted by the Coverity checker.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Moore [Wed, 8 Aug 2007 00:53:10 +0000 (17:53 -0700)]
[NetLabel]: add missing rcu_dereference() calls in the LSM domain mapping hash table
The LSM domain mapping head table pointer was not being referenced via the RCU
safe dereferencing function, rcu_dereference(). This patch adds those missing
calls to the NetLabel code.
This has been tested using recent linux-2.6 git kernels with no visible
regressions.
Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
> >> Looks like memset() is zeroing wrong nr of bytes.
> >
> > Good catch, however, I think we can just remove this memset altogether
> > since the memory gets allocated via kzalloc.
>
> Correct, that memset() is superfluous.
Paul Mundt [Wed, 1 Aug 2007 06:48:55 +0000 (15:48 +0900)]
net: smc91x: Build fixes for general sh boards.
SH boards in general only wire this up in 8 or 16-bit mode, and
as we never had the wrappers for 32-bit mode defined, SMC_CAN_USE_32BIT
caused build failure for the non-Solution Engine boards. This gets it
building again.
Also kill off the straggling set_irq_type() definition, this is left
over cruft that was missed when the rest of it switched to IRQ flags.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
--
Do not allow cached open for O_RDONLY or O_WRONLY unless the file has been
previously opened in these modes.
Also Fix the calculation of the mode in nfs4_close_prepare. We should only
issue an OPEN_DOWNGRADE if we're sure that we will still be holding the
correct open modes. This may not be the case if we've been doing delegated
opens.
Finally, there is no need to adjust the open mode bit flags in
nfs4_close_done(): that has already been done in nfs4_close_prepare().
NFSv4: Fix a locking regression in nfs4_set_mode_locked()
We don't really need to clear &state->inode_states inside
nfs4_set_mode_locked, and doing so without holding the inode->i_lock would
in any case be a bug...
We need to grab the inode->i_lock atomically with the last reference put in
order to remove the open context that is being freed from the
nfsi->open_files list.
Fix by converting the kref to a standard atomic counter and then using
atomic_dec_and_lock()...
Thanks to Arnd Bergmann for pointing out the problem.
The commit 4ada539ed77c7a2bbcb75cafbbd7bd8d2b9bef7b lead to the unpleasant
possibility of an asynchronous rpc_task being required to call
rpciod_down() when it is complete. This again means that the rpciod
workqueue may get to call destroy_workqueue on itself -> hang...
Change rpciod_up/rpciod_down to just get/put the module, and then
create/destroy the workqueues on module load/unload.
Rusty Russell [Mon, 6 Aug 2007 00:48:18 +0000 (10:48 +1000)]
Enable lguest drivers in Kconfig
Lguest drivers need to default to "Y" otherwise they're never selected
for new builds. (We don't bother prompting, because they're less than
4k combined, and implied by selecting lguest support).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Avi Kivity [Sun, 5 Aug 2007 07:16:11 +0000 (10:16 +0300)]
KVM: x86 emulator: fix debug reg mov instructions
More fallout from the writeback fixes: debug register transfer
instructions do their own writeback and thus need to disable the general
writeback mechanism.
This fixes oopses and some guest failures on AMD machines (the Intel
variant decodes the instruction in hardware and thus does not need
emulation).
Cc: Alistair John Strachan <alistair@devzero.co.uk> Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 7 Aug 2007 00:52:56 +0000 (17:52 -0700)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NETFILTER]: Add xt_statistic.h to the header list for usermode programs
[BNX2]: Fix suspend/resume problem.
[TG3]: Fix suspend/resume problem.
Dave Airlie [Mon, 6 Aug 2007 23:09:51 +0000 (09:09 +1000)]
drm/i915: Fix i965 secured batchbuffer usage
This 965G and above chipsets moved the batch buffer non-secure bits to
another place. This means that previous drm's allowed in-secure batchbuffers
to be submitted to the hardware from non-privileged users who are logged
into X and and have access to direct rendering.
Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Francois Romieu [Wed, 1 Aug 2007 22:00:48 +0000 (00:00 +0200)]
r8169: avoid needless NAPI poll scheduling
Theory : though needless, it should not have hurt.
Practice: it does not play nice with DEBUG_SHIRQ + LOCKDEP + UP
(see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242572).
The patch makes sense in itself but I should dig why it has an effect
on #242572 (assuming that NAPI do not change in a near future).
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Edward Hsu <edward_hsu@realtek.com.tw>
Daniel Drake [Fri, 27 Jul 2007 13:43:24 +0000 (15:43 +0200)]
[PATCH] mac80211: don't allow scanning in monitor mode
zd1211rw gets confused when the user asks for a scan when the device is
in monitor mode. This patch tightens up the SIWSCAN handler to deny the scan
under these conditions.
Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: Jiri Benc <jbenc@suse.cz> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michael Buesch [Tue, 31 Jul 2007 18:41:04 +0000 (20:41 +0200)]
[PATCH] softmac: Fix deadlock of wx_set_essid with assoc work
The essid wireless extension does deadlock against the assoc mutex,
as we don't unlock the assoc mutex when flushing the workqueue, which
also holds the lock.
Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
While filling the control set the driver tests for a PSPOLL frame.
But it tested only the subtype of the packet. The full type needs
to be tested to identify those packets reliably.
[dsd@gentoo.org: backport to mainline] Signed-off-by: Ulrich Kunitz <kune@deine-taler.de> Signed-off-by: Daniel Drake <dsd@gentoo.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michael Wu [Mon, 16 Jul 2007 00:09:55 +0000 (17:09 -0700)]
[PATCH] rtl8187: ensure priv->hwaddr is always valid
conf->mac_addr is not guaranteed to be set. This ensures priv->hwaddr is
always set to a valid mac address. Thanks to Johannes Berg
<johannes@sipsolutions.net> for finding this problem.
Signed-off-by: Michael Wu <flamingice@sourmilk.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Russell King [Mon, 6 Aug 2007 15:10:54 +0000 (16:10 +0100)]
[ARM] pata_icside: fix the FIXMEs
Alan Cox suggested that the solution to the FIXMEs in pata_icside is
to use a private postreset method to detect the lack of devices on a
port, and in such a case, disable the interrupt for the port.
This patch implements such a method, and removes the hard coded
disable of port 0. Tested as working.
Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[CRYPTO] api: fix writting into unallocated memory in setkey_aligned
setkey_unaligned() commited in ca7c39385ce1a7b44894a4b225a4608624e90730
overwrites unallocated memory in the following memset() because
I used the wrong buffer length.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Dan Williams [Thu, 2 Aug 2007 16:08:51 +0000 (17:08 +0100)]
[ARM] 4541/1: iop: defconfig updates
With the availability of the iop-adma driver iop platforms can now use
their offload engines for md-raid5 (copy+xor) and net-dma (tcp receive
copy) offload.
Cc: Lennert Buytenhek <kernel@wantstofly.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Salyzyn, Mark [Thu, 2 Aug 2007 19:38:59 +0000 (15:38 -0400)]
[SCSI] aacraid: prevent panic on adapter resource failure
If the driver fails to allocate the contiguous (DMAable) memory for
system reasons, we fail to load the instance, but then we try to free
the <nul> allocation in the cleanup code and we get a panic in
pci_free_consistent(). This is reported against an older kernel, hope
this is relevant for latest/greatest.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
check_condition code-path was similar but more
complicated to Reset. It went like this:
1. extra space was allocated at aha152x_scdata for mirroring
scsi_cmnd members.
2. At aha152x_internal_queue() every not check_condition
(REQUEST_SENSE) command was copied to above members in
case of error.
3. At busfree_run() in the DONE_CS phase if a Status of
SAM_STAT_CHECK_CONDITION was detected. The command was
re-queued Internally using aha152x_internal_queue(,,check_condition,)
The old command members are over written with the
REQUEST_SENSE info.
4. At busfree_run() in the DONE_CS phase again. If it is a
check_condition command, info was restored from mirror
made at first call to aha152x_internal_queue() (see 2)
and the command is completed.
What I did is:
1. Allocate less space in aha152x_scdata only for the 16-byte
original command. (which is actually not needed by scsi-ml
anymore at this stage. But this is to much knowledge of scsi-ml)
2. If Status == SAM_STAT_CHECK_CONDITION, then like before
re-queue a REQUEST_SENSE command. But only now save original
command members. (Less of them)
3. In aha152x_internal_queue(), just like for Reset, use the
check_condition hint to set differently the working members.
execute the command.
4. At busfree_run() in the DONE_CS phase again. restore needed
members.
While at it. This patch fixes a BUG. Old code when sending
a REQUEST_SENSE for a failed command. Would than return with
cmd->resid == 0 which was the status of the REQUEST_SENSE.
The failing command resid was lost. And when would resid
be interesting if not on a failing command?
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
What Reset code was doing: Save command's important/dangerous
Info on stack. NULL those members from scsi_cmnd.
Issue a Reset. wait for it to finish than restore members
and return.
What I do is save or NULL nothing. But use the "resetting"
hint in aha152x_internal_queue() to NULL out working members
and leave struct scsi_cmnd alone.
The indent here looks funny but it will change/drop in last
patch and it is clear this way what changed.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
[SCSI] aha152x: preliminary fixes and some comments
hunk by hunk:
- CHECK_CONDITION is what happens to cmnd->status >> 1
or after status_byte() macro. But here it is used
directly on status which means 0x1 which is an undefined
bit in the standard. And is a status that will never
return from a target.
- in busfree_run at the DONE_SC phase we have 3 distinct
operation:
1-if(DONE_SC->SCp.phase & check_condition)
The REQUEST_SENSE command return.
- Restore original command
- Than continue to operation 3.
2-if(DONE_SC->SCp.Status==SAM_STAT_CHECK_CONDITION)
A regular command returned with a status.
- Internally re-Q a REQUEST_SENSE.
- Do not do operation 3.
3-
- Complete the command and return it to scsi-ml
So the 0x2 in both these operations (1,2) means the scsi
check-condition status, hence SAM_STAT_CHECK_CONDITION
- Here the code asks about !(DONE_SC->SCp.Status & not_issued)
but "not_issued" is an enum belonging to the "phase" member
and not to the Status returned from target. The reason this
works is because not_issued==1 and Also CHECK_CONDITION==1
(remember from hunk 1). So actually the code was asking
!(DONE_SC->SCp.Status & CHECK_CONDITION). Which means
"Has the status been read from target yet?"
Staus is read at status_run(). "not_issued" is
cleared in seldo_run() which is usually earlier than
status_run().
So this patch does nothing as far as assembly is concerned
but it does let the reader understand what is going on.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
James Bottomley [Fri, 3 Aug 2007 21:41:11 +0000 (16:41 -0500)]
[SCSI] sd: disentangle barriers in SCSI
Our current implementation has a generic set of barrier functions that
go through the SCSI driver model. Realistically, this is unnecessary,
because the only device that can use barriers (sd) can set the flush
functions up at probe or revalidate time. This patch pulls the barrier
functions out of the mid layer and scsi driver model and relocates them
directly in sd.
Acked-by: Tejun Heo <htejun@gmail.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>