git.karo-electronics.de Git - linux-beck.git/log

]> git.karo-electronics.de Git - linux-beck.git/log

Li, Shaohua [Wed, 23 Mar 2011 07:30:34 +0000 (08:30 +0100)]

cfq-iosched: removing unnecessary think time checking

Removing think time checking. A high thinktime queue might means the queue
dispatches several requests and then do away. Limitting such queue seems
meaningless. And also this can simplify code. This is suggested by Vivek.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Justin TerAvest [Wed, 23 Mar 2011 07:25:44 +0000 (08:25 +0100)]

cfq-iosched: Don't clear queue stats when preempt.

For v2, I added back lines to cfq_preempt_queue() that were removed
during updates for accounting unaccounted_time. Thanks for pointing out
that I'd missed these, Vivek.

Previous commit "cfq-iosched: Don't set active queue in preempt" wrongly
cleared stats for preempting queues when it shouldn't have, because when
we choose a queue to preempt, it still isn't necessarily scheduled next.

Thanks to Vivek Goyal for figuring this out and understanding how the
preemption code works.

Signed-off-by: Justin TerAvest <teravest@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Vivek Goyal [Tue, 22 Mar 2011 20:54:29 +0000 (21:54 +0100)]

blk-throttle: Reset group slice when limits are changed

Lina reported that if throttle limits are initially very high and then
dropped, then no new bio might be dispatched for a long time. And the
reason being that after dropping the limits we don't reset the existing
slice and do the rate calculation with new low rate and account the bios
dispatched at high rate. To fix it, reset the slice upon rate change.

https://lkml.org/lkml/2011/3/10/298

Another problem with very high limit is that we never queued the
bio on throtl service tree. That means we kept on extending the
group slice but never trimmed it. Fix that also by regulary
trimming the slice even if bio is not being queued up.

Reported-by: Lina Lu <lulina_nuaa@foxmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Justin TerAvest [Tue, 22 Mar 2011 20:26:54 +0000 (21:26 +0100)]

blk-cgroup: Only give unaccounted_time under debug

This change moves unaccounted_time to only be reported when
CONFIG_DEBUG_BLK_CGROUP is true.

Signed-off-by: Justin TerAvest <teravest@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Justin TerAvest [Tue, 22 Mar 2011 20:26:49 +0000 (21:26 +0100)]

cfq-iosched: Don't set active queue in preempt

Commit "Add unaccounted time to timeslice_used" changed the behavior of
cfq_preempt_queue to set cfqq active. Vivek pointed out that other
preemption rules might get involved, so we shouldn't manually set which
queue is active.

This cleans up the code to just clear the queue stats at preemption
time.

Signed-off-by: Justin TerAvest <teravest@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Shaohua Li [Tue, 22 Mar 2011 07:35:35 +0000 (08:35 +0100)]

block: fix non-atomic access to genhd inflight structures

After the stack plugging introduction, these are called lockless.
Ensure that the counters are updated atomically.

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Mon, 21 Mar 2011 09:14:27 +0000 (10:14 +0100)]

block: attempt to merge with existing requests on plug flush

One of the disadvantages of on-stack plugging is that we potentially
lose out on merging since all pending IO isn't always visible to
everybody. When we flush the on-stack plugs, right now we don't do
any checks to see if potential merge candidates could be utilized.

Correct this by adding a new insert variant, ELEVATOR_INSERT_SORT_MERGE.
It works just ELEVATOR_INSERT_SORT, but first checks whether we can
merge with an existing request before doing the insertion (if we fail
merging).

This fixes a regression with multiple processes issuing IO that
can be merged.

Thanks to Shaohua Li <shaohua.li@intel.com> for testing and fixing
an accounting bug.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Dan Carpenter [Sat, 19 Mar 2011 12:53:31 +0000 (13:53 +0100)]

block: NULL dereference on error path in __blkdev_get()

"disk" is always NULL when we goto out. There was a check for this
before, but it was removed in 69e02c59a7d9 "block: Don't check events
while open is in progress".

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@carl>

commit | commitdiff | tree

Justin TerAvest [Thu, 17 Mar 2011 15:12:36 +0000 (16:12 +0100)]

cfq-iosched: Don't update group weights when on service tree

Version 3 is updated to apply to for-2.6.39/core.

For version 2, I took Vivek's advice and made sure we update the group
weight from cfq_group_service_tree_add().

If a weight was updated while a group is on the service tree, the
calculation for the total weight of the service tree can be adjusted
improperly, which either leads to bad service tree weights, or
potentially crashes (if total_weight becomes 0).

This patch defers updates to the weight until a group is off the service
tree.

Signed-off-by: Justin TerAvest <teravest@google.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Thu, 17 Mar 2011 10:13:12 +0000 (11:13 +0100)]

fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away

We don't have proper reference counting for this yet, so we run into
cases where the device is pulled and we OOPS on flushing the fs data.
This happens even though the dirty inodes have already been
migrated to the default_backing_dev_info.

Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Tested-by: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Martin K. Petersen [Thu, 17 Mar 2011 10:11:05 +0000 (11:11 +0100)]

block: Require subsystems to explicitly allocate bio_set integrity mempool

MD and DM create a new bio_set for every metadevice. Each bio_set has an
integrity mempool attached regardless of whether the metadevice is
capable of passing integrity metadata. This is a waste of memory.

Instead we defer the allocation decision to MD and DM since we know at
metadevice creation time whether integrity passthrough is needed or not.

Automatic integrity mempool allocation can then be removed from
bioset_create() and we make an explicit integrity allocation for the
fs_bio_set.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Zdenek Kabelac <zkabelac@redhat.com>
Acked-by: Mike Snitzer <snizer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Thu, 17 Mar 2011 10:01:52 +0000 (11:01 +0100)]

jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging

'write_op' was still used, even though it was always WRITE_SYNC now.
Add plugging around the cases where it submits IO, and flush them
before we end up waiting for that IO.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Thu, 17 Mar 2011 09:56:45 +0000 (10:56 +0100)]

jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging

'write_op' was still used, even though it was always WRITE_SYNC now.
Add plugging around the cases where it submits IO, and flush them
before we end up waiting for that IO.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Thu, 17 Mar 2011 09:51:40 +0000 (10:51 +0100)]

fs: make fsync_buffers_list() plug

It used WRITE_SYNC_PLUG before and potentially submits a batch
of IO, so lets enable plugging for this case.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Shaohua Li [Thu, 17 Mar 2011 09:47:06 +0000 (10:47 +0100)]

mm: make generic_writepages() use plugging

This recovers a performance regression caused by the removal
of the per-device plugging.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Justin TerAvest [Sat, 12 Mar 2011 15:54:00 +0000 (16:54 +0100)]

blk-cgroup: Add unaccounted time to timeslice_used.

There are two kind of times that tasks are not charged for: the first
seek and the extra time slice used over the allocated timeslice. Both
of these exported as a new unaccounted_time stat.

I think it would be good to have this reported in 'time' as well, but
that is probably a separate discussion.

Signed-off-by: Justin TerAvest <teravest@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Fri, 11 Mar 2011 19:17:08 +0000 (20:17 +0100)]

block: fixup plugging stubs for !CONFIG_BLOCK

They used an older prototype, fix it up.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Tao Ma [Fri, 11 Mar 2011 19:13:54 +0000 (20:13 +0100)]

block: remove obsolete comments for blkdev_issue_zeroout.

barrier is already removed, so remove the obsolete comments
in blkdev_issue_zeroout.

Cc: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Tao Ma [Fri, 11 Mar 2011 19:11:59 +0000 (20:11 +0100)]

blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.

In blk_add_trace_rq, we only chose the minor 2 bits from
request's cmd_flags and did some check for discard.
so most of other flags(e.g, REQ_SYNC) are missing.

For example, with a sync write after blkparse we get:
  8,16   1        1     0.001776503  7509  A  WS 1349632 + 1024 <- (8,17) 1347584
  8,16   1        2     0.001776813  7509  Q  WS 1349632 + 1024 [dd]
  8,16   1        3     0.001780395  7509  G  WS 1349632 + 1024 [dd]
  8,16   1        5     0.001783186  7509  I   W 1349632 + 1024 [dd]
  8,16   1       11     0.001816987  7509  D   W 1349632 + 1024 [dd]
  8,16   0        2     0.006218192     0  C   W 1349632 + 1024 [0]

Since now we have integrated the flags of both bio and request,
it is safe to pass rq->cmd_flags directly to __blk_add_trace.

With this patch, after a sync write we get:
  8,16   1        1     0.001776900  5425  A  WS 1189888 + 1024 <- (8,17) 1187840
  8,16   1        2     0.001777179  5425  Q  WS 1189888 + 1024 [dd]
  8,16   1        3     0.001780797  5425  G  WS 1189888 + 1024 [dd]
  8,16   1        5     0.001783402  5425  I  WS 1189888 + 1024 [dd]
  8,16   1       11     0.001817468  5425  D  WS 1189888 + 1024 [dd]
  8,16   0        2     0.005640709     0  C  WS 1189888 + 1024 [0]

Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Thu, 10 Mar 2011 07:58:35 +0000 (08:58 +0100)]

Merge branch 'for-2.6.39/stack-plug' into for-2.6.39/core

Conflicts:
block/blk-core.c
block/blk-flush.c
drivers/md/raid1.c
drivers/md/raid10.c
drivers/md/raid5.c
fs/nilfs2/btnode.c
fs/nilfs2/mdt.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Vivek Goyal [Wed, 9 Mar 2011 07:27:37 +0000 (08:27 +0100)]

blk-throttle: Use blk_plug in throttle dispatch

Use plug in throttle dispatch also as we are dispatching a bunch of
bios in throttle context and some of them might merge.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Wed, 9 Mar 2011 10:56:30 +0000 (11:56 +0100)]

block: kill off REQ_UNPLUG

With the plugging now being explicitly controlled by the
submitter, callers need not pass down unplugging hints
to the block layer. If they want to unplug, it's because they
manually plugged on their own - in which case, they should just
unplug at will.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Thu, 3 Mar 2011 01:12:18 +0000 (20:12 -0500)]

aio: remove request submission batching

This should be useless now that we have on-stack plugging. So lets just
kill it.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Shaohua Li [Thu, 1 Jul 2010 05:55:01 +0000 (07:55 +0200)]

fs: make aio plug

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Tue, 22 Jun 2010 10:52:14 +0000 (12:52 +0200)]

fs: make mpage read/write_pages() plug

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Mon, 19 Apr 2010 08:04:38 +0000 (10:04 +0200)]

read-ahead: use plugging

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Thu, 24 Jun 2010 13:05:37 +0000 (15:05 +0200)]

fs: make generic file read/write functions plug

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Thu, 10 Mar 2011 07:52:07 +0000 (08:52 +0100)]

block: remove per-queue plugging

Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Tue, 8 Mar 2011 12:19:51 +0000 (13:19 +0100)]

block: initial patch for on-stack per-task plugging

This patch adds support for creating a queuing context outside
of the queue itself. This enables us to batch up pieces of IO
before grabbing the block device queue lock and submitting them to
the IO scheduler.

The context is created on the stack of the process and assigned in
the task structure, so that we can auto-unplug it if we hit a schedule
event.

The current queue plugging happens implicitly if IO is submitted to
an empty device, yet callers have to remember to unplug that IO when
they are going to wait for it. This is an ugly API and has caused bugs
in the past. Additionally, it requires hacks in the vm (->sync_page()
callback) to handle that logic. By switching to an explicit plugging
scheme we make the API a lot nicer and can get rid of the ->sync_page()
hack in the vm.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Fri, 16 Apr 2010 19:13:15 +0000 (21:13 +0200)]

scsi: convert to blk_delay_queue()

It was always abuse to reuse the plugging infrastructure for this,
convert it to the (new) real API for delaying queueing a bit. A
default delay of 3 msec is defined, to match the previous
behaviour.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Jens Axboe [Fri, 16 Apr 2010 19:11:21 +0000 (21:11 +0200)]

ide-cd: convert to blk_delay_queue() for a short pause

It was always abuse to reuse the plugging infrastructure for this,
convert it to the (new) real API for delaying queueing a bit.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Acked-by: David S. Miller <davem@davemloft.net>

commit | commitdiff | tree

Jens Axboe [Wed, 2 Mar 2011 16:08:00 +0000 (11:08 -0500)]

block: add API for delaying work/request_fn a little bit

Currently we use plugging for that, but as plugging is going away,
we need an alternative mechanism.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>

commit | commitdiff | tree

Tejun Heo [Wed, 9 Mar 2011 18:54:29 +0000 (19:54 +0100)]

staging: Convert to bdops->check_events()

Convert two staging drivers - blkvsc_drv and cyasblkdev_block - from
->media_changed() to ->check_events(). The former always indicated
media changed while the latter always indicated media not changed.
Not sure what the drivers are trying to achieve but keep the original
behavior.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kay Sievers <kay.sievers@vrfy.org>

commit | commitdiff | tree