Right now the kernel still has one piece of rbd management
duplicated from the rbd command line tool: snapshot creation.
There's nothing special about snapshot creation that makes it
advantageous to do from the kernel, so I'd like to remove the
create_snap sysfs interface. That is,
/sys/bus/rbd/devices/<id>/create_snap
would be removed.
Does anyone rely on the sysfs interface for creating rbd
snapshots? If so, how hard would it be to replace with:
rbd snap create pool/image@snap
Is there any benefit to the sysfs interface that I'm missing?
Josh
This patch implements this proposal, removing the code that
implements the "snap_create" sysfs interface for rbd images.
As a result, quite a lot of other supporting code goes away.
[elder@inktank.com: commented out rbd_req_sync_exec() to avoid warning]
Suggested-by: Josh Durgin <josh.durgin@inktank.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
(based on commit 02cdb02ceab1f3dd9ac2bc899fc51f0e0e744782)
Alex Elder [Fri, 7 Dec 2012 15:57:58 +0000 (09:57 -0600)]
libceph: avoid using freed osd in __kick_osd_requests()
If an osd has no requests and no linger requests, __reset_osd()
will just remove it with a call to __remove_osd(). That drops
a reference to the osd, and therefore the osd may have been free
by the time __reset_osd() returns. That function offers no
indication this may have occurred, and as a result the osd will
continue to be used even when it's no longer valid.
Change__reset_osd() so it returns an error (ENODEV) when it
deletes the osd being reset. And change __kick_osd_requests() so it
returns immediately (before referencing osd again) if __reset_osd()
returns *any* error.
Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 685a7555ca69030739ddb57a47f0ea8ea80196a4) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sage Weil [Mon, 29 Oct 2012 18:01:42 +0000 (11:01 -0700)]
libceph: fix osdmap decode error paths
Ensure that we set the err value correctly so that we do not pass a 0
value to ERR_PTR and confuse the calling code. (In particular,
osd_client.c handle_map() will BUG(!newmap)).
Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
(cherry picked from commit 0ed7285e0001b960c888e5455ae982025210ed3d) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We should not set con->state to CLOSED here; that happens in
ceph_fault() in the caller, where it first asserts that the state
is not yet CLOSED. Avoids a BUG when the features don't match.
Since the fail_protocol() has become a trivial wrapper, replace
calls to it with direct calls to reset_connection().
Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
(cherry picked from commit 0fa6ebc600bc8e830551aee47a0e929e818a1868) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Elder [Wed, 26 Dec 2012 16:43:57 +0000 (10:43 -0600)]
libceph: WARN, don't BUG on unexpected connection states
A number of assertions in the ceph messenger are implemented with
BUG_ON(), killing the system if connection's state doesn't match
what's expected. At this point our state model is (evidently) not
well understood enough for these assertions to trigger a BUG().
Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...)
so we learn about these issues without killing the machine.
We now recognize that a connection fault can occur due to a socket
closure at any time, regardless of the state of the connection. So
there is really nothing we can assert about the state of the
connection at that point so eliminate that assertion.
Reported-by: Ugis <ugis22@gmail.com> Tested-by: Ugis <ugis22@gmail.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 122070a2ffc91f87fe8e8493eb0ac61986c5557c) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Elder [Wed, 26 Dec 2012 20:31:40 +0000 (14:31 -0600)]
libceph: always reset osds when kicking
When ceph_osdc_handle_map() is called to process a new osd map,
kick_requests() is called to ensure all affected requests are
updated if necessary to reflect changes in the osd map. This
happens in two cases: whenever an incremental map update is
processed; and when a full map update (or the last one if there is
more than one) gets processed.
In the former case, the kick_requests() call is followed immediately
by a call to reset_changed_osds() to ensure any connections to osds
affected by the map change are reset. But for full map updates
this isn't done.
Both cases should be doing this osd reset.
Rather than duplicating the reset_changed_osds() call, move it into
the end of kick_requests().
Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit e6d50f67a6b1a6252a616e6e629473b5c4277218) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Elder [Wed, 19 Dec 2012 21:52:36 +0000 (15:52 -0600)]
libceph: move linger requests sooner in kick_requests()
The kick_requests() function is called by ceph_osdc_handle_map()
when an osd map change has been indicated. Its purpose is to
re-queue any request whose target osd is different from what it
was when it was originally sent.
It is structured as two loops, one for incomplete but registered
requests, and a second for handling completed linger requests.
As a special case, in the first loop if a request marked to linger
has not yet completed, it is moved from the request list to the
linger list. This is as a quick and dirty way to have the second
loop handle sending the request along with all the other linger
requests.
Because of the way it's done now, however, this quick and dirty
solution can result in these incomplete linger requests never
getting re-sent as desired. The problem lies in the fact that
the second loop only arranges for a linger request to be sent
if it appears its target osd has changed. This is the proper
handling for *completed* linger requests (it avoids issuing
the same linger request twice to the same osd).
But although the linger requests added to the list in the first loop
may have been sent, they have not yet completed, so they need to be
re-sent regardless of whether their target osd has changed.
The first required fix is we need to avoid calling __map_request()
on any incomplete linger request. Otherwise the subsequent
__map_request() call in the second loop will find the target osd
has not changed and will therefore not re-send the request.
Second, we need to be sure that a sent but incomplete linger request
gets re-sent. If the target osd is the same with the new osd map as
it was when the request was originally sent, this won't happen.
This can be fixed through careful handling when we move these
requests from the request list to the linger list, by unregistering
the request *before* it is registered as a linger request. This
works because a side-effect of unregistering the request is to make
the request's r_osd pointer be NULL, and *that* will ensure the
second loop actually re-sends the linger request.
Processing of such a request is done at that point, so continue with
the next one once it's been moved.
Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit ab60b16d3c31b9bd9fd5b39f97dc42c52a50b67d) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Elder [Thu, 6 Dec 2012 13:22:04 +0000 (07:22 -0600)]
libceph: register request before unregister linger
In kick_requests(), we need to register the request before we
unregister the linger request. Otherwise the unregister will
reset the request's osd pointer to NULL.
Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c89ce05e0c5a01a256100ac6a6019f276bdd1ca6) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Elder [Mon, 17 Dec 2012 18:23:48 +0000 (12:23 -0600)]
libceph: don't use rb_init_node() in ceph_osdc_alloc_request()
The red-black node in the ceph osd request structure is initialized
in ceph_osdc_alloc_request() using rbd_init_node(). We do need to
initialize this, because in __unregister_request() we call
RB_EMPTY_NODE(), which expects the node it's checking to have
been initialized. But rb_init_node() is apparently overkill, and
may in fact be on its way out. So use RB_CLEAR_NODE() instead.
For a little more background, see this commit: 4c199a93 rbtree: empty nodes have no color"
Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit a978fa20fb657548561dddbfb605fe43654f0825) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Elder [Mon, 17 Dec 2012 18:23:48 +0000 (12:23 -0600)]
libceph: init event->node in ceph_osdc_create_event()
The red-black node node in the ceph osd event structure is not
initialized in create_osdc_create_event(). Because this node can
be the subject of a RB_EMPTY_NODE() call later on, we should ensure
the node is initialized properly for that.
Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 3ee5234df68d253c415ba4f2db72ad250d9c21a9) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Elder [Thu, 6 Dec 2012 13:22:04 +0000 (07:22 -0600)]
libceph: init osd->o_node in create_osd()
The red-black node node in the ceph osd structure is not initialized
in create_osd(). Because this node can be the subject of a
RB_EMPTY_NODE() call later on, we should ensure the node is
initialized properly for that. Add a call to RB_CLEAR_NODE()
initialize it.
Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit f407731d12214e7686819018f3a1e9d7b6f83a02) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Elder [Fri, 14 Dec 2012 22:47:41 +0000 (16:47 -0600)]
libceph: report connection fault with warning
When a connection's socket disconnects, or if there's a protocol
error of some kind on the connection, a fault is signaled and
the connection is reset (closed and reopened, basically). We
currently get an error message on the log whenever this occurs.
A ceph connection will attempt to reestablish a socket connection
repeatedly if a fault occurs. This means that these error messages
will get repeatedly added to the log, which is undesirable.
Change the error message to be a warning, so they don't get
logged by default.
Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 28362986f8743124b3a0fda20a8ed3e80309cce1) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Elder [Sat, 8 Dec 2012 01:50:07 +0000 (19:50 -0600)]
libceph: socket can close in any connection state
A connection's socket can close for any reason, independent of the
state of the connection (and without irrespective of the connection
mutex). As a result, the connectino can be in pretty much any state
at the time its socket is closed.
Handle those other cases at the top of con_work(). Pull this whole
block of code into a separate function to reduce the clutter.
Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 7bb21d68c535ad8be38e14a715632ae398b37ac1) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sage Weil [Tue, 25 Sep 2012 04:01:02 +0000 (21:01 -0700)]
ceph: propagate layout error on osd request creation
If we are creating an osd request and get an invalid layout, return
an EINVAL to the caller. We switch up the return to have an error
code instead of NULL implying -ENOMEM.
Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>
(cherry picked from commit 6816282dab3a72efe8c0d182c1bc2960d87f4322) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Josh Durgin [Mon, 5 Dec 2011 22:03:05 +0000 (14:03 -0800)]
rbd: use reference counting for the snap context
This prevents a race between requests with a given snap context and
header updates that free it. The osd client was already expecting the
snap context to be reference counted, since it get()s it in
ceph_osdc_build_request and put()s it when the request completes.
Also remove the second down_read()/up_read() on header_rwsem in
rbd_do_request, which wasn't actually preventing this race or
protecting any other data.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Reviewed-by: Alex Elder <elder@inktank.com>
(cherry picked from commit d1d25646543134d756a02ffe4e02073faa761f2c) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Josh Durgin [Tue, 22 Nov 2011 02:14:25 +0000 (18:14 -0800)]
rbd: return errors for mapped but deleted snapshot
When a snapshot is deleted, the OSD will return ENOENT when reading
from it. This is normally interpreted as a hole by rbd, which will
return zeroes. To minimize the time in which this can happen, stop
requests early when we are notified that our snapshot no longer
exists.
Sage Weil [Mon, 30 Jul 2012 23:21:17 +0000 (16:21 -0700)]
ceph: close old con before reopening on mds reconnect
When we detect a mds session reset, close the old ceph_connection before
reopening it. This ensures we clean up the old socket properly and keep
the ceph_connection state correct.
Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
(cherry picked from commit a53aab645c82f0146e35684b34692c69b5118121) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This was meant to be the purpose of the
intel_crtc_wait_for_pending_flips() function which is called whilst
preparing the CRTC for a modeset or before disabling. However, as Ville
Syrjala pointed out, we set the pending flip notification on the old
framebuffer that is no longer attached to the CRTC by the time we come
to flush the pending operations. Instead, we can simply wait on the
pending unpin work to be finished on this CRTC, knowning that the
hardware has therefore finished modifying the registers, before proceeding
with our direct access.
Fixes i-g-t/flip_test on non-pch platforms. pch platforms simply
schedule the flip immediately when the pipe is disabled, leading
to other funny issues.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Added i-g-t note and cc: stable] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
... since finish_page_flip needs the vblank timestamp generated
in drm_handle_vblank. Somehow all the gmch platforms get it right,
but all the pch platform irq handlers get is wrong. Hooray for copy&
pasting!
Currently this gets papered over by a gross hack in finish_page_flip.
A second patch will remove that.
Note that without this, the new timestamp sanity checks in flip_test
occasionally get tripped up, hence the cc: stable tag.
Reviewed-by: mario.kleiner@tuebingen.mpg.de Tested-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwh: Backported to 3.2: no loop over pipes in ivybridge_irq_handler(),
so make a similar change to that in ironlake_irq_handler()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I can't even find how I figured this might be needed anymore. But sure
enough, the value I'm reading back on platforms doesn't match what the
docs recommends.
It seemed to fix Chris' GT1 in limited testing as well.
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwh: Backported to 3.2: open-code _MASKED_BIT_{ENABLE,DISABLE}] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pin-leaks persist and we get the perennial bug reports of machine
lockups to the BUG_ON(pin_count==MAX). If we instead loudly report that
the object cannot be pinned at that time it should prevent the driver from
locking up, and hopefully restore a semblance of working whilst still
leaving us a OOPS to debug.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Avoid constant wakeups caused by noisy irq lines when we don't even care
about the irq. This should be particularly useful for i945g/gm where the
hotplug has been disabled:
v2: While at it, remove the bogus hotplug_active read, and do not mask
hotplug_active[0] before checking whether the irq is needed, per discussion
with Daniel on IRC.
Note that gen3 is the only platform where we've got the bit
definitions right, hence the workaround of disabling sdvo hotplug
support on i945g/gm is not due to misdiagnosis of broken hotplug irq
handling ...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: add some blurb about sdvo hotplug fail on i945g/gm I've
wondered about while reviewing.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwh: Backported to 3.2:
- Adjust context
- Handle all three cases in i915_driver_irq_postinstall() as there
are not separate functions for gen3 and gen4+
- Carry on using IS_SDVOB() in intel_sdvo_init()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On IVB and older, we basically have two registers: the control and the
data register. We write a few consecutitve times to the control
register, and we need these writes to arrive exactly in the specified
order.
Also, when we're changing the data register, we need to guarantee that
anything written to the control register already arrived (since
changing the control register can change where the data register
points to). Also, we need to make sure all the writes to the data
register happen exactly in the specified order, and we also *can't*
read the data register during this process, since reading and/or
writing it will change the place it points to.
So invoke the "better safe than sorry" rule and just be careful and
put barriers everywhere :)
On HSW we still have a control register that we write many times, but
we have many data registers.
Demanded-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwh: Backported to 3.2:
- There are only two write_infoframe functions to be modified
- The other VIDEO_DIP_CTL writes are in entirely different functions] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Selecting ATOM_PPLL_INVALID should be equivalent as the
DCPLL or PPLL0 are already programmed for the DISPCLK, but
the preferred method is to always specify the PLL selected.
SetPixelClock will check the parameters and skip the
programming if the PLL is already set up.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
... we will botch up the bit17 swizzling. Furthermore tiled pwrite is
a (now) unused slowpath, so no one really cares.
This fixes the last swizzling issues I have with i-g-t on my bit17
swizzling i915G. No regression, it's been broken since the dawn of
gem, but it's nice for regression tracking when really _all_ i-g-t
tests work.
Actually this is not true, Chris Wilson noticed while reviewing this
patch that the commit
During modeset we have to disable the pipe to reconfigure its timings
and maybe its size. Userspace may have queued up command buffers that
depend upon the pipe running in a certain configuration and so the
commands may become confused across the modeset. At the moment, we use a
less than satisfactory kick-scanline-waits should the GPU hang during
the modeset. It should be more reliable to wait for the pending
operations to complete first, even though we still have a window for
userspace to submit a broken command buffer during the modeset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I have seen a number of "blt ring initialization failed" messages
where the ctl or start registers are not the correct value. Upon further
inspection, if the code just waited a little bit, it would read the
correct value. Adding the wait_for to these reads should eliminate the
issue.
Signed-off-by: Sean Paul <seanpaul@chromium.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently i915 driver checks [PCH_]LVDS register bits to decide
whether to set up the dual-link or the single-link mode. This relies
implicitly on that BIOS initializes the register properly at boot.
However, BIOS doesn't initialize it always. When the machine is
booted with the closed lid, BIOS skips the LVDS reg initialization.
This ends up in blank output on a machine with a dual-link LVDS when
you open the lid after the boot.
This patch adds a workaround for that problem by checking the initial
LVDS register value in VBT.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37742 Tested-By: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Empirical evidence suggests that we need to: On at least one ivb
machine when running the hangman i-g-t test, the rings don't properly
initialize properly - the RING_START registers seems to be stuck at
all zeros.
Holding forcewake around this register init sequences makes chip reset
reliable again. Note that this is not the first such issue:
added delay loops to make RING_START and RING_CTL initialization
reliable on the blt ring at boot-up. So I guess it won't hurt if we do
this unconditionally for all force_wake needing gpus.
To avoid copy&pasting of the HAS_FORCE_WAKE check I've added a new
intel_info bit for that.
v2: Fixup missing commas in static struct and properly handling the
error case in init_ring_common, both noticed by Jani Nikula.
Reported-and-tested-by: Yang Guang <guang.a.yang@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50522 Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[bwh: Backported to 3.2:
- drop changes to Haswell device information
- NEEDS_FORCE_WAKE didn't refer to Valley View anyway] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
[jcristau: further context adjustments for 3.4] Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Or at least plug another gapping hole. Apparrently hw desingers only
moved the bit field, but did not bother ot re-enumerate the planes
when adding support for a 3rd pipe.
Discovered by i-g-t/flip_test.
This may or may not fix the reference bugzilla, because that one
smells like we have still larger fish to fry.
v2: Fixup the impossible case to catch programming errors, noticed by
Chris Wilson.
References: https://bugs.freedesktop.org/show_bug.cgi?id=50069 Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Julien Cristau <jcristau@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drm/i915: set phase sync pointer override enable before setting phase sync pointer
It turns out that this w/a isn't actually required on cpt/ppt and
positively harmful on ivb/ppt when using fdi B/C links - it results in
a black screen occasionally, with seemingfully everything working as
it should. The only failure indication I've found in the hw is that
eventually (but not right after the modeset completes) a pipe underrun
is signalled.
Big thanks to Arthur Runyan for all the ideas for registers to check
and changes to test, otherwise I couldn't ever have tracked this down!
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: "Runyan, Arthur J" <arthur.j.runyan@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In gfs2_trans_add_bh(), gfs2 was testing if a there was a bd attached to the
buffer without having the gfs2_log_lock held. It was then assuming it would
stay attached for the rest of the function. However, without either the log
lock being held of the buffer locked, __gfs2_ail_flush() could detach bd at any
time. This patch moves the locking before the test. If there isn't a bd
already attached, gfs2 can safely allocate one and attach it before locking.
There is no way that the newly allocated bd could be on the ail list,
and thus no way for __gfs2_ail_flush() to detach it.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A high speed control or bulk endpoint may have bInterval set to zero,
which means it does not NAK. If bInterval is non-zero, it means the
endpoint NAKs at a rate of 2^(bInterval - 1).
The xHCI code to compute the NAK interval does not handle the special
case of zero properly. The current code unconditionally subtracts one
from bInterval and uses it as an exponent. This causes a very large
bInterval to be used, and warning messages like these will be printed:
usb 1-1: ep 0x1 - rounding interval to 32768 microframes, ep desc says 0 microframes
This may cause the xHCI host hardware to reject the Configure Endpoint
command, which means the HS device will be unusable under xHCI ports.
This patch should be backported to kernels as old as 2.6.31, that contain
commit dfa49c4ad120a784ef1ff0717168aa79f55a483a "USB: xhci - fix math in
xhci_get_endpoint_interval()".
Reported-by: Vincent Pelletier <plr.vincent@gmail.com> Suggested-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some touchscreens have buggy firmware which claims
remote wakeup to be enabled after a reset. They nevertheless
crash if the feature is cleared by the host.
Add a check for reset resume before checking for
an enabled remote wakeup feature. On compliant
devices the feature must be cleared after a reset anyway.
Signed-off-by: Oliver Neukum <oneukum@suse.de> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The USB core hub thread (khubd) is designed with external USB hubs in
mind. It expects that if a port status change bit is set, the hub will
continue to send a notification through the hub status data transfer.
Basically, it expects hub notifications to be level-triggered.
The xHCI host controller is designed to be edge-triggered on the logical
'OR' of all the port status change bits. When all port status change
bits are clear, and a new change bit is set, the xHC will generate a
Port Status Change Event. If another change bit is set in the same port
status register before the first bit is cleared, it will not send
another event.
This means that the hub code may lose port status changes because of
race conditions between clearing change bits. The user sees this as a
"dead port" that doesn't react to device connects.
The fix is to turn on port polling whenever a new change bit is set.
Once the USB core issues a hub status request that shows that no change
bits are set in any USB ports, turn off port polling.
We can't allow the USB core to poll the roothub for port events during
host suspend because if the PCI host is in D3cold, the port registers
will be all f's. Instead, stop the port polling timer, and
unconditionally restart it when the host resumes. If there are no port
change bits set after the resume, the first call to hub_status_data will
disable polling.
This patch should be backported to stable kernels with the first xHCI
support, 2.6.31 and newer, that include the commit 0f2a79300a1471cf92ab43af165ea13555c8b0a5 "USB: xhci: Root hub support."
There will be merge conflicts because the check for HC_STATE_SUSPENDED
was moved into xhci_suspend in 3.8.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
An empty port can transition to either Inactive or Compliance Mode if a
newly connected USB 3.0 device fails to link train. In that case, we
issue a warm reset. Some devices, such as John's Roseweil eusb3
enclosure, slip back into Compliance Mode after the warm reset.
The current warm reset code does not check for device connect status on
warm reset completion, and it incorrectly reports the warm reset
succeeded. This causes the USB core to attempt to send a Set Address
control transfer to a port in Compliance Mode, which will always fail.
Make hub_port_wait_reset check the current connect status and link state
after the warm reset completes. Return a failure status if the device
is disconnected or the link state is Compliance Mode or SS.Inactive.
Make hub_events disable the port if warm reset fails. This will disable
the port, and then bring it back into the RxDetect state. Make the USB
core ignore the connect change until the device reconnects.
Note that this patch does NOT handle connected devices slipping into the
Inactive state very well. This is a concern, because devices can go
into the Inactive state on U1/U2 exit failure. However, the fix for
that case is too large for stable, so it will be submitted in a separate
patch.
The port reset code bails out early if the current connect status is
cleared (device disconnected). If we're issuing a hot reset, it may
also look at the link state before the reset is finished.
Section 10.14.2.6 of the USB 3.0 spec says that when a port enters the
Error state or Resetting state, the port connection bit retains the
value from the previous state. Therefore we can't trust it until the
reset finishes. Also, the xHCI spec section 4.19.1.2.5 says software
shall ignore the link state while the port is resetting, as it can be in
an unknown state.
The port state during reset is also unknown for USB 2.0 hubs. The hub
sends a reset signal by driving the bus into an SE0 state. This
overwhelms the "connect" signal from the device, so the port can't tell
whether anything is connected or not.
Fix the port reset code to ignore the port link state and current
connect bit until the reset finishes, and USB_PORT_STAT_RESET is
cleared.
Remove the check for USB_PORT_STAT_C_BH_RESET in the warm reset case,
because it's redundant. When the warm reset finishes, the port reset
bit will be cleared at the same time USB_PORT_STAT_C_BH_RESET is set.
Remove the now-redundant check for a cleared USB_PORT_STAT_RESET bit
in the code to deal with the finished reset.
This patch should be backported to all stable kernels.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John's NEC 0.96 xHCI host controller needs a longer timeout for a warm
reset to complete. The logs show it takes 650ms to complete the warm
reset, so extend the hub reset timeout to 800ms to be on the safe side.
If hot and warm reset fails, or a port remains in the Compliance Mode,
the USB core needs to be able to disable a USB 3.0 port. Unlike USB 2.0
ports, once the port is placed into the Disabled link state, it will not
report any new device connects. To get device connect notifications, we
need to put the link into the Disabled state, and then the RxDetect
state.
The xHCI driver needs to atomically clear all change bits on USB 3.0
port disable, so that we get Port Status Change Events for future port
changes. We could technically do this in the USB core instead of in the
xHCI roothub code, since the port state machine can't advance out of the
disabled state until we set the link state to RxDetect. However,
external USB 3.0 hubs don't need this code. They are level-triggered,
not edge-triggered like xHCI, so they will continue to send interrupt
events when any change bit is set. Therefore it doesn't make sense to
put this code in the USB core.
This patch is part of a series to fix several reports of infinite loops
on device enumeration failure. This includes John, when he boots with
a USB 3.0 device (Roseweil eusb3 enclosure) attached to his NEC 0.96
host controller. The fix requires warm reset support, so it does not
make sense to backport this patch to stable kernels without warm reset
support.
When the USB core finishes reseting a USB device, the xHCI driver sends
a Reset Device command to the host. The xHC then updates its internal
representation of the USB device to the 'Default' device state. If the
device was already in the Default state, the xHC will complete the
command with an error status.
If a device needs to be reset several times during enumeration, the
second reset will always fail because of the xHCI Reset Device command.
This can cause issues during enumeration.
For example, usb_reset_and_verify_device calls into hub_port_init in a
loop. Say that on the first call into hub_port_init, the device is
successfully reset, but doesn't respond to several set address control
transfers. Then the port will be disabled, but the udev will remain in
tact. usb_reset_and_verify_device will call into hub_port_init again.
On the second call into hub_port_init, the device will be reset, and the
xHCI driver will issue a Reset Device command. This command will fail
(because the device is already in the Default state), and
usb_reset_and_verify_device will fail. The port will be disabled, and
the device won't be able to enumerate.
Fix this by ignoring the return value of the HCD reset_device callback.
USB 3.0 hubs and roothubs will automatically transition a failed hot
reset to a warm (BH) reset. In that case, the warm reset change bit
will be set, and the link state change bit may also be set. Change
hub_port_finish_reset to unconditionally clear those change bits for USB
3.0 hubs. If these bits are not cleared, we may lose port change events
from the roothub.
Commit 2a44e499 ("drm/nouveau/disp: introduce proper init/fini, separate
from create/destroy") started to call display init routines on pre-nv50
hardware on module load. But LVDS init code sets driver state in a way
which prevents modesetting code from operating properly.
nv04_display_init calls nv04_dfp_restore, which sets encoder->last_dpms to
NV_DPMS_CLEARED.
nv04_lvds_dpms checks last_dpms mode (which is NV_DPMS_CLEARED) and wrongly
assumes it's a "powersaving mode", the new one (DRM_MODE_DPMS_OFF) is too,
so it skips calling some crucial lvds scripts.
Reported-by: Chris Paulson-Ellis <chris@edesix.com> Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 5c8a86e10a7c164f44537fabdc169fd8b4e7a440 (usb: musb: drop unneeded
musb_debug trickery) erroneously removed '\n' from the driver's banner.
Concatenate all the banner substrings while adding it back...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we do have endpoints named like "ep-a" then bEndpointAddress is
counted internally by the gadget framework.
If we do have endpoints named like "ep-1" then bEndpointAddress is
assigned from the digit after "ep-".
If we do have both, then it is likely that after we used up the
"generic" endpoints we will use the digits and thus assign one
bEndpointAddress to multiple endpoints.
This theory can be proofed by using the completely enabled g_multi.
Without this patch, the mass storage won't enumerate and times out
because it shares endpoints with RNDIS.
This patch also adds fills up the endpoints list so we have in total
endpoints 1 to 15 in + out available while some of them are restricted
to certain types like BULK or ISO. Without this change the nokia gadget
won't load because the system does not provide enough (BULK) endpoints
but it did before ep-a - ep-f were removed.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Simple fix to add support for Crucible Technologies COMET Caller ID
USB decoder - a device containing FTDI USB/Serial converter chip,
handling 1200bps CallerID messages decoded from the phone line -
adding correct USB PID is sufficient.
Tested to apply cleanly and work flawlessly against 3.6.9, 3.7.0-rc8
and 3.8.0-rc3 on both amd64 and x86 arches.
Signed-off-by: Tomasz Mloduchowski <q@qdot.me> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Recent versions of udev cause synchronous firmware loading from the
probe routine to fail because the request to user space would time
out. The original fix for b43 (commit 6b6fa58) moved the firmware
load from the probe routine to a work queue, but it still used synchronous
firmware loading. This method is OK when b43 is built as a module;
however, it fails when the driver is compiled into the kernel.
This version changes the code to load the initial firmware file
using request_firmware_nowait(). A completion event is used to
hold the work queue until that file is available. This driver
reads several firmware files - the remainder can be read synchronously.
On some test systems, the async read fails; however, a following synch
read works, thus the async failure falls through to the sync try.
Reported-and-Tested by: Felix Janda <felix.janda@posteo.de> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wait_event_interruptible function returns -ERESTARTSYS if it's
interrupted by a signal. Driver should check the return value
and handle this case properly.
In mwifiex_wait_queue_complete() routine, as we are now checking
wait_event_interruptible return value, the condition check is not
required. Also, we have removed mwifiex_cancel_pending_ioctl()
call to avoid a chance of sending second command to FW by other path
as soon as we clear current command node. FW can not handle two
commands simultaneously.
Signed-off-by: Bing Zhao <bzhao@marvell.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is a very old bug, but there's nothing that prevents the
timer from running while the module is being removed when we
only do del_timer() instead of del_timer_sync().
The timer should normally not be running at this point, but
it's not clearly impossible (or we could just remove this.)
Tested-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Those rn50 chip are often connected to console remoting hw and load
detection often fails with those. Just don't try to load detect and
report connect.
Éric Piel reported a kernel oops in the "comedi_test" module. It was a
NULL pointer dereference within `waveform_ai_interrupt()` (actually a
timer function) that sometimes occurred when a running asynchronous
command is cancelled (either by the `COMEDI_CANCEL` ioctl or by closing
the device file).
This seems to be a race between the caller of `waveform_ai_cancel()`
which on return from that function goes and tears down the running
command, and the timer function which uses the command. In particular,
`async->cmd.chanlist` gets freed (and the pointer set to NULL) by
`do_become_nonbusy()` in "comedi_fops.c" but a previously scheduled
`waveform_ai_interrupt()` timer function will dereference that pointer
regardless, leading to the oops.
Fix it by replacing the `del_timer()` call in `waveform_ai_cancel()`
with `del_timer_sync()`.
The minimum period was set to 357 ns, while the divider for these boards is 50
ns. This prevented to output at maximum speed as ni_ao_cmdtest() would return
357 but would not accept it.
Not sure why it was set to 357 ns (this was done before the git history,
which starts 5 years ago). My guess is that it comes from reading the
specification stating a 2.8 MHz rate (~ 357 ns). The latest
specification states a 2.86 MHz rate (~ 350 ns), which makes a lot
more sense.
When a low-level comedi driver auto-configures a device, a `struct
comedi_dev_file_info` is allocated (as well as a `struct
comedi_device`) by `comedi_alloc_board_minor()`. A pointer to the
hardware `struct device` is stored as a cookie in the `struct
comedi_dev_file_info`. When the low-level comedi driver
auto-unconfigures the device, `comedi_auto_unconfig()` uses the cookie
to find the `struct comedi_dev_file_info` so it can detach the comedi
device from the driver, clean it up and free it.
A problem arises if the user manually unconfigures and reconfigures the
comedi device using the `COMEDI_DEVCONFIG` ioctl so that is no longer
associated with the original hardware device. The problem is that the
cookie is not cleared, so that a call to `comedi_auto_unconfig()` from
the low-level driver will still find it, detach it, clean it up and free
it.
Stop this problem occurring by always clearing the `hardware_device`
cookie in the `struct comedi_dev_file_info` whenever the
`COMEDI_DEVCONFIG` ioctl call is successful.
Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes some code that implements a work-around to a hardware bug in
the ac97 controller on the pxa27x. A bug in the controller's warm reset
functionality requires that the mfp used by the controller as the AC97_nRESET
line be temporarily reconfigured as a generic output gpio (AF0) and manually
held high for the duration of the warm reset cycle. This is what was done in
the original code, but it was broken long ago by commit fb1bf8cd
([ARM] pxa: introduce processor specific pxa27x_assert_ac97reset())
which changed the mfp to a GPIO input instead of a high output.
The fix requires the ac97 controller to obtain the gpio via gpio_request_one(),
with arguments that configure the gpio as an output initially driven high.
Tested on a palm treo 680 machine. Reportedly, this broken code only prevents a
warm reset on hardware that lacks a pull-up on the line, which appears to be the
case for me.
Signed-off-by: Mike Dunn <mikedunn@newsguy.com> Signed-off-by: Igor Grinberg <grinberg@compulab.co.il> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
appears in the kernel log. Through trial-and-error (the pxa270 developer's
manual is mostly incoherent on the topic of ac97 reset), I got cold reset to
complete by setting the WARM_RST bit in the GCR register (and later noticed that
pxa3xx does this for cold reset as well). Also, a timeout loop is needed to
wait for the reset to complete.
Tested on a palm treo 680 machine.
Signed-off-by: Mike Dunn <mikedunn@newsguy.com> Acked-by: Igor Grinberg <grinberg@compulab.co.il> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is a left-over from when udl_get_edid returned the amount of bytes
successfully read, which it no longer does.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The buffer passed to usb_control_msg may end up in scatter-gather list, and
may thus not be on the stack. Having it on the stack usually works on x86, but
not on other archs.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
udldrmfb only reads the main EDID block, and if that advertises extensions
the drm_edid code expects them to be present, and starts reading beyond the
buffer udldrmfb passes it.
Although it may be possible to read more EDID info with the udl we simpy don't
know how, and even if trial and error gets it working on one device, that is
no guarantee it will work on other revisions. So this patch does a simple fix
in the form of patching the EDID info to report 0 extension blocks, this
fixes udldrmfb only doing 1024x768 on monitors with EDID extension blocks.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to the defines in wm2200.h:
/*
* R1284 (0x504) - Audio IF 1_5
*/
We should not left shift 1 bit for fmt_val when setting dai format.
Signed-off-by: Axel Lin <axel.lin@ingics.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The USB recovery mode present in i.MX23 ROM emulates USB HID. It needs this
quirk to behave properly.
Even if the official branding of the chip is Freescale i.MX23, I named it
Sigmatel STMP3780 since that's what the chip really is and it even reports
itself as STMP3780.
EPOLL_CTL_MOD sets the interest mask before calling f_op->poll() to
ensure events are not missed. Since the modifications to the interest
mask are not protected by the same lock as ep_poll_callback, we need to
ensure the change is visible to other CPUs calling ep_poll_callback.
We also need to ensure f_op->poll() has an up-to-date view of past
events which occured before we modified the interest mask. So this
barrier also pairs with the barrier in wq_has_sleeper().
This should guarantee either ep_poll_callback or f_op->poll() (or both)
will notice the readiness of a recently-ready/modified item.
This issue was encountered by Andreas Voellmy and Junchang(Jason) Wang in:
http://thread.gmane.org/gmane.linux.kernel/1408782/
Signed-off-by: Eric Wong <normalperson@yhbt.net> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Voellmy <andreas.voellmy@yale.edu> Tested-by: "Junchang(Jason) Wang" <junchang.wang@yale.edu> Cc: netdev@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If count is less than the size of a register then we may hit integer
wraparound when trying to move backwards to check if we're still in
the buffer. Instead move the position forwards to check if it's still
in the buffer, we are unlikely to be able to allocate a buffer
sufficiently big to overflow here.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Way Access Filter in recent AMD CPUs may hurt the performance of
some workloads, caused by aliasing issues in the L1 cache.
This patch disables it on the affected CPUs.
The issue is similar to that one of last year:
http://lkml.indiana.edu/hypermail/linux/kernel/1107.3/00041.html
This new patch does not replace the old one, we just need another
quirk for newer CPUs.
The performance penalty without the patch depends on the
circumstances, but is a bit less than the last year's 3%.
The workloads affected would be those that access code from the same
physical page under different virtual addresses, so different
processes using the same libraries with ASLR or multiple instances of
PIE-binaries. The code needs to be accessed simultaneously from both
cores of the same compute unit.
More details can be found here:
http://developer.amd.com/Assets/SharedL1InstructionCacheonAMD15hCPU.pdf
CPUs affected are anything with the core known as Piledriver.
That includes the new parts of the AMD A-Series (aka Trinity) and the
just released new CPUs of the FX-Series (aka Vishera).
The model numbering is a bit odd here: FX CPUs have model 2,
A-Series has model 10h, with possible extensions to 1Fh. Hence the
range of model ids.
On COW, a new hugepage is allocated and charged to the memcg. If the
system is oom or the charge to the memcg fails, however, the fault
handler will return VM_FAULT_OOM which results in an oom kill.
Instead, it's possible to fallback to splitting the hugepage so that the
COW results only in an order-0 page being allocated and charged to the
memcg which has a higher liklihood to succeed. This is expensive
because the hugepage must be split in the page fault handler, but it is
much better than unnecessarily oom killing a process.
Signed-off-by: David Rientjes <rientjes@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Johannes Weiner <jweiner@redhat.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Incrementing lenExtents even while writing to a hole is bad
for performance as calls to udf_discard_prealloc and
udf_truncate_tail_extent would not return from start if
isize != lenExtents
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Shuah Khan <shuah.khan@hp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
blk_alloc_queue has already done a bdi_init, so do not bdi_init
again in aoeblk_gdalloc. The extra call causes list corruption
in the per-CPU backing dev info stats lists.
Affected users see console WARNINGs about list_del corruption on
percpu_counter_destroy when doing "rmmod aoe" or "aoeflush -a"
when AoE targets have been detected and initialized by the
system.
The patch below applies to v3.6.11, with its v47 aoe driver. It
is expected to apply to all currently maintained stable kernels
except 3.7.y. A related but different fix has been posted for
3.7.y.
References:
RedHat bugzilla ticket with original report
https://bugzilla.redhat.com/show_bug.cgi?id=853064
LKML discussion of bug and fix
http://thread.gmane.org/gmane.linux.kernel/1416336/focus=1416497
Commit c278531d39 added a warning when ext4_flush_unwritten_io() is
called without i_mutex being taken. It had previously not been taken
during orphan cleanup since races weren't possible at that point in
the mount process, but as a result of this c278531d39, we will now see
a kernel WARN_ON in this case. Take the i_mutex in
ext4_orphan_cleanup() to suppress this warning.
Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a journal-less ext4 filesystem is mounted on a read-only block
device (blockdev --setro will do), each remount (for other, unrelated,
flags, like suid=>nosuid etc) results in a series of scary messages
from kernel telling about I/O errors on the device.
This is becauese of the following code ext4_remount():
if (sbi->s_journal == NULL)
ext4_commit_super(sb, 1);
at the end of remount procedure, which forces writing (flushing) of
a superblock regardless whenever it is dirty or not, if the filesystem
is readonly or not, and whenever the device itself is readonly or not.
We only need call ext4_commit_super when the file system had been
previously mounted read/write.
Thanks to Eric Sandeen for help in diagnosing this issue.
The following race is possible between start_this_handle() and someone
calling jbd2_journal_flush().
Process A Process B
start_this_handle().
if (journal->j_barrier_count) # false
if (!journal->j_running_transaction) { #true
read_unlock(&journal->j_state_lock);
jbd2_journal_lock_updates()
jbd2_journal_flush()
write_lock(&journal->j_state_lock);
if (journal->j_running_transaction) {
# false
... wait for committing trans ...
write_unlock(&journal->j_state_lock);
...
write_lock(&journal->j_state_lock);
if (!journal->j_running_transaction) { # true
jbd2_get_transaction(journal, new_transaction);
write_unlock(&journal->j_state_lock);
goto repeat; # eventually blocks on j_barrier_count > 0
...
J_ASSERT(!journal->j_running_transaction);
# fails
We fix the race by rechecking j_barrier_count after reacquiring j_state_lock
in exclusive mode.
Reported-by: yjwsignal@empal.com Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently we allow enabling dioread_nolock mount option on remount for
filesystems where blocksize < PAGE_CACHE_SIZE. This isn't really
supported so fix the bug by moving the check for blocksize !=
PAGE_CACHE_SIZE into parse_options(). Change the original PAGE_SIZE to
PAGE_CACHE_SIZE along the way because that's what we are really
interested in.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The VDCTRL4 register does not provide the MXS SET/CLR/TOGGLE feature.
The write in mxsfb_disable_controller() sets the data_cnt for the LCD
DMA to 0 which obviously means the max. count for the LCD DMA and
leads to overwriting arbitrary memory when the display is unblanked.