Jani Nikula [Fri, 3 Jun 2016 14:57:05 +0000 (17:57 +0300)]
drm/i915/dsi: fix bxt split screen and color issue
Fix the failure mode where the display appears split, or shifted about
2/3 of the screen, and the color components are cycled. Turns out we
were missing the crucial BXT_DEFEATURE_DPI_FIFO_CTR bit in the
EOT_DISABLE register.
Per bspec, with the bit set, the "mipi_dpf_vblank_start" signal is
asserted only when the complete frame is transferred in the DPHY line
and also the DPI FIFO is flushed out at the end of each frame.
The problem was mitigated by keeping the panel fitter enabled, but that
only limited the issue to a shift of about 0..10 pixels. With the fix
here, the panel fitter workaround does not seem to be needed at all.
While at it, set BXT_DPHY_DEFEATURE_EN in EOT_DISABLE register which is
also needed per the BXT DSI mode set sequence.
Issue: VIZ-7610 Cc: Mika Kahola <mika.kahola@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1464965825-31035-1-git-send-email-jani.nikula@intel.com
Ville Syrjälä [Tue, 31 May 2016 09:08:34 +0000 (12:08 +0300)]
drm/i915: Extract physical display dimensions from VBT
The VBT has these mysterious H/V image sizes as part of the display
timings. Looking at some dumps those appear to be the physical
dimensions in mm. Which makes sense since the timing descriptor matches
the format used by EDID detailed timing descriptor, which defines these
as "H/V Addressable Video Image Size in mm".
So let's use that information from the panel fixed mode to get the
physical dimensions for LVDS/eDP/DSI displays. And with that we can
fill out the display_info so that userspace can get at it via
GetConnector.
v2: Use (hi<<8)|lo instead of broken (hi<<4)+lo
Handle LVDS and eDP too
Chris Wilson [Wed, 1 Jun 2016 17:08:43 +0000 (18:08 +0100)]
drm/i915: Silence "unexpected child device config size" for VBT on 845g
My old 845g complains that the child_device_size inside its VBT,
version 110, is incorrect. Let's fiddle with the version matching such
that it works with this VBT (i.e. treat BIOS v110 as having the same size
as v108).
Fixes [drm:intel_bios_init] *ERROR* Unexpected child device config
size 27 (expected 33 for VBT version 110)
Whether this is correct, no one knows - but it works for this particular
machine.
Daniel Vetter [Thu, 2 Jun 2016 07:54:12 +0000 (09:54 +0200)]
Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Git got absolutely destroyed with all our cherry-picking from
drm-intel-next-queued to various branches. It ended up inserting
intel_crtc_page_flip 2x even in intel_display.c.
Backmerge to get back to sanity.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Dave Airlie [Wed, 1 Jun 2016 21:58:36 +0000 (07:58 +1000)]
Merge branch 'drm-intel-next' of git://anongit.freedesktop.org/drm-intel into drm-next
drm-intel-next-2016-05-22:
- cmd-parser support for direct reg->reg loads (Ken Graunke)
- better handle DP++ smart dongles (Ville)
- bxt guc fw loading support (Nick Hoathe)
- remove a bunch of struct typedefs from dpll code (Ander)
- tons of small work all over to avoid casting between drm_device and the i915
dev struct (Tvrtko&Chris)
- untangle request retiring from other operations, also fixes reset stat corner
cases (Chris)
- skl atomic watermark support from Matt Roper, yay!
- various wm handling bugfixes from Ville
- big pile of cdclck rework for bxt/skl (Ville)
- CABC (Content Adaptive Brigthness Control) for dsi panels (Jani&Deepak M)
- nonblocking atomic commits for plane-only updates (Maarten Lankhorst)
- bunch of PSR fixes&improvements
- untangle our map/pin/sg_iter code a bit (Dave Gordon)
drm-intel-next-2016-05-08:
- refactor stolen quirks to share code between early quirks and i915 (Joonas)
- refactor gem BO/vma funcstion (Tvrtko&Dave)
- backlight over DPCD support (Yetunde Abedisi)
- more dsi panel sequence support (Jani)
- lots of refactoring around handling iomaps, vma, ring access and related
topics culmulating in removing the duplicated request tracking in the execlist
code (Chris & Tvrtko) includes a small patch for core iomapping code
- hw state readout for bxt dsi (Ramalingam C)
- cdclk cleanups (Ville)
- dedupe chv pll code a bit (Ander)
- enable semaphores on gen8+ for legacy submission, to be able to have a direct
comparison against execlist on the same platform (Chris) Not meant to be used
for anything else but performance tuning
- lvds border bit hw state checker fix (Jani)
- rpm vs. shrinker/oom-notifier fixes (Praveen Paneri)
- l3 tuning (Imre)
- revert mst dp audio, it's totally non-functional and crash-y (Lyude)
- first official dmc for kbl (Rodrigo)
- and tons of small things all over as usual
* 'drm-intel-next' of git://anongit.freedesktop.org/drm-intel: (194 commits)
drm/i915: Revert async unpin and nonblocking atomic commit
drm/i915: Update DRIVER_DATE to 20160522
drm/i915: Inline sg_next() for the optimised SGL iterator
drm/i915: Introduce & use new lightweight SGL iterators
drm/i915: optimise i915_gem_object_map() for small objects
drm/i915: refactor i915_gem_object_pin_map()
drm/i915/psr: Implement PSR2 w/a for gen9
drm/i915/psr: Use ->get_aux_send_ctl functions
drm/i915/psr: Order DP aux transactions correctly
drm/i915/psr: Make idle_frames sensible again
drm/i915/psr: Try to program link training times correctly
drm/i915/userptr: Convert to drm_i915_private
drm/i915: Allow nonblocking update of pageflips.
drm/i915: Check for unpin correctness.
Reapply "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
drm/i915: Make unpin async.
drm/i915: Prepare connectors for nonblocking checks.
drm/i915: Pass atomic states to fbc update functions.
drm/i915: Remove reset_counter from intel_crtc.
drm/i915: Remove queue_flip pointer.
...
Dave Airlie [Wed, 1 Jun 2016 21:50:23 +0000 (07:50 +1000)]
Merge tag 'topic/drm-misc-2016-06-01' of git://anongit.freedesktop.org/drm-intel into drm-next
Frist -misc pull for 4.8, with pretty much just random all over plus a few
more lockless gem BO patches acked/reviewed by driver maintainers.
I'm starting a bit earlier this time around because there's a few invasive
patch series to land (nonblocking atomic prep work, fence prep work,
rst/sphinx kerneldoc finally happening) and I need a baseline with all the
branches merged.
* tag 'topic/drm-misc-2016-06-01' of git://anongit.freedesktop.org/drm-intel: (21 commits)
drm/vc4: Use lockless gem BO free callback
drm/vc4: Use drm_gem_object_unreference_unlocked
drm: Initialize a linear gamma table by default
drm/vgem: Use lockless gem BO free callback
drm/qxl: Don't set a gamma table size
drm/msm: Nuke dummy gamma_set/get functions
drm/cirrus: Drop redundnant gamma size check
drm/fb-helper: Remove dead code in setcolreg
drm/mediatek: Use lockless gem BO free callback
drm/hisilicon: Use lockless gem BO free callback
drm/hlcd: Use lockless gem BO free callback
vga_switcheroo: Support deferred probing of audio clients
vga_switcheroo: Add helper for deferred probing
virtio-gpu: fix output lookup
drm/doc: Unify KMS Locking docs
drm/atomic-helper: Do not call ->mode_fixup for CRTC which will be disabled
Fix annoyingly awkward typo in drm_edid_load.c
drm/doc: Drop vblank_disable_allow wording
drm: use seqlock for vblank time/count
drm/mm: avoid possible null pointer dereference
...
Kumar, Mahesh [Thu, 19 May 2016 22:03:01 +0000 (15:03 -0700)]
drm/i915/skl+: Use scaling amount for plane data rate calculation (v4)
if downscaling is enabled plane data rate increases according to scaling
amount. take scaling amount under consideration while calculating plane
data rate
v2: Address Matt's comments, where data rate was overridden because of
missing else.
v3 (by Matt):
- Add braces to 'else' branch to match kernel coding style
- Adjust final calculation now that skl_plane_downscale_amount()
returns 16.16 fixed point value instead of a decimal fixed point
v4 (by Matt):
- Avoid integer overflow by making sure final multiplication is
treated as 64-bit.
downscale amount = max[1, src_h/dst_h] * max[1, src_w/dst_w]
if 90/270 rotation use rotated width & height
v2: use intel_plane_state->visible instead of (fb == NULL) as per Matt's
comment.
v3 (by Matt):
- Keep downscale amount in 16.16 fixed point rather than converting to
decimal fixed point.
- Store adjusted plane pixel rate in plane state instead of the plane
parameters structure that we no longer use.
v4 (by Matt):
- Significant rebasing onto latest atomic watermark work
- Don't bother storing plane pixel rate in state; just calculate it
right before the calls that make use of it.
- Fix downscale calculations to actually use width values when
computing downscale_w rather than copy/pasted height values.
don't always use 8 ddb as minimum, instead calculate using proper
algorithm.
v2: optimizations as per Matt's comments.
v3 (by Matt):
- Fix boolean logic for !fb test in skl_ddb_min_alloc()
- Adjust negative tiling format comparisons in skl_ddb_min_alloc() to
improve readability.
v4 (by Matt):
- Rebase onto recent atomic watermark changes
- Slight tweaks to code flow to make the logic more closely match the
description in the bspec.
v5 (by Matt):
- Handle minimum scanline calculation properly for 4 & 8 bpp formats.
8bpp isn't actually possible right now, but it's listed in the bspec
so I've included it here for forward compatibility (similar to how
we have logic for NV12).
Matt Roper [Mon, 16 May 2016 22:51:58 +0000 (15:51 -0700)]
drm/i915: Don't try to calculate relative data rates during hw readout
We don't actually read out full plane state during driver startup (only
whether the primary plane is enabled/disabled), so all of the src/dest
rectangles are invalid at this point. However this calculation was
needless anyway since we re-calculate them from scratch on the very
first atomic transaction after boot anyway.
Chris Wilson [Wed, 1 Jun 2016 07:27:50 +0000 (08:27 +0100)]
drm/i915: Only ignore eDP ports that are connected
If the VBT says that a certain port should be eDP (and hence fused off
from HDMI), but in reality it isn't, we need to try and acquire the HDMI
connection instead. So only trust the VBT edp setting if we can connect
to an eDP device on that port.
Fixes: d2182a6608 (drm/i915: Don't register HDMI connectors for eDP ports on VLV/CHV)
References: https://bugs.freedesktop.org/show_bug.cgi?id=96288 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Phidias Chiang <phidias.chiang@canonical.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1464766070-31623-1-git-send-email-chris@chris-wilson.co.uk
"drm/i915: Allow nonblocking update of pageflips" should have been
split up, misses a proper commit message and seems to cause issues in
the legacy page_flip path as demonstrated by kms_flip.
"drm/i915: Make unpin async" doesn't handle the unthrottled cursor
updates correctly, leading to an apparent pin count leak. This is
caught by the WARN_ON in i915_gem_object_do_pin which screams if we
have more than DRM_I915_GEM_OBJECT_MAX_PIN_COUNT pins.
Unfortuantely we can't just revert these two because this patch series
came with a built-in bisect breakage in the form of temporarily
removing the unthrottled cursor update hack for legacy cursor ioctl.
Therefore there's no other option than to revert the entire pile :(
There's one tiny conflict in intel_drv.h due to other patches, nothing
serious.
Normally I'd wait a bit longer with doing a maintainer revert, but
since the minimal set of patches we need to revert (due to the bisect
breakage) is so big, time is running out fast. And very soon
(especially after a few attempts at fixing issues) it'll be really
hard to revert things cleanly.
Lessons learned:
- Not a good idea to rush the review (done by someone fairly new to
the area) and not make sure domain experts had a chance to read it.
- Patches should be properly split up. I only looked at the two
patches that should be reverted in detail, but both look like the
mix up different things in one patch.
- Patches really should have proper commit messages. Especially when
doing more than one thing, and especially when touching critical and
tricky core code.
- Building a patch series and r-b stamping it when it has a built-in
bisect breakage is not a good idea.
- I also think we need to stop building up technical debt by
postponing atomic igt testcases even longer. I think it's clear that
there's enough corner cases in this beast that we really need to
have the testcases _before_ the next step lands.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
drm/i915: Update GEN6_PMINTRMSK setup with GuC enabled
On Loading, GuC sets PM interrupts routing (bit 31) and clears ARAT
expired interrupt (bit 9). Host turbo also updates this register
in RPS flows. This patch ensures bit 31 and bit 9 setup by GuC persists.
ARAT timer interrupt is needed in GuC for various features. It also
facilitates halting GuC and hence achieving RC6. PM interrupt routing
will not impact RPS interrupt reception by host as GuC will redirect
them.
This patch fixes igt test pm_rc6_residency that was failing with guc
load/submission enabled. Tested with SKL GuC v6.1 and BXT GuC v5.1 and v8.7.
v2: i915_irq/i915_pm decoupling from intel_guc. (ChrisW)
v3: restructuring the mask update and rebase w.r.t Ville's patch. (ChrisW)
v4: Updating the pm_intr_keep during direct_interrupts_to_guc. (Sagar)
Cc: Chris Harris <chris.harris@intel.com> Cc: Zhe Wang <zhe1.wang@intel.com> Cc: Deepak S <deepak.s@intel.com> Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com> Cc: Akash Goel <akash.goel@intel.com>
Testcase: igt/pm_rc6_residency Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Tested-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1464683307-19475-1-git-send-email-sagar.a.kamble@intel.com
Daniel Vetter [Mon, 30 May 2016 17:53:06 +0000 (19:53 +0200)]
drm/vc4: Use drm_gem_object_unreference_unlocked
Since my last struct_mutex crusade someone escaped!
This already has the advantage that for the common case when someone
else holds a ref the unref won't even acquire dev->struct_mutex. And
I'm working on code to allow drivers to completely opt-out of any and
all dev->struct_mutex usage, but that only works if they use the
_unlocked variants everywhere.
Daniel Vetter [Wed, 30 Mar 2016 09:51:16 +0000 (11:51 +0200)]
drm: Initialize a linear gamma table by default
Code stolen from gma500.
This is just a minor bit of safety code that I spotted and figured it
might be useful if we put it into the core. This is to make the
get_gamma ioctl reflect likely reality even before the first set_gamma
ioctl call.
v2 on irc: Extend commit message per Maarten's suggestions.
Daniel Vetter [Wed, 30 Mar 2016 09:51:21 +0000 (11:51 +0200)]
drm/msm: Nuke dummy gamma_set/get functions
Again the fbdev emulation gamma_set/get functions are only needed for
drivers that try to also use 8bpp paletted mode. Which msm doesn't, so
this is dead code. Let's rip it out.
Daniel Vetter [Wed, 30 Mar 2016 09:51:17 +0000 (11:51 +0200)]
drm/fb-helper: Remove dead code in setcolreg
DRM fbdev emulation only supports pallete_color with depth == 8, and
truecolor with depth > 8. Handling depth == 16 for palettes is hence
dead code, let's remove it.
Lukas Wunner [Tue, 31 May 2016 09:13:27 +0000 (11:13 +0200)]
vga_switcheroo: Support deferred probing of audio clients
Daniel Vetter pointed out that vga_switcheroo_client_probe_defer() could
be needed by audio clients as well. To avoid mistakes when someone adds
conditions for these in the future, constrain the single existing
condition to VGA clients by checking for PCI_BASE_CLASS_DISPLAY. This
encompasses both PCI_CLASS_DISPLAY_VGA as well as PCI_CLASS_DISPLAY_3D,
which is used by some Nvidia Optimus GPUs.
Any future checks for audio clients should then be constrained to
PCI_BASE_CLASS_MULTIMEDIA.
v6: Spun out from commit introducing vga_switcheroo_client_probe_defer()
to keep it a pure refactoring change. (Emil Velikov, Jani Nikula)
Lukas Wunner [Tue, 31 May 2016 09:13:27 +0000 (11:13 +0200)]
vga_switcheroo: Add helper for deferred probing
So far we've got one condition when DRM drivers need to defer probing
on a dual GPU system and it's coded separately into each of the relevant
drivers. As suggested by Daniel Vetter, deduplicate that code in the
drivers and move it to a new vga_switcheroo helper. This yields better
encapsulation of concepts and lets us add further checks in a central
place. (The existing check pertains to pre-retina MacBook Pros and an
additional check is expected to be needed for retinas.)
One might be tempted to check deferred probing conditions in
vga_switcheroo_register_client(), but this is usually called fairly late
during driver load. The GPU is fully brought up and ready for switching
at that point. On boot the ->probe hook is potentially called dozens of
times until it finally succeeds, and each time we'd repeat bringup and
teardown of the GPU, lengthening boot time considerably and cluttering
logfiles. A separate helper is therefore needed which can be called
right at the beginning of the ->probe hook.
Note that amdgpu currently does not call this helper as the AMD GPUs
built into MacBook Pros are only supported by radeon so far.
v2: This helper could eventually be used by audio clients as well,
so rephrase kerneldoc to refer to "client" instead of "GPU"
and move the single existing check in an if block specific
to PCI_CLASS_DISPLAY_VGA devices. Move documentation on
that check from kerneldoc to a comment. (Daniel Vetter)
v3: Mandate in kerneldoc that registration of client shall only
happen after calling this helper. (Daniel Vetter)
v4: Rebase on 412c8f7de011 ("drm/radeon: Return -EPROBE_DEFER when
amdkfd not loaded")
v5: Some Optimus GPUs use PCI_CLASS_DISPLAY_3D, make sure those are
matched as well. (Emil Velikov)
v6: The if-condition referring to PCI_BASE_CLASS_DISPLAY may be
considered a functional change. Move to a separate commit to
keep this a pure refactoring change. (Emil Velikov, Jani Nikula)
Ville Syrjälä [Fri, 27 May 2016 17:59:23 +0000 (20:59 +0300)]
drm/i915: Give meaningful names to all the planes
Let's name our planes in a way that makes sense wrt. the spec:
- skl+ -> "plane 1A", "plane 2A", "plane 1C", "cursor A" etc.
- g4x+ -> "primary A", "primary B", "sprite A", "cursor C" etc.
- pre-g4x -> "plane A", "cursor B" etc.
v2: Rebase on top of the fixed/cleaned error paths
Use a local 'name' variable to make things easier
v3: Pass the name as a function argument to drm_universal_plane_init() (Jani)
v3: Pass the printf style string to drm_universal_plane_init()
Ville Syrjälä [Fri, 27 May 2016 17:59:21 +0000 (20:59 +0300)]
drm/i915: Set crtc->name to "pipe A", "pipe B", etc.
v2: Fix intel_crtc leak on failure to allocate the name
Use a local 'name' variable to make things easier
v3: Pass the name as a function arguemnt to drm_crtc_init_with_planes() (Jani)
v4: Pass the printf style format string to drm_crtc_init_with_planes()
v5: Drop spurious code changes
Liu Ying [Fri, 27 May 2016 09:35:54 +0000 (17:35 +0800)]
drm/atomic-helper: Do not call ->mode_fixup for CRTC which will be disabled
When a CRTC is going to be disabled, it's state may contain a display mode
with zeroed content. This could be reproduced by HDMI cable hotplug out
operation with legacy fbdev support in dual display cases. It would confuse
driver's CRTC callback ->mode_fixup and make the total state be rejected.
So, let's don't call the callback for the CRTC.
George Spelvin [Sun, 29 May 2016 05:26:41 +0000 (01:26 -0400)]
Rename other copy of hash_string to hashlen_string
The original name was simply hash_string(), but that conflicted with a
function with that name in drivers/base/power/trace.c, and I decided
that calling it "hashlen_" was better anyway.
But you have to do it in two places.
[ This caused build errors for architectures that don't define
CONFIG_DCACHE_WORD_ACCESS - Linus ]
Mikulas Patocka [Tue, 24 May 2016 20:49:18 +0000 (22:49 +0200)]
hpfs: implement the show_options method
The HPFS filesystem used generic_show_options to produce string that is
displayed in /proc/mounts. However, there is a problem that the options
may disappear after remount. If we mount the filesystem with option1
and then remount it with option2, /proc/mounts should show both option1
and option2, however it only shows option2 because the whole option
string is replaced with replace_mount_options in hpfs_remount_fs.
To fix this bug, implement the hpfs_show_options function that prints
options that are currently selected.
Mikulas Patocka [Tue, 24 May 2016 20:48:33 +0000 (22:48 +0200)]
affs: fix remount failure when there are no options changed
Commit c8f33d0bec99 ("affs: kstrdup() memory handling") checks if the
kstrdup function returns NULL due to out-of-memory condition.
However, if we are remounting a filesystem with no change to
filesystem-specific options, the parameter data is NULL. In this case,
kstrdup returns NULL (because it was passed NULL parameter), although no
out of memory condition exists. The mount syscall then fails with
ENOMEM.
This patch fixes the bug. We fail with ENOMEM only if data is non-NULL.
The patch also changes the call to replace_mount_options - if we didn't
pass any filesystem-specific options, we don't call
replace_mount_options (thus we don't erase existing reported options).
Mikulas Patocka [Tue, 24 May 2016 20:47:00 +0000 (22:47 +0200)]
hpfs: fix remount failure when there are no options changed
Commit ce657611baf9 ("hpfs: kstrdup() out of memory handling") checks if
the kstrdup function returns NULL due to out-of-memory condition.
However, if we are remounting a filesystem with no change to
filesystem-specific options, the parameter data is NULL. In this case,
kstrdup returns NULL (because it was passed NULL parameter), although no
out of memory condition exists. The mount syscall then fails with
ENOMEM.
This patch fixes the bug. We fail with ENOMEM only if data is non-NULL.
The patch also changes the call to replace_mount_options - if we didn't
pass any filesystem-specific options, we don't call
replace_mount_options (thus we don't erase existing reported options).
Fixes: ce657611baf9 ("hpfs: kstrdup() out of memory handling") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
VDSO:
- Build microMIPS VDSO for microMIPS kernels.
- Fix aliasing warning by building with `-fno-strict-aliasing' for
debugging but also tracing them might result in recursion.
Misc:
- Add missing FROZEN hotplug notifier transitions.
- Fix clk binding example for varioius PIC32 devices.
- Fix cpu interrupt controller node-names in the DT files.
- Fix XPA CPU feature separation.
- Fix write_gc0_* macros when writing zero.
- Add inline asm encoding helpers.
- Add missing VZ accessor microMIPS encodings.
- Fix little endian microMIPS MSA encodings.
- Add 64-bit HTW fields and fix its configuration.
- Fix sigreturn via VDSO on microMIPS kernel.
- Lots of typo fixes.
- Add definitions of SegCtl registers and use them"
Linus Torvalds [Sat, 28 May 2016 23:15:25 +0000 (16:15 -0700)]
Merge branch 'hash' of git://ftp.sciencehorizons.net/linux
Pull string hash improvements from George Spelvin:
"This series does several related things:
- Makes the dcache hash (fs/namei.c) useful for general kernel use.
(Thanks to Bruce for noticing the zero-length corner case)
- Converts the string hashes in <linux/sunrpc/svcauth.h> to use the
above.
- Avoids 64-bit multiplies in hash_64() on 32-bit platforms. Two
32-bit multiplies will do well enough.
- Rids the world of the bad hash multipliers in hash_32.
This finishes the job started in commit 689de1d6ca95 ("Minimal
fix-up of bad hashing behavior of hash_64()")
The vast majority of Linux architectures have hardware support for
32x32-bit multiply and so derive no benefit from "simplified"
multipliers.
The few processors that do not (68000, h8/300 and some models of
Microblaze) have arch-specific implementations added. Those
patches are last in the series.
- Overhauls the dcache hash mixing.
The patch in commit 0fed3ac866ea ("namei: Improve hash mixing if
CONFIG_DCACHE_WORD_ACCESS") was an off-the-cuff suggestion.
Replaced with a much more careful design that's simultaneously
faster and better. (My own invention, as there was noting suitable
in the literature I could find. Comments welcome!)
- Modify the hash_name() loop to skip the initial HASH_MIX(). This
would let us salt the hash if we ever wanted to.
- Sort out partial_name_hash().
The hash function is declared as using a long state, even though
it's truncated to 32 bits at the end and the extra internal state
contributes nothing to the result. And some callers do odd things:
- fs/hfs/string.c only allocates 32 bits of state
- fs/hfsplus/unicode.c uses it to hash 16-bit unicode symbols not bytes
- Modify bytemask_from_count to handle inputs of 1..sizeof(long)
rather than 0..sizeof(long)-1. This would simplify users other
than full_name_hash"
Special thanks to Bruce Fields for testing and finding bugs in v1. (I
learned some humbling lessons about "obviously correct" code.)
On the arch-specific front, the m68k assembly has been tested in a
standalone test harness, I've been in contact with the Microblaze
maintainers who mostly don't care, as the hardware multiplier is never
omitted in real-world applications, and I haven't heard anything from
the H8/300 world"
* 'hash' of git://ftp.sciencehorizons.net/linux:
h8300: Add <asm/hash.h>
microblaze: Add <asm/hash.h>
m68k: Add <asm/hash.h>
<linux/hash.h>: Add support for architecture-specific functions
fs/namei.c: Improve dcache hash function
Eliminate bad hash multipliers from hash_32() and hash_64()
Change hash_64() return value to 32 bits
<linux/sunrpc/svcauth.h>: Define hash_str() in terms of hashlen_string()
fs/namei.c: Add hashlen_string() function
Pull out string hash to <linux/stringhash.h>
George Spelvin [Wed, 25 May 2016 18:19:49 +0000 (14:19 -0400)]
h8300: Add <asm/hash.h>
This will improve the performance of hash_32() and hash_64(), but due
to complete lack of multi-bit shift instructions on H8, performance will
still be bad in surrounding code.
Designing H8-specific hash algorithms to work around that is a separate
project. (But if the maintainers would like to get in touch...)
Signed-off-by: George Spelvin <linux@sciencehorizons.net> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp
George Spelvin [Wed, 25 May 2016 15:06:09 +0000 (11:06 -0400)]
microblaze: Add <asm/hash.h>
Microblaze is an FPGA soft core that can be configured various ways.
If it is configured without a multiplier, the standard __hash_32()
will require a call to __mulsi3, which is a slow software loop.
Instead, use a shift-and-add sequence for the constant multiply.
GCC knows how to do this, but it's not as clever as some.
Signed-off-by: George Spelvin <linux@sciencehorizons.net> Cc: Alistair Francis <alistair.francis@xilinx.com> Cc: Michal Simek <michal.simek@xilinx.com>
George Spelvin [Fri, 27 May 2016 02:11:51 +0000 (22:11 -0400)]
<linux/hash.h>: Add support for architecture-specific functions
This is just the infrastructure; there are no users yet.
This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares
the existence of <asm/hash.h>.
That file may define its own versions of various functions, and define
HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones.
Included is a self-test (in lib/test_hash.c) that verifies the basics.
It is NOT in general required that the arch-specific functions compute
the same thing as the generic, but if a HAVE_* symbol is defined with
the value 1, then equality is tested.
Signed-off-by: George Spelvin <linux@sciencehorizons.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Andreas Schwab <schwab@linux-m68k.org> Cc: Philippe De Muyter <phdm@macq.eu> Cc: linux-m68k@lists.linux-m68k.org Cc: Alistair Francis <alistai@xilinx.com> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: uclinux-h8-devel@lists.sourceforge.jp
George Spelvin [Mon, 23 May 2016 11:43:58 +0000 (07:43 -0400)]
fs/namei.c: Improve dcache hash function
Patch 0fed3ac866 improved the hash mixing, but the function is slower
than necessary; there's a 7-instruction dependency chain (10 on x86)
each loop iteration.
Word-at-a-time access is a very tight loop (which is good, because
link_path_walk() is one of the hottest code paths in the entire kernel),
and the hash mixing function must not have a longer latency to avoid
slowing it down.
There do not appear to be any published fast hash functions that:
1) Operate on the input a word at a time, and
2) Don't need to know the length of the input beforehand, and
3) Have a single iterated mixing function, not needing conditional
branches or unrolling to distinguish different loop iterations.
One of the algorithms which comes closest is Yann Collet's xxHash, but
that's two dependent multiplies per word, which is too much.
The key insights in this design are:
1) Barring expensive ops like multiplies, to diffuse one input bit
across 64 bits of hash state takes at least log2(64) = 6 sequentially
dependent instructions. That is more cycles than we'd like.
2) An operation like "hash ^= hash << 13" requires a second temporary
register anyway, and on a 2-operand machine like x86, it's three
instructions.
3) A better use of a second register is to hold a two-word hash state.
With careful design, no temporaries are needed at all, so it doesn't
increase register pressure. And this gets rid of register copying
on 2-operand machines, so the code is smaller and faster.
4) Using two words of state weakens the requirement for one-round mixing;
we now have two rounds of mixing before cancellation is possible.
5) A two-word hash state also allows operations on both halves to be
done in parallel, so on a superscalar processor we get more mixing
in fewer cycles.
I ended up using a mixing function inspired by the ChaCha and Speck
round functions. It is 6 simple instructions and 3 cycles per iteration
(assuming multiply by 9 can be done by an "lea" instruction):
x ^= *input++;
y ^= x; x = ROL(x, K1);
x += y; y = ROL(y, K2);
y *= 9;
Not only is this reversible, two consecutive rounds are reversible:
if you are given the initial and final states, but not the intermediate
state, it is possible to compute both input words. This means that at
least 3 words of input are required to create a collision.
(It also has the property, used by hash_name() to avoid a branch, that
it hashes all-zero to all-zero.)
The rotate constants K1 and K2 were found by experiment. The search took
a sample of random initial states (I used 1023) and considered the effect
of flipping each of the 64 input bits on each of the 128 output bits two
rounds later. Each of the 8192 pairs can be considered a biased coin, and
adding up the Shannon entropy of all of them produces a score.
The best-scoring shifts also did well in other tests (flipping bits in y,
trying 3 or 4 rounds of mixing, flipping all 64*63/2 pairs of input bits),
so the choice was made with the additional constraint that the sum of the
shifts is odd and not too close to the word size.
The final state is then folded into a 32-bit hash value by a less carefully
optimized multiply-based scheme. This also has to be fast, as pathname
components tend to be short (the most common case is one iteration!), but
there's some room for latency, as there is a fair bit of intervening logic
before the hash value is used for anything.
(Performance verified with "bonnie++ -s 0 -n 1536:-2" on tmpfs. I need
a better benchmark; the numbers seem to show a slight dip in performance
between 4.6.0 and this patch, but they're too noisy to quote.)
Special thanks to Bruce fields for diligent testing which uncovered a
nasty fencepost error in an earlier version of this patch.
[checkpatch.pl formatting complaints noted and respectfully disagreed with.]
Signed-off-by: George Spelvin <linux@sciencehorizons.net> Tested-by: J. Bruce Fields <bfields@redhat.com>
George Spelvin [Fri, 27 May 2016 03:00:23 +0000 (23:00 -0400)]
Eliminate bad hash multipliers from hash_32() and hash_64()
The "simplified" prime multipliers made very bad hash functions, so get rid
of them. This completes the work of 689de1d6ca.
To avoid the inefficiency which was the motivation for the "simplified"
multipliers, hash_64() on 32-bit systems is changed to use a different
algorithm. It makes two calls to hash_32() instead.
drivers/media/usb/dvb-usb-v2/af9015.c uses the old GOLDEN_RATIO_PRIME_32
for some horrible reason, so it inherits a copy of the old definition.
Signed-off-by: George Spelvin <linux@sciencehorizons.net> Cc: Antti Palosaari <crope@iki.fi> Cc: Mauro Carvalho Chehab <m.chehab@samsung.com>
George Spelvin [Fri, 27 May 2016 02:22:01 +0000 (22:22 -0400)]
Change hash_64() return value to 32 bits
That's all that's ever asked for, and it makes the return
type of hash_long() consistent.
It also allows (upcoming patch) an optimized implementation
of hash_64 on 32-bit machines.
I tried adding a BUILD_BUG_ON to ensure the number of bits requested
was never more than 32 (most callers use a compile-time constant), but
adding <linux/bug.h> to <linux/hash.h> breaks the tools/perf compiler
unless tools/perf/MANIFEST is updated, and understanding that code base
well enough to update it is too much trouble. I did the rest of an
allyesconfig build with such a check, and nothing tripped.
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
George Spelvin [Fri, 20 May 2016 17:31:33 +0000 (13:31 -0400)]
<linux/sunrpc/svcauth.h>: Define hash_str() in terms of hashlen_string()
Finally, the first use of previous two patches: eliminate the
separate ad-hoc string hash functions in the sunrpc code.
Now hash_str() is a wrapper around hash_string(), and hash_mem() is
likewise a wrapper around full_name_hash().
Note that sunrpc code *does* call hash_mem() with a zero length, which
is why the previous patch needed to handle that in full_name_hash().
(Thanks, Bruce, for finding that!)
This also eliminates the only caller of hash_long which asks for
more than 32 bits of output.
The comment about the quality of hashlen_string() and full_name_hash()
is jumping the gun by a few patches; they aren't very impressive now,
but will be improved greatly later in the series.
Signed-off-by: George Spelvin <linux@sciencehorizons.net> Tested-by: J. Bruce Fields <bfields@redhat.com> Acked-by: J. Bruce Fields <bfields@redhat.com> Cc: Jeff Layton <jlayton@poochiereds.net> Cc: linux-nfs@vger.kernel.org
George Spelvin [Fri, 20 May 2016 12:41:37 +0000 (08:41 -0400)]
fs/namei.c: Add hashlen_string() function
We'd like to make more use of the highly-optimized dcache hash functions
throughout the kernel, rather than have every subsystem create its own,
and a function that hashes basic null-terminated strings is required
for that.
(The name is to emphasize that it returns both hash and length.)
It's actually useful in the dcache itself, specifically d_alloc_name().
Other uses in the next patch.
full_name_hash() is also tweaked to make it more generally useful:
1) Take a "char *" rather than "unsigned char *" argument, to
be consistent with hash_name().
2) Handle zero-length inputs. If we want more callers, we don't want
to make them worry about corner cases.
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Linus Torvalds [Sat, 28 May 2016 19:38:50 +0000 (12:38 -0700)]
Merge branch 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:
"A fix for a regression introduced yesterday.
The regression didn't show up here locally because I did not have
PAGE_POISONING enabled. And buildbots discovered this only after it
hit your tree. Thanks to Dan for the quick response"
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: dev: use after free in detach
Linus Torvalds [Sat, 28 May 2016 19:32:01 +0000 (12:32 -0700)]
Merge tag 'chrome-platform' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform
Pull chrome platform updates from Olof Johansson
"A handful of Chrome driver and binding changes this merge window:
- a few patches to fix probing and configuration of pstore
- a few patches adding Elan touchpad registration on a few devices
- EC changes: a security fix dealing with max message sizes and
addition of compat_ioctl support.
- keyboard backlight control support
There was also an accidential duplicate registration of trackpads on
'Leon', which was reverted just recently"
* tag 'chrome-platform' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform:
Revert "platform/chrome: chromeos_laptop: Add Leon Touch"
platform/chrome: chromeos_laptop - Add Elan touchpad for Wolf
platform/chrome: chromeos_laptop - Add elan trackpad option for C720
platform/chrome: cros_ec_dev - Populate compat_ioctl
platform/chrome: cros_ec_lightbar - use name instead of ID to hide lightbar attributes
platform/chrome: cros_ec_dev - Fix security issue
platform/chrome: Add Chrome OS keyboard backlight LEDs support
platform/chrome: use to_platform_device()
platform/chrome: pstore: Move to larger record size.
platform/chrome: pstore: probe for ramoops buffer using acpi
platform/chrome: chromeos_laptop: Add Leon Touch
Linus Torvalds [Sat, 28 May 2016 19:23:12 +0000 (12:23 -0700)]
Merge tag 'sound-4.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull more sound updates from Takashi Iwai:
"This is the second update round for 4.7-rc1. Most of changes are
about the pending ASoC updates and fixes, including a few new drivers.
Below are some highlights:
ASoC:
- New drivers for MAX98371 and TAS5720
- SPI support for TLV320AIC32x4, along with the module split
- TDM support for STI Uniperf IPs
- Remaining topology API fixes / updates
HDA:
- A couple of Dell quirks and new Realtek codec support"
* tag 'sound-4.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (63 commits)
ALSA: hda - Fix headset mic detection problem for one Dell machine
spi: spi-ep93xx: Fix the PTR_ERR() argument
ALSA: hda/realtek - Add support for ALC295/ALC3254
ASoC: kirkwood: fix build failure
ALSA: hda - Fix headphone noise on Dell XPS 13 9360
ASoC: ak4642: Enable cache usage to fix crashes on resume
ASoC: twl6040: Disconnect AUX output pads on digital mute
ASoC: tlv320aic32x4: Properly implement the positive and negative pins into the mixers
rcar: src: skip disabled-SRC nodes
ASoC: max98371 Remove duplicate entry in max98371_reg
ASoC: twl6040: Select LPPLL during standby
ASoC: rsnd: don't use prohibited number to PDMACHCRn.SRS
ASoC: simple-card: Add pm callbacks to platform driver
ASoC: pxa: Fix module autoload for platform drivers
ASoC: topology: Fix memory leak in widget creation
ASoC: Add max98371 codec driver
ASoC: rsnd: count .probe/.remove for rsnd_mod_call()
ASoC: topology: Check size mismatch of ABI objects before parsing
ASoC: topology: Check failure to create a widget
ASoC: add support for TAS5720 digital amplifier
...
Linus Torvalds [Sat, 28 May 2016 19:04:17 +0000 (12:04 -0700)]
Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
"Here are the outstanding target pending updates for v4.7-rc1.
The highlights this round include:
- Allow external PR/ALUA metadata path be defined at runtime via top
level configfs attribute (Lee)
- Fix target session shutdown bug for ib_srpt multi-channel (hch)
- Make TFO close_session() and shutdown_session() optional (hch)
- Drop se_sess->sess_kref + convert tcm_qla2xxx to internal kref
(hch)
- Add tcm_qla2xxx endpoint attribute for basic FC jammer (Laurence)
- Refactor iscsi-target RX/TX PDU encode/decode into common code
(Varun)
- Extend iscsit_transport with xmit_pdu, release_cmd, get_rx_pdu,
validate_parameters, and get_r2t_ttt for generic ISO offload
(Varun)
- Initial merge of cxgb iscsi-segment offload target driver (Varun)
The bulk of the changes are Chelsio's new driver, along with a number
of iscsi-target common code improvements made by Varun + Co along the
way"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (29 commits)
iscsi-target: Fix early sk_data_ready LOGIN_FLAGS_READY race
cxgbit: Use type ISCSI_CXGBIT + cxgbit tpg_np attribute
iscsi-target: Convert transport drivers to signal rdma_shutdown
iscsi-target: Make iscsi_tpg_np driver show/store use generic code
tcm_qla2xxx Add SCSI command jammer/discard capability
iscsi-target: graceful disconnect on invalid mapping to iovec
target: need_to_release is always false, remove redundant check and kfree
target: remove sess_kref and ->shutdown_session
iscsi-target: remove usage of ->shutdown_session
tcm_qla2xxx: introduce a private sess_kref
target: make close_session optional
target: make ->shutdown_session optional
target: remove acl_stop
target: consolidate and fix session shutdown
cxgbit: add files for cxgbit.ko
iscsi-target: export symbols
iscsi-target: call complete on conn_logout_comp
iscsi-target: clear tx_thread_active
iscsi-target: add new offload transport type
iscsi-target: use conn_transport->transport_type in text rsp
...
Linus Torvalds [Sat, 28 May 2016 18:04:16 +0000 (11:04 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull more rdma updates from Doug Ledford:
"This is the second group of code for the 4.7 merge window. It looks
large, but only in one sense. I'll get to that in a minute. The list
of changes here breaks down as follows:
- Dynamic counter infrastructure in the IB drivers
This is a sysfs based code to allow free form access to the
hardware counters RDMA devices might support so drivers don't need
to code this up repeatedly themselves
- SendOnlyFullMember multicast support
- IB router support
- A couple misc fixes
- The big item on the list: hfi1 driver updates, plus moving the hfi1
driver out of staging
There was a group of 15 patches in the hfi1 list that I thought I had
in the first pull request but they weren't. So that added to the
length of the hfi1 section here.
As far as these go, everything but the hfi1 is pretty straight
forward.
The hfi1 is, if you recall, the driver that Al had complaints about
how it used the write/writev interfaces in an overloaded fashion. The
write portion of their interface behaved like the write handler in the
IB stack proper and did bi-directional communications. The writev
interface, on the other hand, only accepts SDMA request structures.
The completions for those structures are sent back via an entirely
different event mechanism.
With the security patch, we put security checks on the write
interface, however, we also knew they would be going away soon. Now,
we've converted the write handler in the hfi1 driver to use ioctls
from the IB reserved magic area for its bidirectional communications.
With that change, Intel has addressed all of the items originally on
their TODO when they went into staging (as well as many items added to
the list later).
As such, I moved them out, and since they were the last item in the
staging/rdma directory, and I don't have immediate plans to use the
staging area again, I removed the staging/rdma area.
Because of the move out of staging, as well as a series of 5 patches
in the hfi1 driver that removed code people thought should be done in
a different way and was optional to begin with (a snoop debug
interface, an eeprom driver for an eeprom connected directory to their
hfi1 chip and not via an i2c bus, and a few other things like that),
the line count, especially the removal count, is high"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (56 commits)
staging/rdma: Remove the entire rdma subdirectory of staging
IB/core: Make device counter infrastructure dynamic
IB/hfi1: Fix pio map initialization
IB/hfi1: Correct 8051 link parameter settings
IB/hfi1: Update pkey table properly after link down or FM start
IB/rdamvt: Fix rdmavt s_ack_queue sizing
IB/rdmavt: Max atomic value should be a u8
IB/hfi1: Fix hard lockup due to not using save/restore spin lock
IB/hfi1: Add tracing support for send with invalidate opcode
IB/hfi1, qib: Add ieth to the packet header definitions
IB/hfi1: Move driver out of staging
IB/hfi1: Do not free hfi1 cdev parent structure early
IB/hfi1: Add trace message in user IOCTL handling
IB/hfi1: Remove write(), use ioctl() for user cmds
IB/hfi1: Add ioctl() interface for user commands
IB/hfi1: Remove unused user command
IB/hfi1: Remove snoop/diag interface
IB/hfi1: Remove EPROM functionality from data device
IB/hfi1: Remove UI char device
IB/hfi1: Remove multiple device cdev
...
Board "Leon" is otherwise known as "Toshiba CB35" and we already have
the entry that supports that board as of this commit : 963cb6f platform/chrome: chromeos_laptop - Add Toshiba CB35 Touch
Remove this duplicate.
Signed-off-by: Benson Leung <bleung@chromium.org> Signed-off-by: Olof Johansson <olof@lixom.net>
Dan Carpenter [Sat, 28 May 2016 05:01:46 +0000 (08:01 +0300)]
i2c: dev: use after free in detach
The call to put_i2c_dev() frees "i2c_dev" so there is a use after
free when we call cdev_del(&i2c_dev->cdev).
Fixes: d6760b14d4a1 ('i2c: dev: switch from register_chrdev to cdev API') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
The corresponding FROZEN hotplug notifier transitions used on
suspend/resume are ignored. Therefore the switch case action argument
is masked with the frozen hotplug notifier transition mask.
James Hogan [Tue, 24 May 2016 08:35:11 +0000 (09:35 +0100)]
MIPS: Build microMIPS VDSO for microMIPS kernels
MicroMIPS kernels may be expected to run on microMIPS only cores which
don't support the normal MIPS instruction set, so be sure to pass the
-mmicromips flag through to the VDSO cflags.
Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.4.x-
Patchwork: https://patchwork.linux-mips.org/patch/13349/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
James Hogan [Tue, 24 May 2016 08:35:10 +0000 (09:35 +0100)]
MIPS: Fix sigreturn via VDSO on microMIPS kernel
In microMIPS kernels, handle_signal() sets the isa16 mode bit in the
vdso address so that the sigreturn trampolines (which are offset from
the VDSO) get executed as microMIPS.
However commit ebb5e78cc634 ("MIPS: Initial implementation of a VDSO")
changed the offsets to come from the VDSO image, which already have the
isa16 mode bit set correctly since they're extracted from the VDSO
shared library symbol table.
Drop the isa16 mode bit handling from handle_signal() to fix sigreturn
for cores which support both microMIPS and normal MIPS. This doesn't fix
microMIPS only cores, since the VDSO is still built for normal MIPS, but
thats a separate problem.
Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.4.x-
Patchwork: https://patchwork.linux-mips.org/patch/13348/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Antony Pavlov [Mon, 23 May 2016 11:39:00 +0000 (14:39 +0300)]
MIPS: devicetree: fix cpu interrupt controller node-names
Here is the quote from [1]:
The unit-address must match the first address specified
in the reg property of the node. If the node has no reg property,
the @ and unit-address must be omitted and the node-name alone
differentiates the node from other nodes at the same level
This patch adjusts MIPS dts-files and devicetree binding
documentation in accordance with [1].
[1] Power.org(tm) Standard for Embedded Power Architecture(tm)
Platform Requirements (ePAPR). Version 1.1 – 08 April 2011.
Chapter 2.2.1.1 Node Name Requirements
Signed-off-by: Antony Pavlov <antonynpavlov@gmail.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ian Campbell <ijc+devicetree@hellion.org.uk> Cc: Kumar Gala <galak@codeaurora.org> Cc: linux-mips@linux-mips.org Cc: devicetree@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13345/ Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Avoid an aliasing issue causing a build error in VDSO:
In file included from include/linux/srcu.h:34:0,
from include/linux/notifier.h:15,
from ./arch/mips/include/asm/uprobes.h:9,
from include/linux/uprobes.h:61,
from include/linux/mm_types.h:13,
from ./arch/mips/include/asm/vdso.h:14,
from arch/mips/vdso/vdso.h:27,
from arch/mips/vdso/gettimeofday.c:11:
include/linux/workqueue.h: In function 'work_static':
include/linux/workqueue.h:186:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
return *work_data_bits(work) & WORK_STRUCT_STATIC;
^
cc1: all warnings being treated as errors
make[2]: *** [arch/mips/vdso/gettimeofday.o] Error 1
with a CONFIG_DEBUG_OBJECTS_WORK configuration and GCC 5.2.0. Include
`-fno-strict-aliasing' along with compiler options used, as required for
kernel code, fixing a problem present since the introduction of VDSO
with commit ebb5e78cc634 ("MIPS: Initial implementation of a VDSO").
Thanks to Tejun for diagnosing this properly!
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com> Reviewed-by: James Hogan <james.hogan@imgtec.com> Fixes: ebb5e78cc634 ("MIPS: Initial implementation of a VDSO") Cc: Tejun Heo <tj@kernel.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.3+
Patchwork: https://patchwork.linux-mips.org/patch/13357/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Harvey Hunt [Wed, 25 May 2016 10:06:35 +0000 (11:06 +0100)]
MIPS: lib: Mark intrinsics notrace
On certain MIPS32 devices, the ftrace tracer "function_graph" uses
__lshrdi3() during the capturing of trace data. ftrace then attempts to
trace __lshrdi3() which leads to infinite recursion and a stack overflow.
Fix this by marking __lshrdi3() as notrace. Mark the other compiler
intrinsics as notrace in case the compiler decides to use them in the
ftrace path.
James Hogan [Fri, 27 May 2016 21:25:23 +0000 (22:25 +0100)]
MIPS: Fix 64-bit HTW configuration
The Hardware page Table Walker (HTW) is being misconfigured on 64-bit
kernels. The PWSize.PS (pointer size) bit determines whether pointers
within directories are loaded as 32-bit or 64-bit addresses, but was
never being set to 1 for 64-bit kernels where the unsigned long in pgd_t
is 64-bits wide.
This actually reduces rather than improves performance when the HTW is
enabled on P6600 since the HTW is initiated lots, but walks are all
aborted due I think to bad intermediate pointers.
Since we were already taking the width of the PTEs into account by
setting PWSize.PTEW, which is the left shift applied to the page table
index *in addition to* the native pointer size, we also need to reduce
PTEW by 1 when PS=1. This is done by calculating PTEW based on the
relative size of pte_t compared to pgd_t.
Finally in order for the HTW to be used when PS=1, the appropriate
XK/XS/XU bits corresponding to the different 64-bit segments need to be
set in PWCtl. We enable only XU for now to enable walking for XUSeg.
Supporting walking for XKSeg would be a bit more involved so is left for
a future patch. It would either require the use of a per-CPU top level
base directory if supported by the HTW (a bit like pgd_current but with
a second entry pointing at swapper_pg_dir), or the HTW would prepend bit
63 of the address to the global directory index which doesn't really
match how we split user and kernel page directories.
Fixes: cab25bc7537b ("MIPS: Extend hardware table walking support to MIPS64") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13364/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
James Hogan [Fri, 27 May 2016 21:25:22 +0000 (22:25 +0100)]
MIPS: Add 64-bit HTW fields
Add field definitions for some of the 64-bit specific Hardware page
Table Walker (HTW) register fields in PWSize and PWCtl, in preparation
for fixing the 64-bit HTW configuration.
Also print these fields out along with the others in print_htw_config().
Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13363/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
James Hogan [Fri, 20 May 2016 22:28:41 +0000 (23:28 +0100)]
MIPS: Simplify DSP instruction encoding macros
Simplify the DSP instruction wrapper macros which use explicit encodings
for microMIPS and normal MIPS by using the new encoding macros and
removing duplication.
To me this makes it easier to read since it is much shorter, but it also
ensures .insn is used, preventing objdump disassembling the microMIPS
code as normal MIPS.
James Hogan [Fri, 20 May 2016 22:28:40 +0000 (23:28 +0100)]
MIPS: Add missing tlbinvf/XPA microMIPS encodings
Hardcoded MIPS instruction encodings are provided for tlbinvf, mfhc0 &
mthc0 instructions, but microMIPS encodings are missing. I doubt any
microMIPS cores exist at present which support these instructions, but
the microMIPS encodings exist, and microMIPS cores may support them in
the future. Add the missing microMIPS encodings using the new macros.
James Hogan [Fri, 20 May 2016 22:28:39 +0000 (23:28 +0100)]
MIPS: Fix little endian microMIPS MSA encodings
When the toolchain doesn't support MSA we encode MSA instructions
explicitly in assembly. Unfortunately we use .word for both MIPS and
microMIPS encodings which is wrong, since 32-bit microMIPS instructions
are made up from a pair of halfwords.
- The most significant halfword always comes first, so for little endian
builds the halves will be emitted in the wrong order.
- 32-bit alignment isn't guaranteed, so the assembler may insert a
16-bit nop instruction to pad the instruction stream to a 32-bit
boundary.
Use the new instruction encoding macros to encode microMIPS MSA
instructions correctly.
James Hogan [Fri, 20 May 2016 22:28:38 +0000 (23:28 +0100)]
MIPS: Add missing VZ accessor microMIPS encodings
Toolchains may be used which support microMIPS but not VZ instructions
(i.e. binutis 2.22 & 2.23), so extend the explicitly encoded versions of
the guest COP0 register & guest TLB access macros to support microMIPS
encodings too, using the new macros.
This prevents non-microMIPS instructions being executed in microMIPS
mode during CPU probe on cores supporting VZ (e.g. M5150), which cause
reserved instruction exceptions early during boot.
Fixes: bad50d79255a ("MIPS: Fix VZ probe gas errors with binutils <2.24") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13311/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
James Hogan [Fri, 20 May 2016 22:28:37 +0000 (23:28 +0100)]
MIPS: Add inline asm encoding helpers
To allow simplification of macros which use inline assembly to
explicitly encode instructions, add a few simple abstractions to
mipsregs.h which expand to specific microMIPS or normal MIPS encodings
depending on what type of kernel is being built:
_ASM_INSN_IF_MIPS(_enc) : Emit a 32bit MIPS instruction if microMIPS is
not enabled.
_ASM_INSN32_IF_MM(_enc) : Emit a 32bit microMIPS instruction if enabled.
_ASM_INSN16_IF_MM(_enc) : Emit a 16bit microMIPS instruction if enabled.
The macros can be used one after another since the MIPS / microMIPS
macros are mutually exclusive, for example: