]> git.karo-electronics.de Git - karo-tx-linux.git/log
karo-tx-linux.git
8 years agolibceph: use the right footer size when skipping a message
Ilya Dryomov [Fri, 19 Feb 2016 10:38:57 +0000 (11:38 +0100)]
libceph: use the right footer size when skipping a message

commit dbc0d3caff5b7591e0cf8e34ca686ca6f4479ee1 upstream.

ceph_msg_footer is 21 bytes long, while ceph_msg_footer_old is only 13.
Don't skip too much when CEPH_FEATURE_MSG_AUTH isn't negotiated.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agolibceph: don't bail early from try_read() when skipping a message
Ilya Dryomov [Wed, 17 Feb 2016 19:04:08 +0000 (20:04 +0100)]
libceph: don't bail early from try_read() when skipping a message

commit e7a88e82fe380459b864e05b372638aeacb0f52d upstream.

The contract between try_read() and try_write() is that when called
each processes as much data as possible.  When instructed by osd_client
to skip a message, try_read() is violating this contract by returning
after receiving and discarding a single message instead of checking for
more.  try_write() then gets a chance to write out more requests,
generating more replies/skips for try_read() to handle, forcing the
messenger into a starvation loop.

Reported-by: Varada Kari <Varada.Kari@sandisk.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Varada Kari <Varada.Kari@sandisk.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agolibceph: fix ceph_msg_revoke()
Ilya Dryomov [Mon, 28 Dec 2015 10:18:34 +0000 (13:18 +0300)]
libceph: fix ceph_msg_revoke()

commit 67645d7619738e51c668ca69f097cb90b5470422 upstream.

There are a number of problems with revoking a "was sending" message:

(1) We never make any attempt to revoke data - only kvecs contibute to
con->out_skip.  However, once the header (envelope) is written to the
socket, our peer learns data_len and sets itself to expect at least
data_len bytes to follow front or front+middle.  If ceph_msg_revoke()
is called while the messenger is sending message's data portion,
anything we send after that call is counted by the OSD towards the now
revoked message's data portion.  The effects vary, the most common one
is the eventual hang - higher layers get stuck waiting for the reply to
the message that was sent out after ceph_msg_revoke() returned and
treated by the OSD as a bunch of data bytes.  This is what Matt ran
into.

(2) Flat out zeroing con->out_kvec_bytes worth of bytes to handle kvecs
is wrong.  If ceph_msg_revoke() is called before the tag is sent out or
while the messenger is sending the header, we will get a connection
reset, either due to a bad tag (0 is not a valid tag) or a bad header
CRC, which kind of defeats the purpose of revoke.  Currently the kernel
client refuses to work with header CRCs disabled, but that will likely
change in the future, making this even worse.

(3) con->out_skip is not reset on connection reset, leading to one or
more spurious connection resets if we happen to get a real one between
con->out_skip is set in ceph_msg_revoke() and before it's cleared in
write_partial_skip().

Fixing (1) and (3) is trivial.  The idea behind fixing (2) is to never
zero the tag or the header, i.e. send out tag+header regardless of when
ceph_msg_revoke() is called.  That way the header is always correct, no
unnecessary resets are induced and revoke stands ready for disabled
CRCs.  Since ceph_msg_revoke() rips out con->out_msg, introduce a new
"message out temp" and copy the header into it before sending.

Reported-by: Matt Conner <matt.conner@keepertech.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Matt Conner <matt.conner@keepertech.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoseccomp: always propagate NO_NEW_PRIVS on tsync
Jann Horn [Sat, 26 Dec 2015 05:00:48 +0000 (06:00 +0100)]
seccomp: always propagate NO_NEW_PRIVS on tsync

commit 103502a35cfce0710909da874f092cb44823ca03 upstream.

Before this patch, a process with some permissive seccomp filter
that was applied by root without NO_NEW_PRIVS was able to add
more filters to itself without setting NO_NEW_PRIVS by setting
the new filter from a throwaway thread with NO_NEW_PRIVS.

Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agocpufreq: Fix NULL reference crash while accessing policy->governor_data
Viresh Kumar [Mon, 25 Jan 2016 17:03:46 +0000 (22:33 +0530)]
cpufreq: Fix NULL reference crash while accessing policy->governor_data

commit e4b133cc4b30b48d488e4e4fffb132f173ce4358 upstream.

There is a race discovered by Juri, where we are able to:
- create and read a sysfs file before policy->governor_data is being set
  to a non NULL value.
  OR
- set policy->governor_data to NULL, and reading a file before being
  destroyed.

And so such a crash is reported:

Unable to handle kernel NULL pointer dereference at virtual address 0000000c
pgd = edfc8000
[0000000c] *pgd=bfc8c835
Internal error: Oops: 17 [#1] SMP ARM
Modules linked in:
CPU: 4 PID: 1730 Comm: cat Not tainted 4.5.0-rc1+ #463
Hardware name: ARM-Versatile Express
task: ee8e8480 ti: ee930000 task.ti: ee930000
PC is at show_ignore_nice_load_gov_pol+0x24/0x34
LR is at show+0x4c/0x60
pc : [<c058f1bc>]    lr : [<c058ae88>]    psr: a0070013
sp : ee931dd0  ip : ee931de0  fp : ee931ddc
r10: ee4bc290  r9 : 00001000  r8 : ef2cb000
r7 : ee4bc200  r6 : ef2cb000  r5 : c0af57b0  r4 : ee4bc2e0
r3 : 00000000  r2 : 00000000  r1 : c0928df4  r0 : ef2cb000
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: adfc806a  DAC: 00000051
Process cat (pid: 1730, stack limit = 0xee930210)
Stack: (0xee931dd0 to 0xee932000)
1dc0:                                     ee931dfc ee931de0 c058ae88 c058f1a4
1de0: edce3bc0 c07bfca4 edce3ac0 00001000 ee931e24 ee931e00 c01fcb90 c058ae48
1e00: 00000001 edce3bc0 00000000 00000001 ee931e50 ee8ff480 ee931e34 ee931e28
1e20: c01fb33c c01fcb0c ee931e8c ee931e38 c01a5210 c01fb314 ee931e9c ee931e48
1e40: 00000000 edce3bf0 befe4a00 ee931f78 00000000 00000000 000001e4 00000000
1e60: c00545a8 edce3ac0 00001000 00001000 befe4a00 ee931f78 00000000 00001000
1e80: ee931ed4 ee931e90 c01fbed8 c01a5038 ed085a58 00020000 00000000 00000000
1ea0: c0ad72e4 ee931f78 ee8ff488 ee8ff480 c077f3fc 00001000 befe4a00 ee931f78
1ec0: 00000000 00001000 ee931f44 ee931ed8 c017c328 c01fbdc4 00001000 00000000
1ee0: ee8ff480 00001000 ee931f44 ee931ef8 c017c65c c03deb10 ee931fac ee931f08
1f00: c0009270 c001f290 c0a8d968 ef2cb000 ef2cb000 ee8ff480 00000020 ee8ff480
1f20: ee8ff480 befe4a00 00001000 ee931f78 00000000 00000000 ee931f74 ee931f48
1f40: c017d1ec c017c2f8 c019c724 c019c684 ee8ff480 ee8ff480 00001000 befe4a00
1f60: 00000000 00000000 ee931fa4 ee931f78 c017d2a8 c017d160 00000000 00000000
1f80: 000a9f20 00001000 befe4a00 00000003 c000ffe4 ee930000 00000000 ee931fa8
1fa0: c000fe40 c017d264 000a9f20 00001000 00000003 befe4a00 00001000 00000000
Unable to handle kernel NULL pointer dereference at virtual address 0000000c
1fc0: 000a9f20 00001000 befe4a00 00000003 00000000 00000000 00000003 00000001
pgd = edfc4000
[0000000c] *pgd=bfcac835
1fe0: 00000000 befe49dc 000197f8 b6e35dfc 60070010 00000003 3065b49d 134ac2c9

[<c058f1bc>] (show_ignore_nice_load_gov_pol) from [<c058ae88>] (show+0x4c/0x60)
[<c058ae88>] (show) from [<c01fcb90>] (sysfs_kf_seq_show+0x90/0xfc)
[<c01fcb90>] (sysfs_kf_seq_show) from [<c01fb33c>] (kernfs_seq_show+0x34/0x38)
[<c01fb33c>] (kernfs_seq_show) from [<c01a5210>] (seq_read+0x1e4/0x4e4)
[<c01a5210>] (seq_read) from [<c01fbed8>] (kernfs_fop_read+0x120/0x1a0)
[<c01fbed8>] (kernfs_fop_read) from [<c017c328>] (__vfs_read+0x3c/0xe0)
[<c017c328>] (__vfs_read) from [<c017d1ec>] (vfs_read+0x98/0x104)
[<c017d1ec>] (vfs_read) from [<c017d2a8>] (SyS_read+0x50/0x90)
[<c017d2a8>] (SyS_read) from [<c000fe40>] (ret_fast_syscall+0x0/0x1c)
Code: e5903044 e1a00001 e3081df4 e34c1092 (e593300c)
---[ end trace 5994b9a5111f35ee ]---

Fix that by making sure, policy->governor_data is updated at the right
places only.

Reported-and-tested-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agocpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype
Arnd Bergmann [Mon, 25 Jan 2016 15:44:38 +0000 (16:44 +0100)]
cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype

commit fb2a24a1c6457d21df9fae0dd66b20c63ba56077 upstream.

There are two definitions of pxa_cpufreq_change_voltage, with slightly
different prototypes after one of them had its argument marked 'const'.
Now the other one (for !CONFIG_REGULATOR) produces a harmless warning:

drivers/cpufreq/pxa2xx-cpufreq.c: In function 'pxa_set_target':
drivers/cpufreq/pxa2xx-cpufreq.c:291:36: warning: passing argument 1 of 'pxa_cpufreq_change_voltage' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
   ret = pxa_cpufreq_change_voltage(&pxa_freq_settings[idx]);
                                    ^
drivers/cpufreq/pxa2xx-cpufreq.c:205:12: note: expected 'struct pxa_freqs *' but argument is of type 'const struct pxa_freqs *'
 static int pxa_cpufreq_change_voltage(struct pxa_freqs *pxa_freq)
            ^

This changes the prototype in the same way as the other, which
avoids the warning.

Fixes: 03c229906311 (cpufreq: pxa: make pxa_freqs arrays const)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agohwmon: (ads1015) Handle negative conversion values correctly
Peter Rosin [Thu, 18 Feb 2016 13:07:52 +0000 (14:07 +0100)]
hwmon: (ads1015) Handle negative conversion values correctly

commit acc146943957d7418a6846f06e029b2c5e87e0d5 upstream.

Make the divisor signed as DIV_ROUND_CLOSEST is undefined for negative
dividends when the divisor is unsigned.

Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agohwmon: (gpio-fan) Remove un-necessary speed_index lookup for thermal hook
Nishanth Menon [Sat, 20 Feb 2016 00:09:51 +0000 (18:09 -0600)]
hwmon: (gpio-fan) Remove un-necessary speed_index lookup for thermal hook

commit 000e0949148382c4962489593a2f05504c2a6771 upstream.

Thermal hook gpio_fan_get_cur_state is only interested in knowing
the current speed index that was setup in the system, this is
already available as part of fan_data->speed_index which is always
set by set_fan_speed. Using get_fan_speed_index is useful when we
have no idea about the fan speed configuration (for example during
fan_ctrl_init).

When thermal framework invokes
gpio_fan_get_cur_state=>get_fan_speed_index via gpio_fan_get_cur_state
especially in a polled configuration for thermal governor, we
basically hog the i2c interface to the extent that other functions
fail to get any traffic out :(.

Instead, just provide the last state set in the driver - since the gpio
fan driver is responsible for the fan state immaterial of override, the
fan_data->speed_index should accurately reflect the state.

Fixes: b5cf88e46bad ("(gpio-fan): Add thermal control hooks")
Reported-by: Tony Lindgren <tony@atomide.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agohwmon: (dell-smm) Blacklist Dell Studio XPS 8000
Thorsten Leemhuis [Sun, 17 Jan 2016 15:03:04 +0000 (16:03 +0100)]
hwmon: (dell-smm) Blacklist Dell Studio XPS 8000

commit 6220f4ebd7b4db499238c2dc91268a9c473fd01c upstream.

Since Linux 4.0 the CPU fan speed is going up and down on Dell Studio
XPS 8000 and 8100 for unknown reasons. The 8100 was already
blacklisted in commit a4b45b25f18d ("hwmon: (dell-smm) Blacklist
Dell Studio XPS 8100"). This patch blacklists the XPS 8000.

Without further debugging on the affected machine, it is not possible
to find the problem. For more details see
https://bugzilla.kernel.org/show_bug.cgi?id=100121

Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Acked-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoThermal: do thermal zone update after a cooling device registered
Chen Yu [Fri, 30 Oct 2015 08:32:10 +0000 (16:32 +0800)]
Thermal: do thermal zone update after a cooling device registered

commit 4511f7166a2deb5f7a578cf87fd2fe1ae83527e3 upstream.

When a new cooling device is registered, we need to update the
thermal zone to set the new registered cooling device to a proper
state.

This fixes a problem that the system is cool, while the fan devices
are left running on full speed after boot, if fan device is registered
after thermal zone device.

Here is the history of why current patch looks like this:
https://patchwork.kernel.org/patch/7273041/

Reference:https://bugzilla.kernel.org/show_bug.cgi?id=92431
Tested-by: Manuel Krause <manuelkrause@netscape.net>
Tested-by: szegad <szegadlo@poczta.onet.pl>
Tested-by: prash <prash.n.rao@gmail.com>
Tested-by: amish <ammdispose-arch@yahoo.com>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoThermal: handle thermal zone device properly during system sleep
Zhang Rui [Fri, 30 Oct 2015 08:31:58 +0000 (16:31 +0800)]
Thermal: handle thermal zone device properly during system sleep

commit ff140fea847e1c2002a220571ab106c2456ed252 upstream.

Current thermal code does not handle system sleep well because
1. the cooling device cooling state may be changed during suspend
2. the previous temperature reading becomes invalid after resumed because
   it is got before system sleep
3. updating thermal zone device during suspending/resuming
   is wrong because some devices may have already been suspended
   or may have not been resumed.

Thus, the proper way to do this is to cancel all thermal zone
device update requirements during suspend/resume, and after all
the devices have been resumed, reset and update every registered
thermal zone devices.

This also fixes a regression introduced by:
Commit 19593a1fb1f6 ("ACPI / fan: convert to platform driver")
Because, with above commit applied, all the fan devices are attached
to the acpi_general_pm_domain, and they are turned on by the pm_domain
automatically after resume, without the awareness of thermal core.

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=78201
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=91411
Tested-by: Manuel Krause <manuelkrause@netscape.net>
Tested-by: szegad <szegadlo@poczta.onet.pl>
Tested-by: prash <prash.n.rao@gmail.com>
Tested-by: amish <ammdispose-arch@yahoo.com>
Tested-by: Matthias <morpheusxyz123@yahoo.de>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoThermal: initialize thermal zone device correctly
Zhang Rui [Fri, 30 Oct 2015 08:31:47 +0000 (16:31 +0800)]
Thermal: initialize thermal zone device correctly

commit bb431ba26c5cd0a17c941ca6c3a195a3a6d5d461 upstream.

After thermal zone device registered, as we have not read any
temperature before, thus tz->temperature should not be 0,
which actually means 0C, and thermal trend is not available.
In this case, we need specially handling for the first
thermal_zone_device_update().

Both thermal core framework and step_wise governor is
enhanced to handle this. And since the step_wise governor
is the only one that uses trends, so it's the only thermal
governor that needs to be updated.

Tested-by: Manuel Krause <manuelkrause@netscape.net>
Tested-by: szegad <szegadlo@poczta.onet.pl>
Tested-by: prash <prash.n.rao@gmail.com>
Tested-by: amish <ammdispose-arch@yahoo.com>
Tested-by: Matthias <morpheusxyz123@yahoo.de>
Reviewed-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoIB/mlx5: Expose correct maximum number of CQE capacity
Leon Romanovsky [Thu, 14 Jan 2016 06:11:40 +0000 (08:11 +0200)]
IB/mlx5: Expose correct maximum number of CQE capacity

commit 9f17768611ebf81dfac69948dd12622b6f2e45fc upstream.

Maximum number of EQE capacity per CQ was mistakenly exposed
as CQE. Fix that.

Fixes: 938fe83c8dcb ("net/mlx5_core: New device capabilities handling")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoIB/qib: Support creating qps with GFP_NOIO flag
Vinit Agnihotri [Mon, 11 Jan 2016 17:57:25 +0000 (12:57 -0500)]
IB/qib: Support creating qps with GFP_NOIO flag

commit fbbeb8632bf0b46ab44cfcedc4654cd7831b7161 upstream.

The current code is problematic when the QP creation and ipoib is used to
support NFS and NFS desires to do IO for paging purposes. In that case, the
GFP_KERNEL allocation in qib_qp.c causes a deadlock in tight memory
situations.

This fix adds support to create queue pair with GFP_NOIO flag for connected
mode only to cleanly fail the create queue pair in those situations.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Vinit Agnihotri <vinit.abhay.agnihotri@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoIB/qib: fix mcast detach when qp not attached
Mike Marciniszyn [Thu, 7 Jan 2016 21:44:10 +0000 (16:44 -0500)]
IB/qib: fix mcast detach when qp not attached

commit 09dc9cd6528f5b52bcbd3292a6312e762c85260f upstream.

The code produces the following trace:

[1750924.419007] general protection fault: 0000 [#3] SMP
[1750924.420364] Modules linked in: nfnetlink autofs4 rpcsec_gss_krb5 nfsv4
dcdbas rfcomm bnep bluetooth nfsd auth_rpcgss nfs_acl dm_multipath nfs lockd
scsi_dh sunrpc fscache radeon ttm drm_kms_helper drm serio_raw parport_pc
ppdev i2c_algo_bit lpc_ich ipmi_si ib_mthca ib_qib dca lp parport ib_ipoib
mac_hid ib_cm i3000_edac ib_sa ib_uverbs edac_core ib_umad ib_mad ib_core
ib_addr tg3 ptp dm_mirror dm_region_hash dm_log psmouse pps_core
[1750924.420364] CPU: 1 PID: 8401 Comm: python Tainted: G D
3.13.0-39-generic #66-Ubuntu
[1750924.420364] Hardware name: Dell Computer Corporation PowerEdge
860/0XM089, BIOS A04 07/24/2007
[1750924.420364] task: ffff8800366a9800 ti: ffff88007af1c000 task.ti:
ffff88007af1c000
[1750924.420364] RIP: 0010:[<ffffffffa0131d51>] [<ffffffffa0131d51>]
qib_mcast_qp_free+0x11/0x50 [ib_qib]
[1750924.420364] RSP: 0018:ffff88007af1dd70  EFLAGS: 00010246
[1750924.420364] RAX: 0000000000000001 RBX: ffff88007b822688 RCX:
000000000000000f
[1750924.420364] RDX: ffff88007b822688 RSI: ffff8800366c15a0 RDI:
6764697200000000
[1750924.420364] RBP: ffff88007af1dd78 R08: 0000000000000001 R09:
0000000000000000
[1750924.420364] R10: 0000000000000011 R11: 0000000000000246 R12:
ffff88007baa1d98
[1750924.420364] R13: ffff88003ecab000 R14: ffff88007b822660 R15:
0000000000000000
[1750924.420364] FS:  00007ffff7fd8740(0000) GS:ffff88007fc80000(0000)
knlGS:0000000000000000
[1750924.420364] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1750924.420364] CR2: 00007ffff597c750 CR3: 000000006860b000 CR4:
00000000000007e0
[1750924.420364] Stack:
[1750924.420364]  ffff88007b822688 ffff88007af1ddf0 ffffffffa0132429
000000007af1de20
[1750924.420364]  ffff88007baa1dc8 ffff88007baa0000 ffff88007af1de70
ffffffffa00cb313
[1750924.420364]  00007fffffffde88 0000000000000000 0000000000000008
ffff88003ecab000
[1750924.420364] Call Trace:
[1750924.420364]  [<ffffffffa0132429>] qib_multicast_detach+0x1e9/0x350
[ib_qib]
[1750924.568035]  [<ffffffffa00cb313>] ? ib_uverbs_modify_qp+0x323/0x3d0
[ib_uverbs]
[1750924.568035]  [<ffffffffa0092d61>] ib_detach_mcast+0x31/0x50 [ib_core]
[1750924.568035]  [<ffffffffa00cc213>] ib_uverbs_detach_mcast+0x93/0x170
[ib_uverbs]
[1750924.568035]  [<ffffffffa00c61f6>] ib_uverbs_write+0xc6/0x2c0 [ib_uverbs]
[1750924.568035]  [<ffffffff81312e68>] ? apparmor_file_permission+0x18/0x20
[1750924.568035]  [<ffffffff812d4cd3>] ? security_file_permission+0x23/0xa0
[1750924.568035]  [<ffffffff811bd214>] vfs_write+0xb4/0x1f0
[1750924.568035]  [<ffffffff811bdc49>] SyS_write+0x49/0xa0
[1750924.568035]  [<ffffffff8172f7ed>] system_call_fastpath+0x1a/0x1f
[1750924.568035] Code: 66 2e 0f 1f 84 00 00 00 00 00 31 c0 5d c3 66 2e 0f 1f
84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 8b 7f 10
<f0> ff 8f 40 01 00 00 74 0e 48 89 df e8 8e f8 06 e1 5b 5d c3 0f
[1750924.568035] RIP  [<ffffffffa0131d51>] qib_mcast_qp_free+0x11/0x50
[ib_qib]
[1750924.568035]  RSP <ffff88007af1dd70>
[1750924.650439] ---[ end trace 73d5d4b3f8ad4851 ]

The fix is to note the qib_mcast_qp that was found.   If none is found, then
return EINVAL indicating the error.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoIB/cm: Fix a recently introduced deadlock
Bart Van Assche [Fri, 1 Jan 2016 12:17:46 +0000 (13:17 +0100)]
IB/cm: Fix a recently introduced deadlock

commit 4bfdf635c668869c69fd18ece37ec66fb6f38fcf upstream.

ib_send_cm_drep() calls cm_enter_timewait() while holding a spinlock
that can be locked from inside an interrupt handler. Hence do not
enable interrupts inside cm_enter_timewait() if called with interrupts
disabled.

This patch fixes e.g. the following deadlock:
Acked-by: Erez Shitrit <erezsh@mellanox.com>
=================================
[ INFO: inconsistent lock state ]
4.4.0-rc7+ #1 Tainted: G            E
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/8/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
(&(&cm_id_priv->lock)->rlock){?.+...}, at: [<ffffffffa036eec4>] cm_establish+0x
74/0x1b0 [ib_cm]
{HARDIRQ-ON-W} state was registered at:
  [<ffffffff810a3c11>] mark_held_locks+0x71/0x90
  [<ffffffff810a3e87>] trace_hardirqs_on_caller+0xa7/0x1c0
  [<ffffffff810a3fad>] trace_hardirqs_on+0xd/0x10
  [<ffffffff8151c40b>] _raw_spin_unlock_irq+0x2b/0x40
  [<ffffffffa036ea8e>] cm_enter_timewait+0xae/0x100 [ib_cm]
  [<ffffffffa036ff76>] ib_send_cm_drep+0xb6/0x190 [ib_cm]
  [<ffffffffa052ed08>] srp_cm_handler+0x128/0x1a0 [ib_srp]
  [<ffffffffa0370340>] cm_process_work+0x20/0xf0 [ib_cm]
  [<ffffffffa0371335>] cm_dreq_handler+0x135/0x2c0 [ib_cm]
  [<ffffffffa03733c5>] cm_work_handler+0x75/0xd0 [ib_cm]
  [<ffffffff8107184d>] process_one_work+0x1bd/0x460
  [<ffffffff81073148>] worker_thread+0x118/0x420
  [<ffffffff81078454>] kthread+0xe4/0x100
  [<ffffffff8151cbbf>] ret_from_fork+0x3f/0x70
irq event stamp: 1672286
hardirqs last  enabled at (1672283): [<ffffffff81408ec0>] poll_idle+0x10/0x80
hardirqs last disabled at (1672284): [<ffffffff8151d304>] common_interrupt+0x84/0x89
softirqs last  enabled at (1672286): [<ffffffff8105b4dc>] _local_bh_enable+0x1c/0x50
softirqs last disabled at (1672285): [<ffffffff8105b697>] irq_enter+0x47/0x70

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&cm_id_priv->lock)->rlock);
  <Interrupt>
    lock(&(&cm_id_priv->lock)->rlock);

 *** DEADLOCK ***

no locks held by swapper/8/0.

stack backtrace:
CPU: 8 PID: 0 Comm: swapper/8 Tainted: G            E   4.4.0-rc7+ #1
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014
 ffff88045af5e950 ffff88046e503a88 ffffffff81251c1b 0000000000000007
 0000000000000006 0000000000000003 ffff88045af5ddc0 ffff88046e503ad8
 ffffffff810a32f4 0000000000000000 0000000000000000 0000000000000001
Call Trace:
 <IRQ>  [<ffffffff81251c1b>] dump_stack+0x4f/0x74
 [<ffffffff810a32f4>] print_usage_bug+0x184/0x190
 [<ffffffff810a36e2>] mark_lock_irq+0xf2/0x290
 [<ffffffff810a3995>] mark_lock+0x115/0x1b0
 [<ffffffff810a3b8c>] mark_irqflags+0x15c/0x170
 [<ffffffff810a4fef>] __lock_acquire+0x1ef/0x560
 [<ffffffff810a53c2>] lock_acquire+0x62/0x80
 [<ffffffff8151bd33>] _raw_spin_lock_irqsave+0x43/0x60
 [<ffffffffa036eec4>] cm_establish+0x74/0x1b0 [ib_cm]
 [<ffffffffa036f031>] ib_cm_notify+0x31/0x100 [ib_cm]
 [<ffffffffa0637f24>] srpt_qp_event+0x54/0xd0 [ib_srpt]
 [<ffffffffa0196052>] mlx4_ib_qp_event+0x72/0xc0 [mlx4_ib]
 [<ffffffffa00775b9>] mlx4_qp_event+0x69/0xd0 [mlx4_core]
 [<ffffffffa006000e>] mlx4_eq_int+0x51e/0xd50 [mlx4_core]
 [<ffffffffa006084f>] mlx4_msi_x_interrupt+0xf/0x20 [mlx4_core]
 [<ffffffff810b67b0>] handle_irq_event_percpu+0x40/0x110
 [<ffffffff810b68bf>] handle_irq_event+0x3f/0x70
 [<ffffffff810ba7f9>] handle_edge_irq+0x79/0x120
 [<ffffffff81007f3d>] handle_irq+0x5d/0x130
 [<ffffffff810071fd>] do_IRQ+0x6d/0x130
 [<ffffffff8151d309>] common_interrupt+0x89/0x89
 <EOI>  [<ffffffff8140895f>] cpuidle_enter_state+0xcf/0x200
 [<ffffffff81408aa2>] cpuidle_enter+0x12/0x20
 [<ffffffff810990d6>] call_cpuidle+0x36/0x60
 [<ffffffff81099163>] cpuidle_idle_call+0x63/0x110
 [<ffffffff8109930a>] cpu_idle_loop+0xfa/0x130
 [<ffffffff8109934e>] cpu_startup_entry+0xe/0x10
 [<ffffffff8103c443>] start_secondary+0x83/0x90

Fixes: commit be4b499323bf ("IB/cm: Do not queue work to a device that's going away")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodmaengine: dw: disable BLOCK IRQs for non-cyclic xfer
Andy Shevchenko [Wed, 10 Feb 2016 13:59:42 +0000 (15:59 +0200)]
dmaengine: dw: disable BLOCK IRQs for non-cyclic xfer

commit ee1cdcdae59563535485a5f56ee72c894ab7d7ad upstream.

The commit 2895b2cad6e7 ("dmaengine: dw: fix cyclic transfer callbacks")
re-enabled BLOCK interrupts with regard to make cyclic transfers work. However,
this change becomes a regression for non-cyclic transfers as interrupt counters
under stress test had been grown enormously (approximately per 4-5 bytes in the
UART loop back test).

Taking into consideration above enable BLOCK interrupts if and only if channel
is programmed to perform cyclic transfer.

Fixes: 2895b2cad6e7 ("dmaengine: dw: fix cyclic transfer callbacks")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mans Rullgard <mans@mansr.com>
Tested-by: Mans Rullgard <mans@mansr.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodmaengine: at_xdmac: fix resume for cyclic transfers
Songjun Wu [Mon, 18 Jan 2016 10:14:44 +0000 (11:14 +0100)]
dmaengine: at_xdmac: fix resume for cyclic transfers

commit 611dcadb01c89d1d3521450c05a4ded332e5a32d upstream.

When having cyclic transfers, the channel was paused when performing
suspend but was not correctly resumed.

Signed-off-by: Songjun Wu <songjun.wu@atmel.com>
Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver")
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodmaengine: dw: fix cyclic transfer callbacks
Mans Rullgard [Mon, 11 Jan 2016 13:04:29 +0000 (13:04 +0000)]
dmaengine: dw: fix cyclic transfer callbacks

commit 2895b2cad6e7a95104cf396e5330054453382ae1 upstream.

Cyclic transfer callbacks rely on block completion interrupts which were
disabled in commit ff7b05f29fd4 ("dmaengine/dw_dmac: Don't handle block
interrupts").  This re-enables block interrupts so the cyclic callbacks
can work.  Other transfer types are not affected as they set the INT_EN
bit only on the last block.

Fixes: ff7b05f29fd4 ("dmaengine/dw_dmac: Don't handle block interrupts")
Signed-off-by: Mans Rullgard <mans@mansr.com>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodmaengine: dw: fix cyclic transfer setup
Mans Rullgard [Mon, 11 Jan 2016 13:04:28 +0000 (13:04 +0000)]
dmaengine: dw: fix cyclic transfer setup

commit df3bb8a0e619d501cd13334c3e0586edcdcbc716 upstream.

Commit 61e183f83069 ("dmaengine/dw_dmac: Reconfigure interrupt and
chan_cfg register on resume") moved some channel initialisation to
a new function which must be called before starting a transfer.

This updates dw_dma_cyclic_start() to use dwc_dostart() like the other
modes, thus ensuring dwc_initialize() gets called and removing some code
duplication.

Fixes: 61e183f83069 ("dmaengine/dw_dmac: Reconfigure interrupt and chan_cfg register on resume")
Signed-off-by: Mans Rullgard <mans@mansr.com>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonfit: fix multi-interface dimm handling, acpi6.1 compatibility
Dan Williams [Fri, 5 Feb 2016 00:51:00 +0000 (16:51 -0800)]
nfit: fix multi-interface dimm handling, acpi6.1 compatibility

commit 6697b2cf69d4363266ca47eaebc49ef13dabc1c9 upstream.

ACPI 6.1 clarified that multi-interface dimms require multiple control
region entries (DCRs) per dimm.  Previously we were assuming that a
control region is only present when block-data-windows are present.
This implementation was done with an eye to be compatibility with the
looser ACPI 6.0 interpretation of this table.

1/ When coalescing the memory device (MEMDEV) tables for a single dimm,
coalesce on device_handle rather than control region index.

2/ Whenever we disocver a control region with non-zero block windows
re-scan for block-data-window (BDW) entries.

We may need to revisit this if a DIMM ever implements a format interface
outside of blk or pmem, but that is not on the foreseeable horizon.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoACPI / PCI / hotplug: unlock in error path in acpiphp_enable_slot()
Insu Yun [Sat, 23 Jan 2016 20:44:19 +0000 (15:44 -0500)]
ACPI / PCI / hotplug: unlock in error path in acpiphp_enable_slot()

commit 2c3033a0664dfae91e1dee7fabac10f24354b958 upstream.

In acpiphp_enable_slot(), there is a missing unlock path
when error occurred.  It needs to be unlocked before returning
an error.

Signed-off-by: Insu Yun <wuninsu@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoACPI: Revert "ACPI / video: Add Dell Inspiron 5737 to the blacklist"
Hans de Goede [Fri, 22 Jan 2016 10:41:05 +0000 (11:41 +0100)]
ACPI: Revert "ACPI / video: Add Dell Inspiron 5737 to the blacklist"

commit b186b4dcb79b1914c3dadb27ac72dafaa4267998 upstream.

The quirk to get "acpi_backlight=vendor" behavior by default on the
Dell Inspiron 5737 was added before we started doing
"acpi_backlight=native" by default on Win8 ready machines.

Since we now avoid using acpi-video as backlight driver on these machines
by default (using the native driver instead) we no longer need this quirk.

Moreover the vendor driver does not work after a suspend/resume where
as the native driver does.

This reverts commit 08a56226d847 (ACPI / video: Add Dell Inspiron 5737
to the blacklist).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=111061
Reported-and-tested-by: erusan@gmail.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830
Hans de Goede [Thu, 14 Jan 2016 13:24:39 +0000 (14:24 +0100)]
ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830

commit b21f2e81bd3fd8ed260590e72901254bca2193cd upstream.

The Toshiba Satellite R830 needs disable_backlight_sysfs_if=1, just like
the Toshiba Portege R830. Add a quirk for this.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=21012
Tested-by: To Do <entodoays@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700
Hans de Goede [Mon, 11 Jan 2016 13:46:17 +0000 (14:46 +0100)]
ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700

commit de588b8ff057d4de0751f337b930f90ca522bab2 upstream.

The Toshiba Portege R700 needs disable_backlight_sysfs_if=1, just like
the Toshiba Portege R830. Add a quirk for this.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=21012
Tested-by: Emma Reisz <emmareisz@outlook.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agolib: sw842: select crc32
Arnd Bergmann [Wed, 13 Jan 2016 22:24:02 +0000 (23:24 +0100)]
lib: sw842: select crc32

commit 5b57167749274961baf15ed1f05a4996b3ab0487 upstream.

The sw842 library code was merged in linux-4.1 and causes a very rare randconfig
failure when CONFIG_CRC32 is not set:

    lib/built-in.o: In function `sw842_compress':
    oid_registry.c:(.text+0x12ddc): undefined reference to `crc32_be'
    lib/built-in.o: In function `sw842_decompress':
    oid_registry.c:(.text+0x137e4): undefined reference to `crc32_be'

This adds an explict 'select CRC32' statement, similar to what the other users
of the crc32 code have. In practice, CRC32 is always enabled anyway because
over 100 other symbols select it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 2da572c959dd ("lib: add software 842 compression/decompression")
Acked-by: Dan Streetman <ddstreet@ieee.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agouapi: update install list after nvme.h rename
Mike Frysinger [Mon, 11 Jan 2016 01:14:11 +0000 (20:14 -0500)]
uapi: update install list after nvme.h rename

commit a9cf8284b45110a4d98aea180a89c857e53bf850 upstream.

Commit 9d99a8dda154 ("nvme: move hardware structures out of the uapi
version of nvme.h") renamed nvme.h to nvme_ioctl.h, but the uapi list
still refers to nvme.h.  People trying to install the headers hit a
failure as the header no longer exists.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoideapad-laptop: Add Lenovo Yoga 700 to no_hw_rfkill dmi list
Josh Boyer [Sun, 24 Jan 2016 15:46:42 +0000 (10:46 -0500)]
ideapad-laptop: Add Lenovo Yoga 700 to no_hw_rfkill dmi list

commit 6b31de3e698582fe0b8f7f4bab15831b73204800 upstream.

Like the Yoga 900 models the Lenovo Yoga 700 does not have a
hw rfkill switch, and trying to read the hw rfkill switch through the
ideapad module causes it to always reported blocking breaking wifi.

This commit adds the Lenovo Yoga 700 to the no_hw_rfkill dmi list, fixing
the wifi breakage.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1295272
Tested-by: <dinyar.rabady+spam@gmail.com>
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoideapad-laptop: Add Lenovo ideapad Y700-17ISK to no_hw_rfkill dmi list
Josh Boyer [Thu, 10 Dec 2015 02:12:52 +0000 (21:12 -0500)]
ideapad-laptop: Add Lenovo ideapad Y700-17ISK to no_hw_rfkill dmi list

commit edde316acb5f07c04abf09a92f59db5d2efd14e2 upstream.

One of the newest ideapad models also lacks a physical hw rfkill switch,
and trying to read the hw rfkill switch through the ideapad module
causes it to always reported blocking breaking wifi.

Fix it by adding this model to the DMI list.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1286293
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agotoshiba_acpi: Fix blank screen at boot if transflective backlight is supported
Azael Avalos [Mon, 16 Nov 2015 03:32:47 +0000 (20:32 -0700)]
toshiba_acpi: Fix blank screen at boot if transflective backlight is supported

commit bae5336f0aaedffa115dab9cb3d8a4e4aed3a26a upstream.

If transflective backlight is supported and the brightness is zero
(lowest brightness level), the set_lcd_brightness function will activate
the transflective backlight, making the LCD appear to be turned off.

This patch fixes the issue by incrementing the brightness level, and
by doing so, avoiding the activation of the tranflective backlight.

Reported-and-tested-by: Fabian Koester <fabian.koester@bringnow.com>
Signed-off-by: Azael Avalos <coproscefalo@gmail.com>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomake sure that freeing shmem fast symlinks is RCU-delayed
Al Viro [Fri, 22 Jan 2016 23:08:52 +0000 (18:08 -0500)]
make sure that freeing shmem fast symlinks is RCU-delayed

commit 3ed47db34f480df7caf44436e3e63e555351ae9a upstream.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon/pm: adjust display configuration after powerstate
Alex Deucher [Fri, 19 Feb 2016 23:05:10 +0000 (18:05 -0500)]
drm/radeon/pm: adjust display configuration after powerstate

commit 39d4275058baf53e89203407bf3841ff2c74fa32 upstream.

set_power_state defaults to no displays, so we need to update
the display configuration after setting up the powerstate on the
first call. In most cases this is not an issue since ends up
getting called multiple times at any given modeset and the proper
order is achieved in the display changed handling at the top of
the function.

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Jordan Lazare <Jordan.Lazare@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: Don't hang in radeon_flip_work_func on disabled crtc. (v2)
Mario Kleiner [Fri, 19 Feb 2016 01:06:38 +0000 (02:06 +0100)]
drm/radeon: Don't hang in radeon_flip_work_func on disabled crtc. (v2)

commit 2b8341b3f917c108b47f6a8a771a40d226c57883 upstream.

This fixes a regression introduced in Linux 4.4.

Limit the amount of time radeon_flip_work_func can
delay programming a page flip, by both limiting the
maximum amount of time per wait cycle and the maximum
number of wait cycles. Continue the flip if the limit
is exceeded, even if that may result in a visual or
timing glitch.

This is to prevent a hang of page flips, as reported
in fdo bug #93746: Disconnecting a DisplayPort display
in parallel to a kms pageflip getting queued can cause
the following hang of page flips and thereby an unusable
desktop:

1. kms pageflip ioctl() queues pageflip -> queues execution
   of radeon_flip_work_func.

2. Hotunplug of display causes the driver to DPMS OFF
   the unplugged display. Display engine shuts down,
   scanout no longer moves, but stays at its resting
   position at start line of vblank.

3. radeon_flip_work_func executes while crtc is off, and
   due to the non-moving scanout position, the new flip
   delay code introduced into Linux 4.4 by
   commit 5b5561b3660d ("drm/radeon: Fixup hw vblank counter/ts..")
   enters an infinite wait loop.

4. After reconnecting the display, the pageflip continues
   to hang in 3. and the display doesn't update its view
   of the desktop.

This patch fixes the Linux 4.4 regression from fdo bug #93746

<https://bugs.freedesktop.org/show_bug.cgi?id=93746>

v2: Skip wait immediately if !radeon_crtc->enabled, as
    suggested by Michel.

Reported-by: Bernd Steinhauser <linux@bernd-steinhauser.de>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Tested-by: Bernd Steinhauser <linux@bernd-steinhauser.de>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm: Fix treatment of drm_vblank_offdelay in drm_vblank_on() (v2)
Mario Kleiner [Fri, 12 Feb 2016 19:30:30 +0000 (20:30 +0100)]
drm: Fix treatment of drm_vblank_offdelay in drm_vblank_on() (v2)

commit bb74fc1bf3072bd3ab4ed5f43afd287a63baf2d7 upstream.

drm_vblank_offdelay can have three different types of values:

< 0 is to be always treated the same as dev->vblank_disable_immediate
= 0 is to be treated as "never disable vblanks"
> 0 is to be treated as disable immediate if kms driver wants it
    that way via dev->vblank_disable_immediate. Otherwise it is
    a disable timeout in msecs.

This got broken in Linux 3.18+ for the implementation of
drm_vblank_on. If the user specified a value of zero which should
always reenable vblank irqs in this function, a kms driver could
override the users choice by setting vblank_disable_immediate
to true. This patch fixes the regression and keeps the user in
control.

v2: Only reenable vblank if there are clients left or the user
    requested to "never disable vblanks" via offdelay 0. Enabling
    vblanks even in the "delayed disable" case (offdelay > 0) was
    specifically added by Ville in commit cd19e52aee922
    ("drm: Kick start vblank interrupts at drm_vblank_on()"),
    but after discussion it turns out that this was done by accident.

    Citing Ville: "I think it just ended up as a mess due to changing
    some of the semantics of offdelay<0 vs. offdelay==0 vs.
    disable_immediate during the review of the series. So yeah, given
    how drm_vblank_put() works now, I'd just make this check for
    offdelay==0."

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: michel@daenzer.net
Cc: vbabka@suse.cz
Cc: ville.syrjala@linux.intel.com
Cc: daniel.vetter@ffwll.ch
Cc: dri-devel@lists.freedesktop.org
Cc: alexander.deucher@amd.com
Cc: christian.koenig@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm: Fix drm_vblank_pre/post_modeset regression from Linux 4.4
Mario Kleiner [Fri, 12 Feb 2016 19:30:29 +0000 (20:30 +0100)]
drm: Fix drm_vblank_pre/post_modeset regression from Linux 4.4

commit c61934ed9a0e3911a9935df26858726a7ec35ec0 upstream.

Changes to drm_update_vblank_count() in Linux 4.4 broke the
behaviour of the pre/post modeset functions as the new update
code doesn't deal with hw vblank counter resets inbetween calls
to drm_vblank_pre_modeset an drm_vblank_post_modeset, as it
should.

This causes mistreatment of such hw counter resets as counter
wraparound, and thereby large forward jumps of the software
vblank counter which in turn cause vblank event dispatching
and vblank waits to fail/hang --> userspace clients hang.

This symptom was reported on radeon-kms to cause a infinite
hang of KDE Plasma 5 shell's login procedure, preventing users
from logging in.

Fix this by detecting when drm_update_vblank_count() is called
inside a pre->post modeset interval. If so, clamp valid vblank
increments to the safe values 0 and 1, pretty much restoring
the update behavior of the old update code of Linux 4.3 and
earlier. Also reset the last recorded hw vblank count at call
to drm_vblank_post_modeset() to be safe against hw that after
modesetting, dpms on etc. only fires its first vblank irq after
drm_vblank_post_modeset() was already called.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Vlastimil Babka <vbabka@suse.cz>
Cc: michel@daenzer.net
Cc: vbabka@suse.cz
Cc: ville.syrjala@linux.intel.com
Cc: daniel.vetter@ffwll.ch
Cc: dri-devel@lists.freedesktop.org
Cc: alexander.deucher@amd.com
Cc: christian.koenig@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm: Prevent vblank counter bumps > 1 with active vblank clients. (v2)
Mario Kleiner [Fri, 12 Feb 2016 19:30:28 +0000 (20:30 +0100)]
drm: Prevent vblank counter bumps > 1 with active vblank clients. (v2)

commit 99b8e71597fadd6b2ac85e6e10f221f79dd9c1c1 upstream.

This fixes a regression introduced by the new drm_update_vblank_count()
implementation in Linux 4.4:

Restrict the bump of the software vblank counter in drm_update_vblank_count()
to a safe maximum value of +1 whenever there is the possibility that
concurrent readers of vblank timestamps could be active at the moment,
as the current implementation of the timestamp caching and updating is
not safe against concurrent readers for calls to store_vblank() with a
bump of anything but +1. A bump != 1 would very likely return corrupted
timestamps to userspace, because the same slot in the cache could
be concurrently written by store_vblank() and read by one of those
readers in a non-atomic fashion and without the read-retry logic
detecting this collision.

Concurrent readers can exist while drm_update_vblank_count() is called
from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank-
irq callers. However, all those calls are happening with the vbl_lock
locked thereby preventing a drm_vblank_get(), so the vblank refcount
can't increase while drm_update_vblank_count() is executing. Therefore
a zero vblank refcount during execution of that function signals that
is safe for arbitrary counter bumps if called from outside vblank irq,
whereas a non-zero count is not safe.

Whenever the function is called from vblank irq, we have to assume concurrent
readers could show up any time during its execution, even if the refcount
is currently zero, as vblank irqs are usually only enabled due to the
presence of readers, and because when it is called from vblank irq it
can't hold the vbl_lock to protect it from sudden bumps in vblank refcount.
Therefore also restrict bumps to +1 when the function is called from vblank
irq.

Such bumps of more than +1 can happen at other times than reenabling
vblank irqs, e.g., when regular vblank interrupts get delayed by more
than 1 frame due to long held locks, long irq off periods, realtime
preemption on RT kernels, or system management interrupts.

A better solution would be to rewrite the timestamp caching to use
full seqlocks to allow concurrent writes and reads for arbitrary
vblank counter increments.

v2: Add code comment that this is essentially a hack and should
    be replaced by a full seqlock implementation for caching of
    timestamps.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: michel@daenzer.net
Cc: vbabka@suse.cz
Cc: ville.syrjala@linux.intel.com
Cc: daniel.vetter@ffwll.ch
Cc: dri-devel@lists.freedesktop.org
Cc: alexander.deucher@amd.com
Cc: christian.koenig@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agodrm: No-Op redundant calls to drm_vblank_off() (v2)
Mario Kleiner [Fri, 12 Feb 2016 19:30:27 +0000 (20:30 +0100)]
drm: No-Op redundant calls to drm_vblank_off() (v2)

commit e8235891b33799d597ff4ab5e45afe173a65da30 upstream.

Otherwise if a kms driver calls into drm_vblank_off() more than once
before calling drm_vblank_on() again, the redundant calls to
vblank_disable_and_save() will call drm_update_vblank_count()
while hw vblank counters and vblank timestamping are in a undefined
state during modesets, dpms off etc.

At least with the legacy drm helpers it is not unusual to
get multiple calls to drm_vblank_off and drm_vblank_on, e.g.,
half a dozen calls to drm_vblank_off and two calls to drm_vblank_on
were observed on radeon-kms during dpms-off -> dpms-on transition.

We don't no-op calls from atomic modesetting drivers, as they
should do a proper job of tracking hw state.

Fixes large jumps of the software maintained vblank counter due to
the hardware vblank counter resetting to zero during dpms off or
modeset, e.g., if radeon-kms is modified to use drm_vblank_off/on
instead of drm_vblank_pre/post_modeset().

This fixes a regression caused by the changes made to
drm_update_vblank_count() in Linux 4.4.

v2: Don't no-op on atomic modesetting drivers, per suggestion
    of Daniel Vetter.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: michel@daenzer.net
Cc: vbabka@suse.cz
Cc: ville.syrjala@linux.intel.com
Cc: alexander.deucher@amd.com
Cc: christian.koenig@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: use post-decrement in error handling
Rasmus Villemoes [Mon, 15 Feb 2016 18:41:47 +0000 (19:41 +0100)]
drm/radeon: use post-decrement in error handling

commit bc3f5d8c4ca01555820617eb3b6c0857e4df710d upstream.

We need to use post-decrement to get the pci_map_page undone also for
i==0, and to avoid some very unpleasant behaviour if pci_map_page
failed already at i==0.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/qxl: use kmalloc_array to alloc reloc_info in qxl_process_single_command
Gerd Hoffmann [Tue, 16 Feb 2016 13:25:00 +0000 (14:25 +0100)]
drm/qxl: use kmalloc_array to alloc reloc_info in qxl_process_single_command

commit 34855706c30d52b0a744da44348b5d1cc39fbe51 upstream.

This avoids integer overflows on 32bit machines when calculating
reloc_info size, as reported by Alan Cox.

Cc: gnomes@lxorguk.ukuu.org.uk
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915: fix error path in intel_setup_gmbus()
Rasmus Villemoes [Tue, 9 Feb 2016 20:11:13 +0000 (21:11 +0100)]
drm/i915: fix error path in intel_setup_gmbus()

commit ed3f9fd1e865975ceefdb2a43b453e090b1fd787 upstream.

This fails to undo the setup for pin==0; moreover, something
interesting happens if the setup failed already at pin==0.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Fixes: f899fc64cda8 ("drm/i915: use GMBUS to manage i2c links")
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455048677-19882-3-git-send-email-linux@rasmusvillemoes.dk
(cherry picked from commit 2417c8c03f508841b85bf61acc91836b7b0e2560)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915/dsi: don't pass arbitrary data to sideband
Jani Nikula [Thu, 4 Feb 2016 10:50:50 +0000 (12:50 +0200)]
drm/i915/dsi: don't pass arbitrary data to sideband

commit 26f6f2d301c1fb46acb1138ee155125815239b0d upstream.

Since sequence block v2 the second byte contains flags other than just
pull up/down. Don't pass arbitrary data to the sideband interface.

The rest may or may not work for sequence block v2, but there should be
no harm done.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/ebe3c2eee623afc4b3a134533b01f8d591d13f32.1454582914.git.jani.nikula@intel.com
(cherry picked from commit 4e1c63e3761b84ec7d87c75b58bbc8bcf18e98ee)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915/dsi: defend gpio table against out of bounds access
Jani Nikula [Thu, 4 Feb 2016 10:50:49 +0000 (12:50 +0200)]
drm/i915/dsi: defend gpio table against out of bounds access

commit 4db3a2448ec8902310acb78de39b6227a9a56ac8 upstream.

Do not blindly trust the VBT data used for indexing.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/cc32d40c2b47f2d2151811855ac2c3dabab1d57d.1454582914.git.jani.nikula@intel.com
(cherry picked from commit 5d2d0a12d3d08bf50434f0b5947bb73bac04b941)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915/skl: Don't skip mst encoders in skl_ddi_pll_select()
Lyude [Tue, 2 Feb 2016 15:49:43 +0000 (10:49 -0500)]
drm/i915/skl: Don't skip mst encoders in skl_ddi_pll_select()

commit 3d849b02336be103d312c1574d6f7314d5c0bc9f upstream.

We don't actually check for INTEL_OUTPUT_DP_MST at all in here, as a
result we skip assigning a DPLL to any DP MST ports, which makes link
training fail:

[ 1442.933896] [drm:intel_power_well_enable] enabling DDI D power well
[ 1442.933905] [drm:skl_set_power_well] Enabling DDI D power well
[ 1442.933957] [drm:intel_mst_pre_enable_dp] 0
[ 1442.935474] [drm:intel_dp_set_signal_levels] Using signal levels 00000000
[ 1442.935477] [drm:intel_dp_set_signal_levels] Using vswing level 0
[ 1442.935480] [drm:intel_dp_set_signal_levels] Using pre-emphasis level 0
[ 1442.936190] [drm:intel_dp_set_signal_levels] Using signal levels 05000000
[ 1442.936193] [drm:intel_dp_set_signal_levels] Using vswing level 1
[ 1442.936195] [drm:intel_dp_set_signal_levels] Using pre-emphasis level 1
[ 1442.936858] [drm:intel_dp_set_signal_levels] Using signal levels 08000000
[ 1442.936862] [drm:intel_dp_set_signal_levels] Using vswing level 2

[ 1442.998253] [drm:intel_dp_link_training_clock_recovery [i915]] *ERROR* too many full retries, give up
[ 1442.998512] [drm:intel_dp_start_link_train [i915]] *ERROR* failed to train DP, aborting

After which the pipe state goes completely out of sync:

[   70.075596] [drm:check_crtc_state] [CRTC:25]
[   70.075696] [drm:intel_pipe_config_compare [i915]] *ERROR* mismatch in ddi_pll_sel (expected 0x00000000, found 0x00000001)
[   70.075747] [drm:intel_pipe_config_compare [i915]] *ERROR* mismatch in shared_dpll (expected -1, found 0)
[   70.075798] [drm:intel_pipe_config_compare [i915]] *ERROR* mismatch in dpll_hw_state.ctrl1 (expected 0x00000000, found 0x00000021)
[   70.075840] [drm:intel_pipe_config_compare [i915]] *ERROR* mismatch in dpll_hw_state.cfgcr1 (expected 0x00000000, found 0x80400173)
[   70.075884] [drm:intel_pipe_config_compare [i915]] *ERROR* mismatch in dpll_hw_state.cfgcr2 (expected 0x00000000, found 0x000003a5)
[   70.075954] [drm:intel_pipe_config_compare [i915]] *ERROR* mismatch in base.adjusted_mode.crtc_clock (expected 262750, found 72256)
[   70.075999] [drm:intel_pipe_config_compare [i915]] *ERROR* mismatch in port_clock (expected 540000, found 148500)

And if you're especially lucky, it keeps going downhill:

[   83.309256] Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
[   83.309265]
[   83.309265] =================================
[   83.309266] [ INFO: inconsistent lock state ]
[   83.309267] 4.5.0-rc1Lyude-Test #265 Not tainted
[   83.309267] ---------------------------------
[   83.309268] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[   83.309270] Xorg/1194 [HC0[1]:SC0[0]:HE1:SE1] takes:
[   83.309293]  (&(&dev_priv->uncore.lock)->rlock){?.-...}, at: [<ffffffffa02a6073>] gen9_write32+0x63/0x400 [i915]
[   83.309293] {IN-HARDIRQ-W} state was registered at:
[   83.309297]   [<ffffffff810e84f4>] __lock_acquire+0x9c4/0x1d00
[   83.309299]   [<ffffffff810ea1be>] lock_acquire+0xce/0x1c0
[   83.309302]   [<ffffffff8177d936>] _raw_spin_lock_irqsave+0x56/0x90
[   83.309321]   [<ffffffffa02a5492>] gen9_read32+0x52/0x3d0 [i915]
[   83.309332]   [<ffffffffa024beea>] gen8_irq_handler+0x27a/0x6a0 [i915]
[   83.309337]   [<ffffffff810fdbc1>] handle_irq_event_percpu+0x41/0x300
[   83.309339]   [<ffffffff810fdeb9>] handle_irq_event+0x39/0x60
[   83.309341]   [<ffffffff811010b4>] handle_edge_irq+0x74/0x130
[   83.309344]   [<ffffffff81009073>] handle_irq+0x73/0x120
[   83.309346]   [<ffffffff817805f1>] do_IRQ+0x61/0x120
[   83.309348]   [<ffffffff8177e6d6>] ret_from_intr+0x0/0x20
[   83.309351]   [<ffffffff815f5105>] cpuidle_enter_state+0x105/0x330
[   83.309353]   [<ffffffff815f5367>] cpuidle_enter+0x17/0x20
[   83.309356]   [<ffffffff810dbe1a>] call_cpuidle+0x2a/0x50
[   83.309358]   [<ffffffff810dc1dd>] cpu_startup_entry+0x26d/0x3a0
[   83.309360]   [<ffffffff817701da>] rest_init+0x13a/0x140
[   83.309363]   [<ffffffff81f2af8e>] start_kernel+0x475/0x482
[   83.309365]   [<ffffffff81f2a315>] x86_64_start_reservations+0x2a/0x2c
[   83.309367]   [<ffffffff81f2a452>] x86_64_start_kernel+0x13b/0x14a

Fixes: 82d354370189 ("drm/i915/skl: Implementation of SKL DPLL programming")
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1454428183-994-1-git-send-email-cpaul@redhat.com
(cherry picked from commit 78385cb398748debb7ea2e36d6d2001830c172bc)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915: Don't reject primary plane windowing with color keying enabled on SKL+
Ville Syrjälä [Fri, 15 Jan 2016 18:46:53 +0000 (20:46 +0200)]
drm/i915: Don't reject primary plane windowing with color keying enabled on SKL+

commit 6f94b6dd006909a5ef6435cc0af557e945240f48 upstream.

On SKL+ plane scaling is mutually exclusive with color keying. The code
check for this, but during some refactoring the code got changed to
also reject primary plane windowing when color keying is used. There is
no such restriction in the hardware, so restore the original logic.

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: 061e4b8d650a ("drm/i915: clean up atomic plane check functions, v2.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1452883613-28549-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
(cherry picked from commit 693bdc28a733dba68b86af295e7509812fec35d9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915/dp: fall back to 18 bpp when sink capability is unknown
Jani Nikula [Wed, 13 Jan 2016 14:35:20 +0000 (16:35 +0200)]
drm/i915/dp: fall back to 18 bpp when sink capability is unknown

commit 5efd407674068dede403551bea3b0b134c32513a upstream.

Per DP spec, the source device should fall back to 18 bpp, VESA range
RGB when the sink capability is unknown. Fix the color depth
clamping. 18 bpp color depth should ensure full color range in automatic
mode.

The clamping has been HDMI specific since its introduction in

commit 996a2239f93b03c5972923f04b097f65565c5bed
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Apr 19 11:24:34 2013 +0200

    drm/i915: Disable high-bpc on pre-1.4 EDID screens

Reported-and-tested-by: Dihan Wickremasuriya <nayomal@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=105331
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1452695720-7076-1-git-send-email-jani.nikula@intel.com
(cherry picked from commit 013dd9e038723bbd2aa67be51847384b75be8253)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915: Make sure DC writes are coherent on flush.
Francisco Jerez [Thu, 14 Jan 2016 02:59:39 +0000 (18:59 -0800)]
drm/i915: Make sure DC writes are coherent on flush.

commit 935a0ff0e1ea62a116848c0a187b13838f7b9cee upstream.

We need to set the DC FLUSH PIPE_CONTROL bit on Gen7+ to guarantee
that writes performed via the HDC are visible in memory.  Fixes an
intermittent failure in a Piglit test that writes to a BO from a
shader using GL atomic counters (implemented as HDC untyped atomics)
and then expects the memory to read back the same value after mapping
it on the CPU.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91298
Tested-by: Mark Janes <mark.a.janes@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1452740379-3194-1-git-send-email-currojerez@riseup.net
(cherry picked from commit 965fd602a6436f689f4f2fe40a6789582778ccd5)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915: Init power domains early in driver load
Daniel Vetter [Wed, 13 Jan 2016 10:55:28 +0000 (11:55 +0100)]
drm/i915: Init power domains early in driver load

commit f5949141a21ee16edf1beaf95cbae7e419171ab5 upstream.

Since

commit ac9b8236551d1177fd07b56aef9b565d1864420d
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Fri Nov 27 18:55:26 2015 +0200

    drm/i915: Introduce a gmbus power domain

gmbus also needs the power domain infrastructure right from the start,
since as soon as we register the i2c controllers someone can use them.

v2: Adjust cleanup paths too (Chris).

v3: Rebase onto -nightly (totally bogus tree I had lying around) and
also move dpio init head (Ville).

v4: Ville instead suggested to move gmbus setup later in the sequence,
since it's only needed by the modeset code.

v5: Move even close to the actual user, right next to the comment that
states where we really need gmbus (and interrupts!).

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Meelis Roos <mroos@linux.ee>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: ac9b8236551d ("drm/i915: Introduce a gmbus power domain")
References: http://www.spinics.net/lists/intel-gfx/msg83075.html
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1452682528-19437-1-git-send-email-daniel.vetter@ffwll.ch
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915: intel_hpd_init(): Fix suspend/resume reprobing
Lyude [Thu, 7 Jan 2016 15:43:28 +0000 (10:43 -0500)]
drm/i915: intel_hpd_init(): Fix suspend/resume reprobing

commit 2dc2f761dea65069485110d24eaa5b0d5d808b07 upstream.

This fixes reprobing of display connectors on resume.  After some
talking with danvet on IRC, I learned that calling
drm_helper_hpd_irq_event() does actually trigger a full reprobe of each
connector's status. It turns out this is the actual reason reprobing on
resume hasn't been working (this was observed on a T440s):

- We call hpd_init()
- We check each connector for a couple of things before marking
  connector->polled with DRM_CONNECTOR_POLL_HPD, one of which is an
  active encoder. Of course, a disconnected port won't have an
  active encoder, so we don't add the flag to any of the
  connectors.
- We call drm_helper_hpd_irq_event()
- drm_helper_irq_event() checks each connector for the
  DRM_CONNECTOR_POLL_HPD flag. The only one that has it is eDP-1,
  so we skip reprobing each connector except that one.

In addition, we also now avoid setting connector->polled to
DRM_CONNECTOR_POLL_HPD for MST connectors, since their reprobing is
handled by the mst helpers. This is probably what was originally
intended to happen here.

Changes since V1:
* Use the explanation of the issue as the commit message instead
* Change the title of the commit, since this does more then just stop a
  check for an encoder now
* Add "Fixes" line for the patch that introduced this regression
* Don't enable DRM_CONNECTOR_POLL_HPD for mst connectors

Changes since V2:
* Put patch changelog above Signed-off-by
* Follow Daniel Vetter's suggestion for making the code here a bit more
  legible

Fixes: 0e32b39ceed6 ("drm/i915: add DP 1.2 MST support (v0.7)")
Signed-off-by: Lyude <cpaul@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1452181408-14777-1-git-send-email-cpaul@redhat.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit 07c519134417d92c2e1a536e2b66d4ffff4b3be0)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/i915: Restore inhibiting the load of the default context
Chris Wilson [Fri, 27 Nov 2015 13:28:55 +0000 (13:28 +0000)]
drm/i915: Restore inhibiting the load of the default context

commit 06ef83a705a98da63797a5a570220b6ca36febd4 upstream.

Following a GPU reset, we may leave the context in a poorly defined
state, and reloading from that context will leave the GPU flummoxed. For
secondary contexts, this will lead to that context being banned - but
currently it is also causing the default context to become banned,
leading to turmoil in the shared state.

This is a regression from

commit 6702cf16e0ba8b0129f5aa1b6609d4e9c70bc13b [v4.1]
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Mon Mar 16 16:00:58 2015 +0000

    drm/i915: Initialize all contexts

which quietly introduced the removal of the MI_RESTORE_INHIBIT on the
default context.

v2: Mark the global default context as uninitialized on GPU reset so
that the context-local workarounds are reloaded upon re-enabling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1448630935-27377-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: This seems to fix a gpu hand on after the first resume,
resulting in any future suspend operation failing with -EIO because
the gpu seems to be in a funky state. Somehow this patch fixes that.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit 42f1cae8c079bcceb3cff079fddc3ff8852c788f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm: fix missing reference counting decrease
Insu Yun [Mon, 1 Feb 2016 16:08:29 +0000 (11:08 -0500)]
drm: fix missing reference counting decrease

commit dabe19540af9e563d526113bb102e1b9b9fa73f9 upstream.

In drm_dp_mst_allocate_vcpi, it returns true in two paths,
but in one path, there is no reference couting decrease.

Signed-off-by: Insu Yun <wuninsu@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: hold reference to fences in radeon_sa_bo_new
Nicolai Hähnle [Fri, 5 Feb 2016 19:35:53 +0000 (14:35 -0500)]
drm/radeon: hold reference to fences in radeon_sa_bo_new

commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.

An arbitrary amount of time can pass between spin_unlock and
radeon_fence_wait_any, so we need to ensure that nobody frees the
fences from under us.

Based on the analogous fix for amdgpu.

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: mask out WC from BO on unsupported arches
Oded Gabbay [Sat, 30 Jan 2016 05:59:33 +0000 (07:59 +0200)]
drm/radeon: mask out WC from BO on unsupported arches

commit c5244987394648913ae1a03879c58058a2fc2cee upstream.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm: add helper to check for wc memory support
Dave Airlie [Sat, 30 Jan 2016 05:59:32 +0000 (07:59 +0200)]
drm: add helper to check for wc memory support

commit 4b0e4e4af6c6dc8354dcb72182d52c1bc55f12fc upstream.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: fix DP audio support for APU with DCE4.1 display engine
Slava Grigorev [Tue, 26 Jan 2016 22:35:57 +0000 (17:35 -0500)]
drm/radeon: fix DP audio support for APU with DCE4.1 display engine

commit fe6fc1f132b4300c1f6defd43a5d673eb60a820d upstream.

Properly setup the DFS divider for DP audio for DCE4.1.

Signed-off-by: Slava Grigorev <slava.grigorev@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: Add a common function for DFS handling
Slava Grigorev [Tue, 26 Jan 2016 21:56:25 +0000 (16:56 -0500)]
drm/radeon: Add a common function for DFS handling

commit a64c9dab1c4d05c87ec8a1cb9b48915816462143 upstream.

Move encoding of DFS (digital frequency synthesizer) divider into a
separate function and improve calculation precision.

Signed-off-by: Slava Grigorev <slava.grigorev@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: cleaned up VCO output settings for DP audio
Slava Grigorev [Tue, 26 Jan 2016 21:45:10 +0000 (16:45 -0500)]
drm/radeon: cleaned up VCO output settings for DP audio

commit c9a392eac18409f51a071520cf508c0b4ad990e2 upstream.

This is preparation for the fixes in the following patches.

Signed-off-by: Slava Grigorev <slava.grigorev@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: properly byte swap vce firmware setup
Alex Deucher [Fri, 22 Jan 2016 05:13:15 +0000 (00:13 -0500)]
drm/radeon: properly byte swap vce firmware setup

commit cc78eb22885bba64445cde438ba098de0104920f upstream.

Firmware is LE.  Need to properly byteswap some of the fields
so they are interpreted correctly by the driver on BE systems.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: clean up fujitsu quirks
Alex Deucher [Thu, 17 Dec 2015 17:52:17 +0000 (12:52 -0500)]
drm/radeon: clean up fujitsu quirks

commit 0eb1c3d4084eeb6fb3a703f88d6ce1521f8fcdd1 upstream.

Combine the two quirks.

bug:
https://bugzilla.kernel.org/show_bug.cgi?id=109481

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: Fix "slow" audio over DP on DCE8+
Slava Grigorev [Thu, 17 Dec 2015 16:09:58 +0000 (11:09 -0500)]
drm/radeon: Fix "slow" audio over DP on DCE8+

commit ac4a9350abddc51ccb897abf0d9f3fd592b97e0b upstream.

DP audio is derived from the dfs clock.

Signed-off-by: Slava Grigorev <slava.grigorev@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: call hpd_irq_event on resume
Alex Deucher [Tue, 24 Nov 2015 19:32:44 +0000 (14:32 -0500)]
drm/radeon: call hpd_irq_event on resume

commit dbb17a21c131eca94eb31136eee9a7fe5aff00d9 upstream.

Need to call this on resume if displays changes during
suspend in order to properly be notified of changes.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/radeon: Fix off-by-one errors in radeon_vm_bo_set_addr
Felix Kuehling [Mon, 23 Nov 2015 22:39:11 +0000 (17:39 -0500)]
drm/radeon: Fix off-by-one errors in radeon_vm_bo_set_addr

commit 42ef344c0994cc453477afdc7a8eadc578ed0257 upstream.

eoffset is sometimes treated as the last address inside the address
range, and sometimes as the first address outside the range. This
was resulting in errors when a test filled up the entire address
space. Make it consistent to always be the last address within the
range. Also fixed related errors when checking the VA limit and in
radeon_vm_fence_pts.

Signed-off-by: Felix.Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/dp/mst: deallocate payload on port destruction
Mykola Lysenko [Wed, 27 Jan 2016 14:39:36 +0000 (09:39 -0500)]
drm/dp/mst: deallocate payload on port destruction

commit 91a25e463130c8e19bdb42f2d827836c7937992e upstream.

This is needed to properly deallocate port payload
after downstream branch get unplugged.

In order to do this unplugged MST topology should
be preserved, to find first alive port on path to
unplugged MST topology, and send payload deallocation
request to branch device of found port.

For this mstb and port kref's are used in reversed
order to track when port and branch memory could be
freed.

Added additional functions to find appropriate mstb
as described above.

Signed-off-by: Mykola Lysenko <Mykola.Lysenko@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/dp/mst: Reverse order of MST enable and clearing VC payload table.
Andrey Grodzovsky [Fri, 22 Jan 2016 22:07:29 +0000 (17:07 -0500)]
drm/dp/mst: Reverse order of MST enable and clearing VC payload table.

commit c175cd16df272119534058f28cbd5eeac6ff2d24 upstream.

On DELL U3014 if you clear the table before enabling MST it sometimes
hangs the receiver.

Signed-off-by: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/dp/mst: move GUID storage from mgr, port to only mst branch
Hersen Wu [Fri, 22 Jan 2016 22:07:28 +0000 (17:07 -0500)]
drm/dp/mst: move GUID storage from mgr, port to only mst branch

commit 5e93b8208d3c419b515fb75e2601931c027e12ab upstream.

Previous implementation does not handle case below: boot up one MST branch
to DP connector of ASIC. After boot up, hot plug 2nd MST branch to DP output
of 1st MST, GUID is not created for 2nd MST branch. When downstream port of
2nd MST branch send upstream request, it fails because 2nd MST branch GUID
is not available.

New Implementation: only create GUID for MST branch and save it within Branch.

Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/dp/mst: Calculate MST PBN with 31.32 fixed point
Harry Wentland [Fri, 22 Jan 2016 22:07:26 +0000 (17:07 -0500)]
drm/dp/mst: Calculate MST PBN with 31.32 fixed point

commit a9ebb3e46c7ef6112c0da466ef0954673ad36832 upstream.

Our PBN value overflows the 20 bits integer part of the 20.12
fixed point. We need to use 31.32 fixed point to avoid this.

This happens with display clocks larger than 293122 (at 24 bpp),
which we see with the Sharp (and similar) 4k tiled displays.

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm: Add drm_fixp_from_fraction and drm_fixp2int_ceil
Harry Wentland [Fri, 22 Jan 2016 22:07:25 +0000 (17:07 -0500)]
drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil

commit 64566b5e767f9bc3161055ca1b443a51afb52aad upstream.

drm_fixp_from_fraction allows us to create a fixed point directly
from a fraction, rather than creating fixed point values and dividing
later. This avoids overflow of our 64 bit value for large numbers.

drm_fixp2int_ceil allows us to return the ceiling of our fixed point
value.

[airlied: squash Jordan's fix]
32-bit-build-fix: Jordan Lazare <Jordan.Lazare@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/dp/mst: fix in RAD element access
Mykola Lysenko [Fri, 25 Dec 2015 08:14:48 +0000 (16:14 +0800)]
drm/dp/mst: fix in RAD element access

commit 7a11a334aa6af4c65c6a0d81b60c97fc18673532 upstream.

This is needed to receive correct port
number from RAD, so MSTB could be found

Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Mykola Lysenko <Mykola.Lysenko@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/dp/mst: fix in MSTB RAD initialization
Mykola Lysenko [Fri, 25 Dec 2015 08:14:47 +0000 (16:14 +0800)]
drm/dp/mst: fix in MSTB RAD initialization

commit 75af4c8c4c0f60d7ad135419805798f144e9baf9 upstream.

This fix is needed to support more then two
branch displays, so RAD address consist at
least of 2 elements

Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Mykola Lysenko <Mykola.Lysenko@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/dp/mst: always send reply for UP request
Mykola Lysenko [Fri, 18 Dec 2015 22:14:43 +0000 (17:14 -0500)]
drm/dp/mst: always send reply for UP request

commit 1f16ee7fa13649f4e55aa48ad31c3eb0722a62d3 upstream.

We should always send reply for UP request in order
to make downstream device clean-up resources appropriately.

Issue was that reply for UP request was sent only once.

Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Mykola Lysenko <Mykola.Lysenko@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/dp/mst: process broadcast messages correctly
Mykola Lysenko [Fri, 18 Dec 2015 22:14:42 +0000 (17:14 -0500)]
drm/dp/mst: process broadcast messages correctly

commit bd9343208704fcc70a5b919f228a7d26ae472727 upstream.

In case broadcast message received in UP request,
RAD cannot be used to identify message originator.
Message should be parsed, originator should be found
by GUID from parsed message.

Also reply with broadcast in case broadcast message
received (for now it is always broadcast)

Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Mykola Lysenko <Mykola.Lysenko@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/nouveau: platform: Fix deferred probe
Thierry Reding [Wed, 24 Feb 2016 17:34:43 +0000 (18:34 +0100)]
drm/nouveau: platform: Fix deferred probe

commit 870571a5698b2e9d0f4d2e5c6245967b582aab45 upstream.

The error cleanup paths aren't quite correct and will crash upon
deferred probe.

Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/nouveau/disp/dp: ensure sink is powered up before attempting link training
Ben Skeggs [Wed, 17 Feb 2016 22:14:19 +0000 (08:14 +1000)]
drm/nouveau/disp/dp: ensure sink is powered up before attempting link training

commit 95664e66fad964c3dd7945d6edfb1d0931844664 upstream.

This can happen under some annoying circumstances, and is a quick fix
until more substantial changes can be made.

Fixed eDP mode changes on (at least) the Lenovo P50.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/nouveau/display: Enable vblank irqs after display engine is on again.
Mario Kleiner [Fri, 12 Feb 2016 19:30:32 +0000 (20:30 +0100)]
drm/nouveau/display: Enable vblank irqs after display engine is on again.

commit ff683df7bf34f90766a50c7e7454e219aef2710e upstream.

In the display resume path, move the calls to drm_vblank_on()
after the point when the display engine is running again.

Since changes were made to drm_update_vblank_count() in Linux 4.4+
to emulate hw vblank counters via vblank timestamping, the function
drm_vblank_on() now needs working high precision vblank timestamping
and therefore working scanout position queries at time of call.
These don't work before the display engine gets restarted, causing
miscalculation of vblank counter increments and thereby large forward
jumps in vblank count at display resume. These jumps can cause client
hangs on resume, or desktop hangs in the case of composited desktops.

Fix this Linux 4.4 regression by reordering calls accordingly.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: ville.syrjala@linux.intel.com
Cc: daniel.vetter@ffwll.ch
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/nouveau/kms: take mode_config mutex in connector hotplug path
Ben Skeggs [Thu, 7 Jan 2016 22:56:51 +0000 (08:56 +1000)]
drm/nouveau/kms: take mode_config mutex in connector hotplug path

commit 0a882cadbc63fd2da3994af7115b4ada2fcbd638 upstream.

fdo#93634

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu/pm: adjust display configuration after powerstate
Alex Deucher [Fri, 19 Feb 2016 22:55:31 +0000 (17:55 -0500)]
drm/amdgpu/pm: adjust display configuration after powerstate

commit 8e7cedc6f7fe762ffe6e348502be34b11fa79298 upstream.

set_power_state defaults to no displays, so we need to update
the display configuration after setting up the powerstate on the
first call. In most cases this is not an issue since ends up
getting called multiple times at any given modeset and the proper
order is achieved in the display changed handling at the top of
the function.

Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Jordan Lazare <Jordan.Lazare@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: Don't hang in amdgpu_flip_work_func on disabled crtc.
Mario Kleiner [Fri, 19 Feb 2016 01:06:39 +0000 (02:06 +0100)]
drm/amdgpu: Don't hang in amdgpu_flip_work_func on disabled crtc.

commit e1d09dc0ccc6c91e3916476f636edb76da1f65bb upstream.

This fixes a regression introduced in Linux 4.4.

This is a port of the same fix for radeon-kms in the
patch "drm/radeon: Don't hang in radeon_flip_work_func
on disabled crtc. (v2)"

Limit the amount of time amdgpu_flip_work_func can
delay programming a page flip, by both limiting the
maximum amount of time per wait cycle and the maximum
number of wait cycles. Continue the flip if the limit
is exceeded, even if that may result in a visual or
timing glitch.

This is to prevent a hang of page flips, as reported
in fdo bug #93746: Disconnecting a DisplayPort display
in parallel to a kms pageflip getting queued can cause
the following hang of page flips and thereby an unusable
desktop:

1. kms pageflip ioctl() queues pageflip -> queues execution
   of amdgpu_flip_work_func.

2. Hotunplug of display causes the driver to DPMS OFF
   the unplugged display. Display engine shuts down,
   scanout no longer moves, but stays at its resting
   position at start line of vblank.

3. amdgpu_flip_work_func executes while crtc is off, and
   due to the non-moving scanout position, the new flip
   delay code introduced into Linux 4.4 by
   commit 8e36f9d33c13 ("drm/amdgpu: Fixup hw vblank counter/ts..")
   enters an infinite wait loop.

4. After reconnecting the display, the pageflip continues
   to hang in 3. and the display doesn't update its view
   of the desktop.

This patch fixes the Linux 4.4 regression from fdo bug #93746

<https://bugs.freedesktop.org/show_bug.cgi?id=93746>

Reported-by: Bernd Steinhauser <linux@bernd-steinhauser.de>
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 years agodrm/amdgpu: use post-decrement in error handling
Rasmus Villemoes [Mon, 15 Feb 2016 18:41:45 +0000 (19:41 +0100)]
drm/amdgpu: use post-decrement in error handling

commit 09ccbb74b6718ad4d1290de3f5669212c0ac7d4b upstream.

We need to use post-decrement to get the pci_map_page undone also for
i==0, and to avoid some very unpleasant behaviour if pci_map_page
failed already at i==0.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: fix issue with overlapping userptrs
Christian König [Mon, 8 Feb 2016 09:57:22 +0000 (10:57 +0100)]
drm/amdgpu: fix issue with overlapping userptrs

commit cc1de6e800c253172334f8774c419dc64401cd2e upstream.

Otherwise we could try to evict overlapping userptr BOs in get_user_pages(),
leading to a possible circular locking dependency.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: hold reference to fences in amdgpu_sa_bo_new (v2)
Nicolai Hähnle [Fri, 5 Feb 2016 15:59:43 +0000 (10:59 -0500)]
drm/amdgpu: hold reference to fences in amdgpu_sa_bo_new (v2)

commit a8d81b36267366603771431747438d18f32ae2d5 upstream.

An arbitrary amount of time can pass between spin_unlock and
fence_wait_any_timeout, so we need to ensure that nobody frees the
fences from under us.

A stress test (rapidly starting and killing hundreds of glxgears
instances) ran into a deadlock in fence_wait_any_timeout after
about an hour, and this race condition appears to be a plausible
cause.

v2: agd: rebase on upstream

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: remove unnecessary forward declaration
Nicolai Hähnle [Fri, 5 Feb 2016 15:49:50 +0000 (10:49 -0500)]
drm/amdgpu: remove unnecessary forward declaration

commit b19763d0d867eb863953500a5c87f2fd663863b8 upstream.

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: fix s4 resume
Flora Cui [Thu, 4 Feb 2016 07:10:08 +0000 (15:10 +0800)]
drm/amdgpu: fix s4 resume

commit ca19852884c8937eed89560f924f5a34cfcc22af upstream.

No need to re-init asic if it's already been initialized.
Skip IB tests since kernel processes are frozen in thaw.

Signed-off-by: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: remove exp hardware support from iceland
Alex Deucher [Tue, 2 Feb 2016 21:24:20 +0000 (16:24 -0500)]
drm/amdgpu: remove exp hardware support from iceland

commit dba280b20bfd1c2bed8a07ce3f75a6da8ba7d247 upstream.

It's working now.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=92270

Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: don't load MEC2 on topaz
Alex Deucher [Tue, 2 Feb 2016 16:15:41 +0000 (11:15 -0500)]
drm/amdgpu: don't load MEC2 on topaz

commit 97dde76a30c2e67fa5fb9cb6a4072c0178c9df26 upstream.

Not validated.

Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: drop topaz support from gmc8 module
Alex Deucher [Tue, 2 Feb 2016 15:57:30 +0000 (10:57 -0500)]
drm/amdgpu: drop topaz support from gmc8 module

commit 8878d8548ac7fae43cd6d82579f966eb8825e282 upstream.

topaz is actually gmc7.

Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: pull topaz gmc bits into gmc_v7
Alex Deucher [Tue, 2 Feb 2016 15:56:15 +0000 (10:56 -0500)]
drm/amdgpu: pull topaz gmc bits into gmc_v7

commit 72b459c8f716ef03a8a0c78078547ce64d8d29a2 upstream.

Add the topaz golden settings into the gmc7 module.

Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above
Ken Wang [Wed, 3 Feb 2016 11:17:53 +0000 (19:17 +0800)]
drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above

commit 8f3c162961fc2d92ec73a66496aab69eb2e19c36 upstream.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: iceland use CI based MC IP
Ken Wang [Wed, 3 Feb 2016 11:16:54 +0000 (19:16 +0800)]
drm/amdgpu: iceland use CI based MC IP

commit 429c45deae6e57f1bb91bfb05b671063fb0cef60 upstream.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: move gmc7 support out of CIK dependency
Alex Deucher [Tue, 2 Feb 2016 15:59:53 +0000 (10:59 -0500)]
drm/amdgpu: move gmc7 support out of CIK dependency

commit e42d85261680edfc350a6c2a86b7fbb44a85014b upstream.

It's used by iceland which is VI.

Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: no need to load MC firmware on fiji
Alex Deucher [Thu, 28 Jan 2016 21:27:41 +0000 (16:27 -0500)]
drm/amdgpu: no need to load MC firmware on fiji

commit ad32152eb26043d165eed9406cb9e2f7011f6b10 upstream.

Vbios does this for us on asic_init.

Reviewed-by: Ken Wang >Qingqing.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: fix amdgpu_bo_pin_restricted VRAM placing v2
Christian König [Tue, 19 Jan 2016 11:48:14 +0000 (12:48 +0100)]
drm/amdgpu: fix amdgpu_bo_pin_restricted VRAM placing v2

commit 78d0e182b6c1f5336f6e8cbb197f403276dabc7f upstream.

We could pin BOs into invisible VRAM otherwise.

v2: make logic more readable as suggested by Michel

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Rex Zhu <Rex.Zhu@amd.com> (v1)
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: fix tonga smu resume
Alex Deucher [Thu, 14 Jan 2016 18:48:24 +0000 (13:48 -0500)]
drm/amdgpu: fix tonga smu resume

commit e160e4db833c7e8587ec3c88efaed0d84f1bcf42 upstream.

Need to make sure smu buffers are pinned on resume.  This
matches what Fiji does.

Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: fix lost sync_to if scheduler is enabled.
Chunming Zhou [Wed, 13 Jan 2016 04:55:18 +0000 (12:55 +0800)]
drm/amdgpu: fix lost sync_to if scheduler is enabled.

commit 888c9e33e4c5a503285921046c621f7c73199d2f upstream.

when scheduler is enabled, the semaphore isn't used at all.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: call hpd_irq_event on resume
Alex Deucher [Tue, 24 Nov 2015 19:30:56 +0000 (14:30 -0500)]
drm/amdgpu: call hpd_irq_event on resume

commit 54fb2a5cd0baf8e97d743de411e2f832d1afa68d upstream.

Need to call this on resume if displays changes during
suspend in order to properly be notified of changes.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/amdgpu: Fix off-by-one errors in amdgpu_vm_bo_map
Felix Kuehling [Mon, 23 Nov 2015 22:43:48 +0000 (17:43 -0500)]
drm/amdgpu: Fix off-by-one errors in amdgpu_vm_bo_map

commit 005ae95e6ec119c64e2d16eb65a94c49e1dcf9f0 upstream.

eaddr is sometimes treated as the last address inside the address
range, and sometimes as the first address outside the range. This
was resulting in errors when a test filled up the entire address
space. Make it consistent to always be the last address within the
range.

Signed-off-by: Felix.Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/vmwgfx: respect 'nomodeset'
Rob Clark [Wed, 15 Oct 2014 19:00:47 +0000 (15:00 -0400)]
drm/vmwgfx: respect 'nomodeset'

commit 96c5d076f0a5e2023ecdb44d8261f87641ee71e0 upstream.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/vmwgfx: Fix a width / pitch mismatch on framebuffer updates
Thomas Hellstrom [Fri, 8 Jan 2016 19:29:40 +0000 (20:29 +0100)]
drm/vmwgfx: Fix a width / pitch mismatch on framebuffer updates

commit a50e2bf5a0f674d62b69f51f6935a30e82bd015c upstream.

When the framebuffer is a vmwgfx dma buffer and a proxy surface is
created, the vmw_kms_update_proxy() function requires that the proxy
surface width and the framebuffer pitch are compatible, otherwise
display corruption occurs as seen in gnome-shell/native with software
3D. Since the framebuffer pitch is determined by user-space, allocate
a proxy surface the width of which is based on the framebuffer pitch
rather than on the framebuffer width.

Reported-by: Raphael Hertzog <buxy@kali.org>
Tested-by: Mati Aharoni <muts@kali.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrm/vmwgfx: Fix an incorrect lock check
Thomas Hellstrom [Fri, 8 Jan 2016 19:29:39 +0000 (20:29 +0100)]
drm/vmwgfx: Fix an incorrect lock check

commit fb89ac5102ae2875d685c847e6b5dbc141622d43 upstream.

With CONFIG_SMP=n and CONFIG_DEBUG_SPINLOCK=y the vmwgfx kernel module
would unconditionally throw a bug when checking for a held spinlock
in the command buffer code. Fix this by using a lockdep check.

Reported-and-tested-by: Tetsuo Handa <penguin-kernel@i-love-sakura.ne.jp>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agovirtio_pci: fix use after free on release
Michael S. Tsirkin [Thu, 14 Jan 2016 14:00:41 +0000 (16:00 +0200)]
virtio_pci: fix use after free on release

commit 2989be09a8a9d62a785137586ad941f916e08f83 upstream.

KASan detected a use-after-free error in virtio-pci remove code. In
virtio_pci_remove(), vp_dev is still used after being freed in
unregister_virtio_device() (in virtio_pci_release_dev() more
precisely).

To fix, keep a reference until cleanup is done.

Fixes: 63bd62a08ca4 ("virtio_pci: defer kfree until release callback")
Reported-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agovirtio_balloon: fix race between migration and ballooning
Minchan Kim [Sun, 27 Dec 2015 23:35:13 +0000 (08:35 +0900)]
virtio_balloon: fix race between migration and ballooning

commit 21ea9fb69e7c4b1b1559c3e410943d3ff248ffcb upstream.

In balloon_page_dequeue, pages_lock should cover the loop
(ie, list_for_each_entry_safe). Otherwise, the cursor page could
be isolated by compaction and then list_del by isolation could
poison the page->lru.{prev,next} so the loop finally could
access wrong address like this. This patch fixes the bug.

general protection fault: 0000 [#1] SMP
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 2 PID: 82 Comm: vballoon Not tainted 4.4.0-rc5-mm1-access_bit+ #1906
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff8800a7ff0000 ti: ffff8800a7fec000 task.ti: ffff8800a7fec000
RIP: 0010:[<ffffffff8115e754>]  [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130
RSP: 0018:ffff8800a7fefdc0  EFLAGS: 00010246
RAX: ffff88013fff9a70 RBX: ffffea000056fe00 RCX: 0000000000002b7d
RDX: ffff88013fff9a70 RSI: ffffea000056fe00 RDI: ffff88013fff9a68
RBP: ffff8800a7fefde8 R08: ffffea000056fda0 R09: 0000000000000000
R10: ffff8800a7fefd90 R11: 0000000000000001 R12: dead0000000000e0
R13: ffffea000056fe20 R14: ffff880138809070 R15: ffff880138809060
FS:  0000000000000000(0000) GS:ffff88013fc40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f229c10e000 CR3: 00000000b8b53000 CR4: 00000000000006a0
Stack:
 0000000000000100 ffff880138809088 ffff880138809000 ffff880138809060
 0000000000000046 ffff8800a7fefe28 ffffffff812c86d3 ffff880138809020
 ffff880138809000 fffffffffff91900 0000000000000100 ffff880138809060
Call Trace:
 [<ffffffff812c86d3>] leak_balloon+0x93/0x1a0
 [<ffffffff812c8bc7>] balloon+0x217/0x2a0
 [<ffffffff8143739e>] ? __schedule+0x31e/0x8b0
 [<ffffffff81078160>] ? abort_exclusive_wait+0xb0/0xb0
 [<ffffffff812c89b0>] ? update_balloon_stats+0xf0/0xf0
 [<ffffffff8105b6e9>] kthread+0xc9/0xe0
 [<ffffffff8105b620>] ? kthread_park+0x60/0x60
 [<ffffffff8143b4af>] ret_from_fork+0x3f/0x70
 [<ffffffff8105b620>] ? kthread_park+0x60/0x60
Code: 8d 60 e0 0f 84 af 00 00 00 48 8b 43 20 a8 01 75 3b 48 89 d8 f0 0f ba 28 00 72 10 48 8b 03 f6 c4 08 75 2f 48 89 df e8 8c 83 f9 ff <49> 8b 44 24 20 4d 8d 6c 24 20 48 83 e8 20 4d 39 f5 74 7a 4c 89
RIP  [<ffffffff8115e754>] balloon_page_dequeue+0x54/0x130
 RSP <ffff8800a7fefdc0>
---[ end trace 43cf28060d708d5f ]---
Kernel panic - not syncing: Fatal exception
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled

Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agovirtio_balloon: fix race by fill and leak
Minchan Kim [Sun, 27 Dec 2015 23:35:12 +0000 (08:35 +0900)]
virtio_balloon: fix race by fill and leak

commit f68b992bbb474641881932c61c92dcfa6f5b3689 upstream.

During my compaction-related stuff, I encountered a bug
with ballooning.

With repeated inflating and deflating cycle, guest memory(
ie, cat /proc/meminfo | grep MemTotal) is decreased and
couldn't be recovered.

The reason is balloon_lock doesn't cover release_pages_balloon
so struct virtio_balloon fields could be overwritten by race
of fill_balloon(e,g, vb->*pfns could be critical).

This patch fixes it in my test.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>