scsi: sg: only check for dxfer_len greater than 256M
Don't make any assumptions on the sg_io_hdr_t::dxfer_direction or the
sg_io_hdr_t::dxferp in order to determine if it is a valid request. The
only way we can check for bad requests is by checking if the length
exceeds 256M.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: 28676d869bbb (scsi: sg: check for valid direction before starting the
request) Reported-by: Jason L Tibbitts III <tibbs@math.uh.edu> Tested-by: Jason L Tibbitts III <tibbs@math.uh.edu> Suggested-by: Doug Gilbert <dgilbert@interlog.com> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: <stable@vger.kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
[qed_sp_fcoe_func_start:150(sp-0-3b:00.02)]Cannot satisfy CQ amount. CQs
requested 8, CQs available 6. Aborting function start
[qed_fcoe_start:821()]Failed to start fcoe
[__qedf_probe:3041]:6: Cannot start FCoE function.
The reason is a newly introduced check in the qed main part. This change
also provides the information about how many CQs are available, so we
simply limit the number of requested CQs..
Fixes: 3c5da9427802 ("qed: Share additional information with qedf") Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Thomas Gleixner [Mon, 24 Jul 2017 10:53:00 +0000 (12:53 +0200)]
scsi: bnx2i: Simplify cpu hotplug code
The CPU hotplug related code of this driver can be simplified by:
1) Consolidating the callbacks into a single state. The CPU thread can be
torn down on the CPU which goes offline. There is no point in delaying
that to the CPU dead state
2) Let the core code invoke the online/offline callbacks and remove the
extra for_each_online_cpu() loops.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Thomas Gleixner [Mon, 24 Jul 2017 10:52:59 +0000 (12:52 +0200)]
scsi: bnx2fc: Simplify CPU hotplug code
The CPU hotplug related code of this driver can be simplified by:
1) Consolidating the callbacks into a single state. The CPU thread can be
torn down on the CPU which goes offline. There is no point in delaying
that to the CPU dead state
2) Let the core code invoke the online/offline callbacks and remove the
extra for_each_online_cpu() loops.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Thomas Gleixner [Mon, 24 Jul 2017 10:52:58 +0000 (12:52 +0200)]
scsi: bnx2i: Prevent recursive cpuhotplug locking
The BNX2I module init/exit code installs/removes the hotplug callbacks with
the cpu hotplug lock held. This worked with the old CPU locking
implementation which allowed recursive locking, but with the new percpu
rwsem based mechanism this is not longer allowed.
Use the _cpuslocked() variants to fix this.
Reported-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The BNX2FC module init/exit code installs/removes the hotplug callbacks with
the cpu hotplug lock held. This worked with the old CPU locking
implementation which allowed recursive locking, but with the new percpu
rwsem based mechanism this is not longer allowed.
Use the _cpuslocked() variants to fix this.
Reported-by: kernel test robot <fengguang.wu@intel.com> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Thomas Gleixner [Mon, 24 Jul 2017 10:52:56 +0000 (12:52 +0200)]
scsi: bnx2fc: Plug CPU hotplug race
bnx2fc_process_new_cqes() has protection against CPU hotplug, which relies
on the per cpu thread pointer. This protection is racy because it happens
only partially with the per cpu fp_work_lock held.
If the CPU is unplugged after the lock is dropped, the wakeup code can
dereference a NULL pointer or access freed and potentially reused memory.
Restructure the code so the thread check and wakeup happens with the
fp_work_lock held.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: lpfc: fix linking against modular NVMe support
When LPFC is built-in but NVMe is a loadable module, we fail to link the
kernel:
drivers/scsi/built-in.o: In function `lpfc_nvme_create_localport':
(.text+0x156a82): undefined reference to `nvme_fc_register_localport'
drivers/scsi/built-in.o: In function `lpfc_nvme_destroy_localport':
(.text+0x156eaa): undefined reference to `nvme_fc_unregister_remoteport'
We can avoid this either by forcing lpfc to be a module, or by disabling
NVMe support in this case. This implements the former.
Fixes: 7d7080335f8d ("scsi: lpfc: Finalize Kconfig options for nvme") Cc: stable@vger.kernel.org Link: https://patchwork.kernel.org/patch/9636569/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hannes Reinecke [Mon, 24 Jul 2017 10:09:36 +0000 (12:09 +0200)]
scsi: scsi_transport_fc: return -EBUSY for deleted vport
When trying to delete a vport via 'vport_delete' sysfs attribute we
should be checking if the port is already in state VPORT_DELETING; if so
there's no need to do anything.
Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: libcxgbi: add check for valid cxgbi_task_data
In error case it is possible that ->cleanup_task() gets called without
calling ->alloc_pdu() in this case cxgbi_task_data is not valid, so add
a check for for valid cxgbi_task_data in cxgbi_cleanup_task().
Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Jakub Kicinski [Wed, 19 Jul 2017 01:58:34 +0000 (18:58 -0700)]
scsi: aic7xxx: fix firmware build with O=path
Building firmware with O=path was apparently broken in aic7 for ever.
Message of the previous commit to the Makefile (from 2008) mentions this
unfortunate state of affairs already. Fix this, mostly to make
randconfig builds more reliable.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Shu Wang <shuwang@redhat.com> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
qedi uses iscsi_boot_sysfs to export the targets used for boot to
sysfs. Select the config option to make sure the module is built.
This addresses the compile time issue,
drivers/scsi/qedi/qedi_main.o: In function `qedi_remove':
qedi_main.c:(.text+0x3bbd): undefined reference to `iscsi_boot_destroy_kset'
drivers/scsi/qedi/qedi_main.o: In function `__qedi_probe.constprop.0':
qedi_main.c:(.text+0x577a): undefined reference to `iscsi_boot_create_target'
qedi_main.c:(.text+0x5807): undefined reference to `iscsi_boot_create_target'
qedi_main.c:(.text+0x587f): undefined reference to `iscsi_boot_create_initiator'
qedi_main.c:(.text+0x58f3): undefined reference to `iscsi_boot_create_ethernet'
qedi_main.c:(.text+0x5927): undefined reference to `iscsi_boot_destroy_kset'
qedi_main.c:(.text+0x5d7b): undefined reference to `iscsi_boot_create_host_kset'
[mkp: fixed whitespace]
Signed-off-by: Nilesh Javali <nilesh.javali@cavium.com> Fixes: c57ec8fb7c02 ("scsi: qedi: Add support for Boot from SAN over iSCSI offload") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
scsi: sg: fix static checker warning in sg_is_valid_dxfer
dxfer_len is an unsigned int and we always assign a value > 0 to it, so
it doesn't make any sense to check if it is < 0. We can't really check
dxferp as well as we have both NULL and not NULL cases in the possible
call paths.
So just return true for SG_DXFER_FROM_DEV transfer in
sg_is_valid_dxfer().
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reported-by: Colin Ian King <colin.king@canonical.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yadan Fan [Fri, 23 Jun 2017 09:40:06 +0000 (17:40 +0800)]
scsi: smartpqi: limit transfer length to 1MB
The smartpqi firmware will bypass the cache for any request larger than
1MB, so we should cap the request size to avoid any performance
degradation in kernels later than v4.3
This degradation is caused from d2be537c3ba3568acd79cd178327b842e60d035e,
which changed max_sectors_kb to 1280k, but the hardware is able to
work fine with it, so the true fix should be from smartpqi driver.
Signed-off-by: Yadan Fan <ydfan@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yadan Fan [Fri, 23 Jun 2017 09:40:05 +0000 (17:40 +0800)]
scsi: hpsa: limit transfer length to 1MB
The hpsa firmware will bypass the cache for any request larger than 1MB,
so we should cap the request size to avoid any performance degradation
in kernels later than v4.3
This degradation is caused from d2be537c3ba3568acd79cd178327b842e60d035e,
which changed max_sectors_kb to 1280k, but the hardware is able to work
fine with it, so the true fix should be from hpsa driver.
Signed-off-by: Yadan Fan <ydfan@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter [Wed, 12 Jul 2017 07:30:22 +0000 (10:30 +0300)]
scsi: libfc: pass an error pointer to fc_disc_error()
This patch is basically to silence a static checker warning.
drivers/scsi/libfc/fc_disc.c:326 fc_disc_error()
warn: passing a valid pointer to 'PTR_ERR'
It doesn't affect runtime because it treats -ENOMEM and a valid pointer
the same. But the documentation says we should be passing an error
pointer.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Colin Ian King [Tue, 11 Jul 2017 12:11:44 +0000 (13:11 +0100)]
scsi: hisi_sas: make several const arrays static
Don't populate various tables on the stack but make them static const.
Makes the object code smaller by over 280 bytes:
Before:
text data bss dec hex filename
39887 5080 64 45031 afe7 hisi_sas_v2_hw.o
After:
text data bss dec hex filename
39318 5368 64 44750 aece hisi_sas_v2_hw.o
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter [Mon, 10 Jul 2017 08:47:40 +0000 (11:47 +0300)]
scsi: qla2xxx: Off by one in qlt_ctio_to_cmd()
There are "req->num_outstanding_cmds" elements in the
req->outstanding_cmds[] array so the > here should be >=.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
SG_DXFER_FROM_DEV transfers do not necessarily have a dxferp as we set
it to NULL for the old sg_io read/write interface, but must have a
length bigger than 0. This fixes a regression introduced by commit 28676d869bbb ("scsi: sg: check for valid direction before starting the
request")
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: 28676d869bbb ("scsi: sg: check for valid direction before starting the request") Reported-by: Chris Clayton <chris2553@googlemail.com> Tested-by: Chris Clayton <chris2553@googlemail.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Tested-by: Chris Clayton <chris2553@googlemail.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Paolo Bonzini [Wed, 5 Jul 2017 08:30:56 +0000 (10:30 +0200)]
scsi: virtio_scsi: always read VPD pages for multiqueue too
Multi-queue virtio-scsi uses a different scsi_host_template struct. Add
the .device_alloc field there, too.
Fixes: 25d1d50e23275e141e3a3fe06c25a99f4c4bf4e0 Cc: stable@vger.kernel.org Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Trivial fix to spelling mistake in QEDF_INFO message and remove
duplicated "since" (thanks to Tyrel Datwyler for spotting the latter
issue).
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Colin Ian King [Mon, 3 Jul 2017 10:24:02 +0000 (11:24 +0100)]
scsi: qedi: fix another spelling mistake: "alloction" -> "allocation"
Trivial fix to spelling mistake in QEDF_ERR message. I should have also
included this in a previous fix, but I only just spotted this one.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Manish Rangankar <Manish.Rangankar@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter [Fri, 30 Jun 2017 08:01:06 +0000 (11:01 +0300)]
scsi: cxlflash: return -EFAULT if copy_from_user() fails
The copy_from/to_user() functions return the number of bytes remaining
to be copied but we had intended to return -EFAULT here.
Fixes: bc88ac47d5cb ("scsi: cxlflash: Support AFU debug") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Nilesh Javali [Tue, 27 Jun 2017 09:26:56 +0000 (02:26 -0700)]
scsi: qedi: Add support for Boot from SAN over iSCSI offload
This patch adds support for Boot from SAN over iSCSI offload. The iSCSI
boot information in the NVRAM is populated under
/sys/firmware/iscsi_bootX/ using qed NVM-image reading API and further
exported to open-iscsi to perform iSCSI login enabling boot over offload
iSCSI interface in a Boot from SAN environment.
Signed-off-by: Arun Easi <arun.easi@cavium.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@cavium.com> Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com> Signed-off-by: Nilesh Javali <nilesh.javali@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull more block updates from Jens Axboe:
"This is a followup for block changes, that didn't make the initial
pull request. It's a bit of a mixed bag, this contains:
- A followup pull request from Sagi for NVMe. Outside of fixups for
NVMe, it also includes a series for ensuring that we properly
quiesce hardware queues when browsing live tags.
- Set of integrity fixes from Dmitry (mostly), fixing various issues
for folks using DIF/DIX.
- Fix for a bug introduced in cciss, with the req init changes. From
Christoph.
- Fix for a bug in BFQ, from Paolo.
- Two followup fixes for lightnvm/pblk from Javier.
- Depth fix from Ming for blk-mq-sched.
- Also from Ming, performance fix for mtip32xx that was introduced
with the dynamic initialization of commands"
* 'for-linus' of git://git.kernel.dk/linux-block: (44 commits)
block: call bio_uninit in bio_endio
nvmet: avoid unneeded assignment of submit_bio return value
nvme-pci: add module parameter for io queue depth
nvme-pci: compile warnings in nvme_alloc_host_mem()
nvmet_fc: Accept variable pad lengths on Create Association LS
nvme_fc/nvmet_fc: revise Create Association descriptor length
lightnvm: pblk: remove unnecessary checks
lightnvm: pblk: control I/O flow also on tear down
cciss: initialize struct scsi_req
null_blk: fix error flow for shared tags during module_init
block: Fix __blkdev_issue_zeroout loop
nvme-rdma: unconditionally recycle the request mr
nvme: split nvme_uninit_ctrl into stop and uninit
virtio_blk: quiesce/unquiesce live IO when entering PM states
mtip32xx: quiesce request queues to make sure no submissions are inflight
nbd: quiesce request queues to make sure no submissions are inflight
nvme: kick requeue list when requeueing a request instead of when starting the queues
nvme-pci: quiesce/unquiesce admin_q instead of start/stop its hw queues
nvme-loop: quiesce/unquiesce admin_q instead of start/stop its hw queues
nvme-fc: quiesce/unquiesce admin_q instead of start/stop its hw queues
...
Merge tag 'smb3-security-fixes-for-4.13' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes and sane default from Steve French:
"Upgrade default dialect to more secure SMB3 from older cifs dialect"
* tag 'smb3-security-fixes-for-4.13' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Clean up unused variables in smb2pdu.c
[SMB3] Improve security, move default dialect to SMB3 from old CIFS
[SMB3] Remove ifdef since SMB3 (and later) now STRONGLY preferred
CIFS: Reconnect expired SMB sessions
CIFS: Display SMB2 error codes in the hex format
cifs: Use smb 2 - 3 and cifsacl mount options setacl function
cifs: prototype declaration and definition to set acl for smb 2 - 3 and cifsacl mount options
Merge tag 'ceph-for-4.13-rc1' of git://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"The main item here is support for v12.y.z ("Luminous") clusters:
RESEND_ON_SPLIT, RADOS_BACKOFF, OSDMAP_PG_UPMAP and CRUSH_CHOOSE_ARGS
feature bits, and various other changes in the RADOS client protocol.
On top of that we have a new fsc mount option to allow supplying
fscache uniquifier (similar to NFS) and the usual pile of filesystem
fixes from Zheng"
* tag 'ceph-for-4.13-rc1' of git://github.com/ceph/ceph-client: (44 commits)
libceph: advertise support for NEW_OSDOP_ENCODING and SERVER_LUMINOUS
libceph: osd_state is 32 bits wide in luminous
crush: remove an obsolete comment
crush: crush_init_workspace starts with struct crush_work
libceph, crush: per-pool crush_choose_arg_map for crush_do_rule()
crush: implement weight and id overrides for straw2
libceph: apply_upmap()
libceph: compute actual pgid in ceph_pg_to_up_acting_osds()
libceph: pg_upmap[_items] infrastructure
libceph: ceph_decode_skip_* helpers
libceph: kill __{insert,lookup,remove}_pg_mapping()
libceph: introduce and switch to decode_pg_mapping()
libceph: don't pass pgid by value
libceph: respect RADOS_BACKOFF backoffs
libceph: make DEFINE_RB_* helpers more general
libceph: avoid unnecessary pi lookups in calc_target()
libceph: use target pi for calc_target() calculations
libceph: always populate t->target_{oid,oloc} in calc_target()
libceph: make sure need_resend targets reflect latest map
libceph: delete from need_resend_linger before check_linger_pool_dne()
...
- Cleanups and improvements for sama5d4, intel-mid_wdt, s3c2410_wdt,
orion_wdt, gpio_wdt, it87_wdt, meson_wdt, davinci_wdt, bcm47xx_wdt,
zx2967_wdt, cadence_wdt
* git://www.linux-watchdog.org/linux-watchdog: (32 commits)
watchdog: introduce watchdog_worker_should_ping helper
watchdog: uniphier: add UniPhier watchdog driver
dt-bindings: watchdog: add description for UniPhier WDT controller
watchdog: cadence_wdt: make of_device_ids const.
watchdog: zx2967: constify zx2967_wdt_ops.
watchdog: bcm47xx_wdt: constify bcm47xx_wdt_hard_ops and bcm47xx_wdt_soft_ops
watchdog: davinci: Add missing clk_disable_unprepare().
watchdog: davinci: Handle return value of clk_prepare_enable
watchdog: meson: Handle return value of clk_prepare_enable
watchdog: it87: Add support for various Super-IO chips
watchdog: it87: Use infrastructure to stop watchdog on reboot
watchdog: it87: Drop support for resetting watchdog though CIR and Game port
watchdog: it87: Convert to use watchdog core infrastructure
watchdog: it87: Drop FSF mailing address
watchdog: dw_wdt: get reset lines from dt
watchdog: bindings: dw_wdt: add reset lines
watchdog: w83627hf: Add support for NCT6793D and NCT6795D
watchdog: core: add option to avoid early handling of watchdog
watchdog: f71808e_wdt: Add F71868 support
watchdog: Add STM32 IWDG driver
...
Merge tag 'chrome-platform-for-linus-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform
Pull chrome platform updates from Benson Leung:
"Changes in this pull request are around catching up cros_ec with the
internal chromeos-kernel versions of cros_ec, cros_ec_lpc, and
cros_ec_lightbar.
Also, switching maintainership from olof to bleung"
* tag 'chrome-platform-for-linus-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
platform/chrome : Add myself as Maintainer
platform/chrome: cros_ec_lightbar - hide unused PM functions
cros_ec: Don't signal wake event for non-wake host events
cros_ec: Fix deadlock when EC is not responsive at probe
cros_ec: Don't return error when checking command version
platform/chrome: cros_ec_lightbar - Avoid I2C xfer to EC during suspend
platform/chrome: cros_ec_lightbar - Add userspace lightbar control bit to EC
platform/chrome: cros_ec_lightbar - Control of suspend/resume lightbar sequence
platform/chrome: cros_ec_lightbar - Add lightbar program feature to sysfs
platform/chrome: cros_ec_lpc: Add MKBP events support over ACPI
platform/chrome: cros_ec_lpc: Add power management ops
platform/chrome: cros_ec_lpc: Add support for GOOG004 ACPI device
platform/chrome: cros_ec_lpc: Add support for mec1322 EC
platform/chrome: cros_ec_lpc: Add R/W helpers to LPC protocol variants
mfd: cros_ec: Add support for dumping panic information
cros_ec_debugfs: Pass proper struct sizes to cros_ec_cmd_xfer()
mfd: cros_ec: add debugfs, console log file
mfd: cros_ec: Add EC console read structures definitions
mfd: cros_ec: Add helper for event notifier.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (115 commits)
kernel/exit.c: avoid undefined behaviour when calling wait4()
kernel/signal.c: avoid undefined behaviour in kill_something_info
binfmt_elf: safely increment argv pointers
s390: reduce ELF_ET_DYN_BASE
powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB
arm64: move ELF_ET_DYN_BASE to 4GB / 4MB
arm: move ELF_ET_DYN_BASE to 4MB
binfmt_elf: use ELF_ET_DYN_BASE only for PIE
fs, epoll: short circuit fetching events if thread has been killed
checkpatch: improve multi-line alignment test
checkpatch: improve macro reuse test
checkpatch: change format of --color argument to --color[=WHEN]
checkpatch: silence perl 5.26.0 unescaped left brace warnings
checkpatch: improve tests for multiple line function definitions
checkpatch: remove false warning for commit reference
checkpatch: fix stepping through statements with $stat and ctx_statement_block
checkpatch: [HLP]LIST_HEAD is also declaration
checkpatch: warn when a MAINTAINERS entry isn't [A-Z]:\t
checkpatch: improve the unnecessary OOM message test
lib/bsearch.c: micro-optimize pivot position calculation
...
When building the argv/envp pointers, the envp is needlessly
pre-incremented instead of just continuing after the argv pointers are
finished. In some (likely impossible) race where the strings could be
changed from userspace between copy_strings() and here, it might be
possible to confuse the envp position. Instead, just use sp like
everything else.
Link: http://lkml.kernel.org/r/20170622173838.GA43308@beast Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Rik van Riel <riel@redhat.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, which is the
traditional x86 minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided). For s390 the
position could be 0x10000, but that is needlessly close to the NULL
address.
Link: http://lkml.kernel.org/r/1498154792-49952-5-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, which is the
traditional x86 minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided).
Link: http://lkml.kernel.org/r/1498154792-49952-4-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, to match ARM.
This could be 0x8000, the standard ET_EXEC load address, but that is
needlessly close to the NULL address, and anyone running arm compat PIE
will have an MMU, so the tight mapping is not needed.
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
4MB is chosen here mainly to have parity with x86, where this is the
traditional minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided).
For ARM the position could be 0x8000, the standard ET_EXEC load address,
but that is needlessly close to the NULL address, and anyone running PIE
on 32-bit ARM will have an MMU, so the tight mapping is not needed.
Link: http://lkml.kernel.org/r/1498154792-49952-2-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ELF_ET_DYN_BASE position was originally intended to keep loaders
away from ET_EXEC binaries. (For example, running "/lib/ld-linux.so.2
/bin/cat" might cause the subsequent load of /bin/cat into where the
loader had been loaded.)
With the advent of PIE (ET_DYN binaries with an INTERP Program Header),
ELF_ET_DYN_BASE continued to be used since the kernel was only looking
at ET_DYN. However, since ELF_ET_DYN_BASE is traditionally set at the
top 1/3rd of the TASK_SIZE, a substantial portion of the address space
is unused.
For 32-bit tasks when RLIMIT_STACK is set to RLIM_INFINITY, programs are
loaded above the mmap region. This means they can be made to collide
(CVE-2017-1000370) or nearly collide (CVE-2017-1000371) with
pathological stack regions.
Lowering ELF_ET_DYN_BASE solves both by moving programs below the mmap
region in all cases, and will now additionally avoid programs falling
back to the mmap region by enforcing MAP_FIXED for program loads (i.e.
if it would have collided with the stack, now it will fail to load
instead of falling back to the mmap region).
To allow for a lower ELF_ET_DYN_BASE, loaders (ET_DYN without INTERP)
are loaded into the mmap region, leaving space available for either an
ET_EXEC binary with a fixed location or PIE being loaded into mmap by
the loader. Only PIE programs are loaded offset from ELF_ET_DYN_BASE,
which means architectures can now safely lower their values without risk
of loaders colliding with their subsequently loaded programs.
For 64-bit, ELF_ET_DYN_BASE is best set to 4GB to allow runtimes to use
the entire 32-bit address space for 32-bit pointers.
Thanks to PaX Team, Daniel Micay, and Rik van Riel for inspiration and
suggestions on how to implement this solution.
Fixes: d1fd836dcf00 ("mm: split ET_DYN ASLR from mmap ASLR") Link: http://lkml.kernel.org/r/20170621173201.GA114489@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rik van Riel <riel@redhat.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Pratyush Anand <panand@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Mon, 10 Jul 2017 22:52:33 +0000 (15:52 -0700)]
fs, epoll: short circuit fetching events if thread has been killed
We've encountered zombies that are waiting for a thread to exit that are
looping in ep_poll() almost endlessly although there is a pending
SIGKILL as a result of a group exit.
This happens because we always find ep_events_available() and fetch more
events and never are able to check for signal_pending() that would break
from the loop and return -EINTR.
Special case fatal signals and break immediately to guarantee that we
loop to fetch more events and delay making a timely exit.
It would also be possible to simply move the check for signal_pending()
higher than checking for ep_events_available(), but there have been no
reports of delayed signal handling other than SIGKILL preventing zombies
from exiting that would be fixed by this.
It fixes an issue for us where we have witnessed zombies sticking around
for at least O(minutes), but considering the code has been like this
forever and nobody else has complained that I have found, I would simply
queue it up for 4.12.
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1705031722350.76784@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jan Kara <jack@suse.cz> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
John Brooks [Mon, 10 Jul 2017 22:52:24 +0000 (15:52 -0700)]
checkpatch: change format of --color argument to --color[=WHEN]
The boolean --color argument did not offer the ability to force
colourized output even if stdout is not a terminal. Change the format
of the argument to the familiar --color[=WHEN] construct as seen in
common Linux utilities such as git, ls and dmesg, which allows the user
to specify whether to colourize output "always", "never", or "auto" when
the output is a terminal. The default is "auto".
The old command-line uses of --color and --no-color are unchanged.
checkpatch: silence perl 5.26.0 unescaped left brace warnings
As of perl 5, version 26, subversion 0 (v5.26.0) some new warnings have
occurred when running checkpatch.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){
<-- HERE \s*/ at scripts/checkpatch.pl line 3544.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){
<-- HERE \s*/ at scripts/checkpatch.pl line 3885.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in
m/^(\+.*(?:do|\))){ <-- HERE / at scripts/checkpatch.pl line 4374.
It seems perfectly reasonable to do as the warning suggests and simply
escape the left brace in these three locations.
Link: http://lkml.kernel.org/r/20170607060135.17384-1-cyrilbur@gmail.com Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Acked-by: Joe Perches <joe@perches.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lib/bsearch.c: micro-optimize pivot position calculation
There is a slightly faster way (in terms of the number of instructions
being used) to calculate the position of a middle element, preserving
integer overflow safeness.
./scripts/bloat-o-meter lib/bsearch.o.old lib/bsearch.o.new
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-24 (-24)
function old new delta
bsearch 122 98 -24
TEST
INT array of size 100001, elements [0..100000]. gcc 7.1, Os, x86_64.
Michal Hocko [Mon, 10 Jul 2017 22:51:55 +0000 (15:51 -0700)]
lib/rhashtable.c: use kvzalloc() in bucket_table_alloc() when possible
bucket_table_alloc() can be currently called with GFP_KERNEL or
GFP_ATOMIC. For the former we basically have an open coded kvzalloc()
while the later only uses kzalloc(). Let's simplify the code a bit by
the dropping the open coded path and replace it with kvzalloc().
Link: http://lkml.kernel.org/r/20170531155145.17111-3-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Thomas Graf <tgraf@suug.ch> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
... such that a user can specify visiting all the nodes in the tree
(intersects with the world). This is a nice opposite from the very
basic default query which is a single point.
Patch also helps on embedded archs which generally only like "int". On
arm "and 0xff" is generated which is waste because all values used in
comparisons are positive.
Matthew Wilcox [Mon, 10 Jul 2017 22:51:35 +0000 (15:51 -0700)]
bitmap: use memcmp optimisation in more situations
Commit 7dd968163f7c ("bitmap: bitmap_equal memcmp optimization") was
rather more restrictive than necessary; we can use memcmp() to implement
bitmap_equal() as long as the number of bits can be proved to be a
multiple of 8. And architectures other than s390 may be able to make
good use of this optimisation.
Matthew Wilcox [Mon, 10 Jul 2017 22:51:32 +0000 (15:51 -0700)]
include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible
Several callers have constant 'start' and an 'nbits' that is a multiple
of 8, so we can turn them into calls to memset. We don't need the
entirety of 'start' and 'nbits' to be constant, we just need to know
whether they're divisible by 8.
Link: http://lkml.kernel.org/r/20170628153221.11322-4-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matthew Wilcox [Mon, 10 Jul 2017 22:51:29 +0000 (15:51 -0700)]
bitmap: optimise bitmap_set and bitmap_clear of a single bit
We have eight users calling bitmap_clear for a single bit and seventeen
calling bitmap_set for a single bit. Rather than fix all of them to
call __clear_bit or __set_bit, turn bitmap_clear and bitmap_set into
inline functions and make this special case efficient.
Link: http://lkml.kernel.org/r/20170628153221.11322-3-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matthew Wilcox [Mon, 10 Jul 2017 22:51:26 +0000 (15:51 -0700)]
lib/test_bitmap.c: add optimisation tests
Patch series "Bitmap optimisations", v2.
These three bitmap patches use more efficient specialisations when the
compiler can figure out that it's safe to do so. Thanks to Rasmus's
eagle eyes, a nasty bug in v1 was avoided, and I've added a test case
which would have caught it.
This patch (of 4):
This version of the test is actually a no-op; the next patch will enable
it.
Link: http://lkml.kernel.org/r/20170628153221.11322-2-willy@infradead.org Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
MAINTAINERS: give proc sysctl some maintainer love
We poke at proc sysctl enough that really we should declare it
maintained. We'll just be Cc'd and sending updates / ACK'ing changes
through akpm's tree.
Link: http://lkml.kernel.org/r/20170524231305.8649-1-mcgrof@kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by <linux/sysfs.h> work with
const attribute_group. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
1120 544 16 1680 690 kernel/ksysfs.o
File size After adding 'const':
text data bss dec hex filename
1160 480 16 1656 678 kernel/ksysfs.o
Link: http://lkml.kernel.org/r/aa224b3cc923fdbb3edd0c41b2c639c85408c9e8.1498737347.git.arvind.yadav.cs@gmail.com Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Dave Young <dyoung@redhat.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The global variable 'rd_size' is declared as 'int' in source file
arch/arm/kernel/atags_parse.c and as 'unsigned long' in
drivers/block/brd.c. Fix this inconsistency.
Additionally, remove the declarations of rd_image_start, rd_prompt and
rd_doload from parse_tag_ramdisk() since these duplicate existing
declarations in <linux/initrd.h>.
Link: http://lkml.kernel.org/r/20170627065024.12347-1-bart.vanassche@wdc.com Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jan Kara <jack@suse.cz> Cc: Jason Yan <yanaijie@huawei.com> Cc: Zhaohongjiang <zhaohongjiang@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ian Abbott [Mon, 10 Jul 2017 22:51:07 +0000 (15:51 -0700)]
bug: split BUILD_BUG stuff out into <linux/build_bug.h>
Including <linux/bug.h> pulls in a lot of bloat from <asm/bug.h> and
<asm-generic/bug.h> that is not needed to call the BUILD_BUG() family of
macros. Split them out into their own header, <linux/build_bug.h>.
Also correct some checkpatch.pl errors for the BUILD_BUG_ON_ZERO() and
BUILD_BUG_ON_NULL() macros by adding parentheses around the bitfield
widths that begin with a minus sign.
Link: http://lkml.kernel.org/r/20170525120316.24473-6-abbotti@mev.co.uk Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Acked-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ian Abbott [Mon, 10 Jul 2017 22:50:58 +0000 (15:50 -0700)]
linux/bug.h: correct formatting of block comment
Correct these checkpatch.pl warnings:
|WARNING: Block comments use * on subsequent lines
|#34: FILE: include/linux/bug.h:34:
|+/* Force a compilation error if condition is true, but also produce a
|+ result (of value 0 and type size_t), so the expression can be used
|WARNING: Block comments use a trailing */ on a separate line
|#36: FILE: include/linux/bug.h:36:
|+ aren't permitted). */
Link: http://lkml.kernel.org/r/20170525120316.24473-3-abbotti@mev.co.uk Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Acked-by: Michal Nazarewicz <mina86@mina86.com> Cc: Kees Cook <keescook@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ian Abbott [Mon, 10 Jul 2017 22:50:55 +0000 (15:50 -0700)]
asm-generic/bug.h: declare struct pt_regs; before function prototype
This series of patches splits BUILD_BUG related macros out of
"include/linux/bug.h" into new file "include/linux/build_bug.h" (patch
5), and changes the pointer type checking in the `container_of()` macro
to deal with pointers of array type better (patch 6). Patches 1 to 4
are prerequisites.
Patches 2, 3, 4, and 5 have been inserted since the previous version of
this patch series. Patch 6 here corresponds to v3 and v4's patch 2.
Patch 1 was a prerequisite in v3 of this series to avoid a lot of
warnings when <linux/bug.h> was included by <linux/kernel.h>. That is
no longer relevant for v5 of the series, but I left it in because it was
acked by a Arnd Bergmann and Michal Nazarewicz.
Patches 2, 3, and 4 are some checkpatch clean-ups on
"include/linux/bug.h" before splitting out the BUILD_BUG stuff in patch
5.
Patch 5 splits the BUILD_BUG related macros out of "include/linux/bug.h"
into new file "include/linux/build_bug.h" because including
<linux/bug.h> in "include/linux/kernel.h" would result in build failures
due to circular dependencies.
Patch 6 changes the pointer type checking by `container_of()` to avoid
some incompatible pointer warnings when the dereferenced pointer has
array type.
1) asm-generic/bug.h: declare struct pt_regs; before function prototype
2) linux/bug.h: correct formatting of block comment
3) linux/bug.h: correct "(foo*)" should be "(foo *)"
4) linux/bug.h: correct "space required before that '-'"
5) bug: split BUILD_BUG stuff out into <linux/build_bug.h>
6) kernel.h: handle pointers to arrays better in container_of()
This patch (of 6):
The declaration of `__warn()` has `struct pt_regs *regs` as one of its
parameters. This can result in compiler warnings if `struct regs` is not
already declared. Add an empty declaration of `struct pt_regs` to avoid
the warnings.
Link: http://lkml.kernel.org/r/20170525120316.24473-2-abbotti@mev.co.uk Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michal Nazarewicz <mina86@mina86.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Mon, 10 Jul 2017 22:50:49 +0000 (15:50 -0700)]
frv: cmpxchg: implement cmpxchg64()
FRV supports 64-bit cmpxchg, which is provided by the arch code as
__cmpxchg_64 and subsequently used to implement atomic64_cmpxchg.
This patch hooks up the generic cmpxchg64 API using the same function,
which also provides default definitions of the relaxed, acquire and
release variants. This fixes the build when COMPILE_TEST=y and
IOMMU_IO_PGTABLE_LPAE=y.
Link: http://lkml.kernel.org/r/1499084670-6996-1-git-send-email-will.deacon@arm.com Signed-off-by: Will Deacon <will.deacon@arm.com> Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The arch uses a verbatim copy of the asm-generic version and does not
add any own implementations to the header, so use asm-generic/fb.h
instead of duplicating code.
Link: http://lkml.kernel.org/r/20170517083307.1697-1-tklauser@distanz.ch Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
frv's asm/device.h is merely including asm-generic/device.h. Thus, the
arch specific header can be omitted and the generic header can be used
directly.
Link: http://lkml.kernel.org/r/20170517124915.26904-1-tklauser@distanz.ch Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KASAN doesn't happen work with memory hotplug because hotplugged memory
doesn't have any shadow memory. So any access to hotplugged memory
would cause a crash on shadow check.
Use memory hotplug notifier to allocate and map shadow memory when the
hotplugged memory is going online and free shadow after the memory
offlined.
Link: http://lkml.kernel.org/r/20170601162338.23540-4-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Potapenko <glider@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We used to read several bytes of the shadow memory in advance.
Therefore additional shadow memory mapped to prevent crash if
speculative load would happen near the end of the mapped shadow memory.
Now we don't have such speculative loads, so we no longer need to map
additional shadow memory.
Link: http://lkml.kernel.org/r/20170601162338.23540-3-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We used to read several bytes of the shadow memory in advance.
Therefore additional shadow memory mapped to prevent crash if
speculative load would happen near the end of the mapped shadow memory.
Now we don't have such speculative loads, so we no longer need to map
additional shadow memory.
Link: http://lkml.kernel.org/r/20170601162338.23540-2-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Potapenko <glider@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For some unaligned memory accesses we have to check additional byte of
the shadow memory. Currently we load that byte speculatively to have
only single load + branch on the optimistic fast path.
However, this approach has some downsides:
- It's unaligned access, so this prevents porting KASAN on
architectures which doesn't support unaligned accesses.
- We have to map additional shadow page to prevent crash if speculative
load happens near the end of the mapped memory. This would
significantly complicate upcoming memory hotplug support.
I wasn't able to notice any performance degradation with this patch. So
these speculative loads is just a pain with no gain, let's remove them.
Link: http://lkml.kernel.org/r/20170601162338.23540-1-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 40f9fb8cffc6 ("mm/zsmalloc: support allocating obj with size of
ZS_MAX_ALLOC_SIZE") fixes a size calculation error that prevented
zsmalloc to allocate an object of the maximal size (ZS_MAX_ALLOC_SIZE).
I think however the fix is unneededly complicated.
This patch replaces the dynamic calculation of zs_size_classes at init
time by a compile time calculation that uses the DIV_ROUND_UP() macro
already used in get_size_class_index().
[akpm@linux-foundation.org: use min_t] Link: http://lkml.kernel.org/r/20170630114859.1979-1-jmarchan@redhat.com Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Mahendran Ganesh <opensource.ganesh@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
attribute_groups are not supposed to change at runtime. All functions
working with attribute_groups provided by <linux/sysfs.h> work with
const attribute_group. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
8293 841 4 9138 23b2 drivers/block/zram/zram_drv.o
File size After adding 'const':
text data bss dec hex filename
8357 777 4 9138 23b2 drivers/block/zram/zram_drv.o
Michal Hocko [Mon, 10 Jul 2017 22:50:12 +0000 (15:50 -0700)]
mm: disallow early_pfn_to_nid on configurations which do not implement it
early_pfn_to_nid will return node 0 if both HAVE_ARCH_EARLY_PFN_TO_NID
and HAVE_MEMBLOCK_NODE_MAP are disabled. It seems we are safe now
because all architectures which support NUMA define one of them (with an
exception of alpha which however has CONFIG_NUMA marked as broken) so
this works as expected. It can get silently and subtly broken too
easily, though. Make sure we fail the compilation if NUMA is enabled
and there is no proper implementation for this function. If that ever
happens we know that either the specific configuration is invalid and
the fix should either disable NUMA or enable one of the above configs.
Link: http://lkml.kernel.org/r/20170704075803.15979-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Joonsoo Kim <js1304@gmail.com> Cc: Yang Shi <yang.shi@linaro.org> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Mon, 10 Jul 2017 22:50:09 +0000 (15:50 -0700)]
mm/memory-hotplug: switch locking to a percpu rwsem
Andrey reported a potential deadlock with the memory hotplug lock and
the cpu hotplug lock.
The reason is that memory hotplug takes the memory hotplug lock and then
calls stop_machine() which calls get_online_cpus(). That's the reverse
lock order to get_online_cpus(); get_online_mems(); in mm/slub_common.c
The problem has been there forever. The reason why this was never
reported is that the cpu hotplug locking had this homebrewn recursive
reader writer semaphore construct which due to the recursion evaded the
full lock dep coverage. The memory hotplug code copied that construct
verbatim and therefor has similar issues.
Three steps to fix this:
1) Convert the memory hotplug locking to a per cpu rwsem so the
potential issues get reported proper by lockdep.
2) Lock the online cpus in mem_hotplug_begin() before taking the memory
hotplug rwsem and use stop_machine_cpuslocked() in the page_alloc
code to avoid recursive locking.
3) The cpu hotpluck locking in #2 causes a recursive locking of the cpu
hotplug lock via __offline_pages() -> lru_add_drain_all(). Solve this
by invoking lru_add_drain_all_cpuslocked() instead.
Link: http://lkml.kernel.org/r/20170704093421.506836322@linutronix.de Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Mon, 10 Jul 2017 22:50:06 +0000 (15:50 -0700)]
mm: swap: provide lru_add_drain_all_cpuslocked()
The rework of the cpu hotplug locking unearthed potential deadlocks with
the memory hotplug locking code.
The solution for these is to rework the memory hotplug locking code as
well and take the cpu hotplug lock before the memory hotplug lock in
mem_hotplug_begin(), but this will cause a recursive locking of the cpu
hotplug lock when the memory hotplug code calls lru_add_drain_all().
Split out the inner workings of lru_add_drain_all() into
lru_add_drain_all_cpuslocked() so this function can be invoked from the
memory hotplug code with the cpu hotplug lock held.
Link: http://lkml.kernel.org/r/20170704093421.419329357@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__list_lru_walk_one() acquires nlru spin lock (nlru->lock) for longer
duration if there are more number of items in the lru list. As per the
current code, it can hold the spin lock for upto maximum UINT_MAX
entries at a time. So if there are more number of items in the lru
list, then "BUG: spinlock lockup suspected" is observed in the below
path:
Fix this lockup by reducing the number of entries to be shrinked from
the lru list to 1024 at once. Also, add cond_resched() before
processing the lru list again.
Link: http://marc.info/?t=149722864900001&r=1&w=2 Link: http://lkml.kernel.org/r/1498707575-2472-1-git-send-email-stummala@codeaurora.org Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Suggested-by: Jan Kara <jack@suse.cz> Suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Alexander Polakov <apolyakov@beget.ru> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/list_lru.c: fix list_lru_count_node() to be race free
list_lru_count_node() iterates over all memcgs to get the total number of
entries on the node but it can race with memcg_drain_all_list_lrus(),
which migrates the entries from a dead cgroup to another. This can return
incorrect number of entries from list_lru_count_node().
Fix this by keeping track of entries per node and simply return it in
list_lru_count_node().
Link: http://lkml.kernel.org/r/1498707555-30525-1-git-send-email-stummala@codeaurora.org Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Jan Kara <jack@suse.cz> Cc: Alexander Polakov <apolyakov@beget.ru> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/mmap.c: expand_downwards: don't require the gap if !vm_prev
expand_stack(vma) fails if address < stack_guard_gap even if there is no
vma->vm_prev. I don't think this makes sense, and we didn't do this
before the recent commit 1be7107fbe18 ("mm: larger stack guard gap,
between vmas").
We do not need a gap in this case, any address is fine as long as
security_mmap_addr() doesn't object.
This also simplifies the code, we know that address >= prev->vm_end and
thus underflow is not possible.
Link: http://lkml.kernel.org/r/20170628175258.GA24881@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Cc: Larry Woodman <lwoodman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Mon, 10 Jul 2017 22:49:51 +0000 (15:49 -0700)]
mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") has
introduced a regression in some rust and Java environments which are
trying to implement their own stack guard page. They are punching a new
MAP_FIXED mapping inside the existing stack Vma.
This will confuse expand_{downwards,upwards} into thinking that the
stack expansion would in fact get us too close to an existing non-stack
vma which is a correct behavior wrt safety. It is a real regression on
the other hand.
Let's work around the problem by considering PROT_NONE mapping as a part
of the stack. This is a gros hack but overflowing to such a mapping
would trap anyway an we only can hope that usespace knows what it is
doing and handle it propely.
Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas") Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Debugged-by: Vlastimil Babka <vbabka@suse.cz> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Willy Tarreau <w@1wt.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/balloon_compaction.c: enqueue zero page to balloon device
presently pages in the balloon device have random value, and these pages
will be scanned by ksmd on the host. They usually cannot be merged.
Enqueue zero pages will resolve this problem.
Link: http://lkml.kernel.org/r/1498698637-26389-1-git-send-email-zhenwei.pi@youruncloud.com Signed-off-by: zhenwei.pi <zhenwei.pi@youruncloud.com> Cc: Gioh Kim <gi-oh.kim@profitbricks.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Rafael Aquini <aquini@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The align_offset parameter is used by bitmap_find_next_zero_area_off()
to represent the offset of map's base from the previous alignment
boundary; the function ensures that the returned index, plus the
align_offset, honors the specified align_mask.
The logic introduced by commit b5be83e308f7 ("mm: cma: align to physical
address, not CMA region position") has the cma driver calculate the
offset to the *next* alignment boundary. In most cases, the base
alignment is greater than that specified when making allocations,
resulting in a zero offset whether we align up or down. In the example
given with the commit, the base alignment (8MB) was half the requested
alignment (16MB) so the math also happened to work since the offset is
8MB in both directions. However, when requesting allocations with an
alignment greater than twice that of the base, the returned index would
not be correctly aligned.
Also, the align_order arguments of cma_bitmap_aligned_mask() and
cma_bitmap_aligned_offset() should not be negative so the argument type
was made unsigned.
Fixes: b5be83e308f7 ("mm: cma: align to physical address, not CMA region position") Link: http://lkml.kernel.org/r/20170628170742.2895-1-opendmb@gmail.com Signed-off-by: Angus Clark <angus@angusclark.org> Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Gregory Fong <gregory.0xf0@gmail.com> Cc: Doug Berger <opendmb@gmail.com> Cc: Angus Clark <angus@angusclark.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Shiraz Hashim <shashim@codeaurora.org> Cc: Jaewon Kim <jaewon31.kim@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>