Sebastian Ott [Tue, 23 Oct 2012 18:28:37 +0000 (20:28 +0200)]
s390/dasd: fix multi-line printks with multiple KERN_<level>s
Do not use more than one KERN_<level> per printk.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Thu, 18 Oct 2012 16:10:06 +0000 (18:10 +0200)]
s390/traps: preinitialize program check table
Preinitialize the program check table, so we can put it into the
read-only data section.
Also use only four byte entries for the table, since each program
check handler resides within the first 2GB. Therefore this reduces
the size of the table by 50% on 64 bit builds.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Wed, 17 Oct 2012 10:18:05 +0000 (12:18 +0200)]
s390/mm,vmemmap: use 1MB frames for vmemmap
Use 1MB frames for vmemmap if EDAT1 is available in order to
reduce TLB pressure
Always use a 1MB frame even if its only partially needed for
struct pages. Otherwise we would end up with a mix of large
frame and page mappings, because vmemmap_populate gets called
for each section (256MB -> 3.5MB memmap) separately.
Worst case is that we would waste 512KB.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Linus Torvalds [Fri, 23 Nov 2012 07:45:34 +0000 (21:45 -1000)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
"This fixes recent regression where /dev/input/mice got assigned wrong
device node which messed up setups with static /dev, and a regression
in ads7846 GPIO debounce setup."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
ARM - OMAP: ads7846: fix pendown debounce setting
Input: ads7846 - enable pendown GPIO debounce time setting
Input: mousedev - move /dev/input/mice to the correct minor
Input: MT - document new 'flags' argument of input_mt_init_slots()
Linus Torvalds [Thu, 22 Nov 2012 19:22:13 +0000 (09:22 -1000)]
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A few more fixes for final 3.7. Two dealing with pinmux setup on
OMAP, and one dealing with TV output on DaVinci. And one small
MAINTAINER update."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: davinci: dm644x: fix out range signal for ED
ARM: OMAP4: TWL: mux sys_drm_msecure as output for PMIC
ARM: OMAP3: igep0020: Set WIFI/BT GPIO pins in correct mux mode
ARM: OMAP: Add maintainer entry for IGEP machines
Linus Torvalds [Thu, 22 Nov 2012 19:16:29 +0000 (09:16 -1000)]
Merge tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6
Pull PARISC fixes from James Bottomley:
"This is two bug fixes: one fixes a loophole where rt_sigprocmask()
with the wrong values panics the box (Denial of Service) and the other
fixes an aliasing problem with get_shared_area() which could cause
data corruption.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
* tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
[PARISC] fix user-triggerable panic on parisc
[PARISC] fix virtual aliasing issue in get_shared_area()
Linus Torvalds [Thu, 22 Nov 2012 19:14:54 +0000 (09:14 -1000)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of four bug fixes.
The isci one is an obvious thinko (using request buffer instead of
response buffer) which causes a command to fail.
The three others are DIF/DIX updates which are required because
they're part of a series of ten patches, the other seven of which went
into the block layer during the merge window meaning our current
DIF/DIX implementation is broken without these three.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
[SCSI] sd: Implement support for WRITE SAME
[SCSI] sd: Permit merged discard requests
[SCSI] Add a report opcode helper
[SCSI] isci: copy fis 0x34 response into proper buffer
Linus Torvalds [Thu, 22 Nov 2012 19:14:24 +0000 (09:14 -1000)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie.
Small fixes for (mostly Nouveau, some radeon) regressions.
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/nouveau: use the correct fence implementation for nv50
drm/radeon: add new SI pci id
radeon: add AGPMode 1 quirk for RV250
drm/radeon: properly track the crtc not_enabled case evergreen_mc_stop()
drm/nouveau/bios: fix DCB v1.5 parsing
drm/nouveau: add missing pll_calc calls
drm/nouveau: fix crash with noaccel=1
drm/nv40: allocate ctxprog with kmalloc
drm/nvc0/disp: fix thinko in vblank regression fix..
Dave Airlie [Thu, 22 Nov 2012 03:21:46 +0000 (13:21 +1000)]
Merge branch 'drm-fixes-3.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Alex writes:
A couple more small fixes for 3.7:
- another evergreen_mc fix
- add an AGP quirk for an old RV250
- new pci id.
* 'drm-fixes-3.7' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: add new SI pci id
radeon: add AGPMode 1 quirk for RV250
drm/radeon: properly track the crtc not_enabled case evergreen_mc_stop()
Commit 1fb3f8ca0e92 ("mm: compaction: capture a suitable high-order page
immediately when it is made available") took split_free_page() and
reused it for the compaction code. It does something curious with
capture_free_page() (previously known as split_free_page()):
int capture_free_page(struct page *page, int alloc_order,
...
__mod_zone_page_state(zone, NR_FREE_PAGES, -(1UL << order));
Note that expand() puts the pages _back_ in the allocator, but it does
not bump NR_FREE_PAGES. We "return" 'alloc_order' worth of pages, but
we accounted for removing 'order' in the __mod_zone_page_state() call.
For the old split_page()-style use (order==alloc_order) the bug will not
trigger. But, when called from the compaction code where we
occasionally get a larger page out of the buddy allocator than we need,
we will run in to this.
This patch simply changes the NR_FREE_PAGES manipulation to the correct
'alloc_order' instead of 'order'.
I've been able to repeatedly trigger this in my testing environment.
The amount "leaked" very closely tracks the imbalance I see in buddy
pages vs. NR_FREE_PAGES. I have confirmed that this patch fixes the
imbalance
1) inet6_csk_update_pmtu() must return NULL or non-NULL, so translate
ERR_PTR to NULL, as needed. Fix from Eric Dumazet.
2) Fix copy&paste error in IRDA sir_dev ->set_speed method invocation,
it was testing the NULL'ness of a different method to guard the
call. Fix from Alexander Shiyan.
3) Fix build regression of xilinx driver, from Jeff Mahoney.
4) Make XEN netfront (like XEN netback) handle compound pages in SKBs
properly. From Ian Campbell.
5) Fix inverted logic of team_dev_queue_xmit() return value checks,
from Jiri Pirko and Dan Carpenter.
6) dma_poll_create() no longer allows a NULL device argument, breaking
both ixp4xx drivers. Fix from Xi Wang.
7) ne2000 driver doesn't hook up the parent device properly, breaking
udev matching. Fix from Alan Cox.
8) Locking and memory leak fixes in Near Field Communications layer.
From Thierry Escande, Szymon Janc, and Waldemar Rymarkiewicz.
9) sis900 resume regression, sis900_set_mode() is being called with the
iomem pointer instead of the expected device private. Fix from
Francois Romieu.
10) Fix IBSS regression caused by uninitializing the ibss-internals
before performing an emptyness check, from Simon WUnderlich.
11) Fix SNIFFER mode regression in iwlwifi driver, from Johannes Berg.
12) Fix task wedges in mwifiex_cmd_timeout_func(), from Bing Zhao.
13) Add back wireless sysfs directory, too much stuff depends upon it
being there (actually I'd say it never should have been removed to
begin with). From Johannes Berg.
14) Fix hang introduced by suspend/resume changes in ath9k. Fix from
Sujith Manoharan.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (24 commits)
team: bcast: convert return value of team_dev_queue_xmit() to bool correctly
bonding: Bonding driver does not consider the gso_max_size/gso_max_segs setting of slave devices.
xen/netfront: handle compound page fragments on transmit
net: fix build failure in xilinx
irda: sir_dev: Fix copy/paste typo
ipv6: fix inet6_csk_update_pmtu() return value
ixp4xx_hss: avoid calling dma_pool_create() with NULL dev
ixp4xx_eth: avoid calling dma_pool_create() with NULL dev
ne2000: add the right platform device
of/net/mdio-gpio: Fix pdev->id issue when using devicetrees.
NFC: Fix pn533 target mode memory leak
NFC: pn533: Fix mem leak in pn533_in_dep_link_up
NFC: pn533: Fix use after free
NFC: pn533: Fix missing lock while operating on commands list
NFC: Fix nfc_llcp_local chained list insertion
ath9k_hw: Fix regression in device reset
sis900: fix sis900_set_mode call parameters.
iwlwifi: don't WARN when a non empty queue is disabled
wireless: add back sysfs directory
mwifiex: report error to MMC core if we cannot suspend
...
Olof Johansson [Wed, 21 Nov 2012 21:56:36 +0000 (13:56 -0800)]
Merge tag 'omap-for-v3.7-rc5/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
From Tony Lindgren:
Few more regression fixes related to u-boot only muxing
essential pins.
* tag 'omap-for-v3.7-rc5/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP4: TWL: mux sys_drm_msecure as output for PMIC
ARM: OMAP3: igep0020: Set WIFI/BT GPIO pins in correct mux mode
ARM: OMAP: Add maintainer entry for IGEP machines
It also revealed a bug in the OMAP GPIO handling code which prevented
the GPIO debounce clock to be disabled and CORE transition to low power
states.
Commit c9c55d9 (gpio/omap: fix off-mode bug: clear debounce settings on
free/reset) fixes the OMAP GPIO handling code by making sure that the
GPIO debounce clock gets disabled if no GPIO is requested from current
bank.
While fixing the OMAP GPIO handling code (in the right way), the above
commit makes the gpio_request->set_debounce->free sequence invalid as
after freeing the GPIO, the debounce settings are lost.
Fix the debounce settings by moving the debounce initialization to the
actual GPIO requesting code - the ads7846 driver.
Signed-off-by: Igor Grinberg <grinberg@compulab.co.il> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Igor Grinberg [Wed, 21 Nov 2012 07:00:10 +0000 (23:00 -0800)]
Input: ads7846 - enable pendown GPIO debounce time setting
Some platforms need the pendown GPIO debounce time setting programmed.
Since the pendown GPIO is handled by the driver, the debounce time
should also be handled along with the pendown GPIO request.
Signed-off-by: Igor Grinberg <grinberg@compulab.co.il> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Jiri Pirko [Wed, 21 Nov 2012 02:34:45 +0000 (02:34 +0000)]
team: bcast: convert return value of team_dev_queue_xmit() to bool correctly
The thing is that team_dev_queue_xmit() returns NET_XMIT_* or -E*.
bc_trasmit() should return true in case all went well. So use ! to get
correct retval from team_dev_queue_xmit() result.
This bug caused iface statistics to be badly computed.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
Chun-Yi Lee [Wed, 21 Nov 2012 11:26:09 +0000 (11:26 +0000)]
sign-file: fix the perl warning message when extracting ASN.1
There have the following warning message when running modules install
for sign ko files:
# make modules_install
...
INSTALL drivers/input/touchscreen/pcap_ts.ko
Found = in conditional, should be == at scripts/sign-file line 164.
Found = in conditional, should be == at scripts/sign-file line 161.
Found = in conditional, should be == at scripts/sign-file line 159.
This patch change replace '=' by '==' in elsif conditions for avoid the
above warning messages.
Signed-off-by: Chun-Yi Lee <jlee@suse.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sarveshwar Bandi [Wed, 21 Nov 2012 04:35:03 +0000 (04:35 +0000)]
bonding: Bonding driver does not consider the gso_max_size/gso_max_segs setting of slave devices.
Patch sets the lowest gso_max_size and gso_max_segs values of the slave devices during enslave and detach.
Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 21 Nov 2012 02:02:16 +0000 (02:02 +0000)]
xen/netfront: handle compound page fragments on transmit
An SKB paged fragment can consist of a compound page with order > 0.
However the netchannel protocol deals only in PAGE_SIZE frames.
Handle this in xennet_make_frags by iterating over the frames which
make up the page.
This is the netfront equivalent to 6a8ed462f16b for netback.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: netdev@vger.kernel.org Cc: xen-devel@lists.xen.org Cc: Eric Dumazet <edumazet@google.com> Cc: Konrad Rzeszutek Wilk <konrad@kernel.org> Cc: ANNIE LI <annie.li@oracle.com> Cc: Sander Eikelenboom <linux@eikelenboom.it> Cc: Stefan Bader <stefan.bader@canonical.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville [Wed, 21 Nov 2012 15:42:23 +0000 (10:42 -0500)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
John W. Linville says:
====================
This is a batch of fixes intended for 3.7...
Included are two pulls. Regarding the mac80211 tree, Johannes says:
"Please pull my mac80211.git tree (see below) to get two more fixes for
3.7. Both fix regressions introduced *before* this cycle that weren't
noticed until now, one for IBSS not cleaning up properly and the other
to add back the "wireless" sysfs directory for Fedora's startup scripts."
Regarding the iwlwifi tree, Johannes says:
"Please also pull my iwlwifi.git tree, I have two fixes: one to remove a
spurious warning that can actually trigger in legitimate situations, and
the other to fix a regression from when monitor mode was changed to use
the "sniffer" firmware mode."
Also included is an nfc tree pull. Samuel says:
"We mostly have pn533 fixes here, 2 memory leaks and an early unlocking fix.
Moreover, we also have an LLCP adapter linked list insertion fix."
On top of that, a few more bits... Albert Pool adds a USB ID
to rtlwifi. Bing Zhao provides two mwifiex fixes -- one to fix
a system hang during a command timeout, and the other to properly
report a suspend error to the MMC core. Finally, Sujith Manoharan
fixes a thinko that would trigger an ath9k hang during device reset.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch below does what Paul McKenney suggested in the previous thread.
Signed-off-by: Dave Jones <davej@redhat.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: Eric Paris <eparis@parisplace.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Morris <james.l.morris@oracle.com>
Linus Torvalds [Wed, 21 Nov 2012 04:53:26 +0000 (18:53 -1000)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull missed powerpc fixes from Benjamin Herrenschmidt:
"Here are small 52xx fixes that Anatolij asked me to pull a while back
and that I completely missed. The stuff is local to that platform
code, and was in next for a while, so it should still go into 3.7."
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/mpc5200: move lpbfifo node and fix its interrupt property
powerpc: 52xx: nop out unsupported critical IRQs
powerpc/pcm030: add pcm030-audio-fabric to dts
Linus Torvalds [Wed, 21 Nov 2012 04:52:01 +0000 (18:52 -1000)]
Merge tag 'stable/for-linus-3.7-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen bug-fix from Konrad Rzeszutek Wilk:
- Fix regression introduced by commit ceb90fa0a800 ("xen/privcmd: add
PRIVCMD_MMAPBATCH_V2 ioctl").
* tag 'stable/for-linus-3.7-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/privcmd: Correctly return success from IOCTL_PRIVCMD_MMAPBATCH
Linus Torvalds [Wed, 21 Nov 2012 04:50:07 +0000 (18:50 -1000)]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM maintainership update from Avi Kivity:
"After many years of maintaining KVM, I am moving on. It was a real
pleasure for me to work with so many talented and dedicated hackers on
this project.
Replacing me will be one of those talented and dedicated hackers,
Gleb, who has authored hundreds of patches in and around KVM."
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: taking co-maintenance
KVM: Retire as maintainer
Linus Torvalds [Wed, 21 Nov 2012 04:49:32 +0000 (18:49 -1000)]
Merge tag 'iommu-fixes-v3.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
"Some fixes and a MAINTAINERS update to remove my lost AMD email
address from the file. The fixes take care of a resource leak and a
problem on VT-d with the new IOMMU group code."
* tag 'iommu-fixes-v3.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
intel-iommu: Fix lookup in add device
iommu/tegra-smmu.c: fix dentry reference leak in smmu_debugfs_stats_show().
iommu/amd: Update MAINTAINERS entry
Linus Torvalds [Wed, 21 Nov 2012 04:48:25 +0000 (18:48 -1000)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull reiserfs and ext3 fixes from Jan Kara:
"Fixes of reiserfs deadlocks when quotas are enabled (locking there was
completely busted by BKL conversion) and also one small ext3 fix in
the trim interface."
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext3: Avoid underflow of in ext3_trim_fs()
reiserfs: Move quota calls out of write lock
reiserfs: Protect reiserfs_quota_write() with write lock
reiserfs: Protect reiserfs_quota_on() with write lock
reiserfs: Fix lock ordering during remount
Mats Petersson [Fri, 16 Nov 2012 18:36:49 +0000 (18:36 +0000)]
xen/privcmd: Correctly return success from IOCTL_PRIVCMD_MMAPBATCH
This is a regression introduced by ceb90fa0 (xen/privcmd: add
PRIVCMD_MMAPBATCH_V2 ioctl). It broke xentrace as it used
xc_map_foreign() instead of xc_map_foreign_bulk().
Most code-paths prefer the MMAPBATCH_V2, so this wasn't very obvious
that it broke. The return value is set early on to -EINVAL, and if all
goes well, the "set top bits of the MFN's" never gets called, so the
return value is still EINVAL when the function gets to the end, causing
the caller to think it went wrong (which it didn't!)
Now also including Andres "move the ret = -EINVAL into the error handling
path, as this avoids other similar errors in future.
Signed-off-by: Mats Petersson <mats.petersson@citrix.com> Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Merge remote-tracking branch 'agust/merge' into merge
Anatolij 52xx updates:
Patch for pcm030 device tree fixing the probe() in pcm030-audio-fabric
driver. Changes to this driver have been merged in 3.7-rc1 via ASoC
tree, but this required device tree patch was submitted separately to
the linux-ppc list and is still missing in mainline. Without this patch
the probe() in pcm030-audio-fabric driver wrongly returns -ENODEV.
A patch from Wolfram fixing wrong invalid critical irq warnings for
all mpc5200 boards.
Another patch for all mpc5200 device trees fixing wrong L1 cell in
the LPB FIFO interrupt property and moving the LPB FIFO node to the
common mpc5200b.dtsi file so that this common node will be present
in all mpc5200 device trees.
Jeff Mahoney [Tue, 20 Nov 2012 10:23:13 +0000 (10:23 +0000)]
net: fix build failure in xilinx
Commit 71c6c837 (drivers/net: fix tasklet misuse issue) introduced a
build failure in the xilinx driver. axienet_dma_err_handler isn't
declared before its use in axienet_open.
This patch provides the prototype before axienet_open.
Cc: Xiaotian Feng <dannyfeng@tencent.com> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 20 Nov 2012 20:14:51 +0000 (15:14 -0500)]
ipv6: fix inet6_csk_update_pmtu() return value
In case of error, inet6_csk_update_pmtu() should consistently
return NULL.
Bug added in commit 35ad9b9cf7d8a
(ipv6: Add helper inet6_csk_update_pmtu().)
Reported-by: LluÃs Batlle i Rossell <viric@viric.name> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Bolle [Mon, 19 Nov 2012 20:17:31 +0000 (21:17 +0100)]
radeon: add AGPMode 1 quirk for RV250
The Intel 82855PM host bridge / Mobility FireGL 9000 RV250 combination
in an (outdated) ThinkPad T41 needs AGPMode 1 for suspend/resume (under
KMS, that is). So add a quirk for it.
(Change R250 to RV250 in comment for preceding quirk too.)
Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
of/net/mdio-gpio: Fix pdev->id issue when using devicetrees.
When the mdio-gpio driver is probed via device trees, the platform
device id is set as -1, However the pdev->id is re-used as bus-id for
while creating mdio gpio bus.
So
For device tree case the mdio-gpio bus name appears as "gpio-ffffffff"
where as
for non-device tree case the bus name appears as "gpio-<bus-num>"
Which means the bus_id is fixed in device tree case, so we can't have
two mdio gpio buses via device trees. Assigning a logical bus number
via device tree solves the problem and the bus name is much consistent
with non-device tree bus name.
Without this patch
1. we can't support two mdio-gpio buses via device trees.
2. we should always pass gpio-ffffffff as bus name to phy_connect, very
different to non-device tree bus name.
So, setting up the bus_id via aliases from device tree is the right
solution and other drivers do similar thing.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Szymon Janc [Mon, 29 Oct 2012 13:04:43 +0000 (14:04 +0100)]
NFC: pn533: Fix use after free
cmd was freed in pn533_dep_link_up regardless of
pn533_send_cmd_frame_async return code. Cmd is passed as argument to
pn533_in_dep_link_up_complete callback and should be freed there.
Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Szymon Janc [Thu, 25 Oct 2012 15:29:45 +0000 (17:29 +0200)]
NFC: pn533: Fix missing lock while operating on commands list
In pn533_wq_cmd command was removed from list without cmd_lock held
(race with pn533_send_cmd_frame_async) which could lead to list
corruption. Delete command from list before releasing lock.
Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Lukas Czerner [Thu, 11 Oct 2012 10:28:38 +0000 (12:28 +0200)]
ext3: Avoid underflow of in ext3_trim_fs()
Currently if len argument in ext3_trim_fs() is smaller than one block,
the 'end' variable underflow. Avoid that by returning EINVAL if len is
smaller than file system block.
Also remove useless unlikely().
Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
Jan Kara [Tue, 13 Nov 2012 16:05:14 +0000 (17:05 +0100)]
reiserfs: Move quota calls out of write lock
Calls into highlevel quota code cannot happen under the write lock. These
calls take dqio_mutex which ranks above write lock. So drop write lock
before calling back into quota code.
CC: stable@vger.kernel.org # >= 3.0 Signed-off-by: Jan Kara <jack@suse.cz>
Jan Kara [Tue, 13 Nov 2012 17:25:38 +0000 (18:25 +0100)]
reiserfs: Protect reiserfs_quota_write() with write lock
Calls into reiserfs journalling code and reiserfs_get_block() need to
be protected with write lock. We remove write lock around calls to high
level quota code in the next patch so these paths would suddently become
unprotected.
CC: stable@vger.kernel.org # >= 3.0 Signed-off-by: Jan Kara <jack@suse.cz>
Jan Kara [Tue, 13 Nov 2012 15:34:17 +0000 (16:34 +0100)]
reiserfs: Protect reiserfs_quota_on() with write lock
In reiserfs_quota_on() we do quite some work - for example unpacking
tail of a quota file. Thus we have to hold write lock until a moment
we call back into the quota code.
CC: stable@vger.kernel.org # >= 3.0 Signed-off-by: Jan Kara <jack@suse.cz>
Jan Kara [Tue, 13 Nov 2012 13:55:52 +0000 (14:55 +0100)]
reiserfs: Fix lock ordering during remount
When remounting reiserfs dquot_suspend() or dquot_resume() can be called.
These functions take dqonoff_mutex which ranks above write lock so we have
to drop it before calling into quota code.
CC: stable@vger.kernel.org # >= 3.0 Signed-off-by: Jan Kara <jack@suse.cz>
Lad, Prabhakar [Wed, 3 Oct 2012 06:35:00 +0000 (12:05 +0530)]
ARM: davinci: dm644x: fix out range signal for ED
Fix the video clock setting when custom timings are used with
pclock <= 27MHz. Existing video clock selection uses PLL2 mode
which results in a 54MHz clock whereas using the MXI mode results
in a 27MHz clock (which is the one actually desired).
This bug affects the Enhanced Definition (ED) support on DM644x.
Without this patch, out-range signals errors are were observed on
the TV when viewing ED. An out-of-range signal is often caused when
the field rate is above the rate that the television will handle.
Signed-off-by: Lad, Prabhakar <prabhakar.lad@ti.com> Signed-off-by: Manjunath Hadli <manjunath.hadli@ti.com> Cc: Sekhar Nori <nsekhar@ti.com>
[nsekhar@ti.com: reword commit message based on on-list discussion] Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Dmitry Torokhov [Fri, 16 Nov 2012 17:14:12 +0000 (09:14 -0800)]
Input: mousedev - move /dev/input/mice to the correct minor
When doing conversion to dynamic input numbers I inadvertently moved
/dev/input/mice from c,13,63 to c,13,31. We need to fix this so that
setups with statically populated /dev continue working.
Tested-by: Krzysztof Mazur <krzysiek@podlesie.net> Tested-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Jan Janssen <medhefgo@web.de> Cc: Daniele Venzano <venza@brownhat.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Marcin Slusarz [Sat, 17 Nov 2012 20:33:15 +0000 (21:33 +0100)]
drm/nouveau/bios: fix DCB v1.5 parsing
memcmp->nv_strncmp conversion, in addition to name change, should have
inverted the return value.
But nv_strncmp does not act like strncmp - it does not check for string
terminator, returns true/false instead of -1/0/1 and has different
parameters order.
Let's rename it to nv_memcmp and let it act like memcmp.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Marcin Slusarz [Sun, 11 Nov 2012 18:58:52 +0000 (19:58 +0100)]
drm/nv40: allocate ctxprog with kmalloc
Some archs defconfigs have CONFIG_FRAME_WARN set to 1024, which lead to this
warning:
drivers/gpu/drm/nouveau/core/engine/graph/ctxnv40.c: warning: the frame size
of 1184 bytes is larger than 1024 bytes
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Al Viro [Sun, 18 Nov 2012 19:19:00 +0000 (19:19 +0000)]
fanotify: fix FAN_Q_OVERFLOW case of fanotify_read()
If the FAN_Q_OVERFLOW bit set in event->mask, the fanotify event
metadata will not contain a valid file descriptor, but
copy_event_to_user() didn't check for that, and unconditionally does a
fd_install() on the file descriptor.
Which in turn will cause a BUG_ON() in __fd_install().
Introduced by commit 352e3b249284 ("fanotify: sanitize failure exits in
copy_event_to_user()")
Mea culpa - missed that path ;-/
Reported-by: Alex Shi <lkml.alex@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 18 Nov 2012 19:13:48 +0000 (09:13 -1000)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc VFS fixes from Al Viro:
"Remove a bogus BUG_ON() that can trigger spuriously + alpha bits of
do_mount() constification I'd missed during the merge window."
This pull request came in a week ago, I missed it for some reason.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
kill bogus BUG_ON() in do_close_on_exec()
missing const in alpha callers of do_mount()
Linus Torvalds [Sun, 18 Nov 2012 18:32:59 +0000 (08:32 -1000)]
Merge tag 'gpio-fixes-for-v3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull last minute GPIO fixes from Linus Walleij:
- Disable blinking on the Orion GPIO driver
- Two Kconfig-style fixes to avoid broken builds
* tag 'gpio-fixes-for-v3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio-mcp23s08: Build I2C support even when CONFIG_I2C=m
gpio: adnp: Depend on OF_GPIO instead of OF
mvebu-gpio: Disable blinking when enabling a GPIO for output
Linus Torvalds [Sun, 18 Nov 2012 18:29:34 +0000 (08:29 -1000)]
Merge tag 'for-linus-v3.7-rc7' of git://oss.sgi.com/xfs/xfs
Pull xfs bugfixes from Ben Myers:
- fix attr tree double split corruption
- fix broken error handling in xfs_vm_writepage
- drop buffer io reference when a bad bio is built
* tag 'for-linus-v3.7-rc7' of git://oss.sgi.com/xfs/xfs:
xfs: drop buffer io reference when a bad bio is built
xfs: fix broken error handling in xfs_vm_writepage
xfs: fix attr tree double split corruption
Linus Torvalds [Sun, 18 Nov 2012 18:26:35 +0000 (08:26 -1000)]
Merge tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
Pull libata fixes from Jeff Garzik:
"If you were going to shoot me for not sending these earlier, you would
be right. -rc6 beat me by ~2 hours it seems, and they really should
have gone out long before that.
These have been in libata-dev.git for a day or so (unfortunately
linux-next is on vacation). The main one is #1, with the others being
minor bits. #1 has multiple tested-by, and can be considered a
regression fix IMO.
1) Fix ACPI oops:
https://bugzilla.kernel.org/show_bug.cgi?id=48211
2) Temporary WARN_ONCE() debugging patch for further ACPI debugging.
The code already oopses here, and so this merely gives slightly
better info. Related to
https://bugzilla.kernel.org/show_bug.cgi?id=49151
which has been bisected down to a patch that _exposes_ a latest
bug, but said bisection target does not actually appear to be the
root cause itself.
3) sata_svw: fix longstanding error recovery bug, which was
preventing kdump, by adding missing DMA-start bit check. Core
code was already checking DMA-start, but ancillary, less-used
routines were not. Fixed.
4) sata_highbank: fix minor __init/__devinit warning
5) Fix minor warning, if CONFIG_PM is set, but CONFIG_PM_SLEEP is not
set
* tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] PM callbacks should be conditionally compiled on CONFIG_PM_SLEEP
sata_svw: check DMA start bit before reset
libata debugging: Warn when unable to find timing descriptor based on xfer_mode
sata_highbank: mark ahci_highbank_probe as __devinit
pata_arasan: Initialize cf clock to 166MHz
libata-acpi: Fix NULL ptr derference in ata_acpi_dev_handle
Andreas Schwab [Sat, 17 Nov 2012 21:27:04 +0000 (22:27 +0100)]
m68k: fix sigset_t accessor functions
The sigaddset/sigdelset/sigismember functions that are implemented with
bitfield insn cannot allow the sigset argument to be placed in a data
register since the sigset is wider than 32 bits. Remove the "d"
constraint from the asm statements.
The effect of the bug is that sending RT signals does not work, the signal
number is truncated modulo 32.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@vger.kernel.org
Daniel M. Weeks [Wed, 7 Nov 2012 04:51:05 +0000 (23:51 -0500)]
gpio-mcp23s08: Build I2C support even when CONFIG_I2C=m
The driver has both SPI and I2C pieces. The appropriate pieces are built based
on whether SPI and/or I2C is/are enabled. However, it was only checking if I2C
was built-in, never if it was built as a module. This patch checks for either
since building both this driver and I2C as modules is possible.
Signed-off-by: Daniel M. Weeks <dan@danweeks.net> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Thierry Reding [Thu, 1 Nov 2012 10:22:11 +0000 (11:22 +0100)]
gpio: adnp: Depend on OF_GPIO instead of OF
The driver accesses the of_node field of struct gpio_chip, which is only
available if OF_GPIO is selected. This solves a build issue on SPARC
which conflicts with OF_GPIO and therefore does not provide this field.
Jamie Lentin [Sun, 28 Oct 2012 12:23:24 +0000 (12:23 +0000)]
mvebu-gpio: Disable blinking when enabling a GPIO for output
The plat-orion GPIO driver would disable any pin blinking whenever
using a pin for output. Do the same here, as a blinking LED will
continue to blink regardless of what the GPIO pin level is.
Dave Chinner [Mon, 12 Nov 2012 11:09:46 +0000 (22:09 +1100)]
xfs: drop buffer io reference when a bad bio is built
Error handling in xfs_buf_ioapply_map() does not handle IO reference
counts correctly. We increment the b_io_remaining count before
building the bio, but then fail to decrement it in the failure case.
This leads to the buffer never running IO completion and releasing
the reference that the IO holds, so at unmount we can leak the
buffer. This leak is captured by this assert failure during unmount:
This is not a new bug - the b_io_remaining accounting has had this
problem for a long, long time - it's just very hard to get a
zero length bio being built by this code...
Further, the buffer IO error can be overwritten on a multi-segment
buffer by subsequent bio completions for partial sections of the
buffer. Hence we should only set the buffer error status if the
buffer is not already carrying an error status. This ensures that a
partial IO error on a multi-segment buffer will not be lost. This
part of the problem is a regression, however.
cc: <stable@vger.kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Mon, 12 Nov 2012 11:09:45 +0000 (22:09 +1100)]
xfs: fix broken error handling in xfs_vm_writepage
When we shut down the filesystem, it might first be detected in
writeback when we are allocating a inode size transaction. This
happens after we have moved all the pages into the writeback state
and unlocked them. Unfortunately, if we fail to set up the
transaction we then abort writeback and try to invalidate the
current page. This then triggers are BUG() in block_invalidatepage()
because we are trying to invalidate an unlocked page.
Fixing this is a bit of a chicken and egg problem - we can't
allocate the transaction until we've clustered all the pages into
the IO and we know the size of it (i.e. whether the last block of
the IO is beyond the current EOF or not). However, we don't want to
hold pages locked for long periods of time, especially while we lock
other pages to cluster them into the write.
To fix this, we need to make a clear delineation in writeback where
errors can only be handled by IO completion processing. That is,
once we have marked a page for writeback and unlocked it, we have to
report errors via IO completion because we've already started the
IO. We may not have submitted any IO, but we've changed the page
state to indicate that it is under IO so we must now use the IO
completion path to report errors.
To do this, add an error field to xfs_submit_ioend() to pass it the
error that occurred during the building on the ioend chain. When
this is non-zero, mark each ioend with the error and call
xfs_finish_ioend() directly rather than building bios. This will
immediately push the ioends through completion processing with the
error that has occurred.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Mon, 12 Nov 2012 11:09:44 +0000 (22:09 +1100)]
xfs: fix attr tree double split corruption
In certain circumstances, a double split of an attribute tree is
needed to insert or replace an attribute. In rare situations, this
can go wrong, leaving the attribute tree corrupted. In this case,
the attr being replaced is the last attr in a leaf node, and the
replacement is larger so doesn't fit in the same leaf node.
When we have the initial condition of a node format attribute
btree with two leaves at index 1 and 2. Call them L1 and L2. The
leaf L1 is completely full, there is not a single byte of free space
in it. L2 is mostly empty. The attribute being replaced - call it X
- is the last attribute in L1.
The way an attribute replace is executed is that the replacement
attribute - call it Y - is first inserted into the tree, but has an
INCOMPLETE flag set on it so that list traversals ignore it. Once
this transaction is committed, a second transaction it run to
atomically mark Y as COMPLETE and X as INCOMPLETE, so that a
traversal will now find Y and skip X. Once that transaction is
committed, attribute X is then removed.
So now we go to replace X, and see that L1:fsp = 0 - it is full so
we can't insert Y in the same leaf. So we record the the location of
attribute X so we can track it for later use, then we split L1 into
L1 and L3 and reblance across the two leafs. We end with:
And we track that the original attribute is now at L3:0.
We then try to insert Y into L1 again, and find that there isn't
enough room because the new attribute is larger than the old one.
Hence we have to split again to make room for Y. We end up with
this:
And now we have the new (incomplete) attribute @ L4:0, and the
original attribute at L3:0. At this point, the first transaction is
committed, and we move to the flipping of the flags.
This is where we are supposed to end up with this:
But that doesn't happen properly - the attribute tracking indexes
are not pointing to the right locations. What we end up with is both
the old attribute to be removed pointing at L4:0 and the new
attribute at L4:1. On a debug kernel, this assert fails like so:
because the new attribute location does not exist. On a production
kernel, this goes unnoticed and the code proceeds ahead merrily and
removes L4 because it thinks that is the block that is no longer
needed. This leaves the hash index node pointing to entries
L1, L4 and L2, but only blocks L1, L3 and L2 to exist. Further, the
leaf level sibling list is L1 <-> L4 <-> L2, but L4 is now free
space, and so everything is busted. This corruption is caused by the
removal of the old attribute triggering a join - it joins everything
correctly but then frees the wrong block.
xfs_repair will report something like:
bad sibling back pointer for block 4 in attribute fork for inode 131
problem with attribute contents in inode 131
would clear attr fork
bad nblocks 8 for inode 131, would reset to 3
bad anextents 4 for inode 131, would reset to 0
The problem lies in the assignment of the old/new blocks for
tracking purposes when the double leaf split occurs. The first split
tries to place the new attribute inside the current leaf (i.e.
"inleaf == true") and moves the old attribute (X) to the new block.
This sets up the old block/index to L1:X, and newly allocated
block to L3:0. It then moves attr X to the new block and tries to
insert attr Y at the old index. That fails, so it splits again.
With the second split, the rebalance ends up placing the new attr in
the second new block - L4:0 - and this is where the code goes wrong.
What is does is it sets both the new and old block index to the
second new block. Hence it inserts attr Y at the right place (L4:0)
but overwrites the current location of the attr to replace that is
held in the new block index (currently L3:0). It over writes it with
L4:1 - the index we later assert fail on.
Hopefully this table will show this in a foramt that is a bit easier
to understand:
Split old attr index new attr index
vanilla patched vanilla patched
before 1st L1:26 L1:26 N/A N/A
after 1st L3:0 L3:0 L1:26 L1:26
after 2nd L4:0 L3:0 L4:1 L4:0
^^^^ ^^^^
wrong wrong
The fix is surprisingly simple, for all this analysis - just stop
the rebalance on the out-of leaf case from overwriting the new attr
index - it's already correct for the double split case.
Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Linus Torvalds [Fri, 16 Nov 2012 23:26:38 +0000 (15:26 -0800)]
Merge branch 'akpm' (Fixes from Andrew)
Merge misc fixes from Andrew Morton.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (12 patches)
revert "mm: fix-up zone present pages"
tmpfs: change final i_blocks BUG to WARNING
tmpfs: fix shmem_getpage_gfp() VM_BUG_ON
mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address
mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
rapidio: fix kernel-doc warnings
swapfile: fix name leak in swapoff
memcg: fix hotplugged memory zone oops
mips, arc: fix build failure
memcg: oom: fix totalpages calculation for memory.swappiness==0
mm: fix build warning for uninitialized value
mm: add anon_vma_lock to validate_mm()
Andrew Morton [Fri, 16 Nov 2012 22:15:06 +0000 (14:15 -0800)]
revert "mm: fix-up zone present pages"
Revert commit 7f1290f2f2a4 ("mm: fix-up zone present pages")
That patch tried to fix a issue when calculating zone->present_pages,
but it caused a regression on 32bit systems with HIGHMEM. With that
change, reset_zone_present_pages() resets all zone->present_pages to
zero, and fixup_zone_present_pages() is called to recalculate
zone->present_pages when the boot allocator frees core memory pages into
buddy allocator. Because highmem pages are not freed by bootmem
allocator, all highmem zones' present_pages becomes zero.
Various options for improving the situation are being discussed but for
now, let's return to the 3.6 code.
Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Petr Tesarik <ptesarik@suse.cz> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Chris Clayton <chris2553@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Fri, 16 Nov 2012 22:15:04 +0000 (14:15 -0800)]
tmpfs: change final i_blocks BUG to WARNING
Under a particular load on one machine, I have hit shmem_evict_inode()'s
BUG_ON(inode->i_blocks), enough times to narrow it down to a particular
race between swapout and eviction.
It comes from the "if (freed > 0)" asymmetry in shmem_recalc_inode(),
and the lack of coherent locking between mapping's nrpages and shmem's
swapped count. There's a window in shmem_writepage(), between lowering
nrpages in shmem_delete_from_page_cache() and then raising swapped
count, when the freed count appears to be +1 when it should be 0, and
then the asymmetry stops it from being corrected with -1 before hitting
the BUG.
One answer is coherent locking: using tree_lock throughout, without
info->lock; reasonable, but the raw_spin_lock in percpu_counter_add() on
used_blocks makes that messier than expected. Another answer may be a
further effort to eliminate the weird shmem_recalc_inode() altogether,
but previous attempts at that failed.
So far undecided, but for now change the BUG_ON to WARN_ON: in usual
circumstances it remains a useful consistency check.
Thanks to Johannes for pointing to truncation: free_swap_and_cache()
only does a trylock on the page, so the page lock we've held since
before confirming swap is not enough to protect against truncation.
What cleanup is needed in this case? Just delete_from_swap_cache(),
which takes care of the memcg uncharge.
Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Dave Jones <davej@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Fri, 16 Nov 2012 22:15:00 +0000 (14:15 -0800)]
mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address
kmap_to_page returns the corresponding struct page for a virtual address
of an arbitrary mapping. This works by checking whether the address
falls in the pkmap region and using the pkmap page tables instead of the
linear mapping if appropriate.
Unfortunately, the bounds checking means that PKMAP_ADDR(LAST_PKMAP) is
incorrectly treated as a highmem address and we can end up walking off
the end of pkmap_page_table and subsequently passing junk to pte_page.
This patch fixes the bound check to stay within the pkmap tables.
Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Fri, 16 Nov 2012 22:14:59 +0000 (14:14 -0800)]
mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures"
Jiri Slaby reported the following:
(It's an effective revert of "mm: vmscan: scale number of pages
reclaimed by reclaim/compaction based on failures".) Given kswapd
had hours of runtime in ps/top output yesterday in the morning
and after the revert it's now 2 minutes in sum for the last 24h,
I would say, it's gone.
The intention of the patch in question was to compensate for the loss of
lumpy reclaim. Part of the reason lumpy reclaim worked is because it
aggressively reclaimed pages and this patch was meant to be a sane
compromise.
When compaction fails, it gets deferred and both compaction and
reclaim/compaction is deferred avoid excessive reclaim. However, since
commit c654345924f7 ("mm: remove __GFP_NO_KSWAPD"), kswapd is woken up
each time and continues reclaiming which was not taken into account when
the patch was developed.
Attempts to address the problem ended up just changing the shape of the
problem instead of fixing it. The release window gets closer and while
a THP allocation failing is not a major problem, kswapd chewing up a lot
of CPU is.
This patch reverts commit 83fde0f22872 ("mm: vmscan: scale number of
pages reclaimed by reclaim/compaction based on failures") and will be
revisited in the future.
Randy Dunlap [Fri, 16 Nov 2012 22:14:56 +0000 (14:14 -0800)]
rapidio: fix kernel-doc warnings
Fix rapidio kernel-doc warnings:
Warning(drivers/rapidio/rio.c:415): No description found for parameter 'local'
Warning(drivers/rapidio/rio.c:415): Excess function parameter 'lstart' description in 'rio_map_inb_region'
Warning(include/linux/rio.h:290): No description found for parameter 'switches'
Warning(include/linux/rio.h:290): No description found for parameter 'destid_table'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Matt Porter <mporter@kernel.crashing.org> Acked-by: Alexandre Bounine <alexandre.bounine@idt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Fri, 16 Nov 2012 22:14:54 +0000 (14:14 -0800)]
memcg: fix hotplugged memory zone oops
When MEMCG is configured on (even when it's disabled by boot option),
when adding or removing a page to/from its lru list, the zone pointer
used for stats updates is nowadays taken from the struct lruvec. (On
many configurations, calculating zone from page is slower.)
But we have no code to update all the lruvecs (per zone, per memcg) when
a memory node is hotadded. Here's an extract from the oops which
results when running numactl to bind a program to a newly onlined node:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000f60
IP: __mod_zone_page_state+0x9/0x60
Pid: 1219, comm: numactl Not tainted 3.6.0-rc5+ #180 Bochs Bochs
Process numactl (pid: 1219, threadinfo ffff880039abc000, task ffff8800383c4ce0)
Call Trace:
__pagevec_lru_add_fn+0xdf/0x140
pagevec_lru_move_fn+0xb1/0x100
__pagevec_lru_add+0x1c/0x30
lru_add_drain_cpu+0xa3/0x130
lru_add_drain+0x2f/0x40
...
The natural solution might be to use a memcg callback whenever memory is
hotadded; but that solution has not been scoped out, and it happens that
we do have an easy location at which to update lruvec->zone. The lruvec
pointer is discovered either by mem_cgroup_zone_lruvec() or by
mem_cgroup_page_lruvec(), and both of those do know the right zone.
So check and set lruvec->zone in those; and remove the inadequate
attempt to set lruvec->zone from lruvec_init(), which is called before
NODE_DATA(node) has been allocated in such cases.
Ah, there was one exceptionr. For no particularly good reason,
mem_cgroup_force_empty_list() has its own code for deciding lruvec.
Change it to use the standard mem_cgroup_zone_lruvec() and
mem_cgroup_get_lru_size() too. In fact it was already safe against such
an oops (the lru lists in danger could only be empty), but we're better
proofed against future changes this way.
I've marked this for stable (3.6) since we introduced the problem in 3.5
(now closed to stable); but I have no idea if this is the only fix
needed to get memory hotadd working with memcg in 3.6, and received no
answer when I enquired twice before.
Michal Hocko [Fri, 16 Nov 2012 22:14:49 +0000 (14:14 -0800)]
memcg: oom: fix totalpages calculation for memory.swappiness==0
oom_badness() takes a totalpages argument which says how many pages are
available and it uses it as a base for the score calculation. The value
is calculated by mem_cgroup_get_limit which considers both limit and
total_swap_pages (resp. memsw portion of it).
This is usually correct but since fe35004fbf9e ("mm: avoid swapping out
with swappiness==0") we do not swap when swappiness is 0 which means
that we cannot really use up all the totalpages pages. This in turn
confuses oom score calculation if the memcg limit is much smaller than
the available swap because the used memory (capped by the limit) is
negligible comparing to totalpages so the resulting score is too small
if adj!=0 (typically task with CAP_SYS_ADMIN or non zero oom_score_adj).
A wrong process might be selected as result.
The problem can be worked around by checking mem_cgroup_swappiness==0
and not considering swap at all in such a case.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Fri, 16 Nov 2012 22:14:48 +0000 (14:14 -0800)]
mm: fix build warning for uninitialized value
do_wp_page() sets mmun_called if mmun_start and mmun_end were
initialized and, if so, may call mmu_notifier_invalidate_range_end()
with these values. This doesn't prevent gcc from emitting a build
warning though:
mm/memory.c: In function `do_wp_page':
mm/memory.c:2530: warning: `mmun_start' may be used uninitialized in this function
mm/memory.c:2531: warning: `mmun_end' may be used uninitialized in this function
It's much easier to initialize the variables to impossible values and do
a simple comparison to determine if they were initialized to remove the
bool entirely.
Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>