Eric Dumazet [Wed, 25 Jan 2012 03:56:30 +0000 (03:56 +0000)]
be2net: allocate more headroom in incoming skbs
Allocation of 64 bytes in skb headroom is not enough if we have to pull
ethernet + ipv6 + tcp headers, and/or extra tunneling header.
Its currently not noticed because netdev_alloc_skb_ip_align(64) give us
more room, thanks to power-of-two kmalloc() roundups.
Make sure we ask for 128 bytes so that side effects of upcoming patches
from Ian Campbell dont decrease benet rx performance, because of extra
skb head reallocations.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Vasundhara Volam <vasundhara.volam@emulex.com> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:54 +0000 (06:01 +0000)]
bnx2x: Update version to 1.72.0 and copyrights
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:53 +0000 (06:01 +0000)]
bnx2x: Recoverable and unrecoverable error statistics
Add statistics for tracking parity errors from which we successfully
recovered and those which were deemed unrecoverable.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:52 +0000 (06:01 +0000)]
bnx2x: Recovery flow bug fixes
1. Sample mcp pulse and mcp sequence in nic load instead of in init_one
as they may change by the time we want to use them.
2. Allow cnic to access device during nic load (by adding a new "LOADING" state
to recovery flow). This prevents the unnecessary cnic timeout which resulted
by cnic attempting to access because nic is loading, but being blocked because
of the Recovery state.
3. Issue 'fake' driver load command to mcp when last driver unloads to prevent
mcp from taking ownership. When recovery is complete unload fake driver to
allow mcp to initialize the hardware before first driver loads.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:51 +0000 (06:01 +0000)]
bnx2x: Track active PFs with bitmap
The recovery register (to which a hardware lock has been added in previous
patch) is used amongst other things to track the active PFs. The old
implementation which used a per path counter is not viable in a virtualized
environment where a pf may increment the counter and then have the kernel
crash around it preventing the counter from ever reaching zero.
In the new implementation the scenario described will result in the PF timing
out against the mcp, which will clear the PF's bit in the bitmask allowing
recovery process to proceed.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:50 +0000 (06:01 +0000)]
bnx2x: Lock PF-common resources
Use hardware locks to protect resources common to several Physical Functions. In
a virtualized environment the RTNL lock only protects a PF's driver against
the PFs sharing it's VMs with regard to device resources. Other PFs may reside
in other VMs under other OSs, and are not subject to the lock. Such resources
which were previously protected implicitly by the RTNL lock must now be
protected explicitly with dedicated HW locks.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:49 +0000 (06:01 +0000)]
bnx2x: Loaded Firmware Version Validation
In a virtualized environment it is possible for a loading driver to discover
that Firmware is already loaded to the device, and that this FW does not match
its own. This can happen for example if different Physical Functions are
Assigned to different VMs in which different driver versions are loaded. The
code in this patch ensures that only drivers with matching FW are loaded over
the device, and that in the case described above where the Firmware version
doesn't match the driver load is aborted.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:48 +0000 (06:01 +0000)]
bnx2x: Function Level Reset Final Cleanup
1. Fix bug where return value is ignored
2. Improve printouts
3. Fix typos
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:47 +0000 (06:01 +0000)]
bnx2x: Obtain Bus Device Function from register
BDF was obtained from kernel but since in virtualized environment
(e.g. physical device assigment in KVM) the function number may
not be the real one, the info must be obtained from the device.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:46 +0000 (06:01 +0000)]
bnx2x: Removing indirect register access
In virtualized environments indirect access to the device may not be supported
(depending on the Hypervisor type). Indirect device access was used since in
some harware contexts (i.e. certain chipset and BIOS) every access the driver
makes across the pci is followed by a BIOS initiated Zero Length Read to the
same address. When accessing widebus registers this zero length read corrupts
the serialization of the read/write sequence resulting with errors. To avoid
this problem widebus registers are always accessed via the DMAE or the indirect
interface. However, the 57712x and 578xx devices intercept the zero length read
and so using the indirect interface with these devices is not necessary. Since
PDA is only supported for 57712x and 578xx the indirect access to device was
restricted to 57710 and 57711x.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Thu, 26 Jan 2012 06:01:45 +0000 (06:01 +0000)]
bnx2x: Support Queue Per Cos in 5771xx devices
Enable the use of up to three hardware queues for transmission. The queues
are always dequed round robin (i.e. strict priority, PFC and ETS are not
supported). This does allow the allocation of a seperate HW queue for low
volume, high priority traffic which will be serviced more promptly.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Fri, 16 Dec 2011 00:46:22 +0000 (00:46 +0000)]
e1000e: 82574/82583 Tx hang workaround
On 82574/82583, there is a hardware bug which might cause a Tx hang when
the internal buffer is full. Setting this bit enables a hardware fix to
work around the issue.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:17 +0000 (00:46 +0000)]
e1000e: use hardware default values for Transmit Control register
This code snippet is simply writing default values to the register which is
unnecessary since the values are programmed into the register by default.
There is a special case for 80003es2lan needing the Retransmit on Late
Collision bit set but that is also done in e1000_init_hw_80003es2lan().
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:12 +0000 (00:46 +0000)]
e1000e: use default settings for Tx Inter Packet Gap timer
Use the default hardware values for TIPG except for 80003es2lan(*). The
code that is removed in this patch is either unnecessarily writing the TIPG
register with the hardware default values for some devices (82571/2/3/4) or
writing the wrong value for others (ICH/PCH LOMs). The only change in
functionality is setting the correct default TIPG for the latter devices.
(*) The correct value for 80003es2lan is already set properly in
e1000_init_hw_80003es2lan() and e1000_cfg_kmrn_{10_100|1000}_80003es2lan(),
and the unused flag FLAG_TIPG_MEDIUM_FOR_80003ESLAN is removed.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:06 +0000 (00:46 +0000)]
e1000e: 82579: workaround for link drop issue
When connected to certain switches, the 82579 PHY might drop link
unexpectedly. Work around the issue by setting the Mean Square Error
higher than the hardware default.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:46:01 +0000 (00:46 +0000)]
e1000e: always set transmit descriptor control registers the same
The hardware erratum workaround where the TXDCTL register must be the same
setting for both queues should always be done.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:45:56 +0000 (00:45 +0000)]
e1000e: default IntMode based on kernel config & available hardware support
Based on a patch from Prabhakar Kushwaha <prabhakar@freescale.com>, set
appropriate default interrupt mode dependent on whether CONFIG_PCI_MSI
is enabled in the kernel configuration and if the hardware supports
MSI-X. Set the module parameter log message accordingly.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Cc: Jin Qing <b24347@freescale.com> Cc: Prabhakar Kushwaha <prabhakar@freescale.com> Cc: Jin Qing <b24347@freescale.com> Cc: Kumar Gala <galak@kernel.crashing.org> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:45:51 +0000 (00:45 +0000)]
e1000e: re-factor ethtool get/set ring parameter
Make it more like how igb does it, with some additional error checking.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:45:45 +0000 (00:45 +0000)]
e1000e: pass pointer to ring struct instead of adapter struct
For ring-specific functions, pass a pointer to the ring struct instead of a
pointer to the adapter struct.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:45:40 +0000 (00:45 +0000)]
e1000e: convert head, tail and itr_register offsets to __iomem pointers
The Tx/Rx head and tail registers and itr_register are always at known
addresses based on the __iomem address at which the PCI region (from BAR 0)
is mapped and known offsets within the region for each of these registers.
Store and use the full address rather than just the region offset to reduce
unnecessary address calculations. Also, change current u8 __iomem pointers
to void __iomem pointers.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 16 Dec 2011 00:45:35 +0000 (00:45 +0000)]
e1000e: re-enable alternate MAC address for all devices which support it
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Wed, 11 Jan 2012 01:26:50 +0000 (01:26 +0000)]
e1000e: add Receive Packet Steering (RPS) support
Enable RPS by default. Disallow jumbo frames when both receive checksum
and receive hashing are enabled because the hardware cannot do both IP
payload checksum (enabled when receive checksum is enabled when using
packet split which is used for jumbo frames) and provide RSS hash at the
same time.
v2: added ethtool command to query flow hashing behavior per Ben Hutchings
and changed the type of rsskey to cleanup the setting of the register
array and avoid unnecessary casts (as pointed out by Joe Perches).
The long error messages are not changed since there is nothing in
the kernel ./Documentation that suggests the preferred method for
dealing with long messages other than to never break strings; leaving
them as-is for now.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Thu, 5 Jan 2012 00:34:05 +0000 (00:34 +0000)]
e1000e: cleanup Rx checksum offload code
1) cleanup whitespace in e1000_rx_checksum() function header comment
2) do not check hardware checksum when Rx checksum is disabled
3) reduce duplicated calls to le16_to_cpu() by just using it within
e1000_rx_checksum() instead of in each call to the function
v2: use swab16 instead of le16_to_cpu & htons and corrected type for the
passed-in csum
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David Miller [Tue, 24 Jan 2012 13:15:57 +0000 (13:15 +0000)]
infiniband: cxgb4: Convert import_ep() over to dst_neigh_lookup().
Now we must provide the IP destination address, and a reference has
to be dropped when we're done with the entry.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Thu, 19 Jan 2012 15:37:22 +0000 (15:37 +0000)]
net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices
Some WWAN LTE/3G devices based on chipsets from Qualcomm provide
near standard CDC ECM interfaces in addition to the usual serial
interfaces. The Huawei E392/E398 are examples of such devices.
These typically cannot be fully configured using AT commands
over a serial interface. It is necessary to speak the proprietary
Qualcomm MSM Interface (QMI) protocol to the device to enable the
ethernet proxy functionality.
The devices embed the QMI protocol in CDC on the control interface,
using standard CDC commands and notifications. The do not otherwise
use CDC commands for the ethernet function. This driver does
therefore not need access to any other aspects of the control
interface than the descriptors attached to it.
Another driver, cdc-wdm, will provide userspace access to the
QMI protocol independently of this driver. To facilitate this,
this driver avoids binding to the control interface, and uses
only the associated data interface after parsing the common CDC
functional descriptors on the control interface.
You will want both the cdc-wdm and option drivers as companions to
this driver, to have full access to all interfaces and protocols
exported by the device.
Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
This driver adds support for Xilinx 10/100/1000 AXI Ethernet.
It can be used, for instance, on Xilinx boards with a Microblaze
architecture like the ML605.
The patch is against the latest net-next tree and checkpatch clean.
Signed-off-by: Ariane Keller <ariane.keller@tik.ee.ethz.ch> Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 24 Jan 2012 21:59:31 +0000 (21:59 +0000)]
bnx2x: unlock before returning an error
We introduced a new return here but forgot to drop the lock.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 24 Jan 2012 19:47:21 +0000 (19:47 +0000)]
vmxnet3: cleanup tso headers manipulation
Use existing helpers to clarify skb headers manipulation.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
add polling interface to xen-netfront device to support netconsole
This patch also alters the spin_lock usage to use irqsave variant.
Documentation/networking/netdevices.txt states that start_xmit
can be called with interrupts disabled by netconsole and therefore using
the irqsave/restore locking in this function is looks correct.
Signed-off-by: Tina.Yang <tina.yang@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Zhenzhong.Duan <zhenzhong.duan@oracle.com> Tested-by: gurudas.pai <gurudas.pai@oracle.com>
[v1: Copy-n-pasted Ian Campbell comments] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
14) skge carrier assertion and DMA mapping fixes from Stephen Hemminger.
15) Congestion recovery undo performed at the wrong spot in BIC and CUBIC
congestion control modules, fix from Neal Cardwell.
16) Ethtool ETHTOOL_GSSET_INFO is unnecessarily restrictive, from Michał Mirosław.
17) Fix triggerable race in ipv6 sysctl handling, from Francesco Ruggeri.
18) Statistics bug fixes in mlx4 from Eugenia Emantayev.
19) rds locking bug fix during info dumps, from your's truly.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits)
rds: Make rds_sock_lock BH rather than IRQ safe.
netprio_cgroup.h: dont include module.h from other includes
net: flow_dissector.c missing include linux/export.h
team: send only changed options/ports via netlink
net/hyperv: fix possible memory leak in do_set_multicast()
drivers/net: dsa/mv88e6xxx.c files need linux/module.h
stmmac: added PCI identifiers
llc: Fix race condition in llc_ui_recvmsg
stmmac: fix phy naming inconsistency
dsa: Add reporting of silicon revision for Marvell 88E6123/88E6161/88E6165 switches.
tg3: fix ipv6 header length computation
skge: add byte queue limit support
mv643xx_eth: Add Rx Discard and Rx Overrun statistics
bnx2x: fix compilation error with SOE in fw_dump
bnx2x: handle CHIP_REVISION during init_one
bnx2x: allow user to change ring size in ISCSI SD mode
bnx2x: fix Big-Endianess in ethtool -t
bnx2x: fixed ethtool statistics for MF modes
bnx2x: credit-leakage fixup on vlan_mac_del_all
macvlan: fix a possible use after free
...
David S. Miller [Thu, 12 Jan 2012 00:46:32 +0000 (16:46 -0800)]
ipv4: Remove bogus checks of rt_gateway being zero.
It can never actually happen. rt_gateway is either the fully resolved
flow lookup key's destination address, or the non-zero FIB entry gateway
address.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 24 Jan 2012 22:03:44 +0000 (17:03 -0500)]
rds: Make rds_sock_lock BH rather than IRQ safe.
rds_sock_info() triggers locking warnings because we try to perform a
local_bh_enable() (via sock_i_ino()) while hardware interrupts are
disabled (via taking rds_sock_lock).
There is no reason for rds_sock_lock to be a hardware IRQ disabling
lock, none of these access paths run in hardware interrupt context.
Therefore making it a BH disabling lock is safe and sufficient to
fix this bug.
Reported-by: Kumar Sanghvi <kumaras@chelsio.com> Reported-by: Josh Boyer <jwboyer@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Gortmaker [Tue, 24 Jan 2012 11:33:19 +0000 (11:33 +0000)]
netprio_cgroup.h: dont include module.h from other includes
A considerable effort was invested in wiping out module.h
from being present in all the other standard includes. This
one leaked back in, but once again isn't strictly necessary,
so remove it.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 24 Jan 2012 05:16:00 +0000 (05:16 +0000)]
team: send only changed options/ports via netlink
This patch changes event message behaviour to send only updated records
instead of whole list. This fixes bug on which userspace receives non-actual
data in case multiple events occur in row.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Tue, 24 Jan 2012 10:21:28 +0000 (10:21 +0000)]
net/hyperv: fix possible memory leak in do_set_multicast()
do_set_multicast() may not free the memory malloc in
netvsc_set_multicast_list().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Gortmaker [Tue, 24 Jan 2012 10:41:40 +0000 (10:41 +0000)]
drivers/net: dsa/mv88e6xxx.c files need linux/module.h
An implicit instance of module.h leaked back into existence
and was masking the fact that these drivers weren't calling
out the include for itself. Fix the drivers before we remove
the implicit include path via net/netprio_cgroup.h file.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
STM has a device ID within its own VENDOR space, and it is being
used in the STA2X11 I/O Hub.
Signed-off-by: Alessandro Rubini <rubini@gnudd.com> Acked-by: Giancarlo Asnaghi <giancarlo.asnaghi@st.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
After commit "db8857b stmmac: use an unique MDIO bus name" my
device stopped being probed because two different names were being
used in different places. This fixes the inconsistency.
Signed-off-by: Alessandro Rubini <rubini@gnudd.com> Acked-by: Giancarlo Asnaghi <giancarlo.asnaghi@st.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Florian Fainelli <florian@openwrt.org> Acked-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 24 Jan 2012 20:12:40 +0000 (12:12 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: Pass information that quota is stored in system file to userspace
ext2: protect inode changes in the SETVERSION and SETFLAGS ioctls
jbd: Issue cache flush after checkpointing
Mark Brown [Tue, 24 Jan 2012 11:17:26 +0000 (11:17 +0000)]
regulator: Fix documentation for of_node parameter of regulator_register()
Commit 5bc75a886353 ("kernel-doc: fix new warning in regulator core")
added documentation for of_node to address a warning but the
documentation didn't explain what the parameter is for so would be
likely to be unhelpful for users. Clarify that.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 23 Jan 2012 23:11:27 +0000 (15:11 -0800)]
Merge tag 'pm-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Power management fixes for 3.3
Two fixes for regressions introduced during the merge window, one fix for
a long-standing obscure issue in the computation of hibernate image size
and two small PM documentation fixes.
* tag 'pm-fixes-for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Sleep: Fix read_unlock_usermodehelper() call.
PM / Hibernate: Rewrite unlock_system_sleep() to fix s2disk regression
PM / Hibernate: Correct additional pages number calculation
PM / Documentation: Fix minor issue in freezing_of_tasks.txt
PM / Documentation: Fix spelling mistake in basic-pm-debugging.txt
Linus Torvalds [Mon, 23 Jan 2012 22:50:30 +0000 (14:50 -0800)]
Merge tag 'arm-soc-imx-move' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Consolidate i.MX 5 platforms to be under the new shared i.MX 3/5/6 tree.
* tag 'arm-soc-imx-move' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM i.MX: Update defconfig
ARM i.MX: Merge i.MX5 support into mach-imx
ARM i.MX5: remove unnecessary includes from board files
Fix up fairly trivial conflicts due to various changes nearby in
arch/arm/{mach,plat}-imx/{Kconfig,Makefile}
Pull request had been sent to the wrong email address, but happened
before the merge window closed. I'm merging the MX 5 consolidation,
since it apparently will help the next development window and will avoid
conflicts later as per Arnd.
Commit b298d289
"PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()"
added read_unlock_usermodehelper() but read_unlock_usermodehelper() is called
without read_lock_usermodehelper() when kmalloc() failed.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Eric Dumazet [Mon, 23 Jan 2012 01:22:09 +0000 (01:22 +0000)]
tg3: fix ipv6 header length computation
tg3_start_xmit() makes the wrong assumption for TSOV6 that skb->head
doesnt include any payload data.
if (skb_is_gso_v6(skb))
hdr_len = skb_headlen(skb) - ETH_HLEN;
This is not true anymore after commit f07d960df3 (tcp: avoid frag
allocation for small frames)
We should instead use : skb_transport_offset(skb) + tcp_hdrlen(skb)
Its also true for IPv4
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Matt Carlson <mcarlson@broadcom.com> CC: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This also changes the cleanup logic slightly to aggregate
completed notifications for multiple packets.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paulius Zaleckas [Mon, 23 Jan 2012 01:16:35 +0000 (01:16 +0000)]
mv643xx_eth: Add Rx Discard and Rx Overrun statistics
These statistics helped me a lot while searching who is losing
packets in my setup.
I added these stats to MIB group since they are very similar,
but just in other registers.
I have tested this patch on 88F6281 SoC.
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 23 Jan 2012 07:31:56 +0000 (07:31 +0000)]
bnx2x: fix compilation error with SOE in fw_dump
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ariel Elior [Mon, 23 Jan 2012 07:31:55 +0000 (07:31 +0000)]
bnx2x: handle CHIP_REVISION during init_one
The macro `CHIP_IS_E1x' requires `bp' to be initialized.
As `bp' is not yet initialized during this phase of `bnx2x_init_dev',
it accessed uninitialized fields in the struct.
Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Mon, 23 Jan 2012 07:31:54 +0000 (07:31 +0000)]
bnx2x: allow user to change ring size in ISCSI SD mode
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Mon, 23 Jan 2012 07:31:53 +0000 (07:31 +0000)]
bnx2x: fix Big-Endianess in ethtool -t
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 23 Jan 2012 07:31:52 +0000 (07:31 +0000)]
bnx2x: fixed ethtool statistics for MF modes
Previosuly, in MF modes `ethtool -S' lacked some of the statistics
which appeared in non-MF modes. This has been fixed.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 23 Jan 2012 07:31:51 +0000 (07:31 +0000)]
bnx2x: credit-leakage fixup on vlan_mac_del_all
Upon insertion of elements into the execution queue, it is validated
that there are enough credits to support additional vlan-macs,
and the credits are consumed. However, when removing a pending
command in `bnx2x_vland_mac_del_all' the consumed credits are not
released, which might cause leakage and eventually the inability to
add new vlan-macs in certain scenarios.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 23 Jan 2012 05:38:59 +0000 (05:38 +0000)]
macvlan: fix a possible use after free
Commit bc416d9768 (macvlan: handle fragmented multicast frames) added a
possible use after free in macvlan_handle_frame(), since
ip_check_defrag() uses pskb_may_pull() : skb header can be reallocated.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ben Greear <greearb@candelatech.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 23 Jan 2012 18:08:08 +0000 (10:08 -0800)]
Merge branch 'kernel-doc' from Randy Dunlap
The usual kernel-doc fixups from Randy. Some of them David acked as
merged in his tree, this is the random left-overs.
* kernel-doc:
docbook: fix sched source file names in device-drivers book
docbook: change iomap source filename in deviceiobook
docbook: don't use serial_core.h in device-drivers book
kernel-doc: fix kernel-doc warnings in sched
kernel-doc: fix new warnings in cfg80211.h
kernel-doc: fix new warning in usb.h
kernel-doc: fix new warnings in device.h
kernel-doc: fix new warnings in debugfs
kernel-doc: fix new warning in regulator core
kernel-doc: fix new warnings in pci
kernel-doc: fix new warnings in driver-core
kernel-doc: fix new warnings in auditsc.c
scripts/kernel-doc: fix fatal error caused by cfg80211.h
Linus Torvalds [Mon, 23 Jan 2012 17:27:54 +0000 (09:27 -0800)]
Merge branch 'akpm'
Quoth Andrew:
"Random fixes. And a simple new LED driver which I'm trying to sneak
in while you're not looking."
Sneaking successful.
* akpm:
score: fix off-by-one index into syscall table
mm: fix rss count leakage during migration
SHM_UNLOCK: fix Unevictable pages stranded after swap
SHM_UNLOCK: fix long unpreemptible section
kdump: define KEXEC_NOTE_BYTES arch specific for s390x
mm/hugetlb.c: undo change to page mapcount in fault handler
mm: memcg: update the correct soft limit tree during migration
proc: clear_refs: do not clear reserved pages
drivers/video/backlight/l4f00242t03.c: return proper error in l4f00242t03_probe if regulator_get() fails
drivers/video/backlight/adp88x0_bl.c: fix bit testing logic
kprobes: initialize before using a hlist
ipc/mqueue: simplify reading msgqueue limit
leds: add led driver for Bachmann's ot200
mm: __count_immobile_pages(): make sure the node is online
mm: fix NULL ptr dereference in __count_immobile_pages
mm: fix warnings regarding enum migrate_mode
Takashi Iwai [Mon, 23 Jan 2012 17:23:36 +0000 (18:23 +0100)]
ALSA: hda - Fix silent outputs from docking-station jacks of Dell laptops
The recent change of the power-widget handling for IDT codecs caused
the silent output from the docking-station line-out jack. This was
partially fixed by the commit f2cbba7602383cd9cdd21f0a5d0b8bd1aad47b33
"ALSA: hda - Fix the lost power-setup of seconary pins after PM resume".
But the line-out on the docking-station is still silent when booted
with the jack plugged even by this fix.
The remainig bug is that the power-widget is set off in stac92xx_init()
because the pins in cfg->line_out_pins[] aren't checked there properly
but only hp_pins[] are checked in is_nid_hp_pin().
This patch fixes the problem by checking both HP and line-out pins
and leaving the power-map correctly.
Linus Torvalds [Mon, 23 Jan 2012 16:59:49 +0000 (08:59 -0800)]
Merge git://git.samba.org/sfrench/cifs-2.6
* git://git.samba.org/sfrench/cifs-2.6:
CIFS: Rename *UCS* functions to *UTF16*
[CIFS] ACL and FSCACHE support no longer EXPERIMENTAL
[CIFS] Fix build break with multiuser patch when LANMAN disabled
cifs: warn about impending deprecation of legacy MultiuserMount code
cifs: fetch credentials out of keyring for non-krb5 auth multiuser mounts
cifs: sanitize username handling
keys: add a "logon" key type
cifs: lower default wsize when unix extensions are not used
cifs: better instrumentation for coalesce_t2
cifs: integer overflow in parse_dacl()
cifs: Fix sparse warning when calling cifs_strtoUCS
CIFS: Add descriptions to the brlock cache functions
Randy Dunlap [Sat, 21 Jan 2012 19:03:13 +0000 (11:03 -0800)]
kernel-doc: fix kernel-doc warnings in sched
Fix new kernel-doc notation warnings:
Warning(include/linux/sched.h:2094): No description found for parameter 'p'
Warning(include/linux/sched.h:2094): Excess function parameter 'tsk' description in 'is_idle_task'
Warning(kernel/sched/cpupri.c:139): No description found for parameter 'newpri'
Warning(kernel/sched/cpupri.c:139): Excess function parameter 'pri' description in 'cpupri_set'
Warning(kernel/sched/cpupri.c:208): Excess function parameter 'bootmem' description in 'cpupri_init'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Sat, 21 Jan 2012 19:03:00 +0000 (11:03 -0800)]
kernel-doc: fix new warnings in cfg80211.h
Fix new kernel-doc warnings:
Warning(include/net/cfg80211.h:1165): No description found for parameter 'channel_type'
Warning(include/net/cfg80211.h:2090): No description found for parameter 'probe_resp_offload'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: linux-wireless@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Sat, 21 Jan 2012 19:02:51 +0000 (11:02 -0800)]
kernel-doc: fix new warnings in device.h
Fix new kernel-doc warnings:
Warning(include/linux/device.h:299): No description found for parameter 'name'
Warning(include/linux/device.h:299): No description found for parameter 'subsys'
Warning(include/linux/device.h:299): No description found for parameter 'node'
Warning(include/linux/device.h:299): No description found for parameter 'add_dev'
Warning(include/linux/device.h:299): No description found for parameter 'remove_dev'
Warning(include/linux/device.h:685): No description found for parameter 'id'
Warning(include/linux/device.h:1009): No description found for parameter '__driver'
Warning(include/linux/device.h:1009): No description found for parameter '__register'
Warning(include/linux/device.h:1009): No description found for parameter '__unregister'
Randy Dunlap [Sat, 21 Jan 2012 19:02:42 +0000 (11:02 -0800)]
kernel-doc: fix new warnings in debugfs
Fix new kernel-doc warnings:
Warning(fs/debugfs/file.c:556): No description found for parameter 'nregs'
Warning(fs/debugfs/file.c:556): Excess function parameter 'mregs' description in 'debugfs_print_regs32'
Randy Dunlap [Sat, 21 Jan 2012 19:02:35 +0000 (11:02 -0800)]
kernel-doc: fix new warnings in pci
Fix new kernel-doc warnings:
Warning(drivers/pci/pci.c:2811): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2811): Excess function parameter 'pdev' description in 'pci_intx_mask_supported'
Warning(drivers/pci/pci.c:2894): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2894): Excess function parameter 'pdev' description in 'pci_check_and_mask_intx'
Warning(drivers/pci/pci.c:2908): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2908): Excess function parameter 'pdev' description in 'pci_check_and_unmask_intx'
Randy Dunlap [Sat, 21 Jan 2012 19:02:28 +0000 (11:02 -0800)]
kernel-doc: fix new warnings in driver-core
Fix new kernel-doc warnings:
Warning(drivers/base/bus.c:925): No description found for parameter 'key'
Warning(drivers/base/bus.c:1241): No description found for parameter 'subsys'
Warning(drivers/base/bus.c:1241): No description found for parameter 'groups'
Randy Dunlap [Sat, 21 Jan 2012 19:02:24 +0000 (11:02 -0800)]
kernel-doc: fix new warnings in auditsc.c
Fix new kernel-doc warnings in auditsc.c:
Warning(kernel/auditsc.c:1875): No description found for parameter 'success'
Warning(kernel/auditsc.c:1875): No description found for parameter 'return_code'
Warning(kernel/auditsc.c:1875): Excess function parameter 'pt_regs' description in '__audit_syscall_exit'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Sat, 21 Jan 2012 18:31:54 +0000 (10:31 -0800)]
scripts/kernel-doc: fix fatal error caused by cfg80211.h
include/net/cfg80211.h uses __must_check in functions that
have kernel-doc notation. This was confusing scripts/kernel-doc,
so have scripts/kernel-doc ignore "__must_check".
Dan Rosenberg [Fri, 20 Jan 2012 22:34:27 +0000 (14:34 -0800)]
score: fix off-by-one index into syscall table
If the provided system call number is equal to __NR_syscalls, the
current check will pass and a function pointer just after the system
call table may be called, since sys_call_table is an array with total
size __NR_syscalls.
Whether or not this is a security bug depends on what the compiler puts
immediately after the system call table. It's likely that this won't do
anything bad because there is an additional NULL check on the syscall
entry, but if there happens to be a non-NULL value immediately after the
system call table, this may result in local privilege escalation.
Memory migration fills a pte with a migration entry and it doesn't
update the rss counters. Then it replaces the migration entry with the
new page (or the old one if migration failed). But between these two
passes this pte can be unmaped, or a task can fork a child and it will
get a copy of this migration entry. Nobody accounts for this in the rss
counters.
This patch properly adjust rss counters for migration entries in
zap_pte_range() and copy_one_pte(). Thus we avoid extra atomic
operations on the migration fast-path.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Hugh Dickins <hughd@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Fri, 20 Jan 2012 22:34:21 +0000 (14:34 -0800)]
SHM_UNLOCK: fix Unevictable pages stranded after swap
Commit cc39c6a9bbde ("mm: account skipped entries to avoid looping in
find_get_pages") correctly fixed an infinite loop; but left a problem
that find_get_pages() on shmem would return 0 (appearing to callers to
mean end of tree) when it meets a run of nr_pages swap entries.
The only uses of find_get_pages() on shmem are via pagevec_lookup(),
called from invalidate_mapping_pages(), and from shmctl SHM_UNLOCK's
scan_mapping_unevictable_pages(). The first is already commented, and
not worth worrying about; but the second can leave pages on the
Unevictable list after an unusual sequence of swapping and locking.
Fix that by using shmem_find_get_pages_and_swap() (then ignoring the
swap) instead of pagevec_lookup().
But I don't want to contaminate vmscan.c with shmem internals, nor
shmem.c with LRU locking. So move scan_mapping_unevictable_pages() into
shmem.c, renaming it shmem_unlock_mapping(); and rename
check_move_unevictable_page() to check_move_unevictable_pages(), looping
down an array of pages, oftentimes under the same lock.
Leave out the "rotate unevictable list" block: that's a leftover from
when this was used for /proc/sys/vm/scan_unevictable_pages, whose flawed
handling involved looking at pages at tail of LRU.
Was there significance to the sequence first ClearPageUnevictable, then
test page_evictable, then SetPageUnevictable here? I think not, we're
under LRU lock, and have no barriers between those.
Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Shaohua Li <shaohua.li@intel.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michel Lespinasse <walken@google.com> Cc: <stable@vger.kernel.org> [back to 3.1 but will need respins] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Fri, 20 Jan 2012 22:34:19 +0000 (14:34 -0800)]
SHM_UNLOCK: fix long unpreemptible section
scan_mapping_unevictable_pages() is used to make SysV SHM_LOCKed pages
evictable again once the shared memory is unlocked. It does this with
pagevec_lookup()s across the whole object (which might occupy most of
memory), and takes 300ms to unlock 7GB here. A cond_resched() every
PAGEVEC_SIZE pages would be good.
However, KOSAKI-san points out that this is called under shmem.c's
info->lock, and it's also under shm.c's shm_lock(), both spinlocks.
There is no strong reason for that: we need to take these pages off the
unevictable list soonish, but those locks are not required for it.
So move the call to scan_mapping_unevictable_pages() from shmem.c's
unlock handling up to shm.c's unlock handling. Remove the recently
added barrier, not needed now we have spin_unlock() before the scan.
Use get_file(), with subsequent fput(), to make sure we have a reference
to mapping throughout scan_mapping_unevictable_pages(): that's something
that was previously guaranteed by the shm_lock().
Remove shmctl's lru_add_drain_all(): we don't fault in pages at SHM_LOCK
time, and we lazily discover them to be Unevictable later, so it serves
no purpose for SHM_LOCK; and serves no purpose for SHM_UNLOCK, since
pages still on pagevec are not marked Unevictable.
The original code avoided redundant rescans by checking VM_LOCKED flag
at its level: now avoid them by checking shp's SHM_LOCKED.
The original code called scan_mapping_unevictable_pages() on a locked
area at shm_destroy() time: perhaps we once had accounting cross-checks
which required that, but not now, so skip the overhead and just let
inode eviction deal with them.
Put check_move_unevictable_page() and scan_mapping_unevictable_pages()
under CONFIG_SHMEM (with stub for the TINY case when ramfs is used),
more as comment than to save space; comment them used for SHM_UNLOCK.
Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Shaohua Li <shaohua.li@intel.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Holzheu [Fri, 20 Jan 2012 22:34:16 +0000 (14:34 -0800)]
kdump: define KEXEC_NOTE_BYTES arch specific for s390x
kdump only allocates memory for the prstatus ELF note. For s390x,
besides of prstatus multiple ELF notes for various different register
types are stored. Therefore the currently allocated memory is not
sufficient. With this patch the KEXEC_NOTE_BYTES macro can be defined
by architecture code and for s390x it is set to the correct size now.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hillf Danton [Fri, 20 Jan 2012 22:34:13 +0000 (14:34 -0800)]
mm/hugetlb.c: undo change to page mapcount in fault handler
Page mapcount should be updated only if we are sure that the page ends
up in the page table otherwise we would leak if we couldn't COW due to
reservations or if idx is out of bounds.
Signed-off-by: Hillf Danton <dhillf@gmail.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Fri, 20 Jan 2012 22:34:12 +0000 (14:34 -0800)]
mm: memcg: update the correct soft limit tree during migration
end_migration() passes the old page instead of the new page to commit
the charge. This page descriptor is not used for committing itself,
though, since we also pass the (correct) page_cgroup descriptor. But
it's used to find the soft limit tree through the page's zone, so the
soft limit tree of the old page's zone is updated instead of that of the
new page's, which might get slightly out of date until the next charge
reaches the ratelimit point.
This glitch has been present since 5564e88 ("memcg: condense
page_cgroup-to-page lookup points").
This fixes a bug that I introduced in 2.6.38. It's benign enough (to my
knowledge) that we probably don't want this for stable.
Reported-by: Hugh Dickins <hughd@google.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Fri, 20 Jan 2012 22:34:09 +0000 (14:34 -0800)]
proc: clear_refs: do not clear reserved pages
/proc/pid/clear_refs is used to clear the Referenced and YOUNG bits for
pages and corresponding page table entries of the task with PID pid, which
includes any special mappings inserted into the page tables in order to
provide things like vDSOs and user helper functions.
On ARM this causes a problem because the vectors page is mapped as a
global mapping and since ec706dab ("ARM: add a vma entry for the user
accessible vector page"), a VMA is also inserted into each task for this
page to aid unwinding through signals and syscall restarts. Since the
vectors page is required for handling faults, clearing the YOUNG bit (and
subsequently writing a faulting pte) means that we lose the vectors page
*globally* and cannot fault it back in. This results in a system deadlock
on the next exception.
To see this problem in action, just run:
$ echo 1 > /proc/self/clear_refs
on an ARM platform (as any user) and watch your system hang. I think this
has been the case since 2.6.37
This patch avoids clearing the aforementioned bits for reserved pages,
therefore leaving the vectors page intact on ARM. Since reserved pages
are not candidates for swap, this change should not have any impact on the
usefulness of clear_refs.
Signed-off-by: Will Deacon <will.deacon@arm.com> Reported-by: Moussa Ba <moussaba@micron.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Russell King <rmk@arm.linux.org.uk> Acked-by: Nicolas Pitre <nico@linaro.org> Cc: Matt Mackall <mpm@selenic.com> Cc: <stable@vger.kernel.org> [2.6.37+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Axel Lin [Fri, 20 Jan 2012 22:34:06 +0000 (14:34 -0800)]
drivers/video/backlight/l4f00242t03.c: return proper error in l4f00242t03_probe if regulator_get() fails
Signed-off-by: Axel Lin <axel.lin@gmail.com> Acked-by: Alberto Panizzo <alberto@amarulasolutions.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Axel Lin [Fri, 20 Jan 2012 22:34:05 +0000 (14:34 -0800)]
drivers/video/backlight/adp88x0_bl.c: fix bit testing logic
We need to write new value if the bit mask fields of new value is not
equal to old value. It does not make sense to write new value only when
all the bit_mask bits are zero.
Signed-off-by: Axel Lin <axel.lin@gmail.com> Cc: Michael Hennerich <michael.hennerich@analog.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit ef53d9c5e ("kprobes: improve kretprobe scalability with hashed
locking") introduced a bug where we can potentially leak
kretprobe_instances since we initialize a hlist head after having used
it.
Initialize the hlist head before using it.
Reported by: Jim Keniston <jkenisto@us.ibm.com> Acked-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Srinivasa D S <srinivasa@in.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for leds on Bachmann's ot200 visualisation device. The
device has three leds on the back panel (led_err, led_init and led_run)
and can handle up to seven leds on the front panel.
The driver was written by Linutronix on behalf of Bachmann electronic
GmbH. It incorporates feedback from Lars-Peter Clausen
[akpm@linux-foundation.org: add dependency on HAS_IOMEM] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Fri, 20 Jan 2012 22:33:58 +0000 (14:33 -0800)]
mm: __count_immobile_pages(): make sure the node is online
page_zone() requires an online node otherwise we are accessing NULL
NODE_DATA. This is not an issue at the moment because node_zones are
located at the structure beginning but this might change in the future
so better be careful about that.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Fri, 20 Jan 2012 22:33:55 +0000 (14:33 -0800)]
mm: fix NULL ptr dereference in __count_immobile_pages
Fix the following NULL ptr dereference caused by
cat /sys/devices/system/memory/memory0/removable
Pid: 13979, comm: sed Not tainted 3.0.13-0.5-default #1 IBM BladeCenter LS21 -[7971PAM]-/Server Blade
RIP: __count_immobile_pages+0x4/0x100
Process sed (pid: 13979, threadinfo ffff880221c36000, task ffff88022e788480)
Call Trace:
is_pageblock_removable_nolock+0x34/0x40
is_mem_section_removable+0x74/0xf0
show_mem_removable+0x41/0x70
sysfs_read_file+0xfe/0x1c0
vfs_read+0xc7/0x130
sys_read+0x53/0xa0
system_call_fastpath+0x16/0x1b
We are crashing because we are trying to dereference NULL zone which
came from pfn=0 (struct page ffffea0000000000). According to the boot
log this page is marked reserved:
e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
and early_node_map confirms that:
early_node_map[3] active PFN ranges
1: 0x00000010 -> 0x0000009c
1: 0x00000100 -> 0x000bffa3
1: 0x00100000 -> 0x00240000
The problem is that memory_present works in PAGE_SECTION_MASK aligned
blocks so the reserved range sneaks into the the section as well. This
also means that free_area_init_node will not take care of those reserved
pages and they stay uninitialized.
When we try to read the removable status we walk through all available
sections and hope that the zone is valid for all pages in the section.
But this is not true in this case as the zone and nid are not initialized.
We have only one node in this particular case and it is marked as node=1
(rather than 0) and that made the problem visible because page_to_nid will
return 0 and there are no zones on the node.
Let's check that the zone is valid and that the given pfn falls into its
boundaries and mark the section not removable. This might cause some
false positives, probably, but we do not have any sane way to find out
whether the page is reserved by the platform or it is just not used for
whatever other reasons.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Mel Gorman <mgorman@suse.de> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Fri, 20 Jan 2012 22:33:53 +0000 (14:33 -0800)]
mm: fix warnings regarding enum migrate_mode
sparc64 allmodconfig:
In file included from include/linux/compat.h:15,
from /usr/src/25/arch/sparc/include/asm/siginfo.h:19,
from include/linux/signal.h:5,
from include/linux/sched.h:73,
from arch/sparc/kernel/asm-offsets.c:13:
include/linux/fs.h:618: warning: parameter has incomplete type
It seems that my sparc64 compiler (gcc-3.4.5) doesn't like the forward
declaration of enums.
Fix this by moving the "enum migrate_mode" definition into its own header
file.
Acked-by: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Dave Jones <davej@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andy Isaacson <adi@hexapodia.org> Cc: Nai Xia <nai.xia@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>