John Fastabend [Thu, 10 Mar 2011 12:06:12 +0000 (12:06 +0000)]
ixgbe: DCB, PFC not cleared until reset occurs
The PFC configuration is not cleared until the device is reset. This
has not been a problem because setting DCB attributes forced a
hardware reset. Now that we no longer require this reset to occur
PFC remains configured even after being disabled until the
device is reset.
This removes a goto in the PFC hardware set routines for 82598 and
82599 devices that was short circuiting the clear.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Lior Levy [Fri, 11 Mar 2011 02:03:07 +0000 (02:03 +0000)]
ixgbe: add support for VF Transmit rate limit using iproute2
Implemented ixgbe_ndo_set_vf_bw function which is being used by iproute2
tool. In addition, updated ixgbe_ndo_get_vf_config function to show the
actual rate limit to the user.
The rate limitation can be configured only when the link is up and the
link speed is 10Gb.
The rate limit value can be 0 or ranged between 11 and actual link
speed measured in Mbps. A value of '0' disables the rate limit for
this specific VF.
iproute2 usage will be 'ip link set ethX vf Y rate Z'.
After the command is made, the rate will be changed instantly.
To view the current rate limit, use 'ip link show ethX'.
The rates will be zeroed only upon driver reload or a link speed change.
This feature is being supported by 82599 and X540 devices.
Signed-off-by: Lior Levy <lior.levy@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Wed, 9 Mar 2011 04:46:16 +0000 (04:46 +0000)]
ixgbe: DCB, set minimum bandwidth per traffic class
DCB provides a guaranteed bandwidth in the case with 0%
bandwidth then no bandwidth is guaranteed. However the
traffic class should still be able to transmit traffic.
For this to work the traffic class must be given the
minimum credits required to send a frame.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Sat, 5 Mar 2011 01:28:07 +0000 (01:28 +0000)]
ixgbe: update PHY code to support 100Mbps as well as 1G/10G
This change updates the PHY setup code to support 100Mbps capable PHYs
as well as 10G and 1Gbps.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Fri, 4 Mar 2011 03:20:59 +0000 (03:20 +0000)]
ixgbe: remove timer reset to 0 on timeout
The VF mailbox polling for acks and messages would reset the timer to zero
on a timeout. Under heavy load a timeout may actually occur without being
the result of an error and when this occurs it is not practical to perform
a full VF driver reset on every message timeout. Instead, just return an
error (which is already done) and the VF driver will have an opportunity
to retry the operation.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Acked-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Wed, 23 Feb 2011 05:58:25 +0000 (05:58 +0000)]
ixgbe: DCB during ifup use correct CEE or IEEE mode
DCB settings are cleared in the hardware across link events
during ifup ixgbe reprograms the hardware for DCB if it is
enabled. Now that we have two modes CEE or IEEE we need to
use the correct set of configuration data.
This patch checks the dcbx_cap bits and then enables the
device in the correct mode.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support to use the priority assignment
table in the ieee_ets structure to map priorities to
traffic classes. Previously ixgbe only supported a
1:1 mapping. Now we can enable and disable hardware
DCB support when multiple traffic classes are actually
being used. This allows the default case all priorities
mapped to traffic class 0 to work in normal hardware
mode and utilize the full packet buffer.
This patch does not address putting the hardware in
4TC mode so packet buffer space may be underutilized
in this case. A follow up patch can address this
optimization. But at least we have the hooks to do
this now.
Also CEE will behave as it always has and map priorities
1:1 with traffic classes.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ixgbe: DCB, abstract out dcb_config from DCB hardware configuration
However the case when CEE link strict and group strict
are set was missed and are currently being mapped
incorrectly in some configurations.
This patch resolves this.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Wed, 23 Feb 2011 05:58:08 +0000 (05:58 +0000)]
ixgbe: DCB: enable RSS to be used with DCB
RSS had previously been disabled when DCB was enabled because
DCB was single queued per traffic class. Now that DCB implements
multiple Tx/Rx rings per traffic class enable RSS.
Here RSS hashes across the queues in the traffic class.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain.@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Wed, 23 Feb 2011 05:58:03 +0000 (05:58 +0000)]
ixgbe: enable ndo_tc_setup
This patch adds the ndo_tc_setup to ixgbe. By default we set
the device to use strict priority.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain.@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Tue, 8 Mar 2011 03:44:52 +0000 (03:44 +0000)]
ixgbe: DCB, use multiple Tx rings per traffic class
This enables multiple {Tx|Rx} rings per traffic class while in DCB
mode. In order to get this working as expected the tc_to_tx net
device mapping is configured as well as the prio_tc_map.
skb priorities are mapped across a range of queue pairs to get
a distribution per traffic class. The maximum number of
queue pairs used while in DCB mode is capped at 64. The hardware
max is actually 128 queues but 64 is sufficient for now and
allocating more seemed a bit excessive. It is easy enough to
increase the cap later if need be.
To get the 802.1Q priority tags inserted correctly ixgbe was
previously using the skb queue_mapping field to directly set
the 802.1Q priority. This no longer works because we have removed
the 1:1 mapping between queues and traffic class. Each ring
is aligned with an 802.1Qaz traffic class so here we add an
extra field to the ring struct to identify the 802.1Q traffic
class. This uses an extra byte of the ixgbe_ring struct
fortunately there was a 2byte hole,
Now we can set the VLAN priority directly and it will be
correct. User space can indicate the 802.1Qaz priority
using the SO_PRIORITY setsocket() option and QOS layer will
steer the skb to the correct rings. Additionally using
the multiq qdisc with a queue_mapping action works as
well.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Wed, 23 Feb 2011 05:57:52 +0000 (05:57 +0000)]
ixgbe: DCB remove ixgbe_fcoe_getapp routine
Remove ixgbe_fcoe_getapp() and use the generic kernel
routine instead. Also add application priority to the
kernel maintained list on setapp so applications and
stacks can query the value.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Wed, 23 Feb 2011 05:57:47 +0000 (05:57 +0000)]
ixgbe: DCB, implement ieee_setapp dcbnl ops
Implement ieee_setapp dcbnl ops in ixgbe. This is required
to setup FCoE which requires dedicated resources. If the
app data is not for FCoE then no action is taken in ixgbe
except to add it to the dcb_app_list.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Tue, 1 Mar 2011 05:25:35 +0000 (05:25 +0000)]
ixgbe: DCB, implement capabilities flags
This implements dcbnl get and set capabilities ops. The
devices supported by ixgbe can be configured to run in
IEEE or CEE modes but not both.
With the DCBX set capabilities bit we add an explicit
signal that must be used to toggle between these modes.
This patch adds logic to fail the CEE command set_hw_all()
which programs the device with a CEE configuration if
the CEE caps bit is not set. Similarly, IEEE set
commands will fail if the IEEE caps bit is not set. We
allow most CEE config set commands to occur because they
do not touch the hardware until set_hw_all() is called.
The one exception to the above is the {set|get}app routines.
These must always be protected by caps bits to ensure
side effects do not corrupt the current configured mode.
By requiring the caps bit to be set correctly we can
maintain a consistent configuration in the hardware
for CEE or IEEE modes and prevent partial hardware
configurations that may occur if user space does
not send a complete IEEE or CEE configurations.
It is expected that user space will signal a DCBX mode
before programming device.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Carolyn Wyborny [Sat, 12 Mar 2011 04:43:54 +0000 (20:43 -0800)]
igb: Add DMA Coalescing feature to driver
This patch add DMA Coalescing which is a power-saving feature that
coalesces DMA writes in order to stay in a low-power state as much
as possible. Feature is disabled by default.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Carolyn Wyborny [Sat, 12 Mar 2011 04:43:18 +0000 (20:43 -0800)]
igb: Update NVM functions to work with i350 devices
This patch adds functions and functions pointers to accommodate
differences between NVM interfaces and options for i350 devices,
82580 devices and the rest.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Wed, 2 Mar 2011 06:52:30 +0000 (06:52 +0000)]
ixgbevf: Fix Version String
The kernel version string is off by a major version number since
new silicon was just introduced and also uses the wrong format for
the version postfix.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Sat, 26 Feb 2011 03:12:19 +0000 (03:12 +0000)]
e1000e: bump version number
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 4 Mar 2011 09:07:01 +0000 (09:07 +0000)]
e1000e: do not suggest the driver supports Wake-on-ARP
The driver doesn't support Wake-on-ARP, so don't advertise through ethtool
that it does.
Cleanup some coding style issues in the same functions.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 25 Feb 2011 07:44:51 +0000 (07:44 +0000)]
e1000e: disable jumbo frames on 82579 when MACsec enabled in EEPROM
If/when an OEM enables MACsec in the 82579 EEPROM, disable jumbo frames
support in the driver due to an interoperability issue in hardware that
prevents jumbo packets from being transmitted or received.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 25 Feb 2011 06:25:18 +0000 (06:25 +0000)]
e1000e: do not toggle LANPHYPC value bit when PHY reset is blocked
When PHY reset is intentionally blocked on 82577/8/9, do not toggle the
LANPHYPC value bit (essentially performing a hard power reset of the
device) otherwise the PHY can be put into an unknown state.
Cleanup whitespace in the same function.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 25 Feb 2011 06:58:03 +0000 (06:58 +0000)]
e1000e: extend EEE LPI timer to prevent dropped link
The link can be unexpectedly dropped when the timer for entering EEE low-
power-idle quiet state expires too soon. The timer needs to be extended
from 196usec to 200usec after every LCD (PHY) reset to prevent this from
happening.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 25 Feb 2011 06:36:25 +0000 (06:36 +0000)]
e1000e: extend timeout for ethtool link test diagnostic
With some PHYs supported by this driver, link establishment can take a
little longer when connected to certain switches. Extend the timeout to
reduce the number of false diagnostic failures, and cleanup a code style
issue in the same function.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Fri, 25 Feb 2011 07:09:37 +0000 (07:09 +0000)]
e1000e: magic number cleanup - ETH_ALEN
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bruce Allan [Thu, 10 Feb 2011 08:17:21 +0000 (08:17 +0000)]
e1000e: use dev_kfree_skb_irq() instead of dev_kfree_skb()
Based on a report and patch originally submitted by Prasanna Panchamukhi.
Use dev_kfree_skb_irq() in e1000_clean_jumbo_rx_irq() since this latter
function is called only in interrupt context. This avoids "Warning:
kfree_skb on hard IRQ" messages.
Cc: "Prasanna S. Panchamukhi" <prasanna.panchamukhi@riverbed.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Lior Levy [Wed, 2 Mar 2011 06:42:37 +0000 (06:42 +0000)]
ixgbevf: remove Tx hang detection
Removed Tx hang detection mechanism from ixgbevf.
This mechanism has no affect and can cause false alarm messages in some
cases. Especially when VF Tx rate limit is turned on.
The same mechanism was removed recently from igbvf.
Signed-off-by: Lior Levy <lior.levy@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Thu, 27 Jan 2011 09:14:18 +0000 (09:14 +0000)]
ixgb: convert to new VLAN model
Based on a patch from Jesse Gross <jesse@nicira.com>
This switches the ixgb driver to use the new VLAN interfaces.
In doing this, it completes the work begun in ae54496f9e8d40c89e5668205c181dccfa9ecda1 allowing the use of
hardware VLAN insertion without having a VLAN group configured.
CC: Jesse Gross <jesse@nicira.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Randy Dunlap [Thu, 10 Mar 2011 21:45:57 +0000 (13:45 -0800)]
net: bridge builtin vs. ipv6 modular
When configs BRIDGE=y and IPV6=m, this build error occurs:
br_multicast.c:(.text+0xa3341): undefined reference to `ipv6_dev_get_saddr'
BRIDGE_IGMP_SNOOPING is boolean; if it were tristate, then adding
depends on IPV6 || IPV6=n
to BRIDGE_IGMP_SNOOPING would be a good fix. As it is currently,
making BRIDGE depend on the IPV6 config works.
Reported-by: Patrick Schaaf <netdev@bof.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Doe, YiCheng [Thu, 10 Mar 2011 20:00:21 +0000 (14:00 -0600)]
ipmi: Fix IPMI errors due to timing problems
This patch fixes an issue in OpenIPMI module where sometimes an ABORT command
is sent after sending an IPMI request to BMC causing the IPMI request to fail.
Signed-off-by: YiCheng Doe <yicheng.doe@hp.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Acked-by: Tom Mingarelli <thomas.mingarelli@hp.com> Tested-by: Andy Cress <andy.cress@us.kontron.com> Tested-by: Mika Lansirine <Mika.Lansirinne@stonesoft.com> Tested-by: Brian De Wolf <bldewolf@csupomona.edu> Cc: Jean Michel Audet <Jean-Michel.Audet@ca.Kontron.com> Cc: Jozef Sudelsky <jozef.sudolsky@elbiahosting.sk> Acked-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 10 Mar 2011 21:16:01 +0000 (13:16 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
fs/dcache: allow d_obtain_alias() to return unhashed dentries
Check for immutable/append flag in fallocate path
sysctl: the include of rcupdate.h is only needed in the kernel
fat: fix d_revalidate oopsen on NFS exports
jfs: fix d_revalidate oopsen on NFS exports
ocfs2: fix d_revalidate oopsen on NFS exports
gfs2: fix d_revalidate oopsen on NFS exports
fuse: fix d_revalidate oopsen on NFS exports
ceph: fix d_revalidate oopsen on NFS exports
reiserfs xattr ->d_revalidate() shouldn't care about RCU
/proc/self is never going to be invalidated...
Linus Torvalds [Thu, 10 Mar 2011 21:09:26 +0000 (13:09 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, UV: Initialize the broadcast assist unit base destination node id properly
x86, numa: Fix numa_emulation code with memory-less node0
x86, build: Make sure mkpiggy fails on read error
Linus Torvalds [Thu, 10 Mar 2011 21:08:59 +0000 (13:08 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix sched rt group scheduling when hierachy is enabled
Linus Torvalds [Thu, 10 Mar 2011 21:07:38 +0000 (13:07 -0800)]
Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf symbols: Avoid resolving [kernel.kallsyms] to real path for buildid cache
perf symbols: Fix vmlinux path when not using --symfs
drm/i915: Do not handle backlight combination mode specially
since this commit introduced other regressions due to untouched LBPC
register, e.g. the backlight dimmed after resume.
In addition to the revert, this patch includes a fix for the original
issue (weird backlight levels) by removing the wrong bit shift for
computing the current backlight level.
Also, including typo fixes (lpbc -> lbpc).
the df's will show that the inode is not freed on the filesystem until
the last step, when it could have been freed after killing the client's
tail -f. On-disk data won't be deallocated either, leading to possible
spurious ENOSPC.
This occurs because when the client does the close, it arrives in a
compound with a putfh and a close, processed like:
- putfh: look up the filehandle. The only alias found for the
inode will be DCACHE_UNHASHED alias referenced by the filp
this, so it creates a new DCACHE_DISCONECTED dentry and
returns that instead.
- close: closes the existing filp, which is destroyed
immediately by dput() since it's DCACHE_UNHASHED.
- end of the compound: release the reference
to the current filehandle, and dput() the new
DCACHE_DISCONECTED dentry, which gets put on the
unused list instead of being destroyed immediately.
Nick Piggin suggested fixing this by allowing d_obtain_alias to return
the unhashed dentry that is referenced by the filp, instead of making it
create a new dentry.
Leave __d_find_alias() alone to avoid changing behavior of other
callers.
Also nfsd doesn't need all the checks of __d_find_alias(); any dentry,
hashed or unhashed, disconnected or not, should work.
Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Matt Carlson [Wed, 9 Mar 2011 16:58:25 +0000 (16:58 +0000)]
tg3: Remove 5750 PCI code
The 5750 ASIC rev was never released as a PCI device. It only exists as
a PCIe device. This patch removes the code that supports the former
configuration.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Wed, 9 Mar 2011 16:58:24 +0000 (16:58 +0000)]
tg3: Move tg3_init_link_config to tg3_phy_probe
This patch moves the function that initializes the link configuration
closer to the place where the rest of the phy code is initialized.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Wed, 9 Mar 2011 16:58:23 +0000 (16:58 +0000)]
tg3: Refine VAux decision process
In the near future, the VAux switching decision process is going to get
more complicated. This patch refines and consolidates the existing
algorithm in anticipation of the new scheme.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Wed, 9 Mar 2011 16:58:22 +0000 (16:58 +0000)]
tg3: cleanup pci device table vars
Commit 895950c2a6565d9eefda4a38b00fa28537e39fcb, entitled
"tg3: Use DEFINE_PCI_DEVICE_TABLE" moved two pci device tables into the
global address space, but didn't declare them static and didn't prefix
them with "tg3_". This patch fixes those problems.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Wed, 9 Mar 2011 16:58:21 +0000 (16:58 +0000)]
tg3: Add code to verify RODATA checksum of VPD
This patch adds code to verify the checksum stored in the "RV" info
keyword of the RODATA VPD section.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Wed, 9 Mar 2011 16:58:20 +0000 (16:58 +0000)]
tg3: Fix NVRAM selftest
The tg3 NVRAM selftest actually fails when validating the checksum of
the legacy NVRAM format. However, the test still reported success
because the last update of the return code was a success from the NVRAM
reads. This patch fixes the code so that the error return code defaults
to a failure status. Then the patch fixes the reason why the checsum
validation failed.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Carlson [Wed, 9 Mar 2011 16:58:19 +0000 (16:58 +0000)]
tg3: Add missed 5719 workaround change
Commit 2866d956fe0ad8fc8d8a7c54104ccc879b49406d, entitled
"tg3: Expand 5719 workaround" extended a 5719 A0 workaround to all
revisions of the chip. There was a change that should have been a
part of that patch that was missed. This patch adds the missing
piece.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Reviewed-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Marco Stornelli [Sat, 5 Mar 2011 10:10:19 +0000 (11:10 +0100)]
Check for immutable/append flag in fallocate path
In the fallocate path the kernel doesn't check for the immutable/append
flag. It's possible to have a race condition in this scenario: an
application open a file in read/write and it does something, meanwhile
root set the immutable flag on the file, the application at that point
can call fallocate with success. In addition, we don't allow to do any
unreserve operation on an append only file but only the reserve one.
Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David S. Miller [Thu, 10 Mar 2011 04:57:50 +0000 (20:57 -0800)]
ipv4: Optimize flow initialization in fib_validate_source().
Like in commit 44713b67db10c774f14280c129b0d5fd13c70cf2
("ipv4: Optimize flow initialization in output route lookup."
we can optimize the on-stack flow setup to only initialize
the members which are actually used.
Otherwise we bzero the entire structure, then initialize
explicitly the first half of it.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 10 Mar 2011 04:42:07 +0000 (20:42 -0800)]
ipv4: Optimize flow initialization in input route lookup.
Like in commit 44713b67db10c774f14280c129b0d5fd13c70cf2
("ipv4: Optimize flow initialization in output route lookup."
we can optimize the on-stack flow setup to only initialize
the members which are actually used.
Otherwise we bzero the entire structure, then initialize
explicitly the first half of it.
Signed-off-by: David S. Miller <davem@davemloft.net>
The reason we do that is to make sure we never bind an inetpeer to a
prefixed route.
The logic turned on here has existed in the tree for many years,
but was always off due to a protecting CPP define. So perhaps
it's no surprise that there is a logic bug here.
The problem is that we canot clone a route that is already a
host route (ie. has DST_HOST set). Because if we do, an identical
entry already exists in the routing tree and therefore the
ip6_rt_ins() call is going to fail.
This sets off a series of failures and high cpu usage, because when
ip6_rt_ins() fails we loop retrying this operation a few times in
order to handle a race between two threads trying to clone and insert
the same host route at the same time.
Fix this by simply using the route as-is when DST_HOST is set.
Reported-by: slash@ac.auone-net.jp Reported-by: Ernst Sjöstrand <ernstp@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 10 Mar 2011 00:46:06 +0000 (16:46 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/pseries: Disable VPNH feature
powerpc/iseries: Fix early init access to lppaca
Vasiliy Kulikov [Tue, 1 Mar 2011 21:33:13 +0000 (00:33 +0300)]
net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules
Since a8f80e8ff94ecba629542d9b4b5f5a8ee3eb565c any process with
CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean
that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are
limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't
allow anybody load any module not related to networking.
This patch restricts an ability of autoloading modules to netdev modules
with explicit aliases. This fixes CVE-2011-1019.
Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior
of loading netdev modules by name (without any prefix) for processes
with CAP_SYS_MODULE to maintain the compatibility with network scripts
that use autoloading netdev modules by aliases like "eth0", "wlan0".
Currently there are only three users of the feature in the upstream
kernel: ipip, ip_gre and sit.
root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) --
root@albatros:~# grep Cap /proc/$$/status
CapInh: 0000000000000000
CapPrm: fffffff800001000
CapEff: fffffff800001000
CapBnd: fffffff800001000
root@albatros:~# modprobe xfs
FATAL: Error inserting xfs
(/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted
root@albatros:~# lsmod | grep xfs
root@albatros:~# ifconfig xfs
xfs: error fetching interface information: Device not found
root@albatros:~# lsmod | grep xfs
root@albatros:~# lsmod | grep sit
root@albatros:~# ifconfig sit
sit: error fetching interface information: Device not found
root@albatros:~# lsmod | grep sit
root@albatros:~# ifconfig sit0
sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1
root@albatros:~# lsmod | grep sit
sit 10457 0
tunnel4 2957 1 sit
For CAP_SYS_MODULE module loading is still relaxed:
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: James Morris <jmorris@namei.org>
The problem is that iSeries very early boot code, which generates
the device-tree and runs before our normal early initializations
does need access the lppaca's very early, before the PACA array is
initialized, and in fact even before the boot PACA has been
initialized (it contains all 0's at this stage).
However, the first patch above makes that code use the new
llpaca_of(cpu) accessor, which itself is changed by the second patch to
use the PACA array.
We fix that by reverting iSeries to directly dereferencing the array. In
addition, we fix all iterators in the iSeries code to always skip CPU
whose number is above 63 which is the maximum size of that array and
the maximum number of supported CPUs on these machines.
Additionally, we make sure the boot_paca is properly initialized
in our early startup code.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Torvalds [Wed, 9 Mar 2011 22:52:09 +0000 (14:52 -0800)]
Merge branch 'for-2.6.38' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.38' of git://linux-nfs.org/~bfields/linux:
nfsd: wrong index used in inner loop
nfsd4: fix bad pointer on failure to find delegation
NFSD: fix decode_cb_sequence4resok
Daniel Turull [Wed, 9 Mar 2011 22:11:00 +0000 (14:11 -0800)]
pktgen: fix errata in show results
The units in show_results in pktgen were not correct.
The results are in usec but it was displayed nsec.
Reported-by: Jong-won Lee <ljw@handong.edu> Signed-off-by: Daniel Turull <daniel.turull@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
tcp: ioctl type SIOCOUTQNSD returns amount of data not sent
In contrast to SIOCOUTQ which returns the amount of data sent
but not yet acknowledged plus data not yet sent this patch only
returns the data not sent.
For various methods of live streaming bitrate control it may
be helpful to know how much data are in the tcp outqueue are
not sent yet.
Signed-off-by: Mario Schuknecht <m.schuknecht@dresearch.de> Signed-off-by: Steffen Sledz <sledz@dresearch.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 9 Mar 2011 22:03:59 +0000 (14:03 -0800)]
Merge branch 'for-2639-rc7/i2c-fixes' of git://git.fluff.org/bjdooks/linux
* 'for-2639-rc7/i2c-fixes' of git://git.fluff.org/bjdooks/linux:
i2c-eg20t: include slab.h for memory allocations
i2c-ocores: Fix pointer type mismatch error
i2c-omap: Program I2C_WE on OMAP4 to enable i2c wakeup
Linus Torvalds [Wed, 9 Mar 2011 21:55:51 +0000 (13:55 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
nd->inode is not set on the second attempt in path_walk()
unfuck proc_sysctl ->d_compare()
minimal fix for do_filp_open() race
Amerigo Wang [Sun, 6 Mar 2011 21:58:46 +0000 (21:58 +0000)]
bonding: move procfs code into bond_procfs.c
V2: Move #ifdef CONFIG_PROC_FS into bonding.h, as suggested by David.
bond_main.c is bloating, separate the procfs code out,
move them to bond_procfs.c
Signed-off-by: WANG Cong <amwang@redhat.com> Reviewed-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 9 Mar 2011 21:27:16 +0000 (13:27 -0800)]
ipv4: Fix erroneous uses of ifa_address.
In usual cases ifa_address == ifa_local, but in the case where
SIOCSIFDSTADDR sets the destination address on a point-to-point
link, ifa_address gets set to that destination address.
Therefore we should use ifa_local when we want the local interface
address.
There were two cases where the selection was done incorrectly:
1) When devinet_ioctl() does matching, it checks ifa_address even
though gifconf correct reported ifa_local to the user
2) IN_DEV_ARP_NOTIFY handling sends a gratuitous ARP using
ifa_address instead of ifa_local.
Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Halperin [Wed, 9 Mar 2011 11:10:18 +0000 (03:10 -0800)]
mac80211: update minstrel_ht sample rate when probe is set
Waiting until the status is received can cause the same rate to be
probed multiple times consecutively.
Cc: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Daniel Halperin <dhalperi@cs.washington.edu> Signed-off-by: John W. Linville <linville@tuxdriver.com>
net/wireless: add COUNTRY to to regulatory device uevent
Regulatory devices issue change uevents to inform userspace of a need
to call the crda tool; however these can often be sent before udevd is
running, and were not previously included in the results of
udevadm trigger (which requests a new change event using the /uevent
attribute of the sysfs object).
Add a uevent function to the device type which includes the COUNTRY
information from the last request if it has yet to be processed, the
case of multiple requests is already handled in the code by checking
whether an unprocessed one is queued in the same manner and refusing
to queue a new one.
The existing udev rule continues to work as before.
Signed-off-by: Scott James Remnant <keybuk@google.com> Acked-By: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Phonet: kill the ST-Ericsson pipe controller Kconfig
This is now a run-time choice so that a single kernel can support both
old and new generation ISI modems. Support for manually enabling the
pipe flow is removed as it did not work properly, does not fit well
with the socket API, and I am not aware of any use at the moment.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Phonet: support active connection without pipe controller on modem
This provides support for newer ISI modems with no need for the
earlier experimental compile-time alternative choice. With this,
we can now use the same kernel and userspace with both types of
modems.
This also avoids confusing two different and incompatible state
machines, actively connected vs accepted sockets, and adds
connection response error handling (processing "SYN/RST" of sorts).
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Phonet: provide pipe socket option to retrieve the pipe identifier
User-space sometimes needs this information. In particular, the GPRS
context or the AT commands pipe setups may use the pipe handle as a
reference.
This removes the settable pipe handle with CONFIG_PHONET_PIPECTRLR.
It did not handle error cases correctly. Furthermore, the kernel
*could* implement a smart scheme for allocating handles (if ever
needed), but userspace really cannot.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Phonet: allocate sock from accept syscall rather than soft IRQ
This moves most of the accept logic to process context like other
socket stacks do. Then we can use a few more common socket helpers
and simplify a bit.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>