Don Skidmore [Thu, 4 Aug 2011 02:07:48 +0000 (02:07 +0000)]
ixgbe: add ECC warning for legacy interrupts
Noticed that the legacy Interrupt handler didn't have the same
ECC warning as did the MSI. So this patch adds it.
Signed-off-by: Don Skidmore <donald.c.skidmore> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Wed, 17 Aug 2011 10:15:21 +0000 (10:15 +0000)]
ixgbe: cleanup ixgbe_setup_gpie() for X540
The X540 thermal sensor interrupt isn't a General Purpose Interrupt
so doesn't need to be enabled in ixgbe_setup_gpie(). Likewise X540 doesn't
use the SDP0 for thermal sensor so it doesn't need to be enabled for any
device other than 82599.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Sat, 20 Aug 2011 04:49:45 +0000 (04:49 +0000)]
ixgbe add thermal sensor support for x540 hardware
Add code to enable thermal sensors for the x540 hardware, as well as a
thermal interrupt check which will exit with a critical message of a
thermal overheat is detected. Intent of code allows other mac types to
be added with different configuration in the future.
Fixed in this version is the addition of setting the temp_sensor
capable flag which was previously only set for a specific mac.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Tue, 23 Aug 2011 03:14:22 +0000 (03:14 +0000)]
ixgbe: update {P}FC thresholds to account for X540 and loopback
Revise high and low threshold marks wrt flow control to account
for the X540 devices and latency introduced by the loopback
switch.
Without this it was in theory possible to drop frames on a
supposedly lossless link with X540 or SR-IOV enabled.
Previously we used a magic number in a define to calculate the
threshold values. This made it difficult to sort out exactly
which latencies were or were not being accounted for. Here
I was overly explicit and tried to used #define names that would
be recognizable after reading the IEEE 802.1Qbb specification.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Vasu Dev [Thu, 18 Aug 2011 06:20:07 +0000 (06:20 +0000)]
ixgbe: disable LLI for FCoE
Disable LLI for FCoE since regular interrupt
and their moderation rate works slightly better
for FCoE also.
Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is meant to help cleanup the interrupt throttle rate logic by
storing the interrupt throttle rate as a value in microseconds instead of
interrupts per second. The advantage to this approach is that the value
can now be stored in an 16 bit field and doesn't require as much math to
flip the value back and forth since the hardware already used microseconds
when setting the rate.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Dean Nelson [Fri, 16 Sep 2011 16:52:54 +0000 (16:52 +0000)]
e1000: don't enable dma receives until after dma address has been setup
Doing an 'ifconfig ethN down' followed by an 'ifconfig ethN up' on a qemu-kvm
guest system configured with two e1000 NICs can result in an 'unable to handle
kernel paging request at 0000000100000000' or 'bad page map in process ...' or
something similar.
These result from a 4096-byte page being corrupted with the following two-word
pattern (16-bytes) repeated throughout the entire page:
0x0000000000000000
0x0000000100000000
There can be other bits set as well. What is a constant is that the 2nd word
has the 32nd bit set. So one could see:
Which came from from a process' page table I dumped out when the marked line
was seen as bad by print_bad_pte().
The repeating pattern represents the e1000's two-word receive descriptor:
struct e1000_rx_desc {
__le64 buffer_addr; /* Address of the descriptor's data buffer */
__le16 length; /* Length of data DMAed into data buffer */
__le16 csum; /* Packet checksum */
u8 status; /* Descriptor status */
u8 errors; /* Descriptor Errors */
__le16 special;
};
And the 32nd bit of the 2nd word maps to the 'u8 status' member, and
corresponds to E1000_RXD_STAT_DD which indicates the descriptor is done.
The corruption appears to result from the following...
. An 'ifconfig ethN down' gets us into e1000_close(), which through a number
of subfunctions results in:
1. E1000_RCTL_EN being cleared in RCTL register. [e1000_down()]
2. dma_free_coherent() being called. [e1000_free_rx_resources()]
. An 'ifconfig ethN up' gets us into e1000_open(), which through a number of
subfunctions results in:
1. dma_alloc_coherent() being called. [e1000_setup_rx_resources()]
2. E1000_RCTL_EN being set in RCTL register. [e1000_setup_rctl()]
3. E1000_RCTL_EN being cleared in RCTL register. [e1000_configure_rx()]
4. RDLEN, RDBAH and RDBAL registers being set to reflect the dma page
allocated in step 1. [e1000_configure_rx()]
5. E1000_RCTL_EN being set in RCTL register. [e1000_configure_rx()]
During the 'ifconfig ethN up' there is a window opened, starting in step 2
where the receives are enabled up until they are disabled in step 3, in which
the address of the receive descriptor dma page known by the NIC is still the
previous one which was freed during the 'ifconfig ethN down'. If this memory
has been reallocated for some other use and the NIC feels so inclined, it will
write to that former dma page with predictably unpleasant results.
I realize that in the guest, we're dealing with an e1000 NIC that is software
emulated by qemu-kvm. The problem doesn't appear to occur on bare-metal. Andy
suspects that this is because in the emulator link-up is essentially instant
and traffic can start flowing immediately. Whereas on bare-metal, link-up
usually seems to take at least a few milliseconds. And this might be enough
to prevent traffic from flowing into the device inside the window where
E1000_RCTL_EN is set.
So perhaps a modification needs to be made to the qemu-kvm e1000 NIC emulator
to delay the link-up. But in defense of the emulator, it seems like a bad idea
to enable dma operations before the address of the memory to be involved has
been made known.
The following patch no longer enables receives in e1000_setup_rctl() but leaves
them however they were. It only enables receives in e1000_configure_rx(), and
only after the dma address has been made known to the hardware.
There are two places where e1000_setup_rctl() gets called. The one in
e1000_configure() is followed immediately by a call to e1000_configure_rx(), so
there's really no change functionally (except for the removal of the problem
window. The other is in __e1000_shutdown() and is not followed by a call to
e1000_configure_rx(), so there is a change functionally. But consider...
. An 'ifconfig ethN down' (just as described above).
. A 'suspend' of the system, which (I'm assuming) will find its way into
e1000_suspend() which calls __e1000_shutdown() resulting in:
1. E1000_RCTL_EN being set in RCTL register. [e1000_setup_rctl()]
And again we've re-opened the problem window for some unknown amount of time.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Oliver Hartkopp [Wed, 28 Sep 2011 02:50:11 +0000 (02:50 +0000)]
candev: allow SJW user setting for bittiming calculation
This patch adds support for SJW user settings to not set the synchronization
jump width (SJW) to 1 in any case when using the in-kernel bittiming
calculation.
The ip-tool from iproute2 already supports to pass the user defined SJW
value. The given SJW value is sanitized with the controller specific sjw_max
and the calculated tseg2 value. As the SJW can have values up to 4 providing
this value will lead to the maximum possible SJW automatically. A higher SJW
allows higher controller oscillator tolerances.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Hartkopp [Fri, 23 Sep 2011 06:59:48 +0000 (06:59 +0000)]
can/sja1000: add driver for EMS PCMCIA card
This patch adds the driver for the SJA1000 based PCMCIA card 'CPC-Card' from
EMS Dr. Thomas Wuensche (http://www.ems-wuensche.de).
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Acked-by: Markus Plessing <plessing@ems-wuensche.com> Signed-off-by: David S. Miller <davem@davemloft.net>
connector: add comm change event report to proc connector
Add an event to monitor comm value changes of tasks. Such an event
becomes vital, if someone desires to control threads of a process in
different manner.
A natural characteristic of threads is its comm value, and helpfully
application developers have an opportunity to change it in runtime.
Reporting about such events via proc connector allows to fine-grain
monitoring and control potentials, for instance a process control daemon
listening to proc connector and following comm value policies can place
specific threads to assigned cgroup partitions.
It might be possible to achieve a pale partial one-shot likeness without
this update, if an application changes comm value of a thread generator
task beforehand, then a new thread is cloned, and after that proc
connector listener gets the fork event and reads new thread's comm value
from procfs stat file, but this change visibly simplifies and extends the
matter.
Signed-off-by: Vladimir Zapolskiy <vzapolskiy@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 19 Sep 2011 05:52:27 +0000 (05:52 +0000)]
af_unix: dont send SCM_CREDENTIALS by default
Since commit 7361c36c5224 (af_unix: Allow credentials to work across
user and pid namespaces) af_unix performance dropped a lot.
This is because we now take a reference on pid and cred in each write(),
and release them in read(), usually done from another process,
eventually from another cpu. This triggers false sharing.
This patch includes SCM_CREDENTIALS information in a af_unix message/skb
only if requested by the sender, [man 7 unix for details how to include
ancillary data using sendmsg() system call]
Note: This might break buggy applications that expected SCM_CREDENTIAL
from an unaware write() system call, and receiver not using SO_PASSCRED
socket option.
If SOCK_PASSCRED is set on source or destination socket, we still
include credentials for mere write() syscalls.
Performance boost in hackbench : more than 50% gain on a 16 thread
machine (2 quad-core cpus, 2 threads per core)
hackbench 20 thread 2000
4.228 sec instead of 9.102 sec
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Most boards with SysKonnect/Marvell Ethernet have only a single port.
For the single port case, use the standard Ethernet driver convention
of allocating IRQ when device is brought up rather than at probe time.
This patch also adds some additional read after writes to avoid any
PCI posting problems when setting the IRQ mask.
The error handling of dual port cards is also changed. If second port
can not be brought up, then just fail. No point in continuing, since
the failure is most certainly because of out of memory.
It is worth noting that the dual port skge device has a single irq but two
seperate status rings and therefore has two NAPI objects, one for
each port.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This fix provides a newly flashed FW version (appended, in braces)
along with the currently running FW version via ethtool. The newly
flashed version runs only after a system reset.
Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
enic: Add support for port profile association on a enic SRIOV VF
This patch touchs most of the enic port profile handling code.
Tried to break it into sub patches without success.
The patch mainly does the following:
- Port profile operations for a SRIOV VF are modified to work
only via its PF
- Changes the port profile static struct in struct enic to a pointer.
This is because a SRIOV PF has to now hold the port profile information
for all its VF's
- Moved address registration for VF's during port profile ASSOCIATE time
- Most changes in port profile handling code are changes related to indexing
into the port profile struct array of a PF for the VF port profile
information
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: Sujith Sankar <ssujith@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds helper functions to use PF as proxy for SRIOV VF firmware
commands.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: Sujith Sankar <ssujith@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support to enable SRIOV on enic devices. Enic SRIOV VF's are dynamic vnics and will use the same driver code as dynamic vnics.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com> Signed-off-by: Sujith Sankar <ssujith@cisco.com> Signed-off-by: Christian Benvenuti <benve@cisco.com> Signed-off-by: David Wang <dwang2@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 22 Sep 2011 20:02:19 +0000 (20:02 +0000)]
tcp: ECN blackhole should not force quickack mode
While playing with a new ADSL box at home, I discovered that ECN
blackhole can trigger suboptimal quickack mode on linux : We send one
ACK for each incoming data frame, without any delay and eventual
piggyback.
This is because TCP_ECN_check_ce() considers that if no ECT is seen on a
segment, this is because this segment was a retransmit.
Refine this heuristic and apply it only if we seen ECT in a previous
segment, to detect ECN blackhole at IP level.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jamal Hadi Salim <jhs@mojatatu.com> CC: Jerry Chu <hkchu@google.com> CC: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> CC: Jim Gettys <jg@freedesktop.org> CC: Dave Taht <dave.taht@gmail.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
Most sky2 hardware only has a single port, although some variations of the
chip support two interfaces. For the single port case, use the standard
Ethernet driver convention of allocating IRQ when device is brought up
rather than at probe time.
Also, change the error handling of dual port cards so that if second
port can not be brought up, then just fail. No point in continuing, since
the failure is most certainly because of out of memory.
The dual port sky2 device has a single irq and a single status ring,
therefore it has a single NAPI object shared by both ports.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
xen/pciback: Add flag indicating device has been assigned by Xen
Device drivers that create and destroy SR-IOV virtual functions via
calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic
failures if they attempt to destroy VFs while they are assigned to
guest virtual machines. By adding a flag for use by the Xen PCI back
to indicate that a device is assigned a device driver can check that
flag and avoid destroying VFs while they are assigned and avoid system
failures.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently ehea ndo_get_stats can sleep in two places, in a hcall
and in a GFP_KERNEL alloc, which is not correct.
This patch creates a delayed workqueue that grabs the information each 1
sec from the hardware, and place it into the device structure, so that,
.ndo_get_stats quickly returns the device structure statistics block.
Signed-off-by: Breno Leitao <brenohl@br.ibm.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Cochran [Tue, 20 Sep 2011 01:43:16 +0000 (01:43 +0000)]
dp83640: add time stamp insertion for sync messages
This commit adds one step support to the phyter. When enabled, the
hardware does not provide time stamps for transmitted sync messages but
instead inserts the stamp into the outgoing packet.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Cochran [Tue, 20 Sep 2011 01:43:15 +0000 (01:43 +0000)]
net: introduce ptp one step time stamp mode for sync packets
The IEEE 1588 standard (PTP) has a provision for a "one step" mode, where
time stamps on outgoing event packets are inserted into the packet by the
hardware on the fly. This patch adds a new flag for the SIOCSHWTSTAMP
ioctl that lets user space programs request this mode.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Cochran [Tue, 20 Sep 2011 01:43:14 +0000 (01:43 +0000)]
dp83640: enable six external events and one periodic output
This patch enables six external event channels and one periodic output.
One GPIO is reserved for synchronizing multiple PHYs. The assignment
of GPIO functions can be changed via a module parameter.
The code supports multiple simultaneous events by inducing a PTP clock
event for every channel marked in the PHY's extended status word.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Signed-off-by: David S. Miller <davem@davemloft.net>
Argument list to CDRP function has become unmanageably long. Fix it by properly
declaring a struct that encompasses all the input and output parameters.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ameen Rahman <ameen.rahman@qlogic.com> Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In function fec_enet_mii_init(), it uses non-zero pdev->id as part
of the condition to check the second fec instance (fec1). This works
before the driver supports device tree probe. But in case of device
tree probe, pdev->id is -1 which is also non-zero, so the logic becomes
broken when device tree probe gets supported.
The patch change the logic to check "pdev->id > 0" as the part of the
condition for identifying fec1.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
net/fec: fec_reset_phy() does not need to always succeed
FEC can work without a phy reset on some platforms, which means not
very platform necessarily have a phy-reset gpio encoded in device tree.
Even on the platforms that have the gpio, FEC can work without
resetting phy for some cases, e.g. boot loader has done that.
So it makes more sense to have the phy-reset-gpio request failure as
a debug message rather than a warning, and get fec_reset_phy() return
void since the caller does not check the return anyway.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Fri, 23 Sep 2011 02:11:30 +0000 (02:11 +0000)]
ixgb: finish conversion to ndo_fix_features
Finish conversion to unified ethtool ops: convert get_flags.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The SEEQ drivers should depend on HAS_IOMEM to prevent compile breakage
on !HAS_IOMEM architectures:
drivers/net/ethernet/seeq/seeq8005.c: In function 'seeq8005_probe1':
drivers/net/ethernet/seeq/seeq8005.c:179:2: error:
implicit declaration of function 'inw' [-Werror=implicit-function-declaration]
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Emil Tantilov [Tue, 16 Aug 2011 08:04:11 +0000 (08:04 +0000)]
ixgbe: remove global reset to the MAC
Reloading FW during resets can cause issues. Remove the full reset
as it is not needed.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Tue, 16 Aug 2011 07:34:18 +0000 (07:34 +0000)]
ixgbe: add WOL support for X540
Add support for WOL as determined by the EEPROM.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Tue, 16 Aug 2011 04:35:11 +0000 (04:35 +0000)]
ixgbe: avoid HW lockup when adapter is reset with Tx work pending
This change is meant to avoid a hardware lockup when Tx work is still
pending and we request a reset.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Thu, 4 Aug 2011 07:15:55 +0000 (07:15 +0000)]
ixgbe: dcb, set priority to traffic class mappings
This patch adds support for configuring the priority to
traffic class mapping.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Thu, 4 Aug 2011 09:28:30 +0000 (09:28 +0000)]
ixgbe: cleanup X540 interrupt enablement
We don't need SFP+ plugable support for X540 hardware (copper only) so
don't enable the SFP+ interrupts.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
John Fastabend [Thu, 4 Aug 2011 05:47:07 +0000 (05:47 +0000)]
ixgbe: DCB, do not call set_state() from IEEE mode
The DCB CEE command set_state() will complete successfully
but is misleading because it enables IEEE mode. After
this patch the command is failed.
And IEEE PFC/ETS is managed from ieee paths now instead
of using CEE primitives.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Wed, 24 Aug 2011 02:37:55 +0000 (02:37 +0000)]
ixgbe: Reconfigure SR-IOV Init
Use the PCI device flag indicating if a VF is assigned to a guest VM
to guard against destroying VFs upon driver removal. Implement
additional feature to detect if VFs already exist when the driver
is loaded and if so configure them and set the driver state to
SR-IOV enabled.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Rose [Fri, 22 Jul 2011 05:46:07 +0000 (05:46 +0000)]
pci: Add flag indicating device has been assigned by KVM
Device drivers that create and destroy SR-IOV virtual functions via
calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic
failures if they attempt to destroy VFs while they are assigned to
guest virtual machines. By adding a flag for use by the KVM module
to indicate that a device is assigned a device driver can check that
flag and avoid destroying VFs while they are assigned and avoid system
failures.
CC: Ian Campbell <ijc@hellion.org.uk> CC: Konrad Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ian Campbell [Wed, 21 Sep 2011 21:53:28 +0000 (21:53 +0000)]
vmxnet3: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Shreyas Bhatewara <sbhatewara@vmware.com> Cc: "VMware, Inc." <pv-drivers@vmware.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 21 Sep 2011 21:53:27 +0000 (21:53 +0000)]
virtionet: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: virtualization@lists.linux-foundation.org Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 21 Sep 2011 21:53:26 +0000 (21:53 +0000)]
via-velocity: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Francois Romieu <romieu@fr.zoreil.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 21 Sep 2011 21:53:25 +0000 (21:53 +0000)]
typhoon: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: David Dillow <dave@thedillows.org> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 21 Sep 2011 21:53:23 +0000 (21:53 +0000)]
tehuti: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Alexander Indenbaum <baum@tehutinetworks.net> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 21 Sep 2011 21:53:20 +0000 (21:53 +0000)]
stmmac: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 21 Sep 2011 21:53:18 +0000 (21:53 +0000)]
sky2: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 21 Sep 2011 21:53:17 +0000 (21:53 +0000)]
skge: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 21 Sep 2011 21:53:16 +0000 (21:53 +0000)]
sfc: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com> Cc: Steve Hodgson <shodgson@solarflare.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 21 Sep 2011 21:53:15 +0000 (21:53 +0000)]
s2io: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jon Mason <jdmason@kudzu.us> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Francois Romieu [Tue, 3 May 2011 14:38:29 +0000 (16:38 +0200)]
r8169: jumbo fixes.
- fix features : jumbo frames and checksumming can not be used at the
same time.
- introduce hw_jumbo_{enable / disable} helpers. Their content has been
creatively extracted from Realtek's own drivers. As an illustration,
it would be nice to know how/if the MaxTxPacketSize register operates
when the device can work with a 9k jumbo frame as its documentation
(8168c) can not be applied beyond ~7k.
- rtl_tx_performance_tweak is moved forward. No change.
8168d and above allow jumbo frames beyond 8k. Bump the received
packet length check before enabling jumbo frames on these chipsets.
Frame length indication covers bits 0..13 of the first Rx descriptor
32 bits for the 8169 and 8168. I only have authoritative documentation
for the allowed use of the extra (13) bit with the 8169 and 8168c.
Realtek's drivers use the same mask for the 816x and the fast ethernet
only 810x.
If register_netdev() fails now, then we call mutex_unlock(&bnad->conf_mutex);
on the error path, but it's already unlocked. So we acquire the lock in error
path which will be later unlocked after the cleanup.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Rasesh Mody <rmody@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Lasse Collin [Wed, 21 Sep 2011 14:30:50 +0000 (17:30 +0300)]
XZ: Fix incorrect XZ_BUF_ERROR
xz_dec_run() could incorrectly return XZ_BUF_ERROR if all of the
following was true:
- The caller knows how many bytes of output to expect and only provides
that much output space.
- When the last output bytes are decoded, the caller-provided input
buffer ends right before the LZMA2 end of payload marker. So LZMA2
won't provide more output anymore, but it won't know it yet and thus
won't return XZ_STREAM_END yet.
- A BCJ filter is in use and it hasn't left any unfiltered bytes in the
temp buffer. This can happen with any BCJ filter, but in practice
it's more likely with filters other than the x86 BCJ.
This fixes <https://bugzilla.redhat.com/show_bug.cgi?id=735408> where
Squashfs thinks that a valid file system is corrupt.
This also fixes a similar bug in single-call mode where the uncompressed
size of a block using BCJ + LZMA2 was 0 bytes and caller provided no
output space. Many empty .xz files don't contain any blocks and thus
don't trigger this bug.
This also tweaks a closely related detail: xz_dec_bcj_run() could call
xz_dec_lzma2_run() to decode into temp buffer when it was known to be
useless. This was harmless although it wasted a minuscule number of CPU
cycles.
* git://github.com/davem330/net: (27 commits)
xfrm: Perform a replay check after return from async codepaths
fib:fix BUG_ON in fib_nl_newrule when add new fib rule
ixgbe: fix possible null buffer error
tg3: fix VLAN tagging regression
net: pxa168: Fix build errors by including interrupt.h
netconsole: switch init_netconsole() to late_initcall
gianfar: Fix overflow check and return value for gfar_get_cls_all()
ppp_generic: fix multilink fragment MTU calculation (again)
GRETH: avoid overwrite IP-stack's IP-frags checksum
GRETH: RX/TX bytes were never increased
ipv6: fix a possible double free
b43: Fix beacon problem in ad-hoc mode
Bluetooth: add support for 2011 mac mini
Bluetooth: Add MacBookAir4,1 support
Bluetooth: Fixed BT ST Channel reg order
r8169: do not enable the TBI for anything but the original 8169.
r8169: remove erroneous processing of always set bit.
r8169: fix WOL setting for 8105 and 8111evl
r8169: add MODULE_FIRMWARE for the firmware of 8111evl
r8169: fix the reset setting for 8111evl
...
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
floppy: use del_timer_sync() in init cleanup
blk-cgroup: be able to remove the record of unplugged device
block: Don't check QUEUE_FLAG_SAME_COMP in __blk_complete_request
mm: Add comment explaining task state setting in bdi_forker_thread()
mm: Cleanup clearing of BDI_pending bit in bdi_forker_thread()
block: simplify force plug flush code a little bit
block: change force plug flush call order
block: Fix queue_flag update when rq_affinity goes from 2 to 1
block: separate priority boosting from REQ_META
block: remove READ_META and WRITE_META
xen-blkback: fixed indentation and comments
xen-blkback: Don't disconnect backend until state switched to XenbusStateClosed.
init: carefully handle loglevel option on kernel cmdline.
When a malformed loglevel value (for example "${abc}") is passed on the
kernel cmdline, the loglevel itself is being set to 0.
That then suppresses all following messages, including all the errors
and crashes caused by other malformed cmdline options. This could make
debugging process quite tricky.
This patch leaves the previous value of loglevel if the new value is
incorrect and reports an error code in this case.
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@sysgo.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Hansen [Tue, 20 Sep 2011 22:19:41 +0000 (15:19 -0700)]
teach /proc/$pid/numa_maps about transparent hugepages
This is modeled after the smaps code.
It detects transparent hugepages and then does a single gather_stats()
for the page as a whole. This has two benifits:
1. It is more efficient since it does many pages in a single shot.
2. It does not have to break down the huge page.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Hansen [Tue, 20 Sep 2011 22:19:39 +0000 (15:19 -0700)]
break out numa_maps gather_pte_stats() checks
gather_pte_stats() does a number of checks on a target page
to see whether it should even be considered for statistics.
This breaks that code out in to a separate function so that
we can use it in the transparent hugepage case in the next
patch.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Acked-by: Hugh Dickins <hughd@google.com> Reviewed-by: Christoph Lameter <cl@gentwo.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Hansen [Tue, 20 Sep 2011 22:19:38 +0000 (15:19 -0700)]
make /proc/$pid/numa_maps gather_stats() take variable page size
We need to teach the numa_maps code about transparent huge pages. The
first step is to teach gather_stats() that the pte it is dealing with
might represent more than one page.
Note that will we use this in a moment for transparent huge pages since
they have use a single pmd_t which _acts_ as a "surrogate" for a bunch
of smaller pte_t's.
I'm a _bit_ unhappy that this interface counts in hugetlbfs page sizes
for hugetlbfs pages and PAGE_SIZE for normal pages. That means that to
figure out how many _bytes_ "dirty=1" means, you must first know the
hugetlbfs page size. That's easier said than done especially if you
don't have visibility in to the mount.
But, that's probably a discussion for another day especially since it
would change behavior to fix it. But, just in case anyone wonders why
this patch only passes a '1' in the hugetlb case...
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Finn Thain [Tue, 13 Sep 2011 07:30:25 +0000 (07:30 +0000)]
macmace, macsonic: cleanup
We check ether_type before registering the platform device in
arch/m68k/mac/config.c. Doing the same test again in the driver is
redundant so remove it.
Multiple probes should not happen since the conversion to platform devices,
so lose that test too.
Then macmace.c need not include macintosh.h, so remove that and irq.h and
include linux/interrupt.h explicitly.
Tested on PowerBook 520, Quadra 660av, LC 630.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
xfrm: Perform a replay check after return from async codepaths
When asyncronous crypto algorithms are used, there might be many
packets that passed the xfrm replay check, but the replay advance
function is not called yet for these packets. So the replay check
function would accept a replay of all of these packets. Also the
system might crash if there are more packets in async processing
than the size of the anti replay window, because the replay advance
function would try to update the replay window beyond the bounds.
This pach adds a second replay check after resuming from the async
processing to fix these issues.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Martin [Tue, 13 Sep 2011 00:49:29 +0000 (00:49 +0000)]
net/smsc911x: Correctly configure 16-bit register access from DT
The SMSC911X_USE_16BIT needs to be set when using 16-bit register
access. However, currently no flag is set if the device tree
doesn't specify 32-bit access, resulting in a BUG() and a non-
working driver when 16-bit register access is configured for
smsc911x in the DT.
This patch should set the SMSC911X_USE_16BIT flag in a manner
consistent with the documented DT bindings.
Signed-off-by: Dave Martin <dave.martin@linaro.org> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
fib:fix BUG_ON in fib_nl_newrule when add new fib rule
add new fib rule can cause BUG_ON happen
the reproduce shell is
ip rule add pref 38
ip rule add pref 38
ip rule add to 192.168.3.0/24 goto 38
ip rule del pref 38
ip rule add to 192.168.3.0/24 goto 38
ip rule add pref 38
then the BUG_ON will happen
del BUG_ON and use (ctarget == NULL) identify whether this rule is unresolved
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch add the peak_pci driver for the PCAN PCI/PCIe cards (1, 2, 3
or 4 channels) from PEAK Systems (http://www.peak-system.com).
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
It seems that at least one PPC machine would occasionally give a (valid) 0 as
the return value from dma_map, this caused the ixgbe code to not work
correctly. A fix is pending in the PPC tree to not return 0 from dma map, but
we can also fix the driver to make sure we don't mess up in other arches as
well.
This patch is applicable to all current stable kernels.
Reported-by: Neil Horman <nhorman@redhat.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Alexander Duyck <alexander.h.duyck@intel.com> CC: stable@kernel.org Tested-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
broke VLAN tagging on outbound packets.
It ifdef'ed BCM_KERNEL_SUPPORTS_8021Q, but this
is not set anywhere. So vlan never gets set, and
all packets are sent with vlan=0.
v2: We can just remove the test. vlan_tx_tag_present
is valid regardless of whether the 802.1q module
is built.
Tested on BCM5721 rev 11.
Signed-off-by: Kasper Pedersen <kernel@kasperkp.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'fixes' of git://git.linaro.org/people/arnd/arm-soc
* 'fixes' of git://git.linaro.org/people/arnd/arm-soc:
mach-integrator: fix VGA base regression
arm/dt: Tegra: Update SDHCI nodes to match bindings
ARM: EXYNOS4: fix incorrect pad configuration for keypad row lines
ARM: SAMSUNG: fix to prevent declaring duplicated
ARM: SAMSUNG: fix watchdog reset issue with clk_get()
ARM: S3C64XX: Remove un-used code backlight code on SMDK6410
ARM: EXYNOS4: restart clocksource while system resumes
ARM: EXYNOS4: Fix routing timer interrupt to offline CPU
ARM: EXYNOS4: Fix return type of local_timer_setup()
ARM: EXYNOS4: Fix wrong pll type for vpll
ARM: Dove: fix second SPI initialization call
After commit c5f5c4db3938 ("staging: zcache: fix crash on high memory
swap") cleancache crashes on the first successful get. This was caused
by a remaining virt_to_page() call in zcache_pampd_get_data_and_free()
that only gets run in the cleancache path.
The patch converts the virt_to_page() to struct page casting like was
done for other instances in c5f5c4db3938.
Makes the Integrator/AP freeze completely. I appears that
this is due to the VGA base address being assigned at PCI
init time, while this base is needed earlier than that.
Moving the initialization of the base address to the
.map_io function solves this problem.
Cc: Rob Herring <rob.herring@calxeda.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Acked-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This patch adds the IC+ IP101A Single port 10/100 PHY
and supports the APS (i.e. power saving mode while link is down)
for both IP1001 and IP101A (where this mode is supported).
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/pxa168_eth.c: In function 'pxa168_eth_collect_events':
drivers/net/pxa168_eth.c:866: error: 'IRQ_NONE' undeclared (first use in this function)
drivers/net/pxa168_eth.c:866: error: (Each undeclared identifier is reported only once
drivers/net/pxa168_eth.c:866: error: for each function it appears in.)
drivers/net/pxa168_eth.c: At top level:
drivers/net/pxa168_eth.c:913: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'pxa168_eth_int_handler'
drivers/net/pxa168_eth.c: In function 'pxa168_eth_open':
drivers/net/pxa168_eth.c:1133: error: implicit declaration of function 'request_irq'
drivers/net/pxa168_eth.c:1133: error: 'pxa168_eth_int_handler' undeclared (first use in this function)
drivers/net/pxa168_eth.c:1134: error: 'IRQF_DISABLED' undeclared (first use in this function)
drivers/net/pxa168_eth.c:1160: error: implicit declaration of function 'free_irq'
Signed-off-by: Tanmay Upadhyay <tanmay.upadhyay@einfochips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Lin Ming [Tue, 20 Sep 2011 19:45:07 +0000 (15:45 -0400)]
netconsole: switch init_netconsole() to late_initcall
Commit 88491d8(drivers/net: Kconfig & Makefile cleanup) causes a
regression that netconsole does not work if netconsole and network
device driver are build into kernel, because netconsole is linked
before network device driver.
Andrew Morton suggested to fix this with initcall ordering.
Fixes it by switching init_netconsole() to late_initcall.
Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Tue, 6 Sep 2011 12:44:25 +0000 (12:44 +0000)]
gianfar: Fix overflow check and return value for gfar_get_cls_all()
This function may currently fill one entry beyond the end of the
array it is given. It also doesn't return an error code in case
it does detect overflow.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Henry Wong [Sun, 18 Sep 2011 13:41:49 +0000 (13:41 +0000)]
ppp_generic: fix multilink fragment MTU calculation (again)
When using MLPPP, the maximum size of a fragment is incorrectly
calculated with an offset of -2.
This patch reverses the changes in the patch found here:
http://marc.info/?l=linux-netdev&m=123541324010539&w=2
The value of hdrlen includes the size of both the 2-byte PPP protocol
field and the 2- or 4-byte multilink header (2+4=6 for long sequence
numbers, 2+2=4 for short sequence numbers). Section 2 of RFC1661 says
that the MRU that is negotiated (i.e., the MTU of the sending system)
includes only the PPP payload but not the protocol field, thus the
correct MTU should be the link's MTU minus the multilink header (mtu -
(hdrlen-2)).
The incorrect calculation causes Linux to fragment packets to a size two
bytes smaller than the allowed MTU. While not technically illegal, this
behaviour confounds MRU-tuning to avoid PPP-layer fragmentation.
Signed-off-by: Henry Wong <henry@stuffedcow.net> Signed-off-by: David S. Miller <davem@davemloft.net>
The GRETH GBIT core does not do checksum offloading for IP
segmentation. This patch adds a check in the xmit function to
determine if the stack has calculated the checksum for us.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>