Suzuki Poulose [Tue, 21 Aug 2012 01:42:43 +0000 (01:42 +0000)]
powerpc: Export memory limit via device tree
The powerpc kernel doesn't export the memory limit enforced by 'mem='
kernel parameter. This is required for building the ELF header in
kexec-tools to limit the vmcore to capture only the used memory. On
powerpc the kexec-tools depends on the device-tree for memory related
information, unlike /proc/iomem on the x86.
Without this information, the kexec-tools assumes the entire System
RAM and vmcore creates an unnecessarily larger dump.
This patch exports the memory limit, if present, via
chosen/linux,memory-limit
property, so that the vmcore can be limited to the memory limit.
The prom_init seems to export this value in the same node. But doesn't
really
appear there. Also the memory_limit gets adjusted with the processing of
crashkernel= parameter. This patch makes sure we get the actual limit.
The kexec-tools will use the value to limit the 'end' of the memory
regions.
Tested this patch on ppc64 and ppc32(ppc440) with a kexec-tools
patch by Mahesh.
Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Tested-by: Mahesh J. Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Suzuki Poulose [Tue, 21 Aug 2012 01:42:33 +0000 (01:42 +0000)]
powerpc: Change memory_limit from phys_addr_t to unsigned long long
There are some device-tree nodes, whose values are of type phys_addr_t.
The phys_addr_t is variable sized based on the CONFIG_PHSY_T_64BIT.
Change these to a fixed unsigned long long for consistency.
This patch does the change only for memory_limit.
The following is a list of such variables which need the change:
1) kernel_end, crashk_size - in arch/powerpc/kernel/machine_kexec.c
2) (struct resource *)crashk_res.start - We could export a local static
variable from machine_kexec.c.
Changing the above values might break the kexec-tools. So, I will
fix kexec-tools first to handle the different sized values and then change
the above.
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc: Fix build dependencies for c files requiring libfdt.h
Several files in obj-plat depend on libfdt header file. Sometimes
when building one can see the following issue. This patch adds
libfdt as dependency to those object files
| In file included from arch/powerpc/boot/treeboot-iss4xx.c:33:0:
| arch/powerpc/boot/libfdt.h:854:1: error: unterminated comment
| In file included from arch/powerpc/boot/treeboot-iss4xx.c:33:0:
| arch/powerpc/boot/libfdt.h:1:0: error: unterminated #ifndef
| BOOTCC arch/powerpc/boot/inffast.o
| make[1]: *** [arch/powerpc/boot/treeboot-iss4xx.o] Error 1
| make[1]: *** Waiting for unfinished jobs....
| BOOTCC arch/powerpc/boot/inflate.o
| make: *** [uImage] Error 2
| ERROR: oe_runmake failed
| ERROR: Function failed: do_compile (see /srv/home/pokybuild/yocto-autobuilder/yocto-slave/p1022ds/build/build/tmp/work/p1022ds-poky-linux-gnuspe/linux-qoriq-sdk-3.0.34-r5/temp/log.do_compile.2167 for further information)
NOTE: recipe linux-qoriq-sdk-3.0.34-r5: task do_compile: Failed
Signed-off-by: Matthew McClintock <msm@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Carl E. Love [Fri, 3 Aug 2012 03:02:17 +0000 (03:02 +0000)]
powerpc/oprofile: Fix marked events support on Power7+ not set.
Starting with Power 7+ we need to check for marked events if the SIAR
register is valid, i.e. it contains the correct address of the instruction
at the time the performance counter overflowed. The mmcra register on
Power 7+, contains a new bit to indicate that the contents of the SIAR
is valid. If the event is not marked, then the sample is recorded
independently of the SIAR valid bit setting. For older processors, there
is no SIAR valid bit to check so the samples are always recorded. This is
done by forcing the cntr_marked_events bit mask to zero. The code will
always record the sample in this case since the bit mask says the event is
not a marked event even if it really is a marked event.
Signed-off-by: Carl Love <cel@us.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 4 Jun 2012 16:47:03 +0000 (16:47 +0000)]
powerpc/pseries: Round up MSI-X requests
The pseries firmware currently refuses any non power of two MSI-X
request. Unfortunately most network drivers end up asking for that
because they want a power of two for RX queues and one or two extra
for everything else.
This patch rounds up the firmware request to the next power of two
if the quota allows it. If this fails we fall back to using the
original request size.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Gavin Shan [Sun, 3 Jun 2012 22:15:25 +0000 (22:15 +0000)]
powerpc/pci: Save P2P bridge resource if possible
When PCI probe flag PCI_REASSIGN_ALL_RSRC has been passed into PCI
core, it's hoped that all resources to be reassigned by PCI core.
As to particular P2P (PCI-to-PCI) bridge, the size of the corresponding
BAR (I/O, MMIO, prefetchable MMIO) is calculated by the resources
required by the PCI devices behind the P2P bridge. That means that
the information like start/end address retrieved from the hardware
registers of the P2P bridge is meainingless in the case. However,
we still count that in and the BARs might have been configured by
firmware with non-zero size. That leads to space waste.
The patch explicitly sets the size of P2P bridge BARs to zero in
case that resource reassignment is expected with PCI probe flag
PCI_REASSIGN_ALL_RSRC. In the result, it will save overall resource
required by the system without waste.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In file included from drivers/atm/fore200e.c:70:0:
drivers/atm/fore200e.h:263:3: error: redefinition of typedef 'opcode_t' with different type
arch/powerpc/include/asm/probes.h:25:13: note: previous declaration of 'opcode_t' was here
Fix the namespace clash by making opcode_t in probes.h to ppc_opcode_t.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc: Don't use __put_user() in patch_instruction
patch_instruction() can be called very early on ppc32, when the kernel
isn't yet running at it's linked address. That can cause the !
is_kernel_addr() test in __put_user() to trip and call might_sleep()
which is very bad at that point during boot.
Use a lower level function instead for now, at least until we get to
rework ppc32 boot process to do the code patching later, like ppc64
does.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Tue, 4 Sep 2012 18:33:08 +0000 (18:33 +0000)]
powerpc: Make sure IPI handlers see data written by IPI senders
We have been observing hangs, both of KVM guest vcpu tasks and more
generally, where a process that is woken doesn't properly wake up and
continue to run, but instead sticks in TASK_WAKING state. This
happens because the update of rq->wake_list in ttwu_queue_remote()
is not ordered with the update of ipi_message in
smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in
scheduler_ipi() is not ordered with the reading of ipi_message in
smp_ipi_demux(). Thus it is possible for the IPI receiver not to see
the updated rq->wake_list and therefore conclude that there is nothing
for it to do.
In order to make sure that anything done before smp_send_reschedule()
is ordered before anything done in the resulting call to scheduler_ipi(),
this adds barriers in smp_muxed_message_pass() and smp_ipi_demux().
The barrier in smp_muxed_message_pass() is a full barrier to ensure that
there is a full ordering between the smp_send_reschedule() caller and
scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than
xchg_local() because xchg() includes release and acquire barriers.
Using xchg() rather than xchg_local() makes sense given that
ipi_message is not just accessed locally.
This moves the barrier between setting the message and calling the
cause_ipi() function into the individual cause_ipi implementations.
Most of them -- those that used outb, out_8 or similar -- already had
a full barrier because out_8 etc. include a sync before the MMIO
store. This adds an explicit barrier in the two remaining cases.
These changes made no measurable difference to the speed of IPIs as
measured using a simple ping-pong latency test across two CPUs on
different cores of a POWER7 machine.
The analysis of the reason why processes were not waking up properly
is due to Milton Miller.
Cc: stable@vger.kernel.org # v3.0+ Reported-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Mon, 3 Sep 2012 16:51:10 +0000 (16:51 +0000)]
powerpc: Restore correct DSCR in context switch
During a context switch we always restore the per thread DSCR value.
If we aren't doing explicit DSCR management
(ie thread.dscr_inherit == 0) and the default DSCR changed while
the process has been sleeping we end up with the wrong value.
Check thread.dscr_inherit and select the default DSCR or per thread
DSCR as required.
This was found with the following test case, when running with
more threads than CPUs (ie forcing context switching):
Anton Blanchard [Mon, 3 Sep 2012 16:49:47 +0000 (16:49 +0000)]
powerpc: Fix DSCR inheritance in copy_thread()
If the default DSCR is non zero we set thread.dscr_inherit in
copy_thread() meaning the new thread and all its children will ignore
future updates to the default DSCR. This is not intended and is
a change in behaviour that a number of our users have hit.
We just need to inherit thread.dscr and thread.dscr_inherit from
the parent which ends up being much simpler.
Anton Blanchard [Mon, 3 Sep 2012 16:48:46 +0000 (16:48 +0000)]
powerpc: Keep thread.dscr and thread.dscr_inherit in sync
When we update the DSCR either via emulation of mtspr(DSCR) or via
a change to dscr_default in sysfs we don't update thread.dscr.
We will eventually update it at context switch time but there is
a period where thread.dscr is incorrect.
If we fork at this point we will copy the old value of thread.dscr
into the child. To avoid this, always keep thread.dscr in sync with
reality.
Anton Blanchard [Mon, 3 Sep 2012 16:47:56 +0000 (16:47 +0000)]
powerpc: Update DSCR on all CPUs when writing sysfs dscr_default
Writing to dscr_default in sysfs doesn't actually change the DSCR -
we rely on a context switch on each CPU to do the work. There is no
guarantee we will get a context switch in a reasonable amount of time
so fire off an IPI to force an immediate change.
This issue was found with the following test case:
Paul Mackerras [Thu, 26 Jul 2012 18:51:09 +0000 (18:51 +0000)]
powerpc/powernv: Always go into nap mode when CPU is offline
The CPU hotplug code for the powernv platform currently only puts
offline CPUs into nap mode if the powersave_nap variable is set.
However, HV-style KVM on this platform requires secondary CPU threads
to be offline and in nap mode. Since we know nap mode works just
fine on all POWER7 machines, and the only machines that support the
powernv platform are POWER7 machines, this changes the code to
always put offline CPUs into nap mode, regardless of powersave_nap.
Powersave_nap still controls whether or not CPUs go into nap mode
when idle, as before.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Thu, 26 Jul 2012 13:56:11 +0000 (13:56 +0000)]
powerpc: Give hypervisor decrementer interrupts their own handler
At the moment the handler for hypervisor decrementer interrupts is
the same as for decrementer interrupts, i.e. timer_interrupt().
This is bogus; if we ever do get a hypervisor decrementer interrupt
it won't have anything to do with the next timer event. In fact
the only time we get hypervisor decrementer interrupts is when one
is left pending on exit from a KVM guest.
When we get a hypervisor decrementer interrupt we don't need to do
anything special to clear it, since they are edge-triggered on the
transition of HDEC from 0 to -1. Thus this adds an empty handler
function for them. We don't need to have them masked when interrupts
are soft-disabled, so we use STD_EXCEPTION_HV instead of
MASKABLE_EXCEPTION_HV.
Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Mihai Caraman [Mon, 6 Aug 2012 03:27:07 +0000 (03:27 +0000)]
powerpc/booke64: Use SPRG0/3 scratch for bolted TLB miss & crit int
Embedded.Hypervisor category defines GSPRG0..3 physical registers for guests.
Avoid SPRG4-7 usage as scratch in host exception handlers, otherwise guest
SPRG4-7 registers will be clobbered.
For bolted TLB miss exception handlers, which is the version currently
supported by KVM, use SPRN_SPRG_GEN_SCRATCH aka SPRG0 instead of
SPRN_SPRG_TLB_SCRATCH aka SPRG6. Keep using TLB PACA slots to fit in one
64-byte cache line.
For critical exception handlers use SPRG3 instead of SPRG7. Provide a routine
to store and restore user-visible SPRGs. This will be subsequently used
to restore VDSO information in SPRG3. Add EX_R13 to paca slots to free up
SPRG3 and change the critical exception epilog to use it.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Mihai Caraman [Mon, 6 Aug 2012 03:27:06 +0000 (03:27 +0000)]
powerpc/booke64: Eemove mfspr srr1 duplicate in exception prolog
Refactor exception prolog to get rid of mfspr srr1 duplicate. This was
introduced by KVM integration, with DO_KVM macro logic expecting srr1 value
earlier in r11.
Reserve r11 to hold srr1's value also required at the end of the prolog and
free up r10 to serve as spare in addition macros.
For syscalls case this change does not add any performance penalty. For irq
soft-disabled case the change adds a store/load of conditional register value
to/from a paca slot. Paca slots fit in one 64-byte cache line so these
additional operations have little impact on performance.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Mihai Caraman [Mon, 6 Aug 2012 03:27:05 +0000 (03:27 +0000)]
powerpc/booke64: Add DO_KVM kernel hooks
Hook DO_KVM macro into 64-bit booke for KVM integration. Extend interrupt
handlers' parameter list with interrupt vector numbers to accomodate the macro.
Only the bolted version of tlb miss handers is addressed now.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Mihai Caraman [Mon, 6 Aug 2012 03:27:04 +0000 (03:27 +0000)]
powerpc/booke64: Use GSRR registers in Guest Doorbell interrupts
Guest Doorbell interrupts use guest save and restore registers. Add a new
Guest Doorbell exception type to accommodate GSRR0/1 SPRs usage in exception
prolog and fix the exception handler.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Mihai Caraman [Mon, 6 Aug 2012 03:27:03 +0000 (03:27 +0000)]
powerpc/booke64: Fix machine check handler to use the right prolog
Machine check exception handler was using a wrong prolog. Hypervisors like
KVM which are called early from the exception handler rely on the interrupt
source.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add thread_struct.trap_nr and use it to store the last exception
the thread experienced. In this patch, we populate the field at
various places where we force_sig_info() to the process.
This is also used in uprobes to determine if the probed instruction
caused an exception.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Sun, 19 Aug 2012 21:44:01 +0000 (21:44 +0000)]
powerpc: Rename 64-bit PVR constants to PVR_foo
We have an old FIXME in reg.h which points out that we should standardise
on PVR_foo for our PVR #defines. Currently we use PVR_ on 32-bit and PV_
on 64-bit.
So do that rename and remove the FIXME.
Seeing as we're touching all but one usage of __is_processor(), rename it
to something less ugly and more indicative of what it does, which is
simply to check the PVR version.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The pseries hvterm driver only registers a udbg backend (for xmon and
other low level debugging mechanisms) when hvc0 is recognized as the
firmware console at boot time, not if it's detected later on, for
example because the firmware is using a graphics card.
This can make debugging challenging especially under X11, and there's
really no good reason for that limitation, so let's hookup udbg
whenever hvc0 is detected instead.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
hvc_console has two methods to instanciate the consoles.
hvc_instanciate is meant to be called at early boot, while hvc_alloc is
called for more dynamically probed objects.
Currently, it only deals with adding kernel consoles in the former case,
which means for example that if a console only uses dynamic probing, it
will never be usable as a kernel console even when specifying
console=hvc0 explicitly, which could be considered annoying...
More specifically, on pseries, we only do the early instanciate for the
console currently used by the firmware, so if you have your firmware
configured to go to a video card, for example, you cannot get your
kernel console, oops messages, etc... on your serial port or hypervisor
console, which would be handy to deal with oopses.
This fixes it by checking if hvc_console.flags & CON_ENABLED is set when
registering a new dynamic console, and if not, redo the index check and
re-register the console if the index matches, allowing console=hvcN to
work.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
1) NLA_PUT* --> nla_put_* conversion got one case wrong in
nfnetlink_log, fix from Patrick McHardy.
2) Missed error return check in ipw2100 driver, from Julia Lawall.
3) PMTU updates in ipv4 were setting the expiry time incorrectly, fix
from Eric Dumazet.
4) SFC driver erroneously reversed src and dst when reporting filters
via ethtool.
5) Memory leak in CAN protocol and wrong setting of IRQF_SHARED in
sja1000 can platform driver, from Alexey Khoroshilov and Sven
Schmitt.
6) Fix multicast traffic scaling regression in ipv4_dst_destroy, only
take the lock when we really need to. From Eric Dumazet.
7) Fix non-root process spoofing in netlink, from Pablo Neira Ayuso.
8) CWND reduction in TCP is done incorrectly during non-SACK recovery,
fix from Yuchung Cheng.
9) Revert netpoll change, and fix what was actually a driver specific
problem. From Amerigo Wang. This should cure bootup hangs with
netconsole some people reported.
10) Fix xen-netfront invoking __skb_fill_page_desc() with a NULL page
pointer. From Ian Campbell.
11) SIP NAT fix for expectiontation creation, from Pablo Neira Ayuso.
12) __ip_rt_update_pmtu() needs RCU locking, from Eric Dumazet.
13) Fix usbnet deadlock on resume, can't use GFP_KERNEL in this
situation. From Oliver Neukum.
14) The davinci ethernet driver triggers an OOPS on removal because it
frees an MDIO object before unregistering it. Fix from Bin Liu.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
net: qmi_wwan: add several new Gobi devices
fddi: 64 bit bug in smt_add_para()
net: ethernet: fix kernel OOPS when remove davinci_mdio module
net/xfrm/xfrm_state.c: fix error return code
net: ipv6: fix error return code
net: qmi_wwan: new device: Foxconn/Novatel E396
usbnet: fix deadlock in resume
cs89x0 : packet reception not working
netfilter: nf_conntrack: fix racy timer handling with reliable events
bnx2x: Correct the ndo_poll_controller call
bnx2x: Move netif_napi_add to the open call
ipv4: must use rcu protection while calling fib_lookup
bnx2x: fix 57840_MF pci id
net: ipv4: ipmr_expire_timer causes crash when removing net namespace
e1000e: DoS while TSO enabled caused by link partner with small MSS
l2tp: avoid to use synchronize_rcu in tunnel free function
gianfar: fix default tx vlan offload feature flag
netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation
xen-netfront: use __pskb_pull_tail to ensure linear area is big enough on RX
netfilter: nfnetlink_log: fix error return code in init path
...
Gobi devices are composite, needing both the qcserial and
qmi_wwan drivers to support all functions. Re-syncing the
list of supported devices with qcserial.
Cc: Aleksander Morgado <aleksander@lanedo.com> Cc: Thomas Tuttle <ttuttle@chromium.org> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@tempietto.lan>
Dan Carpenter [Sat, 1 Sep 2012 09:57:40 +0000 (09:57 +0000)]
fddi: 64 bit bug in smt_add_para()
The intent was to set 4 bytes of data so that's why the sp_len is set
to 4 on the next line. The cast to u_long pointer clears 8 bytes
on 64 bit arches.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@tempietto.lan>
John Stultz [Fri, 31 Aug 2012 17:30:06 +0000 (13:30 -0400)]
time: Move ktime_t overflow checking into timespec_valid_strict
Andreas Bombe reported that the added ktime_t overflow checking added to
timespec_valid in commit 4e8b14526ca7 ("time: Improve sanity checking of
timekeeping inputs") was causing problems with X.org because it caused
timeouts larger then KTIME_T to be invalid.
Previously, these large timeouts would be clamped to KTIME_MAX and would
never expire, which is valid.
This patch splits the ktime_t overflow checking into a new
timespec_valid_strict function, and converts the timekeeping codes
internal checking to use this more strict function.
Reported-and-tested-by: Andreas Bombe <aeb@debian.org> Cc: Zhouping Liu <zliu@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6
Pull PARISC fixes from James Bottomley:
"This is a set of two bug fixes. One is the ATOMIC problem which is
now causing a compile failure in certain situations. The other is
mishandling of PER_LINUX32 which may also cause user visible effects.
Signed-off-by: James Bottomley <JBottomley@Parallels.com>"
* tag 'parisc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6:
[PARISC] fix personality flag check in copy_thread()
[PARISC] Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the casts
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"A couple of s390 bug fixes for 3.5-rc4"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/32: Don't clobber personality flags on exec
s390/smp: add missing smp_store_status() for !SMP
s390/dasd: fix ioctl return value
s390: Always use "long" for ssize_t to match size_t
Cc: Dan Williams <dcbw@redhat.com> Cc: Bjørn Mork <bjorn@mork.no> Cc: Ben Chan <benchan@google.com> Signed-off-by: Aleksander Morgado <aleksander@lanedo.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Neukum [Sun, 26 Aug 2012 20:41:38 +0000 (20:41 +0000)]
usbnet: fix deadlock in resume
A usbnet device can share a multifunction device
with a storage device. If the storage device is autoresumed
the usbnet devices also needs to be autoresumed. Allocating
memory with GFP_KERNEL can deadlock in this case.
netfilter: nf_conntrack: fix racy timer handling with reliable events
Existing code assumes that del_timer returns true for alive conntrack
entries. However, this is not true if reliable events are enabled.
In that case, del_timer may return true for entries that were
just inserted in the dying list. Note that packets / ctnetlink may
hold references to conntrack entries that were just inserted to such
list.
This patch fixes the issue by adding an independent timer for
event delivery. This increases the size of the ecache extension.
Still we can revisit this later and use variable size extensions
to allocate this area on demand.
Tested-by: Oliver Smith <olipro@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Merav Sicron [Mon, 27 Aug 2012 03:26:19 +0000 (03:26 +0000)]
bnx2x: Move netif_napi_add to the open call
Move netif_napi_add for all queues from the probe call to the open call, to
avoid the case that napi objects are added for queues that may eventually not
be initialized and activated. With the former behavior, the driver could crash
when netpoll was calling ndo_poll_controller.
Signed-off-by: Merav Sicron <meravs@broadcom.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 26 Aug 2012 00:35:45 +0000 (00:35 +0000)]
bnx2x: fix 57840_MF pci id
Commit c3def943c7117d42caaed3478731ea7c3c87190e have added support for
new pci ids of the 57840 board, while failing to change the obsolete value
in 'pci_ids.h'.
This patch does so, allowing the probe of such devices.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ipv4: ipmr_expire_timer causes crash when removing net namespace
When tearing down a net namespace, ipv4 mr_table structures are freed
without first deactivating their timers. This can result in a crash in
run_timer_softirq.
This patch mimics the corresponding behaviour in ipv6.
Locking and synchronization seem to be adequate.
We are about to kfree mrt, so existing code should already make sure that
no other references to mrt are pending or can be created by incoming traffic.
The functions invoked here do not cause new references to mrt or other
race conditions to be created.
Invoking del_timer_sync guarantees that ipmr_expire_timer is inactive.
Both ipmr_expire_process (whose completion we may have to wait in
del_timer_sync) and mroute_clean_tables internally use mfc_unres_lock
or other synchronizations when needed, and they both only modify mrt.
Tested in Linux 3.4.8.
Signed-off-by: Francesco Ruggeri <fruggeri@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Fri, 24 Aug 2012 20:38:11 +0000 (20:38 +0000)]
e1000e: DoS while TSO enabled caused by link partner with small MSS
With a low enough MSS on the link partner and TSO enabled locally, the
networking stack can periodically send a very large (e.g. 64KB) TCP
message for which the driver will attempt to use more Tx descriptors than
are available by default in the Tx ring. This is due to a workaround in
the code that imposes a limit of only 4 MSS-sized segments per descriptor
which appears to be a carry-over from the older e1000 driver and may be
applicable only to some older PCI or PCIx parts which are not supported in
e1000e. When the driver gets a message that is too large to fit across the
configured number of Tx descriptors, it stops the upper stack from queueing
any more and gets stuck in this state. After a timeout, the upper stack
assumes the adapter is hung and calls the driver to reset it.
Remove the unnecessary limitation of using up to only 4 MSS-sized segments
per Tx descriptor, and put in a hard failure test to catch when attempting
to check for message sizes larger than would fit in the whole Tx ring.
Refactor the remaining logic that limits the size of data per Tx descriptor
from a seemingly arbitrary 8KB to a limit based on the dynamic size of the
Tx packet buffer as described in the hardware specification.
Also, fix the logic in the check for space in the Tx ring for the next
largest possible packet after the current one has been successfully queued
for transmit, and use the appropriate defines for default ring sizes in
e1000_probe instead of magic values.
This issue goes back to the introduction of e1000e in 2.6.24 when it was
split off from e1000.
Reported-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Cc: Stable <stable@vger.kernel.org> [2.6.24+] Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Thu, 23 Aug 2012 21:46:25 +0000 (21:46 +0000)]
gianfar: fix default tx vlan offload feature flag
Commit -
"b852b72 gianfar: fix bug caused by 87c288c6e9aa31720b72e2bc2d665e24e1653c3e"
disables by default (on mac init) the hw vlan tag insertion.
The "features" flags were not updated to reflect this, and
"ethtool -K" shows tx-vlan-offload to be "on" by default.
Cc: Sebastian Poehn <sebastian.poehn@belden.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell [Wed, 22 Aug 2012 00:26:47 +0000 (00:26 +0000)]
xen-netfront: use __pskb_pull_tail to ensure linear area is big enough on RX
I'm slightly concerned by the "only in exceptional circumstances"
comment on __pskb_pull_tail but the structure of an skb just created
by netfront shouldn't hit any of the especially slow cases.
This approach still does slightly more work than the old way, since if
we pull up the entire first frag we now have to shuffle everything
down where before we just received into the right place in the first
place.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Mel Gorman <mgorman@suse.de> Cc: xen-devel@lists.xensource.com Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 30 Aug 2012 16:11:33 +0000 (09:11 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"A bunch of scattered fixes ati/intel/nouveau, couple of core ones,
nothing too shocking or different."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm: Add EDID_QUIRK_FORCE_REDUCED_BLANKING for ASUS VW222S
gma500: Consider CRTC initially active.
drm/radeon: fix dig encoder selection on DCE61
drm/radeon: fix double free in radeon_gpu_reset
drm/radeon: force dma32 to fix regression rs4xx,rs6xx,rs740
drm/radeon: rework panel mode setup
drm/radeon/atom: powergating fixes for DCE6
drm/radeon/atom: rework DIG modesetting on DCE3+
drm/radeon: don't disable plls that are in use by other crtcs
drm/radeon: add proper checking of RESOLVE_BOX command for r600-r700
drm/radeon: initialize tracked CS state
drm/radeon: fix reading CB_COLORn_MASK from the CS
drm/nvc0/copy: check PUNITS to determine which copy engines are disabled
i915: Quirk no_lvds on Gigabyte GA-D525TUD ITX motherboard
drm/i915: Use the correct size of the GTT for placing the per-process entries
drm: Check for invalid cursor flags
drm: Initialize object type when using DRM_MODE() macro
drm/i915: fix color order for BGR formats on IVB
drm/i915: fix wrong order of parameters in port checking functions
Heiko Carstens [Tue, 28 Aug 2012 08:02:08 +0000 (10:02 +0200)]
s390/32: Don't clobber personality flags on exec
In native 32 bit mode the personality flags were not correctly inherited.
This is the s390 version of 59e4c3a2 "powerpc/32: Don't clobber personality
flags on exec".
Reported-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=17629 Signed-off-by: Paul Menzel <paulepanter@users.sourceforge.net> Cc: <dri-devel@lists.freedesktop.org> Cc: Adam Jackson <ajax@redhat.com> Cc: Ian Pilcher <arequipeno@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 30 Aug 2012 00:35:34 +0000 (10:35 +1000)]
Merge branch 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Alex writes:
Highlights:
- fix a gart regression on older IGP chips
- more MSAA fixes
- fix a double free in gpu reset code
- modesetting fixes
- trinity dig encoder fix.
* 'drm-fixes-3.6' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: fix dig encoder selection on DCE61
drm/radeon: fix double free in radeon_gpu_reset
drm/radeon: force dma32 to fix regression rs4xx,rs6xx,rs740
drm/radeon: rework panel mode setup
drm/radeon/atom: powergating fixes for DCE6
drm/radeon/atom: rework DIG modesetting on DCE3+
drm/radeon: don't disable plls that are in use by other crtcs
drm/radeon: add proper checking of RESOLVE_BOX command for r600-r700
drm/radeon: initialize tracked CS state
drm/radeon: fix reading CB_COLORn_MASK from the CS
Forest Bond [Mon, 13 Aug 2012 16:31:24 +0000 (16:31 +0000)]
gma500: Consider CRTC initially active.
[this one ideally should make 3.6 - it fixes the very annoying mode setting bug]
This causes the pipe to be forced off prior to initial mode set, which
roughly mirrors the behavior of the i915 driver. It fixes initial mode
setting on my Intel DN2800MT (Cedarview) board. Without it, mode
setting triggers an out-of-range error from the monitor for most modes,
but only on initial configuration (i.e. they can be configured
successfully from userspace after that).
Signed-off-by: Forest Bond <forest.bond@rapidrollout.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
This patch reverts the offending commit and fixes be_poll() by
avoid disabling BH there, this is okay because be_poll()
can be called either by poll_napi() which already disables
IRQ, or by net_rx_action() which already disables BH.
Reported-by: Andrew Morton <akpm@linux-foundation.org> Reported-by: Sylvain Munaut <s.munaut@whatever-company.com> Cc: Sylvain Munaut <s.munaut@whatever-company.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Sathya Perla <sathya.perla@emulex.com> Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com> Cc: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: Cong Wang <amwang@redhat.com> Tested-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 29 Aug 2012 18:36:22 +0000 (11:36 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"I've split out the big send/receive update from my last pull request
and now have just the fixes in my for-linus branch. The send/recv
branch will wander over to linux-next shortly though.
The largest patches in this pull are Josef's patches to fix DIO
locking problems and his patch to fix a crash during balance. They
are both well tested.
The rest are smaller fixes that we've had queued. The last rc came
out while I was hacking new and exciting ways to recover from a
misplaced rm -rf on my dev box, so these missed rc3."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (25 commits)
Btrfs: fix that repair code is spuriously executed for transid failures
Btrfs: fix ordered extent leak when failing to start a transaction
Btrfs: fix a dio write regression
Btrfs: fix deadlock with freeze and sync V2
Btrfs: revert checksum error statistic which can cause a BUG()
Btrfs: remove superblock writing after fatal error
Btrfs: allow delayed refs to be merged
Btrfs: fix enospc problems when deleting a subvol
Btrfs: fix wrong mtime and ctime when creating snapshots
Btrfs: fix race in run_clustered_refs
Btrfs: don't run __tree_mod_log_free_eb on leaves
Btrfs: increase the size of the free space cache
Btrfs: barrier before waitqueue_active
Btrfs: fix deadlock in wait_for_more_refs
btrfs: fix second lock in btrfs_delete_delayed_items()
Btrfs: don't allocate a seperate csums array for direct reads
Btrfs: do not strdup non existent strings
Btrfs: do not use missing devices when showing devname
Btrfs: fix that error value is changed by mistake
Btrfs: lock extents as we map them in DIO
...
David Rientjes [Wed, 29 Aug 2012 02:57:21 +0000 (19:57 -0700)]
mm, slab: lock the correct nodelist after reenabling irqs
cache_grow() can reenable irqs so the cpu (and node) can change, so ensure
that we take list_lock on the correct nodelist.
This fixes an issue with commit 072bb0aa5e06 ("mm: sl[au]b: add
knowledge of PFMEMALLOC reserve pages") where list_lock for the wrong
node was taken after growing the cache.
Reported-and-tested-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christian König [Wed, 29 Aug 2012 11:24:15 +0000 (13:24 +0200)]
drm/radeon: fix double free in radeon_gpu_reset
radeon_ring_restore is freeing the memory for the saved
ring data. We need to remember that, otherwise we try to
restore the ring data again on the next try. Additional
to that it shouldn't try the reset infinitely if we have
saved ring data.
Signed-off-by: Christian König <deathsimple@vodafone.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Jerome Glisse [Tue, 28 Aug 2012 20:50:22 +0000 (16:50 -0400)]
drm/radeon: force dma32 to fix regression rs4xx,rs6xx,rs740
It seems some of those IGP dislike non dma32 page despite what
documentation says. Fix regression since we allowed non dma32
pages. It seems it only affect some revision of those IGP chips
as we don't know which one just force dma32 for all of them.
Alex Deucher [Mon, 27 Aug 2012 21:48:18 +0000 (17:48 -0400)]
drm/radeon: rework panel mode setup
Adjust the panel mode setup to match the behavior
of the vbios. Rather than checking for specific
bridge chip ids, just check the eDP configuration register.
This saves extra aux transactions and works across
DP bridge chips without requiring additional per chip
id checking.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 24 Aug 2012 22:21:21 +0000 (18:21 -0400)]
drm/radeon/atom: powergating fixes for DCE6
Power gating is per crtc pair, but the powergating registers
should be called individually. The hw handles power up/down
properly. The pair is powered up if either crtc in the pair
is powered up and the pair is not powered down until both
crtcs in the pair are powered down. This simplifies
programming and should save additional power as the previous
code never actually power gated the crtc pair.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Alex Deucher [Wed, 22 Aug 2012 13:54:56 +0000 (09:54 -0400)]
drm/radeon/atom: rework DIG modesetting on DCE3+
The ordering is important and the current drm code
wasn't cutting it for modern DIG encoders. We need
to have information about crtc before setting up
the encoders so I've shifted the ordering a bit.
Probably we'll need a full rework akin to danvet's
recent intel patchs. This patch fixes numerous
issues with DP bridge chips and makes link training
much more reliable.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Marek Olšák [Fri, 24 Aug 2012 12:27:36 +0000 (14:27 +0200)]
drm/radeon: add proper checking of RESOLVE_BOX command for r600-r700
Checking of the second colorbuffer was skipped on r700, because
CB_TARGET_MASK was 0xf. With r600, CB_TARGET_MASK is changed to 0xff,
so we must set the number of samples of the second colorbuffer to 1 in order
to pass the CS checker.
The DRM version is bumped, because RESOLVE_BOX is always rejected without this
fix on r600.
Signed-off-by: Marek Olšák <maraeo@gmail.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Dave Airlie [Wed, 29 Aug 2012 10:09:23 +0000 (20:09 +1000)]
Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
Daniel writes:
"Just a few smaller things:
- Fix up a pipe vs. plane confusion from a refactoring, fixes a regression
from 3.1 (Anhua Xu).
- Fix ivb sprite pixel formats (Vijay).
- Fixup ppgtt pde placement for machines where the Bios artifically limits
the availbale gtt space in the name of ... product differentiation
(Chris). This fixes an oops.
- Yet another no_lvds quirk entry."
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
i915: Quirk no_lvds on Gigabyte GA-D525TUD ITX motherboard
drm/i915: Use the correct size of the GTT for placing the per-process entries
drm/i915: fix color order for BGR formats on IVB
drm/i915: fix wrong order of parameters in port checking functions