John Kacur [Wed, 7 Oct 2009 18:19:32 +0000 (20:19 +0200)]
x86, cpuid: Remove the bkl from cpuid_open()
Most of the variables are local to the function. It IS possible that
for struct cpuinfo_x86 *c c could point to the same area. However,
this is used read only.
Signed-off-by: John Kacur <jkacur@redhat.com>
LKML-Reference: <alpine.LFD.2.00.0910072016190.15183@localhost.localdomain> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Remove the big kernel lock from msr_open() as it doesn't protect
anything there.
The only racy event that can happen here is a concurrent cpu shutdown.
So let's look at what could be racy during/after the above event:
- The cpu_online() check is racy, but the bkl doesn't help about
that anyway it disables preemption but we may be chcking another
cpu than the current one.
Also the cpu can still become offlined between open and read calls.
- The cpu_data(cpu) returns a safe pointer too. It won't be released on
cpu offlining. But some fields can be changed from
arch/x86/kernel/smpboot.c:remove_siblinginfo() :
- phys_proc_id
- cpu_core_id
Those are not read from msr_open(). What we are checking is the
x86_capability that is left untouched on offlining.
So this removal looks safe.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: John Kacur <jkacur@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Sven-Thorsten Dietrich <sdietrich@suse.de>
LKML-Reference: <1254944602-7382-1-git-send-email-fweisbec@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Merge branch 'x86-percpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-percpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, percpu: Collect hot percpu variables into one cacheline
x86, percpu: Fix DECLARE/DEFINE_PER_CPU_PAGE_ALIGNED()
x86, percpu: Add 'percpu_read_stable()' interface for cacheable accesses
Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, highmem_32.c: Clean up comment
x86, pgtable.h: Clean up types
x86: Clean up dump_pagetable()
Merge branch 'x86-kbuild-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-kbuild-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Simplify the Makefile in a minor way through use of cc-ifversion
Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-64: move clts into batch cpu state updates when preloading fpu
x86-64: move unlazy_fpu() into lazy cpu state part of context switch
x86-32: make sure clts is batched during context switch
x86: split out core __math_state_restore
Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Decrease the level of some NUMA messages to KERN_DEBUG
Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
x86: Fix code patching for paravirt-alternatives on 486
x86, msr: change msr-reg.o to obj-y, and export its symbols
x86: Use hard_smp_processor_id() to get apic id for AMD K8 cpus
x86, sched: Workaround broken sched domain creation for AMD Magny-Cours
x86, mcheck: Use correct cpumask for shared bank4
x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors
x86: Fix CPU llc_shared_map information for AMD Magny-Cours
x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too
x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h
x86, msr: fix msr-reg.S compilation with gas 2.16.1
x86, msr: Export the register-setting MSR functions via /dev/*/msr
x86, msr: Create _on_cpu helpers for {rw,wr}msr_safe_regs()
x86, msr: Have the _safe MSR functions return -EIO, not -EFAULT
x86, msr: CFI annotations, cleanups for msr-reg.S
x86, asm: Make _ASM_EXTABLE() usable from assembly code
x86, asm: Add 32-bit versions of the combined CFI macros
x86, AMD: Disable wrongly set X86_FEATURE_LAHF_LM CPUID bit
x86, msr: Rewrite AMD rd/wrmsr variants
x86, msr: Add rd/wrmsr interfaces with preset registers
x86: add specific support for Intel Atom architecture
...
Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Make memtype_seq_ops const
x86: uv: Clean up uv_ptc_init(), use proc_create()
x86: Use printk_once()
x86/cpu: Clean up various files a bit
x86: Remove duplicated #include
x86, ipi: Clean up safe_smp_processor_id() by using the cpu_has_apic() macro helper
x86: Clean up idt_descr and idt_tableby using NR_VECTORS instead of hardcoded number
x86: Further clean up of mtrr/generic.c
x86: Clean up mtrr/main.c
x86: Clean up mtrr/state.c
x86: Clean up mtrr/mtrr.h
x86: Clean up mtrr/if.c
x86: Clean up mtrr/generic.c
x86: Clean up mtrr/cyrix.c
x86: Clean up mtrr/cleanup.c
x86: Clean up mtrr/centaur.c
x86: Clean up mtrr/amd.c:
x86: ds.c fix invalid assignment
Merge branch 'x86-asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: remove all now-duplicate header files
x86: convert termios.h to the asm-generic version
x86: convert almost generic headers to asm-generic version
x86: convert trivial headers to asm-generic version
x86: add copies of some headers to convert to asm-generic
Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits)
ACPI, x86: expose some IO-APIC routines when CONFIG_ACPI=n
x86, apic: Slim down stack usage in early_init_lapic_mapping()
x86, ioapic: Get rid of needless check and simplify ioapic_setup_resources()
x86, ioapic: Define IO_APIC_DEFAULT_PHYS_BASE constant
x86: Fix x86_model test in es7000_apic_is_cluster()
x86, apic: Move dmar_table_init() out of enable_IR()
x86, ioapic: Panic on irq-pin binding only if needed
x86/apic: Enable x2APIC without interrupt remapping under KVM
x86, apic: Drop redundant bit assignment
x86, ioapic: Throw BUG instead of NULL dereference
x86, ioapic: Introduce for_each_irq_pin() helper
x86: Remove superfluous NULL pointer check in destroy_irq()
x86/ioapic.c: unify ioapic_retrigger_irq()
x86/ioapic.c: convert __target_IO_APIC_irq to conventional for() loop
x86/ioapic.c: clean up replace_pin_at_irq_node logic and comments
x86/ioapic.c: convert replace_pin_at_irq_node to conventional for() loop
x86/ioapic.c: simplify add_pin_to_irq_node()
x86/ioapic.c: convert io_apic_level_ack_pending loop to normal for() loop
x86/ioapic.c: move lost comment to what seems like appropriate place
x86/ioapic.c: remove redundant declaration of irq_pin_list
...
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (87 commits)
NFSv4: Disallow 'mount -t nfs4 -overs=2' and 'mount -t nfs4 -overs=3'
NFS: Allow the "nfs" file system type to support NFSv4
NFS: Move details of nfs4_get_sb() to a helper
NFS: Refactor NFSv4 text-based mount option validation
NFS: Mount option parser should detect missing "port="
NFS: out of date comment regarding O_EXCL above nfs3_proc_create()
NFS: Handle a zero-length auth flavor list
SUNRPC: Ensure that sunrpc gets initialised before nfs, lockd, etc...
nfs: fix compile error in rpc_pipefs.h
nfs: Remove reference to generic_osync_inode from a comment
SUNRPC: cache must take a reference to the cache detail's module on open()
NFS: Use the DNS resolver in the mount code.
NFS: Add a dns resolver for use with NFSv4 referrals and migration
SUNRPC: Fix a typo in cache_pipefs_files
nfs: nfs4xdr: optimize low level decoding
nfs: nfs4xdr: get rid of READ_BUF
nfs: nfs4xdr: simplify decode_exchange_id by reusing decode_opaque_inline
nfs: nfs4xdr: get rid of COPYMEM
nfs: nfs4xdr: introduce decode_sessionid helper
nfs: nfs4xdr: introduce decode_verifier helper
...
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (25 commits)
pata_rz1000: use printk_once
ahci: kill @force_restart and refine CLO for ahci_kick_engine()
pata_cs5535: add pci id for AMD based CS5535 controllers
ahci: Add AMD SB900 SATA/IDE controller device IDs
drivers/ata: use resource_size
sata_fsl: Defer non-ncq commands when ncq commands active
libata: add SATA PMP revision information for spec 1.2
libata: fix off-by-one error in ata_tf_read_block()
ahci: Gigabyte GA-MA69VM-S2 can't do 64bit DMA
ahci: make ahci_asus_m2a_vm_32bit_only() quirk more generic
dmi: extend dmi_get_year() to dmi_get_date()
dmi: fix date handling in dmi_get_year()
libata: unbreak TPM filtering by reorganizing ata_scsi_pass_thru()
sata_sis: convert to slave_link
sata_sil24: always set protocol override for non-ATAPI data commands
libata: Export AHCI capabilities
libata: Delegate nonrot flag setting to SCSI
[libata] Add pata_rdc driver for RDC ATA devices
drivers/ata: Remove unnecessary semicolons
libata: remove spindown skipping and warning
...
Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (105 commits)
ring-buffer: only enable ring_buffer_swap_cpu when needed
ring-buffer: check for swapped buffers in start of committing
tracing: report error in trace if we fail to swap latency buffer
tracing: add trace_array_printk for internal tracers to use
tracing: pass around ring buffer instead of tracer
tracing: make tracing_reset safe for external use
tracing: use timestamp to determine start of latency traces
tracing: Remove mentioning of legacy latency_trace file from documentation
tracing/filters: Defer pred allocation, fix memory leak
tracing: remove users of tracing_reset
tracing: disable buffers and synchronize_sched before resetting
tracing: disable update max tracer while reading trace
tracing: print out start and stop in latency traces
ring-buffer: disable all cpu buffers when one finds a problem
ring-buffer: do not count discarded events
ring-buffer: remove ring_buffer_event_discard
ring-buffer: fix ring_buffer_read crossing pages
ring-buffer: remove unnecessary cpu_relax
ring-buffer: do not swap buffers during a commit
ring-buffer: do not reset while in a commit
...
Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (64 commits)
sched: Fix sched::sched_stat_wait tracepoint field
sched: Disable NEW_FAIR_SLEEPERS for now
sched: Keep kthreads at default priority
sched: Re-tune the scheduler latency defaults to decrease worst-case latencies
sched: Turn off child_runs_first
sched: Ensure that a child can't gain time over it's parent after fork()
sched: enable SD_WAKE_IDLE
sched: Deal with low-load in wake_affine()
sched: Remove short cut from select_task_rq_fair()
sched: Turn on SD_BALANCE_NEWIDLE
sched: Clean up topology.h
sched: Fix dynamic power-balancing crash
sched: Remove reciprocal for cpu_power
sched: Try to deal with low capacity, fix update_sd_power_savings_stats()
sched: Try to deal with low capacity
sched: Scale down cpu_power due to RT tasks
sched: Implement dynamic cpu_power
sched: Add smt_gain
sched: Update the cpu_power sum during load-balance
sched: Add SD_PREFER_SIBLING
...
Merge branch 'perfcounters-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
perf tools: Avoid unnecessary work in directory lookups
perf stat: Clean up statistics calculations a bit more
perf stat: More advanced variance computation
perf stat: Use stddev_mean in stead of stddev
perf stat: Remove the limit on repeat
perf stat: Change noise calculation to use stddev
x86, perf_counter, bts: Do not allow kernel BTS tracing for now
x86, perf_counter, bts: Correct pointer-to-u64 casts
x86, perf_counter, bts: Fail if BTS is not available
perf_counter: Fix output-sharing error path
perf trace: Fix read_string()
perf trace: Print out in nanoseconds
perf tools: Seek to the end of the header area
perf trace: Fix parsing of perf.data
perf trace: Sample timestamps as well
perf_counter: Introduce new (non-)paranoia level to allow raw tracepoint access
perf trace: Sample the CPU too
perf tools: Work around strict aliasing related warnings
perf tools: Clean up warnings list in the Makefile
perf tools: Complete support for dynamic strings
...
Merge branch 'irq-threaded-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-threaded-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Do not mask oneshot edge type interrupts
genirq: Support nested threaded irq handling
genirq: Add buslock support
genirq: Add oneshot support
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits)
rcu: Move end of special early-boot RCU operation earlier
rcu: Changes from reviews: avoid casts, fix/add warnings, improve comments
rcu: Create rcutree plugins to handle hotplug CPU for multi-level trees
rcu: Remove lockdep annotations from RCU's _notrace() API members
rcu: Add #ifdef to suppress __rcu_offline_cpu() warning in !HOTPLUG_CPU builds
rcu: Add CPU-offline processing for single-node configurations
rcu: Add "notrace" to RCU function headers used by ftrace
rcu: Remove CONFIG_PREEMPT_RCU
rcu: Merge preemptable-RCU functionality into hierarchical RCU
rcu: Simplify rcu_pending()/rcu_check_callbacks() API
rcu: Use debugfs_remove_recursive() simplify code.
rcu: Merge per-RCU-flavor initialization into pre-existing macro
rcu: Fix online/offline indication for rcudata.csv trace file
rcu: Consolidate sparse and lockdep declarations in include/linux/rcupdate.h
rcu: Renamings to increase RCU clarity
rcu: Move private definitions from include/linux/rcutree.h to kernel/rcutree.h
rcu: Expunge lingering references to CONFIG_CLASSIC_RCU, optimize on !SMP
rcu: Delay rcu_barrier() wait until beginning of next CPU-hotunplug operation.
rcu: Fix typo in rcu_irq_exit() comment header
rcu: Make rcupreempt_trace.c look at offline CPUs
...
Merge branch 'core-printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-printk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
printk: Fix "printk: Enable the use of more than one CON_BOOT (early console)"
printk: Restore previous console_loglevel when re-enabling logging
printk: Ensure that "console enabled" messages are printed on the console
printk: Enable the use of more than one CON_BOOT (early console)
Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (59 commits)
x86/gart: Do not select AGP for GART_IOMMU
x86/amd-iommu: Initialize passthrough mode when requested
x86/amd-iommu: Don't detach device from pt domain on driver unbind
x86/amd-iommu: Make sure a device is assigned in passthrough mode
x86/amd-iommu: Align locking between attach_device and detach_device
x86/amd-iommu: Fix device table write order
x86/amd-iommu: Add passthrough mode initialization functions
x86/amd-iommu: Add core functions for pd allocation/freeing
x86/dma: Mark iommu_pass_through as __read_mostly
x86/amd-iommu: Change iommu_map_page to support multiple page sizes
x86/amd-iommu: Support higher level PTEs in iommu_page_unmap
x86/amd-iommu: Remove old page table handling macros
x86/amd-iommu: Use 2-level page tables for dma_ops domains
x86/amd-iommu: Remove bus_addr check in iommu_map_page
x86/amd-iommu: Remove last usages of IOMMU_PTE_L0_INDEX
x86/amd-iommu: Change alloc_pte to support 64 bit address space
x86/amd-iommu: Introduce increase_address_space function
x86/amd-iommu: Flush domains if address space size was increased
x86/amd-iommu: Introduce set_dte_entry function
x86/amd-iommu: Add a gneric version of amd_iommu_flush_all_devices
...
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (102 commits)
crypto: sha-s390 - Fix warnings in import function
crypto: vmac - New hash algorithm for intel_txt support
crypto: api - Do not displace newly registered algorithms
crypto: ansi_cprng - Fix module initialization
crypto: xcbc - Fix alignment calculation of xcbc_tfm_ctx
crypto: fips - Depend on ansi_cprng
crypto: blkcipher - Do not use eseqiv on stream ciphers
crypto: ctr - Use chainiv on raw counter mode
Revert crypto: fips - Select CPRNG
crypto: rng - Fix typo
crypto: talitos - add support for 36 bit addressing
crypto: talitos - align locks on cache lines
crypto: talitos - simplify hmac data size calculation
crypto: mv_cesa - Add support for Orion5X crypto engine
crypto: cryptd - Add support to access underlaying shash
crypto: gcm - Use GHASH digest algorithm
crypto: ghash - Add GHASH digest algorithm for GCM
crypto: authenc - Convert to ahash
crypto: api - Fix aligned ctx helper
crypto: hmac - Prehash ipad/opad
...
Merge branch 'writeback' of git://git.kernel.dk/linux-2.6-block
* 'writeback' of git://git.kernel.dk/linux-2.6-block:
writeback: check for registered bdi in flusher add and inode dirty
writeback: add name to backing_dev_info
writeback: add some debug inode list counters to bdi stats
writeback: get rid of pdflush completely
writeback: switch to per-bdi threads for flushing data
writeback: move dirty inodes from super_block to backing_dev_info
writeback: get rid of generic_sync_sb_inodes() export
Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (54 commits)
[S390] tape: Use pr_xxx instead of dev_xxx in shared driver code
[S390] Wire up page fault events for software perf counters.
[S390] Remove smp_cpu_not_running.
[S390] Get rid of cpuid.h header file.
[S390] Limit cpu detection to 256 physical cpus.
[S390] tape: Fix device online messages
[S390] Enable guest page hinting by default.
[S390] use generic scatterlist.h
[S390] s390dbf: Add description for usage of "%s" in sprintf events
[S390] Initialize __LC_THREAD_INFO early.
[S390] fix recursive locking on page_table_lock
[S390] kvm: use console_initcall() to initialize s390 virtio console
[S390] tape: reversed order of labels
[S390] hypfs: Use "%u" instead of "%d" for unsigned ints in snprintf
[S390] kernel: Print an error message if kernel NSS cannot be defined
[S390] zcrypt: Free ap_device if dev_set_name fails.
[S390] zcrypt: Use spin_lock_bh in suspend callback
[S390] xpram: Remove checksum validation for suspend/resume
[S390] vmur: Invalid allocation sequence for vmur class
[S390] hypfs: remove useless variable qname
...
Merge branch 'kmemleak' of git://linux-arm.org/linux-2.6
* 'kmemleak' of git://linux-arm.org/linux-2.6:
kmemleak: Improve the "Early log buffer exceeded" error message
kmemleak: fix sparse warning for static declarations
kmemleak: fix sparse warning over overshadowed flags
kmemleak: move common painting code together
kmemleak: add clear command support
kmemleak: use bool for true/false questions
kmemleak: Do no create the clean-up thread during kmemleak_disable()
kmemleak: Scan all thread stacks
kmemleak: Don't scan uninitialized memory when kmemcheck is enabled
kmemleak: Ignore the aperture memory hole on x86_64
kmemleak: Printing of the objects hex dump
kmemleak: Do not report alloc_bootmem blocks as leaks
kmemleak: Save the stack trace for early allocations
kmemleak: Mark the early log buffer as __initdata
kmemleak: Dump object information on request
kmemleak: Allow rescheduling during an object scanning
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (48 commits)
RDMA/iwcm: Reject the connection when the cm_id is destroyed
RDMA/cxgb3: Clean up properly on FW mismatch failures
RDMA/cxgb3: Don't ignore insert_handle() failures
MAINTAINERS: InfiniBand/RDMA mailing list transition to vger
IB/mad: Allow tuning of QP0 and QP1 sizes
IB/mad: Fix possible lock-lock-timer deadlock
RDMA/nes: Map MTU to IB_MTU_* and correctly report link state
RDMA/nes: Rework the disconn routine for terminate and flushing
RDMA/nes: Use the flush code to fill in cqe error
RDMA/nes: Make poll_cq return correct number of wqes during flush
RDMA/nes: Use flush mechanism to set status for wqe in error
RDMA/nes: Implement Terminate Packet
RDMA/nes: Add CQ error handling
RDMA/nes: Clean out CQ completions when QP is destroyed
RDMA/nes: Change memory allocation for cqp request to GFP_ATOMIC
RDMA/nes: Allocate work item for disconnect event handling
RDMA/nes: Update refcnt during disconnect
IB/mthca: Don't allow userspace open while recovering from catastrophic error
IB/mthca: Distinguish multiple devices in /proc/interrupts
IB/mthca: Annotate CQ locking
...
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (57 commits)
binfmt_elf: fix PT_INTERP bss handling
TPM: Fixup boot probe timeout for tpm_tis driver
sysfs: Add labeling support for sysfs
LSM/SELinux: inode_{get,set,notify}secctx hooks to access LSM security context information.
VFS: Factor out part of vfs_setxattr so it can be called from the SELinux hook for inode_setsecctx.
KEYS: Add missing linux/tracehook.h #inclusions
KEYS: Fix default security_session_to_parent()
Security/SELinux: includecheck fix kernel/sysctl.c
KEYS: security_cred_alloc_blank() should return int under all circumstances
IMA: open new file for read
KEYS: Add a keyctl to install a process's session keyring on its parent [try #6]
KEYS: Extend TIF_NOTIFY_RESUME to (almost) all architectures [try #6]
KEYS: Do some whitespace cleanups [try #6]
KEYS: Make /proc/keys use keyid not numread as file position [try #6]
KEYS: Add garbage collection for dead, revoked and expired keys. [try #6]
KEYS: Flag dead keys to induce EKEYREVOKED [try #6]
KEYS: Allow keyctl_revoke() on keys that have SETATTR but not WRITE perm [try #6]
KEYS: Deal with dead-type keys appropriately [try #6]
CRED: Add some configurable debugging [try #6]
selinux: Support for the new TUN LSM hooks
...
Michael Holzheu [Fri, 11 Sep 2009 08:29:07 +0000 (10:29 +0200)]
[S390] tape: Use pr_xxx instead of dev_xxx in shared driver code
For messages from the tape core that is shared between the 3590 and 34xx
tape disciplines, we want to have the "tape" prefix instead of "tape_3590"
or "tape_34xx". In order to fix this, we now use the pr_xxx printk macros.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Merge cpuid.h header file into cpu.h.
While at it convert from typedef to struct declaration and also
convert cio code to use proper lowcore structure instead of casts.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Fri, 11 Sep 2009 08:29:02 +0000 (10:29 +0200)]
[S390] tape: Fix device online messages
Currently, when a tape device is set online and no cartridge is loaded, we
get the messages "The tape cartridge has been successfully unloaded" and
"Determining the size of the recorded area". These messages are not correct.
To fix this, we now print the "cartridge loaded/unloaded" messages only,
when the load/unload event really occurs. In addition to that, the message
"Determining the size of the recorded area" is only printed, if a cartridge
is loaded.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
"lockdep: Fix backtraces" reveales a bug in early setup code: when
lockdep tries to save a stack backtrace before setup_arch has been
called the lowcore pointer for the current thread info pointer isn't
initialized yet.
However our save stack backtrace code relies on it. If the pointer
isn't initialized the saved backtrace will have zero entries.
lockdep however relies (correctly) on the fact that that cannot
happen.
A write access to some random memory region is the result.
Fix this by initializing the thread info pointer early.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The page_table_lock is already acquired in __pmd_alloc (mm/memory.c) and
it tries to populate the pud/pgd with a new pmd allocated. If another
thread populates it before we get a chance, we free the pmd using
pmd_free().
On s390x, pmd_free(even pud_free ) is #defined to crst_table_free(),
which acquires the page_table_lock to protect the crst_table index updates.
Hence this ends up in a recursive locking of the page_table_lock.
The solution suggested by Dave Hansen is to use a new spin lock in the mmu
context to protect the access to the crst_list and the pgtable_list.
Reported-by: Suzuki Poulose <suzuki@in.ibm.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[S390] kvm: use console_initcall() to initialize s390 virtio console
Use a console_initcall() to initialize the s390 virtio console and
clean up s390 console initialization in setup.c.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Fri, 11 Sep 2009 08:28:50 +0000 (10:28 +0200)]
[S390] xpram: Remove checksum validation for suspend/resume
Currently in the suspend process checksums for the XPRAM partitions are
created and stored. During the resume process it is checked,
if the checksums are still the same. If this is not the case, a kernel panic
is triggered. Unfortunately this prevents XPRAM from beeing used as suspend
device, because in this case after the checksum has been created, the
memory image is written to XPRAM and therefore the contents of the suspend
partition is changed. In order to allow XPRAM to be used as suspend device,
this patch removes the checksum validation.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Fri, 11 Sep 2009 08:28:49 +0000 (10:28 +0200)]
[S390] vmur: Invalid allocation sequence for vmur class
The vmur class is allocated after the CCW driver is registered
and it is destroyed before the CCW driver is unregistered.
This is not the correct sequence, because the vmur class can be used
via driver core callbacks that are triggered during the CCW driver
deregistration. For Example:
1. vmur device is online
2. vmur module is unloaded
[S390] kernel: always keep machine flags in lowcore
Eleminate the local variable machine_flags and always change machine
flags directly in the lowcore.
This avoids confusion about when and why the two variables have to be
synchronized.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Nelson Elhage [Fri, 11 Sep 2009 08:28:44 +0000 (10:28 +0200)]
[S390] clean up linker script using new linker script macros.
Note that this patch moves .data.init_task inside _edata. In
addition, the alignment of .init.ramfs changes: It is now PAGE_ALIGNED
and __initramfs_end is arbitrarily aligned; Previously it was
only aligned to a 0x100-byte boundary, and always ended on an even
byte.
This change results in fewer output sections and in some data being
reordered, but should have no functional effect.
Signed-off-by: Nelson Elhage <nelhage@ksplice.com> Signed-off-by: Tim Abbott <tabbott@ksplice.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Tim Abbott [Fri, 11 Sep 2009 08:28:43 +0000 (10:28 +0200)]
[S390] Use macros for .data.page_aligned.
.data.page_aligned should not need a separate output section, so as
part of this cleanup I moved into the .data output section in the
linker scripts in order to eliminate unnecessary references to the
section name.
Remove the reference to .data.idt, since nothing is put into the
.data.idt section on the s390 architecture. It looks like Cyrill
Gorcunov posted a patch to remove the .data.idt code on s390
previously:
CCing him and the people who acked that patch in case there's a reason
it wasn't applied.
Signed-off-by: Tim Abbott <tabbott@ksplice.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[S390] move (io|sysc)_restore_trace_psw into .data section
The sysc_restore_trace_psw and io_restore_trace_psw storage locations
are created in the .text section. When creating and IPLing from a named
saved system (NSS), writing to these locations causes a protection exception
(because the .text section is mapped as shared read-only in the NSS).
To permit write access, move the storage locations into the .data section.
[S390] kernel: Append scpdata to kernel boot command line
Append scpdata to the kernel boot command line. If scpdata starts
with the equal sign (=), the kernel boot command line is replaced.
(For consistency with zIPL and IPL PARM parameters.)
To use scpdata for the kernel boot command line, scpdata must consist
of ascii characters only. If scpdata contains other characters,
scpdata is not appended to the kernel boot command line.
In addition, re-IPL is extended for setting scpdata for the next
Linux reboot.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Frank Munzert [Fri, 11 Sep 2009 08:28:39 +0000 (10:28 +0200)]
[S390] tape: use init_timer_on_stack() rather than init_timer()
With CONFIG_DEBUG_OBJECTS_TIMERS=y "chccwdev --online" for a tape device
will fail with message "ODEBUG: object is on stack, but not annotated".
We now use init_timer_on_stack.
Signed-off-by: Frank Munzert <munzert@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introduce get_clock_monotonic() function which can be used to get a
(fast) timestamp. Resolution is the same as for get_clock(). The
only difference is that the timestamps are monotonic and don't jump
backward or forward.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Stefan Haberland [Fri, 11 Sep 2009 08:28:30 +0000 (10:28 +0200)]
[S390] dasd: fix message naming
This patch fixes message naming so that generic dasd messages do not
contain the device discipline. For this purpose the dev_ makros are
replaced by pr_ makros for generic dasd messages.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Fri, 11 Sep 2009 08:28:27 +0000 (10:28 +0200)]
[S390] cio: remove ccw_device init_name
We used the init_name to set the console ccw_device's name early
at the boot stage. This patch moves the name setting (for all ccw
devices) to the point where we actually register the device. At this
time we can do dynamic allocations and therefore use dev_set_name.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Fri, 11 Sep 2009 08:28:26 +0000 (10:28 +0200)]
[S390] cio: move final put_device to ccw_device_unregister
We use a test_and_clear_bit to prevent a device from being
unregistered twice. Unfortunately in this cases the "final"
put_device (from device_initialize) was issued more than once,
resulting in an use after free error. Fix this by moving this
put_device to ccw_device_unregister.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Fri, 11 Sep 2009 08:28:25 +0000 (10:28 +0200)]
[S390] cio: remove subchannel init_name
We used the init_name to set the console subchannels name early
at the boot stage. With the patch cio: fix memleak in subchannel validation
we moved the name setting to the point where we actually register the
console subchannel. At this time we can do dynamic allocations and therefore
use dev_set_name.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Fri, 11 Sep 2009 08:28:24 +0000 (10:28 +0200)]
[S390] cio: fix memleak in subchannel validation
When scanning for new subchannels we have a code path where we allocate
memory for a struct subchannel, set the device name (which is dynamically
allocated now) and do a check if the underlying device is blacklisted - if
so we free the subchannel structure.
Since we have not set up refcounting at this stage, the device name's memory
is lost. Fix this by moving the dev_set_name after the blacklist test.
Note: With this patch the init_name for the console subchannel becomes
virtually obsolete.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Fri, 11 Sep 2009 08:28:23 +0000 (10:28 +0200)]
[S390] cio: fix use after free in s390 debug feature
When using s390dbf with "%s" in sprintf format strings the string itself
is not copied to the dbf buffer.
Since in this case only pointers are stored in the s390dbf, we should
not use dev_name - which is bound to the lifetime of the device.
Reading this entry from s390dbf after the device was released will cause
an use after free error.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Ernst [Fri, 11 Sep 2009 08:28:21 +0000 (10:28 +0200)]
[S390] cio: failing set online/offline processing.
When unit checks trigger sensing the device state is set to W4SENSE
until sense completion; then the device state is set back to
ONLINE. If a unit check occurs while set online or set offline
requests are processed then it might happen that the device's
temporary W4SENSE state causes these functions to terminate,
leaving the device in an inconsistent state when the state is set
back to ONLINE later on so that the device cannot be set online or
offline any longer.
To solve this, set online/offline and related rollback or error
routines are processed only if the device is in a final or
DISCONNECTED state.
Signed-off-by: Michael Ernst <mernst@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Fri, 11 Sep 2009 08:28:20 +0000 (10:28 +0200)]
[S390] cio: ensure to hold a reference for deferred deregistration
Ensure to always hold an extra device reference for scheduling a
subchannel deregistration, by moving the get_device to
ccw_device_schedule_sch_unregister. This fixes an use after free
error in ccw_device_call_sch_unregister where put_device was called
on an already freed device structure.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Fri, 11 Sep 2009 08:28:17 +0000 (10:28 +0200)]
[S390] cio: fix not oper handling after failed [on|off]line processing
If online/offline processing of a ccw device fails, resulting in not
operational state, notify the driver and unregister the device in case
the driver dosn't want to keep it.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[S390] cio: move scsw helper functions to header file
All scsw helper functions are very short and usage of them shouldn't
result in function calls. Therefore we move them to a separate header
file.
Also saves a lot of EXPORT_SYMBOLs.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Path verification events occurring for offline devices are currently
ignored. As a result, offline devices are not removed, even though
they might no longer be accessible (for example because the last path
to the device was varied offline). Fix this by scheduling a status
evaluation for the affected subchannel when a path verification event
occurs.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
writeback: switch to per-bdi threads for flushing data
This gets rid of pdflush for bdi writeout and kupdated style cleaning.
pdflush writeout suffers from lack of locality and also requires more
threads to handle the same workload, since it has to work in a
non-blocking fashion against each queue. This also introduces lumpy
behaviour and potential request starvation, since pdflush can be starved
for queue access if others are accessing it. A sample ffsb workload that
does random writes to files is about 8% faster here on a simple SATA drive
during the benchmark phase. File layout also seems a LOT more smooth in
vmstat:
A 10 disk test with btrfs performs 26% faster with per-bdi flushing. A
SSD based writeback test on XFS performs over 20% better as well, with
the throughput being very stable around 1GB/sec, where pdflush only
manages 750MB/sec and fluctuates wildly while doing so. Random buffered
writes to many files behave a lot better as well, as does random mmap'ed
writes.
A separate thread is added to sync the super blocks. In the long term,
adding sync_supers_bdi() functionality could get rid of this thread again.
writeback: move dirty inodes from super_block to backing_dev_info
This is a first step at introducing per-bdi flusher threads. We should
have no change in behaviour, although sb_has_dirty_inodes() is now
ridiculously expensive, as there's no easy way to answer that question.
Not a huge problem, since it'll be deleted in subsequent patches.