Mark Fasheh [Fri, 22 Jun 2007 22:45:27 +0000 (15:45 -0700)]
ocfs2: simplify deallocation locking
Deallocation of suballocator blocks, most notably extent blocks, might
involve multiple suballocator inodes.
The locking for this can get extremely complicated, especially when the
suballocator inodes to delete from aren't known until deep within an
unrelated codepath.
Implement a simple scheme for recording the blocks to be unlinked so that
the actual deallocation can be done in a context which won't deadlock.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Mark Fasheh [Mon, 18 Jun 2007 18:12:36 +0000 (11:12 -0700)]
ocfs2: harden buffer check during mapping of page blocks
We don't want to submit buffer_new blocks for read i/o. This actually won't
happen right now because those requests during an allocating write are all nicely
aligned. It's probably a good idea to provide an explicit check though.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Mark Fasheh [Wed, 9 May 2007 00:47:32 +0000 (17:47 -0700)]
ocfs2: rework ocfs2_buffered_write_cluster()
Use some ideas from the new-aops patch series and turn
ocfs2_buffered_write_cluster() into a 2 stage operation with the caller
copying data in between. The code now understands multiple cluster writes as
a result of having to deal with a full page write for greater than 4k pages.
This sets us up to easily call into the write path during ->page_mkwrite().
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Sunil Mushran [Tue, 19 Jun 2007 00:00:24 +0000 (17:00 -0700)]
ocfs2: Add "preferred slot" mount option
ocfs2 will attempt to assign the node the slot# provided in the mount
option. Failure to assign the preferred slot is not an error. This small
feature can be useful for automated testing.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Joel Becker [Tue, 6 Feb 2007 23:45:39 +0000 (15:45 -0800)]
ocfs2: Wake up a starting region if it gets killed in the background.
Tell o2cb_region_dev_write() to wake up if rmdir(2) happens on the
heartbeat region while it is starting up. Then o2hb_region_dev_write()
can check to see if it is alive and act accordingly. This prevents a hang
(not being woken) and a crash (if it's woken by a signal).
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Joel Becker [Tue, 19 Jun 2007 18:34:03 +0000 (11:34 -0700)]
ocfs2: live heartbeat depends on the local node configuration
Removing the local node configuration out from underneath a running
heartbeat is "bad". Provide an API in the ocfs2 nodemanager to request
a configfs dependancy on the local node, then use it in heartbeat.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Joel Becker [Fri, 15 Jun 2007 04:40:49 +0000 (21:40 -0700)]
ocfs2: Depend on configfs heartbeat items.
ocfs2 mounts require a heartbeat region. Use the new configfs_depend_item()
facility to actually depend on them so they can't go away from under us.
First, teach cluster/nodemanager.c to depend an item on the o2cb subsystem.
Then teach o2hb_register_callbacks to take a UUID and depend on the
appropriate region. Finally, teach all users of o2hb to pass a UUID or
NULL if they don't require a pin.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Joel Becker [Tue, 19 Jun 2007 01:06:09 +0000 (18:06 -0700)]
configfs: config item dependancies.
Sometimes other drivers depend on particular configfs items. For
example, ocfs2 mounts depend on a heartbeat region item. If that
region item is removed with rmdir(2), the ocfs2 mount must BUG or go
readonly. Not happy.
This provides two additional API calls: configfs_depend_item() and
configfs_undepend_item(). A client driver can call
configfs_depend_item() on an existing item to tell configfs that it is
depended on. configfs will then return -EBUSY from rmdir(2) for that
item. When the item is no longer depended on, the client driver calls
configfs_undepend_item() on it.
These API cannot be called underneath any configfs callbacks, as
they will conflict. They can block and allocate. A client driver
probably shouldn't calling them of its own gumption. Rather it should
be providing an API that external subsystems call.
How does this work? Imagine the ocfs2 mount process. When it mounts,
it asks for a heart region item. This is done via a call into the
heartbeat code. Inside the heartbeat code, the region item is looked
up. Here, the heartbeat code calls configfs_depend_item(). If it
succeeds, then heartbeat knows the region is safe to give to ocfs2.
If it fails, it was being torn down anyway, and heartbeat can gracefully
pass up an error.
[ Fixed some bad whitespace in configfs.txt. --Mark ]
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Joel Becker [Sat, 7 Oct 2006 00:33:23 +0000 (17:33 -0700)]
configfs: accessing item hierarchy during rmdir(2)
Add a notification callback, ops->disconnect_notify(). It has the same
prototype as ->drop_item(), but it will be called just before the item
linkage is broken. This way, configfs users who want to do work while
the object is still in the heirarchy have a chance.
Client drivers will still need to config_item_put() in their
->drop_item(), if they implement it. They need do nothing in
->disconnect_notify(). They don't have to provide it if they don't
care. But someone who wants to be notified before ci_parent is set to
NULL can now be notified.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Johannes Berg [Fri, 22 Jun 2007 09:20:00 +0000 (11:20 +0200)]
[PATCH] configsfs buffer: use mutex
Seems copied from sysfs, but I don't see a reason here nor there to use
a semaphore instead of a mutex. Convert.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Joel Becker [Sat, 7 Jul 2007 06:33:17 +0000 (23:33 -0700)]
configfs: Convert subsystem semaphore to mutex
Convert the su_sem member of struct configfs_subsystem to a struct
mutex, as that's what it is. Also convert all the users and update
Documentation/configfs.txt and Documentation/configfs_example.c
accordingly.
[PATCH] configfs+dlm: Rename config_group_find_obj and state semantics clearly
Configfs being based upon sysfs code, config_group_find_obj() is probably
so named because of the similar kset_find_obj() in sysfs. However,
"kobject"s in sysfs become "config_item"s in configfs, so let's call it
config_group_find_item() instead, for sake of uniformity, and make
corresponding change in the users of this function.
BTW a crucial difference between kset_find_obj and config_group_find_item
is in locking expectations. kset_find_obj does its locking by itself, but
config_group_find_item expects the *caller* to do the locking. The reason
for this: kset's have their own locks, config_group's don't but instead
rely on the subsystem mutex. And, subsystem needn't necessarily be around
when config_group_find_item() is called.
So let's state these locking semantics explicitly, and rectify the comment,
otherwise bugs could continue to occur in future, as they did in the past
(refer commit d82b8191e238 in gfs2-2.6-fixes.git).
[ I also took the opportunity to fix some bad whitespace and
double-empty lines. --Joel ]
[PATCH] configfs+dlm: Separate out __CONFIGFS_ATTR into configfs.h
fs/dlm/config.c contains a useful generic macro called __CONFIGFS_ATTR
that is similar to sysfs' __ATTR macro that makes defining attributes
easy for any user of configfs. Separate it out into configfs.h so that
other users (forthcoming in dynamic netconsole patchset) can use it too.
Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in> Cc: David Teigland <teigland@redhat.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Satyam Sharma [Wed, 27 Jun 2007 10:32:14 +0000 (16:02 +0530)]
configfs: misc cleanups
1. item.c:config_item_cleanup() is a private function (only called by
config_item_release() in same file). However, it is spuriously
exported in include/linux/configfs.h, so remove that export and make
it static in item.c. Also, it is no longer exported / interface
function, so no need to give comment for this function (the comment
was stating obvious thing, anyway).
2. Kernel-doc comment format does not allow empty line between end of
comment and start of function (declaration line). There were several
such spurious empty lines in item.c, so fix them.
Joel Becker [Fri, 22 Jun 2007 20:07:02 +0000 (13:07 -0700)]
configfs: consistent attribute size
The attribute store/show code currently limits attributes at PAGE_SIZE.
This code comes from sysfs, where it still works that way.
However, PAGE_SIZE is not constant. A 16k attribute string works on
ia64 but not on x86. Really a subsystem shouldn't allow different
attribute sizes based on platform.
As such, limit all simple attributes to 4k. This works on all
platforms, and is consistent with all current code.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
mmc: at91_mci: fix hanging and rework to match flowcharts
mmc: at91_mci typo
sdhci: Fix "Unexpected interrupt" handling
mmc: fix silly copy-and-paste error
mmc: move layer init and workqueue to core file
mmc: refactor host class handling
mmc: refactor bus operations
sdhci: add ene controller id
mmc: bounce requests for simple hosts
Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (40 commits)
bonding/bond_main.c: make 2 functions static
ps3: gigabit ethernet driver for PS3, take3
[netdrvr] Fix dependencies for ax88796 ne2k clone driver
eHEA: Capability flag for DLPAR support
Remove sk98lin ethernet driver.
sunhme.c:quattro_pci_find() must be __devinit
bonding / ipv6: no addrconf for slaves separately from master
atl1: remove write-only var in tx handler
macmace: use "unsigned long flags;"
Cleanup usbnet_probe() return value handling
netxen: deinline and sparse fix
eeprom_93cx6: shorten pulse timing to match spec (bis)
phylib: Add Marvell 88E1112 phy id
phylib: cleanup marvell.c a bit
AX88796 network driver
IOC3: Switch to pci refcounting safe APIs
e100: Fix Tyan motherboard e100 not receiving IPMI commands
QE Ethernet driver writes to wrong register to mask interrupts
rrunner.c:rr_init() must be __devinit
tokenring/3c359.c:xl_init() must be __devinit
...
Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: (32 commits)
[libata] sata_mv: print out additional chip info during probe
[libata] Use ATA_UDMAx standard masks when filling driver's udma_mask info
[libata] AHCI: Add support for Marvell AHCI-like chips (initially 6145)
[libata] Clean up driver udma_mask initializers
libata: Support chips with 64K PRD quirk
Add a PCI ID for santa rosa's PATA controller.
sata_sil24: sil24_interrupt() micro-optimisation
Add irq_flags to struct pata_platform_info
sata_promise: cleanups
[libata] pata_ixp4xx: kill unused var
ata_piix: fix pio/mwdma programming
[libata] ahci: minor internal cleanups
[ATA] Add named constant for ATAPI command DEVICE RESET
[libata] sata_sx4, sata_via: minor documentation updates
[libata] ahci: minor internal cleanups
[libata] ahci: Factor out SATA port init into a separate function
[libata] pata_sil680: minor cleanups from benh
[libata] sata_sx4: named constant cleanup
[libata] pata_ixp4xx: convert to new EH
[libata] pdc_adma: Reorder initializers with a couple structs
...
Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] vmlogrdr function annotation.
[S390] s390: rename CPU_IDLE to S390_CPU_IDLE
[S390] cio: Remove prototype for non-existing function cmf_reset().
[S390] zcrypt: fix request timeout handling
[S390] system call optimization.
[S390] dasd: Avoid compile warnings on !CONFIG_DASD_PROFILE
[S390] Remove volatile from atomic_t
[S390] Program check in diag 210 under 31 bit
[S390] Bogomips calculation for 64 bit.
[S390] smp: Merge smp_count_cpus() and smp_get_save_areas().
[S390] zcore: Fix __user annotation.
[S390] fixed cdl-format detection.
[S390] sclp: Test facility list before executing a service call.
[S390] sclp: introduce some new interfaces.
[S390] Fixed comment typo.
[S390] vmcp cleanup
Merge branch 'splice-2.6.23' of git://git.kernel.dk/data/git/linux-2.6-block
* 'splice-2.6.23' of git://git.kernel.dk/data/git/linux-2.6-block:
pipe: add documentation and comments
pipe: change the ->pin() operation to ->confirm()
Remove remnants of sendfile()
xip sendfile removal
splice: completely document external interface with kerneldoc
sendfile: remove bad_sendfile() from bad_file_ops
shmem: convert to using splice instead of sendfile()
relay: use splice_to_pipe() instead of open-coding the pipe loop
pipe: allow passing around of ops private pointer
splice: divorce the splice structure/function definitions from the pipe header
splice: relay support
sendfile: convert nfsd to splice_direct_to_actor()
sendfile: convert nfs to using splice_read()
loop: convert to using splice_direct_to_actor() instead of sendfile()
splice: add void cookie to the actor data
sendfile: kill generic_file_sendfile()
sendfile: remove .sendfile from filesystems that use generic_file_sendfile()
sys_sendfile: switch to using ->splice_read, if available
vmsplice: add vmsplice-to-user support
splice: abstract out actor data
Merge branch 'trivial-2.6.23' of git://git.kernel.dk/data/git/linux-2.6-block
* 'trivial-2.6.23' of git://git.kernel.dk/data/git/linux-2.6-block:
Documentation/block/barrier.txt is not in sync with the actual code: - blk_queue_ordered() no longer has a gfp_mask parameter - blk_queue_ordered_locked() no longer exists - sd_prepare_flush() looks slightly different
Use list_for_each_entry() instead of list_for_each() in the block device
Make a "menuconfig" out of the Kconfig objects "menu, ..., endmenu",
block/Kconfig already has its own "menuconfig" so remove these
Use menuconfigs instead of menus, so the whole menu can be disabled at once
cfq-iosched: fix async queue behaviour
unexport bio_{,un}map_user
Remove legacy CDROM drivers
[PATCH] fix request->cmd == INT cases
cciss: add new controller support for P700m
[PATCH] Remove acsi.c
[BLOCK] drop unnecessary bvec rewinding from flush_dry_bio_endio
[PATCH] cdrom_sysctl_info fix
blk_hw_contig_segment(): bad segment size checks
[TRIVIAL PATCH] Kill blk_congestion_wait() stub for !CONFIG_BLOCK
Adrian Bunk [Mon, 9 Jul 2007 18:51:12 +0000 (11:51 -0700)]
bonding/bond_main.c: make 2 functions static
Signed-off-by: Adrian Bunk <bunk@stusta.de> Cc: Chad Tindel <ctindel@users.sourceforge.net> Cc: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
This is the third submission of the network driver for PS3.
The differences from the previous one are:
- renamed source file names so that their prefix can match
with the module name
- added cbe-oss-dev@ozlabs.org line for MAINTAINER file
- changed some in copyright comments
If there are no more comments, please apply for 2.6.23.
Thank you
--
Subject: PS3: Ethernet driver
From: Masakazu Mokuno <mokuno@sm.sony.co.jp>
Add Gigabit Ethernet support for the PS3 game console. The module will
be called ps3_gelic.
Jay Vosburgh [Mon, 9 Jul 2007 17:42:47 +0000 (10:42 -0700)]
bonding / ipv6: no addrconf for slaves separately from master
At present, when a device is enslaved to bonding, if ipv6 is
active then addrconf will be initated on the slave (because it is closed
then opened during the enslavement processing). This causes DAD and RS
packets to be sent from the slave. These packets in turn can confuse
switches that perform ipv6 snooping, causing them to incorrectly update
their forwarding tables (if, e.g., the slave being added is an inactve
backup that won't be used right away) and direct traffic away from the
active slave to a backup slave (where the incoming packets will be
dropped).
This patch alters the behavior so that addrconf will only run on
the master device itself. I believe this is logically correct, as it
prevents slaves from having an IPv6 identity independent from the
master. This is consistent with the IPv4 behavior for bonding.
This is accomplished by (a) having bonding set IFF_SLAVE sooner
in the enslavement processing than currently occurs (before open, not
after), and (b) having ipv6 addrconf ignore UP and CHANGE events on
slave devices.
The eql driver also uses the IFF_SLAVE flag. I inspected eql,
and I believe this change is reasonable for its usage of IFF_SLAVE, but
I did not test it.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Peter Korsgaard [Mon, 2 Jul 2007 22:46:42 +0000 (00:46 +0200)]
Cleanup usbnet_probe() return value handling
usbnet_probe() handles a positive return value from the driver bind()
function as success, but will later only setup the status handler if the
return value was zero, leading to confusion. Patch adjusts this to accept
positive values as success in both checks.
Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Get rid of dubious casts to (void *) which causes a sparse warning.
And move largeish function from inline to the one file that uses the code,
the compiler can then decide to inline it.
Compile tested only.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Olof Johansson [Tue, 3 Jul 2007 21:23:46 +0000 (16:23 -0500)]
phylib: cleanup marvell.c a bit
Simplify the marvell driver init a bit: Make the supported devices an
array instead of explicitly registering each structure. This makes it
considerably easier to add new devices down the road.
Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Robert P. J. Day [Tue, 10 Jul 2007 10:37:56 +0000 (06:37 -0400)]
[MIPS] PNX8550: Cleanup proc code.
Here's a slightly cleaner way of creating the /proc structure for the
pnx8850. mostly, it creates a directory with default mode 555, since the
one you're creating is mode 444, which is somewhat unusual for a directory
under /proc.
[MIPS] Change names of local variables to silence sparse
This patch is an workaround for these sparse warnings:
linux/include/linux/calc64.h:25:17: warning: symbol '__quot' shadows an earlier one
linux/include/linux/calc64.h:25:17: originally declared here
linux/include/linux/calc64.h:25:17: warning: symbol '__mod' shadows an earlier one
linux/include/linux/calc64.h:25:17: originally declared here
[MIPS] Add debugfs files to show fpuemu statistics
Export contents of struct mips_fpu_emulator_stats via debugfs.
There is no way to read these statistics for now but they (at least
the "emulated" count) might be sometimes useful for performance tuning
on FPU-less CPUs.
[MIPS] rbtx4938: Fix secondary PCIC and glue internal NICs
* Fix pci ops for secondary PCIC
* Do not reserve 1MB for PCI MEM region (leave PCIBIOS_MIN_MEM zero)
* Use platform_device to provide ethernet addresses for internal NICs.
(background: TX49XX SoCs include PCI NIC (TC35815 compatible)
connected via its internal PCI bus, but the NIC's PROM interface is
not connected to SEEPROM. So we must provide its ethernet address
by another way.)
* Check return value of early_read_config_word()
Atsushi Nemoto [Fri, 29 Jun 2007 13:34:53 +0000 (22:34 +0900)]
[MIPS] tc35815: Load MAC address via platform_device
TX49XX SoCs include PCI NIC (TC35815 compatible) connected via its
internal PCI bus, but the NIC's PROM interface is not connected to
SEEPROM. So we must provide its ethernet address by another way.
Atsushi Nemoto [Mon, 25 Jun 2007 16:14:01 +0000 (01:14 +0900)]
[MIPS] Make ioremap() work on TX39/49 special unmapped segment
TX39XX and TX49XX have "reserved" segment in CKSEG3 area.
0xff000000-0xff3fffff on TX49XX and 0xff000000-0xfffeffff on TX39XX
are reserved (unmapped, uncached). Controllers on these SoCs are
placed in this segment.
This patch add plat_ioremap() and plat_iounmap() to override default
behavior and implement these hooks for TX39/TX49.
- use RTC_CLASS instead of GEN_RTC
- get rid of ds1216 in favour of a RTC_CLASS driver
- use correct console device for older RM400
- use physical addresses for 82596 device
- use 128 byte L1 cache line size (this is needed because most of the
SNI caches are using 128 L2 cache lines)
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[MIPS] Enable support for the userlocal hardware register
Which will cut down the cost of RDHWR $29 which is used to obtain the
TLS pointer and so far being emulated in software down to a single cycle
operation.
This is an optimised implementation of early printk() for the DECstation.
After the recent conversion to a MIPS-specific generic routine using a
character-by-character output the performance dropped significantly.
This change reverts to the previous speed -- even at 9600 bps of the
serial console the difference is visible with a naked eye; I presume for a
framebuffer it is even worse (it may depend on exactly which one is used
though).
Additionally the change includes a fix for a problem that the old
implementation had -- the format used would not actually limit the length
of the string output. This new implementation uses a local buffer to deal
with it -- even with this additional copying it is much faster than the
generic function.
Plus this driver is registered much earlier than the generic one,
allowing one to see critical messages, such as one about an incorrect CPU
setting used, that are produced beforehand. :-)
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Franck Bui-Huu [Mon, 4 Jun 2007 15:46:35 +0000 (17:46 +0200)]
[MIPS] Fix PHYS_OFFSET for 64-bits kernels with 32-bits symbols
The current implementation of __pa() for 64-bits kernels with 32-bits
symbols is broken. In this configuration, we need 2 values for
PAGE_OFFSET, one in XKPHYS and the other in CKSEG0 space.
When the value in CKSEG0 space is used, it doesn't take into account
of PHYS_OFFSET. Even worse we can't redefine this value.
The patch restores CPHYSADDR() but in __pa()'s implementation because
it removes the need of 2 PAGE_OFFSET.
OTOH, CPHYSADDR() is quite bad when dealing with mapped kernels. So
this patch assumes there's no need to deal with such kernel in 64-bits
world.
Franck Bui-Huu [Mon, 4 Jun 2007 15:46:33 +0000 (17:46 +0200)]
[MIPS] Make PAGE_OFFSET aware of PHYS_OFFSET
For platforms that use PHYS_OFFSET and do not use a mapped kernel,
this patch automatically adds PHYS_OFFSET into PAGE_OFFSET.
Therefore there are no more needs for them to redefine PAGE_OFFSET.
For mapped kernel, they need to redefine PAGE_OFFSET anyways.
No point in adding yet another #ifdef for Loongson since all this mask is
being used for is converting an XKPHYS address into a physical address
anyway. So replace all definitions by one with the highest architectural
possible value.