Samuel Jero [Mon, 25 Jul 2011 02:57:49 +0000 (20:57 -0600)]
dccp ccid-2: check Ack Ratio when reducing cwnd
This patch causes CCID-2 to check the Ack Ratio after reducing the congestion
window. If the Ack Ratio is greater than the congestion window, it is
reduced. This prevents timeouts caused by an Ack Ratio larger than the
congestion window.
In this situation, we choose to set the Ack Ratio to half the congestion window
(or one if that's zero) so that if we loose one ack we don't trigger a timeout.
Signed-off-by: Samuel Jero <sj323707@ohio.edu> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Samuel Jero [Mon, 25 Jul 2011 03:05:16 +0000 (21:05 -0600)]
dccp ccid-2: increment cwnd correctly
This patch fixes an issue where CCID-2 will not increase the congestion
window for numerous RTTs after an idle period, application-limited period,
or a loss once the algorithm is in Congestion Avoidance.
What happens is that, when CCID-2 is in Congestion Avoidance mode, it will
increase hc->tx_packets_acked by one for every packet and will increment cwnd
every cwnd packets. However, if there is now an idle period in the connection,
cwnd will be reduced, possibly below the slow start threshold. This will
cause the connection to go into Slow Start. However, in Slow Start CCID-2
performs this test to increment cwnd every second ack:
++hc->tx_packets_acked == 2
Unfortunately, this will be incorrect, if cwnd previous to the idle period
was larger than 2 and if tx_packets_acked was close to cwnd. For example:
cwnd=50 and tx_packets_acked=45.
In this case, the current code, will increment tx_packets_acked until it
equals two, which will only be once tx_packets_acked (an unsigned 32-bit
integer) overflows.
My fix is simply to change that test for tx_packets_acked greater than or
equal to two in slow start.
Signed-off-by: Samuel Jero <sj323707@ohio.edu> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Samuel Jero [Mon, 25 Jul 2011 02:49:19 +0000 (20:49 -0600)]
dccp ccid-2: prevent cwnd > Sequence Window
Add a check to prevent CCID-2 from increasing the cwnd greater than the
Sequence Window.
When the congestion window becomes bigger than the Sequence Window, CCID-2
will attempt to keep more data in the network than the DCCP Sequence Window
code considers possible. This results in the Sequence Window code issuing
a Sync, thereby inducing needless overhead. Further, if this occurs at the
sender, CCID-2 will never detect the problem because the Acks it receives
will indicate no losses. I have seen this cause a drop of 1/3rd in throughput
for a connection.
Also add code to adjust the Sequence Window to be about 5 times the number of
packets in the network (RFC 4340, 7.5.2) and to adjust the Ack Ratio so that
the remote Sequence Window will hold about 5 times the number of packets in
the network. This allows the congestion window to increase correctly without
being limited by the Sequence Window.
Signed-off-by: Samuel Jero <sj323707@ohio.edu> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
dccp ccid-2: use feature-negotiation to report Ack Ratio changes
This uses the new feature-negotiation framework to signal Ack Ratio changes,
as required by RFC 4341, sec. 6.1.2.
That raises some problems with CCID-2, which at the moment can not cope
gracefully with Ack Ratios > 1. Since these issues are not directly related
to feature negotiation, they are marked by a FIXME.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Samuel Jero <sj323707@ohio.edu> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.uk>
Samuel Jero [Mon, 25 Jul 2011 03:06:37 +0000 (21:06 -0600)]
dccp: send Confirm options only once
If a connection is in the OPEN state, remove feature negotiation Confirm
options from the list of options after sending them once; as such options
are NOT supposed to be retransmitted and are ONLY supposed to be sent in
response to a Change option (RFC 4340 6.2).
Signed-off-by: Samuel Jero <sj323707@ohio.edu> Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
dccp: support for the exchange of NN options in established state 1/2
In contrast to static feature negotiation at the begin of a connection, this
patch introduces support for exchange of dynamically changing options.
Such an update/exchange is necessary in at least two cases:
* CCID-2's Ack Ratio (RFC 4341, 6.1.2) which changes during the connection;
* Sequence Window values that, as per RFC 4340, 7.5.2, should be sent "as
the connection progresses".
Both are non-negotiable (NN) features, which means that no new capabilities
are negotiated, but rather that changes in known parameters are brought
up-to-date at either end.
Thse characteristics are reflected by the implementation:
* only NN options can be exchanged after connection setup;
* an ack is scheduled directly after activation to speed up the update;
* CCIDs may request changes to an NN feature even if a negotiation for that
feature is already underway: this is required by CCID-2, where changes in
cwnd necessitate Ack Ratio changes, such that the previous Ack Ratio (which
is still being negotiated) would cause irrecoverable RTO timeouts (thanks
to work by Samuel Jero).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Samuel Jero <sj323707@ohio.edu> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.uk>
Joe Perches [Fri, 29 Jul 2011 16:38:14 +0000 (16:38 +0000)]
ipg: Use current logging styles
Add and use pr_fmt, pr_<level> and netdev_<level>.
Convert printks with %[n].[n]x to %nx to be shorter
and clearer.
Consolidate printks to use a single printk rather
than continued printks without KERN_CONT.
Removed unnecessary trailing periods.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Now printing states of essential registers once fw hang has been detected.
Bumped up the driver version to 5.0.22
Signed-off-by: Sritej Velaga <sritej.velaga@qlogic.com> Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Place for gathering FW dump template has been moved to the FW restart path
so that the driver can check if a newer FW version is available and in that case
it replaces the existing FW dump template with the newer template.
Signed-off-by: Sritej Velaga <sritej.velaga@qlogic.com> Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Driver should not check for heart beat anymore when FW is hung, rather it
should restart the FW.
Signed-off-by: Sritej Velaga <sritej.velaga@qlogic.com> Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Change details:
- Split the hw interface into common and asic specific to support new asic
in the future.
- Fix bfa_ioc_ct_isr_mode_set() to also include the case that we are already
in the desired msix mode.
Signed-off-by: Rasesh Mody <rmody@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Change details:
- ioc->cna is always set to 1 for eth functions, remove the check that
asserts IOC is in CNA mode in bfa_ioc_firmware_lock() and
bfa_ioc_firmware_unlock() in bfa_ioc_ct.c.
Signed-off-by: Rasesh Mody <rmody@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for 64-bit stats interface, the following cleanups help
streamline the code:
1) made some more rx/tx stats stored by driver 64 bit
2) made some HW stas (err/drop counters) stored in be_drv_stats 32 bit to
keep the code simple as BE provides 32-bit counters only.
3) removed duplication of netdev stats in ethtool
4) removed some un-necessary stats and fixed some names
Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 1 Aug 2011 00:31:44 +0000 (14:31 -1000)]
Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (46 commits)
mfd: Fix mismatch in twl4030 mutex lock-unlock
mfd: twl6030-pwm.c needs MODULE_LICENSE
mfd: Fix the omap-usb-host clock API usage on usbhs_disable()
mfd: Acknowledge WM8994 IRQs before reporting
mfd: Acknowlege all WM831x IRQs before we handle them
mfd: Avoid two assignments if failures happen in tps65910_i2c_probe
regulator: Storing tps65912 error codes in u8
mfd: Don't leak init_data in tps65910_i2c_probe
regulator: aat2870: Add AAT2870 regulator driver
backlight: Add AAT2870 backlight driver
mfd: Add AAT2870 mfd driver
mfd: Remove dead code from max8997-irq
mfd: Move TPS55910 Kconfig option
mfd: Fix missing stmpe kerneldoc
mfd: Fix off-by-one value range checking for tps65912_i2c_write
mfd: Add devices for WM831x clocking module
mfd: Remove comp{1,2}_threshold sysfs entries in tps65911_comparator_remove
mfd: Don't ask about the TPS65912 core driver in Kconfig
mfd: Fix off by one in WM831x IRQ code
mfd: Add tps65921 support from twl-core
...
Linus Torvalds [Mon, 1 Aug 2011 00:30:59 +0000 (14:30 -1000)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k/math-emu: Remove unnecessary code
m68k/math-emu: Remove commented out old code
m68k: Kill warning in setup_arch() when compiling for Sun3
m68k/atari: Prefix GPIO_{IN,OUT} with CODEC_
sparc: iounmap() and *_free_coherent() - Use lookup_resource()
m68k/atari: Reserve some ST-RAM early on for device buffer use
m68k/amiga: Chip RAM - Use lookup_resource()
resources: Add lookup_resource()
sparc: _sparc_find_resource() should check for exact matches
m68k/amiga: Chip RAM - Offset resource end by CHIP_PHYSADDR
m68k/amiga: Chip RAM - Use resource_size() to fix off-by-one error
m68k/amiga: Chip RAM - Change chipavail to an atomic_t
m68k/amiga: Chip RAM - Always allocate from the start of memory
m68k/amiga: Chip RAM - Convert from printk() to pr_*()
m68k/amiga: Chip RAM - Use tabs for indentation
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Hemanth V <hemanthv@ti.com> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
mfd: Fix the omap-usb-host clock API usage on usbhs_disable()
usbhs_disable function was invoking clk_enable() instead of
clk_disable(), thus only increasing the clock usage counter and
preventing this particular clock from being ever turned off.
Because of this, the power domain of omap4 the USB Host subsystem
would never reach lower power states.This patch calls clk_disable()
in usbhs_disable function
Signed-off-by: Keshava Munegowda <keshava_mgowda@ti.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Wed, 20 Jul 2011 16:05:13 +0000 (17:05 +0100)]
mfd: Acknowlege all WM831x IRQs before we handle them
Ensure that we never have a window where we've handled an interrupt (and
therefore need to be notified of new events) but haven't yet told the
interrupt controller that this is the case (so any new events will be
discarded).
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
mfd: Avoid two assignments if failures happen in tps65910_i2c_probe
In drivers/mfd/tps65910.c:tps65910_i2c_probe() there's potential for a
tiny optimization.
We assign to init_data->irq and init_data->irq_base long before we
need them, and there are two potential exits from the function before
they are needed.
Moving the assignments below these two potential exits means we
completely avoid doing them in these two (failure) cases.
Jin Park [Mon, 4 Jul 2011 08:43:42 +0000 (17:43 +0900)]
regulator: aat2870: Add AAT2870 regulator driver
Add regulator driver for AnalogicTech AAT2870.
Signed-off-by: Jin Park <jinyoungp@nvidia.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We either hit one of the case's or the default in the switch statement
in get_i2c(), so the 'return ERR_PTR(-EINVAL);' at the end of the
function is just dead code - remove it.
Signed-off-by: Jesper Juhl <jj@chaosbits.net> Acked-by: MyungJoo Ham <myungjoo.ham@samsung.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Having another TPS chip at the end of the Kconfig when all it's
relatives are grouped together in their own section seems totally
counter-intuitive. Move it, also in the Makefile.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Wed, 22 Jun 2011 13:53:58 +0000 (14:53 +0100)]
mfd: Don't ask about the TPS65912 core driver in Kconfig
The user has to select the I2C and SPI drivers individually and they select
the core driver for the device so there's no point in presenting the user
with an option for the core driver.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The GPIO IRQs aren't the first IRQs defined, we need to subtract the base
for the GPIOs as well to use them for array indexes.
Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Mon, 20 Jun 2011 10:47:55 +0000 (11:47 +0100)]
mfd: Implement tps65910 IRQ cleanup
The tps65910_irq_exit() cleanup function was generating a warning from
sparse due to the lack of a prototype. This wasn't causing GCC warnings
as the driver wasn't cleaning up its IRQs on exit at all so there was no
use of an unprototyped function.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Linus Walleij [Thu, 9 Jun 2011 21:57:57 +0000 (23:57 +0200)]
mfd: Clean-up ab8500 register file
This adds a previously undefined test register and removed a
number of double-defined accessory detect registers (they are
already defined higher up in the file.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Linus Walleij [Thu, 9 Jun 2011 21:57:45 +0000 (23:57 +0200)]
mfd: Update ab8500 subdevice list
This synchronize the subdevice entries for the AB8500 MFD driver
with the latest development of subdrivers for things like battery
charging and temperature monitoring.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Margarita Olaya [Thu, 9 Jun 2011 19:50:27 +0000 (14:50 -0500)]
tps65912: add regulator driver
The tps65912 consist of 4 DCDCs and 10 LDOs. The output voltages can be
configured by the SPI or I2C interface, they are meant to supply power
to the main processor and other components.
Signed-off-by: Margarita Olaya Cabrera <magi@slimlogic.co.uk> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Liam Girdwood <lrg@ti.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Peter Huewe [Mon, 6 Jun 2011 20:43:32 +0000 (22:43 +0200)]
mfd: Use kstrtoul_from_user in ab8500
This patch replaces the code for getting an unsigned long from a
userspace buffer by a simple call to kstroul_from_user.
This makes it easier to read and less error prone.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Peter Huewe [Mon, 6 Jun 2011 20:43:31 +0000 (22:43 +0200)]
mfd: Use kstrtoul_from_user in ab3550
This patch replaces the code for getting an unsigned long from a
userspace buffer by a simple call to kstroul_from_user.
This makes it easier to read and less error prone.
Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Tue, 7 Jun 2011 10:49:42 +0000 (11:49 +0100)]
mfd: Fix WM8994 IRQ register cache restore on resume
When the byte swap was factored out into the per-register I/O functions
the register restore for the IRQ mask cache (which we use and store in
CPU native format for the interrupt handler) was not updated to do a byte
swap when it uses the bulk I/O. Fix this by writing the cache out one
register at a time.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Tue, 7 Jun 2011 10:47:28 +0000 (11:47 +0100)]
mfd: Support multiple active WM831x AUXADC conversions
The WM831x AUXADC hardware can schedule multiple conversions at once,
allowing higher performance when more than one source is in use as we
can have the hardware start new conversions without having to wait for
a register write.
Take advantage of this in the interrupt driven case, maintaining a list of
callers that are waiting for AUXADC conversions and completing them all
simultaneously. The external interface of the AUXADC is not changed so
there will be limited use of the feature immediately.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Sascha Hauer [Thu, 2 Jun 2011 18:18:54 +0000 (19:18 +0100)]
mfd: Allocate wm835x irq descs dynamically
This allows boards to leave the irq_base field unitialized and
prevents them having to reserve irqs in the platform.
pdata can be optional for irq support now. Without pdata the
driver allocates some free irq range. With pdata and irq_base > 0
the driver allocates exactly the specified irq.
Without pdata the irq defaults to IRQF_TRIGGER_LOW.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Thu, 2 Jun 2011 18:18:53 +0000 (19:18 +0100)]
mfd: Refactor wm831x AUXADC handling into a separate file
In preparation for some additional work on the wm831x AUXADC code move the
support into a separate file. This is a simple code motion patch, there
should be no functional changes.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Thu, 2 Jun 2011 18:18:52 +0000 (19:18 +0100)]
mfd: Read wm831x AUXADC conversion results before acknowledging interrupt
Ensure that there's no possibility of loosing an AUXADC interrupt by reading
the conversion result in the IRQ handler when using interrupts. Otherwise
it's possible that under very heavy load a new conversion could be initiated
before the acknowledgement for a previous interrupt has happened.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Thu, 2 Jun 2011 18:18:51 +0000 (19:18 +0100)]
mfd: Support dynamic allocation of IRQ range for wm831x
Use irq_allocate_desc() to get the IRQ range, which turns into a noop on
non-sparse systems. Since all existing users are non-sparse there should
be no compatibility issues.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Thu, 2 Jun 2011 18:18:50 +0000 (19:18 +0100)]
mfd: Only register wm831x RTC device if the 32.768kHz crystal is enabled
The RTC uses the 32.768kHz crystal so if it's not enabled (and it can only
be enabled via OTP or InstantConfig, not runtime software) the RTC can't
function.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Thu, 2 Jun 2011 18:18:49 +0000 (19:18 +0100)]
mfd: Allow touchscreen to be disabled on wm831x devices
Allow platform data to flag the touchscreen as disabled so that if the
touch driver is built in we don't end up causing lots of work by spuriously
detecting touchscreen activity on systems where it isn't in use.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Mark Brown [Thu, 2 Jun 2011 18:18:47 +0000 (19:18 +0100)]
mfd: Fix bus lock interaction for WM831x IRQ set_type() operation
The WM831x IRQ set_type() operation is doing a direct register write when
called but since set_type() is called with the bus lock held this isn't
legal and could cause deadlocks in the IRQ core.
Fix this by posting the updates into an array and syncing in the
bus_sync_unlock() callback.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
* git://git.infradead.org/battery-2.6:
gpio-charger: Fix checking return value of request_any_context_irq
power_supply: MAX17042: Support additional properties
max8903_charger: Allow platform data to be __initdata
power_supply: Add charger driver for MAX8998/LP3974
power_supply: Add charger driver for MAX8997/8966
max17042_battery: Remove obsolete cleanup for clientdata
twl4030_charger: Fix warnings
wm831x_power: Support multiple instances
wm831x_backup: Support multiple instances
apm_power: Fix style error in macros
s3c_adc_battery: Fix annotation for s3c_adc_battery_probe()
bq20z75: Enable detection after registering
bq20z75: Add support for external notification
* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/cpupowerutils:
cpupower: Do detect IDA (opportunistic processor performance) via cpuid
cpupower: Show Intel turbo ratio support via ./cpupower frequency-info
cpupowerutils: increase MAX_LINE_LEN
cpupower: Rename package from cpupowerutils to cpupower
cpupowerutils: Rename: libcpufreq->libcpupower
cpupowerutils: use kernel version-derived version string
cpupowerutils: utils - ConfigStyle bugfixes
cpupowerutils: helpers - ConfigStyle bugfixes
cpupowerutils: idle_monitor - ConfigStyle bugfixes
cpupowerutils: lib - ConfigStyle bugfixes
cpupowerutils: bench - ConfigStyle bugfixes
cpupowerutils: do not update po files on each and every compile
cpupowerutils: remove ccdv, use kernel quiet/verbose mechanism
cpupowerutils: use COPYING, CREDITS from top-level directory
cpupowerutils - cpufrequtils extended with quite some features
* git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
smc91c92_cs.c: fix bogus compiler warning
orinoco_cs: be more careful when matching cards with ID 0x0156:0x0002
hostap_cs: support cards with "Version 01.02" as third product ID
pcmcia: add PCMCIA_DEVICE_MANF_CARD_PROD_ID3
pxa2xx pcmcia - stargate 2 use gpio array.
pcmcia: pxa2xx: remove empty socket_init / socket_resume functions.
drivers:pcmcia:soc_common: make socket_init and socket_suspend optional
Fred Isaman [Sun, 31 Jul 2011 00:52:54 +0000 (20:52 -0400)]
pnfsblock: bl_write_pagelist
Note: When upper layer's read/write request cannot be fulfilled, the block
layout driver shouldn't silently mark the page as error. It should do
what can be done and leave the rest to the upper layer. To do so, we
should set rdata/wdata->res.count properly.
When upper layer re-send the read/write request to finish the rest
part of the request, pgbase is the position where we should start at.
[pnfsblock: bl_write_pagelist support functions]
[pnfsblock: bl_write_pagelist adjust for missing PG_USE_PNFS] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: handle errors when read or write pagelist.] Signed-off-by: Zhang Jingwang <yyalone@gmail.com>
[pnfs-block: use new write_pagelist api] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu>
[SQUASHME: pnfsblock: mds_offset is set in the generic layer] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[pnfsblock: mark IO error with NFS_LAYOUT_{RW|RO}_FAILED] Signed-off-by: Peng Tao <peng_tao@emc.com>
[pnfsblock: SQUASHME: adjust to API change] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: fixup blksize alignment in bl_setup_layoutcommit] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[pnfsblock: bl_write_pagelist adjust for missing PG_USE_PNFS] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: handle errors when read or write pagelist.] Signed-off-by: Zhang Jingwang <yyalone@gmail.com>
[pnfs-block: use new write_pagelist api] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fred Isaman [Sun, 31 Jul 2011 00:52:53 +0000 (20:52 -0400)]
pnfsblock: bl_read_pagelist
Note: When upper layer's read/write request cannot be fulfilled, the block
layout driver shouldn't silently mark the page as error. It should do
what can be done and leave the rest to the upper layer. To do so, we
should set rdata/wdata->res.count properly.
When upper layer re-send the read/write request to finish the rest
part of the request, pgbase is the position where we should start at.
[pnfsblock: mark IO error with NFS_LAYOUT_{RW|RO}_FAILED] Signed-off-by: Peng Tao <peng_tao@emc.com>
[pnfsblock: read path error handling] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: handle errors when read or write pagelist.] Signed-off-by: Zhang Jingwang <yyalone@gmail.com>
[pnfs-block: use new read_pagelist api] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fred Isaman [Sun, 31 Jul 2011 00:52:52 +0000 (20:52 -0400)]
pnfsblock: cleanup_layoutcommit
In blocklayout driver. There are two things happening
while layoutcommit/cleanup.
1. the modified extents are encoded.
2. On cleanup the extents are put back on the layout rw
extents list, for reads.
In the new system where actual xdr encoding is done in
encode_layoutcommit() directly into xdr buffer, these are
the new commit stages:
1. On setup_layoutcommit, the range is adjusted as before
and a structure is allocated for communication with
bl_encode_layoutcommit && bl_cleanup_layoutcommit
(Generic layer provides a void-star to hang it on)
2. bl_encode_layoutcommit is called to do the actual
encoding directly into xdr. The commit-extent-list is not
freed and is stored on above structure.
FIXME: The code is not yet converted to the new XDR cleanup
3. On cleanup the commit-extent-list is put back by a call
to set_to_rw() as before, but with no need for XDR decoding
of the list as before. And the commit-extent-list is freed.
Finally allocated structure is freed.
[rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()] Signed-off-by: Jim Rees <rees@umich.edu>
[pnfsblock: introduce bl_committing list] Signed-off-by: Peng Tao <peng_tao@emc.com>
[pnfsblock: SQUASHME: adjust to API change] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[blocklayout: encode_layoutcommit implementation] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
[pnfsblock: fix bug setting up layoutcommit.] Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
[pnfsblock: cleanup_layoutcommit wants a status parameter] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fred Isaman [Sun, 31 Jul 2011 00:52:51 +0000 (20:52 -0400)]
pnfsblock: encode_layoutcommit
In blocklayout driver. There are two things happening
while layoutcommit/cleanup.
1. the modified extents are encoded.
2. On cleanup the extents are put back on the layout rw
extents list, for reads.
In the new system where actual xdr encoding is done in
encode_layoutcommit() directly into xdr buffer, these are
the new commit stages:
1. On setup_layoutcommit, the range is adjusted as before
and a structure is allocated for communication with
bl_encode_layoutcommit && bl_cleanup_layoutcommit
(Generic layer provides a void-star to hang it on)
2. bl_encode_layoutcommit is called to do the actual
encoding directly into xdr. The commit-extent-list is not
freed and is stored on above structure.
FIXME: The code is not yet converted to the new XDR cleanup
3. On cleanup the commit-extent-list is put back by a call
to set_to_rw() as before, but with no need for XDR decoding
of the list as before. And the commit-extent-list is freed.
Finally allocated structure is freed.
[rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()]
[pnfsblock: get rid of deprecated xdr macros] Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[blocklayout: encode_layoutcommit implementation] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
[pnfsblock: fix bug setting up layoutcommit.] Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
[pnfsblock: prevent commit list corruption]
[pnfsblock: fix layoutcommit with an empty opaque] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Fred Isaman [Sun, 31 Jul 2011 00:52:49 +0000 (20:52 -0400)]
pnfsblock: add extent manipulation functions
Adds working implementations of various support functions
to handle INVAL extents, needed by writes, such as
bl_mark_sectors_init and bl_is_sector_init.
Fred Isaman [Sun, 31 Jul 2011 00:52:48 +0000 (20:52 -0400)]
pnfsblock: bl_find_get_extent
Implement bl_find_get_extent(), one of the core extent manipulation
routines.
[pnfsblock: Lookup list entry of layouts and tags in reverse order] Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Jim Rees <rees@umich.edu>
pnfsblock: fix print format warnings for sector_t and size_t
gcc spews warnings about these on x86_64, e.g.:
fs/nfs/blocklayout/blocklayout.c:74: warning: format ‘%Lu’ expects type ‘long long unsigned int’, but argument 2 has type ‘sector_t’
fs/nfs/blocklayout/blocklayout.c:388: warning: format ‘%d’ expects type ‘int’, but argument 5 has type ‘size_t’
Fred Isaman [Sun, 31 Jul 2011 00:52:46 +0000 (20:52 -0400)]
pnfsblock: call and parse getdevicelist
Call GETDEVICELIST during mount, then call and parse GETDEVICEINFO
for each device returned.
[pnfsblock: get rid of deprecated xdr macros] Signed-off-by: Jim Rees <rees@umich.edu>
[pnfsblock: fix pnfs_deviceid references] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
[pnfsblock: fix print format warnings for sector_t and size_t]
[pnfs-block: #include <linux/vmalloc.h>]
[pnfsblock: no PNFS_NFS_SERVER] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfsblock: fix bug determining size of striped volume]
[pnfsblock: fix oops when using multiple devices] Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[pnfsblock: get rid of vmap and deviceid->area structure] Signed-off-by: Peng Tao <peng_tao@emc.com> Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>