Darrick J. Wong [Sun, 29 Apr 2012 22:45:10 +0000 (18:45 -0400)]
ext4: make block group checksums use metadata_csum algorithm
metadata_csum supersedes uninit_bg. Convert the ROCOMPAT uninit_bg
flag check to a helper function that covers both, and make the
checksum calculation algorithm use either crc16 or the metadata_csum
chosen algorithm depending on which flag is set. Print a warning if
we try to mount a filesystem with both feature flags set.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Darrick J. Wong [Sun, 29 Apr 2012 22:43:10 +0000 (18:43 -0400)]
ext4: Calculate and verify checksums of extended attribute blocks
Calculate and verify the checksums of extended attribute blocks. This
only applies to separate EA blocks that are pointed to by
inode->i_file_acl (i.e. external EA blocks); the checksum lives in
the EA header.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Darrick J. Wong [Sun, 29 Apr 2012 22:41:10 +0000 (18:41 -0400)]
ext4: calculate and verify checksums of directory leaf blocks
Calculate and verify the checksums for directory leaf blocks
(i.e. blocks that only contain actual directory entries). The
checksum lives in what looks to be an unused directory entry with a 0
name_len at the end of the block. This scheme is not used for
internal htree nodes because the mechanism in place there only costs
one dx_entry, whereas the "empty" directory entry would cost two
dx_entries.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Darrick J. Wong [Sun, 29 Apr 2012 22:39:10 +0000 (18:39 -0400)]
ext4: Calculate and verify checksums for htree nodes
Calculate and verify the checksum for directory index tree (htree)
node blocks. The checksum is stored in the last 4 bytes of the htree
block and requires the dx_entry array to stop 1 dx_entry short of the
end of the block.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Darrick J. Wong [Sun, 29 Apr 2012 22:37:10 +0000 (18:37 -0400)]
ext4: verify and calculate checksums for extent tree blocks
Calculate and verify the checksum for each extent tree block. The
checksum is located in the space immediately after the last possible
ext4_extent in the block. The space is is typically the last 4-8
bytes in the block.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Darrick J. Wong [Sun, 29 Apr 2012 22:31:10 +0000 (18:31 -0400)]
ext4: calculate and verify inode checksums
This patch introduces to ext4 the ability to calculate and verify
inode checksums. This requires the use of a new ro compatibility flag
and some accompanying e2fsprogs patches to provide the relevant
features in tune2fs and e2fsck. The inode generation changes have
been integrated into this patch.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Darrick J. Wong [Sun, 29 Apr 2012 22:29:10 +0000 (18:29 -0400)]
ext4: calculate and verify superblock checksum
Calculate and verify the superblock checksum. Since the UUID and
block group number are embedded in each copy of the superblock, we
need only checksum the entire block. Refactor some of the code to
eliminate open-coding of the checksum update call.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Darrick J. Wong [Sun, 29 Apr 2012 22:21:10 +0000 (18:21 -0400)]
ext4: create a new BH_Verified flag to avoid unnecessary metadata validation
Create a new BH_Verified flag to indicate that we've verified all the
data in a buffer_head for correctness. This allows us to bypass
expensive verification steps when they are not necessary without
missing them when they are.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Merge tag 'pm-for-3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael J. Wysocki:
"Fix for an issue causing hibernation to hang on systems with highmem
(that practically means i386) due to broken memory management (bug
introduced in 3.2, so -stable material) and PM documentation update
making the freezer documentation follow the code again after some
recent updates."
* tag 'pm-for-3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Freezer / Docs: Update documentation about freezing of tasks
PM / Hibernate: fix the number of pages used for hibernate/thaw buffering
PM / QoS: Create device constraints objects on notifier registration
The current behavior of dev_pm_qos_add_notifier() makes device PM QoS
notifiers less than useful. Namely, it silently returns success when
called before any PM QoS constraints are added for the device, so the
caller will assume that the notifier has been registered, but when
someone actually adds some nontrivial constraints for the device
eventually, the previous callers of dev_pm_qos_add_notifier()
will not know about that and their notifier routines will not be
executed (contrary to their expectations).
To address this problem make dev_pm_qos_add_notifier() create the
constraints object for the device if it is not present when the
routine is called.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by : markgross <markgross@thegnar.org>
PM / Runtime: Remove device fields related to suspend time, v2
After the previous changes in default_stop_ok() and
default_power_down_ok() for PM domains, there are two fields in
struct dev_pm_info that aren't necessary any more, suspend_time
and max_time_suspended_ns.
Remove those fields along with all of the code that accesses them,
which simplifies the runtime PM framework quite a bit.
PM / Domains: Rework default domain power off governor function, v2
The existing default domain power down governor function for PM
domains, default_power_down_ok(), is supposed to check whether or not
the PM QoS latency constraints of the devices in the domain will be
violated if the domain is turned off by pm_genpd_poweroff().
However, the computations carried out by it don't reflect the
definition of the PM QoS latency constrait in
Documentation/ABI/testing/sysfs-devices-power.
Make default_power_down_ok() follow the definition of the PM QoS
latency constrait. In particular, make it only take latencies into
account, because it doesn't matter how much time has elapsed since
the domain's devices were suspended for the computation.
Remove the break_even_ns and power_off_time fields from
struct generic_pm_domain, because they are not necessary any more.
The existing default device stop governor function for PM domains,
default_stop_ok(), is supposed to check whether or not the device's
PM QoS latency constraint will be violated if the device is stopped
by pm_genpd_runtime_suspend(). However, the computations carried out
by it don't reflect the definition of the PM QoS latency constrait in
Documentation/ABI/testing/sysfs-devices-power.
Make default_stop_ok() follow the definition of the PM QoS latency
constrait. In particular, make it take the device's start and stop
latencies correctly.
Add a new field, effective_constraint_ns, to struct gpd_timing_data
and use it to store the difference between the device's PM QoS
constraint and its resume latency for use by the device's parent
(the effective_constraint_ns values for the children are used for
computing the parent's one along with its PM QoS constraint).
Remove the break_even_ns field from struct gpd_timing_data, because
it's not used any more.
epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
wakeup_source will be active to prevent suspend. This can be used to
handle wakeup events from a driver that support poll, e.g. input, if
that driver wakes up the waitqueue passed to epoll before allowing
suspend.
The current implementation uses an extra wakeup_source when
ep_scan_ready_list runs. This can cause problems if a single thread
is polling on wakeup events and frequent non-wakeup events (events
usually arrive during thread freezing) using the same epoll file.
Signed-off-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
PM / Sleep: Add user space interface for manipulating wakeup sources, v3
Android allows user space to manipulate wakelocks using two
sysfs file located in /sys/power/, wake_lock and wake_unlock.
Writing a wakelock name and optionally a timeout to the wake_lock
file causes the wakelock whose name was written to be acquired (it
is created before is necessary), optionally with the given timeout.
Writing the name of a wakelock to wake_unlock causes that wakelock
to be released.
Implement an analogous interface for user space using wakeup sources.
Add the /sys/power/wake_lock and /sys/power/wake_unlock files
allowing user space to create, activate and deactivate wakeup
sources, such that writing a name and optionally a timeout to
wake_lock causes the wakeup source of that name to be activated,
optionally with the given timeout. If that wakeup source doesn't
exist, it will be created and then activated. Writing a name to
wake_unlock causes the wakeup source of that name, if there is one,
to be deactivated. Wakeup sources created with the help of
wake_lock that haven't been used for more than 5 minutes are garbage
collected and destroyed. Moreover, there can be only WL_NUMBER_LIMIT
wakeup sources created with the help of wake_lock present at a time.
The data type used to track wakeup sources created by user space is
called "struct wakelock" to indicate the origins of this feature.
This version of the patch includes an rbtree manipulation fix from John Stultz.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NeilBrown <neilb@suse.de>
Android uses one wakelock statistics that is only necessary for
opportunistic sleep. Namely, the prevent_suspend_time field
accumulates the total time the given wakelock has been locked
while "automatic suspend" was enabled. Add an analogous field,
prevent_sleep_time, to wakeup sources and make it behave in a similar
way.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce a mechanism by which the kernel can trigger global
transitions to a sleep state chosen by user space if there are no
active wakeup sources.
It consists of a new sysfs attribute, /sys/power/autosleep, that
can be written one of the strings returned by reads from
/sys/power/state, an ordered workqueue and a work item carrying out
the "suspend" operations. If a string representing the system's
sleep state is written to /sys/power/autosleep, the work item
triggering transitions to that state is queued up and it requeues
itself after every execution until user space writes "off" to
/sys/power/autosleep.
That work item enables the detection of wakeup events using the
functions already defined in drivers/base/power/wakeup.c (with one
small modification) and calls either pm_suspend(), or hibernate() to
put the system into a sleep state. If a wakeup event is reported
while the transition is in progress, it will abort the transition and
the "system suspend" work item will be queued up again.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: NeilBrown <neilb@suse.de>