hfsplus: correct usage of HFSPLUS_ATTR_MAX_STRLEN for non-English attributes
HFSPLUS_ATTR_MAX_STRLEN (=127) is the limit of attribute names for the
number of unicode character (UTF-16BE) storable in the HFS+ file system.
Almost all the current usage of it is wrong, in relation to NLS to on-disk
conversion.
Except for one use calling hfsplus_asc2uni (which should stay the same)
and its uses in calling hfsplus_uni2asc (which was corrected in the
earlier patch in this series concerning usage of hfsplus_uni2asc), all the
other uses are of the forms:
Conversion between on-disk unicode representation and NLS char strings (in
whichever direction) always needs to accommodate the worst-case NLS
conversion, so all char buffers of that size need to have a
NLS_MAX_CHARSET_SIZE x .
The bound checks are all wrong, since they compare nls_length derived from
strlen() to a unicode length limit.
It turns out that all the bound-checks do is to protect hfsplus_asc2uni(),
which can fail if the input is too large. There is only one usage of it
as far as attributes are concerned, in hfsplus_attr_build_key(). It is in
turn used by hfsplus_find_attr(), hfsplus_create_attr(),
hfsplus_delete_attr(). Thus making sure that errors from
hfsplus_asc2uni() is caught in hfsplus_attr_build_key() and propagated is
sufficient to replace all the bound checks.
Unpropagated errors from hfsplus_asc2uni() in the file catalog code was
addressed recently in an independent patch "hfsplus: fix longname handling"
by Sougata Santra.
Before this patch, trying to set a 55 CJK character (in a UTF-8
locale, > 127/3=42) attribute plus user prefix fails with:
(= "pointlessly long attribute for testing", elaborate Chinese in
UTF-8 enoding).
However, it is not possible to set double the size (110 + 5 is still
under 127) in a UTF-8 locale:
$setfattr -n user.`cat testing-string testing-string` -v \
`cat testing-string testing-string` testing-string
setfattr: testing-string: Numerical result out of range
110 CJK char in UTF-8 is 330 bytes - the generic get/set attribute system
call code in linux/fs/xattr.c imposes a 255 byte limit. One can use a
combination of iconv to encode content, changing terminal locale for
viewing, and an nls=cp932/cp936/cp949/cp950 mount option to fully use
127-unicode attribute in a double-byte locale.
Also, as an additional information, it is possible to (mis-)use unicode
half-width/full-width forms (U+FFxx) to write attributes which looks like
english but not actually ascii.
Thanks Anton Altaparmakov for reviewing the earlier ideas behind this
change.
Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Anton Altaparmakov <anton@tuxera.com> Cc: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Sougata Santra <sougata@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>