[Webkit-unassigned] [Bug 120030] input/textarea: Count text length for maxLength check with the standard way

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Aug 19 16:43:52 PDT 2013


https://bugs.webkit.org/show_bug.cgi?id=120030





--- Comment #2 from Ryosuke Niwa <rniwa at webkit.org>  2013-08-19 16:43:22 PST ---
This behavior was implemented in https://bugs.webkit.org/show_bug.cgi?id=7622 following https://bugs.webkit.org/show_bug.cgi?id=6987#c11:

 Comment #11 From Darin Adler 2006-03-05 15:49:06 PST (-) [reply] 
(From update of attachment 6878 [details])
One major difference between this maxLength implementation and the one I did in KWQTextField is that this one limits you to a certain number of UTF-16 characters. But the one in KWQTextField limits you to a certain number of "composed character sequences". That means that an e with an umlaut over it counts as 1 character even though it can be two Unicode characters in a row (the e followed by the non-spacing umlaut) and a single Japanese character that is outside the "BMP" that requires two UTF-16 codes (a "surrogate pair") to encode also counts as a single character.

The code that deals with this in KWQTextField is _KWQ_numComposedCharacterSequences and _KWQ_truncateToNumComposedCharacterSequences:.

We will need to replicate this, although I guess it's fine not to at first, but I'd like to see another bug about that.

To tell if a character is half of a surrogate pair, you use macros in <unicode/utf16.h>, such as U16_LENGTH or U16_IS_LEAD. To tell if the character is going to combine with the one before it is more difficult. There's code in CoreFoundation that does this analysis and I presume there's some way to do it with ICU, but I don't know what that is.

In addition to determining such things, code will have to be careful not to do math on the length of strings, since composing means that "length of A plus length of B" is not necessarily the same as "length of A plus B".

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list