[Webkit-unassigned] [Bug 78856] visual word movement: Using ICU break iterator to simplify implementation

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Tue Feb 21 12:14:09 PST 2012


https://bugs.webkit.org/show_bug.cgi?id=78856


Xiaomei Ji <xji at chromium.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #127480|0                           |1
        is obsolete|                            |




--- Comment #6 from Xiaomei Ji <xji at chromium.org>  2012-02-21 12:14:07 PST ---
Created an attachment (id=128018)
 --> (https://bugs.webkit.org/attachment.cgi?id=128018&action=review)
patch w/ layout test

Shuffled the code to the beginning of the file for readability only.

In reply to comment #3)
> > it reduces the complexity of old implementation in the cost of performance.
> 
> This patch breaks regression tests. Is your plan to also make sure that nothing breaks, in addition to not regressing performance?

You are talking about the regression in move-by-word-visually-single-space-inline-element.html, right?
There are 5 regressions and 5 progressions in this file.

There are 2 cases of the regression, both because I used logical word break to imply visual word break, which might not be correct. But except using space as word segmentor (which is totally wrong for languages do not use space as word segmentor, such as CJK languages), I could not think of any solution yet.

The first case is the regression of test 11, 13, and 17.
Using test 11 as an example, the logical text is: <div dir=rtl>ABC DEF <span dir=ltr>abc def</span>OPQ RST</div>.
Logically, position after "def|" is not a word breaker.
The text is visually displayed as "TSR QPOabc def FED CBA". When cursor is at "TSR QPOabc def| FED CBA", since we use logical word break position to imply visual word break position, that position is not considered as word breaker logically, so it wont be considered as word breaker visually.

The second case is the regression of test 14 and 18.
Using test 14 as an example, the logical text is <div dir=ltr>ABC DEF <span>abc def</span>OPQ RST</div>.
The text visually displayed as "FED CBA abc defTSR QPO".
Position "FED CBA abc def|TSR QPO" is at (InlineBox: "OPQ RST", offset:7). It is corresponding to logical end of the text, which is considered as a word break logically, so it is considered as a word break visually.

Except those 5 test cases, all other changes (either in the test file itself or in the expected result) are progression.

More cases (that I have not added yet) that the new code yield wrong result is when arrow-left-right yields wrong result.
For example, for case "abc \u202bABC def\u202c xyz", ICU word break iterator treat "def\u202c|" as a word break position. But arrow key is not able to reach that position.
Another example, <div contenteditable id="multiple_space_wrong_result">abc ששש def <span dir=rtl>שנב  opq סטז</span>  uvw ששש xyz</div>, in which arrow-left falls in a dead-loop.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the webkit-unassigned mailing list