[Webkit-unassigned] [Bug 25298] Ctrl + Right/Left arrow move forward/backward through document instead of right/left in RTL text

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Jun 16 12:48:28 PDT 2010


https://bugs.webkit.org/show_bug.cgi?id=25298





--- Comment #17 from Xiaomei Ji <xji at chromium.org>  2010-06-16 12:48:28 PST ---
Thanks for the review!
Please see my replies inline.

(In reply to comment #16)
> (From update of attachment 58093 [details])
> Thanks for your patch! I have some comments on this version:
> 
> Does it really make sense to use a word break iterator on a string in visual order? I would think that the word break iterator is meant to be use on a logically-ordered string. For example, for languages where word breaks are based on a dictionary, like Thai, I don’t think this will work.

I also had a bit concern on whether it is correct to use work break iterator on a sting in visual order.

Yes, you are right that work break iterator is meant to be used on a logically-ordered string. 
But is it true that visual order is different from logical order only for RTL languages and those languages use space as word delimiter? If so, the approach is probably ok.

For dictionary based word break language, such as Chinese, since the visual order is the same as the logical order, the approach should work. 
Is Thai different from other dictionary-based-word-break languages?


> 
> > +    Node* de = d->documentElement();
> 
> Please use a longer name for this variable, like “documentElement”.
> 
> > +    while (box) {
> > +        node = box->renderer()->node();
> > +        // Get strings in visual order.
> > +        if (box->direction() == LTR) {
> > +            range->setStart(node, box->caretMinOffset(), ec);
> > +            startNode ? range->setEnd(node, offset, ec) : range->setEnd(node, box->caretMaxOffset(), ec);
> 
> Normally the ternary operator is only used in WebKit when the result is used (for example, as a parameter to a function or assigned to a variable). In other cases, an if statement is used.
> 
> > +            SimplifiedBackwardsTextIterator it(range.get());
> 
> I am confused by the way you set up the range and use of a backwards text iterator. I am really not sure how adjusting only one end of the range makes sense; the code seems to make some assumptions about how the text iterator works and what it returns first (you never advance the iterator) which it should probably not be making.
> 
> > +            string.prepend(it.characters(), it.length());
> > +        } else {
> > +            startNode ? range->setStart(node, offset, ec) : range->setStart(node, box->caretMinOffset(), ec);
> > +            range->setEnd(node, box->caretMaxOffset(), ec);
> > +            SimplifiedBackwardsTextIterator it(range.get());
> > +            for (int i = 0; i < it.length(); ++i)
> > +                string.prepend(it.characters()[i]);
> 
> Like I said, I doubt that reversing the string will always give the correct result. Moreover, you cannot reverse a string like this, because it will break surrogate pairs (characters that are encoded by two UTF-16 characters).
> 
> > +        prev = previousWordPositionBoundary(string.data(), string.size(), string.size(), DontHaveMoreContext, needMoreContext);
> 
> Is it really true that you couldn’t provide more context to the break iterator if needed?

I was just making it simple.

> 
> > +VisiblePosition rightWordPosition(const VisiblePosition& c)
> 
> A lot of the code in this function is identical or similar to leftWordPosition(). The common code should not be repeated. It should be factored out (probably into file-static functions).
> 
> I think it’s great that you’ve added so many tests! Do you know if the results in these tests match the behavior of the native text system on Mac OS X (that is, does it behave like TextEdit?)? What about other platforms’ native text systems?

It does not (completely) match native behavior in Windows,  OS X, or Linux.

In Widows, arrow key follows logical order, same for ctrl-arrow key.

In Linux, from my test, looks like ctrl-arrow follows logical order too, although arrow key follows visual order. There is a bug filed on it: https://bugzilla.gnome.org/show_bug.cgi?id=136059

In OS X, if the writing direction is RTL, both arrow and alt-arrow follows visual order for pure RTL text, but alt-arrow does not work correctly for pure English.

> 
> It’s especially good that you have these tests and the expected results, because I think you need to take a significantly different approach to this bug, and the tests will allow you to verify that the new approach produces the same (or better) results. It seems like the behavior you were going for here is to return the (visually) nearest position to the left (respectively, to the right) that is a word boundary. A naive algorithm could just iterate over word boundaries (logically) forward and backwards along the line, then pick the one nearest to the starting position to the left (resp. to the right). Does that sounds like a correct description of the desired behavior? If so, perhaps the right thing to do would be to start with an implementation of the naive algorithm, then see how it can be made reasonably efficient.

Good suggestion.

But since caret positions are not always continuous, especially at the boundary,
then, the problem is how to "pick the one nearest to the starting position".
For example:
<div contenteditable dir=rtl>abc def שנב סטז uvw xyz</div> 

the position will be:
.....(8) (7)a(1)b(2)c(3) (4)d(5)e(6)f(0)

the word boundaries are: 0, 4, 3, 7, 8....

when the caret is between 'a' and 'b', whose position is 1, how to make ctrl-left-arrow pick position 7 instead of position 4?

An alternative is to use VisiblePosition::left() to advance caret and check it against the word break positions.

Another one is to use startOfWord/endOfWord to find the boundary, then, only do further work when the caret is already at the boundary.

Or for ctrl-left-arrow,
1. move caret left by VisiblePosition::left()
2. return startOfWord() if the current box is LTR,
or return endOfWord() if the current box is RTL.
3. plus some tweaks if necessary.
Same applies for ctrl-right-arrow.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the webkit-unassigned mailing list