[Webkit-unassigned] [Bug 25321] RLE/LRE characters are removed from any text entered into a text box (either using a keyboard or by pasting)

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Oct 14 17:31:59 PDT 2010


https://bugs.webkit.org/show_bug.cgi?id=25321





--- Comment #15 from Xiaomei Ji <xji at chromium.org>  2010-10-14 17:31:58 PST ---
Thanks for the review! Please see my replies inlined.

(In reply to comment #14)
> (From update of attachment 67049 [details])
> Thanks for tackling this bug!  Here are some preliminary feedbacks.
> 
> > WebCore/dom/Position.cpp:556
> > +        if (renderer->isText() && mode == PreserveUnicodeBidiControlCharacter) {
> 
> Can we do:
> if (!renderer->isText())
>   continue;
> instead of checking renderer->isText() here

Done.

> 
> > WebCore/dom/Position.cpp:557
> > +            unsigned textOffset = currentPos.offsetInLeafNode();
> 
> There's a code below that does the exact same thing.  Only difference being checking whether or not there's rendered text.  Can we somehow combine these two code?

Combined.


> 
> > WebCore/dom/Position.cpp:567
> >          if (renderer->isText() && toRenderText(renderer)->firstTextBox()) {
> 
> and here?  You can file a separate bug to do this cleanup first to make the patch smaller.

Done.


> 
> > WebCore/dom/Position.cpp:693
> > +        if (renderer->isText() &&  mode == PreserveUnicodeBidiControlCharacter) {
> 
> Same comment goes here.

Done.


> 
> > WebCore/dom/Position.cpp:792
> > +        if (!nodeIsUserSelectNone(node())) {
> > +            bool rendered = inRenderedText();
> > +            // In the case of PreserveUnicodeBidiControlCharacter, consider a position begins with unicode bidi control
> > +            // character as a candidate as well.
> > +            return mode == PreserveUnicodeBidiControlCharacter ? rendered || startWithUnicodeBidiControlCharacter() : rendered;
> > +        }
> > +        return false;
> 
> (I think) you can simplify this as:
> return !nodeIsUserSelectNone(node()) && (inRenderedText()
>     || (mode == PreserveUnicodeBidiControlCharacter ? startWithUnicodeBidiControlCharacter() : true))

"true" should be changed to "false".
Changed to:
 return !nodeIsUserSelectNone(node())
            && (inRenderedText() || mode == PreserveUnicodeBidiControlCharacter && startWithUnicodeBidiControlCharacter())



> 
> > WebCore/dom/Position.cpp:813
> > +bool Position::startWithUnicodeBidiControlCharacter() const
> 
> This function name does not seem to match what the function does.

It checks whether the current position starts with continuous unicode bidi control characters.
It has 3 scenarios that I will explain in detail below.  


> 
> > WebCore/dom/Position.cpp:830
> > +        for (int i = m_offset; i < static_cast<int>(static_cast<Text*>(node())->length()); ++i) {
> > +            UChar ch = static_cast<Text*>(node())->data()[i];
> > +            if (!isBidiControlCharacter(ch))
> > +                return false;
> > +        }
> > +        return true;
> 
> e.g. this code seems to return true only if everything in the text box was bidi control character, which doesn't match what the function name implies.  Could you clarify what you're trying to do in this function?

This is the first scenario:
Unicode control characters are the only characters entered in a content editable area.
At this time, there is no text node yet (since unicode control characters are not included in text node).

This happens when you enter a unicode control character as the first character in a editing area (and when you continuing entering unicode control characters only).


The 2nd scenario is unicode control characters entered between 2 text boxes.
For example,  abc + Control_Characters + def.
When the position is at Control_Characters, it is before text box "def", if the characters between position and "d" are all unicode control characters, the position is considered as a valid candidate position.


> 
> > WebCore/dom/Position.cpp:858
> > +        // Check against unicode bidi control characters after text box end.
> > +        if (m_offset > static_cast<int>(box->start() + box->len())) {
> > +            int i = static_cast<int>(box->start() + box->len());
> > +            for (; i < m_offset; i++) {
> > +                UChar ch = static_cast<Text*>(node())->data()[i];
> > +                if (!isBidiControlCharacter(ch))
> > +                    break;
> > +            }
> > +            if (i == m_offset)
> > +                return true;
> > +        }
> 
> I don't understand why we need to check the control characters after the text box.  Is that because control characters after the text box influence whatever proceeds it?  Could you clarify?

This is the 3rd scenario,
it covers when Unicode control characters are the ending characters in editing area (I just realized that it might have overlap with the 2nd scenario. It might be enough to run this only after the end of the last text box).

For example, abc + Control_Characters,
if position is at control_characters, if the characters between "c" and position are all unicode control characters, that position should be considered as valid too.

I am open to the suggestion of a good function name.


> 
> > WebCore/dom/Position.h:70
> > +    enum EditingMode {
> 
> I think EditingMode is very general name, and can easily be confused with editing behavior and other things.  Please rename it to something more specific like UnicodeBidiControlCharacterPreservationMode or BidiControlCharacterMode if you think the formar is too long.

changed to UnicodeBidiControlCharacterPreservationMode

> 
> > WebCore/dom/Position.h:164
> > +    Position upstream(EditingBoundaryCrossingRule = CannotCrossEditingBoundary, EditingMode = NotPreserveUnicodeBidiControlCharacter) const;
> > +    Position downstream(EditingBoundaryCrossingRule = CannotCrossEditingBoundary, EditingMode = NotPreserveUnicodeBidiControlCharacter) const;
> 
> Is there a case where we don't want to preserve the bidi control character?  i.e. are there cases where we intentionally remove them?

upstream() and downstream() (or their callers) are used in many places. In other non-editing mode, I think they should still be ignored.
One case to ignore bidi control character is for caret movement. If the caret is at the beginning of a block element containing unicode bidi control character as the first character, move caret to right should ignore unicode control character and move caret after the first visible character.

Please refer to editing/selection/home-end (test case 16)  and editing/selection/extend-selection (test case 17) for detail.

> 
> > WebCore/editing/VisiblePosition.cpp:1
> > +
> 
> You probably didn't intend to insert a blank line here, did you?

You are right. removed.

> 
> > WebCore/editing/VisiblePosition.cpp:48
> > +VisiblePosition::VisiblePosition(const Position &pos, EAffinity affinity, Position::EditingMode editMode)
> > +    : m_editMode(editMode)    
> 
> Again, editMode sounds too general.

Changed to bidiPreservationMode.

> 
> > WebCore/editing/VisibleSelection.cpp:171
> > +        s = m_start.downstream(Position::CannotCrossEditingBoundary, m_editMode);
> > +        e = m_end.upstream(Position::CannotCrossEditingBoundary, m_editMode);
> 
> You may consider changing the order of EditingBoundaryCrossingRule and EditingMode (or whatever you rename to) so that you don't have to keep repeating CannotCrossEditingBoundary here and there.  As far as I searched, only RenderObject::createVisiblePosition calls upstream / downstream with CanCrossEditingBoundary.

Good suggestion. Done.

> 
> > LayoutTests/editing/inserting/insert-paste-bidi-control.html:89
> > +runEditingTest();
> 
> Do we need editing delegates dump & rendering results for this test?  If we're only concerned about bidi control characters being preserved, then we can use dump-as-text test.  i.e. you call runDumpAsTextEditingTest() here instead of the regular runEditingTest.  If that's not sufficient, then you can add a little script above to check the existence of bidi control characters and print PASS/FAIL.

Changed to dumpAsText. But removed the test for <textarea> since <textarea> is not part of the DOM tree (user can not traverse to it from root). Typing into a text area does not alter the DOM, so the result is not showed up in dumpAsText() (You can not get the text entered in <textarea> using innerText of innerHTML).

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list