[Webkit-unassigned] [Bug 45271] HTML parser should provide script column position within HTML document to JavaScript engine

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Sep 23 14:03:54 PDT 2010


https://bugs.webkit.org/show_bug.cgi?id=45271





--- Comment #20 from Peter Rybin <peter.rybin at gmail.com>  2010-09-23 14:03:54 PST ---
(From update of attachment 68076)
View in context: https://bugs.webkit.org/attachment.cgi?id=68076&action=review

Hi Adam

I am splitting this long patch into 2 parts because it seems to be too inconvenient.
The upcoming part is about introducing TextPosition structure with 0/1-based numbers. I need it for passing column data from parser to JS engine.

Last time I failed to answer your inline comments. I'm sorry about it. I believe I addressed all your comments.

>>> JavaScriptCore/JavaScriptCore.gypi:388
>>> +            'wtf/HTMLPosition.cpp',
>> 
>> This isn't the right place for these files.  WTF doesn't know anything about HTML.  They should probably go in WebCore/html/parser.
> 
> 

Actually this sturcutre is going to be used wider than in WebCore/html/parser.
Since it's actually not HTML-specific, I renamed it into TextPosition.

>> JavaScriptCore/wtf/HTMLPosition.h:86
>> +/* An int wrapper that always reminds you that the number should be 0-based */
> 
> We generally use C++ style comments.

Done

>> WebCore/bindings/js/ScriptSourceCode.h:46
>> +        const HTMLPosition1& startPosition = HTMLPosition1::minimumPosition())
> 
> Please put this on one line.

Done

>> WebCore/bindings/v8/ScriptController.cpp:263
>> +HTMLPosition0 ScriptController::eventHandlerPosition() const
> 
> What is HTMLPosition0 ?

It's a variant of generic HTMLPosition for zero-based numbers. There's also HTMLPosition1. I chose this super-short form, don't know whether it's okay though.
Now all known as 'TextPosition*'
I hope to merge types  TextPosition0 + TextPosition1 => TextPosition  once we don't care about internal representation.

>> WebCore/dom/XMLDocumentParserLibxml2.cpp:1372
>> +    // TODO(peter.rybin): the implementation probably return 1-based int, but method should return 0-based.
> 
> WebKit uses FIXME instead of TODO(name).  Also, this is not a complete sentence.

Done.

>> WebCore/dom/XMLDocumentParserLibxml2.cpp:1374
>> +    printf("line number %d\n", res);
> 
> We don't usually call printf.  Also, "res" is a pretty unintelligible variable name.

I'm sorry, this shouldn't have stayed here.
Done

>> WebCore/dom/XMLDocumentParserLibxml2.cpp:1391
>> +            WTF::ZeroBasedNumber::fromZeroBasedInt(context->input->col - 1), 0);
> 
> This multi-line if should use curly braces.
> 
> This code doem't make much sense.  It looks like context->input is one-based.  Shouldn't we create a one-based number from that and then covert that to a zero-based number instead of subtracting manually?

Done
I did like the second part of comment.

>> WebCore/html/parser/HTMLDocumentParser.cpp:84
>> +    , m_treeBuilder(HTMLTreeBuilder::create(m_tokenizer.get(), this, document, reportErrors))
> 
> This change is not correct.  The HTMLTreeBuilder isn't allowed to know about the HTMLDocumentParser.  There are going to be situations in the future where the HTMLTreeBuilder is used without an HTMLDocumentParser.

Done
I dropped actual position providing out of patch scope for now.

>> WebCore/html/parser/HTMLDocumentParser.cpp:421
>> +    int generation = m_input.getInsertionLevel();
> 
> Generally, we don't use a "get" prefix for getters.  We just call the methods by the thing they return.  Why isn't m_input.current() returning a HTMLPosition?

Done.
My main concern was that HTMLPosition has 3 field while only 2 of them would be used here.
I dropped actual position providing out of patch scope for now.

>> WebCore/html/parser/HTMLInputStream.h:53
>> +        : m_last(&m_first), m_insertionLevel(0)
> 
> The HTMLInputStream should not have an insertionLevel.  That's a concept of the HTMLDocumentParser.  (Also, it's called a nesting level.)

Done

>> WebCore/html/parser/HTMLInputStream.h:145
>> +        m_inputStream->setInsertionLevel(m_insertionLevel + 1);
> 
> This is redundantly tracking the nestingLevel, which is already tracked carefully in the HTMLDocumentParser.

Done

>> WebCore/html/parser/HTMLTreeBuilder.cpp:2748
>> +    HTMLPosition0 position = m_scriptableDocumentParser->htmlPosition();
> 
> How can a ScriptableDocumentParser have an HTMLPosition?  ScriptableDocumentParser does not necessary have anything to do with HTML.  This makes no sense.

The script needs this type. I renamed it into TextPosition now, meaning that it's a position in any text document (HTML, XML etc)

>> WebCore/platform/text/SegmentedString.cpp:239
>> +        }
> 
> Wrong indent.

Done. Sorry, early patch upload.

>> WebCore/platform/text/SegmentedString.cpp:254
>> +}
> 
> Why not just return a position?

I was afraid that HTMLPosition has 3 fields while I need only 2 of them here.

>> WebCore/platform/text/SegmentedString.h:166
>> +            }
> 
> I'm confused.  I though I did all this work already for the view source parser.  Also, this code explicitly uses += to avoid a branch here.  Did you run the parsing benchmark???

I dropped this part out of patch for now.

However I can't find a way of putting 'numberOfCharactersConsumed()' call out of the branch.
I'm probably showing myself stupid, but I can't read your 'I did all this work already for the view source parser'. Unfortunately I couldn't find anything related to columns in HTMLViewSourceParser or related classes.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list