[Webkit-unassigned] [Bug 45271] HTML parser should provide script column position within HTML document to JavaScript engine

Mon Sep 20 10:41:19 PDT 2010

https://bugs.webkit.org/show_bug.cgi?id=45271

Adam Barth <abarth at webkit.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #68076|review?                     |review-
               Flag|                            |

--- Comment #15 from Adam Barth <abarth at webkit.org>  2010-09-20 10:41:19 PST ---
(From update of attachment 68076)
View in context: https://bugs.webkit.org/attachment.cgi?id=68076&action=review

I like the approach with the specific types, but there are a lot of errors in this patch.  Also please post html-parser benchmark scores with your patch.  This code is highly performance sensitive.

> JavaScriptCore/JavaScriptCore.gypi:388
> +            'wtf/HTMLPosition.h',
> +            'wtf/HTMLPosition.cpp',

This isn't the right place for these files.  WTF doesn't know anything about HTML.  They should probably go in WebCore/html/parser.

> JavaScriptCore/wtf/HTMLPosition.h:86
> +/* An int wrapper that always reminds you that the number should be 0-based */

We generally use C++ style comments.

> WebCore/bindings/js/ScriptSourceCode.h:46
> +    ScriptSourceCode(const String& source, const KURL& url = KURL(),
> +        const HTMLPosition1& startPosition = HTMLPosition1::minimumPosition())

Please put this on one line.

> WebCore/bindings/v8/ScriptController.cpp:263
> +HTMLPosition0 ScriptController::eventHandlerPosition() const

What is HTMLPosition0 ?

> WebCore/dom/XMLDocumentParserLibxml2.cpp:1372
> +    // TODO(peter.rybin): the implementation probably return 1-based int, but method should return 0-based.

WebKit uses FIXME instead of TODO(name).  Also, this is not a complete sentence.

> WebCore/dom/XMLDocumentParserLibxml2.cpp:1374
> +    int res = (context() ? context()->input->line : 1) - 1;
> +    printf("line number %d\n", res);

We don't usually call printf.  Also, "res" is a pretty unintelligible variable name.

> WebCore/dom/XMLDocumentParserLibxml2.cpp:1391
> +    if (context)
> +        return HTMLPosition0(WTF::ZeroBasedNumber::fromZeroBasedInt(context->input->line - 1),
> +            WTF::ZeroBasedNumber::fromZeroBasedInt(context->input->col - 1), 0);

This multi-line if should use curly braces.

This code doem't make much sense.  It looks like context->input is one-based.  Shouldn't we create a one-based number from that and then covert that to a zero-based number instead of subtracting manually?

> WebCore/html/parser/HTMLDocumentParser.cpp:84
> -    , m_treeBuilder(HTMLTreeBuilder::create(m_tokenizer.get(), document, reportErrors))
> +    , m_treeBuilder(HTMLTreeBuilder::create(m_tokenizer.get(), this, document, reportErrors))

This change is not correct.  The HTMLTreeBuilder isn't allowed to know about the HTMLDocumentParser.  There are going to be situations in the future where the HTMLTreeBuilder is used without an HTMLDocumentParser.

> WebCore/html/parser/HTMLDocumentParser.cpp:421
> +    int line = m_input.current().getCurrentLine();
> +    int column = m_input.current().getCurrentColumn();
> +    int generation = m_input.getInsertionLevel();

Generally, we don't use a "get" prefix for getters.  We just call the methods by the thing they return.  Why isn't m_input.current() returning a HTMLPosition?

> WebCore/html/parser/HTMLInputStream.h:53
> -        : m_last(&m_first)
> +        : m_last(&m_first), m_insertionLevel(0)

The HTMLInputStream should not have an insertionLevel.  That's a concept of the HTMLDocumentParser.  (Also, it's called a nesting level.)

> WebCore/html/parser/HTMLInputStream.h:145
> +        m_insertionLevel = m_inputStream->insertionLevel();
> +        m_inputStream->setInsertionLevel(m_insertionLevel + 1);

This is redundantly tracking the nestingLevel, which is already tracked carefully in the HTMLDocumentParser.

> WebCore/html/parser/HTMLTreeBuilder.cpp:2748
> +    HTMLPosition0 position = m_scriptableDocumentParser->htmlPosition();

How can a ScriptableDocumentParser have an HTMLPosition?  ScriptableDocumentParser does not necessary have anything to do with HTML.  This makes no sense.

> WebCore/platform/text/SegmentedString.cpp:239
> +        if (*m_currentString.m_current++ == '\n' && m_currentString.doNotExcludeLineNumbers()) {
> +          ++lineNumber;
> +          ++m_currentLine;
> +          m_lineBeginCharactersConsumedValue = numberOfCharactersConsumed();
> +        }

Wrong indent.

> WebCore/platform/text/SegmentedString.cpp:254
> +int SegmentedString::currentLine() const
> +{
> +    return m_currentLine;
> +}
> +
> +int SegmentedString::currentColumn() const
> +{
> +    return numberOfCharactersConsumed() - m_lineBeginCharactersConsumedValue;
> +}

Why not just return a position?

> WebCore/platform/text/SegmentedString.h:166
> +            if (m_currentString.doNotExcludeLineNumbers()) {
> +                ++lineNumber;
> +                ++m_currentLine;
> +                m_lineBeginCharactersConsumedValue = numberOfCharactersConsumed();
> +            }

I'm confused.  I though I did all this work already for the view source parser.  Also, this code explicitly uses += to avoid a branch here.  Did you run the parsing benchmark???

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.