[Webkit-unassigned] [Bug 131704] Simple ES6 feature:String prototype additions

Fri Aug 29 09:54:07 PDT 2014

https://bugs.webkit.org/show_bug.cgi?id=131704

Darin Adler <darin at apple.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #237340|review?, commit-queue?      |review-, commit-queue-
               Flag|                            |

--- Comment #31 from Darin Adler <darin at apple.com>  2014-08-29 09:54:12 PST ---
(From update of attachment 237340)
View in context: https://bugs.webkit.org/attachment.cgi?id=237340&action=review

I appreciate your continued work on this. It’s almost right, but there are various things wrong.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1565
> +    if (exec->hadException())
> +        return JSValue::encode(jsUndefined());

Is the ordering right here? Does this need to be done before the checking of the argument?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1572
> +    if (a0.isUndefinedOrNull())
> +        return JSValue::encode(jsBoolean(false));

This is new. Is it correct? Is a null value supposed to turn into a false result or turn into the string "null"? What about undefined? What about a missing argument? What does the specification say?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1578
> +    if (exec->hadException())
> +        return JSValue::encode(jsUndefined());

It’s OK to include these two lines of code, but they are not needed. There is no way for the JavaScript code to detect whether we do the startsWith or not. The return value is undefined if we have an exception.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1591
> +    if (exec->hadException())
> +        return JSValue::encode(jsUndefined());

Is the ordering right here? Does this need to be done before the checking of the argument?

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1598
> +    if (a0.isUndefinedOrNull())
> +        return JSValue::encode(jsBoolean(false));

Same question as above.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1607
> +    if (exec->hadException())
> +        return JSValue::encode(jsUndefined());

Same point about not needing these two lines of code.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1627
> +    if (a0.isUndefinedOrNull())
> +        return JSValue::encode(jsBoolean(false));

Same question again.

> Source/JavaScriptCore/runtime/StringPrototype.cpp:1633
> +    if (exec->hadException())
> +        return JSValue::encode(jsUndefined());

Same point about not needing these two lines of code.

> Source/WTF/wtf/text/StringImpl.cpp:1414
> +    ASSERT(startOffset <= length());
> +    ASSERT(matchString.length() <= length());
> +    ASSERT(startOffset + matchString.length() <= length());

It’s not right to assert this. All the other functions in this class clamp these values. They don’t treat it as illegal to pass a large value. Putting an assertion in doesn’t fix the overflow issue.

> Source/WTF/wtf/text/StringImpl.cpp:1417
> +    if (matchString.length() + startOffset > length())
> +        return false;

Here’s how we write this and avoid overflow:

    if (startOffset > length())
        return false;
    if (matchString.length() > length())
        return false;
    if (matchString.length() + startOffset > length())
        return false;

No assertions. But also, I suggest we put these checks into equalInner rather than here. This function can just be a forwarder to equalInner.

> Source/WTF/wtf/text/StringImpl.cpp:1423
> +    return equalInner(this, startOffset, reinterpret_cast<const char*>(matchString.characters16()), 
> +        matchString.length(), caseSensitive);

This is wrong. We can’t just cast a run of 16-bit characters to a run of 8-bit characters. The code won’t compare all the characters! And it will convert individual bytes of the characters to lowercase if asked to not be case sensitive.

> Source/WTF/wtf/text/StringImpl.cpp:1452
> +    int startOffset = endOffset - matchString.length();

I am still not sure that I understand what an end offset is, so it’s really hard for me to understand what this function is trying to do.

But this is not the right way to write this to correctly range check and avoid overflow. We should instead just write this:

    if (endOffset > matchString.length())
        return false;
    unsigned startOffset = matchString.length() - endOffset;

Don’t involve signed arithmetic at all.

And we will need these checks:

    if (startOffset > length())
        return false;
    if (matchString.length() > length())
        return false;
    if (matchString.length() + startOffset > length())
        return false;

But I suggest putting them inside equalInner.

>> Source/WTF/wtf/text/StringImpl.cpp:1454
>> +    ASSERT(!(endOffset <= matchString.length() ^ startOffset <= 0));
> 
> This reads as if 'endOffset' is greater or equals than 'matchLength', 'startOffset' should be positive or zero. On the other hand, if 'endOffset' is lower or equals than 'matchLength', 'startOffset' should be negative.

Fine, but it’s not appropriate to assert these things.

> Source/WTF/wtf/text/StringImpl.cpp:1457
> +    if (startOffset < 0)
> +        return false;

Please just leave this out.

> Source/WTF/wtf/text/StringImpl.cpp:1463
> +    return equalInner(this, startOffset, reinterpret_cast<const char*>(matchString.characters16()),
> +        matchString.length(), caseSensitive);

Again, wrong handling of 16-bit characters, and not enough test coverage to see that’s what’s happening.

> LayoutTests/js/script-tests/string-contains.js:52
> +// If int32.valueOf() is called, string.valueOf() should not be called.

These tests are on the right track, but here’s what I would suggest:

Make three objects that all have valueOf on them (for the string itself, the offset, and the match string), each with separate side effects (appending to a string is one good way to do it) and then have a test to make sure the side effects all happen in the right order. For example, the use "A", "B", and "C" and make sure the side effects string is "ABC".

Then for each of the three objects, replace with one that raises an exception and make sure that the exception is propagated properly and only the side effects that are supposed to come first happen. So you’d get an exception and the string is empty, then "A", then "AB".

> LayoutTests/js/string-contains-expected.txt:14
> +PASS 'foo bar'.contains() is false

No test coverage for contains(null) or contains(undefined) or even contains(<number>). In particular we would want test cases like: "null".contains(null) and "xnullx".contains(null) and "xundefinedx".contains(undefined) and "xundefinedx".contains().

>> LayoutTests/js/string-contains-expected.txt:42
>> +PASS 'foo bar'.contains('bar', 0x7fffffff) is false
> 
> I didn't know what's the best way to validate an ASSERT. All functions return 0 when value of startOffset is negative. Now looking at it, startsWith and contains are returning false because the offset is larger than the length of the string. In the case of endsWith I think is working as 0x80000010 - 3 = 0x80000007 which is a negative number in int32_t.
> 
> I'm not sure about the idiom for checking for an overflow in startsWith and contains. I think it should check that the sum of the two values should be greater than any of them:
> 
> ASSERT(startOffset + matchString.length() >= startOffset);

If we ever hit an ASSERT due to JavaScript code, then the ASSERT is wrong or there is a bug we need to fix. ASSERT is used to state something that can never be true unless there is a mistake in the WebKit code.

So there is no way to “validate” an ASSERT, and this assertion indicates the patch is not yet ready.

> LayoutTests/js/string-contains-expected.txt:51
> +TEST COMPLETE

No test cases at all that check the 16-bit-character code path. That’s a problem.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.