[webkit-reviews] review granted: [Bug 179000] [ESNext][BigInt] Implement BigInt literals and JSBigInt : [Attachment 328927] Patch

Sun Dec 10 18:29:37 PST 2017

Darin Adler <darin at apple.com> has granted Caio Lima <ticaiolima at gmail.com>'s
request for review:
Bug 179000: [ESNext][BigInt] Implement BigInt literals and JSBigInt
https://bugs.webkit.org/show_bug.cgi?id=179000

Attachment 328927: Patch

https://bugs.webkit.org/attachment.cgi?id=328927&action=review

--- Comment #52 from Darin Adler <darin at apple.com> ---
Comment on attachment 328927
  --> https://bugs.webkit.org/attachment.cgi?id=328927
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=328927&action=review

There is a substantial number of tests here, but I am not sure they cover all
the code. There is a lot of code!

> JSTests/ChangeLog:22
> +	   * stress/big-int-literal-line-terminator.js: Added.
> +	   (assert):
> +	   * stress/big-int-literals.js: Added.
> +	   (assert):
> +	   (assertThrowSyntaxError):
> +	   (catch):
> +	   * stress/big-int-operations-error.js: Added.
> +	   (assert):
> +	   (assertThrowTypeError):
> +	   * stress/big-int-type-of.js: Added.
> +	   (assert):
> +	   * stress/big-int-white-space-trailing-leading.js: Added.
> +	   (assert):
> +	   (assertThrowSyntaxError):

Although the script puts them in, I don’t think we need to put function lists
for newly added files into the change log.

> Source/JavaScriptCore/bytecompiler/BytecodeGenerator.cpp:3149
> +    auto it = m_bigIntMap.find(entryKey);
> +    if (it != m_bigIntMap.end())
> +	   return it->value;
> +    
> +    JSBigInt* bigIntInMap = JSBigInt::parseInt(nullptr, *vm(),
identifier.string(), radix);
> +    // FIXME: [ESNext] Enables a way to throw an error on ByteCodeGenerator
step
> +    // https://bugs.webkit.org/show_bug.cgi?id=180139
> +    RELEASE_ASSERT(bigIntInMap);
> +    addConstantValue(bigIntInMap);
> +
> +    auto addedEntry = m_bigIntMap.add(entryKey, bigIntInMap);
> +    ASSERT(addedEntry.isNewEntry);
> +
> +    return addedEntry.iterator->value;

This is double hashing, once in find and a second time in add. It’s more
efficient to hash only once, and the best way to do that is normally to use the
ensure function instead of a find/add pair. Can we do that?

> Source/JavaScriptCore/bytecompiler/BytecodeGenerator.h:1009
> +		   return key.first->existingSymbolAwareHash() +
WTF::intHash(key.second);

Adding two hashes together is not the best way to combine two hashes. Instead,
please try this:

    return computeHash(key.first->existingSymbolAwareHash(), key.second);

That uses my newly added Hasher class so you might have to include Hasher.h.

> Source/JavaScriptCore/bytecompiler/BytecodeGenerator.h:1014
> +		   return a.first == b.first && a.second == b.second;

This can just be "return a == b"; std::pair knows how to do ==.

> Source/JavaScriptCore/parser/Lexer.cpp:1560
> +	   return std::optional<Variant<double, const
Identifier*>>(binaryValue);

Should not need to be explicit about the std::optional part. This should work:

    return Variant<double, const Identifier*> { binaryValue };

Same thing in many other cases below.

> Source/JavaScriptCore/parser/Lexer.cpp:2168
> +	       if (!parseNumberResult ||
WTF::holds_alternative<double>(*parseNumberResult))
> +		   tokenData->doubleValue = parseNumberResult ?
WTF::get<double>(*parseNumberResult) : 0;

I think this would be more elegant with two separate if statements rather than
using the trinary to give us a zero.

Or possibly you could try one of these?

    auto parseNumberResult = parseBinary().value_or(0);
    auto parseNumberResult = parseBinary().value_or(0.0);

Not sure if that will compile without casting the 0 to the Variant type which
would make it pretty ugly.

Same pattern recurs below.

> Source/JavaScriptCore/parser/Nodes.h:332
> +    class BigIntNode : public ConstantNode {

Could mark this class final. I think you should do it.

> Source/JavaScriptCore/parser/Nodes.h:339
> +	   bool isBigInt() const override { return true; }
> +	   JSValue jsValue(BytecodeGenerator&) const override;

We should mark these final instead of override, I think.

> Source/JavaScriptCore/runtime/JSBigInt.cpp:80
> +	   void* data = static_cast<void*>(dataStorage());
> +	   memset(data, 0, length() * DigitSize);

No need for the static_cast, nor for the local variable. Can just pass
dataStorage() to memset. Don’t forget to remove the unneeded braces too.

> Source/JavaScriptCore/runtime/JSBigInt.cpp:119
> +std::optional<uint8_t> JSBigInt::toStringOneDigit()

I think a better name would be singleDigitValueForString. Calling it toString
is not great because it returns the value for a digit, not a string, and not
even a numeric digit character for a string, which is usually what "digit"
means. Worse, Digit means something different in JSBigInt. Might want something
in the name that makes it clear it’s base 10; not sure.

> Source/JavaScriptCore/runtime/JSBigInt.cpp:371
> +	   RELEASE_ASSERT(!(carry + high));

Why exactly does it need to be RELEASE_ASSERT instead of ASSERT here?

> Source/JavaScriptCore/runtime/JSBigInt.cpp:429
> +// base-N string representation of a number. To increase accuracy, the array

I think you mean to increase precision?

> Source/JavaScriptCore/runtime/JSBigInt.cpp:451
> +#if USE(JSVALUE64)
> +    int leadingZeros = clz64(lastDigit);
> +#else
> +    int leadingZeros = clz32(lastDigit);
> +#endif

Could the if here be based on sizeof(lastDigit) instead of USE? I think that
would be clearer. The actual definition of Digit is based on uintptr_t, not on
USE(JSVALUE64).

Or we could have an overloaded function so we don’t need an #if here at all.

> Source/JavaScriptCore/runtime/JSBigInt.cpp:474
> +String JSBigInt::toStringGeneric(ExecState* exec, JSBigInt* x, int radix)

In new code I am suggesting we call this "state" rather than "exec", since the
former is a noun and a word and the latter is a non-word and an adjective. In
many cases we are using ExecState& instead of ExecState*, and I would have done
that here.

> Source/JavaScriptCore/runtime/JSBigInt.cpp:476
> +    StringVector<LChar> resultString;

Should one this down past the out-of-memory check. Also not 100% sure that we
should use a StringVector and adopt it; OK for now I guess.

> Source/JavaScriptCore/runtime/JSBigInt.cpp:490
> +	   return StringImpl::adopt(WTFMove(resultString));

This should just return the null string; we don’t want to wastefully allocate
memory when we are throwing an error.

> Source/JavaScriptCore/runtime/JSBigInt.cpp:574
> +    Digit* data = trimmedBigInt->dataStorage();
> +    for (int i = 0; i < newLength; i++)
> +	   data[i] = digit(i);

I would write this instead:

    std::copy(dataStorage(), dataStorage() + newLength,
trimmedBigInt->dataStorage());

> Source/JavaScriptCore/runtime/JSBigInt.cpp:589
> +    size_t chars = static_cast<size_t>(charcount);

Should not require a static_cast.

> Source/JavaScriptCore/runtime/JSBigInt.cpp:608
> +    if (exec)
> +	   throwOutOfMemoryError(exec, scope);

What is this "if (exec)" about? Do we allow calling this with an exec of null?

> Source/JavaScriptCore/runtime/JSBigInt.cpp:616
> +    JSBigInt* thisObject = jsCast<JSBigInt*>(cell);
> +    size_t bufferEstimatedSize = thisObject->m_length * DigitSize;
> +    return Base::estimatedSize(cell) + bufferEstimatedSize;

Seems like this works well as a one-liner:

    return Base::estimatedSize(cell) + jsCast<JSBigInt*>(cell)->m_length *
digitSize;

> Source/JavaScriptCore/runtime/JSBigInt.h:59
> +public:
> +
> +    enum class InitializationType { None, WithZero };
> +    
> +    JSBigInt(VM&, Structure*, int length);
> +
> +    void initialize(InitializationType);
> +
> +    static void visitChildren(JSCell*, SlotVisitor&);
> +
> +    static size_t estimatedSize(JSCell*);
> +
> +    static size_t allocationSize(int length);
> +
> +    static Structure* createStructure(VM&, JSGlobalObject*, JSValue
prototype);
> +
> +    static JSBigInt* createZero(ExecState*, VM&);
> +
> +    static JSBigInt* createWithLength(ExecState*, VM&, int length);

Some more grouping here would make this easier to read. For example:

    group InitializationType with initialize
    group estimatedSize with allocationSize
    group createZero with createWithLength
    group toStringOneDigit with toString

> Source/JavaScriptCore/runtime/JSBigInt.h:79
> +    JS_EXPORT_PRIVATE static bool equalToBigInt(JSBigInt* x, JSBigInt* y);

No need to name these arguments.

Not sure why we need to name this with the type in the name.

> Source/JavaScriptCore/runtime/JSBigInt.h:97
> +    static const int BitsPerByte = 8;
> +    static const int DigitSize = sizeof(Digit);
> +    static const int DigitBits = DigitSize * BitsPerByte;
> +    static const int HalfDigitBits = DigitBits / 2;
> +    static const Digit HalfDigitMask = (1ull << HalfDigitBits) - 1;
> +    static const int MaxInt = 0x7FFFFFFF;
> +    
> +    // The maximum length that the current implementation supports would be
> +    // MaxInt / DigitBits. However, we use a lower limit for now, because
> +    // raising it later is easier than lowering it.
> +    // Support up to 1 million bits.
> +    static const int MaxLength = 1024 * 1024 / (sizeof(void*) *
BitsPerByte);

WebKit coding style does not use capital letters for constants; we use that for
types. These should probably all be constexpr.

Also seems like sizeof(Digit) is clearer than DigitSize, so not 100% sure we
should have a named constant for that one.

> Source/JavaScriptCore/runtime/JSBigInt.h:107
> +    static Digit digitAdd(Digit a, Digit b, Digit& carry);
> +    static Digit digitSub(Digit a, Digit b, Digit& borrow);
> +    static Digit digitMul(Digit a, Digit b, Digit& high);

No need for the "a" and "b" names here.

> Source/JavaScriptCore/runtime/JSBigInt.h:116
> +    template <typename CharType>
> +    static JSBigInt* parseInt(ExecState* exec, VM& vm, CharType* data, int
length, int startIndex, int radix, bool allowEmptyString = true)

Can this go into the cpp file please? We can declare the template in the header
and only define it in the source file as long as it’s only used inside the
source file.

> Source/JavaScriptCore/runtime/JSBigInt.h:177
> +    static size_t offsetOfData()
> +    {
> +	   return WTF::roundUpToMultipleOf<sizeof(Digit)>(sizeof(JSBigInt));
> +    }

If these private functions are not used outside the .cpp file, please move them
inside to cut down on recompilation and for clarity.

> Source/JavaScriptCore/runtime/JSBigInt.h:181
> +	   return bitwise_cast<Digit*>(bitwise_cast<char*>(this) +
offsetOfData());

There is no need to use bitwise_cast here. Should use reinterpret_cast in a
case like this where it’s just about different pointer types. We should reserve
bitwise_cast for cases where its different semantic is needed.

> Source/JavaScriptCore/runtime/JSBigInt.h:188
> +	   Digit* dataPtr = dataStorage();
> +	   return dataPtr[n];

No need for the local variable here.

> Source/JavaScriptCore/runtime/JSBigInt.h:195
> +	   Digit* digitPtr = dataStorage();
> +	   digitPtr[n] = value;

Ditto.

> Source/JavaScriptCore/runtime/JSBigInt.h:200
> +	   return length * DigitSize;

What guarantees we can’t overflow here?

> Source/JavaScriptCore/runtime/JSCJSValue.cpp:386
> +	   std::optional<uint8_t> digit = bigInt->toStringOneDigit();
> +	   if (digit)
> +	       return vm.smallStrings.singleCharacterString(*digit + '0');

Better to declare inside the if statement:

    if (auto digit = bigInt->toStringOneDigit())