[Webkit-unassigned] [Bug 259267] New: SerializedScriptValue could be faster

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Jul 17 01:23:14 PDT 2023


https://bugs.webkit.org/show_bug.cgi?id=259267

            Bug ID: 259267
           Summary: SerializedScriptValue could be faster
           Product: WebKit
           Version: WebKit Nightly Build
          Hardware: Unspecified
                OS: Unspecified
            Status: NEW
          Severity: Normal
          Priority: P2
         Component: WebCore JavaScript
          Assignee: webkit-unassigned at lists.webkit.org
          Reporter: jarred at jarredsumner.com

There are a number of fast paths for JSON.stringify, JSON.parse, Object.entries, and more that could likely be re-used in WebCore::SerializedScriptValue, with some modifications.

Some examples

Instead of `getOwnPropertyNames`, in a number of cases `structure->forEachProperty(lambda)` could be used to iterate through an object quicker. The hard part of doing that here is dealing with stack overflow as carefully as the current code, but maybe a similar idea used in JSON.stringify could be reused here (check soft recursion limit)


```
                indexStack.append(0);
                propertyStack.append(PropertyNameArray(vm, PropertyNameMode::Strings, PrivateSymbolMode::Exclude));
                inObject->methodTable()->getOwnPropertyNames(inObject, m_lexicalGlobalObject, propertyStack.last(), DontEnumPropertiesMode::Exclude);
```

https://github.com/WebKit/WebKit/blob/7707bd0020f85ef8ab11f353f64b963d5f59f92e/Source/WebCore/bindings/js/SerializedScriptValue.cpp#L2391-L2394



The code for reading from arrays could use the indexing mode to find out if it only contains int32 or double and then have a faster path for reading/writing arrays of int32 or double


Identifiers do a look up in the AtomString table, which means for property names, the string caching is duplicated with the calls to Identifier::fromString(vm, string), and also hashed an extra time.

Storing an Identifier in CachedString may improve deserialization performance. 

```
case ObjectStartVisitMember: {
            CachedStringRef cachedString;
            bool wasTerminator = false;
            if (!readStringData(cachedString, wasTerminator)) {
                if (!wasTerminator)
                    goto error;

                JSObject* outObject = outputObjectStack.last();
                outValue = outObject;
                outputObjectStack.removeLast();
                break;
            }

            if (JSValue terminal = readTerminal()) {
                putProperty(outputObjectStack.last(), Identifier::fromString(vm, cachedString->string()), terminal);
                goto objectStartVisitMember;
            }
            stateStack.append(ObjectEndVisitMember);
            propertyNameStack.append(Identifier::fromString(vm, cachedString->string()));
            goto stateUnknown;
        }
```

Making the change to directly read them as Identifier yielded a 20% performance improvement to a deserialization microbenchmark in Bun: https://github.com/oven-sh/bun/pull/3655/files#diff-644fa389dd06a2df5a14ad4c234d546fd452170a9d57d28f4919d929fed14710R2973-R3003


After:

cpu: Apple M1 Max
runtime: bun 0.6.15 (arm64-darwin)

benchmark        time (avg)             (min … max)       p75       p99      p995
--------------------------------------------------- -----------------------------
deserialize    8.54 µs/iter     (8.31 µs … 8.84 µs)   8.66 µs   8.84 µs   8.84 µs

Before:

cpu: Apple M1 Max
runtime: bun 0.6.15 (arm64-darwin)

benchmark        time (avg)             (min … max)       p75       p99      p995
--------------------------------------------------- -----------------------------
deserialize   10.01 µs/iter    (9.78 µs … 10.32 µs)  10.16 µs  10.32 µs  10.32 µs
```

There are more things that can be done here. constructEmptyObject() could pass the count of the number of properties (if that's available) when the size is less than the inline size limit. Structures potentially should be cached in a temporary buffer if it turns out that two objects have the same set of keys (ideally those would be in the constant pool too). If they were serialized that way, putDirectOffset could be used (for the named properties) and it would be very fast to clone final objects

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20230717/f7c1a917/attachment.htm>


More information about the webkit-unassigned mailing list