[Webkit-unassigned] [Bug 250112] New: Possible 8x perf improvement to UTF8 -> UTF16 text encoding perf

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Jan 4 16:38:42 PST 2023


https://bugs.webkit.org/show_bug.cgi?id=250112

            Bug ID: 250112
           Summary: Possible 8x perf improvement to UTF8 -> UTF16 text
                    encoding perf
           Product: WebKit
           Version: WebKit Nightly Build
          Hardware: Unspecified
                OS: Unspecified
            Status: NEW
          Severity: Normal
          Priority: P2
         Component: JavaScriptCore
          Assignee: webkit-unassigned at lists.webkit.org
          Reporter: jarred at jarredsumner.com

simdutf is a fast text encoding/decoding library which supports UTF8 -> UTF16, UTF16 -> UTF8, and ascii validation. GitHub: https://github.com/simdutf/simdutf

Bun uses simdutf for TextEncoder & TextDecoder (in the happy path)

For this code:
```
var decoder = new TextDecoder();
var encoder = new TextEncoder();
var buf = encoder.encode(
    "not all ascii �� �� 1-2-3-4 abcdefghklmnopqrstuvwxyz".repeat(9999)
  ),
  buf1 = buf.slice(0, buf.length - 1),
  buf2 = buf.slice(0, buf.length - 2);
var decoded = decoder.decode(buf),
  decoded1 = decoder.decode(buf1),
  decoded2 = decoder.decode(buf2);

console.time("TextDecoder.decode");
decoder.decode(buf);
console.timeEnd("TextDecoder.decode");
console.time("TextDecoder.decode");
decoder.decode(buf1);
console.timeEnd("TextDecoder.decode");
console.time("TextDecoder.decode");
decoder.decode(buf2);
console.timeEnd("TextDecoder.decode");
console.time("TextEncoder.encode");
encoder.encode(decoded);
console.timeEnd("TextEncoder.encode");
console.time("TextEncoder.encode");
encoder.encode(decoded1);
console.timeEnd("TextEncoder.encode");
console.time("TextEncoder.encode");
encoder.encode(decoded2);
console.timeEnd("TextEncoder.encode");

```

On macOS arm64 in Bun v0.4.1:

```
[0.69ms] TextDecoder.decode
[0.54ms] TextDecoder.decode
[0.59ms] TextDecoder.decode
[0.21ms] TextEncoder.encode
[0.20ms] TextEncoder.encode
[0.25ms] TextEncoder.encode
```

Safari Technology Preview:

```
TextDecoder.decode: 0.621ms
TextDecoder.decode: 0.589ms
TextDecoder.decode: 0.604ms
TextEncoder.encode: 2.317ms
TextEncoder.encode: 1.945ms
TextEncoder.encode: 1.966ms
```

For non-ascii UTF-8 input, TextEncoder.encode runs about 8x faster in Bun compared to Safari and that's mostly because it's using simdutf.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-unassigned/attachments/20230105/feaa41fe/attachment.htm>


More information about the webkit-unassigned mailing list