[jsc-dev] bmalloc freelist from JIT code
Filip Pizlo
fpizlo at apple.com
Thu Nov 2 12:52:31 PDT 2017
> On Nov 2, 2017, at 12:37 PM, Yusuke SUZUKI <utatane.tea at gmail.com> wrote:
>
> Hello JSC folks!
>
> Recently, I introduced a String method intrinsic StringSlice into DFG.
> And I'm planning to add StringSubstring and StringSubstr too, which can share a large part of StringSlice implementation.
>
> At that time, I noticed that these functions can become super efficient if we can allocate bmalloc-managed object from JIT code by using freelist. If we have such mechanism, we can allocate substringed StringImpl in JIT code. Those object's size should be the same at the call site, thus simple free list for a specific size works fine. (sizeof(StringImpl) + substring slot).
>
> We already have JSRopeString. Since JSRopeString can hold an owner JSString, we can represent substringed string with JSRopeString + JSString. But our runtime and WebCore strongly relies on WTFString / StringImpl. Thus they typically attempt to resolve these ropes. So, JSRopeString is typically useful if it is used in series of + operators. And it is not so useful if we use JSRopeString for slice/substring etc.
I don’t think we should remove JSRopeString’s substring capability for the same reason that we shouldn’t remove JSRopeString in general. We can always expand the set of clients that know how to use JSRopeString directly. I think there are already enough such clients that it’s profitable for perf.
It’s OK to have both JSRopeString substring support and WTFString substring support.
>
> So, my question is, is it feasible to consider adding a freelist mechanism to bmalloc that can be used from JIT?
Have you determined if it’s necessary to inline calls to malloc in the DFG? Please keep in mind that inlining allocation paths is rarely profitable. For example, we don’t even inline bmalloc in C++ code, even though we probably could. It just does not make that big of a difference.
I think that we should set the bar extremely high for having the JIT inline code from bmalloc. I wouldn’t want us to do that for microbenchmarks. If it speeds up something real (ARES-6, JetStream, Speedometer, etc) then it makes sense.
A better direction long-term would be to allocate strings in the GC all the time. Then, StringImpl and JSString would become one object. This would require teaching the GC to handle allocation from any thread (the FreeList portion of MarkedAllocator should be per-thread), and to respect reference counts (an input constraint can do that). I think we should talk more about how to do that, if you want to get some more string perf improvements. :-)
-Filip
More information about the jsc-dev
mailing list