mjs at apple.com
Tue Oct 1 15:27:26 PDT 2013
On Oct 1, 2013, at 3:05 PM, Geoffrey Garen <ggaren at apple.com> wrote:
>>> (3) Find a fast thread-specific data API on the canonical GTK platform.
>> Threading for GTK+ on non-Mac/non-Windows platforms is essentially
> To access thread-specific data using pthreads, you first need to take a lock and call pthread_key_create(). Since the whole point of thread-specific data is to avoid taking a lock, the API is useless.
The normal way to do it is to use pthread_once to create the key, which does not in general take a lock. (That or use an out-of-band prior initializer, but that wouldn't work for malloc).
> You’ll need an alternative to the cross-platform pthread API for accessing thread-specific data. Otherwise, the cost of that API will dominate any other cost, and it won’t be worth our time to try to optimize other things.
FastMalloc uses vanilla pthread_getspecific() all the time (including at least on every malloc call) on platforms that don't have a faster form of thread-specific data (such as pthread_getspecific on Mac or __thread on Windows). While it makes a difference, FastMalloc still tends to be faster overall than system malloc implementations. So I suspect it would work ok for the new malloc as well. Probably the easiest way to find out is to test.
C++11 also introduces the thread_local keyword which is likely more readily optimizable than function-call-based APIs where supported.
More information about the webkit-dev