[webkit-dev] Parallel JavaScript: Why a separate ParallelArray types

Mon Apr 15 19:05:52 PDT 2013

On Apr 12, 2013, at 10:26 PM, Filip Pizlo <fpizlo at apple.com> wrote:

> 
> You have all of the same problems as with threads and locks.  The two are logically equivalent.  You can imagine a lock as being a message queue that always has a token in it when the lock is unlocked, and the token is read out to "lock" the lock.  You can imagine a shared memory location as being a task that controls that location, and you send messages to it to read and write the location.

The key difference in practice seems to be that message passing can be made nonblocking and asynchronous, whereas locking typically is not.

> 
>> - No possibility of deadlock
> 
> There is a possibility of deadlock.  You can have one task waiting for a message from another, and that other waiting for a message from the first.

Yes, you can have pseudo-deadlocks like this, but unlike actually blocking on a mutex in serial code on the main thread, this will not cause the web page as a whole to outright hang. This is the only distinction I meant to draw. If my webpage was blocked on a hypothetical JS mutex on the main thread, then I (as the user) can't click on a link or select text or what have you. If it's waiting for a message reply that never comes, then I almost certainly can.

> 
> But I think that deadlocks are a red herring.  For a locked program to experience deadlock, you need an inversion of lock nesting.  But not only can this be detected dynamically - you can throw an exception as soon as it happens - it also seems to happen infrequently.  I've seen people scare each other over them at programming language conferences, but I think that deadlocks are sufficiently rare in practice that it's not worth worrying too much about.

I had the (perhaps incorrect) impression that there was no reliable way to avoid lock inversion other than globally ensuring that the program as a whole always grabs locks in the same order. If there is a better solution, then fine. But even a good solution would entail the potential of blocking the WebProcess main thread for a long time, and message passing doesn't have that effect.

> 
> 
>> - No performance penalty from correctly supporting fine-grained concurrent access to arbitrary mutable objects
> 
> There is a huge performance penalty.  Message queues involve the same underlying synchronization primitives as locks.  

There's no penalty to single-threaded code from message passing, though. With shared state however: If every JS array had to safely handle concurrent access from multiple threads (at the very least preventing memory corruption), that would be a huge performance penalty for single-threaded code. Or if you only turn on the checks when a second thread is created, then you have a performance cliff, and the checks would likely still harm code that doesn't actually access data structures concurrently.

Message passing (at least in the Workers style) is really only suitable for course-grained concurrency, and indeed, it does impose a significant communication overhead.

>> 
>> The first is particularly important as you really do not want the web page's UI thread to lock up, even if the page itself is not making progress.
> 
> But message passing also leads to lock-up.  You could have someone waiting for a message that hasn't been sent yet.

I clarified above what I meant here.

> 
> Of course you could try to side-step this by mandating that there is no synchronous message receive operation on the main thread, and instead all messages arrive back asynchronously.

Indeed, there is no synchronous message receive on any thread in Web Workers.

>  But there is no reason why you couldn't do that with locks: you could have an async lock operation:
> 
> lock.acquireAsynchronously(function() { … do things once lock is available … });

Can a lock-based model with only asynchronous locking be made efficient enough for real use? (If you offer the synchronous version at all, then the worst programmers will use it, once again leading to blocking). I would guess it ends up only good enough for course-grained concurrency, at which point it does not have much advantage compared to message passing.

> 
> But I'm not sure that this would even be necessary.  If the web page's UI gets stuck because of an error made by the programmer, then that's not necessarily the programming language designer's problem to solve.  

If a mistake of type X with consequence Y is possible in Model A but not in Model B, that's an advantage for Model B. The Web has set higher expectations of protecting the user from the consequences of badly written webpages.

> I don't believe that a deadlock is any more likely than a long-running computation or a run-away memory leak.  In fact in my experience it is less likely, particularly if you provide people with a sensible API for good locking discipline like:
> 
> lock.acquire(function() { … do things … });
> 
> And specifically disallow manual lock acquire/release except using this block-scoped idiom.  The only way to deadlock in that world is to have inverted lock nesting, which can be detected by the runtime, and you can throw an exception in all offending threads.
> 
> Also, you could do a libdispatch style, where there is no "lock" per se but instead you just have queues that you put tasks on.  I like that, too.  It's also formally equivalent to both locks and message queues, but much more similar to threads and locks in practice.

How would you use a dispatch queue model to concurrently access the same data structure from different threads? If you can't do this (safely), then it's really no different from Web Workers. If you can, then I expect this involves locking. libdispatch itself has semaphores for this purpose.

> 
>> 
>> I believe message passing as a concurrent programming model is much less prone to severe errors than shared mutable state + locking.
> 
> I don't think there is evidence to support this claim.  In fact there is evidence to support the opposite: hardly anybody uses languages that only have message passing (Erlang was I think the only "popular" one ever, for a very loose definition of "popular"), but languages that have shared mutable state are used widely (C++, Java, and many others).

These languages are popular, but I believe most complex projects using them with threading a long tail of nasty but hard-to-diagnose threadsafety bugs.

Regards,
Maciej

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20130415/b3fd8439/attachment.html>