[webkit-dev] Parallel JavaScript: Why a separate ParallelArray types

Jarred Nicholls jarred at webkit.org
Fri Apr 12 13:43:10 PDT 2013

Argh, sending from the right address this time.

On Fri, Apr 12, 2013 at 4:39 PM, Jarred Nicholls
<jarred.nicholls at gmail.com>wrote:

> On Fri, Apr 12, 2013 at 2:54 PM, Filip Pizlo <fpizlo at apple.com> wrote:
>> On Apr 12, 2013, at 8:36 AM, "Hudson, Rick" <rick.hudson at intel.com>
>> wrote:
>> > I'm assuming that we agree that JavaScript must evolve to leverage the
>> hardware's power stingy parallelism.
>> True, but there is also the question of how high of a priority we should
>> give to this relative to other project goals, and also what opportunity
>> costs are introduced from any particular implementation of parallelism that
>> we choose.  What you are proposing will only work for code that performs
>> non-effectful operations.  This is a limited form of parallelism, and I'm
>> not sure that it's a general enough programming model to warrant adding new
>> things to the platform.
>> > For completeness there seems to be the following approaches.
>> >
>> > 1)    Add nothing to the language and rely on increasingly
>> sophisticated compilers to detect opportunities to safely parallelize
>> computations within the current semantics.
>> > 2)    Add methods to Array that provide parallel semantics.
>> > 3)    Add syntax and keywords such as parallelFor to introduce
>> parallelism.
>> > 4)    Add a parallel type (ParallelArray) and associated methods that
>> provide parallel semantics.
>> You forgot the obvious ones:
>> 5) Message passing
>> 6) Shared mutable memory with either threads and locks, or something
>> along those lines.
>> We sort of already have (5) but it is half baked; I think there is
>> opportunity to improve it.
>> We don't have (6).  (6) subsumes all of (1), (2), (3), (4), and (5) as it
>> would allow the programmer to roll all of those primitives in JS.
>> We should not reject shared mutable memory out of hand.  It is the
>> prevailing technique for implementing concurrency and parallelism in the
>> other languages that people use.
>> >
>> > The _nothing_ approach would have to deal with array methods, such as
>> map and reduce, that have sequential semantics. For example reduce is
>> really reduce right (or fold right) and these semantics defeat
>> parallelization. For example the EIC signature for the function/closure
>> passed to reduce assumes E === C[I] but the value E could be the result of
>> previous invocation of the function.  This approach seems to be strictly
>> worse than the methods approach. A similar nothing approach is to analyze
>> loops to mechanically discover opportunities for parallelization. While
>> there exists decades of literature that have been full of optimism the
>> reality is that this approach has not achieved wide spread success.
>> Part of that lack of success stems from the fact that most code that does
>> things to arrays has subtle loop-carried side-effects.  The closures passed
>> to ParallelArray methods could have those, also; and if they did then you
>> wouldn't be able to soundly parallelize.
>> You haven't convinced me that ParallelArray will be any more successful
>> than the full autoparallelization approach.  You also haven't convinced me
>> that it will be any more successful than other eclectic ideas, like OpenMP,
>> for example.  I don't think that such annotation-based approaches have been
>> particularly successful in other languages.  The languages that have seen
>> the most success in parallel development are precisely those that give
>> *complete* control to the programmer, such as Java, which gives you threads
>> and locks combined with a sensible memory model - and allows you to craft
>> your algorithm in the way that best suits your problem domain.  Why is this
>> approach being rejected?
>> >
>> > The _methods_ approach side steps the semantics issue raised in the
>> nothing approach but does not compose well with the original methods. For
>> example Array has several inherently sequential  methods, such as push and
>> pop, that do not have a reasonable parallel semantics since they force
>> mutation.
>> > That said this approach has its proponents. Nico
>> http://smallcultfollowing.com/babysteps/blog/2013/02/26/splitting-the-pjs-api/suggested that we add unorderedMap and unorderedReduce to indicate that the
>> scheduling of the computation be unordered. While the idea is compelling on
>> the surface it is likely to lead to the development of algorithms that are
>> at first sequential and then hacked to make them parallel. This approach
>> will impede the goal of having programmers develop parallel algorithms and
>> the appropriate parallel data structures. The methods approach would miss a
>> real opportunity here to move programmers away from the sequential models
>> and on to parallel models.
>> ParallelArray also misses this opportunity.  The only thing that
>> ParallelArray does is adds methods that could have been added to Array, and
>> requires you to say, when you write your code, if you want parallelization
>> or not.  The only problem this solves is inferring the "is it profitable to
>> parallelize" question.  I'm not sure that inferring the answer to this
>> question is hard enough to warrant a new type.
>> >
>> > The _syntax_ approach suggests that we follow the lead of Amp and add
>> parallelFor and siblings to JavaScript. This approach seems to suffer from
>> not being backwardly compatible and like the methods approach misses an
>> opportunity to move programmers away from their sequential mindset.
>> >
>> > The _parallel type_ approach provides a clear separation from the
>> semantics of Array and in doing so does a much better job of encouraging
>> programmers to build parallel algorithms and data structures instead of
>> attempting to parallelize sequential algorithms.
>> The methods approach also provides such a clear separation.
>> Just adding the word "Parallel" to the name of a new type and saying that
>> this is the only type that benefits from parallelism does not constitute
>> positive encouragement of any kind.  Quite the opposite, it seems the only
>> thing it encourages is for VM engineers to give up on the inference
>> approach entirely.
>> > The contract between the programmer and the system can be better
>> defined if ParallelArray exists independently of Array and the two
>> prototypes could develop independently. That is not to say that we
>> shouldn't follow the principle of least surprise and leverage the
>> programmer's familiarity with the Array methods signatures. For example the
>> kernel/callback function should follow the same conventions found in the
>> Array.map signature such as the call back (kernel) function formals being
>> element, index, array.
>> But this is a double-edged sword: having a separate array type with
>> separate semantics incurs opportunity costs for optimization.  As the two
>> prototypes diverge, there will be optimizations that get implemented less
>> well in one than in the other.
>> Supporting more types in a compiler is never free.
>> >
>> > Furthermore the parallel type approach is a clear indication to the SDE
>> and JIT implementers that the programmer wishes to execute this code in
>> parallel. The SDE will be able to assist the programmer when they don't
>> follow the contract, for example by throwing when it encounters a kernel
>> with side effects.
>> But how will developers debug this exception?  Likely by using something
>> like the Inspector's exception view which highlights parts of the code that
>> threw.
>> We can do the same thing for autoparallelization in the "methods"
>> approach: we can highlight those array method calls that were parallelized,
>> as well as those that didn't.
>> I don't buy your argument that ParallelArray handles this developer
>> scenario any better than the other approaches.  Whether the failure case
>> throws an exception, or not, either way we can highlight the fact that one
>> kernel or another wasn't parallelized.
>> > While the parallel type approach could be implemented on top of the
>> methods or even the syntax approach information about the intent of the
>> programmer would be lost to the JIT and the SDE. In summary the real value
>> in adding ParallelArray as its own type is that it helps move the developer
>> structure his programs and choose his algorithms with parallelism in mind
>> and convey that intent clearly.
>> This argument can be used in favor of any parallel or concurrent
>> programming model.  If I run s/ParallelArray as its own type/threads and
>> locks/ on your statement, I get a convincing argument for the traditional
>> approach to shared memory parallelism.
>> I do believe that parallelism, and concurrency, are features that should
>> be added to JavaScript.  But I think we should aim higher than adding a
>> limited-use restricted programming model for this purpose.
>> -Filip
> TL;DR: I'd rather see ParallelArray be a library that is built on top of
> lower-level primitives.
> For as little worth as it is, I agree with you Filip that providing
> low-level primitives would be best in terms of a foundation for many
> parallel programming models.  In terms of actually supporting shared
> mutable memory and locks, it would be a challenge for the engines to become
> thread-safe (internally) across contexts that share memory cells.  I'd
> envision the sharing of mutable memory being an opt-in semantic, marking a
> piece of memory as being shared/mutable from multiple contexts.
> The fact is, parallelism comes in many different forms and use cases.
>  Performing side-effect homogenous tasks over an Array of data is one type
> of problem (and which can be GPU accelerated easily).  Performing separate
> heterogenous operations on shared (or unshared) memory can be different
> more general problem altogether.  I'd tend to prefer a solution that
> provides the primitives to solve all such cases, and can be further
> abstracted & simplified by user-land code to implement different parallel
> programming models.  There are others out there that would vehemently
> disagree with that approach, and their objections are understandable.
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev at lists.webkit.org
>> https://lists.webkit.org/mailman/listinfo/webkit-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20130412/c757006e/attachment.html>

More information about the webkit-dev mailing list