[webkit-dev] parallel layout research

lmeyerov lmeyerov at eecs.berkeley.edu
Sat Aug 1 00:58:19 PDT 2009




>  It reminds me a lot of the work Badros did circa 2000 [1].
> 

That work added linear constraints to CSS and a variety of other languages.
It did not show how to do flow-based layout with linear constraints:
ultimately, the utility and applicability of their techniques to actual
layout systems was never really understood (e.g., can we cleanly implement
and specify CSS with linear constraints? quadratic ones?). I don't recall
the end-to-end solving time, but, in my experience, it's not the constraint
solving of stuff like finding what point of font (which is essentially what
they support). The way I think about that work is that there is awkwardness
with strictly flow-based layout and Badros showed a useful primitive --
linearly constrained variables -- to help with some of it.



> they've implemented an abstract language for describing the CSS
> spec for parameterizing layout.
> 

Most of my implementation work has actually been on tweaking out selectors
as a way to get used to multicore performance concerns in this space :) I'm
working on language subsets, not full ones, in order to experiment with
actual implementations and iterate over many algorithm ideas.

For layout, I started with formalizing a subset of CSS 2.1 -- essentially
inlines, inline blocks, blocks, and left-floats with no
margins/padding/explicit positioning. Given that subset, I found places for
parallelism at the node-level (with speculation that floats don't have an
impact at the local level, which seems founded). Next I'll be bouncing
between expanding it and implementing and then tweaking it for various
memory systems.



>  They suggest that this model does layout of the CSS 2.1 spec in time
> similar to the hand-tuned sequential engine in Gecko [2]
> using Cilk++ to handle low-level paralleism, they've seen
> significant speedups beyond that for multi-core [2] 
> 

That work is now moving into the implementation phase: the slides were for a
model showing a minimum of what we'd expect if solving constraints was the
bottleneck (so parallelizing a tree of slow constraints). Interesting,
they're not -- it's stuff like dealing with text and images that come into
play -- so I actually expect a much better speedup. Because of this, my
first implementation step was to actually reexamine how to call into a font
library like freetype2 for better caching and multicore behavior.



> I think parallel layout is an interesting topic to explore, in light of
> where CPUs are going... We have had a lot of success with improving layout
> performance given test cases that demonstrate slowness.
> 

Worth mentioning our interest is half in traditional browser settings
(faster slashdot) and half in the mobile space. For the former, you may or
may not care if opening a page is as fast as hitting the back button (though
there are studies to show this has a big impact on usability in practice),
but, for the latter, we want to do at least a magnitude better *everywhere*.
Moh has had luck with firefox in terms of better memory usage, but I think
that's telling: you need to take advantage of the hardware with the
algorithms, whether it be CPU or L1.



> We probably can't use code in Cilk++ directly,
> 

TBB might be a more viable route in terms of licensing (and more amenable to
performance tools). Currently, I benefit from Cilk most in the early
prototyping phases -- I then rewrite in TBB because the compiler seems to do
better and I can do better profiling.


I'll be back in the bay area starting mid-September and would love to meet
anybody interested in this stuff. There are approaches that might work
(e.g., STM or rewriting in a pure language), but at least we're starting to
get proof that multicore helps for at least one approach :) Finally, it's
worth noting that I also get significant speedups by being careful about
l1/l2 usage -- I doubt that style of optimization works well in a huge
codebase with many contributors, so parallelization (with a sane framework
like cilk/tbb) might better scale with the code over time.
-- 
View this message in context: http://www.nabble.com/parallel-layout-research-tp24761520p24765531.html
Sent from the Webkit mailing list archive at Nabble.com.



More information about the webkit-dev mailing list