[Webkit-unassigned] [Bug 63531] Turn PreloadScanner implementation to a thread based solution

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Wed Jun 29 06:02:06 PDT 2011


https://bugs.webkit.org/show_bug.cgi?id=63531





--- Comment #9 from Zoltan Horvath <zoltan at webkit.org>  2011-06-29 06:02:05 PST ---
(In reply to comment #2) Adam
> ...
> > +static String typeAttrString;
> 
> This looks like static initializers, we aren't allowed.

Yeap, this is just a hack to make these strings comparable on the thread, although we should find a better way to do this. I didn't want to initialize the entire QualifiedName 'array' on the thread.(In reply to comment #3)


(In reply to comment #3) Tony
> Really excited to see you working on this.

I like this topic! ;)

> I've used web-page-replay for benchmarking preload scanner changes in the past. Methanol seems similar, but wpr allows for network simulation so that bandwidth contention and RTT can be modeled. Instructions here:
> http://code.google.com/p/web-page-replay/wiki/ChromeBenchmark

I'm going to set up wpr and make measurements soon!

> I haven't looked at the patch yet, but can you describe briefly what you hope the benefit will be. Is this just offloading the CPU time spent parsing or does it actually allow us to discover subresources earlier?

One thing is decrease CPU load on the main thread. On the other side we can save the time that preloadscanner spends on the main thread. Ideally only the resourcerequest will coast on main thread. Does it make sense and is it possible to modify resource downloading to support download from worker threads?

(In reply to comment #4) Tony
> > Is this just offloading the CPU time spent parsing or does it actually allow us to discover subresources earlier?
> 
> BTW, I ask because there are different ways this could evolve. If the concern is the CPU we spend running the preload scanner, an alternative to consider is reusing the tokens the scanner generates instead of discarding them (https://bugs.webkit.org/show_bug.cgi?id=57376).

Unfortunately, with the threaded approach the reusing of the tokens doesn't seems to be trivial. 

> A potential solution would be to always run the preload scanner first thing in insert()/append(), then pass the tokens to the real parser for it to actually run the scripts and build the tree. 

I have just rived this sentence off... My solution does the same, but it doesn't reuse tokens.


(In reply to comment #5) Adam
> Another way this can improve performance is the following situation:
> 
> <script src="external-script-that-comes-back-quickly.js"></script>
> ... lots of stuff to scan ...
> 
> In this case, we'll be blocked inside the preload scanner when the script comes back from the network.  The time between that point in time and when the preload scanner yields will be wasted and add to overall latency.

In the case of html-parser test html5.html looks like the same, so this can be the reason for long runtime, so as I see html-parser test is testing preloadscanner instead of the html5-parser. 

(In reply to comment #7) Antti
> In the worst case preload scanner will end up tokenizing the document text twice. Considering the triviality of the parsing compared the real parser (and  the lack of tree building), the cost should always be fraction of the time spend in real parsing.
> 
> Either these results indicate serious bug in preload scanning or you are mismeasuring.

I will upload a patch here to disable preloadscanner, could you make a quick measurement for me to confirm or deny me?

(In reply to comment #6) Alexey 
> Using a dedicated thread just for this seems extremely wasteful. On Mac, tasks should probably just go to libdispatch, and other platforms would likely need some kind of thread pool.

We would be happy with using of a generic thread pool as well. We pushed Parallel Jobs Framework in the trunk that supports libdispatch also, what do you think about the modification of it to a more generic implementation that supports long/short live threads and use that?

Thanks for the comments guys!

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.



More information about the webkit-unassigned mailing list