[webkit-qt] Webkit as javascript+css aware html parser

Tarandeep Singh tarandeep at gmail.com
Thu Mar 25 10:17:38 PDT 2010


On Thu, Mar 25, 2010 at 3:10 AM, Benjamin Poulain <
benjamin.poulain at nokia.com> wrote:

> ext Tarandeep Singh wrote:
>
>> I am working on webkit (QT port) and trying to use it as a JavaScript+CSS
>> aware HTML parser.
>> Has anyone worked on something similar and care to join me?
>>
>
> I am not sure of what you mean by "JavaScript+CSS aware HTML parser" but be
> careful to have something clean in order to have it integrated.
>
>
Let me clarify this- a lot of web pages generate their content (that the
user see) via javascript and / or CSS. So if we simply get the page source
and try to construct the DOM of HTML Tags, we won't get any useful
information. So there is a need of a parser that will not only construct DOM
from HTML tags but will also run javascripts and use style information to
modify the DOM.

In nut shell, something that a browser does but without the actual
rendering. Webkit has all the required components, I just need to trim it
down to bare minimum HTML parsing, Javascript + CSS execution => Final DOM.

I would like to run this trim down version of webkit with a crawler, so
there arise another requirement- it should be able to run without XServer.

-Tarandeep

For example, Zoltan did a Javascript parser that perform better than the
> current one (https://bugs.webkit.org/show_bug.cgi?id=34019), but it was
> not integrated due to its structure.
>
>
>  Also, where can I get a high level architecture of webkit (control flow
>> etc). After searching for some time on google, this is the best I could get
>> on webkit architecture-
>>
>
> There is no such things. You can find some useful information on the Wiki,
> some interesting code path for example:
> http://trac.webkit.org/wiki/CodePaths
>
> cheers,
> Benjamin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-qt/attachments/20100325/7de0bda2/attachment.html>


More information about the webkit-qt mailing list