[webkit-help] Building a simple web crawler with webkit?

Ryan Leavengood leavengood at gmail.com
Tue Aug 25 18:03:13 PDT 2009

On Tue, Aug 25, 2009 at 5:13 PM, Dan<dan at dancryer.com> wrote:
> Just posted this to webkit-dev, and was advised that this is a better list
> for the question. Sorry if this is a little vague... but, does anyone have
> any general guidance as to where I'd start with webkit if I wanted to build
> a headless web client, along the lines of a crawler / bot, on top of it?

The first question I have is what platform would you want this to run
on? WebKit runs on many platforms and toolkits (Mac, Linux, Windows,
Haiku, GTK, Qt, Wx, etc.) If you only care about Mac then just worry
about the Mac port and the WebKit API exposed there. Same with
Windows. If you want something more multi-platform you may want to
look at the GTK or Qt ports.

> Would I be best to use individual parts of the code, or implement a browser
> and hide the UI side of it?

You should be able to drive WebKit with a given WebKit API (each
platform and toolkit implements their own) and you would not
necessarily need to have a UI. Take a look in the WebKit directory to
get an idea of what is exposed at that level. Apple also has
documentation about their WebKit API here:


Note that you may need to delve a bit deeper into WebCore to be able
to traverse links inside a page for example. Though maybe not since it
looks like at least Apple's API exposes the DOM in Objective-C.


More information about the webkit-help mailing list