[webkit-gtk] Operation webkit_web_view_new() crashed when in multithreading

Niranjan Rao nhrdls at gmail.com
Wed Nov 19 10:09:22 PST 2014


We also use webkit for what essentially boils down to crawling because 
of scenarios liuyang06 said. Ajax driven sites make life very 
interesting as you are not sure when the content you are interested in 
is going to show up. With normal curl/wget like tools/calls it becomes 
very hard to manage especially if you are working with sites that need 
authentication and use javascript heavily. Simple actions such 
submitting form get very complicated as we have to reverse engineer the 
site and see what actually gets submitted after javascript processing is 
done. Its much easier to build robotic actions saying set text on this 
input, click there, let javascript massage the data and submit and then 
read this content when it appears

Agreed that webkit is heavy for these operations, but after 
experimenting with lot of sites we want to process and tools that 
were/are available, we concluded it was the best technology. With XVFB 
it works perfectly. My next goal is to experiment with network process 
model and see if we can reduce resource consumption little more.

On 11/18/2014 09:01 PM, Robert Schroll wrote:
> On Tue, Nov 18, 2014 at 8:56 PM, 刘阳 <liuyang06 at hc360.com> wrote:
>> But, as you know, more and more website, they use more and more 
>> dynamic loading by javascript.
>> It may will add DOM into HTML as what the user do or type. Therefor, 
>> I want to do a program do as
>> a real user with the WebKitGtk, without GUI.
>
> I admit I've never used it myself, but it sounds like you're looking 
> for Ghost.py: https://github.com/jeanphix/Ghost.py
>
> Robert
>
> _______________________________________________
> webkit-gtk mailing list
> webkit-gtk at lists.webkit.org
> https://lists.webkit.org/mailman/listinfo/webkit-gtk



More information about the webkit-gtk mailing list