[webkit-gtk] Operation webkit_web_view_new() crashed when in multithreading
Niranjan Rao
nhrdls at gmail.com
Wed Nov 19 10:09:22 PST 2014
We also use webkit for what essentially boils down to crawling because
of scenarios liuyang06 said. Ajax driven sites make life very
interesting as you are not sure when the content you are interested in
is going to show up. With normal curl/wget like tools/calls it becomes
very hard to manage especially if you are working with sites that need
authentication and use javascript heavily. Simple actions such
submitting form get very complicated as we have to reverse engineer the
site and see what actually gets submitted after javascript processing is
done. Its much easier to build robotic actions saying set text on this
input, click there, let javascript massage the data and submit and then
read this content when it appears
Agreed that webkit is heavy for these operations, but after
experimenting with lot of sites we want to process and tools that
were/are available, we concluded it was the best technology. With XVFB
it works perfectly. My next goal is to experiment with network process
model and see if we can reduce resource consumption little more.
On 11/18/2014 09:01 PM, Robert Schroll wrote:
> On Tue, Nov 18, 2014 at 8:56 PM, 刘阳 <liuyang06 at hc360.com> wrote:
>> But, as you know, more and more website, they use more and more
>> dynamic loading by javascript.
>> It may will add DOM into HTML as what the user do or type. Therefor,
>> I want to do a program do as
>> a real user with the WebKitGtk, without GUI.
>
> I admit I've never used it myself, but it sounds like you're looking
> for Ghost.py: https://github.com/jeanphix/Ghost.py
>
> Robert
>
> _______________________________________________
> webkit-gtk mailing list
> webkit-gtk at lists.webkit.org
> https://lists.webkit.org/mailman/listinfo/webkit-gtk
More information about the webkit-gtk
mailing list