[webkit-dev] Save Page - Ideas

zaheer ahmad zaheer.mot at gmail.com
Wed Nov 5 05:25:37 PST 2008


hi darin, dave,
thanks for your inputs.

i have few comments on the patches

1- 7211 - the patch seems to serialize only the html. what about external
resources? chrome seems to rely on the cache for these resources but this
may not work in case you want to use the save page as a feature to
store/view content offline.

2-7168 - this patch uses the archive resource which seems to cache all
resources loaded including external reference in memory. though i haven't
measured the extra memory, it could be an issue for resource constrained
devices. so storing to file system may be an option at the cost of little
slower load time?

thanks,
Zaheer



On Thu, Oct 30, 2008 at 10:17 PM, Darin Fisher <darin at google.com> wrote:

>
>
> On Thu, Oct 30, 2008 at 9:33 AM, David Kilzer <ddkilzer at webkit.org> wrote:
>
>> On Thu, 10/30/08, zaheer ahmad <zaheer.mot at gmail.com> wrote:
>>
>> > iam working on implementing save page functionality. Looks
>> > like its not
>> > already supported in the core.
>>
>> Apple's Mac port saves ".webarchive" files.  The format is specific to the
>> CoreFoundation framework, but there is platform-specific code that does this
>> nevertheless.
>>
>> > Following are some high
>> > level ideas and iam
>> > not sure if some or all of these are the right approaches
>> > to this problem
>> >
>> > - write the page data to the file system as and when is
>> > received - but this
>> > is not optimal since this incurs constant overhead on page
>> > load
>>
>> Don't do this.
>>
>> > - apis to retreive the source (html, js, css) and
>> > image/object data
>> > (original form) from the document. I think the
>> > parsers/loaders incrementally
>> > handle the data and throw off the parsed text - pls
>> > validate my
>> > understanding here.
>>
>> There should be API to do this already.  Look at how content for
>> .webarchive files is retrieved.
>>
>> > - parse and convert all the html absolute/relative URIs to
>> > relative URIs on
>> > the file system
>>
>> Bug 7211: Support save as "Web page, complete" in Firefox format
>> https://bugs.webkit.org/show_bug.cgi?id=7211
>
>
> We have code to support this feature in the Chromium code base.  You can
> find it here:
>
> http://src.chromium.org/viewvc/chrome/trunk/src/webkit/glue/dom_serializer.h?view=markup
>
> http://src.chromium.org/viewvc/chrome/trunk/src/webkit/glue/dom_serializer.cc?view=markup
>
> It is something we would love to one day see as part of WebKit.
>
> -Darin
>
>
>  <https://bugs.webkit.org/show_bug.cgi?id=7211>
>>
>> > - any other optimized storage methods - e.g. storing the
>> > entire page as a
>> > single file using multipart content
>>
>> Bug 7169: Support exporting of MHTML web archives
>> https://bugs.webkit.org/show_bug.cgi?id=7169
>>
>> I would strongly encourage you to reuse an existing format rather than
>> inventing your own.  (In my opinion the Firefox format is preferred because
>> it's readable by all web browsers.)
>>
>> Dave
>>
>>
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev at lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20081105/d21c9291/attachment.html>


More information about the webkit-dev mailing list