[Webkit-unassigned] [Bug 112081] Printing to PDF should produce internal links when HTML has internal links

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Mar 14 22:18:54 PDT 2013


https://bugs.webkit.org/show_bug.cgi?id=112081





--- Comment #16 from David Lattimore <dml+webkit at google.com>  2013-03-14 22:21:19 PST ---
(From update of attachment 192865)
View in context: https://bugs.webkit.org/attachment.cgi?id=192865&action=review

>> Source/WebCore/ChangeLog:9
>> +        have any existing tests. Manual testing was performed in Chromium
> 
> That said, adding a way to test PDF generation would be a valuable project.

Agreed. Not sure I have sufficient familiarity with WebKit to know where and how to write such a test. Ideally it'd run WebKit headlessly on an input HTML file, render to PDF then make assertions about the contents of the PDF. I've got no idea how straightforward it would be to run WebKit in this manner.

As an interim measure, is there somewhere I can put my test file, in case the code is changed later and someone wants to manually test that it still works? Or should I just attach it to this bug?

>> Source/WebCore/ChangeLog:17
>> +        (WebCore):
> 
> It's just a bug of prepare-ChangeLog that this cruft is added. The purpose of generating file and name links is to simplify adding per-function comments to ChangeLogs, and unhelpful garbage should be removed (yes, a lot of people don't do that, sadly).

Makes sense. Removed.

>> Source/WebCore/dom/Document.h:1200
>> +    HashMap<String, Element*>& usedAnchors() { return m_usedAnchors; }
> 
> Looks like this could be const (on both sides). But see other comments.

Code removed.

>> Source/WebCore/dom/Document.h:1465
>> +    HashMap<String, Element*> m_usedAnchors;
> 
> This is very surprising to have as a permanent member of Document. This causes a serious bug: if one generates PDF twice, and an element is removed in between the calls, a dangling pointer is left in the HashMap, and then dereferenced.
> 
> The HashMap should at least be cleared when no longer needed, but I think that I have a better suggestion below.

I've removed all code from Document.

>> Source/WebCore/page/FrameView.cpp:3434
>> +            p->addDestinationAtPoint(i->key, IntPoint(0, y));
> 
> Why no horizontal offset? The user can have the document zoomed in, and then scrolling to (0, y) may not reveal the element.
> 
> Can the result of absoluteBoundingBoxRect() have negative coordinates? Perhaps we should clamp to 0. Ditto for being out of page bounds on the other side.

I now check that the point is inside the page's rectangle and ignore it if it isn't. I think Skia does this as well, but it doesn't hurt to do it in WebKit as well.

>>> Source/WebCore/rendering/RenderObject.cpp:1124
>>> +            document()->addUsedAnchor(name, element);
>> 
>> Would it work if we just called addDestinationAtPoint here? The same destination could be added several times, but I don't see any reason why that would be a problem. We are just using an intermediate map instead of one in PDF context.
> 
> Unfortunately not. GraphicsContext::addDestinationAtPoint needs to be called on the GraphicsContext for the page containing the destination. If we called it here, the link would only work if the destination was on the same page as the link. It's currently being called for all pages and then dropped when the point falls outside the bounds of the page (i.e. all but the correct page). I'll have a look and see if I can find another way of structuring it without storing stuff on Document. If you have any ideas, let me know.

You asked if a new GraphicsContext is created for each page... I'd assumed that one was - the headless code I'm using does that, but I see that PrintContext::spoolAllPagesWithBoundaries doesn't do that - it instead reuses the same GraphicsContext. In any case, at the Skia level, we definitely have a different SkPDFDevice/SkCanvas for each page and the destinations need to get written to the correct SkPDFDevice for the page that the destination appears on. So unless there is some way to get access to a random page's SkCanvas, the destinations need to be put out while the page containing those destinations is being rendered.

Instead, I've added code to FrameView that finds links, gets the elements they point to and outputs them as destinations if they're on the current page.

-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the webkit-unassigned mailing list