[Webkit-unassigned] [Bug 9001] New: Javascript stops running before replacement page data arrives

Fri May 19 09:56:45 PDT 2006

http://bugzilla.opendarwin.org/show_bug.cgi?id=9001

           Summary: Javascript stops running before replacement page data
                    arrives
           Product: WebKit
           Version: 420+ (nightly)
          Platform: Macintosh
        OS/Version: Mac OS X 10.4
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Page Loading
        AssignedTo: webkit-unassigned at opendarwin.org
        ReportedBy: ggaren at apple.com

This bug is also in Radar as <rdar://4477632>

* SUMMARY
I'm working on a combination of JavaScript and CGI that gives a progress bar
while a file is being uploaded.  I can't do it in the background with
XMLHttpRequest, because Javascript doesn't allow access to file contents. 
(Grrr.)  Thus, the only workaround is to track progress on the server and use
client-side code to poll it periodically during the form submission.

Problem is... it doesn't work in Safari.  It works fine in Firefox, but in
Safari as soon as you click a link or start a form submission, all Javascript
execution stops cold.  It's really annoying, and the only workaround I can find
(using a pop-up window for status reporting) is a bit of a kludge.

I've attached a simple test page that demonstrates the problem.  To test this,
load the page timertest.html, then click the link to start either the internal
(same page) or external (pop-up) timer, then click to start the test.  In
Firefox, you get he expected behavior—the number continues to increment once a
second while the new page tries to load.  In Safari, the number stops changing
as soon as you click the test link.

-------------------------------------------

<GMT15-Mar-2006 01:21:05GMT> David Gatwood:
I just checked, and this mechanism also works on IE5 (Mac), Netscape 7.1 (Mac),
and Opera (Mac), as well as IE 6 (Win) and Netscape 7.1 (Win).  So it looks
like Safari is the only major browser where this code doesn't work.

<GMT22-Mar-2006 23:35:35GMT> David Gatwood:
Proposed patch attached.  The problem was caused by the call to
Frame::pauseTimeouts from within saveDocumentToPageCache, which was, in turn,
called from WebFrame's _createPageCacheForItem method.  Because this was being
called before the data source was committed, the javascript code stopped
functioning, causing my JavaScript progress bar to stand still.

This patch wraps the _createPageCacheForItem message in a test for
(_private->state == WebFrameStateCommittedPage) instead of (_private->state ==
WebFrameStateProvisional), thus causing the JavaScript code to continue
executing until the last practical moment.

This two line change fixes the misbehavior noted.  Backwards/forwards caching
appears to still work, at least in very limited testing.

<GMT28-Mar-2006 23:54:11GMT> Alice Liu:
Don't worry David - the milestone being "Later" doesn't preclude your fix, if
approved, from being integrated into Leopard. =)

Geoff please review David's patch. 

<GMT30-Mar-2006 00:39:58GMT> Geoff Garen:
David,

It's great that you took this bug into your own hands. However, I don't think
the fix you proposed will work. A load becomes committed when its receives its
first byte of data. At that time, it's almost certain that isLoading is true.
Therefore, your patch would result in WebKit never adding items to the
back/forward cache. 

I confirmed this theory by setting a breakpoint on _createPageCacheForItem and
browsing around www.google.com.

I'm assigning this back to you in case you want to take another shot at it.

<GMT30-Mar-2006 03:49:06GMT> David Gatwood:
At the point where this happens, previousItem is null for the first page, so we
obviously should be looking at currentItem.  However, when I do this, backwards
caching appears to work, but going back forwards results in a bus error.  So it
looks like _createPageCacheForItem will have to take an alternate data source,
cached from the previous one, and if there is no previous data source, no cache
entry should get stored.

It is definitely using the caching with these changes, as I can disconnect from
the network and go backwards and forwards without any problems.

I ended up fixing a NULL pointer dereference in scheduleRelayout in which it
checks to see if m_frame->document is NULL, but doesn't check to see if m_frame
is NULL, and for some reason, it is.  I'm not sure why m_frame is ending up
being NULL, though, and that concerns me that this may be introducing some very
subtle regression in back/forward cache behavior.  I sure can't see it if it
is, though.

I've attached the updated bug as javascript_hang-revised.patch.

I'm not sure how concerned to be about the m_frame issue.  With the updated
changes, it still passes all the regression tests except two that failed before
I started touching things.  Let me know what you think.

<GMT01-Apr-2006 23:09:32GMT> Geoff Garen:
This patch worries me. I don't think either of us fully understands its
repercussions. It's very easy to cause nasty regressions in this party of the
code (I speak from experience here), and none of our regression tests cover it.
(In fact, DumpRenderTree runs with the back/forward cache turned off.)

Some comments:
1. I think you only need to track previousDataSource and previousDocumentView.
You can use those later to discover whether the previous data source was
loading, etc.

2. The commented-out code in Window::Alert is just confusing.

3. Was the change to CursorMac.mm intentional? What does it do?

4. To match "_private->previousDataSource = NULL;", you should probably use
"[_private->previousDataSource release];" instead of "[dataSource release];" In
fact, it would probably be clearer if you just used
"_private->previousDataSource" instead of "dataSource".

5. The log is probably only appropriate in the case where
_private->previousDataSource exists. The fact that we don't save pages to the
b/f cache when you first arrive is not news; it's only if a page fails to save
when you leave that's news. So "else if (_private->previousDataSource) {" is
probably a better way to go.

Generally, this approach seems sound. I don't think we can land it until we
know why m_frame is now NULL in a situation where it wasn't before, though.

<GMT03-Apr-2006 18:29:40GMT> David Gatwood:
Re #1: good point.  That will make things a lot cleaner.

Ignore #2.  This was me trying to figure out whether the script execution had
actually stopped or the results simply were not being displayed.

Ignore #3, also.  This was me trying to work around a compile problem of some
sort that turned out to just be a a case of me using the wrong gcc version.... 
(BTW, the webkit pages should really address which compiler to use.  :-)

Will take care of #4 and #5 and will try to figure out what's causing the null
pointer regression.

<GMT04-Apr-2006 01:39:48GMT> David Gatwood:
Okay, I'm still not sure whether the failure in that call would actually cause
any problems or not, but this revised patch avoids the question entirely by
moving the caching code earlier in the chain of events that lead to committing
the new data---far enough back that backing data hasn't been obliterated, but
far enough forward that it doesn't stop the Javascript interpreter until data
has actually been received.

I've attached the updated patch as javascript_hang-revised-3.patch.

<GMT04-Apr-2006 16:54:55GMT> Geoff Garen:
I'm confused. javascript_hang-revised-3.patch doesn't seem to address #4 or #5,
and it still includes the NULL check for m_frame. 

Generally, I'm reluctant to add things to _checkLoadCompleteForThisFrame that
do more than check whether the load is complete and notify the right folks.
It's true that some code is guilty of that already, but we really need to clean
that up, clarifying load responsibilities and making regressions due to poorly
understood dependencies less likely.

<GMT04-Apr-2006 19:02:47GMT> David Gatwood:
I'm a little confused by your second issue.  I didn't touch
_checkLoadCompleteForThisFrame.  The call to doCache is at the start of
_transitionToCommitted.

Oh, son of a... I just figured out why  m_frame was NULL when the caching code
was executed later....  After that one line change, we can go back to caching
it where I tried to cache it in the original patch.  *slams head into wall
repeatedly*

New (greatly simplified) patch attached as javascript_hang-revised-6.patch.

<GMT05-Apr-2006 14:54:08GMT> Geoff Garen:
if (_private->previousDataSource &&

In Objective-C, messages sent to nil objects just return nil, so the nil check
is unnecessary. 

[_private->bridge canCachePage]

What does this call do? Should it be called the previous bridge now, not the
current one?

         else {
-            LOG(PageCache, "NOT saving page to back/forward cache, %@\n",
[[self dataSource] _URL]);
+            if (_private->previousDataSource) {
+                   LOG(PageCache, "NOT saving page to back/forward cache,
%@\n", [_private->previousDataSource _URL]);
+           }

The WebCore style guidelines say only to use braces around ifs if the ifs
enclose more than one line. So you can eliminate three lines of code here by
nixing the 2 braces and making the else \n if into an else if.

Why do we refuse to cache when _private->quickRedirectComing is true? With your
new scheme, do we need to know if a quick redirect *was* coming for the
previous page?

_canCachePage is the most confusing indirection. Let's just get rid of it and
put the one line of code it encloses at the call site. (Not new to your patch,
but it would be a great thing to clean up along the way.)

+       /* DAG: Delayed caching code until the last possible moment
+          for more consistent JavaScript behavior. */

We usually put this kind of exposition in the ChangeLog. (You can generate a
skeleton ChangeLog with the prepare-ChangeLog script, then fill in the blanks.
You'll need one before this patch can land.)

<GMT05-Apr-2006 18:58:50GMT> David Gatwood:

Cool about the null call being safe.  I'm used to C++ where calling a function
with a NULL "this" pointer isn't a bright idea.  :-)

Re: the bridge's canCachePage... since it's calling through the bridge into the
C++ bigs, that call is probably using the wrong m_frame by this point in the
code no matter what we do.  Thus, I'm thinking that the cacheable status of the
old page should be stored in _private->previousCanCachePage instead of the web
view's notion (which isn't going to change anyway).

Re: the quick redirect, yes, this is still needed, I think.  If the previous
page contained a meta refresh with... I think a second or less... something
like that... then the page doesn't go in the cache because it is considered
worthless... or at least that's the way I heard it explained to me.  But the
previous page still was loaded, so the old page info should still be stored in
_private->previousWhatever, so if we don't check that, we risk breaking that
caching behavior.

Re: style, as long as I'm fixing coding style in that spot, I'm moving the
'else' statements to be on the same line as the prior closing brace, as per
recommended style.

It is a little confusing visually to have a nested else statement without
braces followed immediately by the closing brace of the enclosing 'if'
statement, though.  I'm wondering if maybe that should be an exception where
even a single line 'else' clause should have braces anyway.  Just a thought.

Anyway, updated patch attached.

<GMT19-May-2006 16:52:15GMT> Geoff Garen:
I still don't think this is quite right. I'm going to post your patch on
bugzilla to get more eyes on it.

-- 
Configure bugmail: http://bugzilla.opendarwin.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.