[webkit-dev] WebKit produces wrong result on autocomplete, after user uses back button

Benjie Chen benjie at lablife.org
Tue Mar 16 14:03:40 PDT 2010


Current WebKit based browsers (as of 3/16/2010), e.g. Safari and
Chrome, exhibit the following bugs. Perhaps someone can take a look.
Thanks.


Bug 1: If a page A has multiple form elements F1 and F2, and the first
(in order of appearance in HTML) form, F1, has autocomplete set to
"off" (i.e. <form ... autocomplete="off">), but F2 has autocomplete
set to "on" (default behavior), then after navigating away from page
A, and then hitting browser back button to go back to page A, F1 and
F2 may be auto-completed incorrectly. In particular, if F1 and F2 both
have input elements with the same name and type, say N and T (i.e.
<input name="N" type="T" ...>), then when navigating back to page A
using the back button, F1.N's value will be autocompleted with F2.N's
value.

Bug 2: First, browser hits page A, and server returns an HTML page
with form elements F1 and F2 (both forms have autocomplete set to on).
Then, user navigates away from page A, and subsequently returns to
page A using the browser back button. On the second visits to page A,
WebKit issues another request for A to the server (this differs from
FireFox's behavior, where on back button no addition request is issued
to server). If the server returns a different HTML page (e.g. because
user session has logged out), with form elements F3 and F4 that are
different from F1 and F2, but consisting of input elements with the
same name and type, then F3 and F4 will be autocompleted with F1 and
F2 input element values, even for input element type hidden and
submit.


WORK AROUND

Bug 1: never use autocomplete="off" unless you have this set for ALL
the forms on the same HTML page.

Bug 2: case specific, no good generic solution. We found an acceptable
work around by including hidden forms to make sure two versions of
page A have similar forms; first version has F1, F2, F3, and second
one has F1, F2', and F3, where F2' is a hidden version of F2. If we
don't include F2', then the second version of page A is F1, and F3,
and F3 will be auto-completed with F2's element values, even for
hidden and submit elements in F3.


ANALYSIS of WebKit CODE

These two bugs occur in the same part of the code, but can probably be
considered as two separate bugs. The code are in WebCore sub-directory
of the WebKit code tree.


Bug 1: in Document::formElementsState, input elements that have
autocomplete turned ON (checked via
HTMLInputElement::saveFormControlState), have their states saved in a
vector. However, in
HTMLFormControlElementWithState::finishParsingChildren, every form
element, regardless if autocomplete is ON or OFF, restores state from
the aforementioned vector. This results in bug 1.

Bug 1 Fix: this should be a fairly straight-forward fix -
finishParsingChildren should not restore state if element has
autocomplete turned OFF.

Disclaimer: I don't develop on Mac. I only use it and we develop a
website. I just browse the WebKit code today. Hence, I have not
created or tested a patch.


Bug 2. This is much more complex.

I assume that in a design decision unrelated to autocomplete, WebKit
is designed to re-fetch page A if user is using the back button to go
back in history to page A.

(I'd be interested in hearing about this too)

Fundamentally, WebKit makes the incorrect assumption that the second
fetch of page A results in the same HTML, or at least the same set of
forms, as the first fetch. If this is not the case, then the
autocomplete logic no longer produces the correct/expected behavior.

When WebKit saves state for a page, it calls
Document::formElementsState, which simply creates a map of pairs, and
puts each input element's name+type and value pair into the map. If
two input elements in two separate forms have the same name and type,
both the values are saved.

For example, say page A has forms F1 and F2, and F1 has input elements
with names a1 and a2, with types t1 and t2, with values v1 and v2
respectively. F2 has input elements with names a3 and a2, with types
t1 and t2, and values v3 and v4, respectively. WebKit saves the state
of this page as (in JSON notiation)

{  "a1,t1" : [ v1 ], "a2,t2" : [ v2, v4 ], "a3,t1" : [ v3 ]  }

If user revisits page A using the browser back button, WebKit will try
to autocomplete forms on the new version of page A, fetched from the
server, using the above state. If the new version of page A has
exactly the same forms as the last, then everything works. If not,
then WebKit produces incorrect behavior. For example, assume the
second time page A is fetched, server returns just one form F3, and F3
has input elements with names a4 and a2, with types t1 and t2, then
F3's a2 element will be populated with v2, saved from the previous
page.

(Note: actual logic of storing state and recovering state, as used in
the code, are slightly different, but the idea is the same)

This problem manifests itself on websites when user sessions may
expire, and after a session expires, hitting page A may produce
slightly different HTML. E.g. may give you a "please login" form, or
may give you roughly the same content, but in place of a search user
data form on top, a login form appears. In these cases, the visible
text input element, hidden input element, and submit input elements
may all have their values changed by WebKit.

Bug 2 Fix: This is difficult, because WebKit re-fetches page A when
user uses the back button. If new version of page A is different from
the old version, WebKit cannot easily match a form's state from old
version of the page to some form, if it even exists, on the new
version. You can't realistically require all forms to have the same
DOM id, and even if you do, that's still not entirely correct since
DOM ids are required to be unique within a HTML page, but not need to
be unique across separate HTML Documents.

The only fix I can think of is this: when you save the state from the
first time you visit page A, take a MD5 or SHA1 hash of the page, and
store that with the input element state. When you go back to page A,
only restore state if the MD5 or SHA1 hash is the same.


More information about the webkit-dev mailing list