[webkit-dev] DOM tree traversal on detached nodes

Kentaro Hara haraken at chromium.org
Wed Jun 6 18:19:08 PDT 2012


> The semantics will fix the FIXME<http://code.google.com/codesearch#OAMlx_jo-ck/src/third_party/WebKit/Source/WebCore/bindings/js/JSNodeCustom.cpp&exact_package=chromium&q=jsnodecustom&type=cs&l=112> in
JSC GC.

Sorry, the link was wrong. Correction:
Here<http://code.google.com/codesearch#OAMlx_jo-ck/src/third_party/WebKit/Source/WebCore/bindings/js/JSNodeCustom.cpp&exact_package=chromium&q=jsnodecustom&type=cs&l=91>



On Thu, Jun 7, 2012 at 10:14 AM, Kentaro Hara <haraken at chromium.org> wrote:

> [Summary]
>
> What values should span.parentNode and span.firstChild return in the
> following code? (test html <http://haraken.info/null/ref_count2.html>).
>
>   div = document.createElement("div");
>   document.body.appendChild(div);
>   div.innerHTML = '<p><p><p><span
> id="span"><br><br><br>text</span></p></p></p>';
>   span = document.getElementById("span");
>   div.innerHTML = "";
>   alert(span);  // <span>
>   alert(span.parentNode);  // ???
>   alert(span.firstChild);  // ???
>
> (a) span.parentNode = <p>, span.firstChild = <br>
> (b) span.parentNode = null, span.firstChild = <br>
> (c) span.parentNode = <p>, span.firstChild = null
> (d) span.parentNode = null, span.firstChild = null
> (e) Any value is OK (i.e. the behavior is UNDEFINED)
>
>
> [Behavior in browsers]
>
> Safari 5.1.7: (b)
> Chrome 20.0: (b)
> Firefox 12.0: (a)
> Opera 11.64: (a)
> IE 9: (d)
>
>
> [How WebKit behaves as (b)]
>
> The behavior is caused by the reference counting algorithm of Node objects
> (TreeShared.h<http://code.google.com/codesearch#OAMlx_jo-ck/src/third_party/WebKit/Source/WebCore/platform/TreeShared.h&exact_package=chromium&q=treeshared.h&type=cs>).
> In the TreeShared algorithm, a Node X is destructed if the ref-count of X
> is 0 and X's parent is NULL. So div.innerHTML = "" causes the following
> steps:
>
> (0) The ref-counts of three <p>s are 0.
> (1) div.innerHTML = "" is executed.
> (2) The parent of the first <p> becomes NULL. The first <p> is destructed.
> (3) The parent of the second <p> becomes NULL. The second <p> is
> destructed.
> (4) The parent of the third <p> becomes NULL. The third <p> is destructed.
>
> On the other hand, <br><br><br> are not destructed because <span> is
> referenced from the JS side and thus the parent of the first <br> does not
> become NULL. Note that "X is destructed if the ref-count of X is 0 and X's
> parent is NULL" implies that "If X has a ref count, then all the nodes
> under X are kept alive". That's why <p><p><p> are destructed but
> <br><br><br> are not destructed.
>
>
> [Other weird behaviors]
>
> The behavior (b) is weird, and it causes other subtle issues. For example,
> editing. Consider the following code (test html<http://haraken.info/null/ref_count4.html>
> ):
>
>   <div contentEditable>
>   a<p>b<p>c<p>d<span id="span">e<br>f<br>g<br>h</p>i</p>j</p>k
>   </div>
>   </body>
>   <script>
>   span = document.getElementById("span");
>   setTimeout(function () {
>     // Please manually delete the texts in <div> within 10 seconds
>     alert("span = " + span);  // <span>
>     alert("span.parentNode = " + span.parentNode);  // <p>
>     alert("span.parentNode = " + span.parentNode.parentNode);  // null
>     alert("span.firstChild = " + span.firstChild);  // "e"
>   }, 10000);
>
> I am not sure why span.parentNode returns <p>
> but span.parentNode.parentNode returns null. Maybe an undo stack keeps
> reference to <p>?
>
> Here is another example. According to the behavior (b), the following
> result makes sense (test html <http://haraken.info/null/ref_count3.html>):
>
>   <html><body><p><span id="span"><br></span></p></body>
>   <script>
>   span = document.getElementById("span");
>   document.body.innerHTML = "";
>   alert("span = " + span);  // <span>
>   alert("span.parentNode = " + span.parentNode);  // null
>   alert("span.firstChild = " + span.firstChild);  // <br>
>   </script>
>   </html>
>
> However, if we omit </span> and </p>, the result changes (test html<http://haraken.info/null/ref_count3-2.html>
> ):
>
>   <html><body><p><span id="span"><br></body>
>   <script>
>   span = document.getElementById("span");
>   document.body.innerHTML = "";
>   alert("span = " + span);  // <span>
>   alert("span.parentNode = " + span.parentNode);  // <p>
>   alert("span.firstChild = " + span.firstChild);  // <br>
>   </script>
>   </html>
>
> Maybe the HTML parser has a list of not-yet-closed tags and the tag entry
> keeps reference to <p>? I am not sure.
>
> Anyway, the point is that the behavior (b) is UNDEFINED from the
> perspective of JS programmers. The behavior depends on what JS objects are
> being used and what data structures are implicitly being allocated in
> WebCore.
>
>
> [Discussion]
>
> First of all, it seems that the behavior is not defined in the spec.
>
> IMHO, (a) would be the best semantics. The semantics of (a) is very
> straightforward from the perspective of JS programmers, i.e. "Reachable DOM
> nodes from JS are kept alive". On the other hand, (b) and (c) are not good
> in that the semantics is UNDEFINED (i.e. the semantics depends on the
> implementation details). Consequently, in terms of the semantics, WebKit
> might want to change the current behavior to (a).
>
> That being said, I am not sure if the semantics is practically important
> in the real world. As explained above, indeed the behavior (b) will cause a
> lot of weird bugs, but it would be "rare" cases. In fact, considering that
> IE, FIrefox, Opera, Chrome and Safari has been behaving differently, the
> current confusing semantics has not caused a big practical issue. This
> would imply that the behavior does not matter in the real world. If you
> know any bugs caused by the behavior, I am super happy to know.
>
>
> [Why I am discussing this]
>
> I've been designing V8 GC for DOM objects. I investigated a couple of
> ideas, one of which requires the semantics that "Reachable DOM nodes are
> kept alive". The semantics is required to reclaim DOM objects safely in the
> current generational V8 GC. In addition, I would emphasize that the
> semantics will also simplify JSC GC. The semantics will fix the FIXME<http://code.google.com/codesearch#OAMlx_jo-ck/src/third_party/WebKit/Source/WebCore/bindings/js/JSNodeCustom.cpp&exact_package=chromium&q=jsnodecustom&type=cs&l=112>in JSC GC. Other benefits of the semantics is that it will naturally solve
> the weird WebKit behaviors that I explained above. If we could reach a
> consensus that the behavior (a) is expected, I would like to discuss how to
> achieve the behavior (a) without extra overhead.
>
>
> In conclusion, my question is... what behavior is expected?
>
>
> Thanks!
>
> --
> Kentaro Hara, Tokyo, Japan (http://haraken.info)
>



-- 
Kentaro Hara, Tokyo, Japan (http://haraken.info)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20120607/062c2330/attachment.html>


More information about the webkit-dev mailing list