[webkit-dev] Smart Pointer Analysis Tool for WebKit

Ryosuke Niwa rniwa at webkit.org
Wed Sep 16 23:32:12 PDT 2020


Hi all,

I’ve been working with Geoff (ggaren) and Jan Korous from Apple's compiler
team to create a static analyzer which detects dangerous use of ref counted
objects, and we’re looking for folks who are interested in trying out this
tool and helping us refine the tool. Please let me know if you’re
interested in using this tool & try applying code changes to WebKit. Our
goal is to eventually integrate this tool into buildbot and EWS so that we
can catch dangerous code before they get committed.

*What is Dangerous?*

So *what is* a *dangerous* use of ref counted objects you may ask? It’s any
use of which we can’t trivially conclude that it doesn’t lead to a
use-after-free.

For now, we don’t detect dangerous use of non-ref counted objects including
ones that can vend WeakPtr. It’s on us, humans, to decide which objects
need to be ref counted or need to be CanMakeWeakPtr.

Consider the following example. This code most certainly will lead to a
use-after-free of “parent” in the third line because the code doesn’t keep
the parent alive. Because isContentEditable updates style in some cases, it
could lead to arbitrary script execution which could remove the parent from
the document.

Node* parent = element->parentElement();
if (parent->isContentEditable())
    parent->scrollIntoView();

In general, relying on a complex data structure such as DOM tree to keep
RefCounted objects alive while we call a non-trivial function is not safe.
All it takes for the code to have a use-after-free is for someone to start
updating style, layout, etc… inside the function either directly or
indirectly. And we don’t want to make WebKit development really hard by
forcing anyone who modifies a function to check every caller of the
function and their callers, etc… to make sure it’s safe to do so.

For this reason, it’s *dangerous* to store a raw pointer or a reference to
a ref counted object as a local variable and use it after calling a
non-trivial function. We did a similar analysis of a number of other
patterns and usage of ref counted objects in WebKit and came up with the
following basic rules for using ref counted objects in a safe manner. We’re
hoping that these rules would be eventually incorporated into our coding
style guideline: https://webkit.org/code-style-guidelines/

*Rules for Using Ref Counted Objects*


   1. Every data member to a ref counted object must use either Ref,
   RefPtr, or WeakPtr. webkit.NoUncountedMemberChecker
   <https://github.com/llvm/llvm-project/blob/master/clang/docs/analyzer/checkers.rst#id17>
   2. Every ref counted base class, if it has a derived class, must define
   a virtual destructor. webkit.RefCntblBaseVirtualDtor
   <https://github.com/llvm/llvm-project/blob/master/clang/docs/analyzer/checkers.rst#id16>
   3. Every ref counted object passed to a non-trivial function as an
   argument (including "this" pointer) should be stored as a Ref or RefPtr in
   the caller’s local scope unless it's an argument to the caller itself by
   transitive property [1]. alpha.webkit.UncountedCallArgsChecker
   <https://github.com/llvm/llvm-project/blob/master/clang/docs/analyzer/checkers.rst#id21>
   4. Every ref counted object must be captured using Ref or RefPtr for
   lambda. webkit.UncountedLambdaCapturesChecker
   <https://github.com/llvm/llvm-project/blob/master/clang/docs/analyzer/checkers.rst#id18>
   5. Local variables - we’re still working on this (
   https://reviews.llvm.org/D83259)


Below, I've dissected each one of these rules with the real warning emitted
by the analysis tool in development. Please let me know if you have any
comments / concerns.

------------------------------
(1) is pretty trivial. Every ref counted data member should be stored using
Ref, RefPtr, or WeakPtr since it would be not trivially obvious that life
cycles of two or more objects are correctly tied or managed together.

------------------------------
(2) is also pretty easy to understand. In the following example, if someone
destroys an instance of B using Ref<A>, then it would result in an
undefined behavior.

struct A : public RefCounted<A> {
        Vector<int> someData;
};

struct B : public A {
        Vector<int> otherData;
};

------------------------------

For (3), passing a ref counted object as a raw pointer or reference to a
function as an argument, the tool may generate a warning like this:

Source/WebCore/html/FormAssociatedElement.cpp:181:13: warning: [WebKit]
call argument is a raw pointers to a ref-countable type
[-Wfunction-arg-ptr-to-refcntbl]
    setForm(findAssociatedForm(&asHTMLElement(), originalForm.get()));
            ^

This warns that void setForm(HTMLFormElement*) is called with the result
of findAssociatedForm, which returns HTMLFormElement* without storing it
anywhere. If setForm can somehow cause HTMLFormElement to be deleted, then
this can result in the use-after-free in setForm.

void FormAssociatedElement::resetFormOwner()
{
    RefPtr<HTMLFormElement> originalForm = m_form.get();
    setForm(findAssociatedForm(&asHTMLElement(), originalForm.get()));
    HTMLElement& element = asHTMLElement();
    auto* newForm = m_form.get();
    if (newForm && newForm != originalForm && newForm->isConnected())
        element.document().didAssociateFormControl(element);
}

Why, you may ask, we don't put HTMLFormElement* in a Ref or RefPtr in
setForm instead? This is because a lot of code has calls to a function with
a local variable. Because the callee doesn’t know whether the caller has
already stored each argument in Ref / RefPtr or not, we need to be safe and
store them again in Ref / RefPtr, resulting in an unnecessary ref-churn.
Additionally, if we took the approach of the callee being responsible for
keeping every argument alive, then every non-trivial member function of a
ref counted object must have "protectedThis" at the beginning of the
function, which would be particularly wasteful if those member functions
are sometimes called by other member functions.

Once this rule is applied everywhere, for example, we can get rid of
“protectedThis” from our codebase because it would be redundant.
Furthermore, this rule transitively allows a raw pointer or a reference
passed in as an argument to be used as arguments to call another function
without storing it in Ref or RefPtr.

For example, the following function satisfies this rule because both
oldDocument and newDocument are raw references passed to this function as
arguments, and therefore the caller of this function must have already
stored as Ref / RefPtr as local variables.

void HTMLMediaElement::didMoveToNewDocument(Document& oldDocument,
Document& newDocument)
{
    ASSERT_WITH_SECURITY_IMPLICATION(&document() == &newDocument);
    if (m_shouldDelayLoadEvent) {
        oldDocument.decrementLoadEventDelayCount();
        newDocument.incrementLoadEventDelayCount();
    }

    unregisterWithDocument(oldDocument);
    registerWithDocument(newDocument);

    HTMLElement::didMoveToNewDocument(oldDocument, newDocument);
    updateShouldAutoplay();
}

------------------------------

For (4), capturing a ref counted object as a raw pointer or a reference,
the tool may generate a warning like this:

Source/WebKit/UIProcess/API/APIHTTPCookieStore.cpp:115:74: warning:
[WebKit] captured a raw pointer to a ref-countable type
[-Wlambda-capture-ptr-to-refcntbl]
    cookieManager->deleteCookie(m_owningDataStore->sessionID(), cookie,
[pool = WTFMove(pool), completionHandler = WTFMove(completionHandler)]()
mutable {
                                                                         ^
This is warning about webPage object getting captured using a raw reference
in the following function:

void HTTPCookieStore::deleteCookie(const WebCore::Cookie& cookie,
CompletionHandler<void()>&& completionHandler)
{
    auto* pool = m_owningDataStore->processPoolForCookieStorageOperations();
    if (!pool) {
        if (m_owningDataStore->sessionID() ==
PAL::SessionID::defaultSessionID() && !cookie.session)
            deleteCookieFromDefaultUIProcessCookieStore(cookie);
        else
            m_owningDataStore->removePendingCookie(cookie);

        RunLoop::main().dispatch([completionHandler =
WTFMove(completionHandler)] () mutable {
            completionHandler();
        });
        return;
    }

    auto* cookieManager = pool->supplement<WebKit::WebCookieManagerProxy>();
    cookieManager->deleteCookie(m_owningDataStore->sessionID(), cookie, [pool
= WTFMove(pool), completionHandler = WTFMove(completionHandler)]() mutable {
        completionHandler();
    });
}

Here, we should have stored WebProcessPool in Ref instead.

- R. Niwa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20200916/093a674a/attachment.htm>


More information about the webkit-dev mailing list