[webkit-dev] Smart Pointer Analysis Tool for WebKit

Konstantin Tokarev annulen at yandex.ru
Thu Sep 17 02:10:49 PDT 2020



17.09.2020, 09:35, "Ryosuke Niwa" <rniwa at webkit.org>:
> Hi all,
>
> I’ve been working with Geoff (ggaren) and Jan Korous from Apple's compiler team to create a static analyzer which detects dangerous use of ref counted objects, and we’re looking for folks who are interested in trying out this tool and helping us refine the tool. Please let me know if you’re interested in using this tool & try applying code changes to WebKit. Our goal is to eventually integrate this tool into buildbot and EWS so that we can catch dangerous code before they get committed.
>
> What is Dangerous?
>
> So what is a dangerous use of ref counted objects you may ask? It’s any use of which we can’t trivially conclude that it doesn’t lead to a use-after-free.
>
> For now, we don’t detect dangerous use of non-ref counted objects including ones that can vend WeakPtr. It’s on us, humans, to decide which objects need to be ref counted or need to be CanMakeWeakPtr.
>
> Consider the following example. This code most certainly will lead to a use-after-free of “parent” in the third line because the code doesn’t keep the parent alive. Because isContentEditable updates style in some cases, it could lead to arbitrary script execution which could remove the parent from the document.
>
> Node* parent = element->parentElement();
> if (parent->isContentEditable())
>     parent->scrollIntoView();
>
> In general, relying on a complex data structure such as DOM tree to keep RefCounted objects alive while we call a non-trivial function is not safe. All it takes for the code to have a use-after-free is for someone to start updating style, layout, etc… inside the function either directly or indirectly. And we don’t want to make WebKit development really hard by forcing anyone who modifies a function to check every caller of the function and their callers, etc… to make sure it’s safe to do so.
>
> For this reason, it’s dangerous to store a raw pointer or a reference to a ref counted object as a local variable and use it after calling a non-trivial function. We did a similar analysis of a number of other patterns and usage of ref counted objects in WebKit and came up with the following basic rules for using ref counted objects in a safe manner. We’re hoping that these rules would be eventually incorporated into our coding style guideline: https://webkit.org/code-style-guidelines/
>
> Rules for Using Ref Counted Objects
>
> * Every data member to a ref counted object must use either Ref, RefPtr, or WeakPtr. webkit.NoUncountedMemberChecker
> * Every ref counted base class, if it has a derived class, must define a virtual destructor. webkit.RefCntblBaseVirtualDtor
> * Every ref counted object passed to a non-trivial function as an argument (including "this" pointer) should be stored as a Ref or RefPtr in the caller’s local scope unless it's an argument to the caller itself by transitive property [1]. alpha.webkit.UncountedCallArgsChecker
> * Every ref counted object must be captured using Ref or RefPtr for lambda. webkit.UncountedLambdaCapturesChecker
> * Local variables - we’re still working on this (https://reviews.llvm.org/D83259)
>
> Below, I've dissected each one of these rules with the real warning emitted by the analysis tool in development. Please let me know if you have any comments / concerns.
>
> ----------------------------------------
> (1) is pretty trivial. Every ref counted data member should be stored using Ref, RefPtr, or WeakPtr since it would be not trivially obvious that life cycles of two or more objects are correctly tied or managed together.
>
> ----------------------------------------
> (2) is also pretty easy to understand. In the following example, if someone destroys an instance of B using Ref<A>, then it would result in an undefined behavior.
>
> struct A : public RefCounted<A> {
>         Vector<int> someData;
> };
>
> struct B : public A {
>         Vector<int> otherData;
> };
>
> ----------------------------------------
> For (3), passing a ref counted object as a raw pointer or reference to a function as an argument, the tool may generate a warning like this:
>
> Source/WebCore/html/FormAssociatedElement.cpp:181:13: warning: [WebKit] call argument is a raw pointers to a ref-countable type [-Wfunction-arg-ptr-to-refcntbl]
>     setForm(findAssociatedForm(&asHTMLElement(), originalForm.get()));
>             ^
>
> This warns that void setForm(HTMLFormElement*) is called with the result of findAssociatedForm, which returns HTMLFormElement* without storing it anywhere. If setForm can somehow cause HTMLFormElement to be deleted, then this can result in the use-after-free in setForm.
>
> void FormAssociatedElement::resetFormOwner()
> {
>     RefPtr<HTMLFormElement> originalForm = m_form.get();
>     setForm(findAssociatedForm(&asHTMLElement(), originalForm.get()));
>     HTMLElement& element = asHTMLElement();
>     auto* newForm = m_form.get();
>     if (newForm && newForm != originalForm && newForm->isConnected())
>         element.document().didAssociateFormControl(element);
> }
>
> Why, you may ask, we don't put HTMLFormElement* in a Ref or RefPtr in setForm instead? This is because a lot of code has calls to a function with a local variable. Because the callee doesn’t know whether the caller has already stored each argument in Ref / RefPtr or not, we need to be safe and store them again in Ref / RefPtr, resulting in an unnecessary ref-churn. Additionally, if we took the approach of the callee being responsible for keeping every argument alive, then every non-trivial member function of a ref counted object must have "protectedThis" at the beginning of the function, which would be particularly wasteful if those member functions are sometimes called by other member functions.
>
> Once this rule is applied everywhere, for example, we can get rid of “protectedThis” from our codebase because it would be redundant. Furthermore, this rule transitively allows a raw pointer or a reference passed in as an argument to be used as arguments to call another function without storing it in Ref or RefPtr.
>
> For example, the following function satisfies this rule because both oldDocument and newDocument are raw references passed to this function as arguments, and therefore the caller of this function must have already stored as Ref / RefPtr as local variables.
>
> void HTMLMediaElement::didMoveToNewDocument(Document& oldDocument, Document& newDocument)
> {
>     ASSERT_WITH_SECURITY_IMPLICATION(&document() == &newDocument);
>     if (m_shouldDelayLoadEvent) {
>         oldDocument.decrementLoadEventDelayCount();
>         newDocument.incrementLoadEventDelayCount();
>     }
>
>     unregisterWithDocument(oldDocument);
>     registerWithDocument(newDocument);
>
>     HTMLElement::didMoveToNewDocument(oldDocument, newDocument);
>     updateShouldAutoplay();
> }
>
> ----------------------------------------
>
> For (4), capturing a ref counted object as a raw pointer or a reference, the tool may generate a warning like this:
>
> Source/WebKit/UIProcess/API/APIHTTPCookieStore.cpp:115:74: warning: [WebKit] captured a raw pointer to a ref-countable type [-Wlambda-capture-ptr-to-refcntbl]
>     cookieManager->deleteCookie(m_owningDataStore->sessionID(), cookie, [pool = WTFMove(pool), completionHandler = WTFMove(completionHandler)]() mutable {
>                                                                          ^
> This is warning about webPage object getting captured using a raw reference in the following function:
>
> void HTTPCookieStore::deleteCookie(const WebCore::Cookie& cookie, CompletionHandler<void()>&& completionHandler)
> {
>     auto* pool = m_owningDataStore->processPoolForCookieStorageOperations();
>     if (!pool) {
>         if (m_owningDataStore->sessionID() == PAL::SessionID::defaultSessionID() && !cookie.session)
>             deleteCookieFromDefaultUIProcessCookieStore(cookie);
>         else
>             m_owningDataStore->removePendingCookie(cookie);
>
>         RunLoop::main().dispatch([completionHandler = WTFMove(completionHandler)] () mutable {
>             completionHandler();
>         });
>         return;
>     }
>
>     auto* cookieManager = pool->supplement<WebKit::WebCookieManagerProxy>();
>     cookieManager->deleteCookie(m_owningDataStore->sessionID(), cookie, [pool = WTFMove(pool), completionHandler = WTFMove(completionHandler)]() mutable {
>         completionHandler();
>     });
> }
>
> Here, we should have stored WebProcessPool in Ref instead.

Sounds great! A few questions:
* Do I understand correctly that analyzer is a part of upstream clang and can work on any platform?
* Does it require WebKit trunk or can work with older branches?
* Any plans to detect other kinds of misuses like circular references?


-- 
Regards,
Konstantin


More information about the webkit-dev mailing list