[webkit-dev] Lets use PassRefPtr for arguments less; lets use RefPtr for locals and data members more

Sat Jun 18 22:15:50 PDT 2011

On Jun 18, 2011, at 5:52 PM, Darin Adler wrote:

> 1:
> 
> Recently, Alexey has encouraged me to use PassRefPtr less for function arguments.
> 
> The PassRefPtr optimization pays off when the object in question is possibly being handed off from one PassRefPtr to another. For an argument, that can happen in two ways: 1) The argument can be the result of a function that returns a PassRefPtr. 2) The argument can be the result of calling release a local or data member that is a RefPtr. In both of those cases, we are transferring ownership.
> 
> Mechanically speaking, PassRefPtr only pays off if we are actually getting that optimization. If we are passing a raw pointer, then using PassRefPtr for the function argument type doesn’t help much. It just puts a ref at the call site, a ref that otherwise would happen inside the function. It may even cause a bit of code bloat if there is more than one call site.
> 
> Conceptually speaking, PassRefPtr only pays off if the context is a clear transfer of ownership. Passing an object that the recipient *might* later choose to take shared ownership of is not enough. Clients are always welcome to take shared ownership of something passed with a raw pointer.
> 
> Because there are also costs to PassRefPtr, we should reserve PassRefPtr arguments for cases where the optimization will really pay off and for where the function definitely is taking ownership. Those functions do exist, but many current uses of PassRefPtr for arguments do not qualify.

A few thoughts on this:

- The benefit of PassRefPtr at any individual call site is probably too small to be measurable, but I believe taken together they all add up to somewhere in the ballpark of .2%-.5% on some benchmarks by avoiding refcount churn.

- I think having a rule for using PassRefPtr for arguments that depends on how callers use the function is likely to be hard to use in practice, since it requires global knowledge of all callers to pick the right function signature. The rule I prefer is that if a function takes ownership of the argument, whether or not we currently have a caller that gives away ownership, the function should take PassRefPtr. If it does not take ownership, it should take a raw pointer. That way you can decide the right thing to do based solely on the definition of the function itself.

- Longer term we can replace PassRefPtr with use of C++0x rvalue references. I wonder if it's possible to do this in a way where we use rvalue references on compilers that support them and classic PassRefPtr elsewhere.

> 
> 2:
> 
> Recently, I’ve noticed that many bugs simply would cease to exist if we used RefPtr more and raw pointer less for things like local variables and data members.
> 
> The time it’s safe to use a raw pointer for a local variable or data member for a reference counted object pointer is when there is some guarantee that someone else is holding a reference. It can be difficult to have such a guarantee and those guarantees are fragile as code executes. Nowhere is that more clear than in loader-related code.
> 
> Some are loathe to use RefPtr for data members because they are concerned about reference cycles. Generally speaking, we can eliminate the worry about reference cycles by making destruction include a “closing” process, which can null out all the references and break cycles. If we find it’s necessary we can also look into an efficient WeakPtr implementation. Some have been enthusiastic about this in the past. I have been a bit less so because I don’t know of an efficient implementation strategy.

We could probably use RefPtr for at least some data members in the DOM, but it would be tricky to avoid introducing needless refcount churn. I guess this gets back to the point about PassRefPtr.

Using RefPtr for locals is probably a good idea in some cases, but it can also lead to refcount churn, and refcount churn tends to make a program slower by death of a thousand cuts, so it's hard to find the source of the problem after the fact. I've wondered at times whether it might be a good idea to use a RefPtr<T>& (note the reference) for cases where it's desirable to be guaranteed that someone holds a ref but you don't want to add an extra ref yourself.

> Conclusion:
> 
> A specific example where an argument has type PassRefPtr, but that does not seem like the correct design, is the the node argument in the constructors of the various HTMLCollection classes.

That's probably so, but I'd also guess the cost is smaller than maintaining clear understanding of why PassRefPtr should not be used here but should in other similar cases. i would still prefer to adhere to the simpler rule even though PassRefPtr doesn't add anything in this specific case.

PassRefPtr also makes it harder to accidentally put an incoming pointer into a raw pointer data member or local variable, so it has a potential typechecking benefit even when it isn't helping your performance.

Regards,
Maciej