[webkit-dev] Memory profiler

Wed Feb 4 08:32:28 PST 2009

Thanks for your comments.

You're right about the IDL files, only an attribute has to be added.
That will makes the JSObject report its size plus the implementation
object reported size. I think at first we should start by implementing
the "aproximateSize" function for the most memory consuming objects. I
think that for the JavaScript developer the approximate size can be as
useful as the actual size if bigger objects are always reported bigger
than the small ones.
For example, DOM elements can count their size by looking at text
elements and images that they contain (objects that have a
JSDomWrapper can be excluded in this search, so that objects are not
counted twice). That can be implemented in a higher level class and
rewritten for resourceful objects.

The object ID is useful to track objects from one snapshot to another.
It can be maintained in a separate hashtable using the the pointer
from the collector as a key. JSCell constructor and destructor will
maintain that. It will also contain a memory buffer representing the
callstack dumped when the object was created. The debugger already has
hooks for maintaining the callstack, I think that can be used to dump
a callstack when one object is created. That can even be shared among
objects that are created on appropriate callstacks (when they are
created in the same function, the parent callstack can be reused). A
timestamp can be useful in order to separate older objects.

The data can be exposed in the Inspector tool as that uses separate
heap to store data and doesn't affect the snapshots. It can take
snapshots of memory and compare them to previous ones. Some patterns
of bad practice in javascript is initializing everything from the
start even though they are not going to be used, that kind of objects.

> There are also plenty of objects that don't have constructors.

I don't understand how would that affect the memory profiler. Do you
mean that JSCell's constructor or "new" operator will not be called?

I think the first steps will be to have a minimal version of the
profiler in the javascript shell and a way to test it automatically.
And then to implement some of the size() functions in WebCore (images,
textelements, dom elements) and expose the data to WebInspector.

Thanks,
Raul

On Wed, Feb 4, 2009 at 2:27 AM, Darin Adler <darin at apple.com> wrote:
> Some quick thoughts on your message.
>
> On Feb 3, 2009, at 1:07 PM, Raul wrote:
>
>> In WebCore the glue code is generated using python script from an IDL
>> file.
>
> A perl script.
>
>> The script can be changed to create ClassInfo structures that also contain
>> the sizeof().
>
> The sizeof on the DOM classes from WebCore won't include all the memory
> accounted for by the element. Many of these elements have auxiliary data
> structures such as vectors or hash maps.
>
>> Special cases that use large amount of memory like String, HashTables,
>> Arrays, Functions, Images, Canvas, CSS, DOM Elements can be added a size()
>> function.
>
> It's going to be challenging to write and maintain these functions. Also,
> the name "size()" won't be available, since it's already used in the DOM, so
> this will have to have a more specific name.
>
>> This function will be declared in the IDL file and the python script will
>> generate a special function in the JSObject glue code.
>
> The IDL file declares the public interface to a DOM class. It wouldn't make
> sense to add something internal like this to the IDL file as a new function.
> It might make sense to add something to the IDL file that's a property
> saying "has custom size function", but this would not be a function in the
> IDL file. It'd be a keyword modifying the class's interface definition.
>
>> When a new JSCell is created its CallStack can be recorded in a hashtable
>> for later use. The CallStack can be stored in a per thread global variable
>> as (WebKit is using a collector for each page, so the "new" operator also
>> have an ExecState argument that can be used to get the callstack). When it
>> is destroyed, the hashtable should be updated accordingly.
>
> This seems like it would be slow. It's important that we find a way to do
> this without slowing down normal operation. And even when the profiler is
> running, we need to make sure it doesn't make things unusably slow.
>
>> Each JSObject should have an unique id per session.
>
> I don't understand this. Where would this be stored?
>
>> Another issue in javascript is that objects defined by the user code do
>> not hold a ClassName. Their ClassInfo is simply "Object". Another virtual
>> function can be added on JSCell to return the actual type. In this case the
>> type can be the Function name that generated the object (javascript
>> constructor) or the FileName.js at LineNumber in case of anonymous functions.
>
> There are also plenty of objects that don't have constructors.
>
> Most of what you say here sounds fine. I'm not much of an expert on how our
> tools are done generally. It seems critical to do this project in a way that
> would produce results early. We don't want to hook up a lot of machinery
> before we are using it for anything.
>
> You should also think about how to test this and how we'd maintain its
> correctness in the future. It sounds like there would be a lot of
> handwritten code that could easily get out of sync with how much memory is
> actually used.
>
>    -- Darin
>
>