[webkit-dev] Proposal for refactoring DOM Storage

Jeremy Orlow jorlow at chromium.org
Wed May 13 19:44:35 PDT 2009


I've been researching, prototyping, and generally thinking about
https://bugs.webkit.org/show_bug.cgi?id=25376 for a while now.  I think I
now know what needs to be done and the least painful way to get there.  I've
written up a design doc which is available here:
http://docs.google.com/Doc?id=dhs4g97m_8cwths74m

If you'd like write permissions to it so you can add comments inline (via
Ctrl-M), shoot me an email.  If you'd rather reply inline via email, feel
free to do that as well.

J

===================================

WebCore DOM Storage Refactoring Design DocOverview and The Need:See
https://bugs.webkit.org/show_bug.cgi?id=25376 for the related bug.

The current design of DOM Storage (i.e. window.LocalStorage and
window.SessionStorage) within WebCore is fairly incompatible with
multi-process browsers like Chromium.  This can be fixed with a clean
frontend/backend split within DOM Storage, allowing multi-process browsers
to implement a proxy layer between the two, and having all the frontends
share one backend.  The design should not assume that pages within the same
origin are in the same process, that a cloned (see the SessionStorage spec)
top level browsing context will remain in the same process, or that all
children of a top level browsing context are in the same process.

Note that the current DOM Storage implementation is such that memory is
rarely ever reclaimed (currently only from SessionStorage and only when tabs
are closed).  The refactored backend should be designed so it's practical to
reclaim resources when tabs close, child processes crash, users navigate
away from pages, etc.  That said, actually reclaiming resources is
considered "future work".

High level plan:There are 2 main classes that will be added to WebCore:
StorageBackend and StorageEventManager.  StorageBackend will be a singleton
that can be replaced with a proxy class.  The StorageEventManager is
instantiated in a way that can easily be overridden by a StorageBackend
proxy class.  Currently, the LocalStorage and SessionStorage classes keep
track of StorageAreas and own the syncing threads/queues.  This
functionality will be moved into the StorageBackend and the
StorageSyncManager (which is owned by the StorageBackend).  The StorageArea
classes' event dispatching will be moved into the StorageEventManger class
since the events may originate from another process via a proxy.

Since the Local and SessionStorage code is going to be more and more similar
as time goes on, some of the classes (like LocalStorageArea and
SessionStorageArea) will probably be combined and the behavior of the class
will be determined explicitly rather than via polymorphism.  For example,
the StorageArea will have a flag that says whether or not the in memory map
is backed by a database.

The actual work will be split into as many patches as possible so the work
is easy to verify as correct.  The performance of Local/SessionStorage
should not be significantly affected at any point.  For example, there are
no new lookup tables required except within Proxy classes.

Stage 1:Move LocalStorage and SessionStorage logic into Backend and
EventManager.

Stage 2:Create a StorageSyncManager class that abstracts all the
Synchronization work.  Combine LocalStorageArea and SessionStorageArea.
 Rename LocalStorageThread/Task StorageSyncThread/Task.

Stage 3:Add hooks for multi-process setups.


Traces through the new design:To help make things clear, here's what would
happen in a single-process environment for a page that simply does the
following: window.localStorage.setItem('key', 'value')

*Javascript bindings convert window.localStorage to DOMWindow.localStorage()
*

DOMWindow.localStorage()
  *// StorageBackend::backend() is a singleton that returns a proxy for
multi-process setups.*
  storage_area = StorageBackend::backend()->createLocalStorage(page_group,
security_origin)
  storage = Storage::create(frame, stroage_area)  *// normal stuff +
registers the storage area with the event manager*

*// In a multi-process environment, parts of this are on the backend and
parts are on the frontend*
StorageBackend.createLocalStorage(page_group, security_origin)
  id = createUniqueIdFromPageGroupAndSecurityOrigin(page_group, security)
  if storageAreaMap.contains(id):
    storageArea = storageAreaMap.get(id)
  else
    storage_area = StorageArea::createLocalStorageArea(id, eventDispatcher,
storageSyncManager)  *// constructor just initializes instance variables*
    storageSyncManager.scheduleImport(storage_area)  *// imports are
scheduled before writes to disk*
    storageAreaMap.set(id, storage_area)
  return storage_area.release()


*Javascript bindings call setItem on the returned object*

Storage.setItem(key, value, exception_code_out)
  return m_storageArea.setItem(m_frame, key, value, exception_code_out)

*// In a multi-process environment, DONE ON THE BACKEND*
StorageArea.setItem(frame, key, value, exception_code_out)
  *// Abstracting out the manager allows implementations like Chromium
freedom to use its existing threads/queues*
  storageSyncManager.blockUntilImportComplete()
  storageMap.set(key, value)
  dirty = true
  setOfKeysToSync.add(key)
  storageSyncManager.scheduleToWrite(this)
  eventDispatcher.event(type == LocalStorageType, key, old_value, new_value,
url, frame)

*// IMPORTANT: for implementations that don't implement window proxies per
the spec, frame will be null...this is why storage instances must register
themselves with the event dispatcher*
EventDispatcher::event(broadcast, key, old_value, new_value, url, frame)
  for storage in registeredStorageInstances:
    page = storage.getFrame().page()
    // do the existing event dispatch stuff
    // an if statement based on the storage type decides if we're
dispatching to all pages in this page's page group or not

Changes for multi-process environments:There are comments in the above for
the 2 bits of logic that'd happen in the Backend.  Basically anything having
to do with the actual storage data and syncing/importing happens in the
backend process.  Anything having to do with event dispatching happens in
the frontend process.  The StorageBackendProxy will keep track of frontend
process objects (like StorageAreaProxies and the StorageEventManager) and
will handle routing across the IPC layers.  The backend will maintain a
StorageEventManager for each origin/groupName combination and will route
events sent to the frontend process to the proper manager.

Future Work:Quota support
DocumentEvent.createEvent<http://www.w3.org/TR/DOM-Level-3-Events/events.html#Events-DocumentEvent-createEvent>
 support
Better resource reclamation:
  Session storage written to disk when unused for a while (for example,
early on in the history for long lived tabs)
  evicting localStorage when low on memory and page is not active
  etc
proxy classes doing caching
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090513/6c00180f/attachment.html>


More information about the webkit-dev mailing list