[webkit-gtk] WebsiteDataStore API proposal for handling data

Carlos Garcia Campos cgarcia at igalia.com
Mon Jun 22 03:29:41 PDT 2015

As I said in the other thread, the WebsiteDataStore class also allows to
retrieve origins/domains with website data and delete website data for
any particular origin/domain.

The internal API uses the same methods for all kinds of data (local
storage, disk cache, memory cache, etc.), using flags to specify the
kind of data you are interested in. So, it's just 3 methods:

void fetchData(WebsiteDataTypes, std::function<void (Vector<WebsiteDataRecord>)> completionHandler);
void removeData(WebsiteDataTypes, std::chrono::system_clock::time_point modifiedSince, std::function<void ()> completionHandler);
void removeData(WebsiteDataTypes, const Vector<WebsiteDataRecord>&, std::function<void ()> completionHandler);

This requires a WebsiteDataRecord struct to know which type the result
refers to. So, we could do something similar adding a boxed type for the
WebsiteDataRecord struct and just two methods fetch_data and remove_data
(I wouldn't expose the remove_modified_since for now). This API makes it
very easy to add new data types, since it would be a matter of adding a
new flag, and it could be more efficient when getting or removing
different kinds of data at the same time. However, I think this kind of
single API for everything is usually more confusing, and more difficult
to use (and requires the extra boxed type), and operations over website
data are not that frequent to require to be optimized. So, an
alternative could be to use a different method for every kind of data.
For example:

webkit_website_data_store_get_origins_with_local_storage (WebKitWebsiteDataManager *manager,
                                                          GCancellable             *cancellable,
                                                          GAsyncReadyCallback       callback,
                                                          gpointer                  user_data);
GList *
webkit_website_data_store_get_origins_with_local_storage_finish (WebKitWebsiteDataManager *manager,
                                                                 GAsyncResult             *result,
                                                                 GError                  **error);
webkit_website_data_store_get_domains_with_plugin_data (WebKitWebsiteDataManager *manager,
                                                        GCancellable             *cancellable,
                                                        GAsyncReadyCallback       callback,
                                                        gpointer                  user_data);
gchar **
webkit_website_data_store_get_domains_with_plugin_data_finish (WebKitWebsiteDataManager *manager,
                                                               GAsyncResult             *result,
                                                               GError                  **error);
webkit_website_data_store_get_origins_with_memory_cache (WebKitWebsiteDataManager *manager,
                                                         GCancellable             *cancellable,
                                                         GAsyncReadyCallback       callback,
                                                         gpointer                  user_data);
GList *
webkit_website_data_store_get_origins_with_memory_cache_finish (WebKitWebsiteDataManager *manager,
                                                                GAsyncResult             *result,
                                                                GError                  **error);

There would be one async method for every flag

enum WebsiteDataTypes {
    WebsiteDataTypeCookies = 1 << 0,
    WebsiteDataTypeDiskCache = 1 << 1,
    WebsiteDataTypeMemoryCache = 1 << 2,
    WebsiteDataTypeOfflineWebApplicationCache = 1 << 3,
    WebsiteDataTypeSessionStorage = 1 << 4,
    WebsiteDataTypeLocalStorage = 1 << 5,
    WebsiteDataTypeWebSQLDatabases = 1 << 6,
    WebsiteDataTypeIndexedDBDatabases = 1 << 7,
    WebsiteDataTypeMediaKeys = 1 << 8,
    WebsiteDataTypePlugInData = 1 << 9,

Except for cookies since we already have the WebKitCookieManager.

Methods that work with domains will return a char** like cookie manager
does, and the ones working with origins will return a GList of
WebKitSecurityOrigin. This means we also need to expose SecurityOrigin
in the API. It could probably be a boxed type.
The methods to delete data would be similar, one for every type
receiving a list of domains or origins, and another one for every type
to remove all data (like the cookie manager). I'm not sure if the remove
methods should also use the async pattern approach. The internal API has
a completion handler, but just as a notification, it doesn't say if the
removal was successful or not, for example. Users could use the async
ready callback just to update the UI after a removal, for example.
Cookie manager doesn't have the callback either, so not using the async
pattern would be consistent with cookie manager.

An alternative approach could be to group some of the data types
together, but using an enum instead of flags, so you would still need to
call the method for every type, for example:

typedef enum {
} WebKitWebsiteDataCacheType;

webkit_website_data_store_get_origins_with_cache (WebKitWebsiteDataManager  *manager,
                                                  WebKitWebsiteDataCacheType cache_type,
                                                  GCancellable              *cancellable,
                                                  GAsyncReadyCallback        callback,
                                                  gpointer                   user_data);
GList *
webkit_website_data_store_get_origins_with_cache_finish (WebKitWebsiteDataManager *manager,
                                                         GAsyncResult             *result,
                                                         GError                  **error);

get_origins_with_cache sounds a bit weird to me, maybe for caches, we
could use get_origins_in_cache instead.

Other types that could be grouped:

 - databases: WebSQL, IndexedDB
 - storage data: local storage, session storage.

And a completely different approach could be to expose different
managers for every data type or group of data types, similar to the
current cookie manager. For example WebKitCacheDataManager,
WebKitDatabaseDataManager, WebKitStorageDataManager,
WebKitPluginDataManager, etc. The advantage of this approach is that we
could add specific methods to any manager that don't make sense for the
others. Those managers could be accessed through
WebKitWebsiteDataManager and created on demand in the getters (we could
move the cookie manager there too, for consistency).

So, in summary, I think we have at least the following options:

 a) Use a single API for all with flags like the internal API
 b) Use a different method for every data type
 c) Use different method for every group of data types
 d) Use a manager class for every group of data types and move
WebKitCookieManager too.

Any other option? Any other thing I haven't considered? Which one you
think is better?

And whatever option we choose, we also need to decide whether to use
async callbacks for remove methods or not and how to expose the

Carlos Garcia Campos
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: <https://lists.webkit.org/pipermail/webkit-gtk/attachments/20150622/9340561b/attachment.sig>

More information about the webkit-gtk mailing list