[webkit-dev] Security Origins

Mon Jun 1 20:29:28 PDT 2009

On Mon, Jun 1, 2009 at 7:35 PM, Sam Weinig <sam.weinig at gmail.com> wrote:

>
>
> On Mon, Jun 1, 2009 at 3:50 PM, Jeremy Orlow <jorlow at chromium.org> wrote:
>
>> I have 2 questions about SecurityOrigins.
>>
>>
>> First of all, in SecurityOrigin::databaseIdentifier() (in
>> http://trac.webkit.org/browser/trunk/WebCore/page/SecurityOrigin.h) the
>> following comment appears: "Serialize the security origin for storage in the
>> database. This format is deprecated and should be used only for
>> compatibility with old databases; use toString() and createFromString()
>> instead."  This was done in http://trac.webkit.org/changeset/32597
>>
>> Despite the comments in the change log, I don't fully understand what the
>> intention here is.  DatabaseTracker and LocalStorage still use
>> databaseIdentifier() which makes sense since, unless I'm totally missing
>> something, the toString() format produces file names that are invalid on
>> most file systems.  At the same time, databaseIdentifier omits some of the
>> logic found in toString (which might be the cause of bugs like
>> https://bugs.webkit.org/show_bug.cgi?id=20701 ...I haven't confirmed
>> yet).
>>
>> So, maybe instead of databaseIdentifier() being "deprecated", it should be
>> updated to be more like a toString that uses file system safe characters?
>>
>
> It is deprecated in the sense that only Database code should use it.  We
> can't make Database code use toString() because there are existing clients
> that depend on the identifier to remain the same.
>

I don't think this is an issue (will comment further later on), but does
this mean there's no migration strategy _if_ it was found that that names
needed to change in the future?  In other words, does this mean that the
current format will need to be grandfathered in "forever"?

>
>
>> In addition, the creation of SecurityOrigin objects doesn't quite seem
>> right.  There are many places (like in DatabaseTracker) where security
>> origins are created based on the file, port, and protocol.  This is fine
>> when all you're doing is .equals (which the HashMap's
>> HashArg=SecurityOriginHash prarmeter does for you), but it seems like it
>> could be inefficient (many SecurityOrigin objects describing the same
>> origin) and could potentially lead to bugs (since not all of the properties
>> are serialized during toString()/databaseIdentifier()).
>>
>> It's surprising to me that there isn't one central store of SecurityOrigin
>> objects (maybe with weak references?) to eliminate the possibility of 2
>> SecurityOrigin objects describing the same origin.
>>
>
> I suspect SecurityOrigins can be optimized, but as of yet they have not
> appeared on profiles.  I would like to improve SecurityOrigins to better
> represent the concepts of effective script origin and origins representing
> resources more effectively.
>
>
>>
>>
>> In general, I suppose I'm just trying to understand the thought behind
>> SecurityOrigin objects and how they're used in the source code...because it
>> doesn't seem very natural to me.
>>
>
> SecurityOrigin was created to cleanup the way we do same-origin checks in
> the JavaScript bindings, and as such, carry some of that baggage (eg. domain
> relaxing).  The main design constraint then was that it was a cheap compare
> for two origins representing the same document (pointer compare) as that is
> common case.  The code evolved from there.
>

Makes sense.  Note that SecurityOrigin::equal does not currently do a fast
path check of "this == other" (to just do a pointer compare as you
mentioned) so this is probably low hanging fruit to make things a bit
faster.  (Though, as you said, it's not showing up on current profiles...so
it's probably not a huge deal.)

On Mon, Jun 1, 2009 at 7:09 PM, Adam Barth <abarth at webkit.org> wrote:

> Hi Jeremy,
>
> Some of this has evolved and could likely be cleaned up.
>
> 2009/6/1 Jeremy Orlow <jorlow at chromium.org>:
> > First of all, in SecurityOrigin::databaseIdentifier() (in
> http://trac.webkit.org/browser/trunk/WebCore/page/SecurityOrigin.h) the
> following comment appears: "Serialize the security origin for storage in the
> database. This format is deprecated and should be used only for
> compatibility with old databases; use toString() and createFromString()
> instead."  This was done in http://trac.webkit.org/changeset/32597
>
> There are a bunch of problems with
> SecurityOrigin::databaseIdentifier(), not the least of which is that
> "_" can't really be used as a delimiter because some intranet host
> names contain the character in their names.  I'd rather we didn't used
> databaseIdentifier() and had a single string representation we used
> everywhere.

If this is the only issue, the parsing code could work around it.  There are
3 parts to the identifier: the protocol (should never have a _ in it,
right?), the domain, and the port (once again, should never have a _).  It
seems as though the parsing code should look for the first _, the last _,
and then assume everything in the middle is the domain.

>
> > Despite the comments in the change log, I don't fully understand what the
> intention here is.  DatabaseTracker and LocalStorage still use
> databaseIdentifier() which makes sense since, unless I'm totally missing
> something, the toString() format produces file names that are invalid on
> most file systems.
>
> I don't see why these string representations need to be used for file
> names.  The advantages of the toString() serialization are (1) we need
> it for postMessage and cross-origin XMLHttpRequest, and (2) we already
> have an awesome parser for it called KURL.

databaseIdentifier() is currently used in the meta-data table (used for
quotas) of HTML 5 databases and the path of LocalStorage SQLite databases.
 I suppose we could run the toString() representation through a function to
make it safe for file systems, but given that we'll need to support the
current naming scheme (forever?) I don't really see what the point would
be...especially given my parsing suggestion above.

>
> > At the same time, databaseIdentifier omits some of the logic found in
> toString (which might be the cause of bugs like
> https://bugs.webkit.org/show_bug.cgi?id=20701 ...I haven't confirmed yet).
>
> I'll look at that bug once I have Internet access (currently on the train).

Don't bother.  This is the localStorage file:// bug you closed for me
tonight.  :-)

>
> > So, maybe instead of databaseIdentifier() being "deprecated", it should
> be updated to be more like a toString that uses file system safe characters?
>
> Late time I looked into this, I was told we couldn't change
> databaseIdentifier() because users had files stored on their computer
> that depended on the serialization.
>
> > In addition, the creation of SecurityOrigin objects doesn't quite seem
> right.  There are many places (like in DatabaseTracker) where security
> origins are created based on the file, port, and protocol.  This is fine
> when all you're doing is .equals (which the HashMap's
> HashArg=SecurityOriginHash prarmeter does for you), but it seems like it
> could be inefficient (many SecurityOrigin objects describing the same
> origin) and could potentially lead to bugs (since not all of the properties
> are serialized during toString()/databaseIdentifier()).
>
> I suspect we can improve this.
>
> > It's surprising to me that there isn't one central store of
> SecurityOrigin objects (maybe with weak references?) to eliminate the
> possibility of 2 SecurityOrigin objects describing the same origin.
>
> The problem is that SecurityOrigin objects are mutable.  For example,
> a SecurityOrigin mutates if you set its document.domain property
> (called "domain").  Also, certain SecurityOrigin objects might be
> granted certain privileges, and we might not be able to do that if
> everyone shared the same object.
>
> > In general, I suppose I'm just trying to understand the thought behind
> SecurityOrigin objects and how they're used in the source code...because it
> doesn't seem very natural to me.
>
> It might be productive to invent a class "StorageOrigin" or some such
> that is immutable, and therefore can be shared and moved across
> threads.  That way the storage system wouldn't be tempted to use
> SecurityOrigin features that don't make sense for storage (e.g.,
> grantLoadLocalResources).

Doesn't seem like a top priority, but maybe it's worth filing a low priority
bug for it anyway.  Although they are 2 somewhat distinct use cases and I
see the possibility for misuse and bad assumptions, I'm not terribly worried
about it.

Thanks for the feedback!
Jeremy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20090601/c5d8f880/attachment-0001.html>