localStorage quota limit
All - I've been discussing the localStorage quota limit over on this bug with Jeremy Orlow: https://bugs.webkit.org/show_bug.cgi?id=31791 To recap from the discussions on that bug: Jeremy has implemented the localStorage quota on the latest Webkit builds. This caused my usage of localStorage to fail, because as a JS programmer, I assumed that 5MB meant '5 million characters' of storage. This assumption holds true on Firefox 3.5.X+ and IE8, but fails on Webkit since it stores things into localStorage as UTF-16. One option we discussed on that bug was getting the spec folks to alter the spec in one of three ways: - specify the quota in terms of 'characters' (or Strings, or whatever) thereby abstracting away the encoding problem entirely. - specify UTF-8 so that 'MB = characters' - specify a JS API such that the encoding could be specified. Jeremy wasn't too taken with any of these proposals, and in any case, they probably need to be taken up on the W3 group defining this stuff, not here. In any case, as Jeremy states in Comment #5 of the bug report, "the spec's mentioning of 5mb is really just an example". And when I filed this bug on Mozilla's Bugzilla tracker: https://bugzilla.mozilla.org/show_bug.cgi?id=461684 another comment there points out the same thing. (Note that this bug was originally filed to see if the Mozilla guys would raise their quota to 10MB to match IE8 and, since they don't use double-byte encoding, I was really asking for '10 million characters' there :-)). Given that, an increase from 5MB to 10MB would 'solve my immediate problem'. And, without going back to the spec folks, I'm not sure that much more can be done here. Jeremy wanted me to post to get the discussion started (and hopefully attain some consensus :-) ), so let's discuss :-). Thanks in advance! Cheers, - Bill
Could WebKit configure the localstorage database(s) to use UTF8 text encoding for string values? On Sun, Nov 29, 2009 at 8:38 AM, William Edney <bedney@technicalpursuit.com>wrote:
All -
I've been discussing the localStorage quota limit over on this bug with Jeremy Orlow:
https://bugs.webkit.org/show_bug.cgi?id=31791
To recap from the discussions on that bug:
Jeremy has implemented the localStorage quota on the latest Webkit builds. This caused my usage of localStorage to fail, because as a JS programmer, I assumed that 5MB meant '5 million characters' of storage. This assumption holds true on Firefox 3.5.X+ and IE8, but fails on Webkit since it stores things into localStorage as UTF-16.
One option we discussed on that bug was getting the spec folks to alter the spec in one of three ways:
- specify the quota in terms of 'characters' (or Strings, or whatever) thereby abstracting away the encoding problem entirely. - specify UTF-8 so that 'MB = characters' - specify a JS API such that the encoding could be specified.
Jeremy wasn't too taken with any of these proposals, and in any case, they probably need to be taken up on the W3 group defining this stuff, not here.
In any case, as Jeremy states in Comment #5 of the bug report, "the spec's mentioning of 5mb is really just an example". And when I filed this bug on Mozilla's Bugzilla tracker:
https://bugzilla.mozilla.org/show_bug.cgi?id=461684
another comment there points out the same thing. (Note that this bug was originally filed to see if the Mozilla guys would raise their quota to 10MB to match IE8 and, since they don't use double-byte encoding, I was really asking for '10 million characters' there :-)).
Given that, an increase from 5MB to 10MB would 'solve my immediate problem'. And, without going back to the spec folks, I'm not sure that much more can be done here.
Jeremy wanted me to post to get the discussion started (and hopefully attain some consensus :-) ), so let's discuss :-).
Thanks in advance!
Cheers,
- Bill
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
This would probably be a performance win since it would reduce the amount of disk i/o. (Note, it doesn't mean that 5 million characters could be stored since a UTF-8 character might be multi-byte.) -Darin On Wed, Dec 2, 2009 at 9:30 AM, Michael Nordman <michaeln@google.com> wrote:
Could WebKit configure the localstorage database(s) to use UTF8 text encoding for string values?
On Sun, Nov 29, 2009 at 8:38 AM, William Edney < bedney@technicalpursuit.com> wrote:
All -
I've been discussing the localStorage quota limit over on this bug with Jeremy Orlow:
https://bugs.webkit.org/show_bug.cgi?id=31791
To recap from the discussions on that bug:
Jeremy has implemented the localStorage quota on the latest Webkit builds. This caused my usage of localStorage to fail, because as a JS programmer, I assumed that 5MB meant '5 million characters' of storage. This assumption holds true on Firefox 3.5.X+ and IE8, but fails on Webkit since it stores things into localStorage as UTF-16.
One option we discussed on that bug was getting the spec folks to alter the spec in one of three ways:
- specify the quota in terms of 'characters' (or Strings, or whatever) thereby abstracting away the encoding problem entirely. - specify UTF-8 so that 'MB = characters' - specify a JS API such that the encoding could be specified.
Jeremy wasn't too taken with any of these proposals, and in any case, they probably need to be taken up on the W3 group defining this stuff, not here.
In any case, as Jeremy states in Comment #5 of the bug report, "the spec's mentioning of 5mb is really just an example". And when I filed this bug on Mozilla's Bugzilla tracker:
https://bugzilla.mozilla.org/show_bug.cgi?id=461684
another comment there points out the same thing. (Note that this bug was originally filed to see if the Mozilla guys would raise their quota to 10MB to match IE8 and, since they don't use double-byte encoding, I was really asking for '10 million characters' there :-)).
Given that, an increase from 5MB to 10MB would 'solve my immediate problem'. And, without going back to the spec folks, I'm not sure that much more can be done here.
Jeremy wanted me to post to get the discussion started (and hopefully attain some consensus :-) ), so let's discuss :-).
Thanks in advance!
Cheers,
- Bill
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
On Dec 2, 2009, at 9:49 AM, Darin Fisher wrote:
This would probably be a performance win since it would reduce the amount of disk i/o.
(Note, it doesn't mean that 5 million characters could be stored since a UTF-8 character might be multi-byte.)
Currently the database can store invalid UTF-16 as well as valid UTF-16. Conversion from UTF-16 to UTF-8 might not be able to preserve invalid UTF-16 sequences. I don’t understand how the other platforms handle this. Perhaps the specification needs to be clearer on whether invalid UTF-16 is allowed. -- Darin
+ hixie I don't know as much about encoding types as I should. How can you construct invalid UTF-16 sequences? What does Firefox or IE do with these when put into LocalStorage? One thing I just considered is that our LocalStorage implementation loads the entire LocalStorage database for an origin into memory all at once. It does this on a background thread, but it's only loaded on the first access to |window.localStorage| and we block on it finishing loading when you first try to use it. So if |alert(window.localStorage.foo);| is the first usage of LocalStorage, it (and thus the main thread) will block on the whole thing being loaded into memory. This can certainly be optimized, but I'm pointing this out because the bigger the DB is, the worse the worst case load time is. (Making this better is on my todo list, but not as near the top as I'd like.) In case you're wondring, you can mitigate this by doing 'var storage = window.localStorage;' as early as possible in your script and waiting as long as possible to actually use window.localStorage. J On Wed, Dec 2, 2009 at 10:01 AM, Darin Adler <darin@apple.com> wrote:
On Dec 2, 2009, at 9:49 AM, Darin Fisher wrote:
This would probably be a performance win since it would reduce the amount of disk i/o.
(Note, it doesn't mean that 5 million characters could be stored since a UTF-8 character might be multi-byte.)
Currently the database can store invalid UTF-16 as well as valid UTF-16. Conversion from UTF-16 to UTF-8 might not be able to preserve invalid UTF-16 sequences. I don’t understand how the other platforms handle this. Perhaps the specification needs to be clearer on whether invalid UTF-16 is allowed.
-- Darin
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
On Dec 2, 2009, at 10:48 AM, Jeremy Orlow wrote:
How can you construct invalid UTF-16 sequences?
IE chokes ("invalid procedure call or argument") and Firefox mangles the data for LocalStorage (but works fine for SessionStorage). On Wed, Dec 2, 2009 at 10:54 AM, Darin Adler <darin@apple.com> wrote:
On Dec 2, 2009, at 10:48 AM, Jeremy Orlow wrote:
How can you construct invalid UTF-16 sequences?
http://unicode.org/faq/utf_bom.html#utf16-7
-- Darin
Arguably, seems like a bug that invalid string values are let thru the door to start with? Since users can't effectively store invalid UTF16 character sequences in FF or IE, is there really any downside to using UTF8 text encoding in WebKit? @Jeremy, this isn't a matter of letting users choose the text encoding, this is entirely an implementation detail of WebStorage. Downsides * The code change to get UTF8 by default in new databases, tiny. * Migrating pre-existing databases to the new encoding. Somewhat of a hassle. But maybe doesn't need to be done, pre-existing files could continue to use UTF16, while newly created dbs could use UTF8 (the text encoding is chosen at database creation time and stuck that way forever thereafter). * Its possible that some app is already depending on the ability to store invalid character sequences (on the iPhone say), and this would be a breaking change for that app. The preload everything characteristic is a separate issue altogether. On Wed, Dec 2, 2009 at 11:15 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
IE chokes ("invalid procedure call or argument") and Firefox mangles the data for LocalStorage (but works fine for SessionStorage).
On Wed, Dec 2, 2009 at 10:54 AM, Darin Adler <darin@apple.com> wrote:
On Dec 2, 2009, at 10:48 AM, Jeremy Orlow wrote:
How can you construct invalid UTF-16 sequences?
http://unicode.org/faq/utf_bom.html#utf16-7
-- Darin
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
On Dec 2, 2009, at 12:06 PM, Michael Nordman wrote:
Arguably, seems like a bug that invalid string values are let thru the door to start with?
ECMAScript Strings are essentially sequences of arbitrary 16-bit values. Sometimes Web apps take advantage of this to use a String as a hacky way to represent binary data. I don't think we should reject such strings arbitrarily.
Since users can't effectively store invalid UTF16 character sequences in FF or IE,
I tend to think this is a bug in FF/IE. Nothing in the spec gives license to reject particular Strings.
is there really any downside to using UTF8 text encoding in WebKit? @Jeremy, this isn't a matter of letting users choose the text encoding, this is entirely an implementation detail of WebStorage.
I think it would be fine to use a more compact encoding opportunistically, as long as we can still handle an arbitrary JavaScript String. Perhaps we should use UTF-8 if and only if the conversion succeeds, or perhaps even use Latin1 as the alternative.
Downsides * The code change to get UTF8 by default in new databases, tiny. * Migrating pre-existing databases to the new encoding. Somewhat of a hassle. But maybe doesn't need to be done, pre-existing files could continue to use UTF16, while newly created dbs could use UTF8 (the text encoding is chosen at database creation time and stuck that way forever thereafter). * Its possible that some app is already depending on the ability to store invalid character sequences (on the iPhone say), and this would be a breaking change for that app.
The preload everything characteristic is a separate issue altogether.
On Wed, Dec 2, 2009 at 11:15 AM, Jeremy Orlow <jorlow@chromium.org> wrote: IE chokes ("invalid procedure call or argument") and Firefox mangles the data for LocalStorage (but works fine for SessionStorage).
On Wed, Dec 2, 2009 at 10:54 AM, Darin Adler <darin@apple.com> wrote: On Dec 2, 2009, at 10:48 AM, Jeremy Orlow wrote:
How can you construct invalid UTF-16 sequences?
http://unicode.org/faq/utf_bom.html#utf16-7
-- Darin
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
On Wed, 2 Dec 2009, Michael Nordman wrote:
Arguably, seems like a bug that invalid string values are let thru the door to start with?
Yeah, I should make the spec through SYNTAX_ERR if there are any unpaired surrogates, the same way WebSocket does. I'll file a bug. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
What about Maciej's comment. JS strings are often use to store binary values. Obviously, if people stick to octets, then it should be fine, but perhaps some folks leverage all 16 bits? -Darin On Wed, Dec 2, 2009 at 5:03 PM, Ian Hickson <ian@hixie.ch> wrote:
On Wed, 2 Dec 2009, Michael Nordman wrote:
Arguably, seems like a bug that invalid string values are let thru the door to start with?
Yeah, I should make the spec through SYNTAX_ERR if there are any unpaired surrogates, the same way WebSocket does. I'll file a bug.
-- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.' _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
On Dec 2, 2009, at 8:14 PM, Darin Fisher wrote:
What about Maciej's comment. JS strings are often use to store binary values. Obviously, if people stick to octets, then it should be fine, but perhaps some folks leverage all 16 bits?
I think some people do use JavaScript strings this way, though not necessarily with LocalStorage. This kind of use will probably become obsolete when we add a proper way to store binary data from the platform. Most Web-related APIs are fully accepting of JavaScript strings that are not proper UTF-16. I don't see a strong reason to make LocalStorage an exception. It does make sense for WebSocket to be an exception, since in that case charset transcoding is required by the protocol, and since it is desirable in that case to prevent any funny business that may trip up the server.. Also, looking at UTF-16 more closely, it seems like all UTF-16 can be transcoded to UTF-8 and round-tripped if one is willing to allow technically invalid UTF-8 that encodes unpaired characters in the surrogate range as if they were characters. It's not clear to me why Firefox or IE choose to reject instead of doing this. This also removes my original objection to storing strings as UTF-8. Regards, Maciej
-Darin
On Wed, Dec 2, 2009 at 5:03 PM, Ian Hickson <ian@hixie.ch> wrote: On Wed, 2 Dec 2009, Michael Nordman wrote:
Arguably, seems like a bug that invalid string values are let thru
the
door to start with?
Yeah, I should make the spec through SYNTAX_ERR if there are any unpaired surrogates, the same way WebSocket does. I'll file a bug.
-- Ian Hickson U+1047E ) \._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _ \ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'-- (,_..'`-.;.' _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
On Wed, Dec 2, 2009 at 8:44 PM, Maciej Stachowiak <mjs@apple.com> wrote:
On Dec 2, 2009, at 8:14 PM, Darin Fisher wrote:
What about Maciej's comment. JS strings are often use to store binary values. Obviously, if people stick to octets, then it should be fine, but perhaps some folks leverage all 16 bits?
I think some people do use JavaScript strings this way, though not necessarily with LocalStorage. This kind of use will probably become obsolete when we add a proper way to store binary data from the platform.
Most Web-related APIs are fully accepting of JavaScript strings that are not proper UTF-16. I don't see a strong reason to make LocalStorage an exception. It does make sense for WebSocket to be an exception, since in that case charset transcoding is required by the protocol, and since it is desirable in that case to prevent any funny business that may trip up the server..
Also, looking at UTF-16 more closely, it seems like all UTF-16 can be transcoded to UTF-8 and round-tripped if one is willing to allow technically invalid UTF-8 that encodes unpaired characters in the surrogate range as if they were characters. It's not clear to me why Firefox or IE choose to reject instead of doing this. This also removes my original objection to storing strings as UTF-8.
I think it is typical for UTF-16 to UTF-8 conversion to involve the intermediate step of forming a Unicode code point. If that cannot be done, then conversion fails. This may actually be a security thing. If something expects UTF-8, it is safer to ensure that it gets valid UTF-8 (even if that involves loss of information). -Darin
Regards, Maciej
-Darin
On Wed, Dec 2, 2009 at 5:03 PM, Ian Hickson <ian@hixie.ch> wrote:
On Wed, 2 Dec 2009, Michael Nordman wrote:
Arguably, seems like a bug that invalid string values are let thru the door to start with?
Yeah, I should make the spec through SYNTAX_ERR if there are any unpaired surrogates, the same way WebSocket does. I'll file a bug.
-- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.' _______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
On Dec 2, 2009, at 9:07 PM, Darin Fisher wrote:
On Wed, Dec 2, 2009 at 8:44 PM, Maciej Stachowiak <mjs@apple.com> wrote:
On Dec 2, 2009, at 8:14 PM, Darin Fisher wrote:
What about Maciej's comment. JS strings are often use to store binary values. Obviously, if people stick to octets, then it should be fine, but perhaps some folks leverage all 16 bits?
I think some people do use JavaScript strings this way, though not necessarily with LocalStorage. This kind of use will probably become obsolete when we add a proper way to store binary data from the platform.
Most Web-related APIs are fully accepting of JavaScript strings that are not proper UTF-16. I don't see a strong reason to make LocalStorage an exception. It does make sense for WebSocket to be an exception, since in that case charset transcoding is required by the protocol, and since it is desirable in that case to prevent any funny business that may trip up the server..
Also, looking at UTF-16 more closely, it seems like all UTF-16 can be transcoded to UTF-8 and round-tripped if one is willing to allow technically invalid UTF-8 that encodes unpaired characters in the surrogate range as if they were characters. It's not clear to me why Firefox or IE choose to reject instead of doing this. This also removes my original objection to storing strings as UTF-8.
I think it is typical for UTF-16 to UTF-8 conversion to involve the intermediate step of forming a Unicode code point. If that cannot be done, then conversion fails. This may actually be a security thing. If something expects UTF-8, it is safer to ensure that it gets valid UTF-8 (even if that involves loss of information).
These security considerations seem important for WebSocket where the protocol uses UTF-8 per spec, but not for the internal storage representation of JavaScript strings in LocalStorage (where observable input and output are both possibly-invalid UTF-16). Regards, Maciej
On Wed, Dec 2, 2009 at 10:20 PM, Maciej Stachowiak <mjs@apple.com> wrote:
On Dec 2, 2009, at 9:07 PM, Darin Fisher wrote:
On Wed, Dec 2, 2009 at 8:44 PM, Maciej Stachowiak <mjs@apple.com> wrote:
On Dec 2, 2009, at 8:14 PM, Darin Fisher wrote:
What about Maciej's comment. JS strings are often use to store binary values. Obviously, if people stick to octets, then it should be fine, but perhaps some folks leverage all 16 bits?
I think some people do use JavaScript strings this way, though not necessarily with LocalStorage. This kind of use will probably become obsolete when we add a proper way to store binary data from the platform.
Most Web-related APIs are fully accepting of JavaScript strings that are not proper UTF-16. I don't see a strong reason to make LocalStorage an exception. It does make sense for WebSocket to be an exception, since in that case charset transcoding is required by the protocol, and since it is desirable in that case to prevent any funny business that may trip up the server..
Also, looking at UTF-16 more closely, it seems like all UTF-16 can be transcoded to UTF-8 and round-tripped if one is willing to allow technically invalid UTF-8 that encodes unpaired characters in the surrogate range as if they were characters. It's not clear to me why Firefox or IE choose to reject instead of doing this. This also removes my original objection to storing strings as UTF-8.
I think it is typical for UTF-16 to UTF-8 conversion to involve the intermediate step of forming a Unicode code point. If that cannot be done, then conversion fails. This may actually be a security thing. If something expects UTF-8, it is safer to ensure that it gets valid UTF-8 (even if that involves loss of information).
These security considerations seem important for WebSocket where the protocol uses UTF-8 per spec, but not for the internal storage representation of JavaScript strings in LocalStorage (where observable input and output are both possibly-invalid UTF-16).
Regards, Maciej
Agreed. I was responding to your statement: "It's not clear to me why Firefox or IE choose to reject instead of doing this." It seems likely to me that neither Firefox nor IE made a concerted choice to treat bad UTF-16 this way. It is probably just a consequence of using the default UTF-16 to UTF-8 converter, which likely behaves as I described. -Darin
On Dec 2, 2009, at 10:51 PM, Darin Fisher wrote:
Agreed. I was responding to your statement: "It's not clear to me why Firefox or IE choose to reject instead of doing this." It seems likely to me that neither Firefox nor IE made a concerted choice to treat bad UTF-16 this way. It is probably just a consequence of using the default UTF-16 to UTF-8 converter, which likely behaves as I described.
That makes sense. - Maciej
On Wed, 2 Dec 2009, Maciej Stachowiak wrote:
On Dec 2, 2009, at 8:14 PM, Darin Fisher wrote:
What about Maciej's comment. JS strings are often use to store binary values. Obviously, if people stick to octets, then it should be fine, but perhaps some folks leverage all 16 bits?
I think some people do use JavaScript strings this way, though not necessarily with LocalStorage. This kind of use will probably become obsolete when we add a proper way to store binary data from the platform.
Most Web-related APIs are fully accepting of JavaScript strings that are not proper UTF-16. I don't see a strong reason to make LocalStorage an exception.
I recommend raising these points on: http://www.w3.org/Bugs/Public/show_bug.cgi?id=8425 -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
This is why LocalStorage quota should remain relatively small. (/me holds back urges to bitch about the LocalStorage spec.) If people want more storage space, then DB should be used, which can more efficiently accommodate large amounts of data. -Darin On Wed, Dec 2, 2009 at 10:48 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
+ hixie
I don't know as much about encoding types as I should. How can you construct invalid UTF-16 sequences? What does Firefox or IE do with these when put into LocalStorage?
One thing I just considered is that our LocalStorage implementation loads the entire LocalStorage database for an origin into memory all at once. It does this on a background thread, but it's only loaded on the first access to |window.localStorage| and we block on it finishing loading when you first try to use it. So if |alert(window.localStorage.foo);| is the first usage of LocalStorage, it (and thus the main thread) will block on the whole thing being loaded into memory. This can certainly be optimized, but I'm pointing this out because the bigger the DB is, the worse the worst case load time is. (Making this better is on my todo list, but not as near the top as I'd like.)
In case you're wondring, you can mitigate this by doing 'var storage = window.localStorage;' as early as possible in your script and waiting as long as possible to actually use window.localStorage.
J
On Wed, Dec 2, 2009 at 10:01 AM, Darin Adler <darin@apple.com> wrote:
On Dec 2, 2009, at 9:49 AM, Darin Fisher wrote:
This would probably be a performance win since it would reduce the amount of disk i/o.
(Note, it doesn't mean that 5 million characters could be stored since a UTF-8 character might be multi-byte.)
Currently the database can store invalid UTF-16 as well as valid UTF-16. Conversion from UTF-16 to UTF-8 might not be able to preserve invalid UTF-16 sequences. I don’t understand how the other platforms handle this. Perhaps the specification needs to be clearer on whether invalid UTF-16 is allowed.
-- Darin
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
After thinking about it a bit, I guess I feel like we should do nothing. I'm pretty against letting users set the encoding type for localStorage. That sounds like a lot of complexity for not a lot of benifit. Plus it'll cause problems when multiple web apps are in the same origin (and require different encoding) or you're using one JS library that assumes one encoding and another that assumes another. Converting to UTF 8 seems problematic. Increasing the limit encourages heavier use of LocalStorage which I'm not in favor of. Darin's right that if you want to store a lot of data and/or have more control over it, Web SQL Database is probably what you should be using. J On Wed, Dec 2, 2009 at 11:19 AM, Darin Fisher <darin@chromium.org> wrote:
This is why LocalStorage quota should remain relatively small. (/me holds back urges to bitch about the LocalStorage spec.)
If people want more storage space, then DB should be used, which can more efficiently accommodate large amounts of data.
-Darin
On Wed, Dec 2, 2009 at 10:48 AM, Jeremy Orlow <jorlow@chromium.org> wrote:
+ hixie
I don't know as much about encoding types as I should. How can you construct invalid UTF-16 sequences? What does Firefox or IE do with these when put into LocalStorage?
One thing I just considered is that our LocalStorage implementation loads the entire LocalStorage database for an origin into memory all at once. It does this on a background thread, but it's only loaded on the first access to |window.localStorage| and we block on it finishing loading when you first try to use it. So if |alert(window.localStorage.foo);| is the first usage of LocalStorage, it (and thus the main thread) will block on the whole thing being loaded into memory. This can certainly be optimized, but I'm pointing this out because the bigger the DB is, the worse the worst case load time is. (Making this better is on my todo list, but not as near the top as I'd like.)
In case you're wondring, you can mitigate this by doing 'var storage = window.localStorage;' as early as possible in your script and waiting as long as possible to actually use window.localStorage.
J
On Wed, Dec 2, 2009 at 10:01 AM, Darin Adler <darin@apple.com> wrote:
On Dec 2, 2009, at 9:49 AM, Darin Fisher wrote:
This would probably be a performance win since it would reduce the amount of disk i/o.
(Note, it doesn't mean that 5 million characters could be stored since a UTF-8 character might be multi-byte.)
Currently the database can store invalid UTF-16 as well as valid UTF-16. Conversion from UTF-16 to UTF-8 might not be able to preserve invalid UTF-16 sequences. I don’t understand how the other platforms handle this. Perhaps the specification needs to be clearer on whether invalid UTF-16 is allowed.
-- Darin
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
participants (7)
-
Darin Adler
-
Darin Fisher
-
Ian Hickson
-
Jeremy Orlow
-
Maciej Stachowiak
-
Michael Nordman
-
William Edney