[webkit-dev] size_t vs unsigned in WTF::Vector API ?

Filip Pizlo fpizlo at apple.com
Wed Nov 19 13:58:10 PST 2014


> On Nov 19, 2014, at 1:50 PM, Alexey Proskuryakov <ap at webkit.org> wrote:
> 
> 
> That makes good sense for internal implementation, do you think that class API should also use a custom type, or should it use size_t?

I would propose having an Int53 type in JSC, because that is now a thing in ECMAScript.  ArrayBuffer would use it.

Int53 could have an implicit conversion to size_t, but not the other way around.

> 
> With Vector though, I don't know how we would differentiate code paths that need large allocations from ones that don't. Nearly anything that is exposed as a JS API or deals with external world can hit sizes over 4Gb. That's not out of reach in most scenarios, not even for resources loaded from network.

Can you provide an example?

- Except for ArrayBuffers, other kinds of JS arrays are still semantically limited to length <= 2^31 - 2.

- Anytime we have an array of structured data where individual elements are large-ish, a vector of size UINT_MAX would not imply 4GB.  It would imply something much larger.  All it takes is Vector<void*> for example - and even with the size being an unsigned, the size limit is actually 32GB on 64-bit.  So, I don't think it's accurate to speak of a 4GB limit since this has nothing to do with bytes.

-Filip


> 
> - Alexey
> 
> 
> 19 нояб. 2014 г., в 13:19, Filip Pizlo <fpizlo at apple.com <mailto:fpizlo at apple.com>> написал(а):
> 
>> ArrayBuffers are very special because they are part of ECMAScript.
>> 
>> We use unsigned for the length, because once upon a time, that would have been the right type; these days the right type would be a 53-bit integer.  So, size_t would be wrong.  I believe that ArrayBuffers should be changed to use a very special size type; probably it wouldn't even be a primitive type but a class that carefully checked that you never overflowed 53 bits.
>> 
>> -Filip
>> 
>> 
>>> On Nov 19, 2014, at 12:54 PM, Alexey Proskuryakov <ap at webkit.org <mailto:ap at webkit.org>> wrote:
>>> 
>>> 
>>> This is not exactly about Vector, but if one uses FileReader.prototype.readAsArrayBuffer() on a large file, I think that it overflows ArrayBuffer. WebKit actually crashes when uploading multi-gigabyte files to YouTube, Google Drive and other similar services, although I haven't checked whether it's because of ArrayBuffers, or because of a use of int/unsigned in another code path.
>>> 
>>> I think that we should use size_t everywhere except for perhaps a few key places where memory impact is critical (and of course we need explicit checks when casting down to an unsigned). Or perhaps the rule can be even simpler, and unsigned may never be used for indices and sizes, period.
>>> 
>>> - Alexey
>>> 
>>> 
>>> 19 нояб. 2014 г., в 12:32, Filip Pizlo <fpizlo at apple.com <mailto:fpizlo at apple.com>> написал(а):
>>> 
>>>> Whatever we do, the clients of Vector should be consistent about what type they use.  I'm actually fine with Vector internally using unsigned even if the API uses size_t, but right now we have lots of code that uses a mix of size_t and unsigned when indexing into Vectors.  That's confusing.
>>>> 
>>>> If I picked one type to use for all Vector indices, it would be unsigned rather than size_t.  Vector being limited to unsigned doesn't imply 4GB unless you do Vector<char>.  Usually we have Vectors of pointer-width things, which means 32GB on 64-bit systems (UINT_MAX * sizeof(void*)).  Even in a world where we had more than 32GB of memory to use within a single web process, I would hope that we wouldn't use it all on a single Vector and that if we did, we would treat that one specially for a bunch of other sensible reasons (among them being that allocating a contiguous slab of virtual memory of that size is rather taxing).  So, size_t would buy us almost nothing since if we had a vector grow large enough to need it, we would be having a bad time already.
>>>> 
>>>> I wonder: are there cases that anyone remembers where we have tried to use Vector to store more than UINT_MAX things?  Another interesting question is: What's the largest number of things that we store into any Vector?  We could use such a metric to project how big Vectors might get in the future.
>>>> 
>>>> -Filip
>>>> 
>>>> 
>>>>> On Nov 19, 2014, at 12:20 PM, Chris Dumez <cdumez at apple.com <mailto:cdumez at apple.com>> wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I recently started updating the WTF::Vector API to use unsigned types instead of size_t [1][2], because:
>>>>> - WTF::Vector is already using unsigned type for size/capacity internally to save memory on 64-bit, causing a mismatch between the API and the internal representation [3]
>>>>> - Some reviewers have asked me to use unsigned for loop counters iterating over vectors (which looks unsafe as the Vector API, e.g. size(), returns a size_t).
>>>>> - I heard from Joseph that this type mismatch is forcing us (and other projects using WTF) to disable some build time warnings
>>>>> - The few people I talked to before making that change said we should do it
>>>>> 
>>>>> However, Alexey recently raised concerns about this change. it doesn't "strike him as the right direction. 4Gb is not much, and we should have more of WebKit work with the right data types, not less.”.
>>>>> I did not initially realize that this change was controversial, but now that it seems it is, I thought I would raise the question on webkit-dev to see what people think about this.
>>>>> 
>>>>> Kr,
>>>>> --
>>>>> Chris Dumez - Apple Inc.
>>>>> Cupertino, CA
>>>>> 
>>>>> 
>>>>> [1] http://trac.webkit.org/changeset/176275 <http://trac.webkit.org/changeset/176275>
>>>>> [2] http://trac.webkit.org/changeset/176293 <http://trac.webkit.org/changeset/176293>
>>>>> [3] http://trac.webkit.org/changeset/148891 <http://trac.webkit.org/changeset/148891>
>>>>> _______________________________________________
>>>>> webkit-dev mailing list
>>>>> webkit-dev at lists.webkit.org <mailto:webkit-dev at lists.webkit.org>
>>>>> https://lists.webkit.org/mailman/listinfo/webkit-dev <https://lists.webkit.org/mailman/listinfo/webkit-dev>
>>>> 
>>>> _______________________________________________
>>>> webkit-dev mailing list
>>>> webkit-dev at lists.webkit.org <mailto:webkit-dev at lists.webkit.org>
>>>> https://lists.webkit.org/mailman/listinfo/webkit-dev <https://lists.webkit.org/mailman/listinfo/webkit-dev>
>>> 
>>> 
>> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-dev/attachments/20141119/e0d223f6/attachment.html>


More information about the webkit-dev mailing list