[webkit-dev] size_t vs unsigned in WTF::Vector API ?

Benjamin Poulain benjamin at webkit.org
Wed Nov 19 13:47:33 PST 2014


I agree with Filip and Antti.

The internal type of Vector is an orthogonal issue.

Keeping the old "size_t" API with the "unsigned" internal has caused a 
mess of types in the codebase and some confusion for contributors. We 
should fix the API and clean up our code.

There are cases were we can expect large vectors. It would be reasonable 
to have the storage type as a template argument of Vector. But in most 
cases we have a lot of very small vectors, not a small amount of large 
vectors.
We could also have a BigVector class that would be optimized for bigger 
allocations.

Benjamin

On 11/19/14 12:54 PM, Alexey Proskuryakov wrote:
>
> This is not exactly about Vector, but if one
> uses FileReader.prototype.readAsArrayBuffer() on a large file, I think
> that it overflows ArrayBuffer. WebKit actually crashes when uploading
> multi-gigabyte files to YouTube, Google Drive and other similar
> services, although I haven't checked whether it's because of
> ArrayBuffers, or because of a use of int/unsigned in another code path.
>
> I think that we should use size_t everywhere except for perhaps a few
> key places where memory impact is critical (and of course we need
> explicit checks when casting down to an unsigned). Or perhaps the rule
> can be even simpler, and unsigned may never be used for indices and
> sizes, period.
>
> - Alexey
>
>
> 19 нояб. 2014 г., в 12:32, Filip Pizlo <fpizlo at apple.com
> <mailto:fpizlo at apple.com>> написал(а):
>
>> Whatever we do, the clients of Vector should be consistent about what
>> type they use.  I'm actually fine with Vector internally using
>> unsigned even if the API uses size_t, but right now we have lots of
>> code that uses a mix of size_t and unsigned when indexing into
>> Vectors.  That's confusing.
>>
>> If I picked one type to use for all Vector indices, it would be
>> unsigned rather than size_t.  Vector being limited to unsigned doesn't
>> imply 4GB unless you do Vector<char>.  Usually we have Vectors of
>> pointer-width things, which means 32GB on 64-bit systems (UINT_MAX *
>> sizeof(void*)).  Even in a world where we had more than 32GB of memory
>> to use within a single web process, I would hope that we wouldn't use
>> it all on a single Vector and that if we did, we would treat that one
>> specially for a bunch of other sensible reasons (among them being that
>> allocating a contiguous slab of virtual memory of that size is rather
>> taxing).  So, size_t would buy us almost nothing since if we had a
>> vector grow large enough to need it, we would be having a bad time
>> already.
>>
>> I wonder: are there cases that anyone remembers where we have tried to
>> use Vector to store more than UINT_MAX things?  Another interesting
>> question is: What's the largest number of things that we store into
>> any Vector?  We could use such a metric to project how big Vectors
>> might get in the future.
>>
>> -Filip
>>
>>
>>> On Nov 19, 2014, at 12:20 PM, Chris Dumez <cdumez at apple.com
>>> <mailto:cdumez at apple.com>> wrote:
>>>
>>> Hi all,
>>>
>>> I recently started updating the WTF::Vector API to use unsigned types
>>> instead of size_t [1][2], because:
>>> - WTF::Vector is already using unsigned type for size/capacity
>>> internally to save memory on 64-bit, causing a mismatch between the
>>> API and the internal representation [3]
>>> - Some reviewers have asked me to use unsigned for loop counters
>>> iterating over vectors (which looks unsafe as the Vector API, e.g.
>>> size(), returns a size_t).
>>> - I heard from Joseph that this type mismatch is forcing us (and
>>> other projects using WTF) to disable some build time warnings
>>> - The few people I talked to before making that change said we should
>>> do it
>>>
>>> However, Alexey recently raised concerns about this change. it
>>> doesn't "strike him as the right direction. 4Gb is not much, and we
>>> should have more of WebKit work with the right data types, not less.”.
>>> I did not initially realize that this change was controversial, but
>>> now that it seems it is, I thought I would raise the question on
>>> webkit-dev to see what people think about this.
>>>
>>> Kr,
>>> --
>>> Chris Dumez - Apple Inc.
>>> Cupertino, CA
>>>
>>>
>>> [1] http://trac.webkit.org/changeset/176275
>>> [2] http://trac.webkit.org/changeset/176293
>>> [3] http://trac.webkit.org/changeset/148891
>>>
>>> _______________________________________________
>>> webkit-dev mailing list
>>> webkit-dev at lists.webkit.org <mailto:webkit-dev at lists.webkit.org>
>>> https://lists.webkit.org/mailman/listinfo/webkit-dev
>>
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev at lists.webkit.org <mailto:webkit-dev at lists.webkit.org>
>> https://lists.webkit.org/mailman/listinfo/webkit-dev
>
>
>
>
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> https://lists.webkit.org/mailman/listinfo/webkit-dev
>



More information about the webkit-dev mailing list