[webkit-dev] size_t vs unsigned in WTF::Vector API ?

Antti Koivisto koivisto at iki.fi
Mon Nov 24 01:28:07 PST 2014


I don't think this is really 32bit vs 64bit platform issue. The vast
majority of 64bit systems our users have (that is iOS devices) can't use
memory buffers sized anywhere near the 32bit limit even in theory. Also
when using Vector auto-grow capabilities (which is really the point of
using a vector instead of just allocating a buffer) you need way more
memory than the actual data size. Growing a Vector<uint8_t> beyond 4GB has
peak allocation of 9GB.

Are there any examples of Vectors in the current code base where we would
usefully fix an actual problem even in medium term by switching to 64bit
storage sizes? I don't think they exists. Cases like file uploads should
stream the data or use some mapped-file backed buffer type that is not a
Vector.

With this in mind I think the right direction is to make the Vector API
match the implementation and just start using unsigned indexes everywhere.


   antti

On Fri, Nov 21, 2014 at 2:59 AM, Maciej Stachowiak <mjs at apple.com> wrote:

>
> On Nov 20, 2014, at 4:51 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>
>
> On Nov 20, 2014, at 9:26 AM, Alexey Proskuryakov <ap at webkit.org> wrote:
>
>
> 19 нояб. 2014 г., в 14:58, Alexey Proskuryakov <ap at webkit.org> написал(а):
>
> These and related uses are all over the place - see also Vectors
> in FormDataBuilder, data returned from
> FrameLoader::loadResourceSynchronously, plug-in code that loads from
> network, SharedBuffer etc.
>
>
> Another way to say this is: we need to deal with large data arrays
> throughout loading code. We do not really need full size vectors in most
> other code - it's sort of OK for HTML parser or for image decoder to fail
> catastrophically when there is too much data fed to it.
>
> This is somewhat questionable design, but if we are going to stick to it,
> then magnitude checks should be centralized, not sprinkled throughout the
> code. We should not make this check explicitly when feeding a network
> resource to the parser, for example.
>
> A 64-bit API for Vector solves this nearly flawlessly. We do not perform
> the checks manually every time we use a Vector, Vector does them for us.
>
> Other options are:
>
> - uint64_t everywhere. This way, we'll solve practical problems with large
> resources once and for all. Also, this may prove to be necessary to solve
> even YouTube/Google Drive uploads, I do not know that yet.
>
> - size_t everywhere. Same effect on 64-bit platforms, while 32-bit ones
> will still be limited. I like this option, because it won't make us pay the
> memory and performance cost on old crippled 32-bit machines, which are
> unlikely to be used for manipulating huge volumes of data anyway.
>
>
> We probably want YouTube upload of large files to work on 32-bit machines.
> Though presumably if you want to upload a file larger than the virtual
> address space, you need to represent it in some way other than a Vector.
>
>
> Thinking about this more - I think file sizes should probably be an off_t,
> not a size_t, so on 32-bit platforms some care must still be taken in the
> conversion between file sizes and vector sizes.
>
> In general, I like the idea of Vector having a size_t API and a choice of
> smaller or larger internal size. We might also want to change other
> containers with size-related interfaces to match.
>
> Regards,
> Maciej
>
>
>
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> https://lists.webkit.org/mailman/listinfo/webkit-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.webkit.org/pipermail/webkit-dev/attachments/20141124/05d64728/attachment.html>


More information about the webkit-dev mailing list