[webkit-dev] SerializedScriptValue: signed vs unsigned char

Adam Barth abarth at webkit.org
Wed Feb 6 17:58:25 PST 2013


On Wed, Feb 6, 2013 at 4:59 PM, Alec Flett <alecflett at chromium.org> wrote:

> On Wed, Feb 6, 2013 at 4:48 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>
>> I think we should continue to use uint8_t instead of char as the primary
>> way to represent a raw byte in WebKit. First, it's good to distinguish raw
>> data from C strings at the type system level, and second, the unpredictable
>> signedness of char is actively bad for byte-oriented processing. Another
>> library making a different choice doesn't overcome these reasons.
>>
>> To be fair, there hasn't been a convention in WebKit at all.  uint8_t was
> chosen for SerializedScriptValue roughly two months ago, with specific
> IndexedDB support in mind: https://bugs.webkit.org/show_bug.cgi?id=104354 -
> this usage is not widespread, and in fact the only consumer of this type is
> IndexedDB.
>
>
>> If there's particular libraries we want to use which have different
>> conventions, the adaptation should be done at the level of interfacing the
>> library. Changing WebKit's conventiones because of one optional dependency
>> does not make sense to me.
>>
>>
> Maybe more simply: Vector<uint8_t> was chosen very recently to replace
> String, in support of cleaning up IndexedDB code. IndexedDB would like to
> use Vector<char> now for further cleanup. Would you feel the same if we
> were switching from String to Vector<char>?
>

Yeah, We use char all over WebCore to represent a byte, including in
SharedBuffer.  We should use for SerializedScriptValue for consistency.

If we think that uint8_t is better than char to represent bytes, then we
should make that change globally in WebCore separately.

Adam



> On Wed, Feb 6, 2013 at 4:48 PM, Maciej Stachowiak <mjs at apple.com> wrote:
>
>>
>> I think we should continue to use uint8_t instead of char as the primary
>> way to represent a raw byte in WebKit. First, it's good to distinguish raw
>> data from C strings at the type system level, and second, the unpredictable
>> signedness of char is actively bad for byte-oriented processing. Another
>> library making a different choice doesn't overcome these reasons.
>>
>> If there's particular libraries we want to use which have different
>> conventions, the adaptation should be done at the level of interfacing the
>> library. Changing WebKit's conventiones because of one optional dependency
>> does not make sense to me.
>>
>> Regards,
>> Maciej
>>
>> On Feb 6, 2013, at 4:35 PM, Alec Flett <alecflett at chromium.org> wrote:
>>
>> ok, so something else has come up: SharedBuffer. SharedBuffer has an
>> adoptVector method that allows you to adopt Vector<char>... some of the
>> stuff I'm using that interacts with LevelDB is also dealing with
>> SharedBuffer, hence I've had to do some nasty casting from Vector<uint8_t>
>> to Vector<char> to allow me to call SharedBuffer.adoptVector()
>>
>> And again, we could tweak SharedBuffer to accept Vector<unsigned char>
>> but now we have two subsystems (LevelDB, and SharedBuffer) that seem to
>> prefer "char" as raw data.
>>
>> Personally outside of WebKit I tend to see more "char*" as the common
>> denominator for raw bytes.
>>
>> Further, there are no subsystems that actually depend on
>> SerializedScriptValue using uint8_t - it was just what we decided to use
>> when (ironically) we were hooking up IndexedDB to JSC, just a month or so
>> ago.
>>
>> So far Benjamin objected, and then seems to have rescinded. Glenn, do you
>> depend on SerializedScriptValue's current method signatures?
>>
>> Alec
>>
>>
>> On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain <benjamin at webkit.org>
>>  wrote:
>>
>>> On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett <alecflett at chromium.org>
>>>  wrote:
>>>
>>>> Well, nobody is explicitly using LChar with SerializedScriptValue
>>>> (maybe it should, maybe that's another issue)  but I guess this is why I'm
>>>> asking - I'm happy to just deal with this in IDB with some ugly
>>>> reinterpret_casts here and there (ok maybe not happy, but satisfied enough)
>>>> if folks prefer that. I don't personally find uint8_t to be any more
>>>> intuitive than char, but it sounds like some do. Nevermind...
>>>>
>>>
>>> Well, since you never use character types and only raw data, just ignore
>>> my comment.
>>>
>>> As far as I know, it is already common to use signed char for raw data
>>> (in the network stack for example).
>>>
>>> Benjamin
>>>
>>
>>
>> On Wed, Feb 6, 2013 at 4:06 PM, Alec Flett <alecflett at google.com> wrote:
>>
>>> ok, so something else has come up: SharedBuffer. SharedBuffer has an
>>> adoptVector method that allows you to adopt Vector<char>... some of the
>>> stuff I'm using that interacts with LevelDB is also dealing with
>>> SharedBuffer, hence I've had to do some nasty casting from Vector<uint8_t>
>>> to Vector<char> to allow me to call SharedBuffer.adoptVector()
>>>
>>> And again, we could tweak SharedBuffer to accept Vector<unsigned char>
>>> but now we have two subsystems (LevelDB, and SharedBuffer) that seem to
>>> prefer "char" as raw data.
>>>
>>> Personally outside of WebKit I tend to see more "char*" as the common
>>> denominator for raw bytes.
>>>
>>> Further, there are no subsystems that actually depend on
>>> SerializedScriptValue using uint8_t - it was just what we decided to use
>>> when (ironically) we were hooking up IndexedDB to JSC, just a month or so
>>> ago.
>>>
>>> So far Benjamin objected, and then seems to have rescinded. Glenn, do
>>> you depend on SerializedScriptValue's current method signatures?
>>>
>>> Alec
>>>
>>>
>>> On Mon, Feb 4, 2013 at 5:14 PM, Benjamin Poulain <benjamin at webkit.org>wrote:
>>>
>>>> On Mon, Feb 4, 2013 at 4:54 PM, Alec Flett <alecflett at chromium.org>wrote:
>>>>
>>>>> Well, nobody is explicitly using LChar with SerializedScriptValue
>>>>> (maybe it should, maybe that's another issue)  but I guess this is why I'm
>>>>> asking - I'm happy to just deal with this in IDB with some ugly
>>>>> reinterpret_casts here and there (ok maybe not happy, but satisfied enough)
>>>>> if folks prefer that. I don't personally find uint8_t to be any more
>>>>> intuitive than char, but it sounds like some do. Nevermind...
>>>>>
>>>>
>>>> Well, since you never use character types and only raw data, just
>>>> ignore my comment.
>>>>
>>>> As far as I know, it is already common to use signed char for raw data
>>>> (in the network stack for example).
>>>>
>>>> Benjamin
>>>>
>>>
>>>
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev at lists.webkit.org
>> https://lists.webkit.org/mailman/listinfo/webkit-dev
>>
>>
>>
>
> _______________________________________________
> webkit-dev mailing list
> webkit-dev at lists.webkit.org
> https://lists.webkit.org/mailman/listinfo/webkit-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.webkit.org/pipermail/webkit-dev/attachments/20130206/525ebda5/attachment.html>


More information about the webkit-dev mailing list