[webkit-dev] Bit fields cause Purify to report UMRs

Anyang Ren anyang.ren at gmail.com
Thu May 3 10:30:23 PDT 2007

On 5/3/07, Oliver Hunt <oliver at apple.com> wrote:
> The problem is that the tool is lacking, not the compiler.  The compiler
> codegen *works* it is the tool that doesn't.

Right.  I will stop speculating what code the compiler generates.  Here is
the disassembly of the code that Visual C++ 2005 Service Pack 1 generates
for the constructor in question.  (I listed the members of DeprecatedStringData
and their offsets first for your reference.)

struct DeprecatedStringData
    unsigned refCount;                  // offset 0
    unsigned _length;                   // offset 4
    mutable DeprecatedChar *_unicode;   // offset 8
    mutable char *_ascii;               // offset 12 (0Ch)
    unsigned _maxUnicode : 30;          // offset 16 (10h)
    bool _isUnicodeValid : 1;
           // offset 20 (14h)
    bool _isHeapAllocated : 1;          // offset 20 (14h)
    unsigned _maxAscii : 31;            // offset 24 (18h)
    bool _isAsciiValid : 1;             // offset 28 (1Ch)
    char _internalBuffer[20];           // offset 32 (20h)

--- deprecatedstring.cpp ---------

DeprecatedStringData::DeprecatedStringData(DeprecatedStringData &o)

    : refCount(1)
    , _length(o._length)

    , _unicode(o._unicode)

    , _ascii(o._ascii)

    , _maxUnicode(o._maxUnicode)

    , _isUnicodeValid(o._isUnicodeValid)
    , _isHeapAllocated(0)

    , _maxAscii(o._maxAscii)
    , _isAsciiValid(o._isAsciiValid)

102A98D0  push        ebx
102A98D1  push        ebp
102A98D2  push        esi
102A98D3  mov         esi,ecx
102A98D5  mov         dword ptr [esi],1
102A98DB  push        edi
102A98DC  mov         edi,dword ptr [esp+14h]
102A98E0  mov         eax,dword ptr [edi+4]
102A98E3  mov         dword ptr [esi+4],eax
102A98E6  mov         ecx,dword ptr [edi+8]
102A98E9  mov         dword ptr [esi+8],ecx
102A98EC  mov         edx,dword ptr [edi+0Ch]
102A98EF  mov         dword ptr [esi+0Ch],edx
102A98F2  mov         eax,dword ptr [edi+10h]
102A98F5  xor         eax,dword ptr [esi+10h]
102A98F8  mov         dl,byte ptr [esi+14h]
102A98FB  and         eax,3FFFFFFFh
102A9900  xor         dword ptr [esi+10h],eax
102A9903  movzx       ecx,byte ptr [edi+14h]
102A9907  and         cl,1
102A990A  and         dl,0FCh
102A990D  or          cl,dl
102A990F  mov         byte ptr [esi+14h],cl
102A9912  mov         eax,dword ptr [edi+18h]
102A9915  xor         eax,dword ptr [esi+18h]
102A9918  lea         ebp,[edi+1Dh]
102A991B  and         eax,7FFFFFFFh
102A9920  xor         dword ptr [esi+18h],eax
102A9923  movzx       ecx,byte ptr [edi+1Ch]
102A9927  xor         cl,byte ptr [esi+1Ch]
102A992A  and         cl,1
102A992D  xor         byte ptr [esi+1Ch],cl
    // Handle the case where either the Unicode or 8-bit pointer was

    // pointing to the internal buffer. We need to point at the
    // internal buffer in the new object, and copy the characters.
    if (_unicode == reinterpret_cast<DeprecatedChar *>(o._internalBuffer)) {
102A9930  cmp         dword ptr [esi+8],ebp
102A9933  jne
WebCore::DeprecatedStringData::DeprecatedStringData+8Ah (102A995Ah)
        if (_isUnicodeValid) {
102A9935  test        byte ptr [esi+14h],1
102A9939  je
WebCore::DeprecatedStringData::DeprecatedStringData+83h (102A9953h)

> > Regarding the "incorrect" structure size on Windows because MSVC
> > packs 'unsigned' and 'bool' bit fields separately, would you consider
> > declaring all bit fields as 'unsigned'?
> Hmmm -- there are probably occasions where we want signed bitfields
> (i have
> considered using them in a particularly ugly area of our svg impl)

Sorry I wasn't clear.  The classes I've looked at that have the bit field
packing problem all have only "unsigned" and "bool" bit fields.  I
meant to ask -- would you consider declaring the bit fields that are
intended to pack into one word with the same type (e.g., "unsigned")?

> Anyhoo, how is it incorrect? does sizeof return a different size from
> the actual
> size in memory, or are the structs just bigger than they need to be?

The structs are bigger than they need to be.  Using DeprecatedStringData
as an example, the comment and the ordering of the bit fields
(30, 1, 1, 31, 1) both suggest that the intention of the author was to
pack the bit fields into two 32-bit words.

// Keep this struct to <= 46 bytes, that's what the system will allocate.
// Will be rounded up to a multiple of 4, so we're stuck at 44.

struct DeprecatedStringData
    unsigned refCount;
    unsigned _length;
    mutable DeprecatedChar *_unicode;
    mutable char *_ascii;
    unsigned _maxUnicode : 30;
    bool _isUnicodeValid : 1;
    bool _isHeapAllocated : 1;
    unsigned _maxAscii : 31;
    bool _isAsciiValid : 1;
    char _internalBuffer[20];

Anyang Ren
Open source developer

More information about the webkit-dev mailing list