[Webkit-unassigned] [Bug 24906] 0x5C of EUC-JP is not Yen Sign but U+005C

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Mon Mar 8 17:53:14 PST 2010


Darin Adler <darin at apple.com> changed:

           What    |Removed                     |Added
  Attachment #50090|review?                     |review-
               Flag|                            |

--- Comment #44 from Darin Adler <darin at apple.com>  2010-03-08 17:53:13 PST ---
(From update of attachment 50090)
Looks good.

> -    String str = renderer->text();
> +    String str = renderer->nonDisplayString();

I don't understand the meaning here. What is a "non-display string"? A display
string is a string intended for display to the end user. But "non-display"
doesn't tell me anything.

Why would this code path in particular use nonDisplayString?

> +    , m_needTranscode(false)

For booleans data members we normally use words that work in a sentence like
this "font <xxx>". But "font need transcode" is not good grammar. I suggest
naming this m_needsTranscoding.

> +    bool needTranscode() const { return m_needTranscode; }

And this should be needsTranscoding() for the same reason.

> -    static const char* const a = atomicCanonicalTextEncodingName("Shift_JIS_X0213-2000");
> -    static const char* const b = atomicCanonicalTextEncodingName("EUC-JP");
> -    return (m_name == a || m_name == b) ? 0x00A5 : '\\';
> +    DEFINE_STATIC_LOCAL(HashSet<const char*>, set, ());
> +    if (set.isEmpty()) {
> +        addEncodingName(set, "Shift_JIS");
> +        addEncodingName(set, "Shift_JIS_X0213-2000");
> +        addEncodingName(set, "EUC-JP");
> +        addEncodingName(set, "ISO-2022-JP");
> +        addEncodingName(set, "ISO-2022-JP-2");
> +    }
> +    return m_name && set.contains(m_name) ? 0x00A5 : '\\';

This fix seems separate from the rest of the patch and should be landed
separately if possible. And needs a test case.

> +        bool useBackslashAsCurrencySymbol() const
> +        {
> +            return m_backslashAsCurrencySymbol != '\\';
> +        }

Again, boolean functions need to finish a sentence "text encoding <xxx>" and
"text encoding use backslash as currency symbol" does not make sense. It should
be usesBackslashAsCurrencySymbol().

>          PassRefPtr<StringImpl> displayString(PassRefPtr<StringImpl> str) const
>          {
> -            if (m_backslashAsCurrencySymbol == '\\' || !str)
> +            if (!useBackslashAsCurrencySymbol() || !str)
>                  return str;
>              return str->replace('\\', m_backslashAsCurrencySymbol);
>          }
>          void displayBuffer(UChar* characters, unsigned len) const
>          {
> -            if (m_backslashAsCurrencySymbol == '\\')
> +            if (!useBackslashAsCurrencySymbol())

I don't see how these changes are related to the major change here. The code is
the same as before, but uses a helper function. This should be landed
separately so the patch isn't so huge. And doesn't seem important.

> +#include "config.h"
> +
> +#include "FontTranscoder.h"

There should not be a space here.

> +    // IE's default font for Japanese encodings is known hacked.

This comment is unclear. What specifically do you mean by "known hacked"? I
think you should say specifically in a way that is specific about what it does
rather than making a value judgement "hacked".

> +    return NoConvertion;

There is no such word as "convertion". Instead it should be "conversion".

> +bool FontTranscoder::needTranscode(const AtomicString& fontFamily, const TextEncoding* encoding, bool* outUseOriginalTextAsNonDisplayString) const

Again, functions that return booleans should ideally be "font transcoder <xxx>"
and "font transcoder need transcode" does not make sense. I don't have a
specific name to suggest.

> +FontTranscoder* fontTranscoder()
> +{
> +    static FontTranscoder* transcoder = new FontTranscoder();
> +    return transcoder;
> +}

Normally we would return a reference from a function like this instead of a
pointer. Also, it's typically "new FontTranscoder" without parentheses.

> +#include "AtomicString.h"
> +#include "AtomicStringHash.h"
> +#include "PlatformString.h"
> +#include <wtf/HashMap.h>

There is no need to include "PlatformString.h" if you are also including

> +class FontTranscoder {

This should derive from Noncopyable.

> +    bool needTranscode(const AtomicString& fontFamily, const TextEncoding* encoding = 0, bool* outUseOriginalTextAsNonDisplayString = 0) const;

The argument name "encoding" should be omitted here.

> +    ConverterType converterType(const AtomicString& fontFamily, const TextEncoding* encoding) const;

The argument name "encoding" should be omitted here.

> +     , m_needTranscode(false)

"render text need transcode" is not good. But m_needsTranscoding might be OK.

> +     , m_useOriginalTextAsNonDisplayString(false)

I still don't understand what a non-display string is, so I can't suggest a
better name here.

> +        String transcoded(m_text);
> +        fontTranscoder()->convert(transcoded, fontFamily, encoding);
> +        m_text = transcoded.impl();

We really should change m_text to be a String instead of a RefPtr<StringImpl>.
The two are almost the same thing.

We need test cases covering *searching* for the symbols in these various
encodings and fonts. In general, I would like to see a test case that covers
each call site. The search code is one example that uses the TextIterator call

review- because I'm sure you'll want to change at least one of the things I
mentioned above.

Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

More information about the webkit-unassigned mailing list