[Webkit-unassigned] [Bug 24906] New: 0x5C of EUC-JP is not Yen Sign but U+005C

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Fri Mar 27 19:09:28 PDT 2009


https://bugs.webkit.org/show_bug.cgi?id=24906

           Summary: 0x5C of EUC-JP is not Yen Sign but U+005C
           Product: WebKit
           Version: 525.x (Safari 3.2)
          Platform: PC
        OS/Version: Windows XP
            Status: UNCONFIRMED
          Severity: Normal
          Priority: P2
         Component: Text
        AssignedTo: webkit-unassigned at lists.webkit.org
        ReportedBy: naruse at airemix.jp


Current webkit converts 0x5C of EUC-JP to U+00A5.

But by IANA Charsets definition, 0x00-0x7F of EUC-JP is US-ASCII.
http://www.iana.org/assignments/character-sets

And by CP51932 (de fact real encoding table of EUC-JP;
Internet Explorer and Firefox use this) also converts
0x00-0x7F of EUC-JP to U+0000-U+007F.
http://legacy-encoding.sourceforge.jp/wiki/index.php?cp51932 (written in
Japanese)
http://code.google.com/p/chromium/issues/detail?id=3094 (wriitten in English)

So this webkit's behavior is against both de fact and de jure standard.

The patch for this is following:
Index: WebCore/platform/text/TextEncoding.cpp
===================================================================
--- WebCore/platform/text/TextEncoding.cpp      (revision 42061)
+++ WebCore/platform/text/TextEncoding.cpp      (working copy)
@@ -157,9 +157,13 @@ UChar TextEncoding::backslashAsCurrencySymbol() co

     // The text encodings below treat backslash as a currency symbol.
     // See http://blogs.msdn.com/michkap/archive/2005/09/17/469941.aspx for
more information.
+    // But on CP932, 0x5C is mapped as U+005C.
+    // (And showed as Yen Sign glyph on Japanese Font set)
+    // And Shift_JIS should follow CP932.
+    // 0x00-x7F of EUC-JP is US-ASCII.
+    // So 0x5C is U+00A5 only on Shift_JIS_X0213-2000.
     static const char* const a =
atomicCanonicalTextEncodingName("Shift_JIS_X0213-2000");
-    static const char* const b = atomicCanonicalTextEncodingName("EUC-JP");
-    return (m_name == a || m_name == b) ? 0x00A5 : '\\';
+    return (m_name == a) ? 0x00A5 : '\\';
 }

 bool TextEncoding::isNonByteBasedEncoding() const


-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the webkit-unassigned mailing list