[Webkit-unassigned] [Bug 24906] 0x5C of EUC-JP is not Yen Sign but U+005C

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Tue Apr 7 02:01:37 PDT 2009


https://bugs.webkit.org/show_bug.cgi?id=24906





------- Comment #25 from naruse at airemix.jp  2009-04-07 02:01 PDT -------
(In reply to comment #21)
> Internally, I'm getting a different input - it would not be acceptable to break
> the display of yen sign.

EUC-JP is defined on Unix context.
EUC is the framework for multi-national language system.
EUC has 4 codesets, 0 is always US-ASCII and others are language specific.
(desribed in UNIX System V ƒŠƒŠ[ƒX 4 ‘Û‰»‹@”\ (MNLS) ‹@”\à–¾‘)
http://www.amazon.co.jp/gp/switch-language/product/4320097165/ref=dp_change_lang?ie=UTF8&language=en%5FJP
(Historically, implementations are allowed to show 0x5C as yen sign.
 But specs says that 0x5C is US-ASCII)

If you are saying about JIS version of EUC:
EUC-JISX0213 (and EUC-JP-2004) is defined by JIS.
But they are also defined 7bit range as ISO/IEC 646 IRV.


What specs they read who says that 0x5C's glyph must be yen sign.
And if there is a certain spec, is it more authorized than IANA Character Set
and UNIX System V Release V MNLS and JIS?

> Really, if it were ok, Microsoft would have removed
> this quirk from its fonts years ago, I'm sure it's a pain for them to have it.

I know you love yen sign glpyh, I Japanese thank you for it.
But do you imagine data which expects that 0x5C is backslash glpyh?
I explained above, many standards defines 7bit range is US-ASCII,
and many data expects it.
(backslash is usually used as escape in EUC-JP text,
 so glyph is less important than Yen sign.
 But logically, this is as important as it.)
For example following page, you'll think "\n" should be backslash-n.
http://cms.phys.s.u-tokyo.ac.jp/~naoki/CIPINTRO/CBEG/cbeg6.html

If Microsoft removes this quirk form,
it affect not only 0x5C of EUC-JP but U+005C (see my example).
And it has come if you use Arial Unicode MS.
http://en.wikipedia.org/wiki/Arial_Unicode_MS
Moreover on Ubuntu (uses VL Gothic) or Mac (uses Hiragino) it is backslash
glyph.

> Currently, this bug doesn't have any examples of pages that work in IE
> properly, but do not work in Safari (http://doc.okkez.net/187/view/spec/regexp
> looks in IE exactly as it does in Safari, even though the rendering can be
> considered wrong).

I add more example on IE, on Firefox of Mac, on Safari of Mac.
Do you need more explanation or examples?


-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the webkit-unassigned mailing list