[Webkit-unassigned] [Bug 24906] 0x5C of EUC-JP is not Yen Sign but U+005C

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Sun Mar 29 11:42:06 PDT 2009


https://bugs.webkit.org/show_bug.cgi?id=24906





------- Comment #13 from naruse at airemix.jp  2009-03-29 11:42 PDT -------
(In reply to comment #12)
> One problem is that on non-Windows platforms, the fonts are not hacked, so a
> lot of pages that display money amounts using the 0x5c character in MS fonts
> would presumably suddenly look wrong if we took this patch as is.

Non-japanese user's environment, it may YES.
But japanese users use fonts for japanese.

For example, IPA font is 'hacked' font.
http://ossipedia.ipa.go.jp/ipafont/
And VL Gothic is normal font.
http://dicey.org/vlgothic/

We can choose suitable font.

This is because, this problem is derived from JIS X 0201.
JIS X 0201 is a variant of ISO 646 and in JIS X 0201, 0x5C is Yen Sign.
So this is long and wide problem and we have both backslash fonts and yen sign
fonts.

> Another issue is that if the user searches for "Y5" (I'm using Y instead of yen
> sign here to please bugzilla), the browser wouldn't find "Y5" on the page - and
> that's clearly not desired behavior. Similarly, a money amount would suddenly
> look broken when copied from a Web page to another application that uses a
> different font.

Japanese wide spread keyboard, JP106 has keys which have yen sign image
and backslash image.
http://www.stanford.edu/class/cs140/projects/pintos/specs/kbd/jp106.jpg
But both of them are assigned to 0x5C of CP932 (this is mapped to U+005C).

So japanese usually searchs the character (backslash or yen sign) by U+005C.
In other words the problem ``the browser wouldn't find "Y5" on the page''
is *current* problem, not another issue.

# on Japanese Windows Internet Explorer,
# both U+005C and U+00A5 are fallback to 0x5C of CP932.
# So 0x5C matches both U+005C and U+00A5.

> Technically, it is possible to make the currency glyph substitution happen only
> for rendering purposes, but then, search and copy will be broken. This is what
> makes me think that per-font transcoding is the only practical way to improve
> the current behavior.

I want no substitution and no per-font transcoding.
We know that 0x5C of EUC-JP may be backslash (US-ASCII) or yen sign (JIS X
0201).
Moreover in japan, U+005C is also backslash and yen sign.
This is not only the problem of EUC-JP but Shift_JIS and Unicode in Japan have
the same problem.
But we have workarrounds.

Such substitutions and per-font transcodings only make the problem more
difficult.
0x5C of EUC-JP is not yen sign, not backslash.
It unified both.
So DON'T separate them by the browser.


-- 
Configure bugmail: https://bugs.webkit.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the webkit-unassigned mailing list