<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>[197781] trunk</title>
</head>
<body>
<style type="text/css"><!--
#msg dl.meta { border: 1px #006 solid; background: #369; padding: 6px; color: #fff; }
#msg dl.meta dt { float: left; width: 6em; font-weight: bold; }
#msg dt:after { content:':';}
#msg dl, #msg dt, #msg ul, #msg li, #header, #footer, #logmsg { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; }
#msg dl a { font-weight: bold}
#msg dl a:link { color:#fc3; }
#msg dl a:active { color:#ff0; }
#msg dl a:visited { color:#cc6; }
h3 { font-family: verdana,arial,helvetica,sans-serif; font-size: 10pt; font-weight: bold; }
#msg pre { overflow: auto; background: #ffc; border: 1px #fa0 solid; padding: 6px; }
#logmsg { background: #ffc; border: 1px #fa0 solid; padding: 1em 1em 0 1em; }
#logmsg p, #logmsg pre, #logmsg blockquote { margin: 0 0 1em 0; }
#logmsg p, #logmsg li, #logmsg dt, #logmsg dd { line-height: 14pt; }
#logmsg h1, #logmsg h2, #logmsg h3, #logmsg h4, #logmsg h5, #logmsg h6 { margin: .5em 0; }
#logmsg h1:first-child, #logmsg h2:first-child, #logmsg h3:first-child, #logmsg h4:first-child, #logmsg h5:first-child, #logmsg h6:first-child { margin-top: 0; }
#logmsg ul, #logmsg ol { padding: 0; list-style-position: inside; margin: 0 0 0 1em; }
#logmsg ul { text-indent: -1em; padding-left: 1em; }#logmsg ol { text-indent: -1.5em; padding-left: 1.5em; }
#logmsg > ul, #logmsg > ol { margin: 0 0 1em 0; }
#logmsg pre { background: #eee; padding: 1em; }
#logmsg blockquote { border: 1px solid #fa0; border-left-width: 10px; padding: 1em 1em 0 1em; background: white;}
#logmsg dl { margin: 0; }
#logmsg dt { font-weight: bold; }
#logmsg dd { margin: 0; padding: 0 0 0.5em 0; }
#logmsg dd:before { content:'\00bb';}
#logmsg table { border-spacing: 0px; border-collapse: collapse; border-top: 4px solid #fa0; border-bottom: 1px solid #fa0; background: #fff; }
#logmsg table th { text-align: left; font-weight: normal; padding: 0.2em 0.5em; border-top: 1px dotted #fa0; }
#logmsg table td { text-align: right; border-top: 1px dotted #fa0; padding: 0.2em 0.5em; }
#logmsg table thead th { text-align: center; border-bottom: 1px solid #fa0; }
#logmsg table th.Corner { text-align: left; }
#logmsg hr { border: none 0; border-top: 2px dashed #fa0; height: 1px; }
#header, #footer { color: #fff; background: #636; border: 1px #300 solid; padding: 6px; }
#patch { width: 100%; }
#patch h4 {font-family: verdana,arial,helvetica,sans-serif;font-size:10pt;padding:8px;background:#369;color:#fff;margin:0;}
#patch .propset h4, #patch .binary h4 {margin:0;}
#patch pre {padding:0;line-height:1.2em;margin:0;}
#patch .diff {width:100%;background:#eee;padding: 0 0 10px 0;overflow:auto;}
#patch .propset .diff, #patch .binary .diff {padding:10px 0;}
#patch span {display:block;padding:0 10px;}
#patch .modfile, #patch .addfile, #patch .delfile, #patch .propset, #patch .binary, #patch .copfile {border:1px solid #ccc;margin:10px 0;}
#patch ins {background:#dfd;text-decoration:none;display:block;padding:0 10px;}
#patch del {background:#fdd;text-decoration:none;display:block;padding:0 10px;}
#patch .lines, .info {color:#888;background:#fff;}
--></style>
<div id="msg">
<dl class="meta">
<dt>Revision</dt> <dd><a href="http://trac.webkit.org/projects/webkit/changeset/197781">197781</a></dd>
<dt>Author</dt> <dd>msaboff@apple.com</dd>
<dt>Date</dt> <dd>2016-03-08 10:35:58 -0800 (Tue, 08 Mar 2016)</dd>
</dl>
<h3>Log Message</h3>
<pre>[ES6] Regular Expression canonicalization tables for Unicode need to be updated to use Unicode CaseFolding.txt
https://bugs.webkit.org/show_bug.cgi?id=155114
Reviewed by Darin Adler.
Source/JavaScriptCore:
Extracted out the Unicode canonicalization table creation from
YarrCanonicalizeUnicode.js into a new Python script, generateYarrCanonicalizeUnicode.
That script generates the Unicode tables as the file YarrCanonicalizeUnicode.cpp in
DerivedSources/JavaScriptCore.
Updated the processing of ignore case to make the ASCII short cuts dependent on whether
or not we are a Unicode pattern.
Renamed yarr/YarrCanonicalizeUnicode.{cpp,js} back to their prior names,
YarrCanonicalizeUCS2.{cpp,js}.
Renamed yarr/YarrCanonicalizeUnicode.h to YarrCanonicalize.h as it declares both the
legacy UCS2 and Unicode tables.
* CMakeLists.txt:
* DerivedSources.make:
* JavaScriptCore.xcodeproj/project.pbxproj:
* generateYarrCanonicalizeUnicode: Added.
* ucd: Added.
* ucd/CaseFolding.txt: Added. The current verion, 8.0, of the Unicode CaseFolding table.
* yarr/YarrCanonicalizeUCS2.cpp: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.cpp.
* yarr/YarrCanonicalize.h: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.h.
* yarr/YarrCanonicalizeUCS2.js: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.js.
(printHeader):
* yarr/YarrCanonicalizeUnicode.cpp: Removed.
* yarr/YarrCanonicalizeUnicode.h: Removed.
* yarr/YarrCanonicalizeUnicode.js: Removed.
* yarr/YarrInterpreter.cpp:
(JSC::Yarr::Interpreter::tryConsumeBackReference):
* yarr/YarrJIT.cpp:
* yarr/YarrPattern.cpp:
(JSC::Yarr::CharacterClassConstructor::putChar):
LayoutTests:
Updated test cases.
* js/regexp-unicode-expected.txt:
* js/script-tests/regexp-unicode.js:</pre>
<h3>Modified Paths</h3>
<ul>
<li><a href="#trunkLayoutTestsChangeLog">trunk/LayoutTests/ChangeLog</a></li>
<li><a href="#trunkLayoutTestsjsregexpunicodeexpectedtxt">trunk/LayoutTests/js/regexp-unicode-expected.txt</a></li>
<li><a href="#trunkLayoutTestsjsscripttestsregexpunicodejs">trunk/LayoutTests/js/script-tests/regexp-unicode.js</a></li>
<li><a href="#trunkSourceJavaScriptCoreCMakeListstxt">trunk/Source/JavaScriptCore/CMakeLists.txt</a></li>
<li><a href="#trunkSourceJavaScriptCoreChangeLog">trunk/Source/JavaScriptCore/ChangeLog</a></li>
<li><a href="#trunkSourceJavaScriptCoreDerivedSourcesmake">trunk/Source/JavaScriptCore/DerivedSources.make</a></li>
<li><a href="#trunkSourceJavaScriptCoreJavaScriptCorexcodeprojprojectpbxproj">trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrInterpretercpp">trunk/Source/JavaScriptCore/yarr/YarrInterpreter.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrJITcpp">trunk/Source/JavaScriptCore/yarr/YarrJIT.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrPatterncpp">trunk/Source/JavaScriptCore/yarr/YarrPattern.cpp</a></li>
</ul>
<h3>Added Paths</h3>
<ul>
<li><a href="#trunkSourceJavaScriptCoregenerateYarrCanonicalizeUnicode">trunk/Source/JavaScriptCore/generateYarrCanonicalizeUnicode</a></li>
<li>trunk/Source/JavaScriptCore/ucd/</li>
<li><a href="#trunkSourceJavaScriptCoreucdCaseFoldingtxt">trunk/Source/JavaScriptCore/ucd/CaseFolding.txt</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeh">trunk/Source/JavaScriptCore/yarr/YarrCanonicalize.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2cpp">trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2js">trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.js</a></li>
</ul>
<h3>Removed Paths</h3>
<ul>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodecpp">trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.cpp</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodeh">trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.h</a></li>
<li><a href="#trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodejs">trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.js</a></li>
</ul>
</div>
<div id="patch">
<h3>Diff</h3>
<a id="trunkLayoutTestsChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/LayoutTests/ChangeLog (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/LayoutTests/ChangeLog        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/LayoutTests/ChangeLog        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -1,3 +1,15 @@
</span><ins>+2016-03-08 Michael Saboff <msaboff@apple.com>
+
+ [ES6] Regular Expression canonicalization tables for Unicode need to be updated to use Unicode CaseFolding.txt
+ https://bugs.webkit.org/show_bug.cgi?id=155114
+
+ Reviewed by Darin Adler.
+
+ Updated test cases.
+
+ * js/regexp-unicode-expected.txt:
+ * js/script-tests/regexp-unicode.js:
+
</ins><span class="cx"> 2016-03-08 Commit Queue <commit-queue@webkit.org>
</span><span class="cx">
</span><span class="cx"> Unreviewed, rolling out r197765.
</span></span></pre></div>
<a id="trunkLayoutTestsjsregexpunicodeexpectedtxt"></a>
<div class="modfile"><h4>Modified: trunk/LayoutTests/js/regexp-unicode-expected.txt (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/LayoutTests/js/regexp-unicode-expected.txt        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/LayoutTests/js/regexp-unicode-expected.txt        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -89,10 +89,10 @@
</span><span class="cx"> PASS match6[0] is "a𐐒𐐒b𐐺𐐒"
</span><span class="cx"> PASS match6[1] is undefined.
</span><span class="cx"> PASS match6[2] is "𐐒𐐒"
</span><del>-PASS /ẚbc/ui.test("abc") is true
-PASS /abc/ui.test("ẚbc") is true
-PASS /texẗ/ui.test("text") is true
-PASS /text/ui.test("ẗext") is true
</del><ins>+PASS /ſtop/ui.test("stop") is true
+PASS /stop/ui.test("ſtop") is true
+PASS /Kelvin/ui.test("kelvin") is true
+PASS /KELVIN/ui.test("Kelvin") is true
</ins><span class="cx"> PASS /\u{1}/.test("u") is true
</span><span class="cx"> PASS /\u{4}/.test("u") is false
</span><span class="cx"> PASS /\u{4}/.test("uuuu") is true
</span></span></pre></div>
<a id="trunkLayoutTestsjsscripttestsregexpunicodejs"></a>
<div class="modfile"><h4>Modified: trunk/LayoutTests/js/script-tests/regexp-unicode.js (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/LayoutTests/js/script-tests/regexp-unicode.js        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/LayoutTests/js/script-tests/regexp-unicode.js        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -137,10 +137,10 @@
</span><span class="cx"> shouldBe('match6[2]', '"\u{10412}\u{10412}"');
</span><span class="cx">
</span><span class="cx"> // Check unicode case insensitive matches
</span><del>-shouldBeTrue('/\u1e9Abc/ui.test("abc")');
-shouldBeTrue('/abc/ui.test("\u1e9Abc")');
-shouldBeTrue('/tex\u1e97/ui.test("text")');
-shouldBeTrue('/text/ui.test("\u1e97ext")');
</del><ins>+shouldBeTrue('/\u017ftop/ui.test("stop")');
+shouldBeTrue('/stop/ui.test("\u017ftop")');
+shouldBeTrue('/\u212aelvin/ui.test("kelvin")');
+shouldBeTrue('/KELVIN/ui.test("\u212aelvin")');
</ins><span class="cx">
</span><span class="cx"> // Verify that without the unicode flag, \u{} doesn't parse to a unicode escapes, but to a counted match of the character 'u'.
</span><span class="cx"> shouldBeTrue('/\\u{1}/.test("u")');
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreCMakeListstxt"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/CMakeLists.txt (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/CMakeLists.txt        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/Source/JavaScriptCore/CMakeLists.txt        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -831,7 +831,7 @@
</span><span class="cx"> wasm/WASMReader.cpp
</span><span class="cx">
</span><span class="cx"> yarr/RegularExpression.cpp
</span><del>- yarr/YarrCanonicalizeUnicode.cpp
</del><ins>+ yarr/YarrCanonicalizeUCS2.cpp
</ins><span class="cx"> yarr/YarrInterpreter.cpp
</span><span class="cx"> yarr/YarrJIT.cpp
</span><span class="cx"> yarr/YarrPattern.cpp
</span><span class="lines">@@ -1082,7 +1082,17 @@
</span><span class="cx"> VERBATIM)
</span><span class="cx"> ADD_SOURCE_DEPENDENCIES(${CMAKE_CURRENT_SOURCE_DIR}/yarr/YarrPattern.cpp ${DERIVED_SOURCES_JAVASCRIPTCORE_DIR}/RegExpJitTables.h)
</span><span class="cx">
</span><ins>+add_custom_command(
+ OUTPUT ${DERIVED_SOURCES_JAVASCRIPTCORE_DIR}/YarrCanonicalizeUnicode.cpp
+ MAIN_DEPENDENCY ${JAVASCRIPTCORE_DIR}/generateYarrCanonicalizeUnicode
+ DEPENDS ${JAVASCRIPTCORE_DIR}/ucd/CaseFolding.txt
+ COMMAND ${PYTHON_EXECUTABLE} ${JAVASCRIPTCORE_DIR}/generateYarrCanonicalizeUnicode ${JAVASCRIPTCORE_DIR}/ucd/CaseFolding.txt ${DERIVED_SOURCES_JAVASCRIPTCORE_DIR}/YarrCanonicalizeUnicode.cpp
+ VERBATIM)
</ins><span class="cx">
</span><ins>+list(APPEND JavaScriptCore_SOURCES
+ ${DERIVED_SOURCES_JAVASCRIPTCORE_DIR}/YarrCanonicalizeUnicode.cpp
+)
+
</ins><span class="cx"> #GENERATOR: "KeywordLookup.h": keyword decision tree used by the lexer
</span><span class="cx"> add_custom_command(
</span><span class="cx"> OUTPUT ${DERIVED_SOURCES_JAVASCRIPTCORE_DIR}/KeywordLookup.h
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreChangeLog"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/ChangeLog (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ChangeLog        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/Source/JavaScriptCore/ChangeLog        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -1,3 +1,42 @@
</span><ins>+2016-03-08 Michael Saboff <msaboff@apple.com>
+
+ [ES6] Regular Expression canonicalization tables for Unicode need to be updated to use Unicode CaseFolding.txt
+ https://bugs.webkit.org/show_bug.cgi?id=155114
+
+ Reviewed by Darin Adler.
+
+ Extracted out the Unicode canonicalization table creation from
+ YarrCanonicalizeUnicode.js into a new Python script, generateYarrCanonicalizeUnicode.
+ That script generates the Unicode tables as the file YarrCanonicalizeUnicode.cpp in
+ DerivedSources/JavaScriptCore.
+
+ Updated the processing of ignore case to make the ASCII short cuts dependent on whether
+ or not we are a Unicode pattern.
+
+ Renamed yarr/YarrCanonicalizeUnicode.{cpp,js} back to their prior names,
+ YarrCanonicalizeUCS2.{cpp,js}.
+ Renamed yarr/YarrCanonicalizeUnicode.h to YarrCanonicalize.h as it declares both the
+ legacy UCS2 and Unicode tables.
+
+ * CMakeLists.txt:
+ * DerivedSources.make:
+ * JavaScriptCore.xcodeproj/project.pbxproj:
+ * generateYarrCanonicalizeUnicode: Added.
+ * ucd: Added.
+ * ucd/CaseFolding.txt: Added. The current verion, 8.0, of the Unicode CaseFolding table.
+ * yarr/YarrCanonicalizeUCS2.cpp: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.cpp.
+ * yarr/YarrCanonicalize.h: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.h.
+ * yarr/YarrCanonicalizeUCS2.js: Copied from Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.js.
+ (printHeader):
+ * yarr/YarrCanonicalizeUnicode.cpp: Removed.
+ * yarr/YarrCanonicalizeUnicode.h: Removed.
+ * yarr/YarrCanonicalizeUnicode.js: Removed.
+ * yarr/YarrInterpreter.cpp:
+ (JSC::Yarr::Interpreter::tryConsumeBackReference):
+ * yarr/YarrJIT.cpp:
+ * yarr/YarrPattern.cpp:
+ (JSC::Yarr::CharacterClassConstructor::putChar):
+
</ins><span class="cx"> 2016-03-08 Andreas Kling <akling@apple.com>
</span><span class="cx">
</span><span class="cx"> WeakBlock::visit() should check for a WeakHandleOwner before consulting mark bits.
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreDerivedSourcesmake"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/DerivedSources.make (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/DerivedSources.make        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/Source/JavaScriptCore/DerivedSources.make        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -61,6 +61,7 @@
</span><span class="cx"> KeywordLookup.h \
</span><span class="cx"> RegExpJitTables.h \
</span><span class="cx"> AirOpcode.h \
</span><ins>+ YarrCanonicalizeUnicode.cpp \
</ins><span class="cx"> #
</span><span class="cx">
</span><span class="cx"> # JavaScript builtins.
</span><span class="lines">@@ -272,6 +273,9 @@
</span><span class="cx"> AirOpcode.h: $(JavaScriptCore)/b3/air/opcode_generator.rb $(JavaScriptCore)/b3/air/AirOpcode.opcodes
</span><span class="cx">         $(RUBY) $^
</span><span class="cx">
</span><ins>+YarrCanonicalizeUnicode.cpp: $(JavaScriptCore)/generateYarrCanonicalizeUnicode $(JavaScriptCore)/ucd/CaseFolding.txt
+        $(PYTHON) $(JavaScriptCore)/generateYarrCanonicalizeUnicode $(JavaScriptCore)/ucd/CaseFolding.txt ./YarrCanonicalizeUnicode.cpp
+
</ins><span class="cx"> # Dynamically-defined targets are listed below. Static targets belong up top.
</span><span class="cx">
</span><span class="cx"> all : \
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreJavaScriptCorexcodeprojprojectpbxproj"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/Source/JavaScriptCore/JavaScriptCore.xcodeproj/project.pbxproj        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -1223,6 +1223,7 @@
</span><span class="cx">                 65C0285C1717966800351E35 /* ARMv7DOpcode.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 65C0285A1717966800351E35 /* ARMv7DOpcode.cpp */; };
</span><span class="cx">                 65C0285D1717966800351E35 /* ARMv7DOpcode.h in Headers */ = {isa = PBXBuildFile; fileRef = 65C0285B1717966800351E35 /* ARMv7DOpcode.h */; };
</span><span class="cx">                 65FB5117184EEE7000C12B70 /* ProtoCallFrame.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 65FB5116184EE9BC00C12B70 /* ProtoCallFrame.cpp */; };
</span><ins>+                65FB63A41C8EA09C0020719B /* YarrCanonicalizeUnicode.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 65A946141C8E9F6F00A7209A /* YarrCanonicalizeUnicode.cpp */; };
</ins><span class="cx">                 6AD2CB4D19B9140100065719 /* DebuggerEvalEnabler.h in Headers */ = {isa = PBXBuildFile; fileRef = 6AD2CB4C19B9140100065719 /* DebuggerEvalEnabler.h */; settings = {ATTRIBUTES = (Private, ); }; };
</span><span class="cx">                 70113D4B1A8DB093003848C4 /* IteratorOperations.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 70113D491A8DB093003848C4 /* IteratorOperations.cpp */; };
</span><span class="cx">                 70113D4C1A8DB093003848C4 /* IteratorOperations.h in Headers */ = {isa = PBXBuildFile; fileRef = 70113D4A1A8DB093003848C4 /* IteratorOperations.h */; settings = {ATTRIBUTES = (Private, ); }; };
</span><span class="lines">@@ -1326,7 +1327,7 @@
</span><span class="cx">                 862553D116136DA9009F17D0 /* JSProxy.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 862553CE16136AA5009F17D0 /* JSProxy.cpp */; };
</span><span class="cx">                 862553D216136E1A009F17D0 /* JSProxy.h in Headers */ = {isa = PBXBuildFile; fileRef = 862553CF16136AA5009F17D0 /* JSProxy.h */; settings = {ATTRIBUTES = (Private, ); }; };
</span><span class="cx">                 863B23E00FC6118900703AA4 /* MacroAssemblerCodeRef.h in Headers */ = {isa = PBXBuildFile; fileRef = 863B23DF0FC60E6200703AA4 /* MacroAssemblerCodeRef.h */; settings = {ATTRIBUTES = (Private, ); }; };
</span><del>-                863C6D9C1521111A00585E4E /* YarrCanonicalizeUnicode.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 863C6D981521111200585E4E /* YarrCanonicalizeUnicode.cpp */; };
</del><ins>+                863C6D9C1521111A00585E4E /* YarrCanonicalizeUCS2.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 863C6D981521111200585E4E /* YarrCanonicalizeUCS2.cpp */; };
</ins><span class="cx">                 8642C510151C06A90046D4EF /* RegExpCachedResult.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 86F75EFB151C062F007C9BA3 /* RegExpCachedResult.cpp */; };
</span><span class="cx">                 8642C512151C083D0046D4EF /* RegExpMatchesArray.cpp in Sources */ = {isa = PBXBuildFile; fileRef = 86F75EFD151C062F007C9BA3 /* RegExpMatchesArray.cpp */; };
</span><span class="cx">                 865A30F1135007E100CDB49E /* JSCJSValueInlines.h in Headers */ = {isa = PBXBuildFile; fileRef = 865A30F0135007E100CDB49E /* JSCJSValueInlines.h */; settings = {ATTRIBUTES = (Private, ); }; };
</span><span class="lines">@@ -3362,6 +3363,8 @@
</span><span class="cx">                 658D3A5519638268003C45D6 /* VMEntryRecord.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; lineEnding = 0; path = VMEntryRecord.h; sourceTree = "<group>"; xcLanguageSpecificationIdentifier = xcode.lang.objcpp; };
</span><span class="cx">                 65987F2C167FE84B003C2F8D /* DFGOSRExitCompilationInfo.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = DFGOSRExitCompilationInfo.h; path = dfg/DFGOSRExitCompilationInfo.h; sourceTree = "<group>"; };
</span><span class="cx">                 65987F2F16828A7E003C2F8D /* UnusedPointer.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = UnusedPointer.h; sourceTree = "<group>"; };
</span><ins>+                65A946131C8E9F2000A7209A /* generateYarrCanonicalizeUnicode */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.script.python; path = generateYarrCanonicalizeUnicode; sourceTree = "<group>"; };
+                65A946141C8E9F6F00A7209A /* YarrCanonicalizeUnicode.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = YarrCanonicalizeUnicode.cpp; sourceTree = "<group>"; };
</ins><span class="cx">                 65B8392C1BACA92A0044E824 /* CachedRecovery.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = CachedRecovery.h; sourceTree = "<group>"; };
</span><span class="cx">                 65B8392D1BACA9D30044E824 /* CachedRecovery.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = CachedRecovery.cpp; sourceTree = "<group>"; };
</span><span class="cx">                 65C0284F171795E200351E35 /* ARMv7Disassembler.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = ARMv7Disassembler.cpp; path = disassembler/ARMv7Disassembler.cpp; sourceTree = "<group>"; };
</span><span class="lines">@@ -3501,9 +3504,9 @@
</span><span class="cx">                 862553CE16136AA5009F17D0 /* JSProxy.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = JSProxy.cpp; sourceTree = "<group>"; };
</span><span class="cx">                 862553CF16136AA5009F17D0 /* JSProxy.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = JSProxy.h; sourceTree = "<group>"; };
</span><span class="cx">                 863B23DF0FC60E6200703AA4 /* MacroAssemblerCodeRef.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = MacroAssemblerCodeRef.h; sourceTree = "<group>"; };
</span><del>-                863C6D981521111200585E4E /* YarrCanonicalizeUnicode.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = YarrCanonicalizeUnicode.cpp; path = yarr/YarrCanonicalizeUnicode.cpp; sourceTree = "<group>"; };
-                863C6D991521111200585E4E /* YarrCanonicalizeUnicode.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = YarrCanonicalizeUnicode.h; path = yarr/YarrCanonicalizeUnicode.h; sourceTree = "<group>"; };
-                863C6D9A1521111200585E4E /* YarrCanonicalizeUnicode.js */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.javascript; name = YarrCanonicalizeUnicode.js; path = yarr/YarrCanonicalizeUnicode.js; sourceTree = "<group>"; };
</del><ins>+                863C6D981521111200585E4E /* YarrCanonicalizeUCS2.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; name = YarrCanonicalizeUCS2.cpp; path = yarr/YarrCanonicalizeUCS2.cpp; sourceTree = "<group>"; };
+                863C6D991521111200585E4E /* YarrCanonicalize.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = YarrCanonicalize.h; path = yarr/YarrCanonicalize.h; sourceTree = "<group>"; };
+                863C6D9A1521111200585E4E /* YarrCanonicalizeUCS2.js */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.javascript; name = YarrCanonicalizeUCS2.js; path = yarr/YarrCanonicalizeUCS2.js; sourceTree = "<group>"; };
</ins><span class="cx">                 8640923B156EED3B00566CB2 /* ARM64Assembler.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = ARM64Assembler.h; sourceTree = "<group>"; };
</span><span class="cx">                 8640923C156EED3B00566CB2 /* MacroAssemblerARM64.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = MacroAssemblerARM64.h; sourceTree = "<group>"; };
</span><span class="cx">                 865A30F0135007E100CDB49E /* JSCJSValueInlines.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = JSCJSValueInlines.h; sourceTree = "<group>"; };
</span><span class="lines">@@ -4437,6 +4440,7 @@
</span><span class="cx">                 0867D691FE84028FC02AAC07 /* JavaScriptCore */ = {
</span><span class="cx">                         isa = PBXGroup;
</span><span class="cx">                         children = (
</span><ins>+                                65A946131C8E9F2000A7209A /* generateYarrCanonicalizeUnicode */,
</ins><span class="cx">                                 8604F4F2143A6C4400B295F5 /* ChangeLog */,
</span><span class="cx">                                 F68EBB8C0255D4C601FF60F7 /* config.h */,
</span><span class="cx">                                 F692A8540255597D01FF60F7 /* create_hash_table */,
</span><span class="lines">@@ -5362,6 +5366,7 @@
</span><span class="cx">                                 996B73131BD9FA2C00331B84 /* SymbolConstructor.lut.h */,
</span><span class="cx">                                 996B73141BD9FA2C00331B84 /* SymbolPrototype.lut.h */,
</span><span class="cx">                                 5D53727D0E1C55EC0021E549 /* TracingDtrace.h */,
</span><ins>+                                65A946141C8E9F6F00A7209A /* YarrCanonicalizeUnicode.cpp */,
</ins><span class="cx">                         );
</span><span class="cx">                         name = "Derived Sources";
</span><span class="cx">                         path = DerivedSources/JavaScriptCore;
</span><span class="lines">@@ -6025,9 +6030,9 @@
</span><span class="cx">                                 A57D23EB1891B5540031C7FA /* RegularExpression.cpp */,
</span><span class="cx">                                 A57D23EC1891B5540031C7FA /* RegularExpression.h */,
</span><span class="cx">                                 451539B812DC994500EF7AC4 /* Yarr.h */,
</span><del>-                                863C6D981521111200585E4E /* YarrCanonicalizeUnicode.cpp */,
-                                863C6D991521111200585E4E /* YarrCanonicalizeUnicode.h */,
-                                863C6D9A1521111200585E4E /* YarrCanonicalizeUnicode.js */,
</del><ins>+                                863C6D981521111200585E4E /* YarrCanonicalizeUCS2.cpp */,
+                                863C6D991521111200585E4E /* YarrCanonicalize.h */,
+                                863C6D9A1521111200585E4E /* YarrCanonicalizeUCS2.js */,
</ins><span class="cx">                                 86704B7D12DBA33700A9FE7B /* YarrInterpreter.cpp */,
</span><span class="cx">                                 86704B7E12DBA33700A9FE7B /* YarrInterpreter.h */,
</span><span class="cx">                                 86704B7F12DBA33700A9FE7B /* YarrJIT.cpp */,
</span><span class="lines">@@ -9349,7 +9354,8 @@
</span><span class="cx">                                 0FC8150B14043C0E00CFA603 /* WriteBarrierSupport.cpp in Sources */,
</span><span class="cx">                                 A7E5AB3A1799E4B200D2833D /* X86Disassembler.cpp in Sources */,
</span><span class="cx">                                 0F2BBD971C5FF3F50023EF23 /* B3Variable.cpp in Sources */,
</span><del>-                                863C6D9C1521111A00585E4E /* YarrCanonicalizeUnicode.cpp in Sources */,
</del><ins>+                                863C6D9C1521111A00585E4E /* YarrCanonicalizeUCS2.cpp in Sources */,
+                                65FB63A41C8EA09C0020719B /* YarrCanonicalizeUnicode.cpp in Sources */,
</ins><span class="cx">                                 86704B8412DBA33700A9FE7B /* YarrInterpreter.cpp in Sources */,
</span><span class="cx">                                 86704B8612DBA33700A9FE7B /* YarrJIT.cpp in Sources */,
</span><span class="cx">                                 86704B8912DBA33700A9FE7B /* YarrPattern.cpp in Sources */,
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoregenerateYarrCanonicalizeUnicode"></a>
<div class="addfile"><h4>Added: trunk/Source/JavaScriptCore/generateYarrCanonicalizeUnicode (0 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/generateYarrCanonicalizeUnicode         (rev 0)
+++ trunk/Source/JavaScriptCore/generateYarrCanonicalizeUnicode        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -0,0 +1,200 @@
</span><ins>+#! /usr/bin/python
+
+# Copyright (C) 2016 Apple Inc. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# 1. Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# 2. Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in the
+# documentation and/or other materials provided with the distribution.
+#
+# THIS SOFTWARE IS PROVIDED BY APPLE AND ITS CONTRIBUTORS "AS IS" AND ANY
+# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL APPLE OR ITS CONTRIBUTORS BE LIABLE FOR ANY
+# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+# THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+# This tool processes the Unicode Character Database file CaseFolding.txt to create
+# canonicalization table as decribed in ECMAScript 6 standard in section
+# "21.2.2.8.2 Runtime Semantics: Canonicalize()", step 2.
+
+import optparse
+import re
+import sys
+from sets import Set
+
+header = """/*
+* Copyright (C) 2016 Apple Inc. All rights reserved.
+*
+* Redistribution and use in source and binary forms, with or without
+* modification, are permitted provided that the following conditions
+* are met:
+*
+* 1. Redistributions of source code must retain the above copyright
+* notice, this list of conditions and the following disclaimer.
+* 2. Redistributions in binary form must reproduce the above copyright
+* notice, this list of conditions and the following disclaimer in the
+* documentation and/or other materials provided with the distribution.
+*
+* THIS SOFTWARE IS PROVIDED BY APPLE AND ITS CONTRIBUTORS "AS IS" AND ANY
+* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+* WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+* DISCLAIMED. IN NO EVENT SHALL APPLE OR ITS CONTRIBUTORS BE LIABLE FOR ANY
+* DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+* LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+* ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+// DO NO EDIT! - This file was generated by generateYarrCanonicalizeUnicode
+
+#include "config.h"
+#include "YarrCanonicalize.h"
+
+namespace JSC { namespace Yarr {
+
+"""
+
+footer = """} } // JSC::Yarr
+"""
+
+MaxUnicode = 0x10ffff
+commonAndSimpleLinesRE = re.compile(r"(?P<code>[0-9A-F]+)\s*;\s*[CS]\s*;\s*(?P<mapping>[0-9A-F]+)", re.IGNORECASE)
+
+def openOrExit(path, mode):
+ try:
+ return open(path, mode)
+ except IOError as e:
+ print "I/O error opening {0}, ({1}): {2}".format(path, e.errno, e.strerror)
+ exit(1)
+
+class Canonicalize:
+ def __init__(self):
+ self.canonicalGroups = {};
+
+ def addMapping(self, code, mapping):
+ if mapping not in self.canonicalGroups:
+ self.canonicalGroups[mapping] = []
+ self.canonicalGroups[mapping].append(code)
+
+ def readCaseFolding(self, file):
+ codesSeen = Set()
+ for line in file:
+ line = line.split('#', 1)[0]
+ line = line.rstrip()
+ if (not len(line)):
+ continue
+
+ fields = commonAndSimpleLinesRE.match(line)
+ if (not fields):
+ continue
+
+ code = int(fields.group('code'), 16)
+ mapping = int(fields.group('mapping'), 16)
+
+ codesSeen.add(code)
+ self.addMapping(code, mapping)
+
+ for i in range(MaxUnicode + 1):
+ if i in codesSeen:
+ continue;
+
+ self.addMapping(i, i)
+
+ def createTables(self, file):
+ typeInfo = [""] * (MaxUnicode + 1)
+ characterSets = []
+
+ for mapping in sorted(self.canonicalGroups.keys()):
+ characters = self.canonicalGroups[mapping]
+ if len(characters) == 1:
+ typeInfo[characters[0]] = "CanonicalizeUnique:0"
+ else:
+ characters.sort()
+ if len(characters) > 2:
+ for ch in characters:
+ typeInfo[ch] = "CanonicalizeSet:%d" % len(characterSets)
+ characterSets.append(characters)
+ else:
+ low = characters[0]
+ high = characters[1]
+ delta = high - low
+ if delta == 1:
+ type = "CanonicalizeAlternatingUnaligned:0" if low & 1 else "CanonicalizeAlternatingAligned:0"
+ typeInfo[low] = type
+ typeInfo[high] = type
+ else:
+ typeInfo[low] = "CanonicalizeRangeLo:%d" % delta
+ typeInfo[high] = "CanonicalizeRangeHi:%d" % delta
+
+ rangeInfo = []
+ end = 0
+ while end <= MaxUnicode:
+ begin = end
+ type = typeInfo[end]
+ while end < MaxUnicode and typeInfo[end + 1] == type:
+ end = end + 1
+ rangeInfo.append({"begin": begin, "end": end, "type": type})
+ end = end + 1
+
+ for i in range(len(characterSets)):
+ characters = ""
+ set = characterSets[i]
+ for ch in set:
+ characters = characters + "0x{character:04x}, ".format(character=ch)
+ file.write("const UChar32 unicodeCharacterSet{index:d}[] = {{ {characters}0 }};\n".format(index=i, characters=characters))
+
+ file.write("\n")
+ file.write("static const size_t UNICODE_CANONICALIZATION_SETS = {setCount:d};\n".format(setCount=len(characterSets)))
+ file.write("const UChar32* const unicodeCharacterSetInfo[UNICODE_CANONICALIZATION_SETS] = {\n")
+
+ for i in range(len(characterSets)):
+ file.write(" unicodeCharacterSet{setNumber:d},\n".format(setNumber=i))
+
+ file.write("};\n")
+ file.write("\n")
+ file.write("const size_t UNICODE_CANONICALIZATION_RANGES = {rangeCount:d};\n".format(rangeCount=len(rangeInfo)))
+ file.write("const CanonicalizationRange unicodeRangeInfo[UNICODE_CANONICALIZATION_RANGES] = {\n")
+
+ for info in rangeInfo:
+ typeAndValue = info["type"].split(":")
+ file.write(" {{ 0x{begin:04x}, 0x{end:04x}, 0x{value:04x}, {type} }},\n".format(begin=info["begin"], end=info["end"], value=int(typeAndValue[1]), type=typeAndValue[0]))
+
+ file.write("};\n")
+ file.write("\n")
+
+
+if __name__ == "__main__":
+ parser = optparse.OptionParser(usage = "usage: %prog <CaseFolding.txt> <YarrCanonicalizeUnicode.h>")
+ (options, args) = parser.parse_args()
+
+ if len(args) != 2:
+ parser.error("<CaseFolding.txt> <YarrCanonicalizeUnicode.h>")
+
+ caseFoldingTxtPath = args[0]
+ canonicalizeHPath = args[1]
+ caseFoldingTxtFile = openOrExit(caseFoldingTxtPath, "r")
+ canonicalizeHFile = openOrExit(canonicalizeHPath, "wb")
+
+ canonicalize = Canonicalize()
+ canonicalize.readCaseFolding(caseFoldingTxtFile)
+
+ canonicalizeHFile.write(header);
+ canonicalize.createTables(canonicalizeHFile)
+ canonicalizeHFile.write(footer);
+
+ caseFoldingTxtFile.close()
+ canonicalizeHFile.close()
+
+ exit(0)
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoreucdCaseFoldingtxt"></a>
<div class="addfile"><h4>Added: trunk/Source/JavaScriptCore/ucd/CaseFolding.txt (0 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/ucd/CaseFolding.txt         (rev 0)
+++ trunk/Source/JavaScriptCore/ucd/CaseFolding.txt        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -0,0 +1,1414 @@
</span><ins>+# CaseFolding-8.0.0.txt
+# Date: 2015-01-13, 18:16:36 GMT [MD]
+#
+# Unicode Character Database
+# Copyright (c) 1991-2015 Unicode, Inc.
+# For terms of use, see http://www.unicode.org/terms_of_use.html
+# For documentation, see http://www.unicode.org/reports/tr44/
+#
+# Case Folding Properties
+#
+# This file is a supplement to the UnicodeData file.
+# It provides a case folding mapping generated from the Unicode Character Database.
+# If all characters are mapped according to the full mapping below, then
+# case differences (according to UnicodeData.txt and SpecialCasing.txt)
+# are eliminated.
+#
+# The data supports both implementations that require simple case foldings
+# (where string lengths don't change), and implementations that allow full case folding
+# (where string lengths may grow). Note that where they can be supported, the
+# full case foldings are superior: for example, they allow "MASSE" and "Maße" to match.
+#
+# All code points not listed in this file map to themselves.
+#
+# NOTE: case folding does not preserve normalization formats!
+#
+# For information on case folding, including how to have case folding
+# preserve normalization formats, see Section 3.13 Default Case Algorithms in
+# The Unicode Standard.
+#
+# ================================================================================
+# Format
+# ================================================================================
+# The entries in this file are in the following machine-readable format:
+#
+# <code>; <status>; <mapping>; # <name>
+#
+# The status field is:
+# C: common case folding, common mappings shared by both simple and full mappings.
+# F: full case folding, mappings that cause strings to grow in length. Multiple characters are separated by spaces.
+# S: simple case folding, mappings to single characters where different from F.
+# T: special case for uppercase I and dotted uppercase I
+# - For non-Turkic languages, this mapping is normally not used.
+# - For Turkic languages (tr, az), this mapping can be used instead of the normal mapping for these characters.
+# Note that the Turkic mappings do not maintain canonical equivalence without additional processing.
+# See the discussions of case mapping in the Unicode Standard for more information.
+#
+# Usage:
+# A. To do a simple case folding, use the mappings with status C + S.
+# B. To do a full case folding, use the mappings with status C + F.
+#
+# The mappings with status T can be used or omitted depending on the desired case-folding
+# behavior. (The default option is to exclude them.)
+#
+# =================================================================
+
+# Property: Case_Folding
+
+# All code points not explicitly listed for Case_Folding
+# have the value C for the status field, and the code point itself for the mapping field.
+
+# =================================================================
+0041; C; 0061; # LATIN CAPITAL LETTER A
+0042; C; 0062; # LATIN CAPITAL LETTER B
+0043; C; 0063; # LATIN CAPITAL LETTER C
+0044; C; 0064; # LATIN CAPITAL LETTER D
+0045; C; 0065; # LATIN CAPITAL LETTER E
+0046; C; 0066; # LATIN CAPITAL LETTER F
+0047; C; 0067; # LATIN CAPITAL LETTER G
+0048; C; 0068; # LATIN CAPITAL LETTER H
+0049; C; 0069; # LATIN CAPITAL LETTER I
+0049; T; 0131; # LATIN CAPITAL LETTER I
+004A; C; 006A; # LATIN CAPITAL LETTER J
+004B; C; 006B; # LATIN CAPITAL LETTER K
+004C; C; 006C; # LATIN CAPITAL LETTER L
+004D; C; 006D; # LATIN CAPITAL LETTER M
+004E; C; 006E; # LATIN CAPITAL LETTER N
+004F; C; 006F; # LATIN CAPITAL LETTER O
+0050; C; 0070; # LATIN CAPITAL LETTER P
+0051; C; 0071; # LATIN CAPITAL LETTER Q
+0052; C; 0072; # LATIN CAPITAL LETTER R
+0053; C; 0073; # LATIN CAPITAL LETTER S
+0054; C; 0074; # LATIN CAPITAL LETTER T
+0055; C; 0075; # LATIN CAPITAL LETTER U
+0056; C; 0076; # LATIN CAPITAL LETTER V
+0057; C; 0077; # LATIN CAPITAL LETTER W
+0058; C; 0078; # LATIN CAPITAL LETTER X
+0059; C; 0079; # LATIN CAPITAL LETTER Y
+005A; C; 007A; # LATIN CAPITAL LETTER Z
+00B5; C; 03BC; # MICRO SIGN
+00C0; C; 00E0; # LATIN CAPITAL LETTER A WITH GRAVE
+00C1; C; 00E1; # LATIN CAPITAL LETTER A WITH ACUTE
+00C2; C; 00E2; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+00C3; C; 00E3; # LATIN CAPITAL LETTER A WITH TILDE
+00C4; C; 00E4; # LATIN CAPITAL LETTER A WITH DIAERESIS
+00C5; C; 00E5; # LATIN CAPITAL LETTER A WITH RING ABOVE
+00C6; C; 00E6; # LATIN CAPITAL LETTER AE
+00C7; C; 00E7; # LATIN CAPITAL LETTER C WITH CEDILLA
+00C8; C; 00E8; # LATIN CAPITAL LETTER E WITH GRAVE
+00C9; C; 00E9; # LATIN CAPITAL LETTER E WITH ACUTE
+00CA; C; 00EA; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+00CB; C; 00EB; # LATIN CAPITAL LETTER E WITH DIAERESIS
+00CC; C; 00EC; # LATIN CAPITAL LETTER I WITH GRAVE
+00CD; C; 00ED; # LATIN CAPITAL LETTER I WITH ACUTE
+00CE; C; 00EE; # LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+00CF; C; 00EF; # LATIN CAPITAL LETTER I WITH DIAERESIS
+00D0; C; 00F0; # LATIN CAPITAL LETTER ETH
+00D1; C; 00F1; # LATIN CAPITAL LETTER N WITH TILDE
+00D2; C; 00F2; # LATIN CAPITAL LETTER O WITH GRAVE
+00D3; C; 00F3; # LATIN CAPITAL LETTER O WITH ACUTE
+00D4; C; 00F4; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+00D5; C; 00F5; # LATIN CAPITAL LETTER O WITH TILDE
+00D6; C; 00F6; # LATIN CAPITAL LETTER O WITH DIAERESIS
+00D8; C; 00F8; # LATIN CAPITAL LETTER O WITH STROKE
+00D9; C; 00F9; # LATIN CAPITAL LETTER U WITH GRAVE
+00DA; C; 00FA; # LATIN CAPITAL LETTER U WITH ACUTE
+00DB; C; 00FB; # LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+00DC; C; 00FC; # LATIN CAPITAL LETTER U WITH DIAERESIS
+00DD; C; 00FD; # LATIN CAPITAL LETTER Y WITH ACUTE
+00DE; C; 00FE; # LATIN CAPITAL LETTER THORN
+00DF; F; 0073 0073; # LATIN SMALL LETTER SHARP S
+0100; C; 0101; # LATIN CAPITAL LETTER A WITH MACRON
+0102; C; 0103; # LATIN CAPITAL LETTER A WITH BREVE
+0104; C; 0105; # LATIN CAPITAL LETTER A WITH OGONEK
+0106; C; 0107; # LATIN CAPITAL LETTER C WITH ACUTE
+0108; C; 0109; # LATIN CAPITAL LETTER C WITH CIRCUMFLEX
+010A; C; 010B; # LATIN CAPITAL LETTER C WITH DOT ABOVE
+010C; C; 010D; # LATIN CAPITAL LETTER C WITH CARON
+010E; C; 010F; # LATIN CAPITAL LETTER D WITH CARON
+0110; C; 0111; # LATIN CAPITAL LETTER D WITH STROKE
+0112; C; 0113; # LATIN CAPITAL LETTER E WITH MACRON
+0114; C; 0115; # LATIN CAPITAL LETTER E WITH BREVE
+0116; C; 0117; # LATIN CAPITAL LETTER E WITH DOT ABOVE
+0118; C; 0119; # LATIN CAPITAL LETTER E WITH OGONEK
+011A; C; 011B; # LATIN CAPITAL LETTER E WITH CARON
+011C; C; 011D; # LATIN CAPITAL LETTER G WITH CIRCUMFLEX
+011E; C; 011F; # LATIN CAPITAL LETTER G WITH BREVE
+0120; C; 0121; # LATIN CAPITAL LETTER G WITH DOT ABOVE
+0122; C; 0123; # LATIN CAPITAL LETTER G WITH CEDILLA
+0124; C; 0125; # LATIN CAPITAL LETTER H WITH CIRCUMFLEX
+0126; C; 0127; # LATIN CAPITAL LETTER H WITH STROKE
+0128; C; 0129; # LATIN CAPITAL LETTER I WITH TILDE
+012A; C; 012B; # LATIN CAPITAL LETTER I WITH MACRON
+012C; C; 012D; # LATIN CAPITAL LETTER I WITH BREVE
+012E; C; 012F; # LATIN CAPITAL LETTER I WITH OGONEK
+0130; F; 0069 0307; # LATIN CAPITAL LETTER I WITH DOT ABOVE
+0130; T; 0069; # LATIN CAPITAL LETTER I WITH DOT ABOVE
+0132; C; 0133; # LATIN CAPITAL LIGATURE IJ
+0134; C; 0135; # LATIN CAPITAL LETTER J WITH CIRCUMFLEX
+0136; C; 0137; # LATIN CAPITAL LETTER K WITH CEDILLA
+0139; C; 013A; # LATIN CAPITAL LETTER L WITH ACUTE
+013B; C; 013C; # LATIN CAPITAL LETTER L WITH CEDILLA
+013D; C; 013E; # LATIN CAPITAL LETTER L WITH CARON
+013F; C; 0140; # LATIN CAPITAL LETTER L WITH MIDDLE DOT
+0141; C; 0142; # LATIN CAPITAL LETTER L WITH STROKE
+0143; C; 0144; # LATIN CAPITAL LETTER N WITH ACUTE
+0145; C; 0146; # LATIN CAPITAL LETTER N WITH CEDILLA
+0147; C; 0148; # LATIN CAPITAL LETTER N WITH CARON
+0149; F; 02BC 006E; # LATIN SMALL LETTER N PRECEDED BY APOSTROPHE
+014A; C; 014B; # LATIN CAPITAL LETTER ENG
+014C; C; 014D; # LATIN CAPITAL LETTER O WITH MACRON
+014E; C; 014F; # LATIN CAPITAL LETTER O WITH BREVE
+0150; C; 0151; # LATIN CAPITAL LETTER O WITH DOUBLE ACUTE
+0152; C; 0153; # LATIN CAPITAL LIGATURE OE
+0154; C; 0155; # LATIN CAPITAL LETTER R WITH ACUTE
+0156; C; 0157; # LATIN CAPITAL LETTER R WITH CEDILLA
+0158; C; 0159; # LATIN CAPITAL LETTER R WITH CARON
+015A; C; 015B; # LATIN CAPITAL LETTER S WITH ACUTE
+015C; C; 015D; # LATIN CAPITAL LETTER S WITH CIRCUMFLEX
+015E; C; 015F; # LATIN CAPITAL LETTER S WITH CEDILLA
+0160; C; 0161; # LATIN CAPITAL LETTER S WITH CARON
+0162; C; 0163; # LATIN CAPITAL LETTER T WITH CEDILLA
+0164; C; 0165; # LATIN CAPITAL LETTER T WITH CARON
+0166; C; 0167; # LATIN CAPITAL LETTER T WITH STROKE
+0168; C; 0169; # LATIN CAPITAL LETTER U WITH TILDE
+016A; C; 016B; # LATIN CAPITAL LETTER U WITH MACRON
+016C; C; 016D; # LATIN CAPITAL LETTER U WITH BREVE
+016E; C; 016F; # LATIN CAPITAL LETTER U WITH RING ABOVE
+0170; C; 0171; # LATIN CAPITAL LETTER U WITH DOUBLE ACUTE
+0172; C; 0173; # LATIN CAPITAL LETTER U WITH OGONEK
+0174; C; 0175; # LATIN CAPITAL LETTER W WITH CIRCUMFLEX
+0176; C; 0177; # LATIN CAPITAL LETTER Y WITH CIRCUMFLEX
+0178; C; 00FF; # LATIN CAPITAL LETTER Y WITH DIAERESIS
+0179; C; 017A; # LATIN CAPITAL LETTER Z WITH ACUTE
+017B; C; 017C; # LATIN CAPITAL LETTER Z WITH DOT ABOVE
+017D; C; 017E; # LATIN CAPITAL LETTER Z WITH CARON
+017F; C; 0073; # LATIN SMALL LETTER LONG S
+0181; C; 0253; # LATIN CAPITAL LETTER B WITH HOOK
+0182; C; 0183; # LATIN CAPITAL LETTER B WITH TOPBAR
+0184; C; 0185; # LATIN CAPITAL LETTER TONE SIX
+0186; C; 0254; # LATIN CAPITAL LETTER OPEN O
+0187; C; 0188; # LATIN CAPITAL LETTER C WITH HOOK
+0189; C; 0256; # LATIN CAPITAL LETTER AFRICAN D
+018A; C; 0257; # LATIN CAPITAL LETTER D WITH HOOK
+018B; C; 018C; # LATIN CAPITAL LETTER D WITH TOPBAR
+018E; C; 01DD; # LATIN CAPITAL LETTER REVERSED E
+018F; C; 0259; # LATIN CAPITAL LETTER SCHWA
+0190; C; 025B; # LATIN CAPITAL LETTER OPEN E
+0191; C; 0192; # LATIN CAPITAL LETTER F WITH HOOK
+0193; C; 0260; # LATIN CAPITAL LETTER G WITH HOOK
+0194; C; 0263; # LATIN CAPITAL LETTER GAMMA
+0196; C; 0269; # LATIN CAPITAL LETTER IOTA
+0197; C; 0268; # LATIN CAPITAL LETTER I WITH STROKE
+0198; C; 0199; # LATIN CAPITAL LETTER K WITH HOOK
+019C; C; 026F; # LATIN CAPITAL LETTER TURNED M
+019D; C; 0272; # LATIN CAPITAL LETTER N WITH LEFT HOOK
+019F; C; 0275; # LATIN CAPITAL LETTER O WITH MIDDLE TILDE
+01A0; C; 01A1; # LATIN CAPITAL LETTER O WITH HORN
+01A2; C; 01A3; # LATIN CAPITAL LETTER OI
+01A4; C; 01A5; # LATIN CAPITAL LETTER P WITH HOOK
+01A6; C; 0280; # LATIN LETTER YR
+01A7; C; 01A8; # LATIN CAPITAL LETTER TONE TWO
+01A9; C; 0283; # LATIN CAPITAL LETTER ESH
+01AC; C; 01AD; # LATIN CAPITAL LETTER T WITH HOOK
+01AE; C; 0288; # LATIN CAPITAL LETTER T WITH RETROFLEX HOOK
+01AF; C; 01B0; # LATIN CAPITAL LETTER U WITH HORN
+01B1; C; 028A; # LATIN CAPITAL LETTER UPSILON
+01B2; C; 028B; # LATIN CAPITAL LETTER V WITH HOOK
+01B3; C; 01B4; # LATIN CAPITAL LETTER Y WITH HOOK
+01B5; C; 01B6; # LATIN CAPITAL LETTER Z WITH STROKE
+01B7; C; 0292; # LATIN CAPITAL LETTER EZH
+01B8; C; 01B9; # LATIN CAPITAL LETTER EZH REVERSED
+01BC; C; 01BD; # LATIN CAPITAL LETTER TONE FIVE
+01C4; C; 01C6; # LATIN CAPITAL LETTER DZ WITH CARON
+01C5; C; 01C6; # LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON
+01C7; C; 01C9; # LATIN CAPITAL LETTER LJ
+01C8; C; 01C9; # LATIN CAPITAL LETTER L WITH SMALL LETTER J
+01CA; C; 01CC; # LATIN CAPITAL LETTER NJ
+01CB; C; 01CC; # LATIN CAPITAL LETTER N WITH SMALL LETTER J
+01CD; C; 01CE; # LATIN CAPITAL LETTER A WITH CARON
+01CF; C; 01D0; # LATIN CAPITAL LETTER I WITH CARON
+01D1; C; 01D2; # LATIN CAPITAL LETTER O WITH CARON
+01D3; C; 01D4; # LATIN CAPITAL LETTER U WITH CARON
+01D5; C; 01D6; # LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON
+01D7; C; 01D8; # LATIN CAPITAL LETTER U WITH DIAERESIS AND ACUTE
+01D9; C; 01DA; # LATIN CAPITAL LETTER U WITH DIAERESIS AND CARON
+01DB; C; 01DC; # LATIN CAPITAL LETTER U WITH DIAERESIS AND GRAVE
+01DE; C; 01DF; # LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON
+01E0; C; 01E1; # LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON
+01E2; C; 01E3; # LATIN CAPITAL LETTER AE WITH MACRON
+01E4; C; 01E5; # LATIN CAPITAL LETTER G WITH STROKE
+01E6; C; 01E7; # LATIN CAPITAL LETTER G WITH CARON
+01E8; C; 01E9; # LATIN CAPITAL LETTER K WITH CARON
+01EA; C; 01EB; # LATIN CAPITAL LETTER O WITH OGONEK
+01EC; C; 01ED; # LATIN CAPITAL LETTER O WITH OGONEK AND MACRON
+01EE; C; 01EF; # LATIN CAPITAL LETTER EZH WITH CARON
+01F0; F; 006A 030C; # LATIN SMALL LETTER J WITH CARON
+01F1; C; 01F3; # LATIN CAPITAL LETTER DZ
+01F2; C; 01F3; # LATIN CAPITAL LETTER D WITH SMALL LETTER Z
+01F4; C; 01F5; # LATIN CAPITAL LETTER G WITH ACUTE
+01F6; C; 0195; # LATIN CAPITAL LETTER HWAIR
+01F7; C; 01BF; # LATIN CAPITAL LETTER WYNN
+01F8; C; 01F9; # LATIN CAPITAL LETTER N WITH GRAVE
+01FA; C; 01FB; # LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE
+01FC; C; 01FD; # LATIN CAPITAL LETTER AE WITH ACUTE
+01FE; C; 01FF; # LATIN CAPITAL LETTER O WITH STROKE AND ACUTE
+0200; C; 0201; # LATIN CAPITAL LETTER A WITH DOUBLE GRAVE
+0202; C; 0203; # LATIN CAPITAL LETTER A WITH INVERTED BREVE
+0204; C; 0205; # LATIN CAPITAL LETTER E WITH DOUBLE GRAVE
+0206; C; 0207; # LATIN CAPITAL LETTER E WITH INVERTED BREVE
+0208; C; 0209; # LATIN CAPITAL LETTER I WITH DOUBLE GRAVE
+020A; C; 020B; # LATIN CAPITAL LETTER I WITH INVERTED BREVE
+020C; C; 020D; # LATIN CAPITAL LETTER O WITH DOUBLE GRAVE
+020E; C; 020F; # LATIN CAPITAL LETTER O WITH INVERTED BREVE
+0210; C; 0211; # LATIN CAPITAL LETTER R WITH DOUBLE GRAVE
+0212; C; 0213; # LATIN CAPITAL LETTER R WITH INVERTED BREVE
+0214; C; 0215; # LATIN CAPITAL LETTER U WITH DOUBLE GRAVE
+0216; C; 0217; # LATIN CAPITAL LETTER U WITH INVERTED BREVE
+0218; C; 0219; # LATIN CAPITAL LETTER S WITH COMMA BELOW
+021A; C; 021B; # LATIN CAPITAL LETTER T WITH COMMA BELOW
+021C; C; 021D; # LATIN CAPITAL LETTER YOGH
+021E; C; 021F; # LATIN CAPITAL LETTER H WITH CARON
+0220; C; 019E; # LATIN CAPITAL LETTER N WITH LONG RIGHT LEG
+0222; C; 0223; # LATIN CAPITAL LETTER OU
+0224; C; 0225; # LATIN CAPITAL LETTER Z WITH HOOK
+0226; C; 0227; # LATIN CAPITAL LETTER A WITH DOT ABOVE
+0228; C; 0229; # LATIN CAPITAL LETTER E WITH CEDILLA
+022A; C; 022B; # LATIN CAPITAL LETTER O WITH DIAERESIS AND MACRON
+022C; C; 022D; # LATIN CAPITAL LETTER O WITH TILDE AND MACRON
+022E; C; 022F; # LATIN CAPITAL LETTER O WITH DOT ABOVE
+0230; C; 0231; # LATIN CAPITAL LETTER O WITH DOT ABOVE AND MACRON
+0232; C; 0233; # LATIN CAPITAL LETTER Y WITH MACRON
+023A; C; 2C65; # LATIN CAPITAL LETTER A WITH STROKE
+023B; C; 023C; # LATIN CAPITAL LETTER C WITH STROKE
+023D; C; 019A; # LATIN CAPITAL LETTER L WITH BAR
+023E; C; 2C66; # LATIN CAPITAL LETTER T WITH DIAGONAL STROKE
+0241; C; 0242; # LATIN CAPITAL LETTER GLOTTAL STOP
+0243; C; 0180; # LATIN CAPITAL LETTER B WITH STROKE
+0244; C; 0289; # LATIN CAPITAL LETTER U BAR
+0245; C; 028C; # LATIN CAPITAL LETTER TURNED V
+0246; C; 0247; # LATIN CAPITAL LETTER E WITH STROKE
+0248; C; 0249; # LATIN CAPITAL LETTER J WITH STROKE
+024A; C; 024B; # LATIN CAPITAL LETTER SMALL Q WITH HOOK TAIL
+024C; C; 024D; # LATIN CAPITAL LETTER R WITH STROKE
+024E; C; 024F; # LATIN CAPITAL LETTER Y WITH STROKE
+0345; C; 03B9; # COMBINING GREEK YPOGEGRAMMENI
+0370; C; 0371; # GREEK CAPITAL LETTER HETA
+0372; C; 0373; # GREEK CAPITAL LETTER ARCHAIC SAMPI
+0376; C; 0377; # GREEK CAPITAL LETTER PAMPHYLIAN DIGAMMA
+037F; C; 03F3; # GREEK CAPITAL LETTER YOT
+0386; C; 03AC; # GREEK CAPITAL LETTER ALPHA WITH TONOS
+0388; C; 03AD; # GREEK CAPITAL LETTER EPSILON WITH TONOS
+0389; C; 03AE; # GREEK CAPITAL LETTER ETA WITH TONOS
+038A; C; 03AF; # GREEK CAPITAL LETTER IOTA WITH TONOS
+038C; C; 03CC; # GREEK CAPITAL LETTER OMICRON WITH TONOS
+038E; C; 03CD; # GREEK CAPITAL LETTER UPSILON WITH TONOS
+038F; C; 03CE; # GREEK CAPITAL LETTER OMEGA WITH TONOS
+0390; F; 03B9 0308 0301; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
+0391; C; 03B1; # GREEK CAPITAL LETTER ALPHA
+0392; C; 03B2; # GREEK CAPITAL LETTER BETA
+0393; C; 03B3; # GREEK CAPITAL LETTER GAMMA
+0394; C; 03B4; # GREEK CAPITAL LETTER DELTA
+0395; C; 03B5; # GREEK CAPITAL LETTER EPSILON
+0396; C; 03B6; # GREEK CAPITAL LETTER ZETA
+0397; C; 03B7; # GREEK CAPITAL LETTER ETA
+0398; C; 03B8; # GREEK CAPITAL LETTER THETA
+0399; C; 03B9; # GREEK CAPITAL LETTER IOTA
+039A; C; 03BA; # GREEK CAPITAL LETTER KAPPA
+039B; C; 03BB; # GREEK CAPITAL LETTER LAMDA
+039C; C; 03BC; # GREEK CAPITAL LETTER MU
+039D; C; 03BD; # GREEK CAPITAL LETTER NU
+039E; C; 03BE; # GREEK CAPITAL LETTER XI
+039F; C; 03BF; # GREEK CAPITAL LETTER OMICRON
+03A0; C; 03C0; # GREEK CAPITAL LETTER PI
+03A1; C; 03C1; # GREEK CAPITAL LETTER RHO
+03A3; C; 03C3; # GREEK CAPITAL LETTER SIGMA
+03A4; C; 03C4; # GREEK CAPITAL LETTER TAU
+03A5; C; 03C5; # GREEK CAPITAL LETTER UPSILON
+03A6; C; 03C6; # GREEK CAPITAL LETTER PHI
+03A7; C; 03C7; # GREEK CAPITAL LETTER CHI
+03A8; C; 03C8; # GREEK CAPITAL LETTER PSI
+03A9; C; 03C9; # GREEK CAPITAL LETTER OMEGA
+03AA; C; 03CA; # GREEK CAPITAL LETTER IOTA WITH DIALYTIKA
+03AB; C; 03CB; # GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA
+03B0; F; 03C5 0308 0301; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS
+03C2; C; 03C3; # GREEK SMALL LETTER FINAL SIGMA
+03CF; C; 03D7; # GREEK CAPITAL KAI SYMBOL
+03D0; C; 03B2; # GREEK BETA SYMBOL
+03D1; C; 03B8; # GREEK THETA SYMBOL
+03D5; C; 03C6; # GREEK PHI SYMBOL
+03D6; C; 03C0; # GREEK PI SYMBOL
+03D8; C; 03D9; # GREEK LETTER ARCHAIC KOPPA
+03DA; C; 03DB; # GREEK LETTER STIGMA
+03DC; C; 03DD; # GREEK LETTER DIGAMMA
+03DE; C; 03DF; # GREEK LETTER KOPPA
+03E0; C; 03E1; # GREEK LETTER SAMPI
+03E2; C; 03E3; # COPTIC CAPITAL LETTER SHEI
+03E4; C; 03E5; # COPTIC CAPITAL LETTER FEI
+03E6; C; 03E7; # COPTIC CAPITAL LETTER KHEI
+03E8; C; 03E9; # COPTIC CAPITAL LETTER HORI
+03EA; C; 03EB; # COPTIC CAPITAL LETTER GANGIA
+03EC; C; 03ED; # COPTIC CAPITAL LETTER SHIMA
+03EE; C; 03EF; # COPTIC CAPITAL LETTER DEI
+03F0; C; 03BA; # GREEK KAPPA SYMBOL
+03F1; C; 03C1; # GREEK RHO SYMBOL
+03F4; C; 03B8; # GREEK CAPITAL THETA SYMBOL
+03F5; C; 03B5; # GREEK LUNATE EPSILON SYMBOL
+03F7; C; 03F8; # GREEK CAPITAL LETTER SHO
+03F9; C; 03F2; # GREEK CAPITAL LUNATE SIGMA SYMBOL
+03FA; C; 03FB; # GREEK CAPITAL LETTER SAN
+03FD; C; 037B; # GREEK CAPITAL REVERSED LUNATE SIGMA SYMBOL
+03FE; C; 037C; # GREEK CAPITAL DOTTED LUNATE SIGMA SYMBOL
+03FF; C; 037D; # GREEK CAPITAL REVERSED DOTTED LUNATE SIGMA SYMBOL
+0400; C; 0450; # CYRILLIC CAPITAL LETTER IE WITH GRAVE
+0401; C; 0451; # CYRILLIC CAPITAL LETTER IO
+0402; C; 0452; # CYRILLIC CAPITAL LETTER DJE
+0403; C; 0453; # CYRILLIC CAPITAL LETTER GJE
+0404; C; 0454; # CYRILLIC CAPITAL LETTER UKRAINIAN IE
+0405; C; 0455; # CYRILLIC CAPITAL LETTER DZE
+0406; C; 0456; # CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+0407; C; 0457; # CYRILLIC CAPITAL LETTER YI
+0408; C; 0458; # CYRILLIC CAPITAL LETTER JE
+0409; C; 0459; # CYRILLIC CAPITAL LETTER LJE
+040A; C; 045A; # CYRILLIC CAPITAL LETTER NJE
+040B; C; 045B; # CYRILLIC CAPITAL LETTER TSHE
+040C; C; 045C; # CYRILLIC CAPITAL LETTER KJE
+040D; C; 045D; # CYRILLIC CAPITAL LETTER I WITH GRAVE
+040E; C; 045E; # CYRILLIC CAPITAL LETTER SHORT U
+040F; C; 045F; # CYRILLIC CAPITAL LETTER DZHE
+0410; C; 0430; # CYRILLIC CAPITAL LETTER A
+0411; C; 0431; # CYRILLIC CAPITAL LETTER BE
+0412; C; 0432; # CYRILLIC CAPITAL LETTER VE
+0413; C; 0433; # CYRILLIC CAPITAL LETTER GHE
+0414; C; 0434; # CYRILLIC CAPITAL LETTER DE
+0415; C; 0435; # CYRILLIC CAPITAL LETTER IE
+0416; C; 0436; # CYRILLIC CAPITAL LETTER ZHE
+0417; C; 0437; # CYRILLIC CAPITAL LETTER ZE
+0418; C; 0438; # CYRILLIC CAPITAL LETTER I
+0419; C; 0439; # CYRILLIC CAPITAL LETTER SHORT I
+041A; C; 043A; # CYRILLIC CAPITAL LETTER KA
+041B; C; 043B; # CYRILLIC CAPITAL LETTER EL
+041C; C; 043C; # CYRILLIC CAPITAL LETTER EM
+041D; C; 043D; # CYRILLIC CAPITAL LETTER EN
+041E; C; 043E; # CYRILLIC CAPITAL LETTER O
+041F; C; 043F; # CYRILLIC CAPITAL LETTER PE
+0420; C; 0440; # CYRILLIC CAPITAL LETTER ER
+0421; C; 0441; # CYRILLIC CAPITAL LETTER ES
+0422; C; 0442; # CYRILLIC CAPITAL LETTER TE
+0423; C; 0443; # CYRILLIC CAPITAL LETTER U
+0424; C; 0444; # CYRILLIC CAPITAL LETTER EF
+0425; C; 0445; # CYRILLIC CAPITAL LETTER HA
+0426; C; 0446; # CYRILLIC CAPITAL LETTER TSE
+0427; C; 0447; # CYRILLIC CAPITAL LETTER CHE
+0428; C; 0448; # CYRILLIC CAPITAL LETTER SHA
+0429; C; 0449; # CYRILLIC CAPITAL LETTER SHCHA
+042A; C; 044A; # CYRILLIC CAPITAL LETTER HARD SIGN
+042B; C; 044B; # CYRILLIC CAPITAL LETTER YERU
+042C; C; 044C; # CYRILLIC CAPITAL LETTER SOFT SIGN
+042D; C; 044D; # CYRILLIC CAPITAL LETTER E
+042E; C; 044E; # CYRILLIC CAPITAL LETTER YU
+042F; C; 044F; # CYRILLIC CAPITAL LETTER YA
+0460; C; 0461; # CYRILLIC CAPITAL LETTER OMEGA
+0462; C; 0463; # CYRILLIC CAPITAL LETTER YAT
+0464; C; 0465; # CYRILLIC CAPITAL LETTER IOTIFIED E
+0466; C; 0467; # CYRILLIC CAPITAL LETTER LITTLE YUS
+0468; C; 0469; # CYRILLIC CAPITAL LETTER IOTIFIED LITTLE YUS
+046A; C; 046B; # CYRILLIC CAPITAL LETTER BIG YUS
+046C; C; 046D; # CYRILLIC CAPITAL LETTER IOTIFIED BIG YUS
+046E; C; 046F; # CYRILLIC CAPITAL LETTER KSI
+0470; C; 0471; # CYRILLIC CAPITAL LETTER PSI
+0472; C; 0473; # CYRILLIC CAPITAL LETTER FITA
+0474; C; 0475; # CYRILLIC CAPITAL LETTER IZHITSA
+0476; C; 0477; # CYRILLIC CAPITAL LETTER IZHITSA WITH DOUBLE GRAVE ACCENT
+0478; C; 0479; # CYRILLIC CAPITAL LETTER UK
+047A; C; 047B; # CYRILLIC CAPITAL LETTER ROUND OMEGA
+047C; C; 047D; # CYRILLIC CAPITAL LETTER OMEGA WITH TITLO
+047E; C; 047F; # CYRILLIC CAPITAL LETTER OT
+0480; C; 0481; # CYRILLIC CAPITAL LETTER KOPPA
+048A; C; 048B; # CYRILLIC CAPITAL LETTER SHORT I WITH TAIL
+048C; C; 048D; # CYRILLIC CAPITAL LETTER SEMISOFT SIGN
+048E; C; 048F; # CYRILLIC CAPITAL LETTER ER WITH TICK
+0490; C; 0491; # CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+0492; C; 0493; # CYRILLIC CAPITAL LETTER GHE WITH STROKE
+0494; C; 0495; # CYRILLIC CAPITAL LETTER GHE WITH MIDDLE HOOK
+0496; C; 0497; # CYRILLIC CAPITAL LETTER ZHE WITH DESCENDER
+0498; C; 0499; # CYRILLIC CAPITAL LETTER ZE WITH DESCENDER
+049A; C; 049B; # CYRILLIC CAPITAL LETTER KA WITH DESCENDER
+049C; C; 049D; # CYRILLIC CAPITAL LETTER KA WITH VERTICAL STROKE
+049E; C; 049F; # CYRILLIC CAPITAL LETTER KA WITH STROKE
+04A0; C; 04A1; # CYRILLIC CAPITAL LETTER BASHKIR KA
+04A2; C; 04A3; # CYRILLIC CAPITAL LETTER EN WITH DESCENDER
+04A4; C; 04A5; # CYRILLIC CAPITAL LIGATURE EN GHE
+04A6; C; 04A7; # CYRILLIC CAPITAL LETTER PE WITH MIDDLE HOOK
+04A8; C; 04A9; # CYRILLIC CAPITAL LETTER ABKHASIAN HA
+04AA; C; 04AB; # CYRILLIC CAPITAL LETTER ES WITH DESCENDER
+04AC; C; 04AD; # CYRILLIC CAPITAL LETTER TE WITH DESCENDER
+04AE; C; 04AF; # CYRILLIC CAPITAL LETTER STRAIGHT U
+04B0; C; 04B1; # CYRILLIC CAPITAL LETTER STRAIGHT U WITH STROKE
+04B2; C; 04B3; # CYRILLIC CAPITAL LETTER HA WITH DESCENDER
+04B4; C; 04B5; # CYRILLIC CAPITAL LIGATURE TE TSE
+04B6; C; 04B7; # CYRILLIC CAPITAL LETTER CHE WITH DESCENDER
+04B8; C; 04B9; # CYRILLIC CAPITAL LETTER CHE WITH VERTICAL STROKE
+04BA; C; 04BB; # CYRILLIC CAPITAL LETTER SHHA
+04BC; C; 04BD; # CYRILLIC CAPITAL LETTER ABKHASIAN CHE
+04BE; C; 04BF; # CYRILLIC CAPITAL LETTER ABKHASIAN CHE WITH DESCENDER
+04C0; C; 04CF; # CYRILLIC LETTER PALOCHKA
+04C1; C; 04C2; # CYRILLIC CAPITAL LETTER ZHE WITH BREVE
+04C3; C; 04C4; # CYRILLIC CAPITAL LETTER KA WITH HOOK
+04C5; C; 04C6; # CYRILLIC CAPITAL LETTER EL WITH TAIL
+04C7; C; 04C8; # CYRILLIC CAPITAL LETTER EN WITH HOOK
+04C9; C; 04CA; # CYRILLIC CAPITAL LETTER EN WITH TAIL
+04CB; C; 04CC; # CYRILLIC CAPITAL LETTER KHAKASSIAN CHE
+04CD; C; 04CE; # CYRILLIC CAPITAL LETTER EM WITH TAIL
+04D0; C; 04D1; # CYRILLIC CAPITAL LETTER A WITH BREVE
+04D2; C; 04D3; # CYRILLIC CAPITAL LETTER A WITH DIAERESIS
+04D4; C; 04D5; # CYRILLIC CAPITAL LIGATURE A IE
+04D6; C; 04D7; # CYRILLIC CAPITAL LETTER IE WITH BREVE
+04D8; C; 04D9; # CYRILLIC CAPITAL LETTER SCHWA
+04DA; C; 04DB; # CYRILLIC CAPITAL LETTER SCHWA WITH DIAERESIS
+04DC; C; 04DD; # CYRILLIC CAPITAL LETTER ZHE WITH DIAERESIS
+04DE; C; 04DF; # CYRILLIC CAPITAL LETTER ZE WITH DIAERESIS
+04E0; C; 04E1; # CYRILLIC CAPITAL LETTER ABKHASIAN DZE
+04E2; C; 04E3; # CYRILLIC CAPITAL LETTER I WITH MACRON
+04E4; C; 04E5; # CYRILLIC CAPITAL LETTER I WITH DIAERESIS
+04E6; C; 04E7; # CYRILLIC CAPITAL LETTER O WITH DIAERESIS
+04E8; C; 04E9; # CYRILLIC CAPITAL LETTER BARRED O
+04EA; C; 04EB; # CYRILLIC CAPITAL LETTER BARRED O WITH DIAERESIS
+04EC; C; 04ED; # CYRILLIC CAPITAL LETTER E WITH DIAERESIS
+04EE; C; 04EF; # CYRILLIC CAPITAL LETTER U WITH MACRON
+04F0; C; 04F1; # CYRILLIC CAPITAL LETTER U WITH DIAERESIS
+04F2; C; 04F3; # CYRILLIC CAPITAL LETTER U WITH DOUBLE ACUTE
+04F4; C; 04F5; # CYRILLIC CAPITAL LETTER CHE WITH DIAERESIS
+04F6; C; 04F7; # CYRILLIC CAPITAL LETTER GHE WITH DESCENDER
+04F8; C; 04F9; # CYRILLIC CAPITAL LETTER YERU WITH DIAERESIS
+04FA; C; 04FB; # CYRILLIC CAPITAL LETTER GHE WITH STROKE AND HOOK
+04FC; C; 04FD; # CYRILLIC CAPITAL LETTER HA WITH HOOK
+04FE; C; 04FF; # CYRILLIC CAPITAL LETTER HA WITH STROKE
+0500; C; 0501; # CYRILLIC CAPITAL LETTER KOMI DE
+0502; C; 0503; # CYRILLIC CAPITAL LETTER KOMI DJE
+0504; C; 0505; # CYRILLIC CAPITAL LETTER KOMI ZJE
+0506; C; 0507; # CYRILLIC CAPITAL LETTER KOMI DZJE
+0508; C; 0509; # CYRILLIC CAPITAL LETTER KOMI LJE
+050A; C; 050B; # CYRILLIC CAPITAL LETTER KOMI NJE
+050C; C; 050D; # CYRILLIC CAPITAL LETTER KOMI SJE
+050E; C; 050F; # CYRILLIC CAPITAL LETTER KOMI TJE
+0510; C; 0511; # CYRILLIC CAPITAL LETTER REVERSED ZE
+0512; C; 0513; # CYRILLIC CAPITAL LETTER EL WITH HOOK
+0514; C; 0515; # CYRILLIC CAPITAL LETTER LHA
+0516; C; 0517; # CYRILLIC CAPITAL LETTER RHA
+0518; C; 0519; # CYRILLIC CAPITAL LETTER YAE
+051A; C; 051B; # CYRILLIC CAPITAL LETTER QA
+051C; C; 051D; # CYRILLIC CAPITAL LETTER WE
+051E; C; 051F; # CYRILLIC CAPITAL LETTER ALEUT KA
+0520; C; 0521; # CYRILLIC CAPITAL LETTER EL WITH MIDDLE HOOK
+0522; C; 0523; # CYRILLIC CAPITAL LETTER EN WITH MIDDLE HOOK
+0524; C; 0525; # CYRILLIC CAPITAL LETTER PE WITH DESCENDER
+0526; C; 0527; # CYRILLIC CAPITAL LETTER SHHA WITH DESCENDER
+0528; C; 0529; # CYRILLIC CAPITAL LETTER EN WITH LEFT HOOK
+052A; C; 052B; # CYRILLIC CAPITAL LETTER DZZHE
+052C; C; 052D; # CYRILLIC CAPITAL LETTER DCHE
+052E; C; 052F; # CYRILLIC CAPITAL LETTER EL WITH DESCENDER
+0531; C; 0561; # ARMENIAN CAPITAL LETTER AYB
+0532; C; 0562; # ARMENIAN CAPITAL LETTER BEN
+0533; C; 0563; # ARMENIAN CAPITAL LETTER GIM
+0534; C; 0564; # ARMENIAN CAPITAL LETTER DA
+0535; C; 0565; # ARMENIAN CAPITAL LETTER ECH
+0536; C; 0566; # ARMENIAN CAPITAL LETTER ZA
+0537; C; 0567; # ARMENIAN CAPITAL LETTER EH
+0538; C; 0568; # ARMENIAN CAPITAL LETTER ET
+0539; C; 0569; # ARMENIAN CAPITAL LETTER TO
+053A; C; 056A; # ARMENIAN CAPITAL LETTER ZHE
+053B; C; 056B; # ARMENIAN CAPITAL LETTER INI
+053C; C; 056C; # ARMENIAN CAPITAL LETTER LIWN
+053D; C; 056D; # ARMENIAN CAPITAL LETTER XEH
+053E; C; 056E; # ARMENIAN CAPITAL LETTER CA
+053F; C; 056F; # ARMENIAN CAPITAL LETTER KEN
+0540; C; 0570; # ARMENIAN CAPITAL LETTER HO
+0541; C; 0571; # ARMENIAN CAPITAL LETTER JA
+0542; C; 0572; # ARMENIAN CAPITAL LETTER GHAD
+0543; C; 0573; # ARMENIAN CAPITAL LETTER CHEH
+0544; C; 0574; # ARMENIAN CAPITAL LETTER MEN
+0545; C; 0575; # ARMENIAN CAPITAL LETTER YI
+0546; C; 0576; # ARMENIAN CAPITAL LETTER NOW
+0547; C; 0577; # ARMENIAN CAPITAL LETTER SHA
+0548; C; 0578; # ARMENIAN CAPITAL LETTER VO
+0549; C; 0579; # ARMENIAN CAPITAL LETTER CHA
+054A; C; 057A; # ARMENIAN CAPITAL LETTER PEH
+054B; C; 057B; # ARMENIAN CAPITAL LETTER JHEH
+054C; C; 057C; # ARMENIAN CAPITAL LETTER RA
+054D; C; 057D; # ARMENIAN CAPITAL LETTER SEH
+054E; C; 057E; # ARMENIAN CAPITAL LETTER VEW
+054F; C; 057F; # ARMENIAN CAPITAL LETTER TIWN
+0550; C; 0580; # ARMENIAN CAPITAL LETTER REH
+0551; C; 0581; # ARMENIAN CAPITAL LETTER CO
+0552; C; 0582; # ARMENIAN CAPITAL LETTER YIWN
+0553; C; 0583; # ARMENIAN CAPITAL LETTER PIWR
+0554; C; 0584; # ARMENIAN CAPITAL LETTER KEH
+0555; C; 0585; # ARMENIAN CAPITAL LETTER OH
+0556; C; 0586; # ARMENIAN CAPITAL LETTER FEH
+0587; F; 0565 0582; # ARMENIAN SMALL LIGATURE ECH YIWN
+10A0; C; 2D00; # GEORGIAN CAPITAL LETTER AN
+10A1; C; 2D01; # GEORGIAN CAPITAL LETTER BAN
+10A2; C; 2D02; # GEORGIAN CAPITAL LETTER GAN
+10A3; C; 2D03; # GEORGIAN CAPITAL LETTER DON
+10A4; C; 2D04; # GEORGIAN CAPITAL LETTER EN
+10A5; C; 2D05; # GEORGIAN CAPITAL LETTER VIN
+10A6; C; 2D06; # GEORGIAN CAPITAL LETTER ZEN
+10A7; C; 2D07; # GEORGIAN CAPITAL LETTER TAN
+10A8; C; 2D08; # GEORGIAN CAPITAL LETTER IN
+10A9; C; 2D09; # GEORGIAN CAPITAL LETTER KAN
+10AA; C; 2D0A; # GEORGIAN CAPITAL LETTER LAS
+10AB; C; 2D0B; # GEORGIAN CAPITAL LETTER MAN
+10AC; C; 2D0C; # GEORGIAN CAPITAL LETTER NAR
+10AD; C; 2D0D; # GEORGIAN CAPITAL LETTER ON
+10AE; C; 2D0E; # GEORGIAN CAPITAL LETTER PAR
+10AF; C; 2D0F; # GEORGIAN CAPITAL LETTER ZHAR
+10B0; C; 2D10; # GEORGIAN CAPITAL LETTER RAE
+10B1; C; 2D11; # GEORGIAN CAPITAL LETTER SAN
+10B2; C; 2D12; # GEORGIAN CAPITAL LETTER TAR
+10B3; C; 2D13; # GEORGIAN CAPITAL LETTER UN
+10B4; C; 2D14; # GEORGIAN CAPITAL LETTER PHAR
+10B5; C; 2D15; # GEORGIAN CAPITAL LETTER KHAR
+10B6; C; 2D16; # GEORGIAN CAPITAL LETTER GHAN
+10B7; C; 2D17; # GEORGIAN CAPITAL LETTER QAR
+10B8; C; 2D18; # GEORGIAN CAPITAL LETTER SHIN
+10B9; C; 2D19; # GEORGIAN CAPITAL LETTER CHIN
+10BA; C; 2D1A; # GEORGIAN CAPITAL LETTER CAN
+10BB; C; 2D1B; # GEORGIAN CAPITAL LETTER JIL
+10BC; C; 2D1C; # GEORGIAN CAPITAL LETTER CIL
+10BD; C; 2D1D; # GEORGIAN CAPITAL LETTER CHAR
+10BE; C; 2D1E; # GEORGIAN CAPITAL LETTER XAN
+10BF; C; 2D1F; # GEORGIAN CAPITAL LETTER JHAN
+10C0; C; 2D20; # GEORGIAN CAPITAL LETTER HAE
+10C1; C; 2D21; # GEORGIAN CAPITAL LETTER HE
+10C2; C; 2D22; # GEORGIAN CAPITAL LETTER HIE
+10C3; C; 2D23; # GEORGIAN CAPITAL LETTER WE
+10C4; C; 2D24; # GEORGIAN CAPITAL LETTER HAR
+10C5; C; 2D25; # GEORGIAN CAPITAL LETTER HOE
+10C7; C; 2D27; # GEORGIAN CAPITAL LETTER YN
+10CD; C; 2D2D; # GEORGIAN CAPITAL LETTER AEN
+13F8; C; 13F0; # CHEROKEE SMALL LETTER YE
+13F9; C; 13F1; # CHEROKEE SMALL LETTER YI
+13FA; C; 13F2; # CHEROKEE SMALL LETTER YO
+13FB; C; 13F3; # CHEROKEE SMALL LETTER YU
+13FC; C; 13F4; # CHEROKEE SMALL LETTER YV
+13FD; C; 13F5; # CHEROKEE SMALL LETTER MV
+1E00; C; 1E01; # LATIN CAPITAL LETTER A WITH RING BELOW
+1E02; C; 1E03; # LATIN CAPITAL LETTER B WITH DOT ABOVE
+1E04; C; 1E05; # LATIN CAPITAL LETTER B WITH DOT BELOW
+1E06; C; 1E07; # LATIN CAPITAL LETTER B WITH LINE BELOW
+1E08; C; 1E09; # LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE
+1E0A; C; 1E0B; # LATIN CAPITAL LETTER D WITH DOT ABOVE
+1E0C; C; 1E0D; # LATIN CAPITAL LETTER D WITH DOT BELOW
+1E0E; C; 1E0F; # LATIN CAPITAL LETTER D WITH LINE BELOW
+1E10; C; 1E11; # LATIN CAPITAL LETTER D WITH CEDILLA
+1E12; C; 1E13; # LATIN CAPITAL LETTER D WITH CIRCUMFLEX BELOW
+1E14; C; 1E15; # LATIN CAPITAL LETTER E WITH MACRON AND GRAVE
+1E16; C; 1E17; # LATIN CAPITAL LETTER E WITH MACRON AND ACUTE
+1E18; C; 1E19; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX BELOW
+1E1A; C; 1E1B; # LATIN CAPITAL LETTER E WITH TILDE BELOW
+1E1C; C; 1E1D; # LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE
+1E1E; C; 1E1F; # LATIN CAPITAL LETTER F WITH DOT ABOVE
+1E20; C; 1E21; # LATIN CAPITAL LETTER G WITH MACRON
+1E22; C; 1E23; # LATIN CAPITAL LETTER H WITH DOT ABOVE
+1E24; C; 1E25; # LATIN CAPITAL LETTER H WITH DOT BELOW
+1E26; C; 1E27; # LATIN CAPITAL LETTER H WITH DIAERESIS
+1E28; C; 1E29; # LATIN CAPITAL LETTER H WITH CEDILLA
+1E2A; C; 1E2B; # LATIN CAPITAL LETTER H WITH BREVE BELOW
+1E2C; C; 1E2D; # LATIN CAPITAL LETTER I WITH TILDE BELOW
+1E2E; C; 1E2F; # LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE
+1E30; C; 1E31; # LATIN CAPITAL LETTER K WITH ACUTE
+1E32; C; 1E33; # LATIN CAPITAL LETTER K WITH DOT BELOW
+1E34; C; 1E35; # LATIN CAPITAL LETTER K WITH LINE BELOW
+1E36; C; 1E37; # LATIN CAPITAL LETTER L WITH DOT BELOW
+1E38; C; 1E39; # LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON
+1E3A; C; 1E3B; # LATIN CAPITAL LETTER L WITH LINE BELOW
+1E3C; C; 1E3D; # LATIN CAPITAL LETTER L WITH CIRCUMFLEX BELOW
+1E3E; C; 1E3F; # LATIN CAPITAL LETTER M WITH ACUTE
+1E40; C; 1E41; # LATIN CAPITAL LETTER M WITH DOT ABOVE
+1E42; C; 1E43; # LATIN CAPITAL LETTER M WITH DOT BELOW
+1E44; C; 1E45; # LATIN CAPITAL LETTER N WITH DOT ABOVE
+1E46; C; 1E47; # LATIN CAPITAL LETTER N WITH DOT BELOW
+1E48; C; 1E49; # LATIN CAPITAL LETTER N WITH LINE BELOW
+1E4A; C; 1E4B; # LATIN CAPITAL LETTER N WITH CIRCUMFLEX BELOW
+1E4C; C; 1E4D; # LATIN CAPITAL LETTER O WITH TILDE AND ACUTE
+1E4E; C; 1E4F; # LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS
+1E50; C; 1E51; # LATIN CAPITAL LETTER O WITH MACRON AND GRAVE
+1E52; C; 1E53; # LATIN CAPITAL LETTER O WITH MACRON AND ACUTE
+1E54; C; 1E55; # LATIN CAPITAL LETTER P WITH ACUTE
+1E56; C; 1E57; # LATIN CAPITAL LETTER P WITH DOT ABOVE
+1E58; C; 1E59; # LATIN CAPITAL LETTER R WITH DOT ABOVE
+1E5A; C; 1E5B; # LATIN CAPITAL LETTER R WITH DOT BELOW
+1E5C; C; 1E5D; # LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON
+1E5E; C; 1E5F; # LATIN CAPITAL LETTER R WITH LINE BELOW
+1E60; C; 1E61; # LATIN CAPITAL LETTER S WITH DOT ABOVE
+1E62; C; 1E63; # LATIN CAPITAL LETTER S WITH DOT BELOW
+1E64; C; 1E65; # LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE
+1E66; C; 1E67; # LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE
+1E68; C; 1E69; # LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE
+1E6A; C; 1E6B; # LATIN CAPITAL LETTER T WITH DOT ABOVE
+1E6C; C; 1E6D; # LATIN CAPITAL LETTER T WITH DOT BELOW
+1E6E; C; 1E6F; # LATIN CAPITAL LETTER T WITH LINE BELOW
+1E70; C; 1E71; # LATIN CAPITAL LETTER T WITH CIRCUMFLEX BELOW
+1E72; C; 1E73; # LATIN CAPITAL LETTER U WITH DIAERESIS BELOW
+1E74; C; 1E75; # LATIN CAPITAL LETTER U WITH TILDE BELOW
+1E76; C; 1E77; # LATIN CAPITAL LETTER U WITH CIRCUMFLEX BELOW
+1E78; C; 1E79; # LATIN CAPITAL LETTER U WITH TILDE AND ACUTE
+1E7A; C; 1E7B; # LATIN CAPITAL LETTER U WITH MACRON AND DIAERESIS
+1E7C; C; 1E7D; # LATIN CAPITAL LETTER V WITH TILDE
+1E7E; C; 1E7F; # LATIN CAPITAL LETTER V WITH DOT BELOW
+1E80; C; 1E81; # LATIN CAPITAL LETTER W WITH GRAVE
+1E82; C; 1E83; # LATIN CAPITAL LETTER W WITH ACUTE
+1E84; C; 1E85; # LATIN CAPITAL LETTER W WITH DIAERESIS
+1E86; C; 1E87; # LATIN CAPITAL LETTER W WITH DOT ABOVE
+1E88; C; 1E89; # LATIN CAPITAL LETTER W WITH DOT BELOW
+1E8A; C; 1E8B; # LATIN CAPITAL LETTER X WITH DOT ABOVE
+1E8C; C; 1E8D; # LATIN CAPITAL LETTER X WITH DIAERESIS
+1E8E; C; 1E8F; # LATIN CAPITAL LETTER Y WITH DOT ABOVE
+1E90; C; 1E91; # LATIN CAPITAL LETTER Z WITH CIRCUMFLEX
+1E92; C; 1E93; # LATIN CAPITAL LETTER Z WITH DOT BELOW
+1E94; C; 1E95; # LATIN CAPITAL LETTER Z WITH LINE BELOW
+1E96; F; 0068 0331; # LATIN SMALL LETTER H WITH LINE BELOW
+1E97; F; 0074 0308; # LATIN SMALL LETTER T WITH DIAERESIS
+1E98; F; 0077 030A; # LATIN SMALL LETTER W WITH RING ABOVE
+1E99; F; 0079 030A; # LATIN SMALL LETTER Y WITH RING ABOVE
+1E9A; F; 0061 02BE; # LATIN SMALL LETTER A WITH RIGHT HALF RING
+1E9B; C; 1E61; # LATIN SMALL LETTER LONG S WITH DOT ABOVE
+1E9E; F; 0073 0073; # LATIN CAPITAL LETTER SHARP S
+1E9E; S; 00DF; # LATIN CAPITAL LETTER SHARP S
+1EA0; C; 1EA1; # LATIN CAPITAL LETTER A WITH DOT BELOW
+1EA2; C; 1EA3; # LATIN CAPITAL LETTER A WITH HOOK ABOVE
+1EA4; C; 1EA5; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE
+1EA6; C; 1EA7; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE
+1EA8; C; 1EA9; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
+1EAA; C; 1EAB; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE
+1EAC; C; 1EAD; # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW
+1EAE; C; 1EAF; # LATIN CAPITAL LETTER A WITH BREVE AND ACUTE
+1EB0; C; 1EB1; # LATIN CAPITAL LETTER A WITH BREVE AND GRAVE
+1EB2; C; 1EB3; # LATIN CAPITAL LETTER A WITH BREVE AND HOOK ABOVE
+1EB4; C; 1EB5; # LATIN CAPITAL LETTER A WITH BREVE AND TILDE
+1EB6; C; 1EB7; # LATIN CAPITAL LETTER A WITH BREVE AND DOT BELOW
+1EB8; C; 1EB9; # LATIN CAPITAL LETTER E WITH DOT BELOW
+1EBA; C; 1EBB; # LATIN CAPITAL LETTER E WITH HOOK ABOVE
+1EBC; C; 1EBD; # LATIN CAPITAL LETTER E WITH TILDE
+1EBE; C; 1EBF; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE
+1EC0; C; 1EC1; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE
+1EC2; C; 1EC3; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
+1EC4; C; 1EC5; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE
+1EC6; C; 1EC7; # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW
+1EC8; C; 1EC9; # LATIN CAPITAL LETTER I WITH HOOK ABOVE
+1ECA; C; 1ECB; # LATIN CAPITAL LETTER I WITH DOT BELOW
+1ECC; C; 1ECD; # LATIN CAPITAL LETTER O WITH DOT BELOW
+1ECE; C; 1ECF; # LATIN CAPITAL LETTER O WITH HOOK ABOVE
+1ED0; C; 1ED1; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE
+1ED2; C; 1ED3; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE
+1ED4; C; 1ED5; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
+1ED6; C; 1ED7; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE
+1ED8; C; 1ED9; # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW
+1EDA; C; 1EDB; # LATIN CAPITAL LETTER O WITH HORN AND ACUTE
+1EDC; C; 1EDD; # LATIN CAPITAL LETTER O WITH HORN AND GRAVE
+1EDE; C; 1EDF; # LATIN CAPITAL LETTER O WITH HORN AND HOOK ABOVE
+1EE0; C; 1EE1; # LATIN CAPITAL LETTER O WITH HORN AND TILDE
+1EE2; C; 1EE3; # LATIN CAPITAL LETTER O WITH HORN AND DOT BELOW
+1EE4; C; 1EE5; # LATIN CAPITAL LETTER U WITH DOT BELOW
+1EE6; C; 1EE7; # LATIN CAPITAL LETTER U WITH HOOK ABOVE
+1EE8; C; 1EE9; # LATIN CAPITAL LETTER U WITH HORN AND ACUTE
+1EEA; C; 1EEB; # LATIN CAPITAL LETTER U WITH HORN AND GRAVE
+1EEC; C; 1EED; # LATIN CAPITAL LETTER U WITH HORN AND HOOK ABOVE
+1EEE; C; 1EEF; # LATIN CAPITAL LETTER U WITH HORN AND TILDE
+1EF0; C; 1EF1; # LATIN CAPITAL LETTER U WITH HORN AND DOT BELOW
+1EF2; C; 1EF3; # LATIN CAPITAL LETTER Y WITH GRAVE
+1EF4; C; 1EF5; # LATIN CAPITAL LETTER Y WITH DOT BELOW
+1EF6; C; 1EF7; # LATIN CAPITAL LETTER Y WITH HOOK ABOVE
+1EF8; C; 1EF9; # LATIN CAPITAL LETTER Y WITH TILDE
+1EFA; C; 1EFB; # LATIN CAPITAL LETTER MIDDLE-WELSH LL
+1EFC; C; 1EFD; # LATIN CAPITAL LETTER MIDDLE-WELSH V
+1EFE; C; 1EFF; # LATIN CAPITAL LETTER Y WITH LOOP
+1F08; C; 1F00; # GREEK CAPITAL LETTER ALPHA WITH PSILI
+1F09; C; 1F01; # GREEK CAPITAL LETTER ALPHA WITH DASIA
+1F0A; C; 1F02; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND VARIA
+1F0B; C; 1F03; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND VARIA
+1F0C; C; 1F04; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA
+1F0D; C; 1F05; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND OXIA
+1F0E; C; 1F06; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI
+1F0F; C; 1F07; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI
+1F18; C; 1F10; # GREEK CAPITAL LETTER EPSILON WITH PSILI
+1F19; C; 1F11; # GREEK CAPITAL LETTER EPSILON WITH DASIA
+1F1A; C; 1F12; # GREEK CAPITAL LETTER EPSILON WITH PSILI AND VARIA
+1F1B; C; 1F13; # GREEK CAPITAL LETTER EPSILON WITH DASIA AND VARIA
+1F1C; C; 1F14; # GREEK CAPITAL LETTER EPSILON WITH PSILI AND OXIA
+1F1D; C; 1F15; # GREEK CAPITAL LETTER EPSILON WITH DASIA AND OXIA
+1F28; C; 1F20; # GREEK CAPITAL LETTER ETA WITH PSILI
+1F29; C; 1F21; # GREEK CAPITAL LETTER ETA WITH DASIA
+1F2A; C; 1F22; # GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA
+1F2B; C; 1F23; # GREEK CAPITAL LETTER ETA WITH DASIA AND VARIA
+1F2C; C; 1F24; # GREEK CAPITAL LETTER ETA WITH PSILI AND OXIA
+1F2D; C; 1F25; # GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA
+1F2E; C; 1F26; # GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI
+1F2F; C; 1F27; # GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI
+1F38; C; 1F30; # GREEK CAPITAL LETTER IOTA WITH PSILI
+1F39; C; 1F31; # GREEK CAPITAL LETTER IOTA WITH DASIA
+1F3A; C; 1F32; # GREEK CAPITAL LETTER IOTA WITH PSILI AND VARIA
+1F3B; C; 1F33; # GREEK CAPITAL LETTER IOTA WITH DASIA AND VARIA
+1F3C; C; 1F34; # GREEK CAPITAL LETTER IOTA WITH PSILI AND OXIA
+1F3D; C; 1F35; # GREEK CAPITAL LETTER IOTA WITH DASIA AND OXIA
+1F3E; C; 1F36; # GREEK CAPITAL LETTER IOTA WITH PSILI AND PERISPOMENI
+1F3F; C; 1F37; # GREEK CAPITAL LETTER IOTA WITH DASIA AND PERISPOMENI
+1F48; C; 1F40; # GREEK CAPITAL LETTER OMICRON WITH PSILI
+1F49; C; 1F41; # GREEK CAPITAL LETTER OMICRON WITH DASIA
+1F4A; C; 1F42; # GREEK CAPITAL LETTER OMICRON WITH PSILI AND VARIA
+1F4B; C; 1F43; # GREEK CAPITAL LETTER OMICRON WITH DASIA AND VARIA
+1F4C; C; 1F44; # GREEK CAPITAL LETTER OMICRON WITH PSILI AND OXIA
+1F4D; C; 1F45; # GREEK CAPITAL LETTER OMICRON WITH DASIA AND OXIA
+1F50; F; 03C5 0313; # GREEK SMALL LETTER UPSILON WITH PSILI
+1F52; F; 03C5 0313 0300; # GREEK SMALL LETTER UPSILON WITH PSILI AND VARIA
+1F54; F; 03C5 0313 0301; # GREEK SMALL LETTER UPSILON WITH PSILI AND OXIA
+1F56; F; 03C5 0313 0342; # GREEK SMALL LETTER UPSILON WITH PSILI AND PERISPOMENI
+1F59; C; 1F51; # GREEK CAPITAL LETTER UPSILON WITH DASIA
+1F5B; C; 1F53; # GREEK CAPITAL LETTER UPSILON WITH DASIA AND VARIA
+1F5D; C; 1F55; # GREEK CAPITAL LETTER UPSILON WITH DASIA AND OXIA
+1F5F; C; 1F57; # GREEK CAPITAL LETTER UPSILON WITH DASIA AND PERISPOMENI
+1F68; C; 1F60; # GREEK CAPITAL LETTER OMEGA WITH PSILI
+1F69; C; 1F61; # GREEK CAPITAL LETTER OMEGA WITH DASIA
+1F6A; C; 1F62; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND VARIA
+1F6B; C; 1F63; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND VARIA
+1F6C; C; 1F64; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA
+1F6D; C; 1F65; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND OXIA
+1F6E; C; 1F66; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI
+1F6F; C; 1F67; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI
+1F80; F; 1F00 03B9; # GREEK SMALL LETTER ALPHA WITH PSILI AND YPOGEGRAMMENI
+1F81; F; 1F01 03B9; # GREEK SMALL LETTER ALPHA WITH DASIA AND YPOGEGRAMMENI
+1F82; F; 1F02 03B9; # GREEK SMALL LETTER ALPHA WITH PSILI AND VARIA AND YPOGEGRAMMENI
+1F83; F; 1F03 03B9; # GREEK SMALL LETTER ALPHA WITH DASIA AND VARIA AND YPOGEGRAMMENI
+1F84; F; 1F04 03B9; # GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA AND YPOGEGRAMMENI
+1F85; F; 1F05 03B9; # GREEK SMALL LETTER ALPHA WITH DASIA AND OXIA AND YPOGEGRAMMENI
+1F86; F; 1F06 03B9; # GREEK SMALL LETTER ALPHA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI
+1F87; F; 1F07 03B9; # GREEK SMALL LETTER ALPHA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+1F88; F; 1F00 03B9; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI
+1F88; S; 1F80; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PROSGEGRAMMENI
+1F89; F; 1F01 03B9; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PROSGEGRAMMENI
+1F89; S; 1F81; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PROSGEGRAMMENI
+1F8A; F; 1F02 03B9; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1F8A; S; 1F82; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1F8B; F; 1F03 03B9; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1F8B; S; 1F83; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1F8C; F; 1F04 03B9; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1F8C; S; 1F84; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1F8D; F; 1F05 03B9; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1F8D; S; 1F85; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1F8E; F; 1F06 03B9; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1F8E; S; 1F86; # GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1F8F; F; 1F07 03B9; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1F8F; S; 1F87; # GREEK CAPITAL LETTER ALPHA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1F90; F; 1F20 03B9; # GREEK SMALL LETTER ETA WITH PSILI AND YPOGEGRAMMENI
+1F91; F; 1F21 03B9; # GREEK SMALL LETTER ETA WITH DASIA AND YPOGEGRAMMENI
+1F92; F; 1F22 03B9; # GREEK SMALL LETTER ETA WITH PSILI AND VARIA AND YPOGEGRAMMENI
+1F93; F; 1F23 03B9; # GREEK SMALL LETTER ETA WITH DASIA AND VARIA AND YPOGEGRAMMENI
+1F94; F; 1F24 03B9; # GREEK SMALL LETTER ETA WITH PSILI AND OXIA AND YPOGEGRAMMENI
+1F95; F; 1F25 03B9; # GREEK SMALL LETTER ETA WITH DASIA AND OXIA AND YPOGEGRAMMENI
+1F96; F; 1F26 03B9; # GREEK SMALL LETTER ETA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI
+1F97; F; 1F27 03B9; # GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+1F98; F; 1F20 03B9; # GREEK CAPITAL LETTER ETA WITH PSILI AND PROSGEGRAMMENI
+1F98; S; 1F90; # GREEK CAPITAL LETTER ETA WITH PSILI AND PROSGEGRAMMENI
+1F99; F; 1F21 03B9; # GREEK CAPITAL LETTER ETA WITH DASIA AND PROSGEGRAMMENI
+1F99; S; 1F91; # GREEK CAPITAL LETTER ETA WITH DASIA AND PROSGEGRAMMENI
+1F9A; F; 1F22 03B9; # GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1F9A; S; 1F92; # GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1F9B; F; 1F23 03B9; # GREEK CAPITAL LETTER ETA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1F9B; S; 1F93; # GREEK CAPITAL LETTER ETA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1F9C; F; 1F24 03B9; # GREEK CAPITAL LETTER ETA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1F9C; S; 1F94; # GREEK CAPITAL LETTER ETA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1F9D; F; 1F25 03B9; # GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1F9D; S; 1F95; # GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1F9E; F; 1F26 03B9; # GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1F9E; S; 1F96; # GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1F9F; F; 1F27 03B9; # GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1F9F; S; 1F97; # GREEK CAPITAL LETTER ETA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1FA0; F; 1F60 03B9; # GREEK SMALL LETTER OMEGA WITH PSILI AND YPOGEGRAMMENI
+1FA1; F; 1F61 03B9; # GREEK SMALL LETTER OMEGA WITH DASIA AND YPOGEGRAMMENI
+1FA2; F; 1F62 03B9; # GREEK SMALL LETTER OMEGA WITH PSILI AND VARIA AND YPOGEGRAMMENI
+1FA3; F; 1F63 03B9; # GREEK SMALL LETTER OMEGA WITH DASIA AND VARIA AND YPOGEGRAMMENI
+1FA4; F; 1F64 03B9; # GREEK SMALL LETTER OMEGA WITH PSILI AND OXIA AND YPOGEGRAMMENI
+1FA5; F; 1F65 03B9; # GREEK SMALL LETTER OMEGA WITH DASIA AND OXIA AND YPOGEGRAMMENI
+1FA6; F; 1F66 03B9; # GREEK SMALL LETTER OMEGA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI
+1FA7; F; 1F67 03B9; # GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI
+1FA8; F; 1F60 03B9; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PROSGEGRAMMENI
+1FA8; S; 1FA0; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PROSGEGRAMMENI
+1FA9; F; 1F61 03B9; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PROSGEGRAMMENI
+1FA9; S; 1FA1; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PROSGEGRAMMENI
+1FAA; F; 1F62 03B9; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1FAA; S; 1FA2; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND VARIA AND PROSGEGRAMMENI
+1FAB; F; 1F63 03B9; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1FAB; S; 1FA3; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND VARIA AND PROSGEGRAMMENI
+1FAC; F; 1F64 03B9; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1FAC; S; 1FA4; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA AND PROSGEGRAMMENI
+1FAD; F; 1F65 03B9; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1FAD; S; 1FA5; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND OXIA AND PROSGEGRAMMENI
+1FAE; F; 1F66 03B9; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1FAE; S; 1FA6; # GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI AND PROSGEGRAMMENI
+1FAF; F; 1F67 03B9; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1FAF; S; 1FA7; # GREEK CAPITAL LETTER OMEGA WITH DASIA AND PERISPOMENI AND PROSGEGRAMMENI
+1FB2; F; 1F70 03B9; # GREEK SMALL LETTER ALPHA WITH VARIA AND YPOGEGRAMMENI
+1FB3; F; 03B1 03B9; # GREEK SMALL LETTER ALPHA WITH YPOGEGRAMMENI
+1FB4; F; 03AC 03B9; # GREEK SMALL LETTER ALPHA WITH OXIA AND YPOGEGRAMMENI
+1FB6; F; 03B1 0342; # GREEK SMALL LETTER ALPHA WITH PERISPOMENI
+1FB7; F; 03B1 0342 03B9; # GREEK SMALL LETTER ALPHA WITH PERISPOMENI AND YPOGEGRAMMENI
+1FB8; C; 1FB0; # GREEK CAPITAL LETTER ALPHA WITH VRACHY
+1FB9; C; 1FB1; # GREEK CAPITAL LETTER ALPHA WITH MACRON
+1FBA; C; 1F70; # GREEK CAPITAL LETTER ALPHA WITH VARIA
+1FBB; C; 1F71; # GREEK CAPITAL LETTER ALPHA WITH OXIA
+1FBC; F; 03B1 03B9; # GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI
+1FBC; S; 1FB3; # GREEK CAPITAL LETTER ALPHA WITH PROSGEGRAMMENI
+1FBE; C; 03B9; # GREEK PROSGEGRAMMENI
+1FC2; F; 1F74 03B9; # GREEK SMALL LETTER ETA WITH VARIA AND YPOGEGRAMMENI
+1FC3; F; 03B7 03B9; # GREEK SMALL LETTER ETA WITH YPOGEGRAMMENI
+1FC4; F; 03AE 03B9; # GREEK SMALL LETTER ETA WITH OXIA AND YPOGEGRAMMENI
+1FC6; F; 03B7 0342; # GREEK SMALL LETTER ETA WITH PERISPOMENI
+1FC7; F; 03B7 0342 03B9; # GREEK SMALL LETTER ETA WITH PERISPOMENI AND YPOGEGRAMMENI
+1FC8; C; 1F72; # GREEK CAPITAL LETTER EPSILON WITH VARIA
+1FC9; C; 1F73; # GREEK CAPITAL LETTER EPSILON WITH OXIA
+1FCA; C; 1F74; # GREEK CAPITAL LETTER ETA WITH VARIA
+1FCB; C; 1F75; # GREEK CAPITAL LETTER ETA WITH OXIA
+1FCC; F; 03B7 03B9; # GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI
+1FCC; S; 1FC3; # GREEK CAPITAL LETTER ETA WITH PROSGEGRAMMENI
+1FD2; F; 03B9 0308 0300; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND VARIA
+1FD3; F; 03B9 0308 0301; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA
+1FD6; F; 03B9 0342; # GREEK SMALL LETTER IOTA WITH PERISPOMENI
+1FD7; F; 03B9 0308 0342; # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND PERISPOMENI
+1FD8; C; 1FD0; # GREEK CAPITAL LETTER IOTA WITH VRACHY
+1FD9; C; 1FD1; # GREEK CAPITAL LETTER IOTA WITH MACRON
+1FDA; C; 1F76; # GREEK CAPITAL LETTER IOTA WITH VARIA
+1FDB; C; 1F77; # GREEK CAPITAL LETTER IOTA WITH OXIA
+1FE2; F; 03C5 0308 0300; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND VARIA
+1FE3; F; 03C5 0308 0301; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA
+1FE4; F; 03C1 0313; # GREEK SMALL LETTER RHO WITH PSILI
+1FE6; F; 03C5 0342; # GREEK SMALL LETTER UPSILON WITH PERISPOMENI
+1FE7; F; 03C5 0308 0342; # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND PERISPOMENI
+1FE8; C; 1FE0; # GREEK CAPITAL LETTER UPSILON WITH VRACHY
+1FE9; C; 1FE1; # GREEK CAPITAL LETTER UPSILON WITH MACRON
+1FEA; C; 1F7A; # GREEK CAPITAL LETTER UPSILON WITH VARIA
+1FEB; C; 1F7B; # GREEK CAPITAL LETTER UPSILON WITH OXIA
+1FEC; C; 1FE5; # GREEK CAPITAL LETTER RHO WITH DASIA
+1FF2; F; 1F7C 03B9; # GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGRAMMENI
+1FF3; F; 03C9 03B9; # GREEK SMALL LETTER OMEGA WITH YPOGEGRAMMENI
+1FF4; F; 03CE 03B9; # GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI
+1FF6; F; 03C9 0342; # GREEK SMALL LETTER OMEGA WITH PERISPOMENI
+1FF7; F; 03C9 0342 03B9; # GREEK SMALL LETTER OMEGA WITH PERISPOMENI AND YPOGEGRAMMENI
+1FF8; C; 1F78; # GREEK CAPITAL LETTER OMICRON WITH VARIA
+1FF9; C; 1F79; # GREEK CAPITAL LETTER OMICRON WITH OXIA
+1FFA; C; 1F7C; # GREEK CAPITAL LETTER OMEGA WITH VARIA
+1FFB; C; 1F7D; # GREEK CAPITAL LETTER OMEGA WITH OXIA
+1FFC; F; 03C9 03B9; # GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
+1FFC; S; 1FF3; # GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI
+2126; C; 03C9; # OHM SIGN
+212A; C; 006B; # KELVIN SIGN
+212B; C; 00E5; # ANGSTROM SIGN
+2132; C; 214E; # TURNED CAPITAL F
+2160; C; 2170; # ROMAN NUMERAL ONE
+2161; C; 2171; # ROMAN NUMERAL TWO
+2162; C; 2172; # ROMAN NUMERAL THREE
+2163; C; 2173; # ROMAN NUMERAL FOUR
+2164; C; 2174; # ROMAN NUMERAL FIVE
+2165; C; 2175; # ROMAN NUMERAL SIX
+2166; C; 2176; # ROMAN NUMERAL SEVEN
+2167; C; 2177; # ROMAN NUMERAL EIGHT
+2168; C; 2178; # ROMAN NUMERAL NINE
+2169; C; 2179; # ROMAN NUMERAL TEN
+216A; C; 217A; # ROMAN NUMERAL ELEVEN
+216B; C; 217B; # ROMAN NUMERAL TWELVE
+216C; C; 217C; # ROMAN NUMERAL FIFTY
+216D; C; 217D; # ROMAN NUMERAL ONE HUNDRED
+216E; C; 217E; # ROMAN NUMERAL FIVE HUNDRED
+216F; C; 217F; # ROMAN NUMERAL ONE THOUSAND
+2183; C; 2184; # ROMAN NUMERAL REVERSED ONE HUNDRED
+24B6; C; 24D0; # CIRCLED LATIN CAPITAL LETTER A
+24B7; C; 24D1; # CIRCLED LATIN CAPITAL LETTER B
+24B8; C; 24D2; # CIRCLED LATIN CAPITAL LETTER C
+24B9; C; 24D3; # CIRCLED LATIN CAPITAL LETTER D
+24BA; C; 24D4; # CIRCLED LATIN CAPITAL LETTER E
+24BB; C; 24D5; # CIRCLED LATIN CAPITAL LETTER F
+24BC; C; 24D6; # CIRCLED LATIN CAPITAL LETTER G
+24BD; C; 24D7; # CIRCLED LATIN CAPITAL LETTER H
+24BE; C; 24D8; # CIRCLED LATIN CAPITAL LETTER I
+24BF; C; 24D9; # CIRCLED LATIN CAPITAL LETTER J
+24C0; C; 24DA; # CIRCLED LATIN CAPITAL LETTER K
+24C1; C; 24DB; # CIRCLED LATIN CAPITAL LETTER L
+24C2; C; 24DC; # CIRCLED LATIN CAPITAL LETTER M
+24C3; C; 24DD; # CIRCLED LATIN CAPITAL LETTER N
+24C4; C; 24DE; # CIRCLED LATIN CAPITAL LETTER O
+24C5; C; 24DF; # CIRCLED LATIN CAPITAL LETTER P
+24C6; C; 24E0; # CIRCLED LATIN CAPITAL LETTER Q
+24C7; C; 24E1; # CIRCLED LATIN CAPITAL LETTER R
+24C8; C; 24E2; # CIRCLED LATIN CAPITAL LETTER S
+24C9; C; 24E3; # CIRCLED LATIN CAPITAL LETTER T
+24CA; C; 24E4; # CIRCLED LATIN CAPITAL LETTER U
+24CB; C; 24E5; # CIRCLED LATIN CAPITAL LETTER V
+24CC; C; 24E6; # CIRCLED LATIN CAPITAL LETTER W
+24CD; C; 24E7; # CIRCLED LATIN CAPITAL LETTER X
+24CE; C; 24E8; # CIRCLED LATIN CAPITAL LETTER Y
+24CF; C; 24E9; # CIRCLED LATIN CAPITAL LETTER Z
+2C00; C; 2C30; # GLAGOLITIC CAPITAL LETTER AZU
+2C01; C; 2C31; # GLAGOLITIC CAPITAL LETTER BUKY
+2C02; C; 2C32; # GLAGOLITIC CAPITAL LETTER VEDE
+2C03; C; 2C33; # GLAGOLITIC CAPITAL LETTER GLAGOLI
+2C04; C; 2C34; # GLAGOLITIC CAPITAL LETTER DOBRO
+2C05; C; 2C35; # GLAGOLITIC CAPITAL LETTER YESTU
+2C06; C; 2C36; # GLAGOLITIC CAPITAL LETTER ZHIVETE
+2C07; C; 2C37; # GLAGOLITIC CAPITAL LETTER DZELO
+2C08; C; 2C38; # GLAGOLITIC CAPITAL LETTER ZEMLJA
+2C09; C; 2C39; # GLAGOLITIC CAPITAL LETTER IZHE
+2C0A; C; 2C3A; # GLAGOLITIC CAPITAL LETTER INITIAL IZHE
+2C0B; C; 2C3B; # GLAGOLITIC CAPITAL LETTER I
+2C0C; C; 2C3C; # GLAGOLITIC CAPITAL LETTER DJERVI
+2C0D; C; 2C3D; # GLAGOLITIC CAPITAL LETTER KAKO
+2C0E; C; 2C3E; # GLAGOLITIC CAPITAL LETTER LJUDIJE
+2C0F; C; 2C3F; # GLAGOLITIC CAPITAL LETTER MYSLITE
+2C10; C; 2C40; # GLAGOLITIC CAPITAL LETTER NASHI
+2C11; C; 2C41; # GLAGOLITIC CAPITAL LETTER ONU
+2C12; C; 2C42; # GLAGOLITIC CAPITAL LETTER POKOJI
+2C13; C; 2C43; # GLAGOLITIC CAPITAL LETTER RITSI
+2C14; C; 2C44; # GLAGOLITIC CAPITAL LETTER SLOVO
+2C15; C; 2C45; # GLAGOLITIC CAPITAL LETTER TVRIDO
+2C16; C; 2C46; # GLAGOLITIC CAPITAL LETTER UKU
+2C17; C; 2C47; # GLAGOLITIC CAPITAL LETTER FRITU
+2C18; C; 2C48; # GLAGOLITIC CAPITAL LETTER HERU
+2C19; C; 2C49; # GLAGOLITIC CAPITAL LETTER OTU
+2C1A; C; 2C4A; # GLAGOLITIC CAPITAL LETTER PE
+2C1B; C; 2C4B; # GLAGOLITIC CAPITAL LETTER SHTA
+2C1C; C; 2C4C; # GLAGOLITIC CAPITAL LETTER TSI
+2C1D; C; 2C4D; # GLAGOLITIC CAPITAL LETTER CHRIVI
+2C1E; C; 2C4E; # GLAGOLITIC CAPITAL LETTER SHA
+2C1F; C; 2C4F; # GLAGOLITIC CAPITAL LETTER YERU
+2C20; C; 2C50; # GLAGOLITIC CAPITAL LETTER YERI
+2C21; C; 2C51; # GLAGOLITIC CAPITAL LETTER YATI
+2C22; C; 2C52; # GLAGOLITIC CAPITAL LETTER SPIDERY HA
+2C23; C; 2C53; # GLAGOLITIC CAPITAL LETTER YU
+2C24; C; 2C54; # GLAGOLITIC CAPITAL LETTER SMALL YUS
+2C25; C; 2C55; # GLAGOLITIC CAPITAL LETTER SMALL YUS WITH TAIL
+2C26; C; 2C56; # GLAGOLITIC CAPITAL LETTER YO
+2C27; C; 2C57; # GLAGOLITIC CAPITAL LETTER IOTATED SMALL YUS
+2C28; C; 2C58; # GLAGOLITIC CAPITAL LETTER BIG YUS
+2C29; C; 2C59; # GLAGOLITIC CAPITAL LETTER IOTATED BIG YUS
+2C2A; C; 2C5A; # GLAGOLITIC CAPITAL LETTER FITA
+2C2B; C; 2C5B; # GLAGOLITIC CAPITAL LETTER IZHITSA
+2C2C; C; 2C5C; # GLAGOLITIC CAPITAL LETTER SHTAPIC
+2C2D; C; 2C5D; # GLAGOLITIC CAPITAL LETTER TROKUTASTI A
+2C2E; C; 2C5E; # GLAGOLITIC CAPITAL LETTER LATINATE MYSLITE
+2C60; C; 2C61; # LATIN CAPITAL LETTER L WITH DOUBLE BAR
+2C62; C; 026B; # LATIN CAPITAL LETTER L WITH MIDDLE TILDE
+2C63; C; 1D7D; # LATIN CAPITAL LETTER P WITH STROKE
+2C64; C; 027D; # LATIN CAPITAL LETTER R WITH TAIL
+2C67; C; 2C68; # LATIN CAPITAL LETTER H WITH DESCENDER
+2C69; C; 2C6A; # LATIN CAPITAL LETTER K WITH DESCENDER
+2C6B; C; 2C6C; # LATIN CAPITAL LETTER Z WITH DESCENDER
+2C6D; C; 0251; # LATIN CAPITAL LETTER ALPHA
+2C6E; C; 0271; # LATIN CAPITAL LETTER M WITH HOOK
+2C6F; C; 0250; # LATIN CAPITAL LETTER TURNED A
+2C70; C; 0252; # LATIN CAPITAL LETTER TURNED ALPHA
+2C72; C; 2C73; # LATIN CAPITAL LETTER W WITH HOOK
+2C75; C; 2C76; # LATIN CAPITAL LETTER HALF H
+2C7E; C; 023F; # LATIN CAPITAL LETTER S WITH SWASH TAIL
+2C7F; C; 0240; # LATIN CAPITAL LETTER Z WITH SWASH TAIL
+2C80; C; 2C81; # COPTIC CAPITAL LETTER ALFA
+2C82; C; 2C83; # COPTIC CAPITAL LETTER VIDA
+2C84; C; 2C85; # COPTIC CAPITAL LETTER GAMMA
+2C86; C; 2C87; # COPTIC CAPITAL LETTER DALDA
+2C88; C; 2C89; # COPTIC CAPITAL LETTER EIE
+2C8A; C; 2C8B; # COPTIC CAPITAL LETTER SOU
+2C8C; C; 2C8D; # COPTIC CAPITAL LETTER ZATA
+2C8E; C; 2C8F; # COPTIC CAPITAL LETTER HATE
+2C90; C; 2C91; # COPTIC CAPITAL LETTER THETHE
+2C92; C; 2C93; # COPTIC CAPITAL LETTER IAUDA
+2C94; C; 2C95; # COPTIC CAPITAL LETTER KAPA
+2C96; C; 2C97; # COPTIC CAPITAL LETTER LAULA
+2C98; C; 2C99; # COPTIC CAPITAL LETTER MI
+2C9A; C; 2C9B; # COPTIC CAPITAL LETTER NI
+2C9C; C; 2C9D; # COPTIC CAPITAL LETTER KSI
+2C9E; C; 2C9F; # COPTIC CAPITAL LETTER O
+2CA0; C; 2CA1; # COPTIC CAPITAL LETTER PI
+2CA2; C; 2CA3; # COPTIC CAPITAL LETTER RO
+2CA4; C; 2CA5; # COPTIC CAPITAL LETTER SIMA
+2CA6; C; 2CA7; # COPTIC CAPITAL LETTER TAU
+2CA8; C; 2CA9; # COPTIC CAPITAL LETTER UA
+2CAA; C; 2CAB; # COPTIC CAPITAL LETTER FI
+2CAC; C; 2CAD; # COPTIC CAPITAL LETTER KHI
+2CAE; C; 2CAF; # COPTIC CAPITAL LETTER PSI
+2CB0; C; 2CB1; # COPTIC CAPITAL LETTER OOU
+2CB2; C; 2CB3; # COPTIC CAPITAL LETTER DIALECT-P ALEF
+2CB4; C; 2CB5; # COPTIC CAPITAL LETTER OLD COPTIC AIN
+2CB6; C; 2CB7; # COPTIC CAPITAL LETTER CRYPTOGRAMMIC EIE
+2CB8; C; 2CB9; # COPTIC CAPITAL LETTER DIALECT-P KAPA
+2CBA; C; 2CBB; # COPTIC CAPITAL LETTER DIALECT-P NI
+2CBC; C; 2CBD; # COPTIC CAPITAL LETTER CRYPTOGRAMMIC NI
+2CBE; C; 2CBF; # COPTIC CAPITAL LETTER OLD COPTIC OOU
+2CC0; C; 2CC1; # COPTIC CAPITAL LETTER SAMPI
+2CC2; C; 2CC3; # COPTIC CAPITAL LETTER CROSSED SHEI
+2CC4; C; 2CC5; # COPTIC CAPITAL LETTER OLD COPTIC SHEI
+2CC6; C; 2CC7; # COPTIC CAPITAL LETTER OLD COPTIC ESH
+2CC8; C; 2CC9; # COPTIC CAPITAL LETTER AKHMIMIC KHEI
+2CCA; C; 2CCB; # COPTIC CAPITAL LETTER DIALECT-P HORI
+2CCC; C; 2CCD; # COPTIC CAPITAL LETTER OLD COPTIC HORI
+2CCE; C; 2CCF; # COPTIC CAPITAL LETTER OLD COPTIC HA
+2CD0; C; 2CD1; # COPTIC CAPITAL LETTER L-SHAPED HA
+2CD2; C; 2CD3; # COPTIC CAPITAL LETTER OLD COPTIC HEI
+2CD4; C; 2CD5; # COPTIC CAPITAL LETTER OLD COPTIC HAT
+2CD6; C; 2CD7; # COPTIC CAPITAL LETTER OLD COPTIC GANGIA
+2CD8; C; 2CD9; # COPTIC CAPITAL LETTER OLD COPTIC DJA
+2CDA; C; 2CDB; # COPTIC CAPITAL LETTER OLD COPTIC SHIMA
+2CDC; C; 2CDD; # COPTIC CAPITAL LETTER OLD NUBIAN SHIMA
+2CDE; C; 2CDF; # COPTIC CAPITAL LETTER OLD NUBIAN NGI
+2CE0; C; 2CE1; # COPTIC CAPITAL LETTER OLD NUBIAN NYI
+2CE2; C; 2CE3; # COPTIC CAPITAL LETTER OLD NUBIAN WAU
+2CEB; C; 2CEC; # COPTIC CAPITAL LETTER CRYPTOGRAMMIC SHEI
+2CED; C; 2CEE; # COPTIC CAPITAL LETTER CRYPTOGRAMMIC GANGIA
+2CF2; C; 2CF3; # COPTIC CAPITAL LETTER BOHAIRIC KHEI
+A640; C; A641; # CYRILLIC CAPITAL LETTER ZEMLYA
+A642; C; A643; # CYRILLIC CAPITAL LETTER DZELO
+A644; C; A645; # CYRILLIC CAPITAL LETTER REVERSED DZE
+A646; C; A647; # CYRILLIC CAPITAL LETTER IOTA
+A648; C; A649; # CYRILLIC CAPITAL LETTER DJERV
+A64A; C; A64B; # CYRILLIC CAPITAL LETTER MONOGRAPH UK
+A64C; C; A64D; # CYRILLIC CAPITAL LETTER BROAD OMEGA
+A64E; C; A64F; # CYRILLIC CAPITAL LETTER NEUTRAL YER
+A650; C; A651; # CYRILLIC CAPITAL LETTER YERU WITH BACK YER
+A652; C; A653; # CYRILLIC CAPITAL LETTER IOTIFIED YAT
+A654; C; A655; # CYRILLIC CAPITAL LETTER REVERSED YU
+A656; C; A657; # CYRILLIC CAPITAL LETTER IOTIFIED A
+A658; C; A659; # CYRILLIC CAPITAL LETTER CLOSED LITTLE YUS
+A65A; C; A65B; # CYRILLIC CAPITAL LETTER BLENDED YUS
+A65C; C; A65D; # CYRILLIC CAPITAL LETTER IOTIFIED CLOSED LITTLE YUS
+A65E; C; A65F; # CYRILLIC CAPITAL LETTER YN
+A660; C; A661; # CYRILLIC CAPITAL LETTER REVERSED TSE
+A662; C; A663; # CYRILLIC CAPITAL LETTER SOFT DE
+A664; C; A665; # CYRILLIC CAPITAL LETTER SOFT EL
+A666; C; A667; # CYRILLIC CAPITAL LETTER SOFT EM
+A668; C; A669; # CYRILLIC CAPITAL LETTER MONOCULAR O
+A66A; C; A66B; # CYRILLIC CAPITAL LETTER BINOCULAR O
+A66C; C; A66D; # CYRILLIC CAPITAL LETTER DOUBLE MONOCULAR O
+A680; C; A681; # CYRILLIC CAPITAL LETTER DWE
+A682; C; A683; # CYRILLIC CAPITAL LETTER DZWE
+A684; C; A685; # CYRILLIC CAPITAL LETTER ZHWE
+A686; C; A687; # CYRILLIC CAPITAL LETTER CCHE
+A688; C; A689; # CYRILLIC CAPITAL LETTER DZZE
+A68A; C; A68B; # CYRILLIC CAPITAL LETTER TE WITH MIDDLE HOOK
+A68C; C; A68D; # CYRILLIC CAPITAL LETTER TWE
+A68E; C; A68F; # CYRILLIC CAPITAL LETTER TSWE
+A690; C; A691; # CYRILLIC CAPITAL LETTER TSSE
+A692; C; A693; # CYRILLIC CAPITAL LETTER TCHE
+A694; C; A695; # CYRILLIC CAPITAL LETTER HWE
+A696; C; A697; # CYRILLIC CAPITAL LETTER SHWE
+A698; C; A699; # CYRILLIC CAPITAL LETTER DOUBLE O
+A69A; C; A69B; # CYRILLIC CAPITAL LETTER CROSSED O
+A722; C; A723; # LATIN CAPITAL LETTER EGYPTOLOGICAL ALEF
+A724; C; A725; # LATIN CAPITAL LETTER EGYPTOLOGICAL AIN
+A726; C; A727; # LATIN CAPITAL LETTER HENG
+A728; C; A729; # LATIN CAPITAL LETTER TZ
+A72A; C; A72B; # LATIN CAPITAL LETTER TRESILLO
+A72C; C; A72D; # LATIN CAPITAL LETTER CUATRILLO
+A72E; C; A72F; # LATIN CAPITAL LETTER CUATRILLO WITH COMMA
+A732; C; A733; # LATIN CAPITAL LETTER AA
+A734; C; A735; # LATIN CAPITAL LETTER AO
+A736; C; A737; # LATIN CAPITAL LETTER AU
+A738; C; A739; # LATIN CAPITAL LETTER AV
+A73A; C; A73B; # LATIN CAPITAL LETTER AV WITH HORIZONTAL BAR
+A73C; C; A73D; # LATIN CAPITAL LETTER AY
+A73E; C; A73F; # LATIN CAPITAL LETTER REVERSED C WITH DOT
+A740; C; A741; # LATIN CAPITAL LETTER K WITH STROKE
+A742; C; A743; # LATIN CAPITAL LETTER K WITH DIAGONAL STROKE
+A744; C; A745; # LATIN CAPITAL LETTER K WITH STROKE AND DIAGONAL STROKE
+A746; C; A747; # LATIN CAPITAL LETTER BROKEN L
+A748; C; A749; # LATIN CAPITAL LETTER L WITH HIGH STROKE
+A74A; C; A74B; # LATIN CAPITAL LETTER O WITH LONG STROKE OVERLAY
+A74C; C; A74D; # LATIN CAPITAL LETTER O WITH LOOP
+A74E; C; A74F; # LATIN CAPITAL LETTER OO
+A750; C; A751; # LATIN CAPITAL LETTER P WITH STROKE THROUGH DESCENDER
+A752; C; A753; # LATIN CAPITAL LETTER P WITH FLOURISH
+A754; C; A755; # LATIN CAPITAL LETTER P WITH SQUIRREL TAIL
+A756; C; A757; # LATIN CAPITAL LETTER Q WITH STROKE THROUGH DESCENDER
+A758; C; A759; # LATIN CAPITAL LETTER Q WITH DIAGONAL STROKE
+A75A; C; A75B; # LATIN CAPITAL LETTER R ROTUNDA
+A75C; C; A75D; # LATIN CAPITAL LETTER RUM ROTUNDA
+A75E; C; A75F; # LATIN CAPITAL LETTER V WITH DIAGONAL STROKE
+A760; C; A761; # LATIN CAPITAL LETTER VY
+A762; C; A763; # LATIN CAPITAL LETTER VISIGOTHIC Z
+A764; C; A765; # LATIN CAPITAL LETTER THORN WITH STROKE
+A766; C; A767; # LATIN CAPITAL LETTER THORN WITH STROKE THROUGH DESCENDER
+A768; C; A769; # LATIN CAPITAL LETTER VEND
+A76A; C; A76B; # LATIN CAPITAL LETTER ET
+A76C; C; A76D; # LATIN CAPITAL LETTER IS
+A76E; C; A76F; # LATIN CAPITAL LETTER CON
+A779; C; A77A; # LATIN CAPITAL LETTER INSULAR D
+A77B; C; A77C; # LATIN CAPITAL LETTER INSULAR F
+A77D; C; 1D79; # LATIN CAPITAL LETTER INSULAR G
+A77E; C; A77F; # LATIN CAPITAL LETTER TURNED INSULAR G
+A780; C; A781; # LATIN CAPITAL LETTER TURNED L
+A782; C; A783; # LATIN CAPITAL LETTER INSULAR R
+A784; C; A785; # LATIN CAPITAL LETTER INSULAR S
+A786; C; A787; # LATIN CAPITAL LETTER INSULAR T
+A78B; C; A78C; # LATIN CAPITAL LETTER SALTILLO
+A78D; C; 0265; # LATIN CAPITAL LETTER TURNED H
+A790; C; A791; # LATIN CAPITAL LETTER N WITH DESCENDER
+A792; C; A793; # LATIN CAPITAL LETTER C WITH BAR
+A796; C; A797; # LATIN CAPITAL LETTER B WITH FLOURISH
+A798; C; A799; # LATIN CAPITAL LETTER F WITH STROKE
+A79A; C; A79B; # LATIN CAPITAL LETTER VOLAPUK AE
+A79C; C; A79D; # LATIN CAPITAL LETTER VOLAPUK OE
+A79E; C; A79F; # LATIN CAPITAL LETTER VOLAPUK UE
+A7A0; C; A7A1; # LATIN CAPITAL LETTER G WITH OBLIQUE STROKE
+A7A2; C; A7A3; # LATIN CAPITAL LETTER K WITH OBLIQUE STROKE
+A7A4; C; A7A5; # LATIN CAPITAL LETTER N WITH OBLIQUE STROKE
+A7A6; C; A7A7; # LATIN CAPITAL LETTER R WITH OBLIQUE STROKE
+A7A8; C; A7A9; # LATIN CAPITAL LETTER S WITH OBLIQUE STROKE
+A7AA; C; 0266; # LATIN CAPITAL LETTER H WITH HOOK
+A7AB; C; 025C; # LATIN CAPITAL LETTER REVERSED OPEN E
+A7AC; C; 0261; # LATIN CAPITAL LETTER SCRIPT G
+A7AD; C; 026C; # LATIN CAPITAL LETTER L WITH BELT
+A7B0; C; 029E; # LATIN CAPITAL LETTER TURNED K
+A7B1; C; 0287; # LATIN CAPITAL LETTER TURNED T
+A7B2; C; 029D; # LATIN CAPITAL LETTER J WITH CROSSED-TAIL
+A7B3; C; AB53; # LATIN CAPITAL LETTER CHI
+A7B4; C; A7B5; # LATIN CAPITAL LETTER BETA
+A7B6; C; A7B7; # LATIN CAPITAL LETTER OMEGA
+AB70; C; 13A0; # CHEROKEE SMALL LETTER A
+AB71; C; 13A1; # CHEROKEE SMALL LETTER E
+AB72; C; 13A2; # CHEROKEE SMALL LETTER I
+AB73; C; 13A3; # CHEROKEE SMALL LETTER O
+AB74; C; 13A4; # CHEROKEE SMALL LETTER U
+AB75; C; 13A5; # CHEROKEE SMALL LETTER V
+AB76; C; 13A6; # CHEROKEE SMALL LETTER GA
+AB77; C; 13A7; # CHEROKEE SMALL LETTER KA
+AB78; C; 13A8; # CHEROKEE SMALL LETTER GE
+AB79; C; 13A9; # CHEROKEE SMALL LETTER GI
+AB7A; C; 13AA; # CHEROKEE SMALL LETTER GO
+AB7B; C; 13AB; # CHEROKEE SMALL LETTER GU
+AB7C; C; 13AC; # CHEROKEE SMALL LETTER GV
+AB7D; C; 13AD; # CHEROKEE SMALL LETTER HA
+AB7E; C; 13AE; # CHEROKEE SMALL LETTER HE
+AB7F; C; 13AF; # CHEROKEE SMALL LETTER HI
+AB80; C; 13B0; # CHEROKEE SMALL LETTER HO
+AB81; C; 13B1; # CHEROKEE SMALL LETTER HU
+AB82; C; 13B2; # CHEROKEE SMALL LETTER HV
+AB83; C; 13B3; # CHEROKEE SMALL LETTER LA
+AB84; C; 13B4; # CHEROKEE SMALL LETTER LE
+AB85; C; 13B5; # CHEROKEE SMALL LETTER LI
+AB86; C; 13B6; # CHEROKEE SMALL LETTER LO
+AB87; C; 13B7; # CHEROKEE SMALL LETTER LU
+AB88; C; 13B8; # CHEROKEE SMALL LETTER LV
+AB89; C; 13B9; # CHEROKEE SMALL LETTER MA
+AB8A; C; 13BA; # CHEROKEE SMALL LETTER ME
+AB8B; C; 13BB; # CHEROKEE SMALL LETTER MI
+AB8C; C; 13BC; # CHEROKEE SMALL LETTER MO
+AB8D; C; 13BD; # CHEROKEE SMALL LETTER MU
+AB8E; C; 13BE; # CHEROKEE SMALL LETTER NA
+AB8F; C; 13BF; # CHEROKEE SMALL LETTER HNA
+AB90; C; 13C0; # CHEROKEE SMALL LETTER NAH
+AB91; C; 13C1; # CHEROKEE SMALL LETTER NE
+AB92; C; 13C2; # CHEROKEE SMALL LETTER NI
+AB93; C; 13C3; # CHEROKEE SMALL LETTER NO
+AB94; C; 13C4; # CHEROKEE SMALL LETTER NU
+AB95; C; 13C5; # CHEROKEE SMALL LETTER NV
+AB96; C; 13C6; # CHEROKEE SMALL LETTER QUA
+AB97; C; 13C7; # CHEROKEE SMALL LETTER QUE
+AB98; C; 13C8; # CHEROKEE SMALL LETTER QUI
+AB99; C; 13C9; # CHEROKEE SMALL LETTER QUO
+AB9A; C; 13CA; # CHEROKEE SMALL LETTER QUU
+AB9B; C; 13CB; # CHEROKEE SMALL LETTER QUV
+AB9C; C; 13CC; # CHEROKEE SMALL LETTER SA
+AB9D; C; 13CD; # CHEROKEE SMALL LETTER S
+AB9E; C; 13CE; # CHEROKEE SMALL LETTER SE
+AB9F; C; 13CF; # CHEROKEE SMALL LETTER SI
+ABA0; C; 13D0; # CHEROKEE SMALL LETTER SO
+ABA1; C; 13D1; # CHEROKEE SMALL LETTER SU
+ABA2; C; 13D2; # CHEROKEE SMALL LETTER SV
+ABA3; C; 13D3; # CHEROKEE SMALL LETTER DA
+ABA4; C; 13D4; # CHEROKEE SMALL LETTER TA
+ABA5; C; 13D5; # CHEROKEE SMALL LETTER DE
+ABA6; C; 13D6; # CHEROKEE SMALL LETTER TE
+ABA7; C; 13D7; # CHEROKEE SMALL LETTER DI
+ABA8; C; 13D8; # CHEROKEE SMALL LETTER TI
+ABA9; C; 13D9; # CHEROKEE SMALL LETTER DO
+ABAA; C; 13DA; # CHEROKEE SMALL LETTER DU
+ABAB; C; 13DB; # CHEROKEE SMALL LETTER DV
+ABAC; C; 13DC; # CHEROKEE SMALL LETTER DLA
+ABAD; C; 13DD; # CHEROKEE SMALL LETTER TLA
+ABAE; C; 13DE; # CHEROKEE SMALL LETTER TLE
+ABAF; C; 13DF; # CHEROKEE SMALL LETTER TLI
+ABB0; C; 13E0; # CHEROKEE SMALL LETTER TLO
+ABB1; C; 13E1; # CHEROKEE SMALL LETTER TLU
+ABB2; C; 13E2; # CHEROKEE SMALL LETTER TLV
+ABB3; C; 13E3; # CHEROKEE SMALL LETTER TSA
+ABB4; C; 13E4; # CHEROKEE SMALL LETTER TSE
+ABB5; C; 13E5; # CHEROKEE SMALL LETTER TSI
+ABB6; C; 13E6; # CHEROKEE SMALL LETTER TSO
+ABB7; C; 13E7; # CHEROKEE SMALL LETTER TSU
+ABB8; C; 13E8; # CHEROKEE SMALL LETTER TSV
+ABB9; C; 13E9; # CHEROKEE SMALL LETTER WA
+ABBA; C; 13EA; # CHEROKEE SMALL LETTER WE
+ABBB; C; 13EB; # CHEROKEE SMALL LETTER WI
+ABBC; C; 13EC; # CHEROKEE SMALL LETTER WO
+ABBD; C; 13ED; # CHEROKEE SMALL LETTER WU
+ABBE; C; 13EE; # CHEROKEE SMALL LETTER WV
+ABBF; C; 13EF; # CHEROKEE SMALL LETTER YA
+FB00; F; 0066 0066; # LATIN SMALL LIGATURE FF
+FB01; F; 0066 0069; # LATIN SMALL LIGATURE FI
+FB02; F; 0066 006C; # LATIN SMALL LIGATURE FL
+FB03; F; 0066 0066 0069; # LATIN SMALL LIGATURE FFI
+FB04; F; 0066 0066 006C; # LATIN SMALL LIGATURE FFL
+FB05; F; 0073 0074; # LATIN SMALL LIGATURE LONG S T
+FB06; F; 0073 0074; # LATIN SMALL LIGATURE ST
+FB13; F; 0574 0576; # ARMENIAN SMALL LIGATURE MEN NOW
+FB14; F; 0574 0565; # ARMENIAN SMALL LIGATURE MEN ECH
+FB15; F; 0574 056B; # ARMENIAN SMALL LIGATURE MEN INI
+FB16; F; 057E 0576; # ARMENIAN SMALL LIGATURE VEW NOW
+FB17; F; 0574 056D; # ARMENIAN SMALL LIGATURE MEN XEH
+FF21; C; FF41; # FULLWIDTH LATIN CAPITAL LETTER A
+FF22; C; FF42; # FULLWIDTH LATIN CAPITAL LETTER B
+FF23; C; FF43; # FULLWIDTH LATIN CAPITAL LETTER C
+FF24; C; FF44; # FULLWIDTH LATIN CAPITAL LETTER D
+FF25; C; FF45; # FULLWIDTH LATIN CAPITAL LETTER E
+FF26; C; FF46; # FULLWIDTH LATIN CAPITAL LETTER F
+FF27; C; FF47; # FULLWIDTH LATIN CAPITAL LETTER G
+FF28; C; FF48; # FULLWIDTH LATIN CAPITAL LETTER H
+FF29; C; FF49; # FULLWIDTH LATIN CAPITAL LETTER I
+FF2A; C; FF4A; # FULLWIDTH LATIN CAPITAL LETTER J
+FF2B; C; FF4B; # FULLWIDTH LATIN CAPITAL LETTER K
+FF2C; C; FF4C; # FULLWIDTH LATIN CAPITAL LETTER L
+FF2D; C; FF4D; # FULLWIDTH LATIN CAPITAL LETTER M
+FF2E; C; FF4E; # FULLWIDTH LATIN CAPITAL LETTER N
+FF2F; C; FF4F; # FULLWIDTH LATIN CAPITAL LETTER O
+FF30; C; FF50; # FULLWIDTH LATIN CAPITAL LETTER P
+FF31; C; FF51; # FULLWIDTH LATIN CAPITAL LETTER Q
+FF32; C; FF52; # FULLWIDTH LATIN CAPITAL LETTER R
+FF33; C; FF53; # FULLWIDTH LATIN CAPITAL LETTER S
+FF34; C; FF54; # FULLWIDTH LATIN CAPITAL LETTER T
+FF35; C; FF55; # FULLWIDTH LATIN CAPITAL LETTER U
+FF36; C; FF56; # FULLWIDTH LATIN CAPITAL LETTER V
+FF37; C; FF57; # FULLWIDTH LATIN CAPITAL LETTER W
+FF38; C; FF58; # FULLWIDTH LATIN CAPITAL LETTER X
+FF39; C; FF59; # FULLWIDTH LATIN CAPITAL LETTER Y
+FF3A; C; FF5A; # FULLWIDTH LATIN CAPITAL LETTER Z
+10400; C; 10428; # DESERET CAPITAL LETTER LONG I
+10401; C; 10429; # DESERET CAPITAL LETTER LONG E
+10402; C; 1042A; # DESERET CAPITAL LETTER LONG A
+10403; C; 1042B; # DESERET CAPITAL LETTER LONG AH
+10404; C; 1042C; # DESERET CAPITAL LETTER LONG O
+10405; C; 1042D; # DESERET CAPITAL LETTER LONG OO
+10406; C; 1042E; # DESERET CAPITAL LETTER SHORT I
+10407; C; 1042F; # DESERET CAPITAL LETTER SHORT E
+10408; C; 10430; # DESERET CAPITAL LETTER SHORT A
+10409; C; 10431; # DESERET CAPITAL LETTER SHORT AH
+1040A; C; 10432; # DESERET CAPITAL LETTER SHORT O
+1040B; C; 10433; # DESERET CAPITAL LETTER SHORT OO
+1040C; C; 10434; # DESERET CAPITAL LETTER AY
+1040D; C; 10435; # DESERET CAPITAL LETTER OW
+1040E; C; 10436; # DESERET CAPITAL LETTER WU
+1040F; C; 10437; # DESERET CAPITAL LETTER YEE
+10410; C; 10438; # DESERET CAPITAL LETTER H
+10411; C; 10439; # DESERET CAPITAL LETTER PEE
+10412; C; 1043A; # DESERET CAPITAL LETTER BEE
+10413; C; 1043B; # DESERET CAPITAL LETTER TEE
+10414; C; 1043C; # DESERET CAPITAL LETTER DEE
+10415; C; 1043D; # DESERET CAPITAL LETTER CHEE
+10416; C; 1043E; # DESERET CAPITAL LETTER JEE
+10417; C; 1043F; # DESERET CAPITAL LETTER KAY
+10418; C; 10440; # DESERET CAPITAL LETTER GAY
+10419; C; 10441; # DESERET CAPITAL LETTER EF
+1041A; C; 10442; # DESERET CAPITAL LETTER VEE
+1041B; C; 10443; # DESERET CAPITAL LETTER ETH
+1041C; C; 10444; # DESERET CAPITAL LETTER THEE
+1041D; C; 10445; # DESERET CAPITAL LETTER ES
+1041E; C; 10446; # DESERET CAPITAL LETTER ZEE
+1041F; C; 10447; # DESERET CAPITAL LETTER ESH
+10420; C; 10448; # DESERET CAPITAL LETTER ZHEE
+10421; C; 10449; # DESERET CAPITAL LETTER ER
+10422; C; 1044A; # DESERET CAPITAL LETTER EL
+10423; C; 1044B; # DESERET CAPITAL LETTER EM
+10424; C; 1044C; # DESERET CAPITAL LETTER EN
+10425; C; 1044D; # DESERET CAPITAL LETTER ENG
+10426; C; 1044E; # DESERET CAPITAL LETTER OI
+10427; C; 1044F; # DESERET CAPITAL LETTER EW
+10C80; C; 10CC0; # OLD HUNGARIAN CAPITAL LETTER A
+10C81; C; 10CC1; # OLD HUNGARIAN CAPITAL LETTER AA
+10C82; C; 10CC2; # OLD HUNGARIAN CAPITAL LETTER EB
+10C83; C; 10CC3; # OLD HUNGARIAN CAPITAL LETTER AMB
+10C84; C; 10CC4; # OLD HUNGARIAN CAPITAL LETTER EC
+10C85; C; 10CC5; # OLD HUNGARIAN CAPITAL LETTER ENC
+10C86; C; 10CC6; # OLD HUNGARIAN CAPITAL LETTER ECS
+10C87; C; 10CC7; # OLD HUNGARIAN CAPITAL LETTER ED
+10C88; C; 10CC8; # OLD HUNGARIAN CAPITAL LETTER AND
+10C89; C; 10CC9; # OLD HUNGARIAN CAPITAL LETTER E
+10C8A; C; 10CCA; # OLD HUNGARIAN CAPITAL LETTER CLOSE E
+10C8B; C; 10CCB; # OLD HUNGARIAN CAPITAL LETTER EE
+10C8C; C; 10CCC; # OLD HUNGARIAN CAPITAL LETTER EF
+10C8D; C; 10CCD; # OLD HUNGARIAN CAPITAL LETTER EG
+10C8E; C; 10CCE; # OLD HUNGARIAN CAPITAL LETTER EGY
+10C8F; C; 10CCF; # OLD HUNGARIAN CAPITAL LETTER EH
+10C90; C; 10CD0; # OLD HUNGARIAN CAPITAL LETTER I
+10C91; C; 10CD1; # OLD HUNGARIAN CAPITAL LETTER II
+10C92; C; 10CD2; # OLD HUNGARIAN CAPITAL LETTER EJ
+10C93; C; 10CD3; # OLD HUNGARIAN CAPITAL LETTER EK
+10C94; C; 10CD4; # OLD HUNGARIAN CAPITAL LETTER AK
+10C95; C; 10CD5; # OLD HUNGARIAN CAPITAL LETTER UNK
+10C96; C; 10CD6; # OLD HUNGARIAN CAPITAL LETTER EL
+10C97; C; 10CD7; # OLD HUNGARIAN CAPITAL LETTER ELY
+10C98; C; 10CD8; # OLD HUNGARIAN CAPITAL LETTER EM
+10C99; C; 10CD9; # OLD HUNGARIAN CAPITAL LETTER EN
+10C9A; C; 10CDA; # OLD HUNGARIAN CAPITAL LETTER ENY
+10C9B; C; 10CDB; # OLD HUNGARIAN CAPITAL LETTER O
+10C9C; C; 10CDC; # OLD HUNGARIAN CAPITAL LETTER OO
+10C9D; C; 10CDD; # OLD HUNGARIAN CAPITAL LETTER NIKOLSBURG OE
+10C9E; C; 10CDE; # OLD HUNGARIAN CAPITAL LETTER RUDIMENTA OE
+10C9F; C; 10CDF; # OLD HUNGARIAN CAPITAL LETTER OEE
+10CA0; C; 10CE0; # OLD HUNGARIAN CAPITAL LETTER EP
+10CA1; C; 10CE1; # OLD HUNGARIAN CAPITAL LETTER EMP
+10CA2; C; 10CE2; # OLD HUNGARIAN CAPITAL LETTER ER
+10CA3; C; 10CE3; # OLD HUNGARIAN CAPITAL LETTER SHORT ER
+10CA4; C; 10CE4; # OLD HUNGARIAN CAPITAL LETTER ES
+10CA5; C; 10CE5; # OLD HUNGARIAN CAPITAL LETTER ESZ
+10CA6; C; 10CE6; # OLD HUNGARIAN CAPITAL LETTER ET
+10CA7; C; 10CE7; # OLD HUNGARIAN CAPITAL LETTER ENT
+10CA8; C; 10CE8; # OLD HUNGARIAN CAPITAL LETTER ETY
+10CA9; C; 10CE9; # OLD HUNGARIAN CAPITAL LETTER ECH
+10CAA; C; 10CEA; # OLD HUNGARIAN CAPITAL LETTER U
+10CAB; C; 10CEB; # OLD HUNGARIAN CAPITAL LETTER UU
+10CAC; C; 10CEC; # OLD HUNGARIAN CAPITAL LETTER NIKOLSBURG UE
+10CAD; C; 10CED; # OLD HUNGARIAN CAPITAL LETTER RUDIMENTA UE
+10CAE; C; 10CEE; # OLD HUNGARIAN CAPITAL LETTER EV
+10CAF; C; 10CEF; # OLD HUNGARIAN CAPITAL LETTER EZ
+10CB0; C; 10CF0; # OLD HUNGARIAN CAPITAL LETTER EZS
+10CB1; C; 10CF1; # OLD HUNGARIAN CAPITAL LETTER ENT-SHAPED SIGN
+10CB2; C; 10CF2; # OLD HUNGARIAN CAPITAL LETTER US
+118A0; C; 118C0; # WARANG CITI CAPITAL LETTER NGAA
+118A1; C; 118C1; # WARANG CITI CAPITAL LETTER A
+118A2; C; 118C2; # WARANG CITI CAPITAL LETTER WI
+118A3; C; 118C3; # WARANG CITI CAPITAL LETTER YU
+118A4; C; 118C4; # WARANG CITI CAPITAL LETTER YA
+118A5; C; 118C5; # WARANG CITI CAPITAL LETTER YO
+118A6; C; 118C6; # WARANG CITI CAPITAL LETTER II
+118A7; C; 118C7; # WARANG CITI CAPITAL LETTER UU
+118A8; C; 118C8; # WARANG CITI CAPITAL LETTER E
+118A9; C; 118C9; # WARANG CITI CAPITAL LETTER O
+118AA; C; 118CA; # WARANG CITI CAPITAL LETTER ANG
+118AB; C; 118CB; # WARANG CITI CAPITAL LETTER GA
+118AC; C; 118CC; # WARANG CITI CAPITAL LETTER KO
+118AD; C; 118CD; # WARANG CITI CAPITAL LETTER ENY
+118AE; C; 118CE; # WARANG CITI CAPITAL LETTER YUJ
+118AF; C; 118CF; # WARANG CITI CAPITAL LETTER UC
+118B0; C; 118D0; # WARANG CITI CAPITAL LETTER ENN
+118B1; C; 118D1; # WARANG CITI CAPITAL LETTER ODD
+118B2; C; 118D2; # WARANG CITI CAPITAL LETTER TTE
+118B3; C; 118D3; # WARANG CITI CAPITAL LETTER NUNG
+118B4; C; 118D4; # WARANG CITI CAPITAL LETTER DA
+118B5; C; 118D5; # WARANG CITI CAPITAL LETTER AT
+118B6; C; 118D6; # WARANG CITI CAPITAL LETTER AM
+118B7; C; 118D7; # WARANG CITI CAPITAL LETTER BU
+118B8; C; 118D8; # WARANG CITI CAPITAL LETTER PU
+118B9; C; 118D9; # WARANG CITI CAPITAL LETTER HIYO
+118BA; C; 118DA; # WARANG CITI CAPITAL LETTER HOLO
+118BB; C; 118DB; # WARANG CITI CAPITAL LETTER HORR
+118BC; C; 118DC; # WARANG CITI CAPITAL LETTER HAR
+118BD; C; 118DD; # WARANG CITI CAPITAL LETTER SSUU
+118BE; C; 118DE; # WARANG CITI CAPITAL LETTER SII
+118BF; C; 118DF; # WARANG CITI CAPITAL LETTER VIYO
+#
+# EOF
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizehfromrev197533trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodeh"></a>
<div class="copfile"><h4>Copied: trunk/Source/JavaScriptCore/yarr/YarrCanonicalize.h (from rev 197533, trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.h) (0 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalize.h         (rev 0)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalize.h        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -0,0 +1,146 @@
</span><ins>+/*
+ * Copyright (C) 2012-2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef YarrCanonicalize_h
+#define YarrCanonicalize_h
+
+#include <stdint.h>
+#include <unicode/utypes.h>
+
+namespace JSC { namespace Yarr {
+
+// This set of data provides information for each UCS2 code point as to the set of code points
+// that it should match under the ES6 case insensitive RegExp matching rules, specified in 21.2.2.8.2.
+// The non-Unicode tables are autogenerated using YarrCanonicalize.js into YarrCanonicalize.cpp.
+// The Unicode tables are autogenerated using the python script generateYarrCanonicalizeUnicode
+// which creates YarrCanonicalizeUnicode.cpp.
+enum UCS2CanonicalizationType {
+ CanonicalizeUnique, // No canonically equal values, e.g. 0x0.
+ CanonicalizeSet, // Value indicates a set in characterSetInfo.
+ CanonicalizeRangeLo, // Value is positive delta to pair, E.g. 0x41 has value 0x20, -> 0x61.
+ CanonicalizeRangeHi, // Value is positive delta to pair, E.g. 0x61 has value 0x20, -> 0x41.
+ CanonicalizeAlternatingAligned, // Aligned consequtive pair, e.g. 0x1f4,0x1f5.
+ CanonicalizeAlternatingUnaligned, // Unaligned consequtive pair, e.g. 0x241,0x242.
+};
+struct CanonicalizationRange {
+ UChar32 begin;
+ UChar32 end;
+ UChar32 value;
+ UCS2CanonicalizationType type;
+};
+
+extern const size_t UCS2_CANONICALIZATION_RANGES;
+extern const UChar32* const ucs2CharacterSetInfo[];
+extern const CanonicalizationRange ucs2RangeInfo[];
+
+extern const size_t UNICODE_CANONICALIZATION_RANGES;
+extern const UChar32* const unicodeCharacterSetInfo[];
+extern const CanonicalizationRange unicodeRangeInfo[];
+
+enum class CanonicalMode { UCS2, Unicode };
+
+inline const UChar32* canonicalCharacterSetInfo(unsigned index, CanonicalMode canonicalMode)
+{
+ const UChar32* const* rangeInfo = canonicalMode == CanonicalMode::UCS2 ? ucs2CharacterSetInfo : unicodeCharacterSetInfo;
+ return rangeInfo[index];
+}
+
+// This searches in log2 time over ~400-600 entries, so should typically result in 9 compares.
+inline const CanonicalizationRange* canonicalRangeInfoFor(UChar32 ch, CanonicalMode canonicalMode = CanonicalMode::UCS2)
+{
+ const CanonicalizationRange* info = canonicalMode == CanonicalMode::UCS2 ? ucs2RangeInfo : unicodeRangeInfo;
+ size_t entries = canonicalMode == CanonicalMode::UCS2 ? UCS2_CANONICALIZATION_RANGES : UNICODE_CANONICALIZATION_RANGES;
+
+ while (true) {
+ size_t candidate = entries >> 1;
+ const CanonicalizationRange* candidateInfo = info + candidate;
+ if (ch < candidateInfo->begin)
+ entries = candidate;
+ else if (ch <= candidateInfo->end)
+ return candidateInfo;
+ else {
+ info = candidateInfo + 1;
+ entries -= (candidate + 1);
+ }
+ }
+}
+
+// Should only be called for characters that have one canonically matching value.
+inline UChar32 getCanonicalPair(const CanonicalizationRange* info, UChar32 ch)
+{
+ ASSERT(ch >= info->begin && ch <= info->end);
+ switch (info->type) {
+ case CanonicalizeRangeLo:
+ return ch + info->value;
+ case CanonicalizeRangeHi:
+ return ch - info->value;
+ case CanonicalizeAlternatingAligned:
+ return ch ^ 1;
+ case CanonicalizeAlternatingUnaligned:
+ return ((ch - 1) ^ 1) + 1;
+ default:
+ RELEASE_ASSERT_NOT_REACHED();
+ }
+ RELEASE_ASSERT_NOT_REACHED();
+ return 0;
+}
+
+// Returns true if no other UCS2 codepoint can match this value.
+inline bool isCanonicallyUnique(UChar32 ch, CanonicalMode canonicalMode = CanonicalMode::UCS2)
+{
+ return canonicalRangeInfoFor(ch, canonicalMode)->type == CanonicalizeUnique;
+}
+
+// Returns true if values are equal, under the canonicalization rules.
+inline bool areCanonicallyEquivalent(UChar32 a, UChar32 b, CanonicalMode canonicalMode = CanonicalMode::UCS2)
+{
+ const CanonicalizationRange* info = canonicalRangeInfoFor(a, canonicalMode);
+ switch (info->type) {
+ case CanonicalizeUnique:
+ return a == b;
+ case CanonicalizeSet: {
+ for (const UChar32* set = canonicalCharacterSetInfo(info->value, canonicalMode); (a = *set); ++set) {
+ if (a == b)
+ return true;
+ }
+ return false;
+ }
+ case CanonicalizeRangeLo:
+ return (a == b) || (a + info->value == b);
+ case CanonicalizeRangeHi:
+ return (a == b) || (a - info->value == b);
+ case CanonicalizeAlternatingAligned:
+ return (a | 1) == (b | 1);
+ case CanonicalizeAlternatingUnaligned:
+ return ((a - 1) | 1) == ((b - 1) | 1);
+ }
+
+ RELEASE_ASSERT_NOT_REACHED();
+ return false;
+}
+
+} } // JSC::Yarr
+
+#endif
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2cppfromrev197533trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodecpp"></a>
<div class="copfile"><h4>Copied: trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.cpp (from rev 197533, trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.cpp) (0 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.cpp         (rev 0)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.cpp        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -0,0 +1,464 @@
</span><ins>+/*
+ * Copyright (C) 2012-2013, 2015-2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+// DO NOT EDIT! - this file autogenerated by YarrCanonicalize.js
+
+#include "config.h"
+#include "YarrCanonicalize.h"
+
+namespace JSC { namespace Yarr {
+
+const UChar32 ucs2CharacterSet0[] = { 0x01c4, 0x01c5, 0x01c6, 0 };
+const UChar32 ucs2CharacterSet1[] = { 0x01c7, 0x01c8, 0x01c9, 0 };
+const UChar32 ucs2CharacterSet2[] = { 0x01ca, 0x01cb, 0x01cc, 0 };
+const UChar32 ucs2CharacterSet3[] = { 0x01f1, 0x01f2, 0x01f3, 0 };
+const UChar32 ucs2CharacterSet4[] = { 0x0392, 0x03b2, 0x03d0, 0 };
+const UChar32 ucs2CharacterSet5[] = { 0x0395, 0x03b5, 0x03f5, 0 };
+const UChar32 ucs2CharacterSet6[] = { 0x0398, 0x03b8, 0x03d1, 0 };
+const UChar32 ucs2CharacterSet7[] = { 0x0345, 0x0399, 0x03b9, 0x1fbe, 0 };
+const UChar32 ucs2CharacterSet8[] = { 0x039a, 0x03ba, 0x03f0, 0 };
+const UChar32 ucs2CharacterSet9[] = { 0x00b5, 0x039c, 0x03bc, 0 };
+const UChar32 ucs2CharacterSet10[] = { 0x03a0, 0x03c0, 0x03d6, 0 };
+const UChar32 ucs2CharacterSet11[] = { 0x03a1, 0x03c1, 0x03f1, 0 };
+const UChar32 ucs2CharacterSet12[] = { 0x03a3, 0x03c2, 0x03c3, 0 };
+const UChar32 ucs2CharacterSet13[] = { 0x03a6, 0x03c6, 0x03d5, 0 };
+const UChar32 ucs2CharacterSet14[] = { 0x1e60, 0x1e61, 0x1e9b, 0 };
+
+static const size_t UCS2_CANONICALIZATION_SETS = 15;
+const UChar32* const ucs2CharacterSetInfo[UCS2_CANONICALIZATION_SETS] = {
+ ucs2CharacterSet0,
+ ucs2CharacterSet1,
+ ucs2CharacterSet2,
+ ucs2CharacterSet3,
+ ucs2CharacterSet4,
+ ucs2CharacterSet5,
+ ucs2CharacterSet6,
+ ucs2CharacterSet7,
+ ucs2CharacterSet8,
+ ucs2CharacterSet9,
+ ucs2CharacterSet10,
+ ucs2CharacterSet11,
+ ucs2CharacterSet12,
+ ucs2CharacterSet13,
+ ucs2CharacterSet14,
+};
+
+const size_t UCS2_CANONICALIZATION_RANGES = 391;
+const CanonicalizationRange ucs2RangeInfo[UCS2_CANONICALIZATION_RANGES] = {
+ { 0x0000, 0x0040, 0x0000, CanonicalizeUnique },
+ { 0x0041, 0x005a, 0x0020, CanonicalizeRangeLo },
+ { 0x005b, 0x0060, 0x0000, CanonicalizeUnique },
+ { 0x0061, 0x007a, 0x0020, CanonicalizeRangeHi },
+ { 0x007b, 0x00b4, 0x0000, CanonicalizeUnique },
+ { 0x00b5, 0x00b5, 0x0009, CanonicalizeSet },
+ { 0x00b6, 0x00bf, 0x0000, CanonicalizeUnique },
+ { 0x00c0, 0x00d6, 0x0020, CanonicalizeRangeLo },
+ { 0x00d7, 0x00d7, 0x0000, CanonicalizeUnique },
+ { 0x00d8, 0x00de, 0x0020, CanonicalizeRangeLo },
+ { 0x00df, 0x00df, 0x0000, CanonicalizeUnique },
+ { 0x00e0, 0x00f6, 0x0020, CanonicalizeRangeHi },
+ { 0x00f7, 0x00f7, 0x0000, CanonicalizeUnique },
+ { 0x00f8, 0x00fe, 0x0020, CanonicalizeRangeHi },
+ { 0x00ff, 0x00ff, 0x0079, CanonicalizeRangeLo },
+ { 0x0100, 0x012f, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x0130, 0x0131, 0x0000, CanonicalizeUnique },
+ { 0x0132, 0x0137, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x0138, 0x0138, 0x0000, CanonicalizeUnique },
+ { 0x0139, 0x0148, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x0149, 0x0149, 0x0000, CanonicalizeUnique },
+ { 0x014a, 0x0177, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x0178, 0x0178, 0x0079, CanonicalizeRangeHi },
+ { 0x0179, 0x017e, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x017f, 0x017f, 0x0000, CanonicalizeUnique },
+ { 0x0180, 0x0180, 0x00c3, CanonicalizeRangeLo },
+ { 0x0181, 0x0181, 0x00d2, CanonicalizeRangeLo },
+ { 0x0182, 0x0185, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x0186, 0x0186, 0x00ce, CanonicalizeRangeLo },
+ { 0x0187, 0x0188, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x0189, 0x018a, 0x00cd, CanonicalizeRangeLo },
+ { 0x018b, 0x018c, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x018d, 0x018d, 0x0000, CanonicalizeUnique },
+ { 0x018e, 0x018e, 0x004f, CanonicalizeRangeLo },
+ { 0x018f, 0x018f, 0x00ca, CanonicalizeRangeLo },
+ { 0x0190, 0x0190, 0x00cb, CanonicalizeRangeLo },
+ { 0x0191, 0x0192, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x0193, 0x0193, 0x00cd, CanonicalizeRangeLo },
+ { 0x0194, 0x0194, 0x00cf, CanonicalizeRangeLo },
+ { 0x0195, 0x0195, 0x0061, CanonicalizeRangeLo },
+ { 0x0196, 0x0196, 0x00d3, CanonicalizeRangeLo },
+ { 0x0197, 0x0197, 0x00d1, CanonicalizeRangeLo },
+ { 0x0198, 0x0199, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x019a, 0x019a, 0x00a3, CanonicalizeRangeLo },
+ { 0x019b, 0x019b, 0x0000, CanonicalizeUnique },
+ { 0x019c, 0x019c, 0x00d3, CanonicalizeRangeLo },
+ { 0x019d, 0x019d, 0x00d5, CanonicalizeRangeLo },
+ { 0x019e, 0x019e, 0x0082, CanonicalizeRangeLo },
+ { 0x019f, 0x019f, 0x00d6, CanonicalizeRangeLo },
+ { 0x01a0, 0x01a5, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x01a6, 0x01a6, 0x00da, CanonicalizeRangeLo },
+ { 0x01a7, 0x01a8, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x01a9, 0x01a9, 0x00da, CanonicalizeRangeLo },
+ { 0x01aa, 0x01ab, 0x0000, CanonicalizeUnique },
+ { 0x01ac, 0x01ad, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x01ae, 0x01ae, 0x00da, CanonicalizeRangeLo },
+ { 0x01af, 0x01b0, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x01b1, 0x01b2, 0x00d9, CanonicalizeRangeLo },
+ { 0x01b3, 0x01b6, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x01b7, 0x01b7, 0x00db, CanonicalizeRangeLo },
+ { 0x01b8, 0x01b9, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x01ba, 0x01bb, 0x0000, CanonicalizeUnique },
+ { 0x01bc, 0x01bd, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x01be, 0x01be, 0x0000, CanonicalizeUnique },
+ { 0x01bf, 0x01bf, 0x0038, CanonicalizeRangeLo },
+ { 0x01c0, 0x01c3, 0x0000, CanonicalizeUnique },
+ { 0x01c4, 0x01c6, 0x0000, CanonicalizeSet },
+ { 0x01c7, 0x01c9, 0x0001, CanonicalizeSet },
+ { 0x01ca, 0x01cc, 0x0002, CanonicalizeSet },
+ { 0x01cd, 0x01dc, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x01dd, 0x01dd, 0x004f, CanonicalizeRangeHi },
+ { 0x01de, 0x01ef, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x01f0, 0x01f0, 0x0000, CanonicalizeUnique },
+ { 0x01f1, 0x01f3, 0x0003, CanonicalizeSet },
+ { 0x01f4, 0x01f5, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x01f6, 0x01f6, 0x0061, CanonicalizeRangeHi },
+ { 0x01f7, 0x01f7, 0x0038, CanonicalizeRangeHi },
+ { 0x01f8, 0x021f, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x0220, 0x0220, 0x0082, CanonicalizeRangeHi },
+ { 0x0221, 0x0221, 0x0000, CanonicalizeUnique },
+ { 0x0222, 0x0233, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x0234, 0x0239, 0x0000, CanonicalizeUnique },
+ { 0x023a, 0x023a, 0x2a2b, CanonicalizeRangeLo },
+ { 0x023b, 0x023c, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x023d, 0x023d, 0x00a3, CanonicalizeRangeHi },
+ { 0x023e, 0x023e, 0x2a28, CanonicalizeRangeLo },
+ { 0x023f, 0x0240, 0x2a3f, CanonicalizeRangeLo },
+ { 0x0241, 0x0242, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x0243, 0x0243, 0x00c3, CanonicalizeRangeHi },
+ { 0x0244, 0x0244, 0x0045, CanonicalizeRangeLo },
+ { 0x0245, 0x0245, 0x0047, CanonicalizeRangeLo },
+ { 0x0246, 0x024f, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x0250, 0x0250, 0x2a1f, CanonicalizeRangeLo },
+ { 0x0251, 0x0251, 0x2a1c, CanonicalizeRangeLo },
+ { 0x0252, 0x0252, 0x2a1e, CanonicalizeRangeLo },
+ { 0x0253, 0x0253, 0x00d2, CanonicalizeRangeHi },
+ { 0x0254, 0x0254, 0x00ce, CanonicalizeRangeHi },
+ { 0x0255, 0x0255, 0x0000, CanonicalizeUnique },
+ { 0x0256, 0x0257, 0x00cd, CanonicalizeRangeHi },
+ { 0x0258, 0x0258, 0x0000, CanonicalizeUnique },
+ { 0x0259, 0x0259, 0x00ca, CanonicalizeRangeHi },
+ { 0x025a, 0x025a, 0x0000, CanonicalizeUnique },
+ { 0x025b, 0x025b, 0x00cb, CanonicalizeRangeHi },
+ { 0x025c, 0x025c, 0xa54f, CanonicalizeRangeLo },
+ { 0x025d, 0x025f, 0x0000, CanonicalizeUnique },
+ { 0x0260, 0x0260, 0x00cd, CanonicalizeRangeHi },
+ { 0x0261, 0x0261, 0xa54b, CanonicalizeRangeLo },
+ { 0x0262, 0x0262, 0x0000, CanonicalizeUnique },
+ { 0x0263, 0x0263, 0x00cf, CanonicalizeRangeHi },
+ { 0x0264, 0x0264, 0x0000, CanonicalizeUnique },
+ { 0x0265, 0x0265, 0xa528, CanonicalizeRangeLo },
+ { 0x0266, 0x0266, 0xa544, CanonicalizeRangeLo },
+ { 0x0267, 0x0267, 0x0000, CanonicalizeUnique },
+ { 0x0268, 0x0268, 0x00d1, CanonicalizeRangeHi },
+ { 0x0269, 0x0269, 0x00d3, CanonicalizeRangeHi },
+ { 0x026a, 0x026a, 0x0000, CanonicalizeUnique },
+ { 0x026b, 0x026b, 0x29f7, CanonicalizeRangeLo },
+ { 0x026c, 0x026c, 0xa541, CanonicalizeRangeLo },
+ { 0x026d, 0x026e, 0x0000, CanonicalizeUnique },
+ { 0x026f, 0x026f, 0x00d3, CanonicalizeRangeHi },
+ { 0x0270, 0x0270, 0x0000, CanonicalizeUnique },
+ { 0x0271, 0x0271, 0x29fd, CanonicalizeRangeLo },
+ { 0x0272, 0x0272, 0x00d5, CanonicalizeRangeHi },
+ { 0x0273, 0x0274, 0x0000, CanonicalizeUnique },
+ { 0x0275, 0x0275, 0x00d6, CanonicalizeRangeHi },
+ { 0x0276, 0x027c, 0x0000, CanonicalizeUnique },
+ { 0x027d, 0x027d, 0x29e7, CanonicalizeRangeLo },
+ { 0x027e, 0x027f, 0x0000, CanonicalizeUnique },
+ { 0x0280, 0x0280, 0x00da, CanonicalizeRangeHi },
+ { 0x0281, 0x0282, 0x0000, CanonicalizeUnique },
+ { 0x0283, 0x0283, 0x00da, CanonicalizeRangeHi },
+ { 0x0284, 0x0286, 0x0000, CanonicalizeUnique },
+ { 0x0287, 0x0287, 0xa52a, CanonicalizeRangeLo },
+ { 0x0288, 0x0288, 0x00da, CanonicalizeRangeHi },
+ { 0x0289, 0x0289, 0x0045, CanonicalizeRangeHi },
+ { 0x028a, 0x028b, 0x00d9, CanonicalizeRangeHi },
+ { 0x028c, 0x028c, 0x0047, CanonicalizeRangeHi },
+ { 0x028d, 0x0291, 0x0000, CanonicalizeUnique },
+ { 0x0292, 0x0292, 0x00db, CanonicalizeRangeHi },
+ { 0x0293, 0x029d, 0x0000, CanonicalizeUnique },
+ { 0x029e, 0x029e, 0xa512, CanonicalizeRangeLo },
+ { 0x029f, 0x0344, 0x0000, CanonicalizeUnique },
+ { 0x0345, 0x0345, 0x0007, CanonicalizeSet },
+ { 0x0346, 0x036f, 0x0000, CanonicalizeUnique },
+ { 0x0370, 0x0373, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x0374, 0x0375, 0x0000, CanonicalizeUnique },
+ { 0x0376, 0x0377, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x0378, 0x037a, 0x0000, CanonicalizeUnique },
+ { 0x037b, 0x037d, 0x0082, CanonicalizeRangeLo },
+ { 0x037e, 0x037e, 0x0000, CanonicalizeUnique },
+ { 0x037f, 0x037f, 0x0074, CanonicalizeRangeLo },
+ { 0x0380, 0x0385, 0x0000, CanonicalizeUnique },
+ { 0x0386, 0x0386, 0x0026, CanonicalizeRangeLo },
+ { 0x0387, 0x0387, 0x0000, CanonicalizeUnique },
+ { 0x0388, 0x038a, 0x0025, CanonicalizeRangeLo },
+ { 0x038b, 0x038b, 0x0000, CanonicalizeUnique },
+ { 0x038c, 0x038c, 0x0040, CanonicalizeRangeLo },
+ { 0x038d, 0x038d, 0x0000, CanonicalizeUnique },
+ { 0x038e, 0x038f, 0x003f, CanonicalizeRangeLo },
+ { 0x0390, 0x0390, 0x0000, CanonicalizeUnique },
+ { 0x0391, 0x0391, 0x0020, CanonicalizeRangeLo },
+ { 0x0392, 0x0392, 0x0004, CanonicalizeSet },
+ { 0x0393, 0x0394, 0x0020, CanonicalizeRangeLo },
+ { 0x0395, 0x0395, 0x0005, CanonicalizeSet },
+ { 0x0396, 0x0397, 0x0020, CanonicalizeRangeLo },
+ { 0x0398, 0x0398, 0x0006, CanonicalizeSet },
+ { 0x0399, 0x0399, 0x0007, CanonicalizeSet },
+ { 0x039a, 0x039a, 0x0008, CanonicalizeSet },
+ { 0x039b, 0x039b, 0x0020, CanonicalizeRangeLo },
+ { 0x039c, 0x039c, 0x0009, CanonicalizeSet },
+ { 0x039d, 0x039f, 0x0020, CanonicalizeRangeLo },
+ { 0x03a0, 0x03a0, 0x000a, CanonicalizeSet },
+ { 0x03a1, 0x03a1, 0x000b, CanonicalizeSet },
+ { 0x03a2, 0x03a2, 0x0000, CanonicalizeUnique },
+ { 0x03a3, 0x03a3, 0x000c, CanonicalizeSet },
+ { 0x03a4, 0x03a5, 0x0020, CanonicalizeRangeLo },
+ { 0x03a6, 0x03a6, 0x000d, CanonicalizeSet },
+ { 0x03a7, 0x03ab, 0x0020, CanonicalizeRangeLo },
+ { 0x03ac, 0x03ac, 0x0026, CanonicalizeRangeHi },
+ { 0x03ad, 0x03af, 0x0025, CanonicalizeRangeHi },
+ { 0x03b0, 0x03b0, 0x0000, CanonicalizeUnique },
+ { 0x03b1, 0x03b1, 0x0020, CanonicalizeRangeHi },
+ { 0x03b2, 0x03b2, 0x0004, CanonicalizeSet },
+ { 0x03b3, 0x03b4, 0x0020, CanonicalizeRangeHi },
+ { 0x03b5, 0x03b5, 0x0005, CanonicalizeSet },
+ { 0x03b6, 0x03b7, 0x0020, CanonicalizeRangeHi },
+ { 0x03b8, 0x03b8, 0x0006, CanonicalizeSet },
+ { 0x03b9, 0x03b9, 0x0007, CanonicalizeSet },
+ { 0x03ba, 0x03ba, 0x0008, CanonicalizeSet },
+ { 0x03bb, 0x03bb, 0x0020, CanonicalizeRangeHi },
+ { 0x03bc, 0x03bc, 0x0009, CanonicalizeSet },
+ { 0x03bd, 0x03bf, 0x0020, CanonicalizeRangeHi },
+ { 0x03c0, 0x03c0, 0x000a, CanonicalizeSet },
+ { 0x03c1, 0x03c1, 0x000b, CanonicalizeSet },
+ { 0x03c2, 0x03c3, 0x000c, CanonicalizeSet },
+ { 0x03c4, 0x03c5, 0x0020, CanonicalizeRangeHi },
+ { 0x03c6, 0x03c6, 0x000d, CanonicalizeSet },
+ { 0x03c7, 0x03cb, 0x0020, CanonicalizeRangeHi },
+ { 0x03cc, 0x03cc, 0x0040, CanonicalizeRangeHi },
+ { 0x03cd, 0x03ce, 0x003f, CanonicalizeRangeHi },
+ { 0x03cf, 0x03cf, 0x0008, CanonicalizeRangeLo },
+ { 0x03d0, 0x03d0, 0x0004, CanonicalizeSet },
+ { 0x03d1, 0x03d1, 0x0006, CanonicalizeSet },
+ { 0x03d2, 0x03d4, 0x0000, CanonicalizeUnique },
+ { 0x03d5, 0x03d5, 0x000d, CanonicalizeSet },
+ { 0x03d6, 0x03d6, 0x000a, CanonicalizeSet },
+ { 0x03d7, 0x03d7, 0x0008, CanonicalizeRangeHi },
+ { 0x03d8, 0x03ef, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x03f0, 0x03f0, 0x0008, CanonicalizeSet },
+ { 0x03f1, 0x03f1, 0x000b, CanonicalizeSet },
+ { 0x03f2, 0x03f2, 0x0007, CanonicalizeRangeLo },
+ { 0x03f3, 0x03f3, 0x0074, CanonicalizeRangeHi },
+ { 0x03f4, 0x03f4, 0x0000, CanonicalizeUnique },
+ { 0x03f5, 0x03f5, 0x0005, CanonicalizeSet },
+ { 0x03f6, 0x03f6, 0x0000, CanonicalizeUnique },
+ { 0x03f7, 0x03f8, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x03f9, 0x03f9, 0x0007, CanonicalizeRangeHi },
+ { 0x03fa, 0x03fb, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x03fc, 0x03fc, 0x0000, CanonicalizeUnique },
+ { 0x03fd, 0x03ff, 0x0082, CanonicalizeRangeHi },
+ { 0x0400, 0x040f, 0x0050, CanonicalizeRangeLo },
+ { 0x0410, 0x042f, 0x0020, CanonicalizeRangeLo },
+ { 0x0430, 0x044f, 0x0020, CanonicalizeRangeHi },
+ { 0x0450, 0x045f, 0x0050, CanonicalizeRangeHi },
+ { 0x0460, 0x0481, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x0482, 0x0489, 0x0000, CanonicalizeUnique },
+ { 0x048a, 0x04bf, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x04c0, 0x04c0, 0x000f, CanonicalizeRangeLo },
+ { 0x04c1, 0x04ce, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x04cf, 0x04cf, 0x000f, CanonicalizeRangeHi },
+ { 0x04d0, 0x052f, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x0530, 0x0530, 0x0000, CanonicalizeUnique },
+ { 0x0531, 0x0556, 0x0030, CanonicalizeRangeLo },
+ { 0x0557, 0x0560, 0x0000, CanonicalizeUnique },
+ { 0x0561, 0x0586, 0x0030, CanonicalizeRangeHi },
+ { 0x0587, 0x109f, 0x0000, CanonicalizeUnique },
+ { 0x10a0, 0x10c5, 0x1c60, CanonicalizeRangeLo },
+ { 0x10c6, 0x10c6, 0x0000, CanonicalizeUnique },
+ { 0x10c7, 0x10c7, 0x1c60, CanonicalizeRangeLo },
+ { 0x10c8, 0x10cc, 0x0000, CanonicalizeUnique },
+ { 0x10cd, 0x10cd, 0x1c60, CanonicalizeRangeLo },
+ { 0x10ce, 0x1d78, 0x0000, CanonicalizeUnique },
+ { 0x1d79, 0x1d79, 0x8a04, CanonicalizeRangeLo },
+ { 0x1d7a, 0x1d7c, 0x0000, CanonicalizeUnique },
+ { 0x1d7d, 0x1d7d, 0x0ee6, CanonicalizeRangeLo },
+ { 0x1d7e, 0x1dff, 0x0000, CanonicalizeUnique },
+ { 0x1e00, 0x1e5f, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x1e60, 0x1e61, 0x000e, CanonicalizeSet },
+ { 0x1e62, 0x1e95, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x1e96, 0x1e9a, 0x0000, CanonicalizeUnique },
+ { 0x1e9b, 0x1e9b, 0x000e, CanonicalizeSet },
+ { 0x1e9c, 0x1e9f, 0x0000, CanonicalizeUnique },
+ { 0x1ea0, 0x1eff, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x1f00, 0x1f07, 0x0008, CanonicalizeRangeLo },
+ { 0x1f08, 0x1f0f, 0x0008, CanonicalizeRangeHi },
+ { 0x1f10, 0x1f15, 0x0008, CanonicalizeRangeLo },
+ { 0x1f16, 0x1f17, 0x0000, CanonicalizeUnique },
+ { 0x1f18, 0x1f1d, 0x0008, CanonicalizeRangeHi },
+ { 0x1f1e, 0x1f1f, 0x0000, CanonicalizeUnique },
+ { 0x1f20, 0x1f27, 0x0008, CanonicalizeRangeLo },
+ { 0x1f28, 0x1f2f, 0x0008, CanonicalizeRangeHi },
+ { 0x1f30, 0x1f37, 0x0008, CanonicalizeRangeLo },
+ { 0x1f38, 0x1f3f, 0x0008, CanonicalizeRangeHi },
+ { 0x1f40, 0x1f45, 0x0008, CanonicalizeRangeLo },
+ { 0x1f46, 0x1f47, 0x0000, CanonicalizeUnique },
+ { 0x1f48, 0x1f4d, 0x0008, CanonicalizeRangeHi },
+ { 0x1f4e, 0x1f50, 0x0000, CanonicalizeUnique },
+ { 0x1f51, 0x1f51, 0x0008, CanonicalizeRangeLo },
+ { 0x1f52, 0x1f52, 0x0000, CanonicalizeUnique },
+ { 0x1f53, 0x1f53, 0x0008, CanonicalizeRangeLo },
+ { 0x1f54, 0x1f54, 0x0000, CanonicalizeUnique },
+ { 0x1f55, 0x1f55, 0x0008, CanonicalizeRangeLo },
+ { 0x1f56, 0x1f56, 0x0000, CanonicalizeUnique },
+ { 0x1f57, 0x1f57, 0x0008, CanonicalizeRangeLo },
+ { 0x1f58, 0x1f58, 0x0000, CanonicalizeUnique },
+ { 0x1f59, 0x1f59, 0x0008, CanonicalizeRangeHi },
+ { 0x1f5a, 0x1f5a, 0x0000, CanonicalizeUnique },
+ { 0x1f5b, 0x1f5b, 0x0008, CanonicalizeRangeHi },
+ { 0x1f5c, 0x1f5c, 0x0000, CanonicalizeUnique },
+ { 0x1f5d, 0x1f5d, 0x0008, CanonicalizeRangeHi },
+ { 0x1f5e, 0x1f5e, 0x0000, CanonicalizeUnique },
+ { 0x1f5f, 0x1f5f, 0x0008, CanonicalizeRangeHi },
+ { 0x1f60, 0x1f67, 0x0008, CanonicalizeRangeLo },
+ { 0x1f68, 0x1f6f, 0x0008, CanonicalizeRangeHi },
+ { 0x1f70, 0x1f71, 0x004a, CanonicalizeRangeLo },
+ { 0x1f72, 0x1f75, 0x0056, CanonicalizeRangeLo },
+ { 0x1f76, 0x1f77, 0x0064, CanonicalizeRangeLo },
+ { 0x1f78, 0x1f79, 0x0080, CanonicalizeRangeLo },
+ { 0x1f7a, 0x1f7b, 0x0070, CanonicalizeRangeLo },
+ { 0x1f7c, 0x1f7d, 0x007e, CanonicalizeRangeLo },
+ { 0x1f7e, 0x1faf, 0x0000, CanonicalizeUnique },
+ { 0x1fb0, 0x1fb1, 0x0008, CanonicalizeRangeLo },
+ { 0x1fb2, 0x1fb7, 0x0000, CanonicalizeUnique },
+ { 0x1fb8, 0x1fb9, 0x0008, CanonicalizeRangeHi },
+ { 0x1fba, 0x1fbb, 0x004a, CanonicalizeRangeHi },
+ { 0x1fbc, 0x1fbd, 0x0000, CanonicalizeUnique },
+ { 0x1fbe, 0x1fbe, 0x0007, CanonicalizeSet },
+ { 0x1fbf, 0x1fc7, 0x0000, CanonicalizeUnique },
+ { 0x1fc8, 0x1fcb, 0x0056, CanonicalizeRangeHi },
+ { 0x1fcc, 0x1fcf, 0x0000, CanonicalizeUnique },
+ { 0x1fd0, 0x1fd1, 0x0008, CanonicalizeRangeLo },
+ { 0x1fd2, 0x1fd7, 0x0000, CanonicalizeUnique },
+ { 0x1fd8, 0x1fd9, 0x0008, CanonicalizeRangeHi },
+ { 0x1fda, 0x1fdb, 0x0064, CanonicalizeRangeHi },
+ { 0x1fdc, 0x1fdf, 0x0000, CanonicalizeUnique },
+ { 0x1fe0, 0x1fe1, 0x0008, CanonicalizeRangeLo },
+ { 0x1fe2, 0x1fe4, 0x0000, CanonicalizeUnique },
+ { 0x1fe5, 0x1fe5, 0x0007, CanonicalizeRangeLo },
+ { 0x1fe6, 0x1fe7, 0x0000, CanonicalizeUnique },
+ { 0x1fe8, 0x1fe9, 0x0008, CanonicalizeRangeHi },
+ { 0x1fea, 0x1feb, 0x0070, CanonicalizeRangeHi },
+ { 0x1fec, 0x1fec, 0x0007, CanonicalizeRangeHi },
+ { 0x1fed, 0x1ff7, 0x0000, CanonicalizeUnique },
+ { 0x1ff8, 0x1ff9, 0x0080, CanonicalizeRangeHi },
+ { 0x1ffa, 0x1ffb, 0x007e, CanonicalizeRangeHi },
+ { 0x1ffc, 0x2131, 0x0000, CanonicalizeUnique },
+ { 0x2132, 0x2132, 0x001c, CanonicalizeRangeLo },
+ { 0x2133, 0x214d, 0x0000, CanonicalizeUnique },
+ { 0x214e, 0x214e, 0x001c, CanonicalizeRangeHi },
+ { 0x214f, 0x215f, 0x0000, CanonicalizeUnique },
+ { 0x2160, 0x216f, 0x0010, CanonicalizeRangeLo },
+ { 0x2170, 0x217f, 0x0010, CanonicalizeRangeHi },
+ { 0x2180, 0x2182, 0x0000, CanonicalizeUnique },
+ { 0x2183, 0x2184, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x2185, 0x24b5, 0x0000, CanonicalizeUnique },
+ { 0x24b6, 0x24cf, 0x001a, CanonicalizeRangeLo },
+ { 0x24d0, 0x24e9, 0x001a, CanonicalizeRangeHi },
+ { 0x24ea, 0x2bff, 0x0000, CanonicalizeUnique },
+ { 0x2c00, 0x2c2e, 0x0030, CanonicalizeRangeLo },
+ { 0x2c2f, 0x2c2f, 0x0000, CanonicalizeUnique },
+ { 0x2c30, 0x2c5e, 0x0030, CanonicalizeRangeHi },
+ { 0x2c5f, 0x2c5f, 0x0000, CanonicalizeUnique },
+ { 0x2c60, 0x2c61, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x2c62, 0x2c62, 0x29f7, CanonicalizeRangeHi },
+ { 0x2c63, 0x2c63, 0x0ee6, CanonicalizeRangeHi },
+ { 0x2c64, 0x2c64, 0x29e7, CanonicalizeRangeHi },
+ { 0x2c65, 0x2c65, 0x2a2b, CanonicalizeRangeHi },
+ { 0x2c66, 0x2c66, 0x2a28, CanonicalizeRangeHi },
+ { 0x2c67, 0x2c6c, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x2c6d, 0x2c6d, 0x2a1c, CanonicalizeRangeHi },
+ { 0x2c6e, 0x2c6e, 0x29fd, CanonicalizeRangeHi },
+ { 0x2c6f, 0x2c6f, 0x2a1f, CanonicalizeRangeHi },
+ { 0x2c70, 0x2c70, 0x2a1e, CanonicalizeRangeHi },
+ { 0x2c71, 0x2c71, 0x0000, CanonicalizeUnique },
+ { 0x2c72, 0x2c73, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x2c74, 0x2c74, 0x0000, CanonicalizeUnique },
+ { 0x2c75, 0x2c76, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x2c77, 0x2c7d, 0x0000, CanonicalizeUnique },
+ { 0x2c7e, 0x2c7f, 0x2a3f, CanonicalizeRangeHi },
+ { 0x2c80, 0x2ce3, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x2ce4, 0x2cea, 0x0000, CanonicalizeUnique },
+ { 0x2ceb, 0x2cee, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0x2cef, 0x2cf1, 0x0000, CanonicalizeUnique },
+ { 0x2cf2, 0x2cf3, 0x0000, CanonicalizeAlternatingAligned },
+ { 0x2cf4, 0x2cff, 0x0000, CanonicalizeUnique },
+ { 0x2d00, 0x2d25, 0x1c60, CanonicalizeRangeHi },
+ { 0x2d26, 0x2d26, 0x0000, CanonicalizeUnique },
+ { 0x2d27, 0x2d27, 0x1c60, CanonicalizeRangeHi },
+ { 0x2d28, 0x2d2c, 0x0000, CanonicalizeUnique },
+ { 0x2d2d, 0x2d2d, 0x1c60, CanonicalizeRangeHi },
+ { 0x2d2e, 0xa63f, 0x0000, CanonicalizeUnique },
+ { 0xa640, 0xa66d, 0x0000, CanonicalizeAlternatingAligned },
+ { 0xa66e, 0xa67f, 0x0000, CanonicalizeUnique },
+ { 0xa680, 0xa69b, 0x0000, CanonicalizeAlternatingAligned },
+ { 0xa69c, 0xa721, 0x0000, CanonicalizeUnique },
+ { 0xa722, 0xa72f, 0x0000, CanonicalizeAlternatingAligned },
+ { 0xa730, 0xa731, 0x0000, CanonicalizeUnique },
+ { 0xa732, 0xa76f, 0x0000, CanonicalizeAlternatingAligned },
+ { 0xa770, 0xa778, 0x0000, CanonicalizeUnique },
+ { 0xa779, 0xa77c, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0xa77d, 0xa77d, 0x8a04, CanonicalizeRangeHi },
+ { 0xa77e, 0xa787, 0x0000, CanonicalizeAlternatingAligned },
+ { 0xa788, 0xa78a, 0x0000, CanonicalizeUnique },
+ { 0xa78b, 0xa78c, 0x0000, CanonicalizeAlternatingUnaligned },
+ { 0xa78d, 0xa78d, 0xa528, CanonicalizeRangeHi },
+ { 0xa78e, 0xa78f, 0x0000, CanonicalizeUnique },
+ { 0xa790, 0xa793, 0x0000, CanonicalizeAlternatingAligned },
+ { 0xa794, 0xa795, 0x0000, CanonicalizeUnique },
+ { 0xa796, 0xa7a9, 0x0000, CanonicalizeAlternatingAligned },
+ { 0xa7aa, 0xa7aa, 0xa544, CanonicalizeRangeHi },
+ { 0xa7ab, 0xa7ab, 0xa54f, CanonicalizeRangeHi },
+ { 0xa7ac, 0xa7ac, 0xa54b, CanonicalizeRangeHi },
+ { 0xa7ad, 0xa7ad, 0xa541, CanonicalizeRangeHi },
+ { 0xa7ae, 0xa7af, 0x0000, CanonicalizeUnique },
+ { 0xa7b0, 0xa7b0, 0xa512, CanonicalizeRangeHi },
+ { 0xa7b1, 0xa7b1, 0xa52a, CanonicalizeRangeHi },
+ { 0xa7b2, 0xff20, 0x0000, CanonicalizeUnique },
+ { 0xff21, 0xff3a, 0x0020, CanonicalizeRangeLo },
+ { 0xff3b, 0xff40, 0x0000, CanonicalizeUnique },
+ { 0xff41, 0xff5a, 0x0020, CanonicalizeRangeHi },
+ { 0xff5b, 0xffff, 0x0000, CanonicalizeUnique },
+};
+
+} } // JSC::Yarr
+
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizeUCS2jsfromrev197533trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodejs"></a>
<div class="copfile"><h4>Copied: trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.js (from rev 197533, trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.js) (0 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.js         (rev 0)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUCS2.js        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -0,0 +1,193 @@
</span><ins>+/*
+ * Copyright (C) 2012, 2016 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR
+ * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
+ * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
+ * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
+ * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
+ * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+function printHeader()
+{
+ var copyright = (
+ "/*" + "\n" +
+ " * Copyright (C) 2012-2013, 2015-2016 Apple Inc. All rights reserved." + "\n" +
+ " *" + "\n" +
+ " * Redistribution and use in source and binary forms, with or without" + "\n" +
+ " * modification, are permitted provided that the following conditions" + "\n" +
+ " * are met:" + "\n" +
+ " * 1. Redistributions of source code must retain the above copyright" + "\n" +
+ " * notice, this list of conditions and the following disclaimer." + "\n" +
+ " * 2. Redistributions in binary form must reproduce the above copyright" + "\n" +
+ " * notice, this list of conditions and the following disclaimer in the" + "\n" +
+ " * documentation and/or other materials provided with the distribution." + "\n" +
+ " *" + "\n" +
+ " * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY" + "\n" +
+ " * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE" + "\n" +
+ " * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR" + "\n" +
+ " * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR" + "\n" +
+ " * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL," + "\n" +
+ " * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO," + "\n" +
+ " * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR" + "\n" +
+ " * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY" + "\n" +
+ " * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT" + "\n" +
+ " * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE" + "\n" +
+ " * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. " + "\n" +
+ " */");
+
+ print(copyright);
+ print();
+ print("// DO NOT EDIT! - this file autogenerated by YarrCanonicalize.js");
+ print();
+ print('#include "config.h"');
+ print('#include "YarrCanonicalize.h"');
+ print();
+ print("namespace JSC { namespace Yarr {");
+ print();
+}
+
+function printFooter()
+{
+ print("} } // JSC::Yarr");
+ print();
+}
+
+// Helper function to convert a number to a fixed width hex representation of a UChar32.
+function hex(x)
+{
+ var s = Number(x).toString(16);
+ while (s.length < 4)
+ s = 0 + s;
+ return "0x" + s;
+}
+
+// See ES 6.0, 21.2.2.8.2 Steps 3
+function canonicalize(ch)
+{
+ var u = String.fromCharCode(ch).toUpperCase();
+ if (u.length > 1)
+ return ch;
+ var cu = u.charCodeAt(0);
+ if (ch >= 128 && cu < 128)
+ return ch;
+ return cu;
+}
+
+var MAX_UCS2 = 0xFFFF;
+
+function createUCS2CanonicalGroups()
+{
+ var groupedCanonically = [];
+ // Pass 1: populate groupedCanonically - this is mapping from canonicalized
+ // values back to the set of character code that canonicalize to them.
+ for (var i = 0; i <= MAX_UCS2; ++i) {
+ var ch = canonicalize(i);
+ if (!groupedCanonically[ch])
+ groupedCanonically[ch] = [];
+ groupedCanonically[ch].push(i);
+ }
+
+ return groupedCanonically;
+}
+
+function createTables(prefix, maxValue, canonicalGroups)
+{
+ var prefixLower = prefix.toLowerCase();
+ var prefixUpper = prefix.toUpperCase();
+ var typeInfo = [];
+ var characterSetInfo = [];
+ // Pass 2: populate typeInfo & characterSetInfo. For every character calculate
+ // a typeInfo value, described by the types above, and a value payload.
+ for (cu in canonicalGroups) {
+ // The set of characters that canonicalize to cu
+ var characters = canonicalGroups[cu];
+
+ // If there is only one, it is unique.
+ if (characters.length == 1) {
+ typeInfo[characters[0]] = "CanonicalizeUnique:0";
+ continue;
+ }
+
+ // Sort the array.
+ characters.sort(function(x,y){return x-y;});
+
+ // If there are more than two characters, create an entry in characterSetInfo.
+ if (characters.length > 2) {
+ for (i in characters)
+ typeInfo[characters[i]] = "CanonicalizeSet:" + characterSetInfo.length;
+ characterSetInfo.push(characters);
+
+ continue;
+ }
+
+ // We have a pair, mark alternating ranges, otherwise track whether this is the low or high partner.
+ var lo = characters[0];
+ var hi = characters[1];
+ var delta = hi - lo;
+ if (delta == 1) {
+ var type = lo & 1 ? "CanonicalizeAlternatingUnaligned:0" : "CanonicalizeAlternatingAligned:0";
+ typeInfo[lo] = type;
+ typeInfo[hi] = type;
+ } else {
+ typeInfo[lo] = "CanonicalizeRangeLo:" + delta;
+ typeInfo[hi] = "CanonicalizeRangeHi:" + delta;
+ }
+ }
+
+ var rangeInfo = [];
+ // Pass 3: coallesce types into ranges.
+ for (var end = 0; end <= maxValue; ++end) {
+ var begin = end;
+ var type = typeInfo[end];
+ while (end < maxValue && typeInfo[end + 1] == type)
+ ++end;
+ rangeInfo.push({begin:begin, end:end, type:type});
+ }
+
+ for (i in characterSetInfo) {
+ var characters = ""
+ var set = characterSetInfo[i];
+ for (var j in set)
+ characters += hex(set[j]) + ", ";
+ print("const UChar32 " + prefixLower + "CharacterSet" + i + "[] = { " + characters + "0 };");
+ }
+ print();
+ print("static const size_t " + prefixUpper + "_CANONICALIZATION_SETS = " + characterSetInfo.length + ";");
+ print("const UChar32* const " + prefixLower + "CharacterSetInfo[" + prefixUpper + "_CANONICALIZATION_SETS] = {");
+ for (i in characterSetInfo)
+ print(" " + prefixLower + "CharacterSet" + i + ",");
+ print("};");
+ print();
+ print("const size_t " + prefixUpper + "_CANONICALIZATION_RANGES = " + rangeInfo.length + ";");
+ print("const CanonicalizationRange " + prefixLower + "RangeInfo[" + prefixUpper + "_CANONICALIZATION_RANGES] = {");
+ for (i in rangeInfo) {
+ var info = rangeInfo[i];
+ var typeAndValue = info.type.split(':');
+ print(" { " + hex(info.begin) + ", " + hex(info.end) + ", " + hex(typeAndValue[1]) + ", " + typeAndValue[0] + " },");
+ }
+ print("};");
+ print();
+}
+
+printHeader();
+
+createTables("UCS2", MAX_UCS2, createUCS2CanonicalGroups());
+
+printFooter();
+
</ins></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodecpp"></a>
<div class="delfile"><h4>Deleted: trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.cpp (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.cpp        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.cpp        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -1,1182 +0,0 @@
</span><del>-/*
- * Copyright (C) 2012-2013, 2015-2016 Apple Inc. All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
- * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
- * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR
- * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
- * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
- * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
- * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
- * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-// DO NOT EDIT! - this file autogenerated by YarrCanonicalizeUnicode.js
-
-#include "config.h"
-#include "YarrCanonicalizeUnicode.h"
-
-namespace JSC { namespace Yarr {
-
-#include <stdint.h>
-
-const UChar32 ucs2CharacterSet0[] = { 0x01c4, 0x01c5, 0x01c6, 0 };
-const UChar32 ucs2CharacterSet1[] = { 0x01c7, 0x01c8, 0x01c9, 0 };
-const UChar32 ucs2CharacterSet2[] = { 0x01ca, 0x01cb, 0x01cc, 0 };
-const UChar32 ucs2CharacterSet3[] = { 0x01f1, 0x01f2, 0x01f3, 0 };
-const UChar32 ucs2CharacterSet4[] = { 0x0392, 0x03b2, 0x03d0, 0 };
-const UChar32 ucs2CharacterSet5[] = { 0x0395, 0x03b5, 0x03f5, 0 };
-const UChar32 ucs2CharacterSet6[] = { 0x0398, 0x03b8, 0x03d1, 0 };
-const UChar32 ucs2CharacterSet7[] = { 0x0345, 0x0399, 0x03b9, 0x1fbe, 0 };
-const UChar32 ucs2CharacterSet8[] = { 0x039a, 0x03ba, 0x03f0, 0 };
-const UChar32 ucs2CharacterSet9[] = { 0x00b5, 0x039c, 0x03bc, 0 };
-const UChar32 ucs2CharacterSet10[] = { 0x03a0, 0x03c0, 0x03d6, 0 };
-const UChar32 ucs2CharacterSet11[] = { 0x03a1, 0x03c1, 0x03f1, 0 };
-const UChar32 ucs2CharacterSet12[] = { 0x03a3, 0x03c2, 0x03c3, 0 };
-const UChar32 ucs2CharacterSet13[] = { 0x03a6, 0x03c6, 0x03d5, 0 };
-const UChar32 ucs2CharacterSet14[] = { 0x1e60, 0x1e61, 0x1e9b, 0 };
-
-static const size_t UCS2_CANONICALIZATION_SETS = 15;
-const UChar32* const ucs2CharacterSetInfo[UCS2_CANONICALIZATION_SETS] = {
- ucs2CharacterSet0,
- ucs2CharacterSet1,
- ucs2CharacterSet2,
- ucs2CharacterSet3,
- ucs2CharacterSet4,
- ucs2CharacterSet5,
- ucs2CharacterSet6,
- ucs2CharacterSet7,
- ucs2CharacterSet8,
- ucs2CharacterSet9,
- ucs2CharacterSet10,
- ucs2CharacterSet11,
- ucs2CharacterSet12,
- ucs2CharacterSet13,
- ucs2CharacterSet14,
-};
-
-const size_t UCS2_CANONICALIZATION_RANGES = 391;
-const CanonicalizationRange ucs2RangeInfo[UCS2_CANONICALIZATION_RANGES] = {
- { 0x0000, 0x0040, 0x0000, CanonicalizeUnique },
- { 0x0041, 0x005a, 0x0020, CanonicalizeRangeLo },
- { 0x005b, 0x0060, 0x0000, CanonicalizeUnique },
- { 0x0061, 0x007a, 0x0020, CanonicalizeRangeHi },
- { 0x007b, 0x00b4, 0x0000, CanonicalizeUnique },
- { 0x00b5, 0x00b5, 0x0009, CanonicalizeSet },
- { 0x00b6, 0x00bf, 0x0000, CanonicalizeUnique },
- { 0x00c0, 0x00d6, 0x0020, CanonicalizeRangeLo },
- { 0x00d7, 0x00d7, 0x0000, CanonicalizeUnique },
- { 0x00d8, 0x00de, 0x0020, CanonicalizeRangeLo },
- { 0x00df, 0x00df, 0x0000, CanonicalizeUnique },
- { 0x00e0, 0x00f6, 0x0020, CanonicalizeRangeHi },
- { 0x00f7, 0x00f7, 0x0000, CanonicalizeUnique },
- { 0x00f8, 0x00fe, 0x0020, CanonicalizeRangeHi },
- { 0x00ff, 0x00ff, 0x0079, CanonicalizeRangeLo },
- { 0x0100, 0x012f, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0130, 0x0131, 0x0000, CanonicalizeUnique },
- { 0x0132, 0x0137, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0138, 0x0138, 0x0000, CanonicalizeUnique },
- { 0x0139, 0x0148, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x0149, 0x0149, 0x0000, CanonicalizeUnique },
- { 0x014a, 0x0177, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0178, 0x0178, 0x0079, CanonicalizeRangeHi },
- { 0x0179, 0x017e, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x017f, 0x017f, 0x0000, CanonicalizeUnique },
- { 0x0180, 0x0180, 0x00c3, CanonicalizeRangeLo },
- { 0x0181, 0x0181, 0x00d2, CanonicalizeRangeLo },
- { 0x0182, 0x0185, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0186, 0x0186, 0x00ce, CanonicalizeRangeLo },
- { 0x0187, 0x0188, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x0189, 0x018a, 0x00cd, CanonicalizeRangeLo },
- { 0x018b, 0x018c, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x018d, 0x018d, 0x0000, CanonicalizeUnique },
- { 0x018e, 0x018e, 0x004f, CanonicalizeRangeLo },
- { 0x018f, 0x018f, 0x00ca, CanonicalizeRangeLo },
- { 0x0190, 0x0190, 0x00cb, CanonicalizeRangeLo },
- { 0x0191, 0x0192, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x0193, 0x0193, 0x00cd, CanonicalizeRangeLo },
- { 0x0194, 0x0194, 0x00cf, CanonicalizeRangeLo },
- { 0x0195, 0x0195, 0x0061, CanonicalizeRangeLo },
- { 0x0196, 0x0196, 0x00d3, CanonicalizeRangeLo },
- { 0x0197, 0x0197, 0x00d1, CanonicalizeRangeLo },
- { 0x0198, 0x0199, 0x0000, CanonicalizeAlternatingAligned },
- { 0x019a, 0x019a, 0x00a3, CanonicalizeRangeLo },
- { 0x019b, 0x019b, 0x0000, CanonicalizeUnique },
- { 0x019c, 0x019c, 0x00d3, CanonicalizeRangeLo },
- { 0x019d, 0x019d, 0x00d5, CanonicalizeRangeLo },
- { 0x019e, 0x019e, 0x0082, CanonicalizeRangeLo },
- { 0x019f, 0x019f, 0x00d6, CanonicalizeRangeLo },
- { 0x01a0, 0x01a5, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01a6, 0x01a6, 0x00da, CanonicalizeRangeLo },
- { 0x01a7, 0x01a8, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x01a9, 0x01a9, 0x00da, CanonicalizeRangeLo },
- { 0x01aa, 0x01ab, 0x0000, CanonicalizeUnique },
- { 0x01ac, 0x01ad, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01ae, 0x01ae, 0x00da, CanonicalizeRangeLo },
- { 0x01af, 0x01b0, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x01b1, 0x01b2, 0x00d9, CanonicalizeRangeLo },
- { 0x01b3, 0x01b6, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x01b7, 0x01b7, 0x00db, CanonicalizeRangeLo },
- { 0x01b8, 0x01b9, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01ba, 0x01bb, 0x0000, CanonicalizeUnique },
- { 0x01bc, 0x01bd, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01be, 0x01be, 0x0000, CanonicalizeUnique },
- { 0x01bf, 0x01bf, 0x0038, CanonicalizeRangeLo },
- { 0x01c0, 0x01c3, 0x0000, CanonicalizeUnique },
- { 0x01c4, 0x01c6, 0x0000, CanonicalizeSet },
- { 0x01c7, 0x01c9, 0x0001, CanonicalizeSet },
- { 0x01ca, 0x01cc, 0x0002, CanonicalizeSet },
- { 0x01cd, 0x01dc, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x01dd, 0x01dd, 0x004f, CanonicalizeRangeHi },
- { 0x01de, 0x01ef, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01f0, 0x01f0, 0x0000, CanonicalizeUnique },
- { 0x01f1, 0x01f3, 0x0003, CanonicalizeSet },
- { 0x01f4, 0x01f5, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01f6, 0x01f6, 0x0061, CanonicalizeRangeHi },
- { 0x01f7, 0x01f7, 0x0038, CanonicalizeRangeHi },
- { 0x01f8, 0x021f, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0220, 0x0220, 0x0082, CanonicalizeRangeHi },
- { 0x0221, 0x0221, 0x0000, CanonicalizeUnique },
- { 0x0222, 0x0233, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0234, 0x0239, 0x0000, CanonicalizeUnique },
- { 0x023a, 0x023a, 0x2a2b, CanonicalizeRangeLo },
- { 0x023b, 0x023c, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x023d, 0x023d, 0x00a3, CanonicalizeRangeHi },
- { 0x023e, 0x023e, 0x2a28, CanonicalizeRangeLo },
- { 0x023f, 0x0240, 0x2a3f, CanonicalizeRangeLo },
- { 0x0241, 0x0242, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x0243, 0x0243, 0x00c3, CanonicalizeRangeHi },
- { 0x0244, 0x0244, 0x0045, CanonicalizeRangeLo },
- { 0x0245, 0x0245, 0x0047, CanonicalizeRangeLo },
- { 0x0246, 0x024f, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0250, 0x0250, 0x2a1f, CanonicalizeRangeLo },
- { 0x0251, 0x0251, 0x2a1c, CanonicalizeRangeLo },
- { 0x0252, 0x0252, 0x2a1e, CanonicalizeRangeLo },
- { 0x0253, 0x0253, 0x00d2, CanonicalizeRangeHi },
- { 0x0254, 0x0254, 0x00ce, CanonicalizeRangeHi },
- { 0x0255, 0x0255, 0x0000, CanonicalizeUnique },
- { 0x0256, 0x0257, 0x00cd, CanonicalizeRangeHi },
- { 0x0258, 0x0258, 0x0000, CanonicalizeUnique },
- { 0x0259, 0x0259, 0x00ca, CanonicalizeRangeHi },
- { 0x025a, 0x025a, 0x0000, CanonicalizeUnique },
- { 0x025b, 0x025b, 0x00cb, CanonicalizeRangeHi },
- { 0x025c, 0x025c, 0xa54f, CanonicalizeRangeLo },
- { 0x025d, 0x025f, 0x0000, CanonicalizeUnique },
- { 0x0260, 0x0260, 0x00cd, CanonicalizeRangeHi },
- { 0x0261, 0x0261, 0xa54b, CanonicalizeRangeLo },
- { 0x0262, 0x0262, 0x0000, CanonicalizeUnique },
- { 0x0263, 0x0263, 0x00cf, CanonicalizeRangeHi },
- { 0x0264, 0x0264, 0x0000, CanonicalizeUnique },
- { 0x0265, 0x0265, 0xa528, CanonicalizeRangeLo },
- { 0x0266, 0x0266, 0xa544, CanonicalizeRangeLo },
- { 0x0267, 0x0267, 0x0000, CanonicalizeUnique },
- { 0x0268, 0x0268, 0x00d1, CanonicalizeRangeHi },
- { 0x0269, 0x0269, 0x00d3, CanonicalizeRangeHi },
- { 0x026a, 0x026a, 0x0000, CanonicalizeUnique },
- { 0x026b, 0x026b, 0x29f7, CanonicalizeRangeLo },
- { 0x026c, 0x026c, 0xa541, CanonicalizeRangeLo },
- { 0x026d, 0x026e, 0x0000, CanonicalizeUnique },
- { 0x026f, 0x026f, 0x00d3, CanonicalizeRangeHi },
- { 0x0270, 0x0270, 0x0000, CanonicalizeUnique },
- { 0x0271, 0x0271, 0x29fd, CanonicalizeRangeLo },
- { 0x0272, 0x0272, 0x00d5, CanonicalizeRangeHi },
- { 0x0273, 0x0274, 0x0000, CanonicalizeUnique },
- { 0x0275, 0x0275, 0x00d6, CanonicalizeRangeHi },
- { 0x0276, 0x027c, 0x0000, CanonicalizeUnique },
- { 0x027d, 0x027d, 0x29e7, CanonicalizeRangeLo },
- { 0x027e, 0x027f, 0x0000, CanonicalizeUnique },
- { 0x0280, 0x0280, 0x00da, CanonicalizeRangeHi },
- { 0x0281, 0x0282, 0x0000, CanonicalizeUnique },
- { 0x0283, 0x0283, 0x00da, CanonicalizeRangeHi },
- { 0x0284, 0x0286, 0x0000, CanonicalizeUnique },
- { 0x0287, 0x0287, 0xa52a, CanonicalizeRangeLo },
- { 0x0288, 0x0288, 0x00da, CanonicalizeRangeHi },
- { 0x0289, 0x0289, 0x0045, CanonicalizeRangeHi },
- { 0x028a, 0x028b, 0x00d9, CanonicalizeRangeHi },
- { 0x028c, 0x028c, 0x0047, CanonicalizeRangeHi },
- { 0x028d, 0x0291, 0x0000, CanonicalizeUnique },
- { 0x0292, 0x0292, 0x00db, CanonicalizeRangeHi },
- { 0x0293, 0x029d, 0x0000, CanonicalizeUnique },
- { 0x029e, 0x029e, 0xa512, CanonicalizeRangeLo },
- { 0x029f, 0x0344, 0x0000, CanonicalizeUnique },
- { 0x0345, 0x0345, 0x0007, CanonicalizeSet },
- { 0x0346, 0x036f, 0x0000, CanonicalizeUnique },
- { 0x0370, 0x0373, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0374, 0x0375, 0x0000, CanonicalizeUnique },
- { 0x0376, 0x0377, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0378, 0x037a, 0x0000, CanonicalizeUnique },
- { 0x037b, 0x037d, 0x0082, CanonicalizeRangeLo },
- { 0x037e, 0x037e, 0x0000, CanonicalizeUnique },
- { 0x037f, 0x037f, 0x0074, CanonicalizeRangeLo },
- { 0x0380, 0x0385, 0x0000, CanonicalizeUnique },
- { 0x0386, 0x0386, 0x0026, CanonicalizeRangeLo },
- { 0x0387, 0x0387, 0x0000, CanonicalizeUnique },
- { 0x0388, 0x038a, 0x0025, CanonicalizeRangeLo },
- { 0x038b, 0x038b, 0x0000, CanonicalizeUnique },
- { 0x038c, 0x038c, 0x0040, CanonicalizeRangeLo },
- { 0x038d, 0x038d, 0x0000, CanonicalizeUnique },
- { 0x038e, 0x038f, 0x003f, CanonicalizeRangeLo },
- { 0x0390, 0x0390, 0x0000, CanonicalizeUnique },
- { 0x0391, 0x0391, 0x0020, CanonicalizeRangeLo },
- { 0x0392, 0x0392, 0x0004, CanonicalizeSet },
- { 0x0393, 0x0394, 0x0020, CanonicalizeRangeLo },
- { 0x0395, 0x0395, 0x0005, CanonicalizeSet },
- { 0x0396, 0x0397, 0x0020, CanonicalizeRangeLo },
- { 0x0398, 0x0398, 0x0006, CanonicalizeSet },
- { 0x0399, 0x0399, 0x0007, CanonicalizeSet },
- { 0x039a, 0x039a, 0x0008, CanonicalizeSet },
- { 0x039b, 0x039b, 0x0020, CanonicalizeRangeLo },
- { 0x039c, 0x039c, 0x0009, CanonicalizeSet },
- { 0x039d, 0x039f, 0x0020, CanonicalizeRangeLo },
- { 0x03a0, 0x03a0, 0x000a, CanonicalizeSet },
- { 0x03a1, 0x03a1, 0x000b, CanonicalizeSet },
- { 0x03a2, 0x03a2, 0x0000, CanonicalizeUnique },
- { 0x03a3, 0x03a3, 0x000c, CanonicalizeSet },
- { 0x03a4, 0x03a5, 0x0020, CanonicalizeRangeLo },
- { 0x03a6, 0x03a6, 0x000d, CanonicalizeSet },
- { 0x03a7, 0x03ab, 0x0020, CanonicalizeRangeLo },
- { 0x03ac, 0x03ac, 0x0026, CanonicalizeRangeHi },
- { 0x03ad, 0x03af, 0x0025, CanonicalizeRangeHi },
- { 0x03b0, 0x03b0, 0x0000, CanonicalizeUnique },
- { 0x03b1, 0x03b1, 0x0020, CanonicalizeRangeHi },
- { 0x03b2, 0x03b2, 0x0004, CanonicalizeSet },
- { 0x03b3, 0x03b4, 0x0020, CanonicalizeRangeHi },
- { 0x03b5, 0x03b5, 0x0005, CanonicalizeSet },
- { 0x03b6, 0x03b7, 0x0020, CanonicalizeRangeHi },
- { 0x03b8, 0x03b8, 0x0006, CanonicalizeSet },
- { 0x03b9, 0x03b9, 0x0007, CanonicalizeSet },
- { 0x03ba, 0x03ba, 0x0008, CanonicalizeSet },
- { 0x03bb, 0x03bb, 0x0020, CanonicalizeRangeHi },
- { 0x03bc, 0x03bc, 0x0009, CanonicalizeSet },
- { 0x03bd, 0x03bf, 0x0020, CanonicalizeRangeHi },
- { 0x03c0, 0x03c0, 0x000a, CanonicalizeSet },
- { 0x03c1, 0x03c1, 0x000b, CanonicalizeSet },
- { 0x03c2, 0x03c3, 0x000c, CanonicalizeSet },
- { 0x03c4, 0x03c5, 0x0020, CanonicalizeRangeHi },
- { 0x03c6, 0x03c6, 0x000d, CanonicalizeSet },
- { 0x03c7, 0x03cb, 0x0020, CanonicalizeRangeHi },
- { 0x03cc, 0x03cc, 0x0040, CanonicalizeRangeHi },
- { 0x03cd, 0x03ce, 0x003f, CanonicalizeRangeHi },
- { 0x03cf, 0x03cf, 0x0008, CanonicalizeRangeLo },
- { 0x03d0, 0x03d0, 0x0004, CanonicalizeSet },
- { 0x03d1, 0x03d1, 0x0006, CanonicalizeSet },
- { 0x03d2, 0x03d4, 0x0000, CanonicalizeUnique },
- { 0x03d5, 0x03d5, 0x000d, CanonicalizeSet },
- { 0x03d6, 0x03d6, 0x000a, CanonicalizeSet },
- { 0x03d7, 0x03d7, 0x0008, CanonicalizeRangeHi },
- { 0x03d8, 0x03ef, 0x0000, CanonicalizeAlternatingAligned },
- { 0x03f0, 0x03f0, 0x0008, CanonicalizeSet },
- { 0x03f1, 0x03f1, 0x000b, CanonicalizeSet },
- { 0x03f2, 0x03f2, 0x0007, CanonicalizeRangeLo },
- { 0x03f3, 0x03f3, 0x0074, CanonicalizeRangeHi },
- { 0x03f4, 0x03f4, 0x0000, CanonicalizeUnique },
- { 0x03f5, 0x03f5, 0x0005, CanonicalizeSet },
- { 0x03f6, 0x03f6, 0x0000, CanonicalizeUnique },
- { 0x03f7, 0x03f8, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x03f9, 0x03f9, 0x0007, CanonicalizeRangeHi },
- { 0x03fa, 0x03fb, 0x0000, CanonicalizeAlternatingAligned },
- { 0x03fc, 0x03fc, 0x0000, CanonicalizeUnique },
- { 0x03fd, 0x03ff, 0x0082, CanonicalizeRangeHi },
- { 0x0400, 0x040f, 0x0050, CanonicalizeRangeLo },
- { 0x0410, 0x042f, 0x0020, CanonicalizeRangeLo },
- { 0x0430, 0x044f, 0x0020, CanonicalizeRangeHi },
- { 0x0450, 0x045f, 0x0050, CanonicalizeRangeHi },
- { 0x0460, 0x0481, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0482, 0x0489, 0x0000, CanonicalizeUnique },
- { 0x048a, 0x04bf, 0x0000, CanonicalizeAlternatingAligned },
- { 0x04c0, 0x04c0, 0x000f, CanonicalizeRangeLo },
- { 0x04c1, 0x04ce, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x04cf, 0x04cf, 0x000f, CanonicalizeRangeHi },
- { 0x04d0, 0x052f, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0530, 0x0530, 0x0000, CanonicalizeUnique },
- { 0x0531, 0x0556, 0x0030, CanonicalizeRangeLo },
- { 0x0557, 0x0560, 0x0000, CanonicalizeUnique },
- { 0x0561, 0x0586, 0x0030, CanonicalizeRangeHi },
- { 0x0587, 0x109f, 0x0000, CanonicalizeUnique },
- { 0x10a0, 0x10c5, 0x1c60, CanonicalizeRangeLo },
- { 0x10c6, 0x10c6, 0x0000, CanonicalizeUnique },
- { 0x10c7, 0x10c7, 0x1c60, CanonicalizeRangeLo },
- { 0x10c8, 0x10cc, 0x0000, CanonicalizeUnique },
- { 0x10cd, 0x10cd, 0x1c60, CanonicalizeRangeLo },
- { 0x10ce, 0x1d78, 0x0000, CanonicalizeUnique },
- { 0x1d79, 0x1d79, 0x8a04, CanonicalizeRangeLo },
- { 0x1d7a, 0x1d7c, 0x0000, CanonicalizeUnique },
- { 0x1d7d, 0x1d7d, 0x0ee6, CanonicalizeRangeLo },
- { 0x1d7e, 0x1dff, 0x0000, CanonicalizeUnique },
- { 0x1e00, 0x1e5f, 0x0000, CanonicalizeAlternatingAligned },
- { 0x1e60, 0x1e61, 0x000e, CanonicalizeSet },
- { 0x1e62, 0x1e95, 0x0000, CanonicalizeAlternatingAligned },
- { 0x1e96, 0x1e9a, 0x0000, CanonicalizeUnique },
- { 0x1e9b, 0x1e9b, 0x000e, CanonicalizeSet },
- { 0x1e9c, 0x1e9f, 0x0000, CanonicalizeUnique },
- { 0x1ea0, 0x1eff, 0x0000, CanonicalizeAlternatingAligned },
- { 0x1f00, 0x1f07, 0x0008, CanonicalizeRangeLo },
- { 0x1f08, 0x1f0f, 0x0008, CanonicalizeRangeHi },
- { 0x1f10, 0x1f15, 0x0008, CanonicalizeRangeLo },
- { 0x1f16, 0x1f17, 0x0000, CanonicalizeUnique },
- { 0x1f18, 0x1f1d, 0x0008, CanonicalizeRangeHi },
- { 0x1f1e, 0x1f1f, 0x0000, CanonicalizeUnique },
- { 0x1f20, 0x1f27, 0x0008, CanonicalizeRangeLo },
- { 0x1f28, 0x1f2f, 0x0008, CanonicalizeRangeHi },
- { 0x1f30, 0x1f37, 0x0008, CanonicalizeRangeLo },
- { 0x1f38, 0x1f3f, 0x0008, CanonicalizeRangeHi },
- { 0x1f40, 0x1f45, 0x0008, CanonicalizeRangeLo },
- { 0x1f46, 0x1f47, 0x0000, CanonicalizeUnique },
- { 0x1f48, 0x1f4d, 0x0008, CanonicalizeRangeHi },
- { 0x1f4e, 0x1f50, 0x0000, CanonicalizeUnique },
- { 0x1f51, 0x1f51, 0x0008, CanonicalizeRangeLo },
- { 0x1f52, 0x1f52, 0x0000, CanonicalizeUnique },
- { 0x1f53, 0x1f53, 0x0008, CanonicalizeRangeLo },
- { 0x1f54, 0x1f54, 0x0000, CanonicalizeUnique },
- { 0x1f55, 0x1f55, 0x0008, CanonicalizeRangeLo },
- { 0x1f56, 0x1f56, 0x0000, CanonicalizeUnique },
- { 0x1f57, 0x1f57, 0x0008, CanonicalizeRangeLo },
- { 0x1f58, 0x1f58, 0x0000, CanonicalizeUnique },
- { 0x1f59, 0x1f59, 0x0008, CanonicalizeRangeHi },
- { 0x1f5a, 0x1f5a, 0x0000, CanonicalizeUnique },
- { 0x1f5b, 0x1f5b, 0x0008, CanonicalizeRangeHi },
- { 0x1f5c, 0x1f5c, 0x0000, CanonicalizeUnique },
- { 0x1f5d, 0x1f5d, 0x0008, CanonicalizeRangeHi },
- { 0x1f5e, 0x1f5e, 0x0000, CanonicalizeUnique },
- { 0x1f5f, 0x1f5f, 0x0008, CanonicalizeRangeHi },
- { 0x1f60, 0x1f67, 0x0008, CanonicalizeRangeLo },
- { 0x1f68, 0x1f6f, 0x0008, CanonicalizeRangeHi },
- { 0x1f70, 0x1f71, 0x004a, CanonicalizeRangeLo },
- { 0x1f72, 0x1f75, 0x0056, CanonicalizeRangeLo },
- { 0x1f76, 0x1f77, 0x0064, CanonicalizeRangeLo },
- { 0x1f78, 0x1f79, 0x0080, CanonicalizeRangeLo },
- { 0x1f7a, 0x1f7b, 0x0070, CanonicalizeRangeLo },
- { 0x1f7c, 0x1f7d, 0x007e, CanonicalizeRangeLo },
- { 0x1f7e, 0x1faf, 0x0000, CanonicalizeUnique },
- { 0x1fb0, 0x1fb1, 0x0008, CanonicalizeRangeLo },
- { 0x1fb2, 0x1fb7, 0x0000, CanonicalizeUnique },
- { 0x1fb8, 0x1fb9, 0x0008, CanonicalizeRangeHi },
- { 0x1fba, 0x1fbb, 0x004a, CanonicalizeRangeHi },
- { 0x1fbc, 0x1fbd, 0x0000, CanonicalizeUnique },
- { 0x1fbe, 0x1fbe, 0x0007, CanonicalizeSet },
- { 0x1fbf, 0x1fc7, 0x0000, CanonicalizeUnique },
- { 0x1fc8, 0x1fcb, 0x0056, CanonicalizeRangeHi },
- { 0x1fcc, 0x1fcf, 0x0000, CanonicalizeUnique },
- { 0x1fd0, 0x1fd1, 0x0008, CanonicalizeRangeLo },
- { 0x1fd2, 0x1fd7, 0x0000, CanonicalizeUnique },
- { 0x1fd8, 0x1fd9, 0x0008, CanonicalizeRangeHi },
- { 0x1fda, 0x1fdb, 0x0064, CanonicalizeRangeHi },
- { 0x1fdc, 0x1fdf, 0x0000, CanonicalizeUnique },
- { 0x1fe0, 0x1fe1, 0x0008, CanonicalizeRangeLo },
- { 0x1fe2, 0x1fe4, 0x0000, CanonicalizeUnique },
- { 0x1fe5, 0x1fe5, 0x0007, CanonicalizeRangeLo },
- { 0x1fe6, 0x1fe7, 0x0000, CanonicalizeUnique },
- { 0x1fe8, 0x1fe9, 0x0008, CanonicalizeRangeHi },
- { 0x1fea, 0x1feb, 0x0070, CanonicalizeRangeHi },
- { 0x1fec, 0x1fec, 0x0007, CanonicalizeRangeHi },
- { 0x1fed, 0x1ff7, 0x0000, CanonicalizeUnique },
- { 0x1ff8, 0x1ff9, 0x0080, CanonicalizeRangeHi },
- { 0x1ffa, 0x1ffb, 0x007e, CanonicalizeRangeHi },
- { 0x1ffc, 0x2131, 0x0000, CanonicalizeUnique },
- { 0x2132, 0x2132, 0x001c, CanonicalizeRangeLo },
- { 0x2133, 0x214d, 0x0000, CanonicalizeUnique },
- { 0x214e, 0x214e, 0x001c, CanonicalizeRangeHi },
- { 0x214f, 0x215f, 0x0000, CanonicalizeUnique },
- { 0x2160, 0x216f, 0x0010, CanonicalizeRangeLo },
- { 0x2170, 0x217f, 0x0010, CanonicalizeRangeHi },
- { 0x2180, 0x2182, 0x0000, CanonicalizeUnique },
- { 0x2183, 0x2184, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x2185, 0x24b5, 0x0000, CanonicalizeUnique },
- { 0x24b6, 0x24cf, 0x001a, CanonicalizeRangeLo },
- { 0x24d0, 0x24e9, 0x001a, CanonicalizeRangeHi },
- { 0x24ea, 0x2bff, 0x0000, CanonicalizeUnique },
- { 0x2c00, 0x2c2e, 0x0030, CanonicalizeRangeLo },
- { 0x2c2f, 0x2c2f, 0x0000, CanonicalizeUnique },
- { 0x2c30, 0x2c5e, 0x0030, CanonicalizeRangeHi },
- { 0x2c5f, 0x2c5f, 0x0000, CanonicalizeUnique },
- { 0x2c60, 0x2c61, 0x0000, CanonicalizeAlternatingAligned },
- { 0x2c62, 0x2c62, 0x29f7, CanonicalizeRangeHi },
- { 0x2c63, 0x2c63, 0x0ee6, CanonicalizeRangeHi },
- { 0x2c64, 0x2c64, 0x29e7, CanonicalizeRangeHi },
- { 0x2c65, 0x2c65, 0x2a2b, CanonicalizeRangeHi },
- { 0x2c66, 0x2c66, 0x2a28, CanonicalizeRangeHi },
- { 0x2c67, 0x2c6c, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x2c6d, 0x2c6d, 0x2a1c, CanonicalizeRangeHi },
- { 0x2c6e, 0x2c6e, 0x29fd, CanonicalizeRangeHi },
- { 0x2c6f, 0x2c6f, 0x2a1f, CanonicalizeRangeHi },
- { 0x2c70, 0x2c70, 0x2a1e, CanonicalizeRangeHi },
- { 0x2c71, 0x2c71, 0x0000, CanonicalizeUnique },
- { 0x2c72, 0x2c73, 0x0000, CanonicalizeAlternatingAligned },
- { 0x2c74, 0x2c74, 0x0000, CanonicalizeUnique },
- { 0x2c75, 0x2c76, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x2c77, 0x2c7d, 0x0000, CanonicalizeUnique },
- { 0x2c7e, 0x2c7f, 0x2a3f, CanonicalizeRangeHi },
- { 0x2c80, 0x2ce3, 0x0000, CanonicalizeAlternatingAligned },
- { 0x2ce4, 0x2cea, 0x0000, CanonicalizeUnique },
- { 0x2ceb, 0x2cee, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x2cef, 0x2cf1, 0x0000, CanonicalizeUnique },
- { 0x2cf2, 0x2cf3, 0x0000, CanonicalizeAlternatingAligned },
- { 0x2cf4, 0x2cff, 0x0000, CanonicalizeUnique },
- { 0x2d00, 0x2d25, 0x1c60, CanonicalizeRangeHi },
- { 0x2d26, 0x2d26, 0x0000, CanonicalizeUnique },
- { 0x2d27, 0x2d27, 0x1c60, CanonicalizeRangeHi },
- { 0x2d28, 0x2d2c, 0x0000, CanonicalizeUnique },
- { 0x2d2d, 0x2d2d, 0x1c60, CanonicalizeRangeHi },
- { 0x2d2e, 0xa63f, 0x0000, CanonicalizeUnique },
- { 0xa640, 0xa66d, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa66e, 0xa67f, 0x0000, CanonicalizeUnique },
- { 0xa680, 0xa69b, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa69c, 0xa721, 0x0000, CanonicalizeUnique },
- { 0xa722, 0xa72f, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa730, 0xa731, 0x0000, CanonicalizeUnique },
- { 0xa732, 0xa76f, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa770, 0xa778, 0x0000, CanonicalizeUnique },
- { 0xa779, 0xa77c, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0xa77d, 0xa77d, 0x8a04, CanonicalizeRangeHi },
- { 0xa77e, 0xa787, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa788, 0xa78a, 0x0000, CanonicalizeUnique },
- { 0xa78b, 0xa78c, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0xa78d, 0xa78d, 0xa528, CanonicalizeRangeHi },
- { 0xa78e, 0xa78f, 0x0000, CanonicalizeUnique },
- { 0xa790, 0xa793, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa794, 0xa795, 0x0000, CanonicalizeUnique },
- { 0xa796, 0xa7a9, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa7aa, 0xa7aa, 0xa544, CanonicalizeRangeHi },
- { 0xa7ab, 0xa7ab, 0xa54f, CanonicalizeRangeHi },
- { 0xa7ac, 0xa7ac, 0xa54b, CanonicalizeRangeHi },
- { 0xa7ad, 0xa7ad, 0xa541, CanonicalizeRangeHi },
- { 0xa7ae, 0xa7af, 0x0000, CanonicalizeUnique },
- { 0xa7b0, 0xa7b0, 0xa512, CanonicalizeRangeHi },
- { 0xa7b1, 0xa7b1, 0xa52a, CanonicalizeRangeHi },
- { 0xa7b2, 0xff20, 0x0000, CanonicalizeUnique },
- { 0xff21, 0xff3a, 0x0020, CanonicalizeRangeLo },
- { 0xff3b, 0xff40, 0x0000, CanonicalizeUnique },
- { 0xff41, 0xff5a, 0x0020, CanonicalizeRangeHi },
- { 0xff5b, 0xffff, 0x0000, CanonicalizeUnique },
-};
-
-const UChar32 unicodeCharacterSet0[] = { 0x0041, 0x0061, 0x1e9a, 0 };
-const UChar32 unicodeCharacterSet1[] = { 0x0046, 0x0066, 0xfb00, 0xfb01, 0xfb02, 0xfb03, 0xfb04, 0 };
-const UChar32 unicodeCharacterSet2[] = { 0x0048, 0x0068, 0x1e96, 0 };
-const UChar32 unicodeCharacterSet3[] = { 0x0049, 0x0069, 0x0131, 0 };
-const UChar32 unicodeCharacterSet4[] = { 0x004a, 0x006a, 0x01f0, 0 };
-const UChar32 unicodeCharacterSet5[] = { 0x0053, 0x0073, 0x00df, 0x017f, 0xfb05, 0xfb06, 0 };
-const UChar32 unicodeCharacterSet6[] = { 0x0054, 0x0074, 0x1e97, 0 };
-const UChar32 unicodeCharacterSet7[] = { 0x0057, 0x0077, 0x1e98, 0 };
-const UChar32 unicodeCharacterSet8[] = { 0x0059, 0x0079, 0x1e99, 0 };
-const UChar32 unicodeCharacterSet9[] = { 0x01c4, 0x01c5, 0x01c6, 0 };
-const UChar32 unicodeCharacterSet10[] = { 0x01c7, 0x01c8, 0x01c9, 0 };
-const UChar32 unicodeCharacterSet11[] = { 0x01ca, 0x01cb, 0x01cc, 0 };
-const UChar32 unicodeCharacterSet12[] = { 0x01f1, 0x01f2, 0x01f3, 0 };
-const UChar32 unicodeCharacterSet13[] = { 0x0386, 0x03ac, 0x1fb4, 0 };
-const UChar32 unicodeCharacterSet14[] = { 0x0389, 0x03ae, 0x1fc4, 0 };
-const UChar32 unicodeCharacterSet15[] = { 0x038f, 0x03ce, 0x1ff4, 0 };
-const UChar32 unicodeCharacterSet16[] = { 0x0391, 0x03b1, 0x1fb3, 0x1fb6, 0x1fb7, 0x1fbc, 0 };
-const UChar32 unicodeCharacterSet17[] = { 0x0392, 0x03b2, 0x03d0, 0 };
-const UChar32 unicodeCharacterSet18[] = { 0x0395, 0x03b5, 0x03f5, 0 };
-const UChar32 unicodeCharacterSet19[] = { 0x0397, 0x03b7, 0x1fc3, 0x1fc6, 0x1fc7, 0x1fcc, 0 };
-const UChar32 unicodeCharacterSet20[] = { 0x0398, 0x03b8, 0x03d1, 0 };
-const UChar32 unicodeCharacterSet21[] = { 0x0345, 0x0390, 0x0399, 0x03b9, 0x1fbe, 0x1fd2, 0x1fd3, 0x1fd6, 0x1fd7, 0 };
-const UChar32 unicodeCharacterSet22[] = { 0x039a, 0x03ba, 0x03f0, 0 };
-const UChar32 unicodeCharacterSet23[] = { 0x00b5, 0x039c, 0x03bc, 0 };
-const UChar32 unicodeCharacterSet24[] = { 0x03a0, 0x03c0, 0x03d6, 0 };
-const UChar32 unicodeCharacterSet25[] = { 0x03a1, 0x03c1, 0x03f1, 0x1fe4, 0 };
-const UChar32 unicodeCharacterSet26[] = { 0x03a3, 0x03c2, 0x03c3, 0 };
-const UChar32 unicodeCharacterSet27[] = { 0x03a5, 0x03b0, 0x03c5, 0x1f50, 0x1f52, 0x1f54, 0x1f56, 0x1fe2, 0x1fe3, 0x1fe6, 0x1fe7, 0 };
-const UChar32 unicodeCharacterSet28[] = { 0x03a6, 0x03c6, 0x03d5, 0 };
-const UChar32 unicodeCharacterSet29[] = { 0x03a9, 0x03c9, 0x1ff3, 0x1ff6, 0x1ff7, 0x1ffc, 0 };
-const UChar32 unicodeCharacterSet30[] = { 0x0535, 0x0565, 0x0587, 0 };
-const UChar32 unicodeCharacterSet31[] = { 0x0544, 0x0574, 0xfb13, 0xfb14, 0xfb15, 0xfb17, 0 };
-const UChar32 unicodeCharacterSet32[] = { 0x054e, 0x057e, 0xfb16, 0 };
-const UChar32 unicodeCharacterSet33[] = { 0x1e60, 0x1e61, 0x1e9b, 0 };
-const UChar32 unicodeCharacterSet34[] = { 0x1f00, 0x1f08, 0x1f80, 0x1f88, 0 };
-const UChar32 unicodeCharacterSet35[] = { 0x1f01, 0x1f09, 0x1f81, 0x1f89, 0 };
-const UChar32 unicodeCharacterSet36[] = { 0x1f02, 0x1f0a, 0x1f82, 0x1f8a, 0 };
-const UChar32 unicodeCharacterSet37[] = { 0x1f03, 0x1f0b, 0x1f83, 0x1f8b, 0 };
-const UChar32 unicodeCharacterSet38[] = { 0x1f04, 0x1f0c, 0x1f84, 0x1f8c, 0 };
-const UChar32 unicodeCharacterSet39[] = { 0x1f05, 0x1f0d, 0x1f85, 0x1f8d, 0 };
-const UChar32 unicodeCharacterSet40[] = { 0x1f06, 0x1f0e, 0x1f86, 0x1f8e, 0 };
-const UChar32 unicodeCharacterSet41[] = { 0x1f07, 0x1f0f, 0x1f87, 0x1f8f, 0 };
-const UChar32 unicodeCharacterSet42[] = { 0x1f20, 0x1f28, 0x1f90, 0x1f98, 0 };
-const UChar32 unicodeCharacterSet43[] = { 0x1f21, 0x1f29, 0x1f91, 0x1f99, 0 };
-const UChar32 unicodeCharacterSet44[] = { 0x1f22, 0x1f2a, 0x1f92, 0x1f9a, 0 };
-const UChar32 unicodeCharacterSet45[] = { 0x1f23, 0x1f2b, 0x1f93, 0x1f9b, 0 };
-const UChar32 unicodeCharacterSet46[] = { 0x1f24, 0x1f2c, 0x1f94, 0x1f9c, 0 };
-const UChar32 unicodeCharacterSet47[] = { 0x1f25, 0x1f2d, 0x1f95, 0x1f9d, 0 };
-const UChar32 unicodeCharacterSet48[] = { 0x1f26, 0x1f2e, 0x1f96, 0x1f9e, 0 };
-const UChar32 unicodeCharacterSet49[] = { 0x1f27, 0x1f2f, 0x1f97, 0x1f9f, 0 };
-const UChar32 unicodeCharacterSet50[] = { 0x1f60, 0x1f68, 0x1fa0, 0x1fa8, 0 };
-const UChar32 unicodeCharacterSet51[] = { 0x1f61, 0x1f69, 0x1fa1, 0x1fa9, 0 };
-const UChar32 unicodeCharacterSet52[] = { 0x1f62, 0x1f6a, 0x1fa2, 0x1faa, 0 };
-const UChar32 unicodeCharacterSet53[] = { 0x1f63, 0x1f6b, 0x1fa3, 0x1fab, 0 };
-const UChar32 unicodeCharacterSet54[] = { 0x1f64, 0x1f6c, 0x1fa4, 0x1fac, 0 };
-const UChar32 unicodeCharacterSet55[] = { 0x1f65, 0x1f6d, 0x1fa5, 0x1fad, 0 };
-const UChar32 unicodeCharacterSet56[] = { 0x1f66, 0x1f6e, 0x1fa6, 0x1fae, 0 };
-const UChar32 unicodeCharacterSet57[] = { 0x1f67, 0x1f6f, 0x1fa7, 0x1faf, 0 };
-const UChar32 unicodeCharacterSet58[] = { 0x1f70, 0x1fb2, 0x1fba, 0 };
-const UChar32 unicodeCharacterSet59[] = { 0x1f74, 0x1fc2, 0x1fca, 0 };
-const UChar32 unicodeCharacterSet60[] = { 0x1f7c, 0x1ff2, 0x1ffa, 0 };
-
-static const size_t UNICODE_CANONICALIZATION_SETS = 61;
-const UChar32* const unicodeCharacterSetInfo[UNICODE_CANONICALIZATION_SETS] = {
- unicodeCharacterSet0,
- unicodeCharacterSet1,
- unicodeCharacterSet2,
- unicodeCharacterSet3,
- unicodeCharacterSet4,
- unicodeCharacterSet5,
- unicodeCharacterSet6,
- unicodeCharacterSet7,
- unicodeCharacterSet8,
- unicodeCharacterSet9,
- unicodeCharacterSet10,
- unicodeCharacterSet11,
- unicodeCharacterSet12,
- unicodeCharacterSet13,
- unicodeCharacterSet14,
- unicodeCharacterSet15,
- unicodeCharacterSet16,
- unicodeCharacterSet17,
- unicodeCharacterSet18,
- unicodeCharacterSet19,
- unicodeCharacterSet20,
- unicodeCharacterSet21,
- unicodeCharacterSet22,
- unicodeCharacterSet23,
- unicodeCharacterSet24,
- unicodeCharacterSet25,
- unicodeCharacterSet26,
- unicodeCharacterSet27,
- unicodeCharacterSet28,
- unicodeCharacterSet29,
- unicodeCharacterSet30,
- unicodeCharacterSet31,
- unicodeCharacterSet32,
- unicodeCharacterSet33,
- unicodeCharacterSet34,
- unicodeCharacterSet35,
- unicodeCharacterSet36,
- unicodeCharacterSet37,
- unicodeCharacterSet38,
- unicodeCharacterSet39,
- unicodeCharacterSet40,
- unicodeCharacterSet41,
- unicodeCharacterSet42,
- unicodeCharacterSet43,
- unicodeCharacterSet44,
- unicodeCharacterSet45,
- unicodeCharacterSet46,
- unicodeCharacterSet47,
- unicodeCharacterSet48,
- unicodeCharacterSet49,
- unicodeCharacterSet50,
- unicodeCharacterSet51,
- unicodeCharacterSet52,
- unicodeCharacterSet53,
- unicodeCharacterSet54,
- unicodeCharacterSet55,
- unicodeCharacterSet56,
- unicodeCharacterSet57,
- unicodeCharacterSet58,
- unicodeCharacterSet59,
- unicodeCharacterSet60,
-};
-
-const size_t UNICODE_CANONICALIZATION_RANGES = 585;
-const CanonicalizationRange unicodeRangeInfo[UNICODE_CANONICALIZATION_RANGES] = {
- { 0x0000, 0x0040, 0x0000, CanonicalizeUnique },
- { 0x0041, 0x0041, 0x0000, CanonicalizeSet },
- { 0x0042, 0x0045, 0x0020, CanonicalizeRangeLo },
- { 0x0046, 0x0046, 0x0001, CanonicalizeSet },
- { 0x0047, 0x0047, 0x0020, CanonicalizeRangeLo },
- { 0x0048, 0x0048, 0x0002, CanonicalizeSet },
- { 0x0049, 0x0049, 0x0003, CanonicalizeSet },
- { 0x004a, 0x004a, 0x0004, CanonicalizeSet },
- { 0x004b, 0x0052, 0x0020, CanonicalizeRangeLo },
- { 0x0053, 0x0053, 0x0005, CanonicalizeSet },
- { 0x0054, 0x0054, 0x0006, CanonicalizeSet },
- { 0x0055, 0x0056, 0x0020, CanonicalizeRangeLo },
- { 0x0057, 0x0057, 0x0007, CanonicalizeSet },
- { 0x0058, 0x0058, 0x0020, CanonicalizeRangeLo },
- { 0x0059, 0x0059, 0x0008, CanonicalizeSet },
- { 0x005a, 0x005a, 0x0020, CanonicalizeRangeLo },
- { 0x005b, 0x0060, 0x0000, CanonicalizeUnique },
- { 0x0061, 0x0061, 0x0000, CanonicalizeSet },
- { 0x0062, 0x0065, 0x0020, CanonicalizeRangeHi },
- { 0x0066, 0x0066, 0x0001, CanonicalizeSet },
- { 0x0067, 0x0067, 0x0020, CanonicalizeRangeHi },
- { 0x0068, 0x0068, 0x0002, CanonicalizeSet },
- { 0x0069, 0x0069, 0x0003, CanonicalizeSet },
- { 0x006a, 0x006a, 0x0004, CanonicalizeSet },
- { 0x006b, 0x0072, 0x0020, CanonicalizeRangeHi },
- { 0x0073, 0x0073, 0x0005, CanonicalizeSet },
- { 0x0074, 0x0074, 0x0006, CanonicalizeSet },
- { 0x0075, 0x0076, 0x0020, CanonicalizeRangeHi },
- { 0x0077, 0x0077, 0x0007, CanonicalizeSet },
- { 0x0078, 0x0078, 0x0020, CanonicalizeRangeHi },
- { 0x0079, 0x0079, 0x0008, CanonicalizeSet },
- { 0x007a, 0x007a, 0x0020, CanonicalizeRangeHi },
- { 0x007b, 0x00b4, 0x0000, CanonicalizeUnique },
- { 0x00b5, 0x00b5, 0x0017, CanonicalizeSet },
- { 0x00b6, 0x00bf, 0x0000, CanonicalizeUnique },
- { 0x00c0, 0x00d6, 0x0020, CanonicalizeRangeLo },
- { 0x00d7, 0x00d7, 0x0000, CanonicalizeUnique },
- { 0x00d8, 0x00de, 0x0020, CanonicalizeRangeLo },
- { 0x00df, 0x00df, 0x0005, CanonicalizeSet },
- { 0x00e0, 0x00f6, 0x0020, CanonicalizeRangeHi },
- { 0x00f7, 0x00f7, 0x0000, CanonicalizeUnique },
- { 0x00f8, 0x00fe, 0x0020, CanonicalizeRangeHi },
- { 0x00ff, 0x00ff, 0x0079, CanonicalizeRangeLo },
- { 0x0100, 0x012f, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0130, 0x0130, 0x0000, CanonicalizeUnique },
- { 0x0131, 0x0131, 0x0003, CanonicalizeSet },
- { 0x0132, 0x0137, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0138, 0x0138, 0x0000, CanonicalizeUnique },
- { 0x0139, 0x0148, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x0149, 0x0149, 0x0173, CanonicalizeRangeLo },
- { 0x014a, 0x0177, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0178, 0x0178, 0x0079, CanonicalizeRangeHi },
- { 0x0179, 0x017e, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x017f, 0x017f, 0x0005, CanonicalizeSet },
- { 0x0180, 0x0180, 0x00c3, CanonicalizeRangeLo },
- { 0x0181, 0x0181, 0x00d2, CanonicalizeRangeLo },
- { 0x0182, 0x0185, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0186, 0x0186, 0x00ce, CanonicalizeRangeLo },
- { 0x0187, 0x0188, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x0189, 0x018a, 0x00cd, CanonicalizeRangeLo },
- { 0x018b, 0x018c, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x018d, 0x018d, 0x0000, CanonicalizeUnique },
- { 0x018e, 0x018e, 0x004f, CanonicalizeRangeLo },
- { 0x018f, 0x018f, 0x00ca, CanonicalizeRangeLo },
- { 0x0190, 0x0190, 0x00cb, CanonicalizeRangeLo },
- { 0x0191, 0x0192, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x0193, 0x0193, 0x00cd, CanonicalizeRangeLo },
- { 0x0194, 0x0194, 0x00cf, CanonicalizeRangeLo },
- { 0x0195, 0x0195, 0x0061, CanonicalizeRangeLo },
- { 0x0196, 0x0196, 0x00d3, CanonicalizeRangeLo },
- { 0x0197, 0x0197, 0x00d1, CanonicalizeRangeLo },
- { 0x0198, 0x0199, 0x0000, CanonicalizeAlternatingAligned },
- { 0x019a, 0x019a, 0x00a3, CanonicalizeRangeLo },
- { 0x019b, 0x019b, 0x0000, CanonicalizeUnique },
- { 0x019c, 0x019c, 0x00d3, CanonicalizeRangeLo },
- { 0x019d, 0x019d, 0x00d5, CanonicalizeRangeLo },
- { 0x019e, 0x019e, 0x0082, CanonicalizeRangeLo },
- { 0x019f, 0x019f, 0x00d6, CanonicalizeRangeLo },
- { 0x01a0, 0x01a5, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01a6, 0x01a6, 0x00da, CanonicalizeRangeLo },
- { 0x01a7, 0x01a8, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x01a9, 0x01a9, 0x00da, CanonicalizeRangeLo },
- { 0x01aa, 0x01ab, 0x0000, CanonicalizeUnique },
- { 0x01ac, 0x01ad, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01ae, 0x01ae, 0x00da, CanonicalizeRangeLo },
- { 0x01af, 0x01b0, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x01b1, 0x01b2, 0x00d9, CanonicalizeRangeLo },
- { 0x01b3, 0x01b6, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x01b7, 0x01b7, 0x00db, CanonicalizeRangeLo },
- { 0x01b8, 0x01b9, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01ba, 0x01bb, 0x0000, CanonicalizeUnique },
- { 0x01bc, 0x01bd, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01be, 0x01be, 0x0000, CanonicalizeUnique },
- { 0x01bf, 0x01bf, 0x0038, CanonicalizeRangeLo },
- { 0x01c0, 0x01c3, 0x0000, CanonicalizeUnique },
- { 0x01c4, 0x01c6, 0x0009, CanonicalizeSet },
- { 0x01c7, 0x01c9, 0x000a, CanonicalizeSet },
- { 0x01ca, 0x01cc, 0x000b, CanonicalizeSet },
- { 0x01cd, 0x01dc, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x01dd, 0x01dd, 0x004f, CanonicalizeRangeHi },
- { 0x01de, 0x01ef, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01f0, 0x01f0, 0x0004, CanonicalizeSet },
- { 0x01f1, 0x01f3, 0x000c, CanonicalizeSet },
- { 0x01f4, 0x01f5, 0x0000, CanonicalizeAlternatingAligned },
- { 0x01f6, 0x01f6, 0x0061, CanonicalizeRangeHi },
- { 0x01f7, 0x01f7, 0x0038, CanonicalizeRangeHi },
- { 0x01f8, 0x021f, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0220, 0x0220, 0x0082, CanonicalizeRangeHi },
- { 0x0221, 0x0221, 0x0000, CanonicalizeUnique },
- { 0x0222, 0x0233, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0234, 0x0239, 0x0000, CanonicalizeUnique },
- { 0x023a, 0x023a, 0x2a2b, CanonicalizeRangeLo },
- { 0x023b, 0x023c, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x023d, 0x023d, 0x00a3, CanonicalizeRangeHi },
- { 0x023e, 0x023e, 0x2a28, CanonicalizeRangeLo },
- { 0x023f, 0x0240, 0x2a3f, CanonicalizeRangeLo },
- { 0x0241, 0x0242, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x0243, 0x0243, 0x00c3, CanonicalizeRangeHi },
- { 0x0244, 0x0244, 0x0045, CanonicalizeRangeLo },
- { 0x0245, 0x0245, 0x0047, CanonicalizeRangeLo },
- { 0x0246, 0x024f, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0250, 0x0250, 0x2a1f, CanonicalizeRangeLo },
- { 0x0251, 0x0251, 0x2a1c, CanonicalizeRangeLo },
- { 0x0252, 0x0252, 0x2a1e, CanonicalizeRangeLo },
- { 0x0253, 0x0253, 0x00d2, CanonicalizeRangeHi },
- { 0x0254, 0x0254, 0x00ce, CanonicalizeRangeHi },
- { 0x0255, 0x0255, 0x0000, CanonicalizeUnique },
- { 0x0256, 0x0257, 0x00cd, CanonicalizeRangeHi },
- { 0x0258, 0x0258, 0x0000, CanonicalizeUnique },
- { 0x0259, 0x0259, 0x00ca, CanonicalizeRangeHi },
- { 0x025a, 0x025a, 0x0000, CanonicalizeUnique },
- { 0x025b, 0x025b, 0x00cb, CanonicalizeRangeHi },
- { 0x025c, 0x025c, 0xa54f, CanonicalizeRangeLo },
- { 0x025d, 0x025f, 0x0000, CanonicalizeUnique },
- { 0x0260, 0x0260, 0x00cd, CanonicalizeRangeHi },
- { 0x0261, 0x0261, 0xa54b, CanonicalizeRangeLo },
- { 0x0262, 0x0262, 0x0000, CanonicalizeUnique },
- { 0x0263, 0x0263, 0x00cf, CanonicalizeRangeHi },
- { 0x0264, 0x0264, 0x0000, CanonicalizeUnique },
- { 0x0265, 0x0265, 0xa528, CanonicalizeRangeLo },
- { 0x0266, 0x0266, 0xa544, CanonicalizeRangeLo },
- { 0x0267, 0x0267, 0x0000, CanonicalizeUnique },
- { 0x0268, 0x0268, 0x00d1, CanonicalizeRangeHi },
- { 0x0269, 0x0269, 0x00d3, CanonicalizeRangeHi },
- { 0x026a, 0x026a, 0x0000, CanonicalizeUnique },
- { 0x026b, 0x026b, 0x29f7, CanonicalizeRangeLo },
- { 0x026c, 0x026c, 0xa541, CanonicalizeRangeLo },
- { 0x026d, 0x026e, 0x0000, CanonicalizeUnique },
- { 0x026f, 0x026f, 0x00d3, CanonicalizeRangeHi },
- { 0x0270, 0x0270, 0x0000, CanonicalizeUnique },
- { 0x0271, 0x0271, 0x29fd, CanonicalizeRangeLo },
- { 0x0272, 0x0272, 0x00d5, CanonicalizeRangeHi },
- { 0x0273, 0x0274, 0x0000, CanonicalizeUnique },
- { 0x0275, 0x0275, 0x00d6, CanonicalizeRangeHi },
- { 0x0276, 0x027c, 0x0000, CanonicalizeUnique },
- { 0x027d, 0x027d, 0x29e7, CanonicalizeRangeLo },
- { 0x027e, 0x027f, 0x0000, CanonicalizeUnique },
- { 0x0280, 0x0280, 0x00da, CanonicalizeRangeHi },
- { 0x0281, 0x0282, 0x0000, CanonicalizeUnique },
- { 0x0283, 0x0283, 0x00da, CanonicalizeRangeHi },
- { 0x0284, 0x0286, 0x0000, CanonicalizeUnique },
- { 0x0287, 0x0287, 0xa52a, CanonicalizeRangeLo },
- { 0x0288, 0x0288, 0x00da, CanonicalizeRangeHi },
- { 0x0289, 0x0289, 0x0045, CanonicalizeRangeHi },
- { 0x028a, 0x028b, 0x00d9, CanonicalizeRangeHi },
- { 0x028c, 0x028c, 0x0047, CanonicalizeRangeHi },
- { 0x028d, 0x0291, 0x0000, CanonicalizeUnique },
- { 0x0292, 0x0292, 0x00db, CanonicalizeRangeHi },
- { 0x0293, 0x029d, 0x0000, CanonicalizeUnique },
- { 0x029e, 0x029e, 0xa512, CanonicalizeRangeLo },
- { 0x029f, 0x02bb, 0x0000, CanonicalizeUnique },
- { 0x02bc, 0x02bc, 0x0173, CanonicalizeRangeHi },
- { 0x02bd, 0x0344, 0x0000, CanonicalizeUnique },
- { 0x0345, 0x0345, 0x0015, CanonicalizeSet },
- { 0x0346, 0x036f, 0x0000, CanonicalizeUnique },
- { 0x0370, 0x0373, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0374, 0x0375, 0x0000, CanonicalizeUnique },
- { 0x0376, 0x0377, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0378, 0x037a, 0x0000, CanonicalizeUnique },
- { 0x037b, 0x037d, 0x0082, CanonicalizeRangeLo },
- { 0x037e, 0x037e, 0x0000, CanonicalizeUnique },
- { 0x037f, 0x037f, 0x0074, CanonicalizeRangeLo },
- { 0x0380, 0x0385, 0x0000, CanonicalizeUnique },
- { 0x0386, 0x0386, 0x000d, CanonicalizeSet },
- { 0x0387, 0x0387, 0x0000, CanonicalizeUnique },
- { 0x0388, 0x0388, 0x0025, CanonicalizeRangeLo },
- { 0x0389, 0x0389, 0x000e, CanonicalizeSet },
- { 0x038a, 0x038a, 0x0025, CanonicalizeRangeLo },
- { 0x038b, 0x038b, 0x0000, CanonicalizeUnique },
- { 0x038c, 0x038c, 0x0040, CanonicalizeRangeLo },
- { 0x038d, 0x038d, 0x0000, CanonicalizeUnique },
- { 0x038e, 0x038e, 0x003f, CanonicalizeRangeLo },
- { 0x038f, 0x038f, 0x000f, CanonicalizeSet },
- { 0x0390, 0x0390, 0x0015, CanonicalizeSet },
- { 0x0391, 0x0391, 0x0010, CanonicalizeSet },
- { 0x0392, 0x0392, 0x0011, CanonicalizeSet },
- { 0x0393, 0x0394, 0x0020, CanonicalizeRangeLo },
- { 0x0395, 0x0395, 0x0012, CanonicalizeSet },
- { 0x0396, 0x0396, 0x0020, CanonicalizeRangeLo },
- { 0x0397, 0x0397, 0x0013, CanonicalizeSet },
- { 0x0398, 0x0398, 0x0014, CanonicalizeSet },
- { 0x0399, 0x0399, 0x0015, CanonicalizeSet },
- { 0x039a, 0x039a, 0x0016, CanonicalizeSet },
- { 0x039b, 0x039b, 0x0020, CanonicalizeRangeLo },
- { 0x039c, 0x039c, 0x0017, CanonicalizeSet },
- { 0x039d, 0x039f, 0x0020, CanonicalizeRangeLo },
- { 0x03a0, 0x03a0, 0x0018, CanonicalizeSet },
- { 0x03a1, 0x03a1, 0x0019, CanonicalizeSet },
- { 0x03a2, 0x03a2, 0x0000, CanonicalizeUnique },
- { 0x03a3, 0x03a3, 0x001a, CanonicalizeSet },
- { 0x03a4, 0x03a4, 0x0020, CanonicalizeRangeLo },
- { 0x03a5, 0x03a5, 0x001b, CanonicalizeSet },
- { 0x03a6, 0x03a6, 0x001c, CanonicalizeSet },
- { 0x03a7, 0x03a8, 0x0020, CanonicalizeRangeLo },
- { 0x03a9, 0x03a9, 0x001d, CanonicalizeSet },
- { 0x03aa, 0x03ab, 0x0020, CanonicalizeRangeLo },
- { 0x03ac, 0x03ac, 0x000d, CanonicalizeSet },
- { 0x03ad, 0x03ad, 0x0025, CanonicalizeRangeHi },
- { 0x03ae, 0x03ae, 0x000e, CanonicalizeSet },
- { 0x03af, 0x03af, 0x0025, CanonicalizeRangeHi },
- { 0x03b0, 0x03b0, 0x001b, CanonicalizeSet },
- { 0x03b1, 0x03b1, 0x0010, CanonicalizeSet },
- { 0x03b2, 0x03b2, 0x0011, CanonicalizeSet },
- { 0x03b3, 0x03b4, 0x0020, CanonicalizeRangeHi },
- { 0x03b5, 0x03b5, 0x0012, CanonicalizeSet },
- { 0x03b6, 0x03b6, 0x0020, CanonicalizeRangeHi },
- { 0x03b7, 0x03b7, 0x0013, CanonicalizeSet },
- { 0x03b8, 0x03b8, 0x0014, CanonicalizeSet },
- { 0x03b9, 0x03b9, 0x0015, CanonicalizeSet },
- { 0x03ba, 0x03ba, 0x0016, CanonicalizeSet },
- { 0x03bb, 0x03bb, 0x0020, CanonicalizeRangeHi },
- { 0x03bc, 0x03bc, 0x0017, CanonicalizeSet },
- { 0x03bd, 0x03bf, 0x0020, CanonicalizeRangeHi },
- { 0x03c0, 0x03c0, 0x0018, CanonicalizeSet },
- { 0x03c1, 0x03c1, 0x0019, CanonicalizeSet },
- { 0x03c2, 0x03c3, 0x001a, CanonicalizeSet },
- { 0x03c4, 0x03c4, 0x0020, CanonicalizeRangeHi },
- { 0x03c5, 0x03c5, 0x001b, CanonicalizeSet },
- { 0x03c6, 0x03c6, 0x001c, CanonicalizeSet },
- { 0x03c7, 0x03c8, 0x0020, CanonicalizeRangeHi },
- { 0x03c9, 0x03c9, 0x001d, CanonicalizeSet },
- { 0x03ca, 0x03cb, 0x0020, CanonicalizeRangeHi },
- { 0x03cc, 0x03cc, 0x0040, CanonicalizeRangeHi },
- { 0x03cd, 0x03cd, 0x003f, CanonicalizeRangeHi },
- { 0x03ce, 0x03ce, 0x000f, CanonicalizeSet },
- { 0x03cf, 0x03cf, 0x0008, CanonicalizeRangeLo },
- { 0x03d0, 0x03d0, 0x0011, CanonicalizeSet },
- { 0x03d1, 0x03d1, 0x0014, CanonicalizeSet },
- { 0x03d2, 0x03d4, 0x0000, CanonicalizeUnique },
- { 0x03d5, 0x03d5, 0x001c, CanonicalizeSet },
- { 0x03d6, 0x03d6, 0x0018, CanonicalizeSet },
- { 0x03d7, 0x03d7, 0x0008, CanonicalizeRangeHi },
- { 0x03d8, 0x03ef, 0x0000, CanonicalizeAlternatingAligned },
- { 0x03f0, 0x03f0, 0x0016, CanonicalizeSet },
- { 0x03f1, 0x03f1, 0x0019, CanonicalizeSet },
- { 0x03f2, 0x03f2, 0x0007, CanonicalizeRangeLo },
- { 0x03f3, 0x03f3, 0x0074, CanonicalizeRangeHi },
- { 0x03f4, 0x03f4, 0x0000, CanonicalizeUnique },
- { 0x03f5, 0x03f5, 0x0012, CanonicalizeSet },
- { 0x03f6, 0x03f6, 0x0000, CanonicalizeUnique },
- { 0x03f7, 0x03f8, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x03f9, 0x03f9, 0x0007, CanonicalizeRangeHi },
- { 0x03fa, 0x03fb, 0x0000, CanonicalizeAlternatingAligned },
- { 0x03fc, 0x03fc, 0x0000, CanonicalizeUnique },
- { 0x03fd, 0x03ff, 0x0082, CanonicalizeRangeHi },
- { 0x0400, 0x040f, 0x0050, CanonicalizeRangeLo },
- { 0x0410, 0x042f, 0x0020, CanonicalizeRangeLo },
- { 0x0430, 0x044f, 0x0020, CanonicalizeRangeHi },
- { 0x0450, 0x045f, 0x0050, CanonicalizeRangeHi },
- { 0x0460, 0x0481, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0482, 0x0489, 0x0000, CanonicalizeUnique },
- { 0x048a, 0x04bf, 0x0000, CanonicalizeAlternatingAligned },
- { 0x04c0, 0x04c0, 0x000f, CanonicalizeRangeLo },
- { 0x04c1, 0x04ce, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x04cf, 0x04cf, 0x000f, CanonicalizeRangeHi },
- { 0x04d0, 0x052f, 0x0000, CanonicalizeAlternatingAligned },
- { 0x0530, 0x0530, 0x0000, CanonicalizeUnique },
- { 0x0531, 0x0534, 0x0030, CanonicalizeRangeLo },
- { 0x0535, 0x0535, 0x001e, CanonicalizeSet },
- { 0x0536, 0x0543, 0x0030, CanonicalizeRangeLo },
- { 0x0544, 0x0544, 0x001f, CanonicalizeSet },
- { 0x0545, 0x054d, 0x0030, CanonicalizeRangeLo },
- { 0x054e, 0x054e, 0x0020, CanonicalizeSet },
- { 0x054f, 0x0556, 0x0030, CanonicalizeRangeLo },
- { 0x0557, 0x0560, 0x0000, CanonicalizeUnique },
- { 0x0561, 0x0564, 0x0030, CanonicalizeRangeHi },
- { 0x0565, 0x0565, 0x001e, CanonicalizeSet },
- { 0x0566, 0x0573, 0x0030, CanonicalizeRangeHi },
- { 0x0574, 0x0574, 0x001f, CanonicalizeSet },
- { 0x0575, 0x057d, 0x0030, CanonicalizeRangeHi },
- { 0x057e, 0x057e, 0x0020, CanonicalizeSet },
- { 0x057f, 0x0586, 0x0030, CanonicalizeRangeHi },
- { 0x0587, 0x0587, 0x001e, CanonicalizeSet },
- { 0x0588, 0x109f, 0x0000, CanonicalizeUnique },
- { 0x10a0, 0x10c5, 0x1c60, CanonicalizeRangeLo },
- { 0x10c6, 0x10c6, 0x0000, CanonicalizeUnique },
- { 0x10c7, 0x10c7, 0x1c60, CanonicalizeRangeLo },
- { 0x10c8, 0x10cc, 0x0000, CanonicalizeUnique },
- { 0x10cd, 0x10cd, 0x1c60, CanonicalizeRangeLo },
- { 0x10ce, 0x1d78, 0x0000, CanonicalizeUnique },
- { 0x1d79, 0x1d79, 0x8a04, CanonicalizeRangeLo },
- { 0x1d7a, 0x1d7c, 0x0000, CanonicalizeUnique },
- { 0x1d7d, 0x1d7d, 0x0ee6, CanonicalizeRangeLo },
- { 0x1d7e, 0x1dff, 0x0000, CanonicalizeUnique },
- { 0x1e00, 0x1e5f, 0x0000, CanonicalizeAlternatingAligned },
- { 0x1e60, 0x1e61, 0x0021, CanonicalizeSet },
- { 0x1e62, 0x1e95, 0x0000, CanonicalizeAlternatingAligned },
- { 0x1e96, 0x1e96, 0x0002, CanonicalizeSet },
- { 0x1e97, 0x1e97, 0x0006, CanonicalizeSet },
- { 0x1e98, 0x1e98, 0x0007, CanonicalizeSet },
- { 0x1e99, 0x1e99, 0x0008, CanonicalizeSet },
- { 0x1e9a, 0x1e9a, 0x0000, CanonicalizeSet },
- { 0x1e9b, 0x1e9b, 0x0021, CanonicalizeSet },
- { 0x1e9c, 0x1e9f, 0x0000, CanonicalizeUnique },
- { 0x1ea0, 0x1eff, 0x0000, CanonicalizeAlternatingAligned },
- { 0x1f00, 0x1f00, 0x0022, CanonicalizeSet },
- { 0x1f01, 0x1f01, 0x0023, CanonicalizeSet },
- { 0x1f02, 0x1f02, 0x0024, CanonicalizeSet },
- { 0x1f03, 0x1f03, 0x0025, CanonicalizeSet },
- { 0x1f04, 0x1f04, 0x0026, CanonicalizeSet },
- { 0x1f05, 0x1f05, 0x0027, CanonicalizeSet },
- { 0x1f06, 0x1f06, 0x0028, CanonicalizeSet },
- { 0x1f07, 0x1f07, 0x0029, CanonicalizeSet },
- { 0x1f08, 0x1f08, 0x0022, CanonicalizeSet },
- { 0x1f09, 0x1f09, 0x0023, CanonicalizeSet },
- { 0x1f0a, 0x1f0a, 0x0024, CanonicalizeSet },
- { 0x1f0b, 0x1f0b, 0x0025, CanonicalizeSet },
- { 0x1f0c, 0x1f0c, 0x0026, CanonicalizeSet },
- { 0x1f0d, 0x1f0d, 0x0027, CanonicalizeSet },
- { 0x1f0e, 0x1f0e, 0x0028, CanonicalizeSet },
- { 0x1f0f, 0x1f0f, 0x0029, CanonicalizeSet },
- { 0x1f10, 0x1f15, 0x0008, CanonicalizeRangeLo },
- { 0x1f16, 0x1f17, 0x0000, CanonicalizeUnique },
- { 0x1f18, 0x1f1d, 0x0008, CanonicalizeRangeHi },
- { 0x1f1e, 0x1f1f, 0x0000, CanonicalizeUnique },
- { 0x1f20, 0x1f20, 0x002a, CanonicalizeSet },
- { 0x1f21, 0x1f21, 0x002b, CanonicalizeSet },
- { 0x1f22, 0x1f22, 0x002c, CanonicalizeSet },
- { 0x1f23, 0x1f23, 0x002d, CanonicalizeSet },
- { 0x1f24, 0x1f24, 0x002e, CanonicalizeSet },
- { 0x1f25, 0x1f25, 0x002f, CanonicalizeSet },
- { 0x1f26, 0x1f26, 0x0030, CanonicalizeSet },
- { 0x1f27, 0x1f27, 0x0031, CanonicalizeSet },
- { 0x1f28, 0x1f28, 0x002a, CanonicalizeSet },
- { 0x1f29, 0x1f29, 0x002b, CanonicalizeSet },
- { 0x1f2a, 0x1f2a, 0x002c, CanonicalizeSet },
- { 0x1f2b, 0x1f2b, 0x002d, CanonicalizeSet },
- { 0x1f2c, 0x1f2c, 0x002e, CanonicalizeSet },
- { 0x1f2d, 0x1f2d, 0x002f, CanonicalizeSet },
- { 0x1f2e, 0x1f2e, 0x0030, CanonicalizeSet },
- { 0x1f2f, 0x1f2f, 0x0031, CanonicalizeSet },
- { 0x1f30, 0x1f37, 0x0008, CanonicalizeRangeLo },
- { 0x1f38, 0x1f3f, 0x0008, CanonicalizeRangeHi },
- { 0x1f40, 0x1f45, 0x0008, CanonicalizeRangeLo },
- { 0x1f46, 0x1f47, 0x0000, CanonicalizeUnique },
- { 0x1f48, 0x1f4d, 0x0008, CanonicalizeRangeHi },
- { 0x1f4e, 0x1f4f, 0x0000, CanonicalizeUnique },
- { 0x1f50, 0x1f50, 0x001b, CanonicalizeSet },
- { 0x1f51, 0x1f51, 0x0008, CanonicalizeRangeLo },
- { 0x1f52, 0x1f52, 0x001b, CanonicalizeSet },
- { 0x1f53, 0x1f53, 0x0008, CanonicalizeRangeLo },
- { 0x1f54, 0x1f54, 0x001b, CanonicalizeSet },
- { 0x1f55, 0x1f55, 0x0008, CanonicalizeRangeLo },
- { 0x1f56, 0x1f56, 0x001b, CanonicalizeSet },
- { 0x1f57, 0x1f57, 0x0008, CanonicalizeRangeLo },
- { 0x1f58, 0x1f58, 0x0000, CanonicalizeUnique },
- { 0x1f59, 0x1f59, 0x0008, CanonicalizeRangeHi },
- { 0x1f5a, 0x1f5a, 0x0000, CanonicalizeUnique },
- { 0x1f5b, 0x1f5b, 0x0008, CanonicalizeRangeHi },
- { 0x1f5c, 0x1f5c, 0x0000, CanonicalizeUnique },
- { 0x1f5d, 0x1f5d, 0x0008, CanonicalizeRangeHi },
- { 0x1f5e, 0x1f5e, 0x0000, CanonicalizeUnique },
- { 0x1f5f, 0x1f5f, 0x0008, CanonicalizeRangeHi },
- { 0x1f60, 0x1f60, 0x0032, CanonicalizeSet },
- { 0x1f61, 0x1f61, 0x0033, CanonicalizeSet },
- { 0x1f62, 0x1f62, 0x0034, CanonicalizeSet },
- { 0x1f63, 0x1f63, 0x0035, CanonicalizeSet },
- { 0x1f64, 0x1f64, 0x0036, CanonicalizeSet },
- { 0x1f65, 0x1f65, 0x0037, CanonicalizeSet },
- { 0x1f66, 0x1f66, 0x0038, CanonicalizeSet },
- { 0x1f67, 0x1f67, 0x0039, CanonicalizeSet },
- { 0x1f68, 0x1f68, 0x0032, CanonicalizeSet },
- { 0x1f69, 0x1f69, 0x0033, CanonicalizeSet },
- { 0x1f6a, 0x1f6a, 0x0034, CanonicalizeSet },
- { 0x1f6b, 0x1f6b, 0x0035, CanonicalizeSet },
- { 0x1f6c, 0x1f6c, 0x0036, CanonicalizeSet },
- { 0x1f6d, 0x1f6d, 0x0037, CanonicalizeSet },
- { 0x1f6e, 0x1f6e, 0x0038, CanonicalizeSet },
- { 0x1f6f, 0x1f6f, 0x0039, CanonicalizeSet },
- { 0x1f70, 0x1f70, 0x003a, CanonicalizeSet },
- { 0x1f71, 0x1f71, 0x004a, CanonicalizeRangeLo },
- { 0x1f72, 0x1f73, 0x0056, CanonicalizeRangeLo },
- { 0x1f74, 0x1f74, 0x003b, CanonicalizeSet },
- { 0x1f75, 0x1f75, 0x0056, CanonicalizeRangeLo },
- { 0x1f76, 0x1f77, 0x0064, CanonicalizeRangeLo },
- { 0x1f78, 0x1f79, 0x0080, CanonicalizeRangeLo },
- { 0x1f7a, 0x1f7b, 0x0070, CanonicalizeRangeLo },
- { 0x1f7c, 0x1f7c, 0x003c, CanonicalizeSet },
- { 0x1f7d, 0x1f7d, 0x007e, CanonicalizeRangeLo },
- { 0x1f7e, 0x1f7f, 0x0000, CanonicalizeUnique },
- { 0x1f80, 0x1f80, 0x0022, CanonicalizeSet },
- { 0x1f81, 0x1f81, 0x0023, CanonicalizeSet },
- { 0x1f82, 0x1f82, 0x0024, CanonicalizeSet },
- { 0x1f83, 0x1f83, 0x0025, CanonicalizeSet },
- { 0x1f84, 0x1f84, 0x0026, CanonicalizeSet },
- { 0x1f85, 0x1f85, 0x0027, CanonicalizeSet },
- { 0x1f86, 0x1f86, 0x0028, CanonicalizeSet },
- { 0x1f87, 0x1f87, 0x0029, CanonicalizeSet },
- { 0x1f88, 0x1f88, 0x0022, CanonicalizeSet },
- { 0x1f89, 0x1f89, 0x0023, CanonicalizeSet },
- { 0x1f8a, 0x1f8a, 0x0024, CanonicalizeSet },
- { 0x1f8b, 0x1f8b, 0x0025, CanonicalizeSet },
- { 0x1f8c, 0x1f8c, 0x0026, CanonicalizeSet },
- { 0x1f8d, 0x1f8d, 0x0027, CanonicalizeSet },
- { 0x1f8e, 0x1f8e, 0x0028, CanonicalizeSet },
- { 0x1f8f, 0x1f8f, 0x0029, CanonicalizeSet },
- { 0x1f90, 0x1f90, 0x002a, CanonicalizeSet },
- { 0x1f91, 0x1f91, 0x002b, CanonicalizeSet },
- { 0x1f92, 0x1f92, 0x002c, CanonicalizeSet },
- { 0x1f93, 0x1f93, 0x002d, CanonicalizeSet },
- { 0x1f94, 0x1f94, 0x002e, CanonicalizeSet },
- { 0x1f95, 0x1f95, 0x002f, CanonicalizeSet },
- { 0x1f96, 0x1f96, 0x0030, CanonicalizeSet },
- { 0x1f97, 0x1f97, 0x0031, CanonicalizeSet },
- { 0x1f98, 0x1f98, 0x002a, CanonicalizeSet },
- { 0x1f99, 0x1f99, 0x002b, CanonicalizeSet },
- { 0x1f9a, 0x1f9a, 0x002c, CanonicalizeSet },
- { 0x1f9b, 0x1f9b, 0x002d, CanonicalizeSet },
- { 0x1f9c, 0x1f9c, 0x002e, CanonicalizeSet },
- { 0x1f9d, 0x1f9d, 0x002f, CanonicalizeSet },
- { 0x1f9e, 0x1f9e, 0x0030, CanonicalizeSet },
- { 0x1f9f, 0x1f9f, 0x0031, CanonicalizeSet },
- { 0x1fa0, 0x1fa0, 0x0032, CanonicalizeSet },
- { 0x1fa1, 0x1fa1, 0x0033, CanonicalizeSet },
- { 0x1fa2, 0x1fa2, 0x0034, CanonicalizeSet },
- { 0x1fa3, 0x1fa3, 0x0035, CanonicalizeSet },
- { 0x1fa4, 0x1fa4, 0x0036, CanonicalizeSet },
- { 0x1fa5, 0x1fa5, 0x0037, CanonicalizeSet },
- { 0x1fa6, 0x1fa6, 0x0038, CanonicalizeSet },
- { 0x1fa7, 0x1fa7, 0x0039, CanonicalizeSet },
- { 0x1fa8, 0x1fa8, 0x0032, CanonicalizeSet },
- { 0x1fa9, 0x1fa9, 0x0033, CanonicalizeSet },
- { 0x1faa, 0x1faa, 0x0034, CanonicalizeSet },
- { 0x1fab, 0x1fab, 0x0035, CanonicalizeSet },
- { 0x1fac, 0x1fac, 0x0036, CanonicalizeSet },
- { 0x1fad, 0x1fad, 0x0037, CanonicalizeSet },
- { 0x1fae, 0x1fae, 0x0038, CanonicalizeSet },
- { 0x1faf, 0x1faf, 0x0039, CanonicalizeSet },
- { 0x1fb0, 0x1fb1, 0x0008, CanonicalizeRangeLo },
- { 0x1fb2, 0x1fb2, 0x003a, CanonicalizeSet },
- { 0x1fb3, 0x1fb3, 0x0010, CanonicalizeSet },
- { 0x1fb4, 0x1fb4, 0x000d, CanonicalizeSet },
- { 0x1fb5, 0x1fb5, 0x0000, CanonicalizeUnique },
- { 0x1fb6, 0x1fb7, 0x0010, CanonicalizeSet },
- { 0x1fb8, 0x1fb9, 0x0008, CanonicalizeRangeHi },
- { 0x1fba, 0x1fba, 0x003a, CanonicalizeSet },
- { 0x1fbb, 0x1fbb, 0x004a, CanonicalizeRangeHi },
- { 0x1fbc, 0x1fbc, 0x0010, CanonicalizeSet },
- { 0x1fbd, 0x1fbd, 0x0000, CanonicalizeUnique },
- { 0x1fbe, 0x1fbe, 0x0015, CanonicalizeSet },
- { 0x1fbf, 0x1fc1, 0x0000, CanonicalizeUnique },
- { 0x1fc2, 0x1fc2, 0x003b, CanonicalizeSet },
- { 0x1fc3, 0x1fc3, 0x0013, CanonicalizeSet },
- { 0x1fc4, 0x1fc4, 0x000e, CanonicalizeSet },
- { 0x1fc5, 0x1fc5, 0x0000, CanonicalizeUnique },
- { 0x1fc6, 0x1fc7, 0x0013, CanonicalizeSet },
- { 0x1fc8, 0x1fc9, 0x0056, CanonicalizeRangeHi },
- { 0x1fca, 0x1fca, 0x003b, CanonicalizeSet },
- { 0x1fcb, 0x1fcb, 0x0056, CanonicalizeRangeHi },
- { 0x1fcc, 0x1fcc, 0x0013, CanonicalizeSet },
- { 0x1fcd, 0x1fcf, 0x0000, CanonicalizeUnique },
- { 0x1fd0, 0x1fd1, 0x0008, CanonicalizeRangeLo },
- { 0x1fd2, 0x1fd3, 0x0015, CanonicalizeSet },
- { 0x1fd4, 0x1fd5, 0x0000, CanonicalizeUnique },
- { 0x1fd6, 0x1fd7, 0x0015, CanonicalizeSet },
- { 0x1fd8, 0x1fd9, 0x0008, CanonicalizeRangeHi },
- { 0x1fda, 0x1fdb, 0x0064, CanonicalizeRangeHi },
- { 0x1fdc, 0x1fdf, 0x0000, CanonicalizeUnique },
- { 0x1fe0, 0x1fe1, 0x0008, CanonicalizeRangeLo },
- { 0x1fe2, 0x1fe3, 0x001b, CanonicalizeSet },
- { 0x1fe4, 0x1fe4, 0x0019, CanonicalizeSet },
- { 0x1fe5, 0x1fe5, 0x0007, CanonicalizeRangeLo },
- { 0x1fe6, 0x1fe7, 0x001b, CanonicalizeSet },
- { 0x1fe8, 0x1fe9, 0x0008, CanonicalizeRangeHi },
- { 0x1fea, 0x1feb, 0x0070, CanonicalizeRangeHi },
- { 0x1fec, 0x1fec, 0x0007, CanonicalizeRangeHi },
- { 0x1fed, 0x1ff1, 0x0000, CanonicalizeUnique },
- { 0x1ff2, 0x1ff2, 0x003c, CanonicalizeSet },
- { 0x1ff3, 0x1ff3, 0x001d, CanonicalizeSet },
- { 0x1ff4, 0x1ff4, 0x000f, CanonicalizeSet },
- { 0x1ff5, 0x1ff5, 0x0000, CanonicalizeUnique },
- { 0x1ff6, 0x1ff7, 0x001d, CanonicalizeSet },
- { 0x1ff8, 0x1ff9, 0x0080, CanonicalizeRangeHi },
- { 0x1ffa, 0x1ffa, 0x003c, CanonicalizeSet },
- { 0x1ffb, 0x1ffb, 0x007e, CanonicalizeRangeHi },
- { 0x1ffc, 0x1ffc, 0x001d, CanonicalizeSet },
- { 0x1ffd, 0x2131, 0x0000, CanonicalizeUnique },
- { 0x2132, 0x2132, 0x001c, CanonicalizeRangeLo },
- { 0x2133, 0x214d, 0x0000, CanonicalizeUnique },
- { 0x214e, 0x214e, 0x001c, CanonicalizeRangeHi },
- { 0x214f, 0x215f, 0x0000, CanonicalizeUnique },
- { 0x2160, 0x216f, 0x0010, CanonicalizeRangeLo },
- { 0x2170, 0x217f, 0x0010, CanonicalizeRangeHi },
- { 0x2180, 0x2182, 0x0000, CanonicalizeUnique },
- { 0x2183, 0x2184, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x2185, 0x24b5, 0x0000, CanonicalizeUnique },
- { 0x24b6, 0x24cf, 0x001a, CanonicalizeRangeLo },
- { 0x24d0, 0x24e9, 0x001a, CanonicalizeRangeHi },
- { 0x24ea, 0x2bff, 0x0000, CanonicalizeUnique },
- { 0x2c00, 0x2c2e, 0x0030, CanonicalizeRangeLo },
- { 0x2c2f, 0x2c2f, 0x0000, CanonicalizeUnique },
- { 0x2c30, 0x2c5e, 0x0030, CanonicalizeRangeHi },
- { 0x2c5f, 0x2c5f, 0x0000, CanonicalizeUnique },
- { 0x2c60, 0x2c61, 0x0000, CanonicalizeAlternatingAligned },
- { 0x2c62, 0x2c62, 0x29f7, CanonicalizeRangeHi },
- { 0x2c63, 0x2c63, 0x0ee6, CanonicalizeRangeHi },
- { 0x2c64, 0x2c64, 0x29e7, CanonicalizeRangeHi },
- { 0x2c65, 0x2c65, 0x2a2b, CanonicalizeRangeHi },
- { 0x2c66, 0x2c66, 0x2a28, CanonicalizeRangeHi },
- { 0x2c67, 0x2c6c, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x2c6d, 0x2c6d, 0x2a1c, CanonicalizeRangeHi },
- { 0x2c6e, 0x2c6e, 0x29fd, CanonicalizeRangeHi },
- { 0x2c6f, 0x2c6f, 0x2a1f, CanonicalizeRangeHi },
- { 0x2c70, 0x2c70, 0x2a1e, CanonicalizeRangeHi },
- { 0x2c71, 0x2c71, 0x0000, CanonicalizeUnique },
- { 0x2c72, 0x2c73, 0x0000, CanonicalizeAlternatingAligned },
- { 0x2c74, 0x2c74, 0x0000, CanonicalizeUnique },
- { 0x2c75, 0x2c76, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x2c77, 0x2c7d, 0x0000, CanonicalizeUnique },
- { 0x2c7e, 0x2c7f, 0x2a3f, CanonicalizeRangeHi },
- { 0x2c80, 0x2ce3, 0x0000, CanonicalizeAlternatingAligned },
- { 0x2ce4, 0x2cea, 0x0000, CanonicalizeUnique },
- { 0x2ceb, 0x2cee, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0x2cef, 0x2cf1, 0x0000, CanonicalizeUnique },
- { 0x2cf2, 0x2cf3, 0x0000, CanonicalizeAlternatingAligned },
- { 0x2cf4, 0x2cff, 0x0000, CanonicalizeUnique },
- { 0x2d00, 0x2d25, 0x1c60, CanonicalizeRangeHi },
- { 0x2d26, 0x2d26, 0x0000, CanonicalizeUnique },
- { 0x2d27, 0x2d27, 0x1c60, CanonicalizeRangeHi },
- { 0x2d28, 0x2d2c, 0x0000, CanonicalizeUnique },
- { 0x2d2d, 0x2d2d, 0x1c60, CanonicalizeRangeHi },
- { 0x2d2e, 0xa63f, 0x0000, CanonicalizeUnique },
- { 0xa640, 0xa66d, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa66e, 0xa67f, 0x0000, CanonicalizeUnique },
- { 0xa680, 0xa69b, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa69c, 0xa721, 0x0000, CanonicalizeUnique },
- { 0xa722, 0xa72f, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa730, 0xa731, 0x0000, CanonicalizeUnique },
- { 0xa732, 0xa76f, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa770, 0xa778, 0x0000, CanonicalizeUnique },
- { 0xa779, 0xa77c, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0xa77d, 0xa77d, 0x8a04, CanonicalizeRangeHi },
- { 0xa77e, 0xa787, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa788, 0xa78a, 0x0000, CanonicalizeUnique },
- { 0xa78b, 0xa78c, 0x0000, CanonicalizeAlternatingUnaligned },
- { 0xa78d, 0xa78d, 0xa528, CanonicalizeRangeHi },
- { 0xa78e, 0xa78f, 0x0000, CanonicalizeUnique },
- { 0xa790, 0xa793, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa794, 0xa795, 0x0000, CanonicalizeUnique },
- { 0xa796, 0xa7a9, 0x0000, CanonicalizeAlternatingAligned },
- { 0xa7aa, 0xa7aa, 0xa544, CanonicalizeRangeHi },
- { 0xa7ab, 0xa7ab, 0xa54f, CanonicalizeRangeHi },
- { 0xa7ac, 0xa7ac, 0xa54b, CanonicalizeRangeHi },
- { 0xa7ad, 0xa7ad, 0xa541, CanonicalizeRangeHi },
- { 0xa7ae, 0xa7af, 0x0000, CanonicalizeUnique },
- { 0xa7b0, 0xa7b0, 0xa512, CanonicalizeRangeHi },
- { 0xa7b1, 0xa7b1, 0xa52a, CanonicalizeRangeHi },
- { 0xa7b2, 0xfaff, 0x0000, CanonicalizeUnique },
- { 0xfb00, 0xfb04, 0x0001, CanonicalizeSet },
- { 0xfb05, 0xfb06, 0x0005, CanonicalizeSet },
- { 0xfb07, 0xfb12, 0x0000, CanonicalizeUnique },
- { 0xfb13, 0xfb15, 0x001f, CanonicalizeSet },
- { 0xfb16, 0xfb16, 0x0020, CanonicalizeSet },
- { 0xfb17, 0xfb17, 0x001f, CanonicalizeSet },
- { 0xfb18, 0xff20, 0x0000, CanonicalizeUnique },
- { 0xff21, 0xff3a, 0x0020, CanonicalizeRangeLo },
- { 0xff3b, 0xff40, 0x0000, CanonicalizeUnique },
- { 0xff41, 0xff5a, 0x0020, CanonicalizeRangeHi },
- { 0xff5b, 0x103ff, 0x0000, CanonicalizeUnique },
- { 0x10400, 0x10427, 0x0028, CanonicalizeRangeLo },
- { 0x10428, 0x1044f, 0x0028, CanonicalizeRangeHi },
- { 0x10450, 0x1189f, 0x0000, CanonicalizeUnique },
- { 0x118a0, 0x118bf, 0x0020, CanonicalizeRangeLo },
- { 0x118c0, 0x118df, 0x0020, CanonicalizeRangeHi },
- { 0x118e0, 0x10ffff, 0x0000, CanonicalizeUnique },
-};
-
-} } // JSC::Yarr
-
</del></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodeh"></a>
<div class="delfile"><h4>Deleted: trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.h (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.h        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.h        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -1,144 +0,0 @@
</span><del>-/*
- * Copyright (C) 2012-2016 Apple Inc. All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
- * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
- * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR
- * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
- * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
- * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
- * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
- * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-#ifndef YarrCanonicalizeUnicode_h
-#define YarrCanonicalizeUnicode_h
-
-#include <stdint.h>
-#include <unicode/utypes.h>
-
-namespace JSC { namespace Yarr {
-
-// This set of data (autogenerated using YarrCanonicalizeUnicode.js into YarrCanonicalizeUnicode.cpp)
-// provides information for each UCS2 code point as to the set of code points that it should
-// match under the ES5.1 case insensitive RegExp matching rules, specified in 15.10.2.8.
-enum UCS2CanonicalizationType {
- CanonicalizeUnique, // No canonically equal values, e.g. 0x0.
- CanonicalizeSet, // Value indicates a set in characterSetInfo.
- CanonicalizeRangeLo, // Value is positive delta to pair, E.g. 0x41 has value 0x20, -> 0x61.
- CanonicalizeRangeHi, // Value is positive delta to pair, E.g. 0x61 has value 0x20, -> 0x41.
- CanonicalizeAlternatingAligned, // Aligned consequtive pair, e.g. 0x1f4,0x1f5.
- CanonicalizeAlternatingUnaligned, // Unaligned consequtive pair, e.g. 0x241,0x242.
-};
-struct CanonicalizationRange {
- UChar32 begin;
- UChar32 end;
- UChar32 value;
- UCS2CanonicalizationType type;
-};
-
-extern const size_t UCS2_CANONICALIZATION_RANGES;
-extern const UChar32* const ucs2CharacterSetInfo[];
-extern const CanonicalizationRange ucs2RangeInfo[];
-
-extern const size_t UNICODE_CANONICALIZATION_RANGES;
-extern const UChar32* const unicodeCharacterSetInfo[];
-extern const CanonicalizationRange unicodeRangeInfo[];
-
-enum class CanonicalMode { UCS2, Unicode };
-
-inline const UChar32* canonicalCharacterSetInfo(unsigned index, CanonicalMode canonicalMode)
-{
- const UChar32* const* rangeInfo = canonicalMode == CanonicalMode::UCS2 ? ucs2CharacterSetInfo : unicodeCharacterSetInfo;
- return rangeInfo[index];
-}
-
-// This searches in log2 time over ~400-600 entries, so should typically result in 9 compares.
-inline const CanonicalizationRange* canonicalRangeInfoFor(UChar32 ch, CanonicalMode canonicalMode = CanonicalMode::UCS2)
-{
- const CanonicalizationRange* info = canonicalMode == CanonicalMode::UCS2 ? ucs2RangeInfo : unicodeRangeInfo;
- size_t entries = canonicalMode == CanonicalMode::UCS2 ? UCS2_CANONICALIZATION_RANGES : UNICODE_CANONICALIZATION_RANGES;
-
- while (true) {
- size_t candidate = entries >> 1;
- const CanonicalizationRange* candidateInfo = info + candidate;
- if (ch < candidateInfo->begin)
- entries = candidate;
- else if (ch <= candidateInfo->end)
- return candidateInfo;
- else {
- info = candidateInfo + 1;
- entries -= (candidate + 1);
- }
- }
-}
-
-// Should only be called for characters that have one canonically matching value.
-inline UChar32 getCanonicalPair(const CanonicalizationRange* info, UChar32 ch)
-{
- ASSERT(ch >= info->begin && ch <= info->end);
- switch (info->type) {
- case CanonicalizeRangeLo:
- return ch + info->value;
- case CanonicalizeRangeHi:
- return ch - info->value;
- case CanonicalizeAlternatingAligned:
- return ch ^ 1;
- case CanonicalizeAlternatingUnaligned:
- return ((ch - 1) ^ 1) + 1;
- default:
- RELEASE_ASSERT_NOT_REACHED();
- }
- RELEASE_ASSERT_NOT_REACHED();
- return 0;
-}
-
-// Returns true if no other UCS2 codepoint can match this value.
-inline bool isCanonicallyUnique(UChar32 ch, CanonicalMode canonicalMode = CanonicalMode::UCS2)
-{
- return canonicalRangeInfoFor(ch, canonicalMode)->type == CanonicalizeUnique;
-}
-
-// Returns true if values are equal, under the canonicalization rules.
-inline bool areCanonicallyEquivalent(UChar32 a, UChar32 b, CanonicalMode canonicalMode = CanonicalMode::UCS2)
-{
- const CanonicalizationRange* info = canonicalRangeInfoFor(a, canonicalMode);
- switch (info->type) {
- case CanonicalizeUnique:
- return a == b;
- case CanonicalizeSet: {
- for (const UChar32* set = canonicalCharacterSetInfo(info->value, canonicalMode); (a = *set); ++set) {
- if (a == b)
- return true;
- }
- return false;
- }
- case CanonicalizeRangeLo:
- return (a == b) || (a + info->value == b);
- case CanonicalizeRangeHi:
- return (a == b) || (a - info->value == b);
- case CanonicalizeAlternatingAligned:
- return (a | 1) == (b | 1);
- case CanonicalizeAlternatingUnaligned:
- return ((a - 1) | 1) == ((b - 1) | 1);
- }
-
- RELEASE_ASSERT_NOT_REACHED();
- return false;
-}
-
-} } // JSC::Yarr
-
-#endif
</del></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrCanonicalizeUnicodejs"></a>
<div class="delfile"><h4>Deleted: trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.js (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.js        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/Source/JavaScriptCore/yarr/YarrCanonicalizeUnicode.js        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -1,221 +0,0 @@
</span><del>-/*
- * Copyright (C) 2012, 2016 Apple Inc. All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY
- * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
- * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR
- * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
- * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
- * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
- * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
- * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
-function printHeader()
-{
- var copyright = (
- "/*" + "\n" +
- " * Copyright (C) 2012-2013, 2015-2016 Apple Inc. All rights reserved." + "\n" +
- " *" + "\n" +
- " * Redistribution and use in source and binary forms, with or without" + "\n" +
- " * modification, are permitted provided that the following conditions" + "\n" +
- " * are met:" + "\n" +
- " * 1. Redistributions of source code must retain the above copyright" + "\n" +
- " * notice, this list of conditions and the following disclaimer." + "\n" +
- " * 2. Redistributions in binary form must reproduce the above copyright" + "\n" +
- " * notice, this list of conditions and the following disclaimer in the" + "\n" +
- " * documentation and/or other materials provided with the distribution." + "\n" +
- " *" + "\n" +
- " * THIS SOFTWARE IS PROVIDED BY APPLE INC. ``AS IS'' AND ANY" + "\n" +
- " * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE" + "\n" +
- " * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR" + "\n" +
- " * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR" + "\n" +
- " * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL," + "\n" +
- " * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO," + "\n" +
- " * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR" + "\n" +
- " * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY" + "\n" +
- " * OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT" + "\n" +
- " * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE" + "\n" +
- " * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. " + "\n" +
- " */");
-
- print(copyright);
- print();
- print("// DO NOT EDIT! - this file autogenerated by YarrCanonicalizeUnicode.js");
- print();
- print('#include "config.h"');
- print('#include "YarrCanonicalizeUnicode.h"');
- print();
- print("namespace JSC { namespace Yarr {");
- print();
- print("#include <stdint.h>");
- print();
-}
-
-function printFooter()
-{
- print("} } // JSC::Yarr");
- print();
-}
-
-// Helper function to convert a number to a fixed width hex representation of a UChar32.
-function hex(x)
-{
- var s = Number(x).toString(16);
- while (s.length < 4)
- s = 0 + s;
- return "0x" + s;
-}
-
-// See ES 6.0, 21.2.2.8.2 Steps 3
-function canonicalize(ch)
-{
- var u = String.fromCharCode(ch).toUpperCase();
- if (u.length > 1)
- return ch;
- var cu = u.charCodeAt(0);
- if (ch >= 128 && cu < 128)
- return ch;
- return cu;
-}
-
-// See ES 6.0, 21.2.2.8.2 Step 2
-function canonicalizeUnicode(ch)
-{
- if (ch < 128)
- return canonicalize(ch);
-
- return String.fromCodePoint(ch).toUpperCase().codePointAt(0);
-}
-
-var MAX_UCS2 = 0xFFFF;
-var MAX_UNICODE = 0x10FFFF;
-
-function createUCS2CanonicalGroups()
-{
- var groupedCanonically = [];
- // Pass 1: populate groupedCanonically - this is mapping from canonicalized
- // values back to the set of character code that canonicalize to them.
- for (var i = 0; i <= MAX_UCS2; ++i) {
- var ch = canonicalize(i);
- if (!groupedCanonically[ch])
- groupedCanonically[ch] = [];
- groupedCanonically[ch].push(i);
- }
-
- return groupedCanonically;
-}
-
-function createUnicodeCanonicalGroups()
-{
- var groupedCanonically = [];
- // Pass 1: populate groupedCanonically - this is mapping from canonicalized
- // values back to the set of character code that canonicalize to them.
- for (var i = 0; i <= MAX_UNICODE; ++i) {
- var ch = canonicalizeUnicode(i);
- if (!groupedCanonically[ch])
- groupedCanonically[ch] = [];
- groupedCanonically[ch].push(i);
- }
-
- return groupedCanonically;
-}
-
-function createTables(prefix, maxValue, canonicalGroups)
-{
- var prefixLower = prefix.toLowerCase();
- var prefixUpper = prefix.toUpperCase();
- var typeInfo = [];
- var characterSetInfo = [];
- // Pass 2: populate typeInfo & characterSetInfo. For every character calculate
- // a typeInfo value, described by the types above, and a value payload.
- for (cu in canonicalGroups) {
- // The set of characters that canonicalize to cu
- var characters = canonicalGroups[cu];
-
- // If there is only one, it is unique.
- if (characters.length == 1) {
- typeInfo[characters[0]] = "CanonicalizeUnique:0";
- continue;
- }
-
- // Sort the array.
- characters.sort(function(x,y){return x-y;});
-
- // If there are more than two characters, create an entry in characterSetInfo.
- if (characters.length > 2) {
- for (i in characters)
- typeInfo[characters[i]] = "CanonicalizeSet:" + characterSetInfo.length;
- characterSetInfo.push(characters);
-
- continue;
- }
-
- // We have a pair, mark alternating ranges, otherwise track whether this is the low or high partner.
- var lo = characters[0];
- var hi = characters[1];
- var delta = hi - lo;
- if (delta == 1) {
- var type = lo & 1 ? "CanonicalizeAlternatingUnaligned:0" : "CanonicalizeAlternatingAligned:0";
- typeInfo[lo] = type;
- typeInfo[hi] = type;
- } else {
- typeInfo[lo] = "CanonicalizeRangeLo:" + delta;
- typeInfo[hi] = "CanonicalizeRangeHi:" + delta;
- }
- }
-
- var rangeInfo = [];
- // Pass 3: coallesce types into ranges.
- for (var end = 0; end <= maxValue; ++end) {
- var begin = end;
- var type = typeInfo[end];
- while (end < maxValue && typeInfo[end + 1] == type)
- ++end;
- rangeInfo.push({begin:begin, end:end, type:type});
- }
-
- for (i in characterSetInfo) {
- var characters = ""
- var set = characterSetInfo[i];
- for (var j in set)
- characters += hex(set[j]) + ", ";
- print("const UChar32 " + prefixLower + "CharacterSet" + i + "[] = { " + characters + "0 };");
- }
- print();
- print("static const size_t " + prefixUpper + "_CANONICALIZATION_SETS = " + characterSetInfo.length + ";");
- print("const UChar32* const " + prefixLower + "CharacterSetInfo[" + prefixUpper + "_CANONICALIZATION_SETS] = {");
- for (i in characterSetInfo)
- print(" " + prefixLower + "CharacterSet" + i + ",");
- print("};");
- print();
- print("const size_t " + prefixUpper + "_CANONICALIZATION_RANGES = " + rangeInfo.length + ";");
- print("const CanonicalizationRange " + prefixLower + "RangeInfo[" + prefixUpper + "_CANONICALIZATION_RANGES] = {");
- for (i in rangeInfo) {
- var info = rangeInfo[i];
- var typeAndValue = info.type.split(':');
- print(" { " + hex(info.begin) + ", " + hex(info.end) + ", " + hex(typeAndValue[1]) + ", " + typeAndValue[0] + " },");
- }
- print("};");
- print();
-}
-
-printHeader();
-
-createTables("UCS2", MAX_UCS2, createUCS2CanonicalGroups());
-createTables("Unicode", MAX_UNICODE, createUnicodeCanonicalGroups());
-
-printFooter();
-
</del></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrInterpretercpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/YarrInterpreter.cpp (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrInterpreter.cpp        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/Source/JavaScriptCore/yarr/YarrInterpreter.cpp        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -28,7 +28,7 @@
</span><span class="cx"> #include "YarrInterpreter.h"
</span><span class="cx">
</span><span class="cx"> #include "Yarr.h"
</span><del>-#include "YarrCanonicalizeUnicode.h"
</del><ins>+#include "YarrCanonicalize.h"
</ins><span class="cx"> #include <wtf/BumpPointerAllocator.h>
</span><span class="cx"> #include <wtf/DataLog.h>
</span><span class="cx"> #include <wtf/text/CString.h>
</span><span class="lines">@@ -377,9 +377,10 @@
</span><span class="cx"> continue;
</span><span class="cx">
</span><span class="cx"> if (pattern->m_ignoreCase) {
</span><del>- // The definition for canonicalize (see ES 6.0, 15.10.2.8) means that
- // unicode values are never allowed to match against ascii ones.
- if (isASCII(oldCh) || isASCII(ch)) {
</del><ins>+ // See ES 6.0, 21.2.2.8.2 for the definition of Canonicalize(). For non-Unicode
+ // patterns, Unicode values are never allowed to match against ASCII ones.
+ // For Unicode, we need to check all canonical equivalents of a character.
+ if (!unicode && (isASCII(oldCh) || isASCII(ch))) {
</ins><span class="cx"> if (toASCIIUpper(oldCh) == toASCIIUpper(ch))
</span><span class="cx"> continue;
</span><span class="cx"> } else if (areCanonicallyEquivalent(oldCh, ch, unicode ? CanonicalMode::Unicode : CanonicalMode::UCS2))
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrJITcpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/YarrJIT.cpp (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrJIT.cpp        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/Source/JavaScriptCore/yarr/YarrJIT.cpp        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -30,7 +30,7 @@
</span><span class="cx"> #include "LinkBuffer.h"
</span><span class="cx"> #include "Options.h"
</span><span class="cx"> #include "Yarr.h"
</span><del>-#include "YarrCanonicalizeUnicode.h"
</del><ins>+#include "YarrCanonicalize.h"
</ins><span class="cx">
</span><span class="cx"> #if ENABLE(YARR_JIT)
</span><span class="cx">
</span></span></pre></div>
<a id="trunkSourceJavaScriptCoreyarrYarrPatterncpp"></a>
<div class="modfile"><h4>Modified: trunk/Source/JavaScriptCore/yarr/YarrPattern.cpp (197780 => 197781)</h4>
<pre class="diff"><span>
<span class="info">--- trunk/Source/JavaScriptCore/yarr/YarrPattern.cpp        2016-03-08 18:19:51 UTC (rev 197780)
+++ trunk/Source/JavaScriptCore/yarr/YarrPattern.cpp        2016-03-08 18:35:58 UTC (rev 197781)
</span><span class="lines">@@ -28,7 +28,7 @@
</span><span class="cx"> #include "YarrPattern.h"
</span><span class="cx">
</span><span class="cx"> #include "Yarr.h"
</span><del>-#include "YarrCanonicalizeUnicode.h"
</del><ins>+#include "YarrCanonicalize.h"
</ins><span class="cx"> #include "YarrParser.h"
</span><span class="cx"> #include <wtf/Vector.h>
</span><span class="cx">
</span><span class="lines">@@ -68,9 +68,14 @@
</span><span class="cx">
</span><span class="cx"> void putChar(UChar32 ch)
</span><span class="cx"> {
</span><del>- // Handle ascii cases.
- if (isASCII(ch)) {
- if (m_isCaseInsensitive && isASCIIAlpha(ch)) {
</del><ins>+ if (!m_isCaseInsensitive) {
+ addSorted(ch);
+ return;
+ }
+
+ if (m_canonicalMode == CanonicalMode::UCS2 && isASCII(ch)) {
+ // Handle ASCII cases.
+ if (isASCIIAlpha(ch)) {
</ins><span class="cx"> addSorted(m_matches, toASCIIUpper(ch));
</span><span class="cx"> addSorted(m_matches, toASCIILower(ch));
</span><span class="cx"> } else
</span><span class="lines">@@ -78,16 +83,10 @@
</span><span class="cx"> return;
</span><span class="cx"> }
</span><span class="cx">
</span><del>- // Simple case, not a case-insensitive match.
- if (!m_isCaseInsensitive) {
- addSorted(m_matchesUnicode, ch);
- return;
- }
-
</del><span class="cx"> // Add multiple matches, if necessary.
</span><span class="cx"> const CanonicalizationRange* info = canonicalRangeInfoFor(ch, m_canonicalMode);
</span><span class="cx"> if (info->type == CanonicalizeUnique)
</span><del>- addSorted(m_matchesUnicode, ch);
</del><ins>+ addSorted(ch);
</ins><span class="cx"> else
</span><span class="cx"> putUnicodeIgnoreCase(ch, info);
</span><span class="cx"> }
</span></span></pre>
</div>
</div>
</body>
</html>